We used a series of production rules based on knowledge of the game of Pokemon to make an inference engine
Uses its knowledge of a pokemon’s moves and current status effects to model the decision making of a Pokemon Trainer
Scores each of its current Pokemon’s move options
For Status moves, Pokémon current HP, existing buff/debuff, and effectiveness are considered in scoring
For Damage Moves, move power, type effectiveness, same type bonus, secondary effect application, and self-harming application are considered in scoring.
Chooses the highest scoring move option to use to against opponent
Recommends the highest scoring move to the Pokemon Trainer each turn in battle
High-Level Logic Diagram: https://app.diagrams.net/#G1qa-3euYMihw4ZtDVu00VMMIEIYdfPlNI
Or check out the RBES.png in main directory
Data Used: Pokemon Database, Pokemon Moves Database, Pokemon Battle Damage Calculation Formulas
Damage Calculations: https://bulbapedia.bulbagarden.net/wiki/Damage
Pokemon Stat Calculations: https://pokemon.fandom.com/wiki/Statistics
Pokedex Data: https://bulbapedia.bulbagarden.net/wiki/List_of_Pok%C3%A9mon_by_National_Pok%C3%A9dex_numbe
Type Bonuses: https://www.ign.com/wikis/pokemon-red-blue-yellow-version/Pokemon_Types
We used a state based Q-learning to find the best move to choose based on rewards.
In each state, the type of Pokémon is checked. After that, both Pokémons will use moves (Action) to deduct each other's HP
One of the two state-reward formulas is chosen based on the type counter relationship between both Pokémons and their HP difference.
The Pokémon is rewarded more if it survives hard combat (got type-countered) and less if it survies easy combat (type-countered opponent)
Until the game ends (either Pokémon reaches zero HP), the state-reward is calculated and aggregated.
After the game ends (Final State), a score bonus or penality is added to sum of state-rewards if the Pokémon won or lose respectively.
The lose_counter increment by 1 each time the Pokémon loses and Pokémon is awarded with large reward if it lost multiple rounds before making a come-back to win
The reward changes is recorded and fed to the Q-Learning Module to decide the policy: which action (Pokémon's move) is optimal during each state.
Data Used: Pokemon Database, Pokemon Moves Database, Pokemon Battle Damage Calculation Formulas, Charizard combat data: train_data_output.csv in folder train_data
High-Level Logic Diagram:
We generated some data by having Pokémon engage in random battles, including information such as their own health points, opponent's health points, opponent's type, and so on. From this data, we selected only the winning instances and examined them for any hidden patterns. We utilized a decision tree to provide us with interpretable decisions. Through repeated validation, we found that 1,000 battles and a decision tree depth of 5 were optimal to avoid overfitting. Below is the automatically generated decision tree, where the "value" represents the occurrence count of a particular situation at each node.
This section will guide you on how to set up and run the Pokémon battle simulation using different AI strategies.
Follow the steps below to get started:
Before installing the required packages, it's a good idea to create a virtual environment. This will help isolate the dependencies for this project from other projects you may have on your system. You can create a virtual environment using the following command:
username$ python -m venv pokemon_ai
Activate the virtual environment:
- For Windows:
username$ pokemon_ai\Scripts\activate
- For macOS/Linux:
username$ source pokemon_ai/bin/activate
To install the required packages for this project, run the following command:
username$ pip install -r requirements.txt
To run the main script, use the following command:
python main.py
When creating a Pokémon object, you can choose your desired AI strategy from the following options: `random`, `rbes`, `DT` (decision tree), or `RL` (reinforcement learning). If you want to test your own model, you can provide the model's path through the `model_path` parameter when creating the Pokémon object.
You can also adjust the number of battles by changing the parameters in the for-loop. This will allow you to obtain the potential win rate, which will be printed as the output.
Example:
pokemon = Pokemon(pokemon_id=6, ai_strategy="DT", model_path="path_to_your_model")Adjust the number of battles:
for i in range(1000): # Change 1000 to the desired number of battles
# Run the simulation## Step 4: Analyze the results
After running the main script, you will see the win rate for the selected AI strategy. Use this information to compare different strategies and models, and to fine-tune your approach.