## Experiment Goal

The goal of this experiment is to research what a complete observation space would look like in terms of trainer parties and battle fields.

In [2]:
import gymnasium as gym
import numpy as np

In [3]:
import os
import sys
module_path = os.path.abspath(os.path.join('..'))
if module_path not in sys.path:
    sys.path.append(module_path)

from services.helper_methods import *

## Informally Defined Observation Space

The observation space should include the following:
- The agents party
- That NPC's party
- Lingering effects on the battlefield, like:
  - Weather
  - Stealth rocks and other entry hazards
  - Reflect and Light screen effects
  - etc
- Volatile status effects applied to each players Pokemon out in the field, like:
  - Stat changes from buffs and debuffs (like attack up, defense down, etc)
  - Effects like confusion, leech seed, flinching, etc
  - Trapping moves like whirlpool, fire spin, etc

### Party
A party consists of one to six Pokémon in an ordered fashion.

### Pokemon
A Pokémon is a tuple with:
- Its level
- Its base stats
- Its EV’s (effort values)
- Its IV’s (individual values)
- Its nature
- The effective Stat (computed stats based on EV’s, IV’s, Base stats, Level and nature)
- Types (one or two types)
- Its ability
- Available moves
- Non-volatile status effects (one of sleep, poisone, badly-poisoned, burn, freeze, paralysis)
- Held item
- Its weight (used for moves like low kick and grass knot)
- Current friendship value (used for moves like return and frustration)
- Its gender (used for moves like attract)

### Move
A single move is a tuple with:
- A unique identifier
- The moves type
- Its category (physical, special, status)
- Its base power
- Its accuracy
- Its PP
- Its priority
- What it targets

## Finding min and max for every stat

| Stat | Min | Max |
|------|-----|-----|
| HP   | 0   | 714 |
| Atk  | 4   | 471 |
| Def  | 4   | 614 |
| SpA  | 4   | 447 |
| SpD  | 4   | 614 |
| Spe  | 4   | 460 |

---

One might be tempted to just use the described values from the dataframe:

In [4]:
stat_columns = ['hp', 'attack', 'defense', 'sp. atk', 'sp. def', 'speed']

In [5]:
pokemon_stats_df[stat_columns].describe()

Unnamed: 0,hp,attack,defense,sp. atk,sp. def,speed
count,493.0,493.0,493.0,493.0,493.0,493.0
mean,67.730223,73.496957,70.109533,67.981744,69.158215,65.440162
std,27.580375,29.168464,30.703012,28.515038,27.884112,27.223685
min,1.0,5.0,5.0,10.0,20.0,5.0
25%,50.0,50.0,50.0,45.0,50.0,45.0
50%,65.0,72.0,65.0,65.0,65.0,65.0
75%,80.0,90.0,85.0,90.0,85.0,85.0
max,255.0,165.0,230.0,154.0,230.0,160.0


However, stats are actually not bounded by these values, as the game calulates the stats based on the following formula:

$$
hp(p) = \left[ \frac{2 \times \text{base} + \text{iv} + (\frac{\text{ev}}{4}) \times \text{level}}{100} \right] + level + 10
$$
All other stats: $$ f(p) = \left[ \left( \left[ \frac{2 \times \text{base } + \text{ iv} + (\frac{\text{ ev}}{4}) \times \text{level}}{100} \right] + 5 \right) \times \text{nature} \right] $$

We could use the theoratical maximum integer value for the stats, but that would be a bit of a waste of space as no pokemon will ever reach those values. Instead we will use the data from the dataframe to get the maximum and minimum (practically) possible values for each stat.

In [6]:
def calc_hp(base: int, ev = 252, iv = 31, lvl = 100):
    # Taken from %ENV-DIR%/poke_battle_sim/poke_sim/pokemon.py::Pokemon::calculate_stats_actual
    # stats_actual.append(
    #     ((2 * self.base[0] + self.ivs[0] + self.evs[0] // 4) * self.level) // 100 + 10
    # )
    return int(((2 * base + iv + ev // 4) * lvl) // 100 + lvl + 10)

def calc_stat(base: int, ev = 252, iv = 31, lvl = 100, nature = 1.1): 
    # Taken from %ENV-DIR%/poke_battle_sim/poke_sim/pokemon.py::Pokemon::calculate_stats_actual
    # stats_actual.append(
    #     (
    #         ((2 * self.base[s] + self.ivs[s] + self.evs[s] // 4) * self.level)
    #         // 100
    #         + 5
    #     )
    #     * nature_stat_changes[s]
    # )
    return int((((2 * base + iv + ev // 4) * lvl) // 100 + 5) * nature)

In [7]:
# quick sanity check
turtwig = pokemon_stats_df[pokemon_stats_df['name'] == 'turtwig'][stat_columns]
turtwig['hp'] = calc_hp(turtwig['hp'].values[0])
turtwig['attack'] = calc_stat(turtwig['attack'].values[0])
turtwig['defense'] = calc_stat(turtwig['defense'].values[0])
turtwig['sp. atk'] = calc_stat(turtwig['sp. atk'].values[0])
turtwig['sp. def'] = calc_stat(turtwig['sp. def'].values[0])
turtwig['speed'] = calc_stat(turtwig['speed'].values[0])
assert all(turtwig.values[0] == [314, 258, 249, 207, 229, 177])

In [8]:
max_computer_stats = pokemon_stats_df[stat_columns].copy()
max_computer_stats['hp'] = max_computer_stats['hp'].apply(lambda x: calc_hp(x))
max_computer_stats['attack'] = max_computer_stats['attack'].apply(lambda x: calc_stat(x))
max_computer_stats['defense'] = max_computer_stats['defense'].apply(lambda x: calc_stat(x))
max_computer_stats['sp. atk'] = max_computer_stats['sp. atk'].apply(lambda x: calc_stat(x))
max_computer_stats['sp. def'] = max_computer_stats['sp. def'].apply(lambda x: calc_stat(x))
max_computer_stats['speed'] = max_computer_stats['speed'].apply(lambda x: calc_stat(x))
max_computer_stats.describe()

Unnamed: 0,hp,attack,defense,sp. atk,sp. def,speed
count,493.0,493.0,493.0,493.0,493.0,493.0
mean,339.460446,269.787018,262.330629,257.634888,260.235294,252.060852
std,55.16075,64.15683,67.53122,62.732855,61.338006,59.882815
min,206.0,119.0,119.0,130.0,152.0,119.0
25%,304.0,218.0,218.0,207.0,218.0,207.0
50%,334.0,267.0,251.0,251.0,251.0,251.0
75%,364.0,306.0,295.0,306.0,295.0,295.0
max,714.0,471.0,614.0,447.0,614.0,460.0


In [9]:
max_stats_row = max_computer_stats.loc[max_computer_stats.idxmax()]
pokemon_stats_df.loc[max_stats_row.index][['name']]

Unnamed: 0,name
241,blissey
408,rampardos
212,shuckle
149,mewtwo
212,shuckle
290,ninjask


The output above is corret: I know from experience that these are the pokemon with the highest achievable stats in the game for each stat. Now for the minimum values... We can skip HP, since that stats lowest minimum is known (0).

In [10]:
min_computer_stats = pokemon_stats_df[['attack', 'defense', 'sp. atk', 'sp. def', 'speed']].copy()
min_computer_stats['attack'] = min_computer_stats['attack'].apply(lambda x: calc_stat(x, ev=0, iv=0, lvl=1, nature=0.9))
min_computer_stats['defense'] = min_computer_stats['defense'].apply(lambda x: calc_stat(x, ev=0, iv=0, lvl=1, nature=0.9))
min_computer_stats['sp. atk'] = min_computer_stats['sp. atk'].apply(lambda x: calc_stat(x, ev=0, iv=0, lvl=1, nature=0.9))
min_computer_stats['sp. def'] = min_computer_stats['sp. def'].apply(lambda x: calc_stat(x, ev=0, iv=0, lvl=1, nature=0.9))
min_computer_stats['speed'] = min_computer_stats['speed'].apply(lambda x: calc_stat(x, ev=0, iv=0, lvl=1, nature=0.9))
min_computer_stats.describe()

Unnamed: 0,attack,defense,sp. atk,sp. def,speed
count,493.0,493.0,493.0,493.0,493.0
mean,5.004057,4.945233,4.89858,4.941176,4.843813
std,0.668684,0.706418,0.685104,0.655332,0.635939
min,4.0,4.0,4.0,4.0,4.0
25%,5.0,5.0,4.0,5.0,4.0
50%,5.0,5.0,5.0,5.0,5.0
75%,5.0,5.0,5.0,5.0,5.0
max,7.0,8.0,7.0,8.0,7.0


In [11]:
min_stats_row = min_computer_stats.loc[min_computer_stats.idxmin()]
pokemon_stats_df.loc[min_stats_row.index][['name']]

Unnamed: 0,name
0,bulbasaur
0,bulbasaur
9,caterpie
9,caterpie
0,bulbasaur


This did not seem right at first, because I expected Kricketune to have the lowest defensive stats in the game. But after manual verification via various other calculators, I can confirm that these are the correct values.

## Label Encoding Types

| Type | Encoding |
|------|----------|
| bug | 0 |
| dark | 1 |
| dragon | 2 |
| electric | 3 |
| fighting | 4 |
| fire | 5 |
| flying | 6 |
| ghost | 7 |
| grass | 8 |
| ground | 9 |
| ice | 10 |
| normal | 11 |
| poison | 12 |
| psychic | 13 |
| rock | 14 |
| steel | 15 |
| water | 16 |
| nan | 17 |

These are luckily alot more easy to get. Since types are essentially a finite amount of strings (17 to be exact), we can simply label encode them. For future refrence and consitency sake I put the encodings in a markdown table above, just in case the `LabelEncoder` from sklearn gets updated or something.

In [12]:
# from services.helper_methods
print(type_encoder.classes_)
print(type_encoder.transform(type_encoder.classes_))

['bug' 'dark' 'dragon' 'electric' 'fighting' 'fire' 'flying' 'ghost'
 'grass' 'ground' 'ice' 'normal' 'poison' 'psychic' 'rock' 'steel' 'water'
 nan]
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17]


## Ailities

Discrete set of numbers from $\{0, 1, ... 121\}$

In [13]:
abilities_df

Unnamed: 0,ability_id,ability_name,gen
0,1,stench,3
1,2,drizzle,3
2,3,speed-boost,3
3,4,battle-armor,3
4,5,sturdy,3
...,...,...,...
117,118,honey-gather,4
118,119,frisk,4
119,120,reckless,4
120,121,multitype,4


## Moves

In [14]:
moves_df

Unnamed: 0,id,identifier,generation_id,type_id,power,pp,accuracy,priority,target_id,move_class,effect_id,effect_chance,effect_amt,effect_stat
0,1,pound,1,normal,40.0,35,100.0,0,10,2,1,,,
1,2,karate-chop,1,fighting,50.0,25,100.0,0,10,2,8,,,
2,3,double-slap,1,normal,15.0,10,85.0,0,10,2,10,,,
3,4,comet-punch,1,normal,18.0,15,85.0,0,10,2,10,,,
4,5,mega-punch,1,normal,80.0,20,85.0,0,10,2,1,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
462,463,magma-storm,4,fire,100.0,5,75.0,0,10,3,24,,,7.0
463,464,dark-void,4,dark,,10,80.0,0,11,1,13,,,5.0
464,465,seed-flare,4,grass,120.0,5,85.0,0,10,3,3,40.0,-2.0,4.0
465,466,ominous-wind,4,ghost,60.0,5,100.0,0,10,3,112,10.0,,


Since we do not care about the name or ID of a move, we can simply drop this information. As we want the agent to learn moves by their properties, not by their names. We will keep the identifier for us to be able to look up the move in the future.

In [15]:
moves_df.drop(columns=['id', 'generation_id'], inplace=True)

### Power

To conclude the research bellow:
- Status moves will have a power of 0
- All other moves with a power of `np.nan` will be -1

This makes the `min` and `max` values for power `-1` and `250` respectively.

We can safely assume that status moves will have a power of 0, as there are no status moves with a power greater than 0.

In [16]:
moves_df.loc[moves_df['move_class'] == 1, 'power'] = 0

In [17]:
unique_effect_moves = moves_df.drop_duplicates(subset=['effect_id'])
unique_effect_moves

Unnamed: 0,identifier,type_id,power,pp,accuracy,priority,target_id,move_class,effect_id,effect_chance,effect_amt,effect_stat
0,pound,normal,40.0,35,100.0,0,10,2,1,,,
1,karate-chop,fighting,50.0,25,100.0,0,10,2,8,,,
2,double-slap,normal,15.0,10,85.0,0,10,2,10,,,
5,pay-day,normal,40.0,20,100.0,0,10,2,0,,,
6,fire-punch,fire,75.0,15,100.0,0,10,2,5,10.0,,1.0
...,...,...,...,...,...,...,...,...,...,...,...,...
447,chatter,flying,65.0,20,100.0,0,10,3,215,,,
448,judgment,normal,100.0,10,100.0,0,10,3,216,,,
456,head-smash,rock,150.0,5,80.0,0,10,2,217,,,
460,lunar-dance,psychic,0.0,10,,0,7,1,218,,,


In [18]:
unique_effect_moves_with_nan_power = unique_effect_moves[unique_effect_moves['power'].isna()].sort_values(by='effect_id')
unique_effect_moves_with_nan_power

Unnamed: 0,identifier,type_id,power,pp,accuracy,priority,target_id,move_class,effect_id,effect_chance,effect_amt,effect_stat
11,guillotine,normal,,5,-1.0,0,10,2,20,,,
48,sonic-boom,normal,,20,90.0,0,10,3,31,,20.0,
66,low-kick,fighting,,20,100.0,0,10,2,35,,,
67,counter,fighting,,20,100.0,-5,1,2,36,,,
68,seismic-toss,fighting,,20,100.0,0,10,2,37,,,
116,bide,normal,,10,,1,7,2,52,,,
148,psywave,psychic,,15,100.0,0,10,3,60,,,
161,super-fang,normal,,10,90.0,0,10,2,66,,,
174,flail,normal,,15,100.0,0,10,2,78,,,
215,return,normal,,20,100.0,0,10,2,96,,,


In [19]:
# https://www.serebii.net/attackdex-dp/normal.shtml
move_url = 'https://pokemondb.net/move/{0}'
moves_with_weird_powers = [
    (
        move_url.format(i[0]), # A url of the moves page to check its effect
        i[1], # The effect_id
        moves_df[moves_df['effect_id'] == i[1]]['identifier'].values # Moves with same effect_id for comparison
    ) 
    for i in unique_effect_moves_with_nan_power[['identifier', 'effect_id']].sort_values(by='effect_id').values
]
len(moves_with_weird_powers)

24

In [20]:
moves_with_weird_powers[23]

('https://pokemondb.net/move/punishment',
 198,
 array(['punishment'], dtype=object))

In [21]:
effect_id_power_mapping = {
    20: -1, # OHKO moves
    31: -1, # deal constant damage (sonic boom = 20, dragon rage = 40)
    35: -1, # deal damage based on weight
    36: -1, # deal damage based on damage taken by previous phisical move (counter)
    37: -1, # deal damage based on users level (night shade, seismic toss)
    52: -1, # deals twice the hp damage user took (bide)
    60: -1, # deals random amount of HP damage, varying between 50% and 150% of the user's level
    66: -1, # always deals half of the target's current HP
    78: -1, # deals more damage the lower the user's HP
    96: -1, # deals damage based on the user's level
    97: -1, # deals damage or heals the target (present)
    98: -1, # deals damage based on friendship (frustration)
    101: -1, # base power of Magnitude is one of 7 random values
    110: -1, # When hit by a Special Attack, user strikes back with 2x power. (mirror coat)
    114: -1, # deals damage from each Pokémon on your team that does not have a status ailment (nor is fainted)
    117: -1, # deals varying damage depending on how many times the user used Stockpile (spit up)
    142: -1, # reduces the opponent's HP to equal the user's HP (endeavor)
    172: -1, # the slower the user compared to the opponent, the higher the damage, up to a maximum base power of 150 (gyro ball)
    175: -1, # its type and base power vary depending on the user's held Berry (natural gift)
    180: -1, # Deals damage equal to 1.5x opponent's attack (metal burst)
    186: -1, # Power depends on held item (fling)
    188: -1, # inflicts more damage when fewer PP are left (trump card)
    190: -1, # inflicts more damage when the opponent's HP is higher 
    198: -1, # deals varying damage based on the opponent's stat increases
}

After some thought, the above mappings dictionary becomes redundant. Ill keep it here for archiving purposes. If anyone in the future wants to use different values for these effect ID's you can just change the values in the dictionary.

In [22]:
print(moves_df['power'].isna().sum())

for k in effect_id_power_mapping.keys():
    moves_df.loc[moves_df['effect_id'] == k, 'power'] = effect_id_power_mapping[k]

print(moves_df['power'].isna().sum())

32
0


In [23]:
moves_df['power'].describe()

count    467.00000
mean      42.16060
std       45.86332
min       -1.00000
25%        0.00000
50%       35.00000
75%       75.00000
max      250.00000
Name: power, dtype: float64

### PP

The `min` and `max` values for PP are `0` and `64` respectively.

In [24]:
moves_df['pp'].describe()

count    467.000000
mean      15.903640
std        8.833299
min        1.000000
25%       10.000000
50%       15.000000
75%       20.000000
max       40.000000
Name: pp, dtype: float64

In [25]:
moves_df['pp'].isna().sum()

0

If we take into account he item `PP up` and `PP max`, the max value for PP becomes $40 \times \frac{8}{5} = 64$

### Accuracy

In [26]:
sorted(moves_df['accuracy'].unique())

[-1.0, 50.0, 55.0, 60.0, 70.0, 75.0, 80.0, 85.0, 90.0, 95.0, 100.0, nan]

In [27]:
moves_df[moves_df['accuracy'] < 0]

Unnamed: 0,identifier,type_id,power,pp,accuracy,priority,target_id,move_class,effect_id,effect_chance,effect_amt,effect_stat
11,guillotine,normal,-1.0,5,-1.0,0,10,2,20,,,
31,horn-drill,normal,-1.0,5,-1.0,0,10,2,20,,,
89,fissure,ground,-1.0,5,-1.0,0,10,2,20,,,
328,sheer-cold,ice,-1.0,5,-1.0,0,10,3,20,,,


It seems that all moves with `accuracy < 0` are [OHKO moves](https://bulbapedia.bulbagarden.net/wiki/One-hit_knockout_move#Generation_III_onward). We can simply use these values as is, since all OHKO moves follow the same accuracy rules.

In [28]:
unique_effect_moves_with_nan_accuracy = unique_effect_moves[unique_effect_moves['accuracy'].isna()].sort_values(by='effect_id')
unique_effect_moves_with_nan_accuracy

Unnamed: 0,identifier,type_id,power,pp,accuracy,priority,target_id,move_class,effect_id,effect_chance,effect_amt,effect_stat
13,swords-dance,normal,0.0,20,,0,7,1,16,,2.0,1.0
53,mist,ice,0.0,30,,0,4,1,33,,,
101,mimic,normal,0.0,10,,0,10,1,44,,,
104,recover,normal,0.0,10,,0,7,1,46,,,
106,minimize,normal,0.0,10,,0,7,1,47,,,
...,...,...,...,...,...,...,...,...,...,...,...,...
392,magnet-rise,electric,0.0,10,,0,7,1,205,,,
431,defog,flying,0.0,15,,0,10,1,211,,,
432,trick-room,psychic,0.0,5,,-7,12,1,212,,,
445,stealth-rock,rock,0.0,20,,0,6,1,214,,,


It seems there are a few cases as to why accuracy would be `np.nan`:
- The move is a self buffing move (like Swords Dance), stage altering (like Stealth Rock) or moves that are desigend to bypass accuracy checks (like Aerial Ace, Transform, etc)
  - Moves that are unaffected by accuracy and will get an accuracy of `-1`
- It has hardcoded rules for targeting (like with [Snatch](https://bulbapedia.bulbagarden.net/wiki/Snatch_(move)#Generations_III_and_IV) for example)
- The moves copies other moves, and is thus dependent on the copied moves accuracy
  - Moves in this category will get an accuracy of `-2`
- Whether the move will be a hit or miss is uniqly defined for that move (like rest, curse)
  - Moves in this category will get an accuracy of `-3`

Special cases:
splash
destiny-bond	
nature-power

In [29]:
moves_effectid_unaffected_by_accuracy = [ 0, 16, 17, 33, 44, 46, 47, 48, 49, 50, 51, 52, 59, 68, 72, 82, 83, 84, 87, 95, 99, 100, 102, 105, 108, 109, 119, 127 ]
moves_effectid_copying_moves = [ 53, 54, 69 ]
moves_effectid_uniquedefinedhitmis = [ 63, 64, 67, 73, 77, 79, 81, 85, 86, 88, 94, 111, 116, 118 ]

In [30]:
moves_df.loc[moves_df['accuracy'].isna() & moves_df['effect_id'].isin(moves_effectid_unaffected_by_accuracy), 'accuracy'] = -1
moves_df.loc[moves_df['accuracy'].isna() & moves_df['effect_id'].isin(moves_effectid_copying_moves), 'accuracy'] = -2
moves_df.loc[moves_df['accuracy'].isna() & moves_df['effect_id'].isin(moves_effectid_uniquedefinedhitmis), 'accuracy'] = -3

In [31]:
unique_effect_moves_with_nan_accuracy = moves_df[moves_df['accuracy'].isna()].sort_values(by='effect_id')
unique_effect_moves_with_nan_accuracy

Unnamed: 0,identifier,type_id,power,pp,accuracy,priority,target_id,move_class,effect_id,effect_chance,effect_amt,effect_stat
128,swift,normal,60.0,20,,0,11,3,1,,,
442,magnet-bomb,steel,60.0,20,,0,10,2,1,,,
344,magical-leaf,grass,60.0,20,,0,10,3,1,,,
331,aerial-ace,flying,60.0,20,,0,10,2,1,,,
324,shadow-punch,ghost,60.0,20,,0,10,2,1,,,
350,shock-wave,electric,60.0,20,,0,10,3,1,,,
395,aura-sphere,fighting,80.0,20,,0,10,3,1,,,
184,feint-attack,dark,60.0,20,,0,10,2,1,,,
232,vital-throw,fighting,70.0,10,,-1,10,2,1,,,
149,splash,normal,0.0,40,,0,7,1,61,,,


### Priority

The `min` and `max` values for priority are `-7` and `5` respectively.

In [32]:
moves_df['priority'].unique()

array([ 0, -6, -5,  1,  4, -1,  2,  3, -3,  5, -4, -7], dtype=int64)

In [46]:
min(moves_df['priority'].unique()), max(moves_df['priority'].unique())

(-7, 5)

### Target ID

In [33]:
sorted(moves_df['target_id'].unique())

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]

After some research, I found that target id dictates what the move targets. So for stat boosting on self target id = 7, stealth rock and other entry hazards target id = 6, etc. This is not a very useful feature for the agent to learn, so we will not include it in the observation space.

### Move class

In [34]:
print(moves_df['move_class'].unique())
move_category_labels = {
    1: 'status',
    2: 'physical',
    3: 'special'
}
print(move_category_labels)

[2 3 1]
{1: 'status', 2: 'physical', 3: 'special'}


I tought I would need to encode move categories as well, but the simulator already has those represented in numerical values.

### Effect ID

In [35]:
moves_df[moves_df['effect_id'] == 1]

Unnamed: 0,identifier,type_id,power,pp,accuracy,priority,target_id,move_class,effect_id,effect_chance,effect_amt,effect_stat
0,pound,normal,40.0,35,100.0,0,10,2,1,,,
4,mega-punch,normal,80.0,20,85.0,0,10,2,1,,,
9,scratch,normal,40.0,35,100.0,0,10,2,1,,,
10,vice-grip,normal,55.0,30,100.0,0,10,2,1,,,
14,cut,normal,50.0,30,95.0,0,10,2,1,,,
16,wing-attack,flying,60.0,35,100.0,0,10,2,1,,,
20,slam,normal,80.0,20,75.0,0,10,2,1,,,
21,vine-whip,grass,45.0,25,100.0,0,10,2,1,,,
24,mega-kick,normal,120.0,5,75.0,0,10,2,1,,,
29,horn-attack,normal,65.0,25,100.0,0,10,2,1,,,


### Effect Change

In [36]:
sorted(moves_df['effect_chance'].unique())

[nan, 10.0, 20.0, 30.0, 40.0, 50.0, 70.0, 100.0]

In [37]:
moves_df[moves_df['effect_chance'].isna()]

Unnamed: 0,identifier,type_id,power,pp,accuracy,priority,target_id,move_class,effect_id,effect_chance,effect_amt,effect_stat
0,pound,normal,40.0,35,100.0,0,10,2,1,,,
1,karate-chop,fighting,50.0,25,100.0,0,10,2,8,,,
2,double-slap,normal,15.0,10,85.0,0,10,2,10,,,
3,comet-punch,normal,18.0,15,85.0,0,10,2,10,,,
4,mega-punch,normal,80.0,20,85.0,0,10,2,1,,,
...,...,...,...,...,...,...,...,...,...,...,...,...
460,lunar-dance,psychic,0.0,10,,0,7,1,218,,,
461,crush-grip,normal,-1.0,5,100.0,0,10,2,190,,,
462,magma-storm,fire,100.0,5,75.0,0,10,3,24,,,7.0
463,dark-void,dark,0.0,10,80.0,0,11,1,13,,,5.0


### Effect Amount

In [38]:
list(moves_df[moves_df['effect_amt'].notna()]['identifier'])

['swords-dance',
 'sand-attack',
 'tail-whip',
 'leer',
 'growl',
 'sonic-boom',
 'acid',
 'bubble-beam',
 'aurora-beam',
 'growth',
 'string-shot',
 'dragon-rage',
 'psychic',
 'meditate',
 'agility',
 'screech',
 'double-team',
 'harden',
 'smokescreen',
 'withdraw',
 'barrier',
 'constrict',
 'amnesia',
 'kinesis',
 'bubble',
 'flash',
 'acid-armor',
 'sharpen',
 'flame-wheel',
 'cotton-spore',
 'scary-face',
 'mud-slap',
 'octazooka',
 'icy-wind',
 'charm',
 'steel-wing',
 'sacred-fire',
 'sweet-scent',
 'iron-tail',
 'metal-claw',
 'crunch',
 'shadow-ball',
 'rock-smash',
 'tail-glow',
 'luster-purge',
 'mist-ball',
 'feather-dance',
 'crush-claw',
 'meteor-mash',
 'fake-tears',
 'overheat',
 'rock-tomb',
 'metal-sound',
 'muddy-water',
 'iron-defense',
 'howl',
 'mud-shot',
 'psycho-boost',
 'hammer-arm',
 'rock-polish',
 'night-slash',
 'bug-buzz',
 'focus-blast',
 'energy-ball',
 'earth-power',
 'nasty-plot',
 'mud-bomb',
 'psycho-cut',
 'mirror-shot',
 'flash-cannon',
 'draco-

All the moves in the output above have either the primary or secondary effect of altering pokemons stats by some amount. The `effect_amt` column shows the amount by which the stat is altered. This is a very useful feature for the agent to learn, so we will include it in the observation space.

In [39]:
moves_df['effect_stat'].unique()

array([nan,  1.,  2.,  3.,  6.,  4.,  5.,  7.])

In [40]:
moves_df[moves_df['effect_stat'].notna()]

Unnamed: 0,identifier,type_id,power,pp,accuracy,priority,target_id,move_class,effect_id,effect_chance,effect_amt,effect_stat
6,fire-punch,fire,75.0,15,100.0,0,10,2,5,10.0,,1.0
7,ice-punch,ice,75.0,15,100.0,0,10,2,5,10.0,,2.0
8,thunder-punch,electric,75.0,15,100.0,0,10,2,5,10.0,,3.0
13,swords-dance,normal,0.0,20,-1.0,0,7,1,16,,2.0,1.0
19,bind,normal,15.0,20,85.0,0,10,2,24,,,1.0
...,...,...,...,...,...,...,...,...,...,...,...,...
440,gunk-shot,poison,120.0,5,80.0,0,10,2,5,30.0,,4.0
450,charge-beam,electric,50.0,10,90.0,0,10,3,2,70.0,1.0,3.0
462,magma-storm,fire,100.0,5,75.0,0,10,3,24,,,7.0
463,dark-void,dark,0.0,10,80.0,0,11,1,13,,,5.0


### Effect Stat

I could not find out what `effect_stat` is for, so I will not include it in the observation space.

## The empty move

The empty move will be defined as followed:

| Column          | Value                                          |
|-----------------|------------------------------------------------|
| `id`            | `starter_move_list['id'].min() - 1`            |
| `type_id`       | 17 (the `np.nan` encoded value)                |
| `power`         | `starter_move_list['power'].min() - 1`         |
| `pp`            | `starter_move_list['pp'].min() - 1`            |
| `accuracy`      | `starter_move_list['accuracy'].min() - 1`      |
| `priority`      | `starter_move_list['priority'].min() - 1`      |
| `target_id`     | `starter_move_list['target_id'].min() - 1`     |
| `move_class`    |`starter_move_list['move_class'].min() - 1`     |
| `effect_id`     |`starter_move_list['effect_id'].min() - 1`      |
| `effect_chance` | `starter_move_list['effect_chance'].min() - 1` |
| `effect_amt`    | `starter_move_list['effect_amt'].min() - 1`    |
| `effect_stat`   | `starter_move_list['effect_stat'].min() - 1`   |

This results in the following tuple:
> $\lambda = (0, 17, -1, -1, -1, 0, -1, -1, -1, -1, -1, -1)$

## Non-volatile status effects (like sleep, poison, etc)

| Status         | Encoding |
|----------------|----------|
| Burn           | 0        |
| Freeze         | 1        |
| Paralysis      | 2        |
| Poison         | 3        |
| Badly Poisoned | 4        |
| Sleep          | 5        |

D:\Users\luc\anaconda3\envs\deth\Lib\site-packages\poke_battle_sim\conf\global_settings.py

## Held item

... TODO FIX THIS SHIT

## Weight, friendship and gender

| Name       | Min | Max   |
|------------|-----|-------|
| weight     | 1   | 9500  |
| friendship | 0   | 254   |
| gender     | 0   | 2     |

These are not really that special. Every pokemon has a weight defined in the data files. Friendship is just a integer value that can be between 0 and 254 (inclusive). And their are 3 possible values for gender: male, female or genderless, which need to be label encoded.

In [41]:
pokemon_stats_df.describe()

Unnamed: 0,ndex,hp,attack,defense,sp. atk,sp. def,speed,height,weight,base exp.,gen
count,493.0,493.0,493.0,493.0,493.0,493.0,493.0,493.0,493.0,493.0,493.0
mean,247.0,67.730223,73.496957,70.109533,67.981744,69.158215,65.440162,11.845842,590.900609,145.955375,2.401623
std,142.461106,27.580375,29.168464,30.703012,28.515038,27.884112,27.223685,11.344592,960.959391,81.698347,1.135602
min,1.0,1.0,5.0,5.0,10.0,20.0,5.0,2.0,1.0,36.0,1.0
25%,124.0,50.0,50.0,50.0,45.0,50.0,45.0,6.0,99.0,66.0,1.0
50%,247.0,65.0,72.0,65.0,65.0,65.0,65.0,10.0,295.0,147.0,2.0
75%,370.0,80.0,90.0,85.0,90.0,85.0,85.0,15.0,608.0,178.0,3.0
max,493.0,255.0,165.0,230.0,154.0,230.0,160.0,145.0,9500.0,635.0,4.0


In [42]:
pokemon_stats_df.isna().sum()

ndex           0
name           0
type 1         0
type 2       270
hp             0
attack         0
defense        0
sp. atk        0
sp. def        0
speed          0
height         0
weight         0
base exp.      0
gen            0
dtype: int64

In [43]:
pb.conf.global_settings.POSSIBLE_GENDERS

['male', 'female', 'genderless']

Pokemon are either:
- male or female
- always genderless

In [44]:
gender_encoder = LabelEncoder()
gender_encoder.fit(pb.conf.global_settings.POSSIBLE_GENDERS)

def get_random_gender_mf():
    return gender_encoder.transform(random.choice(['male', 'female']))

def get_gender_encoding(gender: str):
    return gender_encoder.transform([gender])[0]

def get_gender_decoding(gender: int):
    return gender_encoder.inverse_transform([gender])[0]

# for c in gender_encoder.classes_:
#     print(f'{c} -> {get_gender_encoding(c)} -> {get_gender_decoding(get_gender_encoding(c))}')

## Lingering effects on the battlefield

... TODO FIX THIS SHIT

## Volatile statusus effects

... TODO FIX THIS SHIT

## Incorperation into environment

TODO incorperate this into environment

## Conclusion

### On removing stuff from the observation space

It might be interesting to see if we remove some stuff from the observation space, what it would do to the model. For example: would it be able to learn on its own that swords dance increases attack? Or would it be able to learn that a move is a physical move by looking at the move's power + the pokemons attack stat + its own defense stat? This would be an interesting experiment to run.