<a href="https://colab.research.google.com/github/NicolasChagnet/pokemon-team-optimization/blob/main/TeamOptimization.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Pokemon Team Optimization

The goal of this notebook is to find an optimal Pokemon team using the [Pokemon dataset](https://www.kaggle.com/datasets/rounakbanik/pokemon) and the Pulp package of solvers for optimization problems.

The problem is defined by the following constraints:
- A team must have up to 6 Pokemon in total.
- The coverage of the team should be maximized.
- The weaknesses of the team should be minimized.
- The base total (sum of all stats of each Pokemon) should be maximized.

## Initializations

In [4]:
%cd /content/drive/MyDrive/Colab\ Notebooks
# !git clone https://github.com/NicolasChagnet/pokemon-team-optimization.git
%cd pokemon-team-optimization
!git pull

/content/drive/MyDrive/Colab Notebooks
/content/drive/MyDrive/Colab Notebooks/pokemon-team-optimization
Already up to date.


In [8]:
# Download dependencies
!pip install pandas numpy matplotlib seaborn pulp



In [9]:
!black ./TeamOptimization.ipynb

[1mAll done! ✨ 🍰 ✨[0m
[34m1 file [0mleft unchanged.


In [6]:
# Imports
import pandas as pd
import numpy as np
import seaborn as sns
from matplotlib import pyplot as plt
import requests
import os
import pulp

Let us start by loading the Pokemon data. An important note: the columns "against_XXX" denote the damage factor taken by the Pokemon against an attack of that type.

In [None]:
GENERATION_CAP = 4
NTYPES = 18
pks = pd.read_csv("data/pokemon.csv")
# Drop the Pokemons above the Generation cap and columns we will not use
pks = pks.drop(pks.loc[pks["generation"] > GENERATION_CAP].index)
pks = pks.drop(['abilities', 'attack',
       'base_egg_steps', 'base_happiness',  'capture_rate',
       'classfication', 'defense', 'experience_growth', 'height_m', 'hp',
       'japanese_name',  'percentage_male',
       'sp_attack', 'sp_defense', 'speed',  'weight_kg'], axis=1)
pks["name"] = pks["name"].str.lower()

In [None]:
display(pks.columns)
type_columns = [col for col in pks.columns if "against" in col]
pks.head()

Index(['against_bug', 'against_dark', 'against_dragon', 'against_electric',
       'against_fairy', 'against_fight', 'against_fire', 'against_flying',
       'against_ghost', 'against_grass', 'against_ground', 'against_ice',
       'against_normal', 'against_poison', 'against_psychic', 'against_rock',
       'against_steel', 'against_water', 'base_total', 'name',
       'pokedex_number', 'type1', 'type2', 'generation', 'is_legendary'],
      dtype='object')

Unnamed: 0,against_bug,against_dark,against_dragon,against_electric,against_fairy,against_fight,against_fire,against_flying,against_ghost,against_grass,...,against_rock,against_steel,against_water,base_total,name,pokedex_number,type1,type2,generation,is_legendary
0,1.0,1.0,1.0,0.5,0.5,0.5,2.0,2.0,1.0,0.25,...,1.0,1.0,0.5,318,bulbasaur,1,grass,poison,1,0
1,1.0,1.0,1.0,0.5,0.5,0.5,2.0,2.0,1.0,0.25,...,1.0,1.0,0.5,405,ivysaur,2,grass,poison,1,0
2,1.0,1.0,1.0,0.5,0.5,0.5,2.0,2.0,1.0,0.25,...,1.0,1.0,0.5,625,venusaur,3,grass,poison,1,0
3,0.5,1.0,1.0,1.0,0.5,1.0,0.5,1.0,1.0,0.5,...,2.0,0.5,2.0,309,charmander,4,fire,,1,0
4,0.5,1.0,1.0,1.0,0.5,1.0,0.5,1.0,1.0,0.5,...,2.0,0.5,2.0,405,charmeleon,5,fire,,1,0


## First optimization problem

Let us start with the first part of the optimization problem using the dataset as it is. We can denote by $x_i$ whether a Pokemon is in the team or not, which means that the constraint of having a full team is $\sum_i x_i = 6$.

Furthermore, denoting by $b_i$ the base total of the Pokemon $i$, maximizing the base state means finding $x_i = \arg \max_{x_i} x_i b_i$ under the previous constraint.

Finally, we want to reduce the weaknesses of our team. We can denote the damage multiple for the Pokemon $i$ against the type $A$ as $w_{iA}$. We want to ensure there is at least one resistant Pokemon to every type i.e. for every $A \in [1, 18]$, we want to impose $\min x_i w_{i A} \leq 1.0$.

Note that such minimizations are difficult to add with PulP. The trick explained [here](https://stackoverflow.com/questions/51939363/pulp-milp-constraint-at-least-one-variable-must-be-below-0) is to define extra boolean variables $y_{A i}$ such that for all $i,A$, we impose $x_i w_{i A} \leq 1.0 + m*(1-y_{Ai})$ and $\sum_i y_{Ai} \geq 1$. The last constraint means that for each $A$, there must be one $i$ with $y_{Ai} = 1$. For $m$ sufficiently large, $y_{Ai} = 0$ for the others is not a constraint.

In [None]:
%%black
def present_solution_weaknesses(team, types):
  team_by_name = team.set_index("name")
  team_by_types = team_by_name[types].transpose()
  team_by_types["min_val"] = team_by_types.min(axis=1)
  team_by_types["min_pkmn"] = team_by_types.idxmin(axis=1)
  team_by_types["type1"] = team_by_name.loc[team_by_types["min_pkmn"].values, "type1"].values
  team_by_types["type2"] = team_by_name.loc[team_by_types["min_pkmn"].values, "type2"].values
  return team_by_types

def optimize_team_weakness(pkms, types, size_team=6):
  prob = pulp.LpProblem("Pokemon_Team_Optimization", pulp.LpMaximize)

  # Define the boolean variables
  x = pulp.LpVariable.dicts("x", range(len(pkms)), cat='Binary')
  y = pulp.LpVariable.dicts("y", (range(len(types)), range(len(pkms))), cat='Binary')

  # Define the size constraint
  prob += pulp.lpSum(x[i] for i in range(len(pkms))) == size_team, "Team Size"

  # Define the base total sum
  prob += pulp.lpSum(pkmn["base_total"] * x[i] for i, pkmn in pkms.iterrows()), "Maximal base total"

  # Define the weakness sum bound for each type
  # Based on https://stackoverflow.com/questions/51939363/pulp-milp-constraint-at-least-one-variable-must-be-below-0
  m=100
  for a,type_col in enumerate(types):
    prob += pulp.lpSum(y[a][i] for i in range(len(pkms))) >= 1 # Overall constraint for each type
    for i, pkmn in pkms.iterrows():
      prob += x[i] * pkmn[type_col]  <= 0.5 + m*(1-y[a][i]), f"Weakness {type_col} for pokemon {i}"

  out_code = prob.solve()
  display(print(f"Out code: {out_code}"))

  if out_code == 1:
    idxs_sol = [i for i in range(len(pkms)) if pulp.value(x[i]) == 1]
    pkms_selected = pkms.loc[idxs_sol]
    return pkms_selected
  return None

In [None]:
pkmns_weakness = optimize_team_weakness(pks, type_columns)
display(pkmns_weakness["name"])
display(present_solution_weaknesses(pkmns_weakness, type_columns)[["min_val", "min_pkmn", "type1", "type2"]])

Out code: 1


None

149      mewtwo
381      kyogre
382     groudon
383    rayquaza
444    garchomp
492      arceus
Name: name, dtype: object

name,min_val,min_pkmn,type1,type2
against_bug,0.5,rayquaza,dragon,flying
against_dark,1.0,kyogre,water,
against_dragon,1.0,mewtwo,psychic,
against_electric,0.0,groudon,ground,
against_fairy,1.0,mewtwo,psychic,
against_fight,0.5,mewtwo,psychic,
against_fire,0.5,kyogre,water,
against_flying,1.0,mewtwo,psychic,
against_ghost,0.0,arceus,normal,
against_grass,0.25,rayquaza,dragon,flying


We see an issue here: some weaknesses are still at least equal to 1.

The constraint to have one $y_Ai = 1$ for all $A$ does not mean these coincide with the Pokemon selected for the team. So we need more variables. The idea is to create variables $z_{Ai}$ under the constraint:
$$
\begin{aligned}
  z_{Ai} \leq x_i~,\\
  z_{Ai} \leq y_{Ai}~,\\
  z_{Ai} \geq x_i + y_{Ai} - 1~.
\end{aligned}
$$
More can be found [here](https://stackoverflow.com/questions/31173983/python-pulp-integer-linear-program-with-dynamic-constraint). For a given $A$ and $i$, we see that if $x_i = y_{Ai} = 1$, then $z_{Ai} = 1$. Otherwise, if one of them is $0$, the first two constraints impose that $z_{Ai} = 0$.

Finally, the constraint $\sum_i y_{Ai} \geq 1$ should be changed to $\sum_i z_{Ai} \geq 1$ to impose that **both** $y_{Ai}$ and $x_i$ are equal to 1 for some $i$, for every $A$.

In [None]:
def optimize_team_weakness_improved(pkms, types, size_team=6):
  prob = pulp.LpProblem("Pokemon_Team_Optimization", pulp.LpMaximize)

  # Define the boolean variables
  x = pulp.LpVariable.dicts("x", range(len(pkms)), cat='Binary')
  y = pulp.LpVariable.dicts("y", (range(len(types)), range(len(pkms))), cat='Binary')
  z = pulp.LpVariable.dicts("z", (range(len(types)), range(len(pkms))), cat='Binary')

  # Define the size constraint
  prob += pulp.lpSum(x[i] for i in range(len(pkms))) == size_team, "Team Size"

  # Define the base total sum
  prob += pulp.lpSum(pkmn["base_total"] * x[i] for i, pkmn in pkms.iterrows()), "Maximal base total"

  # Define the weakness sum bound for each type
  # Based on https://stackoverflow.com/questions/51939363/pulp-milp-constraint-at-least-one-variable-must-be-below-0
  m=100
  for a,type_col in enumerate(types):
    prob += pulp.lpSum(z[a][i] for i in range(len(pkms))) >= 1 # Overall constraint for each type
    for i, pkmn in pkms.iterrows():
      prob += z[a][i] <= x[i], f"Contraint z 1 for {a},{i}"
      prob += z[a][i] <= y[a][i], f"Contraint z 2 for {a},{i}"
      prob += z[a][i] >= x[i] + y[a][i] - 1, f"Contraint z 3 for {a},{i}"
      prob += x[i] * pkmn[type_col]  <= 0.5 + m*(1-y[a][i]), f"Weakness {type_col} for pokemon {i}"

  out_code = prob.solve()
  display(print(f"Out code: {out_code}"))

  if out_code == 1:
    idxs_sol = [i for i in range(len(pkms)) if pulp.value(x[i]) == 1]
    pkms_selected = pkms.loc[idxs_sol]
    return pkms_selected
  return None

In [None]:
pkmns_weakness_optimized = optimize_team_weakness_improved(pks, type_columns)
display(pkmns_weakness_optimized["name"])
display(present_solution_weaknesses(pkmns_weakness_optimized, type_columns)[["min_val", "min_pkmn", "type1", "type2"]])

Out code: 1


None

149       mewtwo
247    tyranitar
375    metagross
381       kyogre
382      groudon
383     rayquaza
Name: name, dtype: object

name,min_val,min_pkmn,type1,type2
against_bug,0.5,rayquaza,dragon,flying
against_dark,0.5,tyranitar,rock,dark
against_dragon,0.5,metagross,steel,psychic
against_electric,0.0,groudon,ground,
against_fairy,0.5,metagross,steel,psychic
against_fight,0.5,mewtwo,psychic,
against_fire,0.5,tyranitar,rock,dark
against_flying,0.5,tyranitar,rock,dark
against_ghost,0.5,tyranitar,rock,dark
against_grass,0.25,rayquaza,dragon,flying


Finally we found a method which works! All types are covered in our team! However, as expected, the team is full of legendary and pseudo-legendaries. The base total requirement almost ensures this, so let's try to just remove legendaries from the dataset.

In [None]:
pks_no_legendaries = pks.loc[pks["is_legendary"] == 0].reset_index().drop(columns=["index"])
pkmns_weakness_optimized_nolegendaries = optimize_team_weakness_improved(pks_no_legendaries, type_columns)
display(pkmns_weakness_optimized_nolegendaries["name"])
display(present_solution_weaknesses(pkmns_weakness_optimized_nolegendaries, type_columns)[["min_val", "min_pkmn", "type1", "type2"]])

Out code: 1


None

129     gyarados
239    tyranitar
277      slaking
361    salamence
364    metagross
423     garchomp
Name: name, dtype: object

name,min_val,min_pkmn,type1,type2
against_bug,0.5,gyarados,water,flying
against_dark,0.5,tyranitar,rock,dark
against_dragon,0.5,metagross,steel,psychic
against_electric,0.0,garchomp,dragon,ground
against_fairy,0.5,metagross,steel,psychic
against_fight,0.5,gyarados,water,flying
against_fire,0.5,gyarados,water,flying
against_flying,0.5,tyranitar,rock,dark
against_ghost,0.0,slaking,normal,
against_grass,0.25,salamence,dragon,flying


This is great, but again now the team is as expected dominated by dragons because they have the most resistance! It would be interesting to restrict ourselves to non-dragons.

In [None]:
list_pseudo_legendaries = ["Dragonite", 	"Tyranitar", 	"Salamence", 	"Metagross", 	"Garchomp", 	"Hydreigon","Goodra", 	"Kommo-o", 	"Dragapult",	"Hisuian Goodra", 	"Baxcalibur"]