# Statistically, what would be the best Pokemon team Ash could assemble?

Members: Sathvik Inteti, Arnav Mittal, Daneil Nguyen

## Introduction

Pokemon is a famous video game series that has beloved fans both old and young. It revolves around "trainers" catching these creatures called Pokemon and using them in battles against others. It has become an international sensation and has spawned a multitude of other forms of entertainment such as a trading card game, movies, and an anime show. Most notably, many fans know of 10 year old Ash Ketchum, the protagonist of the Pokemon anime. The anime follows Ash and his friends on his journey to become a Pokemon Champion, the best Pokemon trainer of them all.

The Pokemon anime has been on air for more than 20 years and Ash has not aged one single bit (because of cartoon logic). Additionally, he has yet to become a Pokemon Champion depite the amount of time he has trained and the number of Pokemon he has encountered, until recently. In 2019, after 22 years after the initial release of the show, Ash's dream came into fruition as he has conquered the Alola league and was crowned as a Pokemon Champion ([source](https://www.pokemonfanclub.net/did-ash-become-a-pokemon-master/#:~:text=Pokemons%20Ash%20Finally%20Becomes%20A,master%20for%20the%20first%20time.)). 

Here, a question arises that we would like to explore. Ash has had 22 years of experience and has accumulated a grand roster over this time. What is the best Pokemon team that Ash could assemble in terms of strength?

In our case, "strength" is determined by total stats of a Pokemon and how a specific Pokemon's type fairs with others (i.e. what types are they weak to and strong against).

To test how "strong" our team is, we will face it against Ash's original rival in the Pokemon anime and video game: Gary Oak. 

We will be utilizing various data sets that contain information about the stats of various Pokemon along with combat data in which we will feed to a learning model to predict the winner of this grand battle between Pokemon champions. 






Link to datasets used:

Pokemon Stats Data: https://www.kaggle.com/datasets/mariotormo/complete-pokemon-dataset-updated-090420?select=pokedex_%28Update.04.20%29.csv

Pokemon Combat Data: https://www.kaggle.com/datasets/terminus7/pokemon-challenge?select=tests.csv

Ash Pokemon Data: https://www.epicdope.com/all-of-ashs-pokemon-in-the-anime/

Gary Pokemon Data: https://pokemon.fandom.com/wiki/Blue%27s_Pok%C3%A9mon_Teams_(Red/Green/Blue)#Charmander


## Data Wrangling

In [24]:
#import all necessary packages
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.ensemble import RandomForestClassifier

In [32]:
#import pokemon data from csv and filter by necessary columns
all_pokemon = pd.read_csv('data/pokedex_(Update.04.20).csv')
all_pokemon = all_pokemon[['pokedex_number', 'name', 'type_1', 'type_2', 'total_points', 'against_normal', 'against_fire', 'against_water', 'against_electric', 'against_grass', 'against_ice', 'against_fight', 'against_poison', 'against_ground', 'against_flying', 'against_psychic', 'against_bug', 'against_rock', 'against_ghost', 'against_dragon', 'against_dark', 'against_steel', 'against_fairy']]
all_pokemon.head()

Unnamed: 0,pokedex_number,name,type_1,type_2,total_points,against_normal,against_fire,against_water,against_electric,against_grass,...,against_ground,against_flying,against_psychic,against_bug,against_rock,against_ghost,against_dragon,against_dark,against_steel,against_fairy
0,1,Bulbasaur,Grass,Poison,318.0,1.0,2.0,0.5,0.5,0.25,...,1.0,2.0,2.0,1.0,1.0,1.0,1.0,1.0,1.0,0.5
1,2,Ivysaur,Grass,Poison,405.0,1.0,2.0,0.5,0.5,0.25,...,1.0,2.0,2.0,1.0,1.0,1.0,1.0,1.0,1.0,0.5
2,3,Venusaur,Grass,Poison,525.0,1.0,2.0,0.5,0.5,0.25,...,1.0,2.0,2.0,1.0,1.0,1.0,1.0,1.0,1.0,0.5
3,3,Mega Venusaur,Grass,Poison,625.0,1.0,1.0,0.5,0.5,0.25,...,1.0,2.0,2.0,1.0,1.0,1.0,1.0,1.0,1.0,0.5
4,4,Charmander,Fire,,309.0,1.0,0.5,2.0,1.0,0.5,...,2.0,1.0,1.0,0.5,2.0,1.0,1.0,1.0,0.5,0.5


In [26]:
#get Gary's list of Pokemon and retrieve data on each
gary_dex = pd.read_csv('data/Gary_pokemon.csv')
gary_pokemon = pd.DataFrame()
for i, r in gary_dex.iterrows():
    gary_pokemon = pd.concat([gary_pokemon, all_pokemon[all_pokemon['name'] == r['Name']]])
gary_pokemon

Unnamed: 0,pokedex_number,name,type_1,type_2,total_points,against_normal,against_fire,against_water,against_electric,against_grass,...,against_ground,against_flying,against_psychic,against_bug,against_rock,against_ghost,against_dragon,against_dark,against_steel,against_fairy
22,18,Pidgeot,Normal,Flying,479.0,1.0,1.0,1.0,2.0,0.5,...,0.0,1.0,1.0,0.5,2.0,0.0,1.0,1.0,1.0,1.0
83,65,Alakazam,Psychic,,500.0,1.0,1.0,1.0,1.0,1.0,...,1.0,1.0,0.5,2.0,1.0,2.0,1.0,2.0,1.0,1.0
144,112,Rhydon,Ground,Rock,485.0,0.5,0.5,4.0,0.0,4.0,...,2.0,0.5,1.0,1.0,0.5,1.0,1.0,1.0,2.0,1.0
77,59,Arcanine,Fire,,555.0,1.0,0.5,2.0,1.0,0.5,...,2.0,1.0,1.0,0.5,2.0,1.0,1.0,1.0,0.5,0.5
132,103,Exeggutor,Grass,Psychic,530.0,1.0,2.0,0.5,0.5,0.5,...,0.5,2.0,0.5,4.0,1.0,2.0,1.0,2.0,1.0,1.0
11,9,Blastoise,Water,,530.0,1.0,0.5,0.5,2.0,2.0,...,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.5,1.0


In [27]:
#get Ash's list of Pokemon and retrieve data on each
ash_dex = pd.read_csv('data/Ash_pokemon.csv')
ash_pokemon = pd.DataFrame()
for i, r in ash_dex.iterrows():
    ash_pokemon = pd.concat([ash_pokemon, all_pokemon[all_pokemon['pokedex_number'] == r['Pokedex Number']]])
ash_pokemon

Unnamed: 0,pokedex_number,name,type_1,type_2,total_points,against_normal,against_fire,against_water,against_electric,against_grass,...,against_ground,against_flying,against_psychic,against_bug,against_rock,against_ghost,against_dragon,against_dark,against_steel,against_fairy
193,153,Bayleef,Grass,,405.0,1.0,2.0,0.5,0.5,0.50,...,0.5,2.0,1.0,2.0,1.0,1.0,1.0,1.0,1.0,1.0
616,525,Boldore,Rock,,390.0,0.5,0.5,2.0,1.0,2.00,...,2.0,0.5,1.0,1.0,1.0,1.0,1.0,1.0,2.0,1.0
497,418,Buizel,Water,,330.0,1.0,0.5,0.5,2.0,2.00,...,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.5,1.0
0,1,Bulbasaur,Grass,Poison,318.0,1.0,2.0,0.5,0.5,0.25,...,1.0,2.0,2.0,1.0,1.0,1.0,1.0,1.0,1.0,0.5
15,12,Butterfree,Bug,Flying,395.0,1.0,2.0,1.0,2.0,0.25,...,0.0,2.0,1.0,0.5,4.0,1.0,1.0,1.0,1.0,1.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
163,128,Tauros,Normal,,490.0,1.0,1.0,1.0,1.0,1.00,...,1.0,1.0,1.0,1.0,1.0,0.0,1.0,1.0,1.0,1.0
384,324,Torkoal,Fire,,470.0,1.0,0.5,2.0,1.0,0.50,...,2.0,1.0,1.0,0.5,2.0,1.0,1.0,1.0,0.5,0.5
466,389,Torterra,Grass,Ground,525.0,1.0,2.0,1.0,0.0,1.00,...,0.5,2.0,1.0,2.0,0.5,1.0,1.0,1.0,1.0,1.0
198,158,Totodile,Water,,314.0,1.0,0.5,0.5,2.0,2.00,...,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.5,1.0


## Data Exploration and Analysis

In [34]:
#include stats vs typing scatter plot?
#specify list of types that a team is specifically weak to or not strong against?
#mean stats and type?

combats_df = pd.read_csv("data/combats.csv")
combats_df.head()

Unnamed: 0,First_pokemon,Second_pokemon,Winner
0,266,298,298
1,702,701,701
2,191,668,668
3,237,683,683
4,151,231,151


## Conclusion