<a href="https://colab.research.google.com/github/AseiSugiyama/PokemonAnalytics/blob/add-pokemon-analysis-notebook/pokemon_analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Road to Pokémon Master!

Welcome to this data science hands-on! This notebook contains following contents;

1. Exploratory data analysis with [The Complete Pokemon Dataset](https://www.kaggle.com/rounakbanik/pokemon)
2. Legendary pokémons detection
3. Pokémon battle dataset analysis

## Download datasets

In [1]:
!rm -rf PokemonAnalytics
!git clone https://github.com/AseiSugiyama/PokemonAnalytics.git

Cloning into 'PokemonAnalytics'...
remote: Enumerating objects: 65, done.[K
remote: Counting objects:   1% (1/65)[Kremote: Counting objects:   3% (2/65)[Kremote: Counting objects:   4% (3/65)[Kremote: Counting objects:   6% (4/65)[Kremote: Counting objects:   7% (5/65)[Kremote: Counting objects:   9% (6/65)[Kremote: Counting objects:  10% (7/65)[Kremote: Counting objects:  12% (8/65)[Kremote: Counting objects:  13% (9/65)[Kremote: Counting objects:  15% (10/65)[Kremote: Counting objects:  16% (11/65)[Kremote: Counting objects:  18% (12/65)[Kremote: Counting objects:  20% (13/65)[Kremote: Counting objects:  21% (14/65)[Kremote: Counting objects:  23% (15/65)[Kremote: Counting objects:  24% (16/65)[Kremote: Counting objects:  26% (17/65)[Kremote: Counting objects:  27% (18/65)[Kremote: Counting objects:  29% (19/65)[Kremote: Counting objects:  30% (20/65)[Kremote: Counting objects:  32% (21/65)[Kremote: Counting objects:  33% (22/65)[Kremote: 

In [2]:
!ls -R ./PokemonAnalytics/data

./PokemonAnalytics/data:
battle	pokedex

./PokemonAnalytics/data/battle:
gen8battlestadiumsingles_battle_logs.json
gen8battlestadiumsingles_dataset.json
gen8battlestadiumsingles_parsed_battle_logs.json
gen8battlestadiumsingles_replay_ids.json
gen8battlestadiumsingles_top_users.json

./PokemonAnalytics/data/pokedex:
height_and_weight.csv  Pokemon.csv  README.md


In [3]:
!pip install pandas-profiling[notebook,html] swifter



In [0]:
import pandas as pd
import pandas_profiling as pdf
import swifter

In [0]:
usecols = [
           "pokedex_number",
           "name",
           "japanese_name",
           "type1",
           "type2",
           "height_m",
           "weight_kg",
           "hp",
           "attack",
           "defense",
           "sp_attack",
           "sp_defense",
           "speed",
           "base_egg_steps",
           "base_happiness",
           "capture_rate",
           "base_total",
           "classfication",
           "experience_growth",
           "generation",
           "is_legendary",
]

raw_pokedex = pd.read_csv(
    "./PokemonAnalytics/data/pokedex/Pokemon.csv",
    usecols=usecols,
    index_col="pokedex_number",
)[usecols[1:]].rename(columns={
    "height_m":"height",
    "weight_kg":"weight",
})

In [6]:
raw_pokedex.head()

Unnamed: 0_level_0,name,japanese_name,type1,type2,height,weight,hp,attack,defense,sp_attack,sp_defense,speed,base_egg_steps,base_happiness,capture_rate,base_total,classfication,experience_growth,generation,is_legendary
pokedex_number,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
1,Bulbasaur,Fushigidaneフシギダネ,grass,poison,0.7,6.9,45,49,49,65,65,45,5120,70,45,318,Seed Pokémon,1059860,1,0
2,Ivysaur,Fushigisouフシギソウ,grass,poison,1.0,13.0,60,62,63,80,80,60,5120,70,45,405,Seed Pokémon,1059860,1,0
3,Venusaur,Fushigibanaフシギバナ,grass,poison,2.0,100.0,80,100,123,122,120,80,5120,70,45,625,Seed Pokémon,1059860,1,0
4,Charmander,Hitokageヒトカゲ,fire,,0.6,8.5,39,52,43,60,50,65,5120,70,45,309,Lizard Pokémon,1059860,1,0
5,Charmeleon,Lizardoリザード,fire,,1.1,19.0,58,64,58,80,65,80,5120,70,45,405,Flame Pokémon,1059860,1,0


In [7]:
raw_pokedex.dtypes

name                  object
japanese_name         object
type1                 object
type2                 object
height               float64
weight               float64
hp                     int64
attack                 int64
defense                int64
sp_attack              int64
sp_defense             int64
speed                  int64
base_egg_steps         int64
base_happiness         int64
capture_rate          object
base_total             int64
classfication         object
experience_growth      int64
generation             int64
is_legendary           int64
dtype: object

In [8]:
def is_int(value):
  try:
    int(value)
    return True
  except ValueError:
    return False

capture_rate_int_values = raw_pokedex.capture_rate.swifter.apply(is_int)
raw_pokedex[~capture_rate_int_values]

HBox(children=(FloatProgress(value=0.0, description='Pandas Apply', max=801.0, style=ProgressStyle(description…




Unnamed: 0_level_0,name,japanese_name,type1,type2,height,weight,hp,attack,defense,sp_attack,sp_defense,speed,base_egg_steps,base_happiness,capture_rate,base_total,classfication,experience_growth,generation,is_legendary
pokedex_number,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
774,Minior,Metenoメテノ,rock,flying,0.3,40.0,60,100,60,100,60,120,6400,70,30 (Meteorite)255 (Core),500,Meteor Pokémon,1059860,7,0


In [9]:
raw_pokedex.loc[~capture_rate_int_values, ["name", "japanese_name", "capture_rate"]]

Unnamed: 0_level_0,name,japanese_name,capture_rate
pokedex_number,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
774,Minior,Metenoメテノ,30 (Meteorite)255 (Core)


In [10]:
raw_pokedex.at[774, "capture_rate"] = 30
raw_pokedex = raw_pokedex.astype({
    "capture_rate":int
})
raw_pokedex.dtypes

name                  object
japanese_name         object
type1                 object
type2                 object
height               float64
weight               float64
hp                     int64
attack                 int64
defense                int64
sp_attack              int64
sp_defense             int64
speed                  int64
base_egg_steps         int64
base_happiness         int64
capture_rate           int64
base_total             int64
classfication         object
experience_growth      int64
generation             int64
is_legendary           int64
dtype: object

In [11]:
raw_pokedex.isna().sum()

name                   0
japanese_name          0
type1                  0
type2                384
height                20
weight                20
hp                     0
attack                 0
defense                0
sp_attack              0
sp_defense             0
speed                  0
base_egg_steps         0
base_happiness         0
capture_rate           0
base_total             0
classfication          0
experience_growth      0
generation             0
is_legendary           0
dtype: int64

In [12]:
raw_pokedex[raw_pokedex.height.isna()]

Unnamed: 0_level_0,name,japanese_name,type1,type2,height,weight,hp,attack,defense,sp_attack,sp_defense,speed,base_egg_steps,base_happiness,capture_rate,base_total,classfication,experience_growth,generation,is_legendary
pokedex_number,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
19,Rattata,Korattaコラッタ,normal,dark,,,30,56,35,25,35,72,3840,70,255,253,Mouse Pokémon,1000000,1,0
20,Raticate,Rattaラッタ,normal,dark,,,75,71,70,40,80,77,3840,70,127,413,Mouse Pokémon,1000000,1,0
26,Raichu,Raichuライチュウ,electric,electric,,,60,85,50,95,85,110,2560,70,75,485,Mouse Pokémon,1000000,1,0
27,Sandshrew,Sandサンド,ground,ice,,,50,75,90,10,35,40,5120,70,255,300,Mouse Pokémon,1000000,1,0
28,Sandslash,Sandpanサンドパン,ground,ice,,,75,100,120,25,65,65,5120,70,90,450,Mouse Pokémon,1000000,1,0
37,Vulpix,Rokonロコン,fire,ice,,,38,41,40,50,65,65,5120,70,190,299,Fox Pokémon,1000000,1,0
38,Ninetales,Kyukonキュウコン,fire,ice,,,73,67,75,81,100,109,5120,70,75,505,Fox Pokémon,1000000,1,0
50,Diglett,Digdaディグダ,ground,ground,,,10,55,30,35,45,90,5120,70,255,265,Mole Pokémon,1000000,1,0
51,Dugtrio,Dugtrioダグトリオ,ground,ground,,,35,100,60,50,70,110,5120,70,50,425,Mole Pokémon,1000000,1,0
52,Meowth,Nyarthニャース,normal,dark,,,40,35,35,50,40,90,5120,70,255,290,Scratch Cat Pokémon,1000000,1,0


In [13]:
raw_pokedex[raw_pokedex.weight.isna()]

Unnamed: 0_level_0,name,japanese_name,type1,type2,height,weight,hp,attack,defense,sp_attack,sp_defense,speed,base_egg_steps,base_happiness,capture_rate,base_total,classfication,experience_growth,generation,is_legendary
pokedex_number,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
19,Rattata,Korattaコラッタ,normal,dark,,,30,56,35,25,35,72,3840,70,255,253,Mouse Pokémon,1000000,1,0
20,Raticate,Rattaラッタ,normal,dark,,,75,71,70,40,80,77,3840,70,127,413,Mouse Pokémon,1000000,1,0
26,Raichu,Raichuライチュウ,electric,electric,,,60,85,50,95,85,110,2560,70,75,485,Mouse Pokémon,1000000,1,0
27,Sandshrew,Sandサンド,ground,ice,,,50,75,90,10,35,40,5120,70,255,300,Mouse Pokémon,1000000,1,0
28,Sandslash,Sandpanサンドパン,ground,ice,,,75,100,120,25,65,65,5120,70,90,450,Mouse Pokémon,1000000,1,0
37,Vulpix,Rokonロコン,fire,ice,,,38,41,40,50,65,65,5120,70,190,299,Fox Pokémon,1000000,1,0
38,Ninetales,Kyukonキュウコン,fire,ice,,,73,67,75,81,100,109,5120,70,75,505,Fox Pokémon,1000000,1,0
50,Diglett,Digdaディグダ,ground,ground,,,10,55,30,35,45,90,5120,70,255,265,Mole Pokémon,1000000,1,0
51,Dugtrio,Dugtrioダグトリオ,ground,ground,,,35,100,60,50,70,110,5120,70,50,425,Mole Pokémon,1000000,1,0
52,Meowth,Nyarthニャース,normal,dark,,,40,35,35,50,40,90,5120,70,255,290,Scratch Cat Pokémon,1000000,1,0


In [0]:
weight_and_heights = pd.read_csv(
    "./PokemonAnalytics/data/pokedex/height_and_weight.csv",
    usecols=["ndex", "height", "weight"],
).drop_duplicates(
    subset=['ndex'],
    keep='first',
).rename(columns={
    "height":"height_from_feet",
    "weight":"weight_from_ponds",
})
weight_and_heights.index = weight_and_heights.ndex
weight_and_heights = weight_and_heights[["height_from_feet", "weight_from_ponds"]]

In [0]:
joined_pokedex = pd.merge(
    raw_pokedex.rename(columns={
        "height":"raw_height",
        "weight":"raw_weight",
        }),
    weight_and_heights,
    left_on="pokedex_number",
    right_on="ndex",
    left_index=True,
    right_index=True,
)

In [41]:
joined_pokedex.head()

Unnamed: 0,pokedex_number,name,japanese_name,type1,type2,raw_height,raw_weight,hp,attack,defense,sp_attack,sp_defense,speed,base_egg_steps,base_happiness,capture_rate,base_total,classfication,experience_growth,generation,is_legendary,height_from_feet,weight_from_ponds
1,1,Bulbasaur,Fushigidaneフシギダネ,grass,poison,0.7,6.9,45,49,49,65,65,45,5120,70,45,318,Seed Pokémon,1059860,1,0,0.7112,6.894604
2,2,Ivysaur,Fushigisouフシギソウ,grass,poison,1.0,13.0,60,62,63,80,80,60,5120,70,45,405,Seed Pokémon,1059860,1,0,0.9906,13.018101
3,3,Venusaur,Fushigibanaフシギバナ,grass,poison,2.0,100.0,80,100,123,122,120,80,5120,70,45,625,Seed Pokémon,1059860,1,0,2.0066,100.017118
4,4,Charmander,Hitokageヒトカゲ,fire,,0.6,8.5,39,52,43,60,50,65,5120,70,45,309,Lizard Pokémon,1059860,1,0,0.6096,8.482177
5,5,Charmeleon,Lizardoリザード,fire,,1.1,19.0,58,64,58,80,65,80,5120,70,45,405,Flame Pokémon,1059860,1,0,1.0922,19.00552


In [45]:
height_is_na = joined_pokedex.raw_height.isna().astype(int)
weight_is_na = joined_pokedex.raw_weight.isna().astype(int)
pokedex_without_na = joined_pokedex.fillna(value={
    "type2":"NONE",
    "raw_height":0.0,
    "raw_weight":0.0,
})
pokedex_without_na["height_m"] = (1 - height_is_na) * pokedex_without_na.raw_height + height_is_na * pokedex_without_na.height_from_feet
pokedex_without_na["weight_kg"] = (1 - weight_is_na) * pokedex_without_na.raw_weight + weight_is_na * pokedex_without_na.weight_from_ponds

pokedex = pokedex_without_na.drop(columns=[
                                           "height_from_feet",
                                           "weight_from_ponds",
                                           "raw_height",
                                           "raw_weight",
                                           ]
                                  )[usecols[1:]].rename(columns={
                                      "height_m":"height",
                                      "weight_kg":"weight",})
pokedex.head()

Unnamed: 0,name,japanese_name,type1,type2,height,weight,hp,attack,defense,sp_attack,sp_defense,speed,base_egg_steps,base_happiness,capture_rate,base_total,classfication,experience_growth,generation,is_legendary
1,Bulbasaur,Fushigidaneフシギダネ,grass,poison,0.7,6.9,45,49,49,65,65,45,5120,70,45,318,Seed Pokémon,1059860,1,0
2,Ivysaur,Fushigisouフシギソウ,grass,poison,1.0,13.0,60,62,63,80,80,60,5120,70,45,405,Seed Pokémon,1059860,1,0
3,Venusaur,Fushigibanaフシギバナ,grass,poison,2.0,100.0,80,100,123,122,120,80,5120,70,45,625,Seed Pokémon,1059860,1,0
4,Charmander,Hitokageヒトカゲ,fire,NONE,0.6,8.5,39,52,43,60,50,65,5120,70,45,309,Lizard Pokémon,1059860,1,0
5,Charmeleon,Lizardoリザード,fire,NONE,1.1,19.0,58,64,58,80,65,80,5120,70,45,405,Flame Pokémon,1059860,1,0


In [48]:
pokedex[pokedex.index == 26] # Raichu height: 0.8m, weight: 30kg

Unnamed: 0,name,japanese_name,type1,type2,height,weight,hp,attack,defense,sp_attack,sp_defense,speed,base_egg_steps,base_happiness,capture_rate,base_total,classfication,experience_growth,generation,is_legendary
26,Raichu,Raichuライチュウ,electric,electric,0.7874,29.982456,60,85,50,95,85,110,2560,70,75,485,Mouse Pokémon,1000000,1,0


## More datsets!

Following datasets are available to explore the pokèmon world.

- Pokemon Sun and Moon (Gen 7) Stats https://www.kaggle.com/mylesoneill/pokemon-sun-and-moon-gen-7-stats
- Pokemon with stats Kaggle https://www.kaggle.com/abcsds/pokemon
