### Imports

In [1]:
import numpy as no
import pandas as pd
import os

### Load the data
We are going to load the data in a `pandas` dataframe and do a quick analysis of the file. As a reminder, the data was extracted from https://www.kaggle.com/abcsds/pokemon (last update: 2020.03.26).

In [108]:
# Dataframe
df = pd.read_csv("../Pokemon.csv")
df.head()

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,False


We notice 13 columns. The first column is the Pokemon's `ID`, the second their `Name`. We then have the `Type`s (Pokemon can have one or two types), and then some statistics.

For the statistics, the column `Total` is the sum of all numbered statistics, namely all of the numbers beside the generation number and the legendary status. Next are all the basic statistics for one Pokemon, like `Attack`, `Special Defense` or `Speed`.

Finally, we have the Pokemon's `Generation` number (i.e. their "season") and a `Legendary` status.

Now, let's dive into the data...

### Analysis

#### ID and Name

In [129]:
id_name = df[["#","Name"]]

In [130]:
print("We have", id_name["#"].nunique(), "Pokemon's `ID` for ", id_name["Name"].nunique(), "Pokemon's `Name`s.")

We have 721 Pokemon's `ID` for  800 Pokemon's `Name`s.


There's a catch. We have 721 unique Pokemon `ID` and yet have 800 `Name`s. So what's going on?

In [131]:
id_name = id_name.groupby(["#"]).filter(lambda x: len(x) > 1)
id_name.head()

Unnamed: 0,#,Name
2,3,Venusaur
3,3,VenusaurMega Venusaur
6,6,Charizard
7,6,CharizardMega Charizard X
8,6,CharizardMega Charizard Y


Turns out, some Pokemon have multiple form. For instance, [`Charizard` has four forms](https://bulbapedia.bulbagarden.net/wiki/Charizard_(Pok%C3%A9mon)). We will have to take that into account when showing these Pokemon!

In [89]:
print("There are", id_name["#"].nunique(), "Pokemon with multiple forms.")

There are 65 Pokemon with multiple forms.


#### Types

In [90]:
# TODO
types = df[["Type 1","Type 2"]]

#### Statistics

In [None]:
# TODO
stats = df[["Total","HP","Attack","Defense","Sp. Atk","Sp. Def","Speed"]]

#### Generation

In [125]:
gen = df.groupby("Generation")["Generation"].count()
gen

Generation
1    166
2    106
3    160
4    121
5    165
6     82
Name: Generation, dtype: int64

Here the numbers are by default wrong, because we have to take into account the multiple Pokemon evolutions, as stated in [ID and Name](#ID-and-Name). The database does not state when a new form for a Pokemon was added.

One thing that we can do is remove every non-unique Pokemon.

In [145]:
# TODO
gen2 = df.groupby(["Generation","#"]).filter(lambda x: len(x["#"]) > 1)
gen2

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
6,6,Charizard,Fire,Flying,534,78,84,78,109,85,100,1,False
7,6,CharizardMega Charizard X,Fire,Dragon,634,78,130,111,130,85,100,1,False
8,6,CharizardMega Charizard Y,Fire,Flying,634,78,104,78,159,115,100,1,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...
787,711,GourgeistSuper Size,Ghost,Grass,494,85,100,122,58,75,54,6,False
795,719,Diancie,Rock,Fairy,600,50,100,150,100,150,50,6,True
796,719,DiancieMega Diancie,Rock,Fairy,700,50,160,110,160,110,110,6,True
797,720,HoopaHoopa Confined,Psychic,Ghost,600,80,110,60,150,130,70,6,True


#### Lengendary statuts

In [123]:
legend = df.groupby("Legendary")["Legendary"].count()
legend

Legendary
False    735
True      65
Name: Legendary, dtype: int64

We have 65 Legendary Pokemon for 735 Common Pokemon. We have to take into account the fact that in this database Legendary is used to denote both [Legendary](https://bulbapedia.bulbagarden.net/wiki/Legendary_Pok%C3%A9mon) and [Mythical](https://bulbapedia.bulbagarden.net/wiki/Mythical_Pok%C3%A9mon) Pokemon. Additionnal work may be done to separate them.