# Explore The Pokemon From Generation 1 To Generation 6

### Description

In this case, we find Pokemon data which is from generation 1 until generation 6. The data have 800 rows and 13 columns includes pokemon ID, pokemon name, pokemon type, and pokemon stats atribute.

Additional

- "#" = ID pokemon
- "Type 1" = Pokemon type
- "Total" = Sum of pokemon's atributes
- "HP" = Hit Points, shows the stats of strength pokemon
- "Attack" = Shows the stats of attack ability pokemon in physical form 
- "Defense" = Shows the stats of defense ability pokemon in physical form
- "Sp. Atk" = Special Attack, Shows the stats of attack ability pokemon in non-physical form
- "Sp. Def" = Special Defense, Shows the stats of defense ability pokemon in non-physical form
- "Speed" = Movement speed of pokemon
- "Generation" = The "n" generation of pokemon
- "Legendary" = Shows stats of pokemon

# Loading Data

In [1]:
from plotly import subplots
from plotly import graph_objs as go
from plotly.offline import iplot, init_notebook_mode
import numpy as np
import pandas as pd
init_notebook_mode(connected=True)

In [2]:
data = pd.read_csv("Pokemon.csv")
data

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...
795,719,Diancie,Rock,Fairy,600,50,100,150,100,150,50,6,True
796,719,DiancieMega Diancie,Rock,Fairy,700,50,160,110,160,110,110,6,True
797,720,HoopaHoopa Confined,Psychic,Ghost,600,80,110,60,150,130,70,6,True
798,720,HoopaHoopa Unbound,Psychic,Dark,680,80,160,60,170,130,80,6,True


we'll rename "#" column into "ID" because "#" column represent pokemon's ID

In [3]:
data.rename(columns={"#":"ID"},inplace=True)

In [4]:
data.head()

Unnamed: 0,ID,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,False


In [5]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 800 entries, 0 to 799
Data columns (total 13 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   ID          800 non-null    int64 
 1   Name        800 non-null    object
 2   Type 1      800 non-null    object
 3   Type 2      414 non-null    object
 4   Total       800 non-null    int64 
 5   HP          800 non-null    int64 
 6   Attack      800 non-null    int64 
 7   Defense     800 non-null    int64 
 8   Sp. Atk     800 non-null    int64 
 9   Sp. Def     800 non-null    int64 
 10  Speed       800 non-null    int64 
 11  Generation  800 non-null    int64 
 12  Legendary   800 non-null    bool  
dtypes: bool(1), int64(9), object(3)
memory usage: 75.9+ KB


There is missing value on "Type 2" column, we'll handle that.

In [6]:
data.describe()

Unnamed: 0,ID,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation
count,800.0,800.0,800.0,800.0,800.0,800.0,800.0,800.0,800.0
mean,362.81375,435.1025,69.25875,79.00125,73.8425,72.82,71.9025,68.2775,3.32375
std,208.343798,119.96304,25.534669,32.457366,31.183501,32.722294,27.828916,29.060474,1.66129
min,1.0,180.0,1.0,5.0,5.0,10.0,20.0,5.0,1.0
25%,184.75,330.0,50.0,55.0,50.0,49.75,50.0,45.0,2.0
50%,364.5,450.0,65.0,75.0,70.0,65.0,70.0,65.0,3.0
75%,539.25,515.0,80.0,100.0,90.0,95.0,90.0,90.0,5.0
max,721.0,780.0,255.0,190.0,230.0,194.0,230.0,180.0,6.0


# Handling Missing Value

There are missing values on columns "Type 2", so we decided to put missing values with "None" because NaN in "Type 2" represent that pokemon doesn't have second type.

In [7]:
data.fillna("None",inplace=True)

# Data Exploration

In [8]:
data["Name"].unique().shape

(800,)

The data have 800 pokemon's name.

In [9]:
list_type1 = data["Type 1"].unique()
list_type1

array(['Grass', 'Fire', 'Water', 'Bug', 'Normal', 'Poison', 'Electric',
       'Ground', 'Fairy', 'Fighting', 'Psychic', 'Rock', 'Ghost', 'Ice',
       'Dragon', 'Dark', 'Steel', 'Flying'], dtype=object)

There are 18 kind of pokemon's type 1: 'Grass', 'Fire', 'Water', 'Bug', 'Normal', 'Poison', 'Electric', 'Ground', 'Fairy', 'Fighting', 'Psychic', 'Rock', 'Ghost', 'Ice', 'Dragon', 'Dark', 'Steel', 'Flying'.

In [10]:
list_type2 = data["Type 2"].unique()
list_type2

array(['Poison', 'None', 'Flying', 'Dragon', 'Ground', 'Fairy', 'Grass',
       'Fighting', 'Psychic', 'Steel', 'Ice', 'Rock', 'Dark', 'Water',
       'Electric', 'Fire', 'Ghost', 'Bug', 'Normal'], dtype=object)

There are 19 kind of pokemon's type 2: 'Poison', 'None', 'Flying', 'Dragon', 'Ground', 'Fairy', 'Grass', 'Fighting', 'Psychic', 'Steel', 'Ice', 'Rock', 'Dark', 'Water', 'Electric', 'Fire', 'Ghost', 'Bug', 'Normal'.

'None' type pokemon means that pokemon doesn't have second type.

In [11]:
list_generation = data["Generation"].unique()
list_generation

array([1, 2, 3, 4, 5, 6], dtype=int64)

There are 6 generation in pokemon form 1 to 6 generation.

In [12]:
list_legendary = data["Legendary"].unique()
list_legendary

array([False,  True])

There are 2 kind pokemon, legendary and non-legendary pokemon.

In [13]:
list_columns = data.columns.values
list_columns

array(['ID', 'Name', 'Type 1', 'Type 2', 'Total', 'HP', 'Attack',
       'Defense', 'Sp. Atk', 'Sp. Def', 'Speed', 'Generation',
       'Legendary'], dtype=object)

There are 13 columns in dataset pokemon.

# Data Visualization

### 1. Identifying Amount Of Each Type Pokemon

First we want to identify amount of each type pokemon by visualize it so people can understand more about pokemon.

In [14]:
colours = ['#00FFFF', '#7FFFD4', '#000000', '#0000FF', '#8A2BE2', '#A52A2A','#DEB887', '#5F9EA0', '#7FFF00', '#D2691E',
            '#FF7F50', '#6495ED', '#DC143C', '#00FFFF', '#00008B', '#008B8B', '#B8860B', '#A9A9A9', '#006400', '#BDB76B',
            '#8B008B', '#556B2F', '#FF8C00', '#9932CC', '#8B0000', '#E9967A', '#8FBC8F', '#483D8B', '#2F4F4F', '#00CED1',
            '#9400D3', '#FF1493', '#00BFFF', '#696969', '#1E90FF', '#B22222', '#228B22', '#FF00FF', '#FFD700', '#DAA520',
            '#808080', '#008000', '#ADFF2F', '#FF69B4', '#CD5C5C', '#4B0082', '#F0E68C', '#7CFC00', '#ADD8E6', '#F08080',
            '#90EE90', '#FFB6C1', '#FFA07A', '#20B2AA', '#87CEFA', '#778899', '#B0C4DE', '#00FF00', '#32CD32', '#FF00FF',
            '#800000', '#66CDAA', '#0000CD', '#BA55D3', '#9370DB', '#3CB371', '#7B68EE', '#00FA9A', '#48D1CC', '#C71585',
            '#191970', '#FFE4B5', '#FFDEAD', '#000080', '#808000', '#6B8E23', '#FFA500', '#FF4500', '#DA70D6', '#EEE8AA',
            '#98FB98', '#AFEEEE', '#DB7093', '#CD853F', '#FFC0CB', '#DDA0DD', '#B0E0E6', '#800080', '#663399', '#FF0000',
            '#BC8F8F', '#4169E1', '#8B4513', '#FA8072', '#F4A460', '#2E8B57', '#A0522D', '#C0C0C0', '#87CEEB', '#6A5ACD',
            '#708090', '#00FF7F', '#4682B4', '#D2B48C', '#008080', '#D8BFD8', '#FF6347', '#40E0D0', '#EE82EE', '#F5DEB3',
            '#FFFF00', '#9ACD32']

In [15]:
amount_type1 = []
for a in list_type1:
    type1_list = data[data["Type 1"] == a]
    amount_type1.append(len(type1_list))
type1_bar = go.Bar(x=list_type1,y=amount_type1,marker=dict(color=colours[:len(amount_type1)]))
type1_layout = go.Layout(title={"text":"<b>Amount Of Pokemon Based On Type 1</b>","y":0.92,"x":0.5,
                        "xanchor":"center","yanchor":"top"},height=600,margin=go.layout.Margin())
data_type1 = [type1_bar]
type1_figure = go.Figure(data=data_type1,layout=type1_layout)
iplot(type1_figure)

Based on graph, the pokemon with the most number in type 1 is water and the least is flying

In [16]:
amount_type2 = []
for b in list_type1:
    type2_list = data[data["Type 2"] == b]
    amount_type2.append(len(type2_list))
type2_bar = go.Bar(x=list_type1,y=amount_type2,marker=dict(color=colours[:len(amount_type2)]))
type2_layout = go.Layout(title={"text":"<b>Amount Of Pokemon Based On Type 2</b>","y":0.92,"x":0.5,
                        "xanchor":"center","yanchor":"top"},height=600,margin=go.layout.Margin())
data_type2 = [type2_bar]
type2_figure = go.Figure(data=data_type2,layout=type2_layout)
iplot(type2_figure)

It's interesting that The pokemon with the most number in type 2 is flying which is opposite from type 1 and the least is bug.

Now let's compare pokemon's type based on type 1 and type 2 to find out all over pokemon. For example if type 1 is water and type 2 is water then the type of that pokemon is water, if type 1 is fire and type 2 is flying then the type of that pokemon counted as fire and flying, and if type 1 is water and type 2 is none then the type of that pokemon is water

In [17]:
amount_type_all = []
for c in list_type1:
    type_list_1 = data["Type 1"] == c
    type_list_2 = data["Type 2"] == c
    amount_type_all.append(len(data[type_list_1 | type_list_2]))
type_bar = go.Bar(x=list_type1,y=amount_type_all,marker=dict(color=colours[:len(amount_type_all)]))
type_layout = go.Layout(title={"text":"<b>Amount Of Pokemon Based On All Type</b>","y":0.92,"x":0.5,
                        "xanchor":"center","yanchor":"top"},height=600,margin=go.layout.Margin())
data_type = [type_bar]
type_figure = go.Figure(data=data_type,layout=type_layout)
iplot(type_figure)

So we've got here that 'Water' type still have the most number of all, following by 'Normal' and 'Flying' type.

Now let's see amount of pokemon's type based on their generations.

In [18]:
def make_graph(e):
    amount_type = []
    amount_type_gen = []
    for d in list_type1:
        type1_list = data["Type 1"] == d
        type2_list = data["Type 2"] == d
        amount_gen = type1_list | type2_list
        amount_type.append(len(data[amount_gen]))
        amount_type_gen.append(len(data[amount_gen & (data["Generation"] == e)]))
    gen_bar = go.Bar(x=list_type1,y=amount_type_gen,marker=dict(color=colours[:len(amount_type_gen)]))
    gen_layout = go.Layout(title={"text":f"<b>Amount Of Pokemon Based On Type In Generation {e}</b>","y":0.92,"x":0.5,
                        "xanchor":"center","yanchor":"top"},height=600,margin=go.layout.Margin())
    data_gen = [gen_bar]
    gen_figure = go.Figure(data=data_gen,layout=gen_layout)
    iplot(gen_figure)

In [19]:
make_graph(1)

Based on graph, the most number of type pokemon in generation 1 is 'Poison' type and the least is 'Dark' type.

In [20]:
make_graph(2)

Based on graph, the most number of type pokemon in generation 2 is 'Flying' type and the least is 'Ghost' type.

In [21]:
make_graph(3)

Based on graph, the most number of type pokemon in generation 3 is 'Water' type and the least are 'Poison' and 'Electric' type.

In [22]:
make_graph(4)

Based on graph, the most number of type pokemon in generation 4 is 'Normal' type and the least is 'Fairy' type.

In [23]:
make_graph(5)

Based on graph, the most number of type pokemon in generation 5 is 'Flying' type and the least is 'Fairy' type.

In [24]:
make_graph(6)

Based on graph, the most number of type pokemon in generation 6 are 'Grass' and 'Ghost' type and the least are 'Poison' and 'Ground' type.

Now let's combine all those become 1 paper so we can compare for every generation.

In [25]:
amount_type_join = []
amount_type_join1 = []
amount_type_join2 = []
amount_type_join3 = []
amount_type_join4 = []
amount_type_join5 = []
amount_type_join6 = []
for f in list_type1:
    type1_join = data["Type 1"] == f
    type2_join = data["Type 2"] == f
    amount_type_all_join = type1_join | type2_join
    amount_type_join.append(len(data[amount_type_all_join]))
    amount_type_join1.append(len(data[amount_type_all_join & (data["Generation"] == 1)]))
    amount_type_join2.append(len(data[amount_type_all_join & (data["Generation"] == 2)]))
    amount_type_join3.append(len(data[amount_type_all_join & (data["Generation"] == 3)]))
    amount_type_join4.append(len(data[amount_type_all_join & (data["Generation"] == 4)]))
    amount_type_join5.append(len(data[amount_type_all_join & (data["Generation"] == 5)]))
    amount_type_join6.append(len(data[amount_type_all_join & (data["Generation"] == 6)]))
gen_bar1 = go.Bar(x=list_type1,y=amount_type_join1,marker=dict(color=colours[:len(amount_type_join)]),name="Gen 1")
gen_bar2 = go.Bar(x=list_type1,y=amount_type_join2,marker=dict(color=colours[:len(amount_type_join)]),name="Gen 2")
gen_bar3 = go.Bar(x=list_type1,y=amount_type_join3,marker=dict(color=colours[:len(amount_type_join)]),name="Gen 3")
gen_bar4 = go.Bar(x=list_type1,y=amount_type_join4,marker=dict(color=colours[:len(amount_type_join)]),name="Gen 4")
gen_bar5 = go.Bar(x=list_type1,y=amount_type_join5,marker=dict(color=colours[:len(amount_type_join)]),name="Gen 5")
gen_bar6 = go.Bar(x=list_type1,y=amount_type_join6,marker=dict(color=colours[:len(amount_type_join)]),name="Gen 6")
figure = subplots.make_subplots(rows=3,cols=2,subplot_titles=("Generation 1","Generation 2","Generation 3","Generation 4",
                                                                "Generation 5","Generation 6"))
figure.append_trace(gen_bar1, 1,1)
figure.append_trace(gen_bar2, 1,2)
figure.append_trace(gen_bar3, 2,1)
figure.append_trace(gen_bar4, 2,2)
figure.append_trace(gen_bar5, 3,1)
figure.append_trace(gen_bar6, 3,2)
figure["layout"].update(title={"text":"<b>Amount Of Pokemon Based On Type In Each Generation</b>","y":0.96,"x":0.5,
                                "xanchor":"center","yanchor":"top"},height=1000,margin=go.layout.Margin(),showlegend=False)
iplot(figure)

### 2. Identifying Amount Average Of Pokemon's Stats For Each Type

We will compare amount average of pokemon's stats for each type. Just like previous condition, we combine type in type 1 and 2 so we can take information for each type. 

In [26]:
def make_graph_hor(h):
    mean_type = []
    for g in list_type1:
        type_1 = data["Type 1"] == g
        type_2 = data["Type 2"] == g
        type_value = type_1 | type_2
        data_type = data[type_value]
        total_mean = data_type[h].mean()
        mean_type.append(total_mean)
    type_mean_bar = go.Bar(x = mean_type,y = list_type1,orientation = "h",marker=dict(color=colours[:len(mean_type)]))
    type_mean_layout = go.Layout(title={"text":f"<b>Average Stats Of {h} Based On Type</b>","y":0.93,"x":0.5,
                                    "xanchor":"center","yanchor":"top"},height=700,margin=go.layout.Margin())
    data_type_bar = [type_mean_bar]
    mean_fig = go.Figure(data=data_type_bar, layout=type_mean_layout)
    iplot(mean_fig)

In [27]:
make_graph_hor("Total")

Wow the biggest total number hold by 'Dragon' type and the smallest number hold by 'Bug' type. It's not surprise that 'Dragon' is a strongest pokemon.

In [28]:
make_graph_hor("HP")

The biggest number of HP is hold by 'Dragon' type and smallest number hold by 'Bug' type. Again it's because 'Dragon' pokemon is the strongest pokemon.

In [29]:
make_graph_hor("Attack")

It's repeated, the biggest number attack hold by 'Dragon' type but the smallest number hold by 'Fairy' type.

In [30]:
make_graph_hor("Defense")

Wow the biggest number of defense is hold by 'Steel' type and the smallest number hold by 'Normal' type. It's because 'Steel' type have strongest body of all pokemon.

In [31]:
make_graph_hor("Sp. Atk")

Just like attack atribute, the biggest number of special attack hold by 'Dragon' type and the smallest number hold by 'Bug' type.

In [32]:
make_graph_hor("Sp. Def")

You think it is same as defense? it's not. Surprisingly the biggest number of special defense hold by 'Dragon' type and the smallest hold by 'Normal' type.

In [33]:
make_graph_hor("Speed")

For the speed atribute, the biggest number hold by 'Flying' type and smallest number hold by 'Rock' type. It's not surprise that 'Flying' type is pokemon type that have ability to fly, so it will be fast to move.

Let's compare all those into one paper so we can compare each other.

In [34]:
mean_type_all = []
mean_type_hp = []
mean_type_atk = []
mean_type_def = []
mean_type_sp_atk = []
mean_type_sp_def = []
mean_type_speed = []
for i in list_type1:
    type1_all = data["Type 1"] == i
    type2_all = data["Type 2"] == i
    type_value_all = type1_all | type2_all
    data_type_all = data[type_value_all]
    hp_mean_all = data_type_all["HP"].mean()
    atk_mean_all = data_type_all["Attack"].mean()
    def_mean_all = data_type_all["Defense"].mean()
    sp_atk_mean_all = data_type_all["Sp. Atk"].mean()
    sp_def_mean_all = data_type_all["Sp. Def"].mean()
    speed_mean_all = data_type_all["Speed"].mean()
    mean_type_all.append(data_type_all)
    mean_type_hp.append(hp_mean_all)
    mean_type_atk.append(atk_mean_all)
    mean_type_def.append(def_mean_all)
    mean_type_sp_atk.append(sp_atk_mean_all)
    mean_type_sp_def.append(sp_def_mean_all)
    mean_type_speed.append(speed_mean_all)
hp_bar = go.Bar(x=mean_type_hp,y=list_type1,orientation = "h",marker=dict(color=colours[:len(mean_type_all)]),
                name="HP")
atk_bar = go.Bar(x=mean_type_atk,y=list_type1,orientation = "h",marker=dict(color=colours[:len(mean_type_all)]),
                name="Attack")
def_bar = go.Bar(x=mean_type_def,y=list_type1,orientation = "h",marker=dict(color=colours[:len(mean_type_all)]),
                name="Defense")
sp_atk_bar = go.Bar(x=mean_type_sp_atk,y=list_type1,orientation = "h",marker=dict(color=colours[:len(mean_type_all)]),
                    name="Sp. Atk")
sp_def_bar = go.Bar(x=mean_type_sp_def,y=list_type1,orientation = "h",marker=dict(color=colours[:len(mean_type_all)]),
                    name="Sp. Def")
speed_bar = go.Bar(x=mean_type_speed,y=list_type1,orientation = "h",marker=dict(color=colours[:len(mean_type_all)]),
                    name="Speed")
figure_all = subplots.make_subplots(rows=3,cols=2,vertical_spacing=0.05,subplot_titles=("Average HP","Average Attack",
                                                                                        "Average Defense","Average Sp. Atk",
                                                                                        "Average Sp. Def","Average Speed",))
figure_all.append_trace(hp_bar, 1,1)
figure_all.append_trace(atk_bar, 1,2)
figure_all.append_trace(def_bar, 2,1)
figure_all.append_trace(sp_atk_bar, 2,2)
figure_all.append_trace(sp_def_bar, 3,1)
figure_all.append_trace(speed_bar, 3,2)
figure_all["layout"].update(title={"text":"<b>Comparison Average Stats Based On Type</b>","y":0.97,"x":0.5,
                                "xanchor":"center","yanchor":"top"},height=1500,margin=go.layout.Margin(),showlegend=False)
iplot(figure_all)

### 3. Identifying Amount Average Of Pokemon's Stats For Each Generation

In this case we will find out average of pokemon's stats for every generation to see which generation has a biggest number.

In [35]:
data_gen1 = data[data["Generation"]==1]
data_gen2 = data[data["Generation"]==2]
data_gen3 = data[data["Generation"]==3]
data_gen4 = data[data["Generation"]==4]
data_gen5 = data[data["Generation"]==5]
data_gen6 = data[data["Generation"]==6]

mean_total_gen1 = data_gen1["Total"].mean()
mean_total_gen2 = data_gen2["Total"].mean()
mean_total_gen3 = data_gen3["Total"].mean()
mean_total_gen4 = data_gen4["Total"].mean()
mean_total_gen5 = data_gen5["Total"].mean()
mean_total_gen6 = data_gen6["Total"].mean()

mean_hp_gen1 = data_gen1["HP"].mean()
mean_hp_gen2 = data_gen2["HP"].mean()
mean_hp_gen3 = data_gen3["HP"].mean()
mean_hp_gen4 = data_gen4["HP"].mean()
mean_hp_gen5 = data_gen5["HP"].mean()
mean_hp_gen6 = data_gen6["HP"].mean()

mean_attack_gen1 = data_gen1["Attack"].mean()
mean_attack_gen2 = data_gen2["Attack"].mean()
mean_attack_gen3 = data_gen3["Attack"].mean()
mean_attack_gen4 = data_gen4["Attack"].mean()
mean_attack_gen5 = data_gen5["Attack"].mean()
mean_attack_gen6 = data_gen6["Attack"].mean()

mean_defense_gen1 = data_gen1["Defense"].mean()
mean_defense_gen2 = data_gen2["Defense"].mean()
mean_defense_gen3 = data_gen3["Defense"].mean()
mean_defense_gen4 = data_gen4["Defense"].mean()
mean_defense_gen5 = data_gen5["Defense"].mean()
mean_defense_gen6 = data_gen6["Defense"].mean()

mean_sp_atk_gen1 = data_gen1["Sp. Atk"].mean()
mean_sp_atk_gen2 = data_gen2["Sp. Atk"].mean()
mean_sp_atk_gen3 = data_gen3["Sp. Atk"].mean()
mean_sp_atk_gen4 = data_gen4["Sp. Atk"].mean()
mean_sp_atk_gen5 = data_gen5["Sp. Atk"].mean()
mean_sp_atk_gen6 = data_gen6["Sp. Atk"].mean()

mean_sp_def_gen1 = data_gen1["Sp. Def"].mean()
mean_sp_def_gen2 = data_gen2["Sp. Def"].mean()
mean_sp_def_gen3 = data_gen3["Sp. Def"].mean()
mean_sp_def_gen4 = data_gen4["Sp. Def"].mean()
mean_sp_def_gen5 = data_gen5["Sp. Def"].mean()
mean_sp_def_gen6 = data_gen6["Sp. Def"].mean()

mean_speed_gen1 = data_gen1["Speed"].mean()
mean_speed_gen2 = data_gen2["Speed"].mean()
mean_speed_gen3 = data_gen3["Speed"].mean()
mean_speed_gen4 = data_gen4["Speed"].mean()
mean_speed_gen5 = data_gen5["Speed"].mean()
mean_speed_gen6 = data_gen6["Speed"].mean()

stats_type = ["HP","Attack","Defense","Sp. Atk","Sp. Def","Speed"]
data_total_mean = [mean_total_gen1,mean_total_gen2,mean_total_gen3,mean_total_gen4,mean_total_gen5,mean_total_gen6]
data_hp_mean = [mean_hp_gen1,mean_hp_gen2,mean_hp_gen3,mean_hp_gen4,mean_hp_gen5,mean_hp_gen6]
data_attack_mean = [mean_attack_gen1,mean_attack_gen2,mean_attack_gen3,mean_attack_gen4,mean_attack_gen5,mean_attack_gen6]
data_defense_mean = [mean_defense_gen1,mean_defense_gen2,mean_defense_gen3,mean_defense_gen4,mean_defense_gen5,mean_defense_gen6]
data_sp_atk_mean = [mean_sp_atk_gen1,mean_sp_atk_gen2,mean_sp_atk_gen3,mean_sp_atk_gen4,mean_sp_atk_gen5,mean_sp_atk_gen6]
data_sp_def_mean = [mean_sp_def_gen1,mean_sp_def_gen2,mean_sp_def_gen3,mean_sp_def_gen4,mean_sp_def_gen5,mean_sp_def_gen6]
data_speed_mean = [mean_speed_gen1,mean_speed_gen2,mean_speed_gen3,mean_speed_gen4,mean_speed_gen5,mean_speed_gen6]

def make_scatter(j):
    if j == "Total":
        k = data_total_mean
        data_scatter = go.Scatter(x=list_generation,y=k,name=f"Avg {j} Stats")
        data_layout = go.Layout(title={"text":f"<b>Average {j} Stats For Each Generation</b>","y":0.91,"x":0.5,
                                    "xanchor":"center","yanchor":"top"},height=500,margin=go.layout.Margin(),showlegend=True)
        data_fig = go.Figure(data=data_scatter,layout=data_layout)
        iplot(data_fig)
    elif j == "HP":
        k = data_hp_mean
        data_scatter = go.Scatter(x=list_generation,y=k,name=f"Avg {j} Stats")
        data_layout = go.Layout(title={"text":f"<b>Average {j} Stats For Each Generation</b>","y":0.91,"x":0.5,
                                    "xanchor":"center","yanchor":"top"},height=500,margin=go.layout.Margin(),showlegend=True)
        data_fig = go.Figure(data=data_scatter,layout=data_layout)
        iplot(data_fig)
    elif j == "Attack":
        k = data_attack_mean
        data_scatter = go.Scatter(x=list_generation,y=k,name=f"Avg {j} Stats")
        data_layout = go.Layout(title={"text":f"<b>Average {j} Stats For Each Generation</b>","y":0.91,"x":0.5,
                                    "xanchor":"center","yanchor":"top"},height=500,margin=go.layout.Margin(),showlegend=True)
        data_fig = go.Figure(data=data_scatter,layout=data_layout)
        iplot(data_fig)
    elif j == "Defense":
        k = data_defense_mean
        data_scatter = go.Scatter(x=list_generation,y=k,name=f"Avg {j} Stats")
        data_layout = go.Layout(title={"text":f"<b>Average {j} Stats For Each Generation</b>","y":0.91,"x":0.5,
                                    "xanchor":"center","yanchor":"top"},height=500,margin=go.layout.Margin(),showlegend=True)
        data_fig = go.Figure(data=data_scatter,layout=data_layout)
        iplot(data_fig)
    elif j == "Sp. Atk":
        k = data_sp_atk_mean
        data_scatter = go.Scatter(x=list_generation,y=k,name=f"Avg {j} Stats")
        data_layout = go.Layout(title={"text":f"<b>Average {j} Stats For Each Generation</b>","y":0.91,"x":0.5,
                                    "xanchor":"center","yanchor":"top"},height=500,margin=go.layout.Margin(),showlegend=True)
        data_fig = go.Figure(data=data_scatter,layout=data_layout)
        iplot(data_fig)
    elif j == "Sp. Def":
        k = data_sp_def_mean
        data_scatter = go.Scatter(x=list_generation,y=k,name=f"Avg {j} Stats")
        data_layout = go.Layout(title={"text":f"<b>Average {j} Stats For Each Generation</b>","y":0.91,"x":0.5,
                                    "xanchor":"center","yanchor":"top"},height=500,margin=go.layout.Margin(),showlegend=True)
        data_fig = go.Figure(data=data_scatter,layout=data_layout)
        iplot(data_fig)
    elif j == "Speed":
        k = data_speed_mean
        data_scatter = go.Scatter(x=list_generation,y=k,name=f"Avg {j} Stats")
        data_layout = go.Layout(title={"text":f"<b>Average {j} Stats For Each Generation</b>","y":0.91,"x":0.5,
                                    "xanchor":"center","yanchor":"top"},height=500,margin=go.layout.Margin(),showlegend=True)
        data_fig = go.Figure(data=data_scatter,layout=data_layout)
        iplot(data_fig)
    else:
        print("Wrong Type")

In [36]:
make_scatter("Total")

Based on scatter graph, the biggest number of total atribute is hold by Generation 4 and the smallest number hold by Generation 2.

In [37]:
make_scatter("HP")

Again, the biggest HP atribute number is hold by Generation 4 but the smallest number is hold by Generation 1.

In [38]:
make_scatter("Attack")

It's repeated, the biggest number of attack atribute hold by Generation 4 and the smallest number hold by Generation 2.

In [39]:
make_scatter("Defense")

It's not surprise that the biggest number defense atribute hold by Generation 4 and the smallest number hold by Generation 1.

In [40]:
make_scatter("Sp. Atk")

It's repeated again, the biggest number special attack atribute hold by Generation 4 and the smallest hold by Generation 2.

In [41]:
make_scatter("Sp. Def")

It's still same for the biggest number hold by Generation 4, but now the smallest hold by Generation 1.

In [42]:
make_scatter("Speed")

Wow now that's a different result, the biggest number of speed atribute hold by Generation 1 and the smallest number hold by Generation 2.

Now let's combine all those become 1 scatter graph to compare all the atribute.

In [43]:
mean_all_stats1 = [mean_hp_gen1,mean_attack_gen1,mean_defense_gen1,mean_sp_atk_gen1,mean_sp_def_gen1,mean_speed_gen1]
mean_all_stats2 = [mean_hp_gen2,mean_attack_gen2,mean_defense_gen2,mean_sp_atk_gen2,mean_sp_def_gen2,mean_speed_gen2]
mean_all_stats3 = [mean_hp_gen3,mean_attack_gen3,mean_defense_gen3,mean_sp_atk_gen3,mean_sp_def_gen3,mean_speed_gen3]
mean_all_stats4 = [mean_hp_gen4,mean_attack_gen4,mean_defense_gen4,mean_sp_atk_gen4,mean_sp_def_gen4,mean_speed_gen4]
mean_all_stats5 = [mean_hp_gen5,mean_attack_gen5,mean_defense_gen5,mean_sp_atk_gen5,mean_sp_def_gen5,mean_speed_gen5]
mean_all_stats6 = [mean_hp_gen6,mean_attack_gen6,mean_defense_gen6,mean_sp_atk_gen6,mean_sp_def_gen6,mean_speed_gen6]

all_stats_scatter1 = go.Scatter(x=stats_type,y=mean_all_stats1,name="Gen 1")
all_stats_scatter2 = go.Scatter(x=stats_type,y=mean_all_stats2,name="Gen 2")
all_stats_scatter3 = go.Scatter(x=stats_type,y=mean_all_stats3,name="Gen 3")
all_stats_scatter4 = go.Scatter(x=stats_type,y=mean_all_stats4,name="Gen 4")
all_stats_scatter5 = go.Scatter(x=stats_type,y=mean_all_stats5,name="Gen 5")
all_stats_scatter6 = go.Scatter(x=stats_type,y=mean_all_stats6,name="Gen 6")

stats_data = [all_stats_scatter1,all_stats_scatter2,all_stats_scatter3,all_stats_scatter4,all_stats_scatter5,all_stats_scatter6]
all_stats_layout = go.Layout(title={"text":"<b>Comparison Every Average Stats For Each Generation</b>","y":0.91,"x":0.5,
                                    "xanchor":"center","yanchor":"top"},height=500,margin=go.layout.Margin(),showlegend=True)
all_stats_figure = go.Figure(data=stats_data,layout=all_stats_layout)
iplot(all_stats_figure)

It's not surprise that Generation 4 is the leader of all Generation because Generation 4 has the biggest number of all atribute except speed atribute which hold by Generation 1.

### 4. Identifying 1 Type Pokemon And 2 Type Pokemon

In this case we will find out how many pokemon with 1 type and 2 types in percentage (%).

In [44]:
amount_1type = len(data[data["Type 2"] == "None"])
amount_2type = len(data[data["Type 2"] != "None"])
type_label = ["1 Type","2 Type"]
type_value_data = [amount_1type,amount_2type]
type_pie = go.Pie(labels=type_label,values=type_value_data)
type_pie_layout = go.Layout(title={"text":f"<b>Comparison 1 Type And 2 Type Pokemon in %</b>","y":0.90,"x":0.5,
                                    "xanchor":"center","yanchor":"top"},showlegend=True)
type_pie_data = [type_pie]
type_pie_fig = go.Figure(data=type_pie_data,layout=type_pie_layout)
iplot(type_pie_fig)

It's interesting that the most of all pokemon is 2 Type within 51.7% from all pokemon.

### 6. Identifying Legendary and Non-Legendary Pokemon

In this case we will find out how many legendary pokemon and non-legendary pokemon in percentage (%).

In [45]:
amount_non_legendary = len(data[data["Legendary"] == False])
amount_legendary = len(data[data["Legendary"] == True])
legendary_pie_label = ["Non-Legendary","Legendary"]
legendary_pie_value = [amount_non_legendary,amount_legendary]
legendary_pie_trace = go.Pie(labels=legendary_pie_label,values=legendary_pie_value)
legendary_pie_layout = go.Layout(title={"text":f"<b>Comparison Non-Legendary And Legendary Pokemon in %</b>","y":0.90,"x":0.5,
                                        "xanchor":"center","yanchor":"top"},showlegend=True)
legendary_pie_data = [legendary_pie_trace]
legendary_pie_fig = go.Figure(data=legendary_pie_data,layout=legendary_pie_layout)
iplot(legendary_pie_fig)

It's not surprise that the most of pokemon is non-legendary type. The legendary pokemon is pokemon's  rare type so obviously the number of legendary pokemon will be small. 

Let's find out if we look for each generation.

In [46]:
def make_pie(l):
    amount_non_legen = len(data[(data["Generation"]==l) & (data["Legendary"]==False)])
    amount_legen = len(data[(data["Generation"] == l) & (data["Legendary"]==True)])
    legen_pie_label = ["Non-Legendary","Legendary"]
    legen_pie_value = [amount_non_legen,amount_legen]
    legen_pie_trace = go.Pie(labels=legen_pie_label,values=legen_pie_value)
    legen_pie_layout = go.Layout(title={"text":f"<b>Comparison Non-Legendary And Legendary Pokemon in % On Generation {l}</b>",
                                        "y":0.90,"x":0.5,"xanchor":"center","yanchor":"top"},showlegend=True)
    legen_pie_data = [legen_pie_trace]
    legen_pie_fig = go.Figure(data=legen_pie_data,layout=legen_pie_layout)
    iplot(legen_pie_fig)

In [47]:
make_pie(1)

Only 3.61% in Generation 1 that is legendary type.

In [48]:
make_pie(2)

Only 4.72% in Generation 2 that is legendary type. It's bigger than Generation 1.

In [49]:
make_pie(3)

It's increasing number of percentage in Generation 3 become 11.3%.

In [50]:
make_pie(4)

It's Interesting that in Generation 4 the number of legendary pokemon decreasing become 10.7%.

In [51]:
make_pie(5)

It's even more decrease in Generation 5 become 9.09%.

In [52]:
make_pie(6)

There is an increase in percentage of legendary pokemon from Generation 5 to Generation 6. The percentage number of legendary pokemon in Generation 6 become 9.76%.

Now Let's compare all of those in one paper so we can see the difference between them.

In [53]:
amount_non_legen1 = len(data[(data["Generation"]==1) & (data["Legendary"]==False)])
amount_non_legen2 = len(data[(data["Generation"]==2) & (data["Legendary"]==False)])
amount_non_legen3 = len(data[(data["Generation"]==3) & (data["Legendary"]==False)])
amount_non_legen4 = len(data[(data["Generation"]==4) & (data["Legendary"]==False)])
amount_non_legen5 = len(data[(data["Generation"]==5) & (data["Legendary"]==False)])
amount_non_legen6 = len(data[(data["Generation"]==6) & (data["Legendary"]==False)])

amount_legen1 = len(data[(data["Generation"] == 1) & (data["Legendary"]==True)])
amount_legen2 = len(data[(data["Generation"] == 2) & (data["Legendary"]==True)])
amount_legen3 = len(data[(data["Generation"] == 3) & (data["Legendary"]==True)])
amount_legen4 = len(data[(data["Generation"] == 4) & (data["Legendary"]==True)])
amount_legen5 = len(data[(data["Generation"] == 5) & (data["Legendary"]==True)])
amount_legen6 = len(data[(data["Generation"] == 6) & (data["Legendary"]==True)])

legen_pie_label1 = ["Non-Legendary Gen 1","Legendary Gen 1"]
legen_pie_label2 = ["Non-Legendary Gen 2","Legendary Gen 2"]
legen_pie_label3 = ["Non-Legendary Gen 3","Legendary Gen 3"]
legen_pie_label4 = ["Non-Legendary Gen 4","Legendary Gen 4"]
legen_pie_label5 = ["Non-Legendary Gen 5","Legendary Gen 5"]
legen_pie_label6 = ["Non-Legendary Gen 6","Legendary Gen 6"]

legen_pie_value1 = [amount_non_legen1,amount_legen1]
legen_pie_value2 = [amount_non_legen2,amount_legen2]
legen_pie_value3 = [amount_non_legen3,amount_legen3]
legen_pie_value4 = [amount_non_legen4,amount_legen4]
legen_pie_value5 = [amount_non_legen5,amount_legen5]
legen_pie_value6 = [amount_non_legen6,amount_legen6]

specs = [[{"type":"domain"},{"type":"domain"}],[{"type":"domain"},{"type":"domain"}],[{"type":"domain"},{"type":"domain"}]]
fig_pie = subplots.make_subplots(rows=3,cols=2,specs=specs,subplot_titles=("Generation 1","Generation 2","Generation 3",
                                                                            "Generation 4","Generation 5","Generation 6"))
fig_pie.add_traces(go.Pie(labels=legen_pie_label1,values=legen_pie_value1,name="Pie Gen 1"), 1,1)
fig_pie.add_traces(go.Pie(labels=legen_pie_label2,values=legen_pie_value2,name="Pie Gen 2"), 1,2)
fig_pie.add_traces(go.Pie(labels=legen_pie_label3,values=legen_pie_value3,name="Pie Gen 3"), 2,1)
fig_pie.add_traces(go.Pie(labels=legen_pie_label4,values=legen_pie_value4,name="Pie Gen 4"), 2,2)
fig_pie.add_traces(go.Pie(labels=legen_pie_label5,values=legen_pie_value5,name="Pie Gen 5"), 3,1)
fig_pie.add_traces(go.Pie(labels=legen_pie_label6,values=legen_pie_value6,name="Pie Gen 6"), 3,2)
fig_pie.update_traces(textposition="inside")
fig_pie["layout"].update(title={"text":"<b>Percentage Of Non-Legendary And Legendary Pokemon In Each Generation</b>","y":0.96,
                                "x":0.5,"xanchor":"center","yanchor":"top"},height=1000,margin=go.layout.Margin(),showlegend=False)
iplot(fig_pie)

### 7. Identifying Non-Legendary And Legendary Pokemon's Stats

In this case we will find out average every stats of Non-Legendary and Legendary pokemon.

In [54]:
data_non_legendary = data[data["Legendary"]==False]
stats_non_legendary = []
for m in stats_type:
    mean_non_legendary = data_non_legendary[m].mean()
    stats_non_legendary.append(mean_non_legendary)

data_legendary = data[data["Legendary"]==True]
stats_legendary = []
for n in stats_type:
    mean_legendary = data_legendary[n].mean()
    stats_legendary.append(mean_legendary)

data_scatter_polar = [go.Scatterpolar(r=stats_legendary,theta=stats_type,fill="toself",name="Legendary"),
                      go.Scatterpolar(r=stats_non_legendary,theta=stats_type,fill="toself",name="Non-Legendary")]
data_scatter_layout = go.Layout(title={"text":"<b>Comparison Of Non-Legendary And Legendary Pokemon's Stats</b>",
                                        "y":0.91,"x":0.5,"xanchor":"center","yanchor":"top"},showlegend=True)

scatter_polar_fig = go.Figure(data=data_scatter_polar,layout=data_scatter_layout)
iplot(scatter_polar_fig)

Obviously Legendary pokemons are stronger than Non-Legendary pokemons.

Let's see pokemon's stats in each generation

In [55]:
def make_scatter_polar(o):
    non_legendary_data = data[(data["Generation"]==o)&(data["Legendary"]==False)]
    non_legendary_stats = []
    for p in stats_type:
        non_legendary_mean = non_legendary_data[p].mean()
        non_legendary_stats.append(non_legendary_mean)
    
    legendary_data = data[(data["Generation"]==o)&(data["Legendary"]==True)]
    legendary_stats = []
    for q in stats_type:
        legendary_mean = legendary_data[q].mean()
        legendary_stats.append(legendary_mean)
    
    scatter_polar = [go.Scatterpolar(r=legendary_stats,theta=stats_type,fill="toself",name=f"Legendary Generation {o}"),
                     go.Scatterpolar(r=non_legendary_stats,theta=stats_type,fill="toself",name=f"Non-Legendary Generation {o}")]
    scatter_layout = go.Layout(title={"text":f"<b>Comparison Of Non-Legendary And Legendary Pokemon's Stats Generation {o}</b>",
                                            "y":0.91,"x":0.5,"xanchor":"center","yanchor":"top"},showlegend=True)
    scatter_polar_figure = go.Figure(data=scatter_polar,layout=scatter_layout)
    iplot(scatter_polar_figure)

In [56]:
make_scatter_polar(1)

For Generation 1 comparison for every atribute is not similar, you can see in special attack atribute and defense atribute which is different.

In [57]:
make_scatter_polar(2)

For Generation 2, Comparison for every atribute is similar. You can see it by look at shape of non-legendary and legendary scatter polar.

In [58]:
make_scatter_polar(3)

For Generation 3 comparison for every atribute is not similar, you can see in HP atribute and special attack atribute which is different.

In [59]:
make_scatter_polar(4)

For Generation 4, Comparison for every atribute is similar. You can see it by look at shape of non-legendary and legendary scatter polar.

In [60]:
make_scatter_polar(5)

For Generation 5 comparison for every atribute is not similar, you can see in attack atribute and defense atribute which is different.

In [61]:
make_scatter_polar(6)

For Generation 6 comparison for every atribute is not similar, you can see in speed atribute and special attack atribute which is different.

Now let's combine all of those become one paper so we can compare each other.

In [62]:
non_legendary_data1 = data[(data["Generation"]==1)&(data["Legendary"]==False)]
non_legendary_data2 = data[(data["Generation"]==2)&(data["Legendary"]==False)]
non_legendary_data3 = data[(data["Generation"]==3)&(data["Legendary"]==False)]
non_legendary_data4 = data[(data["Generation"]==4)&(data["Legendary"]==False)]
non_legendary_data5 = data[(data["Generation"]==5)&(data["Legendary"]==False)]
non_legendary_data6 = data[(data["Generation"]==6)&(data["Legendary"]==False)]
non_legendary_stats1 = []
non_legendary_stats2 = []
non_legendary_stats3 = []
non_legendary_stats4 = []
non_legendary_stats5 = []
non_legendary_stats6 = []
for r in stats_type:
    non_legendary_mean1 = non_legendary_data1[r].mean()
    non_legendary_mean2 = non_legendary_data2[r].mean()
    non_legendary_mean3 = non_legendary_data3[r].mean()
    non_legendary_mean4 = non_legendary_data4[r].mean()
    non_legendary_mean5 = non_legendary_data5[r].mean()
    non_legendary_mean6 = non_legendary_data6[r].mean()
    non_legendary_stats1.append(non_legendary_mean1)
    non_legendary_stats2.append(non_legendary_mean2)
    non_legendary_stats3.append(non_legendary_mean3)
    non_legendary_stats4.append(non_legendary_mean4)
    non_legendary_stats5.append(non_legendary_mean5)
    non_legendary_stats6.append(non_legendary_mean6)
    
legendary_data1 = data[(data["Generation"]==1)&(data["Legendary"]==True)]
legendary_data2 = data[(data["Generation"]==2)&(data["Legendary"]==True)]
legendary_data3 = data[(data["Generation"]==3)&(data["Legendary"]==True)]
legendary_data4 = data[(data["Generation"]==4)&(data["Legendary"]==True)]
legendary_data5 = data[(data["Generation"]==5)&(data["Legendary"]==True)]
legendary_data6 = data[(data["Generation"]==6)&(data["Legendary"]==True)]
legendary_stats1 = []
legendary_stats2 = []
legendary_stats3 = []
legendary_stats4 = []
legendary_stats5 = []
legendary_stats6 = []
for s in stats_type:
    legendary_mean1 = legendary_data1[s].mean()
    legendary_mean2 = legendary_data2[s].mean()
    legendary_mean3 = legendary_data3[s].mean()
    legendary_mean4 = legendary_data4[s].mean()
    legendary_mean5 = legendary_data5[s].mean()
    legendary_mean6 = legendary_data6[s].mean()
    legendary_stats1.append(legendary_mean1)
    legendary_stats2.append(legendary_mean2)
    legendary_stats3.append(legendary_mean3)
    legendary_stats4.append(legendary_mean4)
    legendary_stats5.append(legendary_mean5)
    legendary_stats6.append(legendary_mean6)

specs = [[{"type":"polar"},{"type":"polar"}],[{"type":"polar"},{"type":"polar"}],[{"type":"polar"},{"type":"polar"}]]
fig_scatter = subplots.make_subplots(rows=3,cols=2,specs=specs,subplot_titles=("Generation 1","Generation 2","Generation 3",
                                                                                "Generation 4","Generation 5","Generation 6"))

scatter_polar1 = [go.Scatterpolar(r=legendary_stats1,theta=stats_type,fill="toself",name="Legendary Generation 1"),
                  go.Scatterpolar(r=non_legendary_stats1,theta=stats_type,fill="toself",name="Non-Legendary Generation 1")]
scatter_polar2 = [go.Scatterpolar(r=legendary_stats2,theta=stats_type,fill="toself",name="Legendary Generation 2"),
                  go.Scatterpolar(r=non_legendary_stats2,theta=stats_type,fill="toself",name="Non-Legendary Generation 2")]
scatter_polar3 = [go.Scatterpolar(r=legendary_stats3,theta=stats_type,fill="toself",name="Legendary Generation 3"),
                  go.Scatterpolar(r=non_legendary_stats3,theta=stats_type,fill="toself",name="Non-Legendary Generation 3")]
scatter_polar4 = [go.Scatterpolar(r=legendary_stats4,theta=stats_type,fill="toself",name="Legendary Generation 4"),
                  go.Scatterpolar(r=non_legendary_stats4,theta=stats_type,fill="toself",name="Non-Legendary Generation 4")]
scatter_polar5 = [go.Scatterpolar(r=legendary_stats5,theta=stats_type,fill="toself",name="Legendary Generation 5"),
                  go.Scatterpolar(r=non_legendary_stats5,theta=stats_type,fill="toself",name="Non-Legendary Generation 5")]
scatter_polar6 = [go.Scatterpolar(r=legendary_stats6,theta=stats_type,fill="toself",name="Legendary Generation 6"),
                  go.Scatterpolar(r=non_legendary_stats6,theta=stats_type,fill="toself",name="Non-Legendary Generation 6")]

fig_scatter.add_traces(scatter_polar1, 1,1)
fig_scatter.add_traces(scatter_polar2, 1,2)
fig_scatter.add_traces(scatter_polar3, 2,1)
fig_scatter.add_traces(scatter_polar4, 2,2)
fig_scatter.add_traces(scatter_polar5, 3,1)
fig_scatter.add_traces(scatter_polar6, 3,2)
fig_scatter["layout"].update(title={"text":"<b>Comparison Non-Legendary And Legendary Pokemon's Stats Every Generation</b>",
                                    "y":0.96,"x":0.5,"xanchor":"center","yanchor":"top"},height=1000,
                                    margin=go.layout.Margin(),showlegend=False)
iplot(fig_scatter)

### 8. Identifying Correlation Between Pokemon's Type and Pokemon's Generation

In this case we will find out correlation between pokemon's type 1 to generation and pokemon's type 2 to generation.

A. Correlation between Type 1 and Generation

In [63]:
array1 = np.array([])
array2 = []
array3 = np.array([])
for t in list_generation: 
    for u in list_type1:
        type1_criteria = data["Type 1"] == u
        generation_criteria = data["Generation"] == t
        array2_value = len(data[type1_criteria & generation_criteria])
        array2.append(array2_value)
    array1 = np.append(array1,[array2])
    array2.clear()
array3 = np.reshape(array1,(len(list_generation),len(list_type1)))

heatmap_trace = go.Heatmap(z=array3,x=list_type1,y=list_generation,colorscale="Hot",reversescale=True)

heatmap_layout = go.Layout(title={"text":"<b>Correlation Between Type 1 And Generations</b>","y":0.89,"x":0.5,
                                "xanchor":"center","yanchor":"top"},xaxis=dict(title="<b>Type 1</b>"),
                                yaxis=dict(title="<b>Generation</b>"))

data_heatmap=[heatmap_trace]
heatmap_figure = go.Figure(data=data_heatmap, layout=heatmap_layout)
iplot(heatmap_figure)

As you can see, type 'Water' has the highest correlation to generation 1 which you can see from it's colour.

2. Correlation between Type 2 and Generation

In [64]:
array1 = np.array([])
array2 = []
array3 = np.array([])
for v in list_generation: 
    for w in list_type2:
        type2_criteria = data["Type 2"] == w
        generation_criteria = data["Generation"] == v
        array2_value = len(data[type2_criteria & generation_criteria])
        array2.append(array2_value)
    array1 = np.append(array1,[array2])
    array2.clear()
array3 = np.reshape(array1,(len(list_generation),len(list_type2)))

heatmap_trace = go.Heatmap(z=array3,x=list_type2,y=list_generation,colorscale="Hot",reversescale=True)

heatmap_layout = go.Layout(title={"text":"<b>Correlation Between Type 2 And Generations</b>","y":0.89,"x":0.5,
                                "xanchor":"center","yanchor":"top"},xaxis=dict(title="<b>Type 2</b>"),
                                yaxis=dict(title="<b>Generation</b>"))

data_heatmap=[heatmap_trace]
heatmap_figure = go.Figure(data=data_heatmap, layout=heatmap_layout)
iplot(heatmap_figure)

As you can see, type 'None' has the highest correlation to generation 1 and generation 5 which you can see from it's colour.

# Conclusion

So if you want to play pokemon, we suggest that you choose pokemon from generation 4 with the type of Legendary pokemon which has the biggest atribute among all. 