# Challenge 1

In this challenge you will be working on **Pokemon**. You will answer a series of questions in order to practice dataframe calculation, aggregation, and transformation.

![Pokemon](../images/pokemon.jpg)

Follow the instructions below and enter your code.

#### Import all required libraries.

In [1]:
# import libraries
import pandas as pd
import numpy as np
import math

#### Import data set.

Read the dataset `pokemon.csv` into a dataframe called `pokemon`.

*Data set attributed to [Alberto Barradas](https://www.kaggle.com/abcsds/pokemon/)*

In [2]:
# import dataset
pokedex = pd.read_csv("pokemon.csv")

In [3]:
# Turn everying to snake_case
pokedex.columns = list(map(lambda x: x.lower(), pokedex.columns))
pokedex.columns = pokedex.columns.str.replace(' ','_')

#### Print first 10 rows of `pokemon`.

In [4]:
# your code here
display(pokedex.head(10))

Unnamed: 0,#,name,type_1,type_2,total,hp,attack,defense,sp._atk,sp._def,speed,generation,legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,False
5,5,Charmeleon,Fire,,405,58,64,58,80,65,80,1,False
6,6,Charizard,Fire,Flying,534,78,84,78,109,85,100,1,False
7,6,CharizardMega Charizard X,Fire,Dragon,634,78,130,111,130,85,100,1,False
8,6,CharizardMega Charizard Y,Fire,Flying,634,78,104,78,159,115,100,1,False
9,7,Squirtle,Water,,314,44,48,65,50,64,43,1,False


#### Obtain the distinct values across `Type 1` and `Type 2`.

Exctract all the values in `Type 1` and `Type 2`. Then create an array containing the distinct values across both fields.

In [5]:
# kind of types
types1 = list(pokedex['type_1'].unique())
types2 = list(pokedex['type_2'].unique())
types = types1.copy()

for type in types2:
    if type not in types:
        types.append(type)
        
# all combination of types
combs = []
for i,pokemon in pokedex.iterrows():
    # when is NaN, python interprets its as a float
    if isinstance(pokemon['type_2'],float):
        combs.append(pokemon['type_1']) # we only want type1 then
    else:
        # I always place first the "bigger" type so that I don't have
        # duplicates (for example Fire Psychic and Psychic Fire)
        if pokemon['type_1']>pokemon['type_2']:
            combs.append(pokemon['type_1']+' '+pokemon['type_2'])
        elif pokemon['type_1']<pokemon['type_2']:
            combs.append(pokemon['type_2']+' '+pokemon['type_1'])

unique_combs = pd.Series(combs).unique()
unique_combs

array(['Poison Grass', 'Fire', 'Flying Fire', 'Fire Dragon', 'Water',
       'Bug', 'Flying Bug', 'Poison Bug', 'Normal Flying', 'Normal',
       'Poison', 'Electric', 'Ground', 'Poison Ground', 'Fairy',
       'Normal Fairy', 'Poison Flying', 'Grass Bug', 'Fighting',
       'Water Fighting', 'Psychic', 'Water Poison', 'Rock Ground',
       'Water Psychic', 'Steel Electric', 'Water Ice', 'Poison Ghost',
       'Psychic Grass', 'Grass', 'Psychic Fairy', 'Psychic Ice',
       'Water Flying', 'Water Dark', 'Water Rock', 'Rock Flying',
       'Ice Flying', 'Flying Electric', 'Dragon', 'Flying Dragon',
       'Psychic Fighting', 'Water Electric', 'Flying Fairy',
       'Psychic Flying', 'Electric Dragon', 'Water Fairy', 'Rock',
       'Grass Flying', 'Water Ground', 'Dark', 'Flying Dark', 'Ghost',
       'Psychic Normal', 'Steel Bug', 'Ground Flying', 'Steel Ground',
       'Rock Bug', 'Fighting Bug', 'Ice Dark', 'Rock Fire', 'Ice Ground',
       'Steel Flying', 'Fire Dark', 'Water Dragon',

#### Cleanup `Name` that contain "Mega".

If you have checked out the pokemon names carefully enough, you should have found there are junk texts in the pokemon names which contain "Mega". We want to clean up the pokemon names. For instance, "VenusaurMega Venusaur" should be "Mega Venusaur", and "CharizardMega Charizard X" should be "Mega Charizard X".

In [6]:
# your code here
for i,pokemon in pokedex.iterrows():
    if 'Mega ' in pokemon['name']:
        pokedex['name'][i] = 'Mega ' +pokemon['name'].split(' ',1)[1]

pokedex.head(10)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  pokedex['name'][i] = 'Mega ' +pokemon['name'].split(' ',1)[1]


Unnamed: 0,#,name,type_1,type_2,total,hp,attack,defense,sp._atk,sp._def,speed,generation,legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,Mega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,False
5,5,Charmeleon,Fire,,405,58,64,58,80,65,80,1,False
6,6,Charizard,Fire,Flying,534,78,84,78,109,85,100,1,False
7,6,Mega Charizard X,Fire,Dragon,634,78,130,111,130,85,100,1,False
8,6,Mega Charizard Y,Fire,Flying,634,78,104,78,159,115,100,1,False
9,7,Squirtle,Water,,314,44,48,65,50,64,43,1,False


#### Create a new column called `A/D Ratio` whose value equals to `Attack` devided by `Defense`.

For instance, if a pokemon has the Attack score 49 and Defense score 49, the corresponding `A/D Ratio` is 49/49=1.

In [7]:
pokedex['a/d_ratio'] = pokedex['attack']/pokedex['defense']

#### Identify the pokemon with the highest `A/D Ratio`.

In [8]:
pokedex[pokedex['a/d_ratio']==pokedex['a/d_ratio'].max()]

Unnamed: 0,#,name,type_1,type_2,total,hp,attack,defense,sp._atk,sp._def,speed,generation,legendary,a/d_ratio
429,386,DeoxysAttack Forme,Psychic,,600,50,180,20,180,20,150,3,True,9.0


#### Identify the pokemon with the lowest A/D Ratio.

In [9]:
pokedex[pokedex['a/d_ratio']==pokedex['a/d_ratio'].min()]

Unnamed: 0,#,name,type_1,type_2,total,hp,attack,defense,sp._atk,sp._def,speed,generation,legendary,a/d_ratio
230,213,Shuckle,Bug,Rock,505,20,10,230,10,230,5,2,False,0.043478


#### Create a new column called `Combo Type` whose value combines `Type 1` and `Type 2`.

Rules:

* If both `Type 1` and `Type 2` have valid values, the `Combo Type` value should contain both values in the form of `<Type 1> <Type 2>`. For example, if `Type 1` value is `Grass` and `Type 2` value is `Poison`, `Combo Type` will be `Grass-Poison`.

* If `Type 1` has valid value but `Type 2` is not, `Combo Type` will be the same as `Type 1`. For example, if `Type 1` is `Fire` whereas `Type 2` is `NaN`, `Combo Type` will be `Fire`.

In [10]:
def combining_types(pokemon):
    if isinstance(pokemon['type_2'],float):
        return pokemon['type_1']
    else:
    # I always place first the "bigger" type so that I don't have
        # duplicates (for example Fire Psychic and Psychic Fire)
        if pokemon['type_1']>pokemon['type_2']:
            return pokemon['type_1']+'-'+pokemon['type_2']
        elif pokemon['type_1']<pokemon['type_2']:
            return pokemon['type_2']+'-'+pokemon['type_1']
    
pokedex['combo_type'] = pokedex.apply(combining_types,axis = 1)

#### Identify the pokemon whose `A/D Ratio` are among the top 5.

In [11]:
# your code here
pokedex.sort_values(by=['a/d_ratio'],ascending=False).head(5)

Unnamed: 0,#,name,type_1,type_2,total,hp,attack,defense,sp._atk,sp._def,speed,generation,legendary,a/d_ratio,combo_type
429,386,DeoxysAttack Forme,Psychic,,600,50,180,20,180,20,150,3,True,9.0,Psychic
347,318,Carvanha,Water,Dark,305,45,90,20,65,20,65,3,False,4.5,Water-Dark
19,15,Mega Beedrill,Bug,Poison,495,65,150,40,15,80,145,1,False,3.75,Poison-Bug
453,408,Cranidos,Rock,,350,67,125,40,30,30,58,4,False,3.125,Rock
348,319,Sharpedo,Water,Dark,460,70,120,40,95,40,95,3,False,3.0,Water-Dark


#### For the 5 pokemon printed above, aggregate `Combo Type` and use a list to store the unique values.

Your end product is a list containing the distinct `Combo Type` values of the 5 pokemon with the highest `A/D Ratio`.

In [12]:
answer = pokedex.sort_values(by=['a/d_ratio'],ascending=False).head(5)['combo_type']
answer = pd.Series(answer).unique()
answer

array(['Psychic', 'Water-Dark', 'Poison-Bug', 'Rock'], dtype=object)

#### For each of the `Combo Type` values obtained from the previous question, calculate the mean scores of all numeric fields across all pokemon.

Your output should look like below:

![Aggregate](../images/aggregated-mean.png)

In [13]:
pokedex_num = pd.concat([pokedex.select_dtypes(np.number),pokedex['combo_type']],axis=1)
pokedex_num[pokedex_num['combo_type'].isin(answer)].groupby(['combo_type']).agg(np.mean)

Unnamed: 0_level_0,#,total,hp,attack,defense,sp._atk,sp._def,speed,generation,a/d_ratio
combo_type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
Poison-Bug,218.538462,346.538462,52.692308,66.923077,60.538462,41.538462,59.0,65.846154,2.461538,1.257494
Psychic,381.973684,464.552632,72.552632,64.947368,67.236842,98.552632,82.394737,78.868421,3.342105,1.164196
Rock,410.111111,409.444444,67.111111,103.333333,107.222222,40.555556,58.333333,32.888889,3.888889,1.260091
Water-Dark,347.666667,493.833333,69.166667,120.0,65.166667,88.833333,63.5,87.166667,3.166667,2.291949
