# Challenge 1

In this challenge you will be working on **Pokemon**. You will answer a series of questions in order to practice dataframe calculation, aggregation, and transformation.

![Pokemon](../images/pokemon.jpg)

Follow the instructions below and enter your code.

#### Import all required libraries.

In [76]:
import numpy as np
import pandas as pd

#### Import data set.

Read the dataset `pokemon.csv` into a dataframe called `pokemon`.

*Data set attributed to [Alberto Barradas](https://www.kaggle.com/abcsds/pokemon/)*

In [77]:
pokemon = pd.read_csv('pokemon.csv')

#### Print first 10 rows of `pokemon`.

In [78]:
pokemon.head(10)

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,False
5,5,Charmeleon,Fire,,405,58,64,58,80,65,80,1,False
6,6,Charizard,Fire,Flying,534,78,84,78,109,85,100,1,False
7,6,CharizardMega Charizard X,Fire,Dragon,634,78,130,111,130,85,100,1,False
8,6,CharizardMega Charizard Y,Fire,Flying,634,78,104,78,159,115,100,1,False
9,7,Squirtle,Water,,314,44,48,65,50,64,43,1,False


When you look at a data set, you often wonder what each column means. Some open-source data sets provide descriptions of the data set. In many cases, data descriptions are extremely useful for data analysts to perform work efficiently and successfully.

For the `Pokemon.csv` data set, fortunately, the owner provided descriptions which you can see [here](https://www.kaggle.com/abcsds/pokemon/home). For your convenience, we are including the descriptions below. Read the descriptions and understand what each column means. This knowledge is helpful in your work with the data.

| Column | Description |
| --- | --- |
| # | ID for each pokemon |
| Name | Name of each pokemon |
| Type 1 | Each pokemon has a type, this determines weakness/resistance to attacks |
| Type 2 | Some pokemon are dual type and have 2 |
| Total | A general guide to how strong a pokemon is |
| HP | Hit points, or health, defines how much damage a pokemon can withstand before fainting |
| Attack | The base modifier for normal attacks (eg. Scratch, Punch) |
| Defense | The base damage resistance against normal attacks |
| SP Atk | Special attack, the base modifier for special attacks (e.g. fire blast, bubble beam) |
| SP Def | The base damage resistance against special attacks |
| Speed | Determines which pokemon attacks first each round |
| Generation | Number of generation |
| Legendary | True if Legendary Pokemon False if not |

#### Obtain the distinct values across `Type 1` and `Type 2`.

Exctract all the values in `Type 1` and `Type 2`. Then create an array containing the distinct values across both fields.

In [79]:
types = pd.concat([pokemon["Type 1"], pokemon["Type 2"]]).unique()        # preciso concatenar para criar uma nova com a uniao de dois tipos

print(types)

['Grass' 'Fire' 'Water' 'Bug' 'Normal' 'Poison' 'Electric' 'Ground'
 'Fairy' 'Fighting' 'Psychic' 'Rock' 'Ghost' 'Ice' 'Dragon' 'Dark' 'Steel'
 'Flying' nan]


#### Cleanup `Name` that contain "Mega".

If you have checked out the pokemon names carefully enough, you should have found there are junk texts in the pokemon names which contain "Mega". We want to clean up the pokemon names. For instance, "VenusaurMega Venusaur" should be "Mega Venusaur", and "CharizardMega Charizard X" should be "Mega Charizard X".

In [100]:

pokemon = pokemon[pokemon['Name'].str.contains('Mega')]  #(nome do pokemom que contem mega)
print(pokemon)


       #                       Name    Type 1    Type 2  Total   HP  Attack   
3      3      VenusaurMega Venusaur     Grass    Poison    625   80     100  \
7      6  CharizardMega Charizard X      Fire    Dragon    634   78     130   
8      6  CharizardMega Charizard Y      Fire    Flying    634   78     104   
12     9    BlastoiseMega Blastoise     Water       NaN    630   79     103   
19    15      BeedrillMega Beedrill       Bug    Poison    495   65     150   
23    18        PidgeotMega Pidgeot    Normal    Flying    579   83      80   
71    65      AlakazamMega Alakazam   Psychic       NaN    590   55      50   
87    80        SlowbroMega Slowbro     Water   Psychic    590   95      75   
102   94          GengarMega Gengar     Ghost    Poison    600   60      65   
124  115  KangaskhanMega Kangaskhan    Normal       NaN    590  105     125   
137  127          PinsirMega Pinsir       Bug    Flying    600   65     155   
141  130      GyaradosMega Gyarados     Water      D

In [102]:
pokemon['Name'] = pokemon['Name'].str.replace('.*Mega', 'Mega')


pokemon.head()


# nao esta indo :XXXX  


import re

pokemon['Name'] = pokemon['Name'].apply(lambda x: re.sub(r'^.*(?=Mega)', '', x))

pokemon.head()


Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
3,3,Mega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
7,6,Mega Charizard X,Fire,Dragon,634,78,130,111,130,85,100,1,False
8,6,Mega Charizard Y,Fire,Flying,634,78,104,78,159,115,100,1,False
12,9,Mega Blastoise,Water,,630,79,103,120,135,115,78,1,False
19,15,Mega Beedrill,Bug,Poison,495,65,150,40,15,80,145,1,False


#### Create a new column called `A/D Ratio` whose value equals to `Attack` devided by `Defense`.

For instance, if a pokemon has the Attack score 49 and Defense score 49, the corresponding `A/D Ratio` is 49/49=1.

In [105]:
pokemon['A/D Ratio'] = pokemon['Attack'] / pokemon['Defense']
pokemon.head()

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary,A/D Ratio
3,3,Mega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False,0.813008
7,6,Mega Charizard X,Fire,Dragon,634,78,130,111,130,85,100,1,False,1.171171
8,6,Mega Charizard Y,Fire,Flying,634,78,104,78,159,115,100,1,False,1.333333
12,9,Mega Blastoise,Water,,630,79,103,120,135,115,78,1,False,0.858333
19,15,Mega Beedrill,Bug,Poison,495,65,150,40,15,80,145,1,False,3.75


#### Identify the pokemon with the highest `A/D Ratio`.

In [106]:
highest_ratio_pokemon = pokemon.loc[pokemon['A/D Ratio'].idxmax(), 'Name']
print(highest_ratio_pokemon)

Mega Beedrill


#### Identify the pokemon with the lowest A/D Ratio.

In [107]:
lowest_ratio_pokemon = pokemon.loc[pokemon['A/D Ratio'].idxmin(), 'Name']
print(lowest_ratio_pokemon)

Mega Slowbro


#### Create a new column called `Combo Type` whose value combines `Type 1` and `Type 2`.

Rules:

* If both `Type 1` and `Type 2` have valid values, the `Combo Type` value should contain both values in the form of `<Type 1> <Type 2>`. For example, if `Type 1` value is `Grass` and `Type 2` value is `Poison`, `Combo Type` will be `Grass-Poison`.

* If `Type 1` has valid value but `Type 2` is not, `Combo Type` will be the same as `Type 1`. For example, if `Type 1` is `Fire` whereas `Type 2` is `NaN`, `Combo Type` will be `Fire`.

In [111]:
def combo_type(row):
    if pd.isna(row['Type 2']):
        return row['Type 1']
    else:
        return row['Type 1'] + '-' + row['Type 2']
    
pokemon['Combo Type'] = pokemon.apply(combo_type, axis=1)

pokemon.head()

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary,A/D Ratio,Combo Type
3,3,Mega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False,0.813008,Grass-Poison
7,6,Mega Charizard X,Fire,Dragon,634,78,130,111,130,85,100,1,False,1.171171,Fire-Dragon
8,6,Mega Charizard Y,Fire,Flying,634,78,104,78,159,115,100,1,False,1.333333,Fire-Flying
12,9,Mega Blastoise,Water,,630,79,103,120,135,115,78,1,False,0.858333,Water
19,15,Mega Beedrill,Bug,Poison,495,65,150,40,15,80,145,1,False,3.75,Bug-Poison


#### Identify the pokemon whose `A/D Ratio` are among the top 5.

In [85]:
# your code here

#### For the 5 pokemon printed above, aggregate `Combo Type` and use a list to store the unique values.

Your end product is a list containing the distinct `Combo Type` values of the 5 pokemon with the highest `A/D Ratio`.

In [113]:
combo_type = pokemon.sort_values('A/D Ratio', ascending=False).head(5)

print (combo_type)

       #           Name   Type 1    Type 2  Total   HP  Attack  Defense   
19    15  Mega Beedrill      Bug    Poison    495   65     150       40  \
393  359     Mega Absol     Dark       NaN    565   65     150       60   
387  354   Mega Banette    Ghost       NaN    555   64     165       75   
164  150  Mega Mewtwo Y  Psychic       NaN    780  106     150       70   
279  257  Mega Blaziken     Fire  Fighting    630   80     160       80   

     Sp. Atk  Sp. Def  Speed  Generation  Legendary  A/D Ratio     Combo Type  
19        15       80    145           1      False   3.750000     Bug-Poison  
393      115       60    115           3      False   2.500000           Dark  
387       93       83     75           3      False   2.200000          Ghost  
164      194      120    140           1       True   2.142857        Psychic  
279      130       80    100           3      False   2.000000  Fire-Fighting  


#### For each of the `Combo Type` values obtained from the previous question, calculate the mean scores of all numeric fields across all pokemon.

Your output should look like below:

![Aggregate](../images/aggregated-mean.png)

In [None]:
#IDK 