# Pandas and Pokemon

### Learning objectives:
- Review importing data in pandas
- Review renaming columns
- Basic EDA and data cleaning in pandas
- Basic sorting and filtering in pandas
- Groupbys in pandas

## Read in the data using pandas:

In [26]:
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

%matplotlib inline

In [27]:
df = pd.read_csv('./pokedex_basic.csv')

In [50]:
df.head(100)

Unnamed: 0,PokedexNumber,Name,Type,Total,HP,Attack,Defense,Special Attack,Special Defense,Speed
0,1,Bulbasaur,GrassPoison,318,45,49,49,65,65,45
1,2,Ivysaur,GrassPoison,405,60,62,63,80,80,60
2,3,Venusaur,GrassPoison,525,80,82,83,100,100,80
3,3,VenusaurMega Venusaur,GrassPoison,625,80,100,123,122,120,80
4,4,Charmander,Fire,309,39,52,43,60,50,65
5,5,Charmeleon,Fire,405,58,64,58,80,65,80
6,6,Charizard,FireFlying,534,78,84,78,109,85,100
7,6,CharizardMega Charizard X,FireDragon,634,78,130,111,130,85,100
8,6,CharizardMega Charizard Y,FireFlying,634,78,104,78,159,115,100
9,7,Squirtle,Water,314,44,48,65,50,64,43


## Rename the SpecialAttack and SpecialDefense columns to Special Attack and Special Defense:

In [29]:
df = df.rename(columns = { "SpecialAttack" : "Special Attack", "SpecialDefense" : "Special Defense"})

## Use inbuilt methods to figure out how many 'repeated' Pokemon there are:

In [49]:
df[df.duplicated(keep = False)]

Unnamed: 0,PokedexNumber,Name,Type,Total,HP,Attack,Defense,Special Attack,Special Defense,Speed


Looks like there are no identically repeated names.  However we know the Mega evolutions are going to be "duplicates".

## Some of our 'repeated' Pokemon are Mega Pokemon or Size-variant Pokemon. Use boolean masks to filter them out of the dataframe:

In [53]:
mega_mask = lambda 'mega': 'mega' not in df['Name']

SyntaxError: invalid syntax (<ipython-input-53-d32d9086eda4>, line 1)

In [104]:
df.drop_duplicates(subset = "PokedexNumber").head(10)

Unnamed: 0,PokedexNumber,Name,Type,Total,HP,Attack,Defense,Special Attack,Special Defense,Speed
0,1,Bulbasaur,GrassPoison,318,45,49,49,65,65,45
1,2,Ivysaur,GrassPoison,405,60,62,63,80,80,60
2,3,Venusaur,GrassPoison,525,80,82,83,100,100,80
4,4,Charmander,Fire,309,39,52,43,60,50,65
5,5,Charmeleon,Fire,405,58,64,58,80,65,80
6,6,Charizard,FireFlying,534,78,84,78,109,85,100
9,7,Squirtle,Water,314,44,48,65,50,64,43
10,8,Wartortle,Water,405,59,63,80,65,80,58
11,9,Blastoise,Water,530,79,83,100,85,105,78
13,10,Caterpie,Bug,195,45,30,35,20,20,45


In [105]:
remove_extras = []

for name in df["Name"]:
    if "Mega" in name:
        remove_extras.append(name)
    elif "Small Size" in name:
        remove_extras.append(name)
    elif "Large Size" in name:
        remove_extras.append(name)
    elif "Super Size" in name:
        remove_extras.append(name)
    elif "Unbound" in name:
        remove_extras.append(name)
        
df[~df['Name'].isin(remove_extras)].head(10)

Unnamed: 0,PokedexNumber,Name,Type,Total,HP,Attack,Defense,Special Attack,Special Defense,Speed
0,1,Bulbasaur,GrassPoison,318,45,49,49,65,65,45
1,2,Ivysaur,GrassPoison,405,60,62,63,80,80,60
2,3,Venusaur,GrassPoison,525,80,82,83,100,100,80
4,4,Charmander,Fire,309,39,52,43,60,50,65
5,5,Charmeleon,Fire,405,58,64,58,80,65,80
6,6,Charizard,FireFlying,534,78,84,78,109,85,100
9,7,Squirtle,Water,314,44,48,65,50,64,43
10,8,Wartortle,Water,405,59,63,80,65,80,58
11,9,Blastoise,Water,530,79,83,100,85,105,78
13,10,Caterpie,Bug,195,45,30,35,20,20,45


## `df[df['Name'] == 'Pancham']`

How does [Pancham the panda Pokemon](https://bulbapedia.bulbagarden.net/wiki/Pancham_(Pok%C3%A9mon) stack up as a fighting Pokemon? Display Pancham's stats, and compare them to the average stats for all fighting Pokemon.

In [90]:
df[df['Name'] == 'Pancham']

Unnamed: 0,PokedexNumber,Name,Type,Total,HP,Attack,Defense,Special Attack,Special Defense,Speed
742,674,Pancham,Fighting,348,67,82,62,46,48,43


In [93]:
df[df["Type"].isin(["Fighting"])]

Unnamed: 0,PokedexNumber,Name,Type,Total,HP,Attack,Defense,Special Attack,Special Defense,Speed
61,56,Mankey,Fighting,305,40,80,35,35,45,70
62,57,Primeape,Fighting,455,65,105,60,60,70,95
72,66,Machop,Fighting,305,70,80,50,35,35,35
73,67,Machoke,Fighting,405,80,100,70,50,60,45
74,68,Machamp,Fighting,505,90,130,80,65,85,55
114,106,Hitmonlee,Fighting,455,50,120,53,35,110,87
115,107,Hitmonchan,Fighting,455,50,105,79,35,110,76
255,236,Tyrogue,Fighting,210,35,35,35,35,35,35
256,237,Hitmontop,Fighting,455,50,95,95,35,110,70
320,296,Makuhita,Fighting,237,72,60,30,20,30,25


## Best Stats

Which Pokemon has the best Total, HP, Attack, Defense, SpecialAttack, SpecialDefense, and Speed stats?

## What Pokemon type is most common?

## Of the 12 most common Pokemon types, what Pokemon are on average strongest?

## Bonus: Plot a heatmap of Pokemon type vs. average stats:

## Bonus: Write a function that plots a histogram of a stat for a given Pokemon type:

In [None]:
# note: why would it be inadvisable to use `type` as a variable name here?
def plot_stat_of_type(stat, poke_type, df):
    pass