In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

**Overview:**

For my project, I am examining a dataset from the Pokemon video games. The dataset contains information from a website called "Bulbapedia", and provides strategic information regarding creature stats, like type effectiveness, gender rarity, and base speed.

The full "pokedex", or dataset, has 801 rows and 41 columns.

In [None]:
pokedex = pd.read_csv("../input/pokemon/pokemon.csv")
pokedex

**Overview cont.**

All Pokemon have a 'type', which influences how well they fare against other creatures during Pokemon battles. As a player, I have gravitated to certain Pokemon types, like Dragon types and Psychic types; conversely, I have always avoided certain types, like Bug types and Normal types. I was curious if my type preferences in game are matched by the average base stats (attack, special attack, defense, special defense, speed) that different pokemon types have, and have structured the code below to explore this question.

Additionally, in game, there are rare, one-of-a-kind pokemon called "Legendary Pokemon". Legendary pokemon typically have stats that are much higher than basic pokemon, and each generation introduces different legendary pokemon to the franchise. I am curious as to which generation had the most powerful legendary pokemon, and I explore this at the end.

**Data Profile**

The information utilized in this table is publically available on Kaggle from user Rounak Banik. The information was scraped from a Pokemon wiki called Serebii.net.

First, I reduced the columns in the pokedex dataset down from the original 41 to 13 columns.

In [None]:
#Step one: Filtering out the irrelvant columns.

pokedex.drop([col for col in pokedex.columns if 'against' in col],axis=1,inplace=True)
pokedex.drop([col for col in pokedex.columns if 'base' in col],axis=1,inplace=True)
pokedex.drop(['capture_rate', 'classfication', 'experience_growth', 'japanese_name', 'percentage_male', 'weight_kg', 'height_m'], axis=1, inplace=True)
pokedex

Next, I wanted to create a list of of all the Pokemon types in the 801 rows, found in columns "type1" and "type2". There are 18 indivual pokemon types.

In [None]:
#What are the different pokemon types?

type_list = pokedex.type1.unique()
type_list

Once I had knew what the 18 types were, I created 18 different groupings for the different types of Pokemon. 
For example, - all pokemon with "grass" listed as either type1 or type2 were filted into a "grass_types" group.
This was repeated for each of the types.

In [None]:
#Create 18 groups of pokemon based on their type.

#I think it is possible to write a for loop related to the array above, but I couldn't figure it out.
#I tried versions of the below, but that isn't right.
    ###for i in type_list:
        ### i = i+'_type' (? trying to make an concatenation)
        ###i = pokedex[(pokedex['type1'] == i) | (pokedex['type2'] == i)]

grass_types = pokedex[(pokedex['type1'] == 'grass') | (pokedex['type2'] == 'grass')]
fire_types = pokedex[(pokedex['type1'] == 'fire') | (pokedex['type2'] == 'fire')]
water_types = pokedex[(pokedex['type1'] == 'water') | (pokedex['type2'] == 'water')]
bug_types = pokedex[(pokedex['type1'] == 'bug') | (pokedex['type2'] == 'bug')]
normal_types = pokedex[(pokedex['type1'] == 'normal') | (pokedex['type2'] == 'normal')]
poison_types = pokedex[(pokedex['type1'] == 'poison') | (pokedex['type2'] == 'poison')]
electric_types = pokedex[(pokedex['type1'] == 'electric') | (pokedex['type2'] == 'electric')]
ground_types = pokedex[(pokedex['type1'] == 'ground') | (pokedex['type2'] == 'ground')]
fairy_types = pokedex[(pokedex['type1'] == 'fairy') | (pokedex['type2'] == 'fairy')]
fighting_types = pokedex[(pokedex['type1'] == 'fighting') | (pokedex['type2'] == 'fighting')]
psychic_types = pokedex[(pokedex['type1'] == 'psychic') | (pokedex['type2'] == 'psychic')]
rock_types = pokedex[(pokedex['type1'] == 'rock') | (pokedex['type2'] == 'rock')]
ghost_types = pokedex[(pokedex['type1'] == 'ghost') | (pokedex['type2'] == 'ghost')]
ice_types = pokedex[(pokedex['type1'] == 'ice') | (pokedex['type2'] == 'ice')]
dragon_types = pokedex[(pokedex['type1'] == 'dragon') | (pokedex['type2'] == 'dragon')]
dark_types = pokedex[(pokedex['type1'] == 'dark') | (pokedex['type2'] == 'dark')]
steel_types = pokedex[(pokedex['type1'] == 'steel') | (pokedex['type2'] == 'steel')]
flying_types = pokedex[(pokedex['type1'] == 'flying') | (pokedex['type2'] == 'flying')]

Once I had the clusters of data sets isolated via their individual types, I used a mean function to find average statistics for that type grouping.
First, I found the **average special attack** for each pokemon type, and placed it in a bar graph to quickly visualize which types were the largest.

Fire, Electric, Psychic and Dragon types on average had the highest special attacks - which is not surprising to me.
Bug types are the lowest.

In [None]:
#What is the average special attack of all of the pokemon in one type class?
#I think it is also possible to use a for loop to reduce the repetive code for this cell, but I also could not figure out how to. 
#Another idea that I had was using local variables to reduce repetition, but I also couldn't figure that out :(

sp_attack_grass_types = grass_types['sp_attack'].mean()
sp_attack_fire_types = fire_types['sp_attack'].mean()
sp_attack_water_types = water_types['sp_attack'].mean()
sp_attack_bug_types = bug_types['sp_attack'].mean()
sp_attack_normal_types = normal_types['sp_attack'].mean()
sp_attack_poison_types = poison_types['sp_attack'].mean()
sp_attack_electric_types = electric_types['sp_attack'].mean()
sp_attack_ground_types = ground_types['sp_attack'].mean()
sp_attack_fairy_types = fairy_types['sp_attack'].mean()
sp_attack_fighting_types = fighting_types['sp_attack'].mean()
sp_attack_psychic_types = psychic_types['sp_attack'].mean()
sp_attack_rock_types = rock_types['sp_attack'].mean()
sp_attack_ghost_types = ghost_types['sp_attack'].mean()
sp_attack_ice_types = ice_types['sp_attack'].mean()
sp_attack_dragon_types = dragon_types['sp_attack'].mean()
sp_attack_dark_types = dark_types['sp_attack'].mean()
sp_attack_steel_types = steel_types['sp_attack'].mean()
sp_attack_flying_types = flying_types['sp_attack'].mean()


sp_attack = pd.DataFrame({'Pkmn Types':['Grass', 'Fire', 'Water', 'Bug', 'Normal', 'Poison', 'Electric', 'Ground', 'Fairy', 'Fighting', 'Psychic', 'Rock', 'Ghost', 'Ice', 'Dragon', 'Dark', 'Steel', 'Flying'], 'Average Special Attack':[sp_attack_grass_types, sp_attack_fire_types, sp_attack_water_types, sp_attack_bug_types, sp_attack_normal_types, sp_attack_poison_types, sp_attack_electric_types, sp_attack_ground_types, sp_attack_fairy_types, sp_attack_fighting_types, sp_attack_psychic_types, sp_attack_rock_types, sp_attack_ghost_types, sp_attack_ice_types, sp_attack_dragon_types, sp_attack_dark_types, sp_attack_steel_types, sp_attack_flying_types]})
ax = sp_attack.plot.barh(x='Pkmn Types', y='Average Special Attack', rot=0)

The same exercise was repeated for the **special defense** statistic.

Steel, Dragon, Psychic, and Fairy have the highest statistics.
Ground has the lowest.

In [None]:
#What is the average special defense of all of the pokemon in one type class?


sp_defense_grass_types = grass_types['sp_defense'].mean()
sp_defense_fire_types = fire_types['sp_defense'].mean()
sp_defense_water_types = water_types['sp_defense'].mean()
sp_defense_bug_types = bug_types['sp_defense'].mean()
sp_defense_normal_types = normal_types['sp_defense'].mean()
sp_defense_poison_types = poison_types['sp_defense'].mean()
sp_defense_electric_types = electric_types['sp_defense'].mean()
sp_defense_ground_types = ground_types['sp_defense'].mean()
sp_defense_fairy_types = fairy_types['sp_defense'].mean()
sp_defense_fighting_types = fighting_types['sp_defense'].mean()
sp_defense_psychic_types = psychic_types['sp_defense'].mean()
sp_defense_rock_types = rock_types['sp_defense'].mean()
sp_defense_ghost_types = ghost_types['sp_defense'].mean()
sp_defense_ice_types = ice_types['sp_defense'].mean()
sp_defense_dragon_types = dragon_types['sp_defense'].mean()
sp_defense_dark_types = dark_types['sp_defense'].mean()
sp_defense_steel_types = steel_types['sp_defense'].mean()
sp_defense_flying_types = flying_types['sp_defense'].mean()


sp_defense = pd.DataFrame({'Pkmn Types':['Grass', 'Fire', 'Water', 'Bug', 'Normal', 'Poison', 'Electric', 'Ground', 'Fairy', 'Fighting', 'Psychic', 'Rock', 'Ghost', 'Ice', 'Dragon', 'Dark', 'Steel', 'Flying'], 'Average Special Defense':[sp_defense_grass_types, sp_defense_fire_types, sp_defense_water_types, sp_defense_bug_types, sp_defense_normal_types, sp_defense_poison_types, sp_defense_electric_types, sp_defense_ground_types, sp_defense_fairy_types, sp_defense_fighting_types, sp_defense_psychic_types, sp_defense_rock_types, sp_defense_ghost_types, sp_defense_ice_types, sp_defense_dragon_types, sp_defense_dark_types, sp_defense_steel_types, sp_defense_flying_types]})
ax = sp_defense.plot.barh(x='Pkmn Types', y='Average Special Defense', rot=0)

The same exercise was repeated for the **attack** statistic.

Fighting, Dragon, and Steel have the highest statistics. Fairy has the lowest.

In [None]:
#What is the average attack of all of the pokemon in one type class?

attack_grass_types = grass_types['attack'].mean()
attack_fire_types = fire_types['attack'].mean()
attack_water_types = water_types['attack'].mean()
attack_bug_types = bug_types['attack'].mean()
attack_normal_types = normal_types['attack'].mean()
attack_poison_types = poison_types['attack'].mean()
attack_electric_types = electric_types['attack'].mean()
attack_ground_types = ground_types['attack'].mean()
attack_fairy_types = fairy_types['attack'].mean()
attack_fighting_types = fighting_types['attack'].mean()
attack_psychic_types = psychic_types['attack'].mean()
attack_rock_types = rock_types['attack'].mean()
attack_ghost_types = ghost_types['attack'].mean()
attack_ice_types = ice_types['attack'].mean()
attack_dragon_types = dragon_types['attack'].mean()
attack_dark_types = dark_types['attack'].mean()
attack_steel_types = steel_types['attack'].mean()
attack_flying_types = flying_types['attack'].mean()


attack = pd.DataFrame({'Pkmn Types':['Grass', 'Fire', 'Water', 'Bug', 'Normal', 'Poison', 'Electric', 'Ground', 'Fairy', 'Fighting', 'Psychic', 'Rock', 'Ghost', 'Ice', 'Dragon', 'Dark', 'Steel', 'Flying'], 'Average Attack':[attack_grass_types, attack_fire_types, attack_water_types, attack_bug_types, attack_normal_types, attack_poison_types, attack_electric_types, attack_ground_types, attack_fairy_types, attack_fighting_types, attack_psychic_types, attack_rock_types, attack_ghost_types, attack_ice_types, attack_dragon_types, attack_dark_types, attack_steel_types, attack_flying_types]})
ax = attack.plot.barh(x='Pkmn Types', y='Average Attack', rot=0)

The same exercise was repeated for the **defense** statistic.

Steel and Rock have the highest statistics. Normal has the lowest.

In [None]:
#What is the average defense of all of the pokemon in one type class?

defense_grass_types = grass_types['defense'].mean()
defense_fire_types = fire_types['defense'].mean()
defense_water_types = water_types['defense'].mean()
defense_bug_types = bug_types['defense'].mean()
defense_normal_types = normal_types['defense'].mean()
defense_poison_types = poison_types['defense'].mean()
defense_electric_types = electric_types['defense'].mean()
defense_ground_types = ground_types['defense'].mean()
defense_fairy_types = fairy_types['defense'].mean()
defense_fighting_types = fighting_types['defense'].mean()
defense_psychic_types = psychic_types['defense'].mean()
defense_rock_types = rock_types['defense'].mean()
defense_ghost_types = ghost_types['defense'].mean()
defense_ice_types = ice_types['defense'].mean()
defense_dragon_types = dragon_types['defense'].mean()
defense_dark_types = dark_types['defense'].mean()
defense_steel_types = steel_types['defense'].mean()
defense_flying_types = flying_types['defense'].mean()


defense = pd.DataFrame({'Pkmn Types':['Grass', 'Fire', 'Water', 'Bug', 'Normal', 'Poison', 'Electric', 'Ground', 'Fairy', 'Fighting', 'Psychic', 'Rock', 'Ghost', 'Ice', 'Dragon', 'Dark', 'Steel', 'Flying'], 'Average Defense':[defense_grass_types, defense_fire_types, defense_water_types, defense_bug_types, defense_normal_types, defense_poison_types, defense_electric_types, defense_ground_types, defense_fairy_types, defense_fighting_types, defense_psychic_types, defense_rock_types, defense_ghost_types, defense_ice_types, defense_dragon_types, defense_dark_types, defense_steel_types, defense_flying_types]})
ax = defense.plot.barh(x='Pkmn Types', y='Average Defense', rot=0)

The same exercise was repeated for the **speed** statistic.

Flying, Electric, Dark, and Dragon have the highest statistics. Rock has the lowest.

In [None]:
#What is the average speed of all of the pokemon in one type class?

speed_grass_types = grass_types['speed'].mean()
speed_fire_types = fire_types['speed'].mean()
speed_water_types = water_types['speed'].mean()
speed_bug_types = bug_types['speed'].mean()
speed_normal_types = normal_types['speed'].mean()
speed_poison_types = poison_types['speed'].mean()
speed_electric_types = electric_types['speed'].mean()
speed_ground_types = ground_types['speed'].mean()
speed_fairy_types = fairy_types['speed'].mean()
speed_fighting_types = fighting_types['speed'].mean()
speed_psychic_types = psychic_types['speed'].mean()
speed_rock_types = rock_types['speed'].mean()
speed_ghost_types = ghost_types['speed'].mean()
speed_ice_types = ice_types['speed'].mean()
speed_dragon_types = dragon_types['speed'].mean()
speed_dark_types = dark_types['speed'].mean()
speed_steel_types = steel_types['speed'].mean()
speed_flying_types = flying_types['speed'].mean()


speed = pd.DataFrame({'Pkmn Types':['Grass', 'Fire', 'Water', 'Bug', 'Normal', 'Poison', 'Electric', 'Ground', 'Fairy', 'Fighting', 'Psychic', 'Rock', 'Ghost', 'Ice', 'Dragon', 'Dark', 'Steel', 'Flying'], 'Speed':[speed_grass_types, speed_fire_types, speed_water_types, speed_bug_types, speed_normal_types, speed_poison_types, speed_electric_types, speed_ground_types, speed_fairy_types, speed_fighting_types, speed_psychic_types, speed_rock_types, speed_ghost_types, speed_ice_types, speed_dragon_types, speed_dark_types, speed_steel_types, speed_flying_types]})
ax = speed.plot.barh(x='Pkmn Types', y='Speed', rot=0)

When I was trying to make generations about which types were the best, I thought it was valuable to be able to quickly view a graph that had all of the stats
on one bar graph. So, I created another bar graph that condenses of all of the information above.

I also added if/else statements that calculated and listed the value for the lowest stat for each type.

In [None]:
#put all of the average stats on one bar graph. First, make a new dataframe with the averages above, and then place into a graph.
#Then, verbally say which stats for each type are the worst.

average_all_stats = {'Type':['Grass', 'Fire', 'Water', 'Bug', 'Normal', 'Poison', 'Electric', 'Ground', 'Fairy', 'Fighting', 'Psychic', 'Rock', 'Ghost', 'Ice', 'Dragon', 'Dark', 'Steel', 'Flying'],
    'Av. Attack':[attack_grass_types, attack_fire_types, attack_water_types, attack_bug_types, attack_normal_types, attack_poison_types, attack_electric_types, attack_ground_types, attack_fairy_types, attack_fighting_types, attack_psychic_types, attack_rock_types, attack_ghost_types, attack_ice_types, attack_dragon_types, attack_dark_types, attack_steel_types, attack_flying_types],
    'Av. Special Attack':[sp_attack_grass_types, sp_attack_fire_types, sp_attack_water_types, sp_attack_bug_types, sp_attack_normal_types, sp_attack_poison_types, sp_attack_electric_types, sp_attack_ground_types, sp_attack_fairy_types, sp_attack_fighting_types, sp_attack_psychic_types, sp_attack_rock_types, sp_attack_ghost_types, sp_attack_ice_types, sp_attack_dragon_types, sp_attack_dark_types, sp_attack_steel_types, sp_attack_flying_types],
    'Av. Defense':[defense_grass_types, defense_fire_types, defense_water_types, defense_bug_types, defense_normal_types, defense_poison_types, defense_electric_types, defense_ground_types, defense_fairy_types, defense_fighting_types, defense_psychic_types, defense_rock_types, defense_ghost_types, defense_ice_types, defense_dragon_types, defense_dark_types, defense_steel_types, defense_flying_types],
    'Av. Special Defense':[sp_defense_grass_types, sp_defense_fire_types, sp_defense_water_types, sp_defense_bug_types, sp_defense_normal_types, sp_defense_poison_types, sp_defense_electric_types, sp_defense_ground_types, sp_defense_fairy_types, sp_defense_fighting_types, sp_defense_psychic_types, sp_defense_rock_types, sp_defense_ghost_types, sp_defense_ice_types, sp_defense_dragon_types, sp_defense_dark_types, sp_defense_steel_types, sp_defense_flying_types],
    'Av. Speed':[speed_grass_types, speed_fire_types, speed_water_types, speed_bug_types, speed_normal_types, speed_poison_types, speed_electric_types, speed_ground_types, speed_fairy_types, speed_fighting_types, speed_psychic_types, speed_rock_types, speed_ghost_types, speed_ice_types, speed_dragon_types, speed_dark_types, speed_steel_types, speed_flying_types]}
index = ['Grass', 'Fire', 'Water', 'Bug', 'Normal', 'Poison', 'Electric', 'Ground', 'Fairy', 'Fighting', 'Psychic', 'Rock', 'Ghost', 'Ice', 'Dragon', 'Dark', 'Steel', 'Flying'];

df = pd.DataFrame(average_all_stats, index)

df.plot.bar(rot=75, title="Average Stats for each Pokemon Type");

################
#This only works because all of the stats have one that is definitively lower than the others. 
#Theoretically, if there were types that have a tie, the code would be different.

grass_stats = (grass_types['attack'].mean(), grass_types['sp_attack'].mean(), grass_types['defense'].mean(), grass_types['sp_defense'].mean(), grass_types['speed'].mean())
if min(grass_stats) == grass_types['attack'].mean():
    x = "attack"
elif min(grass_stats) == grass_types['sp_attack'].mean():
    x = "sp. attack"
elif min(grass_stats) == grass_types['defense'].mean():
    x = "defense"
elif min(grass_stats) == grass_types['sp_defense'].mean():
    x = "sp. defense"
elif min(grass_stats) == grass_types['speed'].mean():
    x = "speed"
print("The worst stat on average for grass pokemon is", x, "-->", (min(grass_stats)))


fire_stats = (fire_types['attack'].mean(), fire_types['sp_attack'].mean(), fire_types['defense'].mean(), fire_types['sp_defense'].mean(), fire_types['speed'].mean())
if min(fire_stats) == fire_types['attack'].mean():
    x = "attack"
elif min(fire_stats) == fire_types['sp_attack'].mean():
    x = "sp. attack"
elif min(fire_stats) == fire_types['defense'].mean():
    x = "defense"
elif min(fire_stats) == fire_types['sp_defense'].mean():
    x = "sp. defense"
elif min(fire_stats) == fire_types['speed'].mean():
    x = "speed"
print("The worst stat on average for fire pokemon is", x, "-->", (min(fire_stats)))


water_stats = (water_types['attack'].mean(), water_types['sp_attack'].mean(), water_types['defense'].mean(), water_types['sp_defense'].mean(), water_types['speed'].mean())
if min(water_stats) == water_types['attack'].mean():
    x = "attack"
elif min(water_stats) == water_types['sp_attack'].mean():
    x = "sp. attack"
elif min(water_stats) == water_types['defense'].mean():
    x = "defense"
elif min(water_stats) == water_types['sp_defense'].mean():
    x = "sp. defense"
elif min(water_stats) == water_types['speed'].mean():
    x = "speed"
print("The worst stat on average for water pokemon is", x, "-->", (min(water_stats)))


bug_stats = (bug_types['attack'].mean(), bug_types['sp_attack'].mean(), bug_types['defense'].mean(), bug_types['sp_defense'].mean(), bug_types['speed'].mean()) 
if min(bug_stats) == bug_types['attack'].mean():
    x = "attack"
elif min(bug_stats) == bug_types['sp_attack'].mean():
    x = "sp. attack"
elif min(bug_stats) == bug_types['defense'].mean():
    x = "defense"
elif min(bug_stats) == bug_types['sp_defense'].mean():
    x = "sp. defense"
elif min(bug_stats) == bug_types['speed'].mean():
    x = "speed"
print("The worst stat on average for bug pokemon is", x, "-->", (min(bug_stats)))


normal_stats = (normal_types['attack'].mean(), normal_types['sp_attack'].mean(), normal_types['defense'].mean(), normal_types['sp_defense'].mean(), normal_types['speed'].mean()) 
if min(normal_stats) == normal_types['attack'].mean():
    x = "attack"
elif min(normal_stats) == normal_types['sp_attack'].mean():
    x = "sp. attack"
elif min(normal_stats) == normal_types['defense'].mean():
    x = "defense"
elif min(normal_stats) == normal_types['sp_defense'].mean():
    x = "sp. defense"
elif min(normal_stats) == normal_types['speed'].mean():
    x = "speed"
print("The worst stat on average for normal pokemon is", x, "-->", (min(normal_stats)))


poison_stats = (poison_types['attack'].mean(), poison_types['sp_attack'].mean(), poison_types['defense'].mean(), poison_types['sp_defense'].mean(), poison_types['speed'].mean()) 
if min(poison_stats) == poison_types['attack'].mean():
    x = "attack"
elif min(poison_stats) == poison_types['sp_attack'].mean():
    x = "sp. attack"
elif min(poison_stats) == poison_types['defense'].mean():
    x = "defense"
elif min(poison_stats) == poison_types['sp_defense'].mean():
    x = "sp. defense"
elif min(poison_stats) == poison_types['speed'].mean():
    x = "speed"
print("The worst stat on average for poison pokemon is", x, "-->", (min(poison_stats)))


electric_stats = (electric_types['attack'].mean(), electric_types['sp_attack'].mean(), electric_types['defense'].mean(), electric_types['sp_defense'].mean(), electric_types['speed'].mean()) 
if min(electric_stats) == electric_types['attack'].mean():
    x = "attack"
elif min(electric_stats) == electric_types['sp_attack'].mean():
    x = "sp. attack"
elif min(electric_stats) == electric_types['defense'].mean():
    x = "defense"
elif min(electric_stats) == electric_types['sp_defense'].mean():
    x = "sp. defense"
elif min(electric_stats) == electric_types['speed'].mean():
    x = "speed"
print("The worst stat on average for electric pokemon is", x, "-->", (min(electric_stats)))


ground_stats = (ground_types['attack'].mean(), ground_types['sp_attack'].mean(), ground_types['defense'].mean(), ground_types['sp_defense'].mean(), ground_types['speed'].mean()) 
if min(ground_stats) == ground_types['attack'].mean():
    x = "attack"
elif min(ground_stats) == ground_types['sp_attack'].mean():
    x = "sp. attack"
elif min(ground_stats) == ground_types['defense'].mean():
    x = "defense"
elif min(ground_stats) == ground_types['sp_defense'].mean():
    x = "sp. defense"
elif min(ground_stats) == ground_types['speed'].mean():
    x = "speed"
print("The worst stat on average for ground pokemon is", x, "-->", (min(ground_stats)))


fairy_stats = (fairy_types['attack'].mean(), fairy_types['sp_attack'].mean(), fairy_types['defense'].mean(), fairy_types['sp_defense'].mean(), fairy_types['speed'].mean()) 
if min(fairy_stats) == fairy_types['attack'].mean():
    x = "attack"
elif min(fairy_stats) == fairy_types['sp_attack'].mean():
    x = "sp. attack"
elif min(fairy_stats) == fairy_types['defense'].mean():
    x = "defense"
elif min(fairy_stats) == fairy_types['sp_defense'].mean():
    x = "sp. defense"
elif min(fairy_stats) == fairy_types['speed'].mean():
    x = "speed"
print("The worst stat on average for fairy pokemon is", x, "-->", (min(fairy_stats)))


fighting_stats = (fighting_types['attack'].mean(), fighting_types['sp_attack'].mean(), fighting_types['defense'].mean(), fighting_types['sp_defense'].mean(), fighting_types['speed'].mean()) 
if min(fighting_stats) == fighting_types['attack'].mean():
    x = "attack"
elif min(fighting_stats) == fighting_types['sp_attack'].mean():
    x = "sp. attack"
elif min(fighting_stats) == fighting_types['defense'].mean():
    x = "defense"
elif min(fighting_stats) == fighting_types['sp_defense'].mean():
    x = "sp. defense"
elif min(fighting_stats) == fighting_types['speed'].mean():
    x = "speed"
print("The worst stat on average for fighting pokemon is", x, "-->", (min(fighting_stats)))


psychic_stats = (psychic_types['attack'].mean(), psychic_types['sp_attack'].mean(), psychic_types['defense'].mean(), psychic_types['sp_defense'].mean(), psychic_types['speed'].mean()) 
if min(psychic_stats) == psychic_types['attack'].mean():
    x = "attack"
elif min(psychic_stats) == psychic_types['sp_attack'].mean():
    x = "sp. attack"
elif min(psychic_stats) == psychic_types['defense'].mean():
    x = "defense"
elif min(psychic_stats) == psychic_types['sp_defense'].mean():
    x = "sp. defense"
elif min(psychic_stats) == psychic_types['speed'].mean():
    x = "speed"
print("The worst stat on average for psychic pokemon is", x, "-->", (min(psychic_stats)))


rock_stats = (rock_types['attack'].mean(), rock_types['sp_attack'].mean(), rock_types['defense'].mean(), rock_types['sp_defense'].mean(), rock_types['speed'].mean()) 
if min(rock_stats) == rock_types['attack'].mean():
    x = "attack"
elif min(rock_stats) == rock_types['sp_attack'].mean():
    x = "sp. attack"
elif min(rock_stats) == rock_types['defense'].mean():
    x = "defense"
elif min(rock_stats) == rock_types['sp_defense'].mean():
    x = "sp. defense"
elif min(rock_stats) == rock_types['speed'].mean():
    x = "speed"
print("The worst stat on average for rock pokemon is", x, "-->", (min(rock_stats)))


ghost_stats = (ghost_types['attack'].mean(), ghost_types['sp_attack'].mean(), ghost_types['defense'].mean(), ghost_types['sp_defense'].mean(), ghost_types['speed'].mean()) 
if min(ghost_stats) == ghost_types['attack'].mean():
    x = "attack"
elif min(ghost_stats) == ghost_types['sp_attack'].mean():
    x = "sp. attack"
elif min(ghost_stats) == ghost_types['defense'].mean():
    x = "defense"
elif min(ghost_stats) == ghost_types['sp_defense'].mean():
    x = "sp. defense"
elif min(ghost_stats) == ghost_types['speed'].mean():
    x = "speed"
print("The worst stat on average for ghost pokemon is", x, "-->", (min(ghost_stats)))


ice_stats = (ice_types['attack'].mean(), ice_types['sp_attack'].mean(), ice_types['defense'].mean(), ice_types['sp_defense'].mean(), ice_types['speed'].mean()) 
if min(ice_stats) == ice_types['attack'].mean():
    x = "attack"
elif min(ice_stats) == ice_types['sp_attack'].mean():
    x = "sp. attack"
elif min(ice_stats) == ice_types['defense'].mean():
    x = "defense"
elif min(ice_stats) == ice_types['sp_defense'].mean():
    x = "sp. defense"
elif min(ice_stats) == ice_types['speed'].mean():
    x = "speed"
print("The worst stat on average for ice pokemon is", x, "-->", (min(ice_stats)))


dragon_stats = (dragon_types['attack'].mean(), dragon_types['sp_attack'].mean(), dragon_types['defense'].mean(), dragon_types['sp_defense'].mean(), dragon_types['speed'].mean()) 
if min(dragon_stats) == dragon_types['attack'].mean():
    x = "attack"
elif min(dragon_stats) == dragon_types['sp_attack'].mean():
    x = "sp. attack"
elif min(dragon_stats) == dragon_types['defense'].mean():
    x = "defense"
elif min(dragon_stats) == dragon_types['sp_defense'].mean():
    x = "sp. defense"
elif min(dragon_stats) == dragon_types['speed'].mean():
    x = "speed"
print("The worst stat on average for dragon pokemon is", x, "-->", (min(dragon_stats)))


dark_stats = (dark_types['attack'].mean(), dark_types['sp_attack'].mean(), dark_types['defense'].mean(), dark_types['sp_defense'].mean(), dark_types['speed'].mean()) 
if min(dark_stats) == dark_types['attack'].mean():
    x = "attack"
elif min(dark_stats) == dark_types['sp_attack'].mean():
    x = "sp. attack"
elif min(dark_stats) == dark_types['defense'].mean():
    x = "defense"
elif min(dark_stats) == dark_types['sp_defense'].mean():
    x = "sp. defense"
elif min(dark_stats) == dark_types['speed'].mean():
    x = "speed"
print("The worst stat on average for dark pokemon is", x, "-->", (min(dark_stats)))


steel_stats = (steel_types['attack'].mean(), steel_types['sp_attack'].mean(), steel_types['defense'].mean(), steel_types['sp_defense'].mean(), steel_types['speed'].mean()) 
if min(steel_stats) == steel_types['attack'].mean():
    x = "attack"
elif min(steel_stats) == steel_types['sp_attack'].mean():
    x = "sp. attack"
elif min(steel_stats) == steel_types['defense'].mean():
    x = "defense"
elif min(steel_stats) == steel_types['sp_defense'].mean():
    x = "sp. defense"
elif min(steel_stats) == steel_types['speed'].mean():
    x = "speed"
print("The worst stat on average for steel pokemon is", x, "-->", (min(steel_stats)))


flying_stats = (flying_types['attack'].mean(), flying_types['sp_attack'].mean(), flying_types['defense'].mean(), flying_types['sp_defense'].mean(), flying_types['speed'].mean()) 
if min(flying_stats) == flying_types['attack'].mean():
    x = "attack"
elif min(flying_stats) == flying_types['sp_attack'].mean():
    x = "sp. attack"
elif min(flying_stats) == flying_types['defense'].mean():
    x = "defense"
elif min(flying_stats) == flying_types['sp_defense'].mean():
    x = "sp. defense"
elif min(flying_stats) == flying_types['speed'].mean():
    x = "speed"
print("The worst stat on average for flying pokemon is", x, "-->", (min(flying_stats)))

Looking at the best and worst pokemon on average in this manner made a ton of sense!

Generally speaking, I never use Bug, Normal, or Poison Pokemon, and those type groupings on average have abymssally low stats. 
I rarely use Flying Pokemon, and when I do, I have used them because I anticipate that they will be fast - the stats bear that out.

I most frequently use Dragon, Steel, and Psychic types, and they have very high stats on average. Additionaly, the final bosses in Pokemon are always Dragon types, which also makes sense based on the statistics. 


Then, I was curious which Pokemon were the best & worst in their type categories, and tried to sort the type groupings to see that in the dataframe.

In [None]:
#NOTE: I wasn't able to figure this out. My code kept running into the error:

    #A value is trying to be set on a copy of a slice from a DataFrame.
    #Try using .loc[row_indexer,col_indexer] = value instead
    
#but was unable to debug my code. My attempt was:

        #dragon_average_column = dragon_types[['sp_attack']].add(dragon_types['sp_defense'], axis='index')
        #dragon_average_column = dragon_average_column.add(dragon_types['speed'], axis='index')
        #dragon_average_column = dragon_average_column.add(dragon_types['attack'], axis='index')
        #dragon_average_column = dragon_average_column.add(dragon_types['defense'], axis='index')
        #dragon_average_column = dragon_average_column / 5

        #dragon_types['stat_averages'] = dragon_average_column
        #dragon_types
        #dragon_types.sort_values(by='stat_averages', ascending=False)

After this problem, I moved onto a new question.
I wanted to determine which generation of legendary pokemon has the highest overall stats. First, I updated the Pokedex dataset to only provide me with rows
where column "is_legendary" was 1, signifying that the Pokemon was a legendary pokemon.

In [None]:
#On average, which generation of legendary pokemon has the highest stats? 
#Step One: how many legendary pokemon are in the original dataset?

pokedex.drop(['abilities', 'hp', 'pokedex_number'], axis=1, inplace=True)
pokedex.drop(pokedex[pokedex['is_legendary'] == 0].index, inplace = True)
pokedex

I was pleasantly surprised to learn of 70 Pokemon that fit that criteria. 

I was curious to see how many Pokemon in each generation were legendary, so I broke out the dataset a bit move to quickly see that. 
I was surprised that the 6th generation has to few, as the trend indicates that the number of legendary pokemon increases each generation.

In [None]:
#On average, which generation of legendary pokemon has the highest stats? 
#Step Two: how many legendary pokemon are in each generation of pokemon?

for i in pokedex.groupby(['generation']):
    print(i)

Finally, using the same techniques as before, I created different clusters of Pokemon , and then visually placed it on a bar graph for a quick comparison.

In [None]:
#On average, which generation of legendary pokemon has the highest stats? 
#Step Three: Seperate out the generations into their own sets.

l_gen_1 = pokedex[(pokedex['generation'] == 1)]
l_gen_2 = pokedex[(pokedex['generation'] == 2)]
l_gen_3 = pokedex[(pokedex['generation'] == 3)]
l_gen_4 = pokedex[(pokedex['generation'] == 4)]
l_gen_5 = pokedex[(pokedex['generation'] == 5)]
l_gen_6 = pokedex[(pokedex['generation'] == 6)]
l_gen_7 = pokedex[(pokedex['generation'] == 7)]

In [None]:
#On average, which generation of legendary pokemon has the highest stats? 
#Step Four: Grab the mean of all of the legendary stats and put onto a bar graph.

av_attack_l_gen_1 = l_gen_1['attack'].mean()
av_sp_attack_l_gen_1 = l_gen_1['sp_attack'].mean()
av_defense_l_gen_1 = l_gen_1['defense'].mean()
av_sp_defense_l_gen_1 = l_gen_1['sp_defense'].mean()
av_speed_l_gen_1 = l_gen_1['speed'].mean()

av_attack_l_gen_2 = l_gen_2['attack'].mean()
av_sp_attack_l_gen_2 = l_gen_2['sp_attack'].mean()
av_defense_l_gen_2 = l_gen_2['defense'].mean()
av_sp_defense_l_gen_2 = l_gen_2['sp_defense'].mean()
av_speed_l_gen_2 = l_gen_2['speed'].mean()

av_attack_l_gen_3 = l_gen_3['attack'].mean()
av_sp_attack_l_gen_3 = l_gen_3['sp_attack'].mean()
av_defense_l_gen_3 = l_gen_3['defense'].mean()
av_sp_defense_l_gen_3 = l_gen_3['sp_defense'].mean()
av_speed_l_gen_3 = l_gen_3['speed'].mean()

av_attack_l_gen_4 = l_gen_4['attack'].mean()
av_sp_attack_l_gen_4 = l_gen_4['sp_attack'].mean()
av_defense_l_gen_4 = l_gen_4['defense'].mean()
av_sp_defense_l_gen_4 = l_gen_4['sp_defense'].mean()
av_speed_l_gen_4 = l_gen_4['speed'].mean()

av_attack_l_gen_5 = l_gen_5['attack'].mean()
av_sp_attack_l_gen_5 = l_gen_5['sp_attack'].mean()
av_defense_l_gen_5 = l_gen_5['defense'].mean()
av_sp_defense_l_gen_5 = l_gen_5['sp_defense'].mean()
av_speed_l_gen_5 = l_gen_5['speed'].mean()

av_attack_l_gen_6 = l_gen_6['attack'].mean()
av_sp_attack_l_gen_6 = l_gen_6['sp_attack'].mean()
av_defense_l_gen_6 = l_gen_6['defense'].mean()
av_sp_defense_l_gen_6 = l_gen_6['sp_defense'].mean()
av_speed_l_gen_6 = l_gen_6['speed'].mean()

av_attack_l_gen_7 = l_gen_7['attack'].mean()
av_sp_attack_l_gen_7 = l_gen_7['sp_attack'].mean()
av_defense_l_gen_7 = l_gen_7['defense'].mean()
av_sp_defense_l_gen_7 = l_gen_7['sp_defense'].mean()
av_speed_l_gen_7 = l_gen_7['speed'].mean()

average_all_l_stats = {'Generation':['1', '2', '3', '4', '5', '6', '7'],
    'Av. L Attack':[av_attack_l_gen_1, av_attack_l_gen_2, av_attack_l_gen_3, av_attack_l_gen_4, av_attack_l_gen_5, av_attack_l_gen_6, av_attack_l_gen_7],
    'Av. L Special Attack':[av_sp_attack_l_gen_1, av_sp_attack_l_gen_2, av_sp_attack_l_gen_3, av_sp_attack_l_gen_4, av_sp_attack_l_gen_5, av_sp_attack_l_gen_6, av_sp_attack_l_gen_7],
    'Av. L Defense':[av_defense_l_gen_1, av_defense_l_gen_2, av_defense_l_gen_3, av_defense_l_gen_4, av_defense_l_gen_5, av_defense_l_gen_6,av_defense_l_gen_7],
    'Av. L Special Defense':[av_sp_defense_l_gen_1, av_sp_defense_l_gen_2, av_sp_defense_l_gen_3, av_sp_defense_l_gen_4, av_sp_defense_l_gen_5, av_sp_defense_l_gen_6, av_sp_defense_l_gen_7],
    'Av. L Speed':[av_speed_l_gen_1, av_speed_l_gen_2, av_speed_l_gen_3, av_speed_l_gen_4, av_speed_l_gen_5, av_speed_l_gen_6, av_speed_l_gen_7]}
index = ['Gen 1', 'Gen 2', 'Gen 3', 'Gen 4', 'Gen 5', 'Gen 6', 'Gen 7'];
    
df2 = pd.DataFrame(average_all_l_stats, index)

df2.plot.bar(rot=75, title="Average Stats for Legendary Pokemon in Each Generation");

Interesting takeaways - I'm surprised by how high the Sp. Attack is for generation 1 Pokemon. In thinking more about it - the generation 1 legendaries have always been extremely popular with fans, enough to where many of them received new upgrades in the newest Pokemon games, 6 generations later. I wondered if their popularity was because of nostalgia, but the average stats suggest that they are actually quite good.

I actually had a lot of fun analysizing this and checking my assumptions! After I finished this assignment, I actually called my younger brother, because we grew up playing Pokemon together. We chatted for 15 minutes about the assumptions, and started to think about ways that this dataset could be used to objectively help "create the perfect Pokemon team."

For Example:

* I didn't play with type matchups in this dataset, but it would be super interesting to figure out which Pokemon teams have the fewest weaknesses based on type matchups and other pairings.
* Most Pokemon evolve at least once, which changes their maximum statistics. This dataset did not include an indicator for evolutions, so my data was slightly corrupted. If there were a column that allowed me to filter out evolutions, that might change how the bar graphs look. For example - Weedle is an extremely weak bug Pokemon that evolves into Beedrill. Including Weedle in with "Bug" definitely lowers the average, so I would have liked to filter it out and only include the stronger evolution, especially because in a game, I would evolve it.
* I wonder which Gym Leader/final Bosses have the best teams, and which ones have the worst? That data would also be super fun to pull.

Etc., etc. This project wasn't as serious as some of the other ones, but I actually really liked doing this to better understand why I play the way that I play.