# **Little Sister Pokemon Task**

### **The issue**: 

#### Little sister wants to play Pokemon Sword, and she really likes the new fairy types, but she feels they are weak in battle compared to other types. She wants to know what pokemon she should use if she were to stick with a Fairy type.

#### **Questions**:

#### -How do the stats of fairy types compare to other types accross all games?
#### -What is the best offensive option for a fairy type pokemon in pokemon sword?

#### In order to answer these questions, I first need to gather some pokemon data. This was done by using a dataset which I created through the use of PokeAPI. It is a general dataset not comprehensive, so we will need to gather more data from other sources later on for this project.

##### Step 1: Gather necesarry tools and materials

In [166]:
# important libraries
import pandas as pd
import requests as rq
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns
import ipywidgets as wdg
from IPython.display import display
from bs4 import BeautifulSoup as bs
import re

##### Step 2: Import previously made dataset

In [167]:
# this dataset was gather from pokeapi in preperation for this project
poke_df = pd.read_csv('Pokemon Data but in CSV.csv')
poke_df = poke_df.drop(columns=['Unnamed: 0'])

In [168]:
display(poke_df)

Unnamed: 0,ID,Name,Type 1,Type 2,Ability 1,Ability 2,Hidden Ability,HP,Attack,Defense,Special Attack,Special Defense,Speed,Legendary,Mythical,Psuedo-Legendary
0,1,bulbasaur,grass,poison,overgrow,,chlorophyll,45,49,49,65,65,45,False,False,False
1,2,ivysaur,grass,poison,overgrow,,chlorophyll,60,62,63,80,80,60,False,False,False
2,3,venusaur,grass,poison,overgrow,,chlorophyll,80,82,83,100,100,80,False,False,False
3,4,charmander,fire,,blaze,,solar-power,39,52,43,60,50,65,False,False,False
4,5,charmeleon,fire,,blaze,,solar-power,58,64,58,80,65,80,False,False,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
893,894,regieleki,electric,,transistor,,,80,100,50,100,50,200,True,False,False
894,895,regidrago,dragon,,dragons-maw,,,200,100,50,100,50,80,True,False,False
895,896,glastrier,ice,,chilling-neigh,,,100,145,130,65,110,30,True,False,False
896,897,spectrier,ghost,,grim-neigh,,,100,65,60,145,80,130,True,False,False


#### Now that we have our dataset we would normally go through it and start cleaning it, but since this dataset was made beforehand with this project in mind it has already been cleaned. So we can instead move things forward and start looking for solutions to the questions given.

### **How do the stats of fairy types compare to other types accross all games?**

#### In order to answer this, we will create a bar chart showing the average stat of all pokemon types. This can be further navigated by changing which stat you would like to compare.

##### Step 3: Organize the data to suit this specific task. My approach to this is to create a datframe for each pokemon type, and put it in a dictionary to use later for visualizations

In [169]:
# create an array of all the types
type_list = poke_df['Type 1'].unique()

# create a dictionary for each type using dictionary comprehension
## link to stack overflow post used:
## https://stackoverflow.com/questions/40482738/how-to-name-dataframe-with-variables-in-pandas
dfs_dict = {type_list[i] + '_df': poke_df.loc[(poke_df['Type 1'] == f'{type_list[i]}') | (poke_df['Type 2'] == f'{type_list[i]}')] for i in range(len(type_list))}

In [170]:
dfs_dict['dragon_df'].head()

Unnamed: 0,ID,Name,Type 1,Type 2,Ability 1,Ability 2,Hidden Ability,HP,Attack,Defense,Special Attack,Special Defense,Speed,Legendary,Mythical,Psuedo-Legendary
146,147,dratini,dragon,,shed-skin,,marvel-scale,41,64,45,50,50,50,False,False,False
147,148,dragonair,dragon,,shed-skin,,marvel-scale,61,84,65,70,70,70,False,False,False
148,149,dragonite,dragon,flying,inner-focus,,multiscale,91,134,95,100,100,80,False,False,True
229,230,kingdra,water,dragon,swift-swim,sniper,damp,75,95,95,95,95,85,False,False,False
328,329,vibrava,ground,dragon,levitate,,,50,70,50,50,50,70,False,False,False


##### Step 4: We then get the mean for each stat (HP, Speed, Attack, etc.) in every dataframe, and output it into one cohesive dataframe showing the average stat for each pokemon type .

In [171]:
# Create list of pokemon stat names in order to iterate through
poke_stats_name = list(dfs_dict['grass_df'].columns.values.tolist())
poke_stats_name = poke_stats_name[7:13]

# This function creates a list of a specific stat input
# The contents of this list are the averages of the inputted stat seperated by pokemon type
def stat_avg_all_types(stat):
  avg_stat_list = []
  for type_name in type_list:
    avg_stat = int(round(dfs_dict[f'{type_name}_df'][stat].mean()))
    avg_stat_list.append(avg_stat)
  return avg_stat_list

# Create a dictionary with stat name as keys, and values of previous function with the key passed thorugh them as input then turn it to dataframe
avg_stat_by_type = pd.DataFrame({poke_stats_name[poke_stats_name.index(stat)]: stat_avg_all_types(stat) for stat in poke_stats_name})

# Inserting pokemon type as column and cleaning it
avg_stat_by_type['Type'] = type_list
avg_stat_by_type['Type'] = avg_stat_by_type['Type'].str.title()
cols = list(avg_stat_by_type.columns)
avg_stat_by_type = avg_stat_by_type[[cols[-1]] + cols[0: 6]]

In [172]:
display(avg_stat_by_type)

Unnamed: 0,Type,HP,Attack,Defense,Special Attack,Special Defense,Speed
0,Grass,66,74,72,70,71,58
1,Fire,69,81,66,85,70,74
2,Water,69,71,73,71,69,64
3,Bug,56,67,69,57,64,61
4,Normal,76,73,60,58,63,69
5,Poison,64,69,64,70,68,65
6,Electric,65,74,63,85,68,82
7,Ground,74,88,85,57,62,57
8,Fairy,67,63,69,78,83,60
9,Fighting,75,101,74,63,69,72


##### Step 5: Create a funtion for showing y values above each conatiner

In [173]:
# Having issues showing the value of the container right above the container itself
# Since the function 'bar_label()' doesn't want to work for some reason this solution is ripped straight from the internet
# link: https://www.statology.org/seaborn-barplot-show-values/
def show_values(axs, orient='v', space=1):
    def _single(ax):
        if orient == "v":
            for p in ax.patches:
                _x = p.get_x() + p.get_width() / 2
                _y = p.get_y() + p.get_height() + (p.get_height()*0.01)
                value = '{:.0f}'.format(p.get_height())
                ax.text(_x, _y, value, ha="center") 
        elif orient == "h":
            for p in ax.patches:
                _x = p.get_x() + p.get_width() + float(space)
                _y = p.get_y() + p.get_height() - (p.get_height()*0.5)
                value = '{:.0f}'.format(p.get_width())
                ax.text(_x, _y, value, ha="left")

    if isinstance(axs, np.ndarray):
        for idx, ax in np.ndenumerate(axs):
            _single(ax)
    else:
        _single(axs)

##### Step 6: Create the plot

In [197]:
# One of my sources was a kaggle journal whcih I used as a guide line since our porjects were so similar.
# They had a list of pokemon type colors already so I just used theirs and made it into a dictionary.
# link: https://www.kaggle.com/code/joaopdrg/discovering-the-best-pok-mon
# Type Color Palete
palette = {'Grass': '#7AC852', 'Fire': '#EF812E', 'Water': '#6991F0', 'Bug': '#A7B822', 'Normal': '#A8AA79', 'Poison': '#A0429F', 'Electric': '#F6D030', 'Ground': '#BCA23B', 'Fairy': '#FF65D5',
           'Fighting': '#C12F27', 'Psychic': '#F85887', 'Rock': '#F6D030', 'Ghost': '#70589A', 'Ice': '#9AD7D9', 'Dragon': '#6B3EE3', 'Dark': '#6D5947', 'Steel': '#B6B8D0', 'Flying': '#A991F0'}

# Drop down widget
stat_options = wdg.Dropdown(
    options= poke_stats_name,
    value='Special Attack',
    description='Stat:',
    disabled=False,
    )

# Generates plot of the average stat value of each type filered by stat
def plot_graph(stat):
    plt.figure(figsize= (14,5))
    ax = sns.barplot(x= 'Type', 
                     y= stat, 
                     data= avg_stat_by_type, 
                     palette= palette,
                     order= avg_stat_by_type.sort_values(stat, ascending= False).Type)
    sns.set_style('dark')
    sns.set_context('paper', font_scale= 1.4)
    ax.set_xticklabels(ax.get_xticklabels(),rotation = 30)
    ax.set_title(f'Highest Average {stat} by Type', loc= 'left', pad= 10, fontweight='bold')
    ax.set_xlabel(xlabel='')
    show_values(ax)

##### Step 6.2: Show plot

In [198]:
display(wdg.interact(plot_graph, stat=stat_options))

interactive(children=(Dropdown(description='Stat:', index=3, options=('HP', 'Attack', 'Defense', 'Special Atta…

<function __main__.plot_graph(stat)>

#### From this plot we can conclude a couple things about Fairy type pokemon across all games:
- Fairy types are on average the weakest physically attacking pokemon
- The stat which Fairy pokemon rank highest on average is Special Defense, only losing to Psychic types
- The best offensive stat for Fairy types is Special Attack with an average of 78

#### After viewing this, many people will probably want to see how the stats of a specific pokemon type compare as a whole.

##### Step 7: Use the previous datframe and tranpose it to fit our needs

In [199]:
# Makes new dataframe to use for the saem graph but in different format
avg_total_stat = avg_stat_by_type.transpose().reset_index()
avg_total_stat = avg_total_stat.iloc[1: , :]
col_names = ['Stat']
for i in range(len(type_list)):
    type_list[i] = type_list[i].capitalize()
for x in type_list:
  col_names.append(x)
avg_total_stat.columns = col_names
avg_total_stat.head()

Unnamed: 0,Stat,Grass,Fire,Water,Bug,Normal,Poison,Electric,Ground,Fairy,Fighting,Psychic,Rock,Ghost,Ice,Dragon,Dark,Steel,Flying
1,HP,66,69,69,56,76,64,65,74,67,75,72,68,64,77,82,73,70,70
2,Attack,74,81,71,67,73,69,74,88,63,101,66,87,77,79,93,90,92,76
3,Defense,72,66,73,69,60,64,63,85,69,74,72,105,77,76,81,67,111,66
4,Special Attack,70,85,71,57,58,70,85,57,78,63,87,58,82,77,86,72,71,71
5,Special Defense,71,70,69,64,63,68,68,62,83,69,84,72,79,78,77,65,80,69


##### Step 8: Create the plot just like the last one, but inveresed

In [200]:
# Drop down widget
for i in range(len(type_list)):
    type_list[i] = type_list[i].capitalize()
type_options = wdg.Dropdown(
    options= type_list,
    value='Fairy',
    description='Type:',
    disabled=False,
    )


# Same as previous graph creation function jsut tweaked so that it shows average of all stats filtered by type
def graph_plot(type_):
    plt.figure(figsize= (10,5))
    ax = sns.barplot(x= 'Stat', 
                     y= type_, 
                     data= avg_total_stat,
                     color= palette[f'{type_}'])
    sns.set_style('dark')
    sns.set_context('paper', font_scale= 1.4)
    ax.set_xticklabels(ax.get_xticklabels())
    ax.set_title(f'Average {type_} Type Stats', loc= 'left', pad= 10, fontweight='bold')
    ax.set_xlabel(xlabel='')
    ax.set_ylabel(ylabel='')
    show_values(ax)

##### Step 8.2: Show plot

In [201]:
display(wdg.interact(graph_plot, type_=type_options))

interactive(children=(Dropdown(description='Type:', index=8, options=('Grass', 'Fire', 'Water', 'Bug', 'Normal…

<function __main__.graph_plot(type_)>

#### This plot confirms that your average Fariy type has Special Attack and Special Defense as their highest stats.
#### With this information we can hypthesize that our best offensive option for a Fairy type pokemon in Pokemon Sword will most likely be a Special Attacker.

### **What is the best offensive option for a fairy type pokemon in pokemon sword?**

#### In order to answer this, we need to know which pokemon are in pokemon sword. This is information which my general dataset does not include, so it will be necesarry to webscrape pokemon infromation from established sites. The ones used for this porject are serebii.net and pokemondb.net, which are renowned for their 

##### Step 9: Defining quick links to use for our webscraping endeavors

In [202]:
# Rips the HTML of the webpage for us to use. So I only want to call this ocasionally
serebii_url = 'https://www.serebii.net/swordshield/galarpokedex.shtml'
serebii_soup = bs(rq.get(serebii_url).text, 'html.parser')
pokemondb_url = 'https://pokemondb.net/pokedex/game/sword-shield'
pokedb_soup = bs(rq.get(pokemondb_url).text, 'html.parser')

##### Step 11: Scrape a table from serebii to prepare for our dataframe

In [203]:
# Gets HTML version of Galar Pokedex
galar_table = serebii_soup.find_all('td', class_= 'fooinfo')

# Turns HTML Contents into list
poke_info = []
for item in galar_table:
    poke_info.append(item.text.strip())

##### Step 12: Create, clean,and organize our dataframe

In [204]:
# Turns List into a structured array and then into dataframe
poke_info = np.array(poke_info)
galar_df = pd.DataFrame(poke_info.reshape(400,11))

# Cleaning and Organizing Dataframe
col_names = ['ID', 'NULL_1', 'Name', 'Abilities', 'NULL_2', 'HP', 'Attack', 'Defense', 'Special Attack', 'Special Defense', 'Speed']
galar_df.columns = col_names
galar_df['ID'] = range(1,401)
# Only Enlgish Names
english_name = []
for string in galar_df['Name']:
    english_name.append(''.join(filter(lambda character:ord(character) < 0x3000,string)))
galar_df['Name'] = english_name
    
galar_df.head()

Unnamed: 0,ID,NULL_1,Name,Abilities,NULL_2,HP,Attack,Defense,Special Attack,Special Defense,Speed
0,1,,Grookey,Overgrow Grassy Surge,,50,65,50,40,40,65
1,2,,Thwackey,Overgrow Grassy Surge,,70,85,70,55,60,80
2,3,,Rillaboom,Overgrow Grassy Surge,,100,125,90,60,70,85
3,4,,Scorbunny,Blaze Libero,,50,71,40,40,40,69
4,5,,Raboot,Blaze Libero,,65,86,60,55,60,94


##### Step 13: Get list of Galar pokemon types, and introduce into our dataframe

In [205]:
# Gets HTML version of Galar Pokedex, and turns contents to list
galar_table = pokedb_soup.find_all('span', class_= 'infocard-lg-data text-muted')

# Gets the Pokemon Types from the list
galar_type_list = []
for item in galar_table:
    galar_type_list.append(item.find_all('small')[1].text)

## Since Type 1 and Type 2 are joined together into the same string, we need to seperate them
# This Elimnates any special characters or spaces in our list
galar_type_list_2 = []
for string in galar_type_list:
    galar_type_list_2.append(''.join(char for char in string if char.isalnum()))

# This finally seperates our types list into 2 lists of type 1 and 2, with no type 2 being counted as 0
type_1 = []
type_2 = []
for item in galar_type_list_2:
    if len(re.findall(r'[A-Z]',item)) == 2:
        item_list = [s for s in re.split("([A-Z][^A-Z]*)", item) if s]
        type_1.append(item_list[0])
        type_2.append(item_list[1])
    else:
        type_1.append(item)
        type_2.append(0)
        
galar_df['Type 1'] = type_1
galar_df['Type 2'] = type_2
galar_df['Type 2'] = galar_df['Type 2'].replace({'0':np.nan, 0:np.nan})
galar_df = galar_df[['ID', 'Name', 'Type 1', 'Type 2', 'HP', 'Attack', 'Defense', 'Special Attack', 'Special Defense', 'Speed']]

galar_df.head()

Unnamed: 0,ID,Name,Type 1,Type 2,HP,Attack,Defense,Special Attack,Special Defense,Speed
0,1,Grookey,Grass,,50,65,50,40,40,65
1,2,Thwackey,Grass,,70,85,70,55,60,80
2,3,Rillaboom,Grass,,100,125,90,60,70,85
3,4,Scorbunny,Fire,,50,71,40,40,40,69
4,5,Raboot,Fire,,65,86,60,55,60,94


#### Now we almost have a dataframe sutied to complete the task given to us. The only thing left to consider is that we only want to consider pokemon caught in pokemon sword. Currently our dataframe contains pokemon from sword and shield.

##### Step 14: Get a list of pokemon exclusive to Pokemon Shield, and delete them from our dataframe.

In [206]:
# This is such a short list that using an API or webscraping would not be as efficient as quick google search and manually typing it in.
shield_exc = ['Goomy', 'Sliggoo', 'Goodra', 'Larvitar', 'Pupitar', 'Tyranitar', 'Ponyta', 'Solosis', 'Duosion', 'Reuniclus', 'Drampa', 'Vullaby', 'Mandibuzz', 'Cursola', 'Lotad', 'Lombre', 'Ludicolo', 'Sableye', 'Lunatone', 'Croagunk', 'Toxicroak', 'Spritzee', 'Aromatisse', 'Oranguru', 'Eiscue']
sword_df = galar_df[~galar_df['Name'].isin(shield_exc)]

display(sword_df)


Unnamed: 0,ID,Name,Type 1,Type 2,HP,Attack,Defense,Special Attack,Special Defense,Speed
0,1,Grookey,Grass,,50,65,50,40,40,65
1,2,Thwackey,Grass,,70,85,70,55,60,80
2,3,Rillaboom,Grass,,100,125,90,60,70,85
3,4,Scorbunny,Fire,,50,71,40,40,40,69
4,5,Raboot,Fire,,65,86,60,55,60,94
...,...,...,...,...,...,...,...,...,...,...
395,396,Drakloak,Dragon,Ghost,68,80,50,60,50,102
396,397,Dragapult,Dragon,Ghost,88,120,75,100,75,142
397,398,Zacian,Fairy,Steel,92,130,115,80,115,138
398,399,Zamazenta,Fighting,Steel,92,130,115,80,115,138


##### Step 15: Delete the legendary pokemon from the dataframe

In [207]:
# Get list of legendary pokemon from premade dataset
poke_df['Name'] = poke_df['Name'].str.capitalize()
# legendary_df = poke_df[poke_df['Legendary'] == True] 
all_legendary_poke = poke_df[poke_df['Legendary'] == True]['Name'].values.tolist()

# Delete them from our sword dataframe
sword_df = sword_df[~sword_df['Name'].isin(all_legendary_poke)]

# Turns all stat vaules into integers
for stat in poke_stats_name:
    sword_df[stat] = sword_df[stat].astype(int)

##### Step 16: Create the plot

In [210]:
# Create a dictionary of dataframes based on types
sword_dict = {type_list[i] + '_df': sword_df.loc[(sword_df['Type 1'] == f'{type_list[i]}') | (sword_df['Type 2'] == f'{type_list[i]}')] for i in range(len(type_list))}

# Generates plot of best pokemon by stat and type
def sword_plot_graph(stat, type_):
    pal = sns.color_palette(f'light:{palette[type_]}', 15)
    pal.reverse()
    plt.figure(figsize= (14,5))
    ax = sns.barplot(x = 'Name', 
                     y = stat,
                     data = sword_dict[f'{type_}_df'],
                     palette = pal,
                     order= sword_dict[f'{type_}_df'].sort_values(stat, ascending= False).Name[0:10])
    sns.set_style('dark')
    sns.set_context('paper', font_scale= 1.4)
    ax.set_xticklabels(ax.get_xticklabels(),rotation = 30)
    ax.set_title(f'Non-legendary {type_} type Pokemon with the Highest {stat} in Pokemon Sword', loc= 'left', pad= 10, fontweight='bold')
    ax.set_xlabel(xlabel='')
    show_values(ax)

##### Step 16.2: Show plot

In [211]:
display(wdg.interact(sword_plot_graph, stat=stat_options, type_=type_options))

interactive(children=(Dropdown(description='Stat:', index=3, options=('HP', 'Attack', 'Defense', 'Special Atta…

<function __main__.sword_plot_graph(stat, type_)>

#### Using this plot we can confirm that the highest offensive stat for Fairy pokemon in Pokmon Sword is a Special Attacker: Hatterene
#### A closer look at both the Special Attack and Attack plot shows that Grimmsnarl shows up on both plots with a respectable 95 Special Attack, and an opressive 120 Attack. Hatterene has a very similar but inverse distribution with a frightening 136 Special Attack, and a menacing 90 Attack. This is a small difference of 10 stat points in favor of Hatterene.
#### For a purely offensive option, Hatterene is the best Fairy type pokemon in Pokemon Sword.

### **Conclusion**

#### In general Fairy type Pokemon are stronger in both special stats; Special Attack and Special Defense with an average of 78 and 83 respectively. Fairy types rank 6th overall in highest average Special Attack, and 2nd in highest Special Defense (losing only ot Psychic types). In regard to every other stat Fairy types fall below the 50th percentile. So my little sister is right to be concerened that Fairy types may be weaker in battle compared to other types, but there are still some viable options out there with Special Attackers. When looking into the pokemon availble to her in Pokemon Sword we noticed that the best purely offensive option is Hatterene, with 136 Special Attack and 90 Attack. Something to note is that the runner up, Grimmsnarl, was short by 10 offensive stat points (90 SPA, 120 ATT), and most notably was a physical attacker. This is hints that there are more vairables such as moveset, other stats, startegy, etc. that goes into choosing a strong pokemon which best mathces your team.

### **I forget the name of this section**

#### Some of the issues I had with this project was that the API I used to procure this dataset was not complete and accurate. Most notably it did not have the different forms some pokemon have which change their type and stats. Also could not find any information about mega-evolutions nor gigantamax. Another piece of information I failed to extract was wheather a pokemon appeared in certain games or not. This led me to realize that this infromation is not readily available to the public, so making a database on it may prove useful to many others like myself. Though I would have liked to widen the scope of the project by including things such as pokemon trainers, gym leaders, movesets, and team synergy when considering which pokemon to use, that would have been too big of a bite for my first project. In the future I would like to a similar project exploring pokemon staregies in the competitive environment of pokemon showdown.