## 2021: Week 25 - The Worst Pokémon

Often our stakeholders can have very niche knowledge and their requests may baffle us. Luckily, as data preppers we have the tools to tackle any dataset, no matter how bizarre. Yes, Carl and Tom have allowed me to create another Pokémon challenge!

The idea came from a YouTube video that I stumbled across: Who Is Pokemon’s LEAST Favorite Pokémon? The logical steps applied in the video felt like they were screaming out for a Preppin' Data challenge to verify the results! But be warned, the answer to this challenge will differ from the conclusion of the video, due to differing datasets.

### Input
We have multiple inputs for this challenge:
1. Gen 1 Pokémon (from Pokémon Database)
2. Evolution Group (from Bulbapedia - also see Preppin' Data 2021 week 10) 
3. Evolutions (Bulbapedia)
4. Mega Evolutions (Pokémon DB)
5. Alolan Pokémon (Pokémon DB)
6. Galarian Pokémon (Pokémon DB)
7. Gigantamax Pokémon (from IGN)
8. Unattainable Pokémon in Sword & Shield (Pokémon DB)
9. Anime appearances for Pokémon (First 116 episodes webscraped from Bulbapedia)

### Challenge
Remember: once a Pokémon meets a condition, their whole evolution group is excluded from consideration. For example, since there is a Mega Beedrill, Weedle and Kakuna cannot be the worst Pokémon since they all belong to the Weedle evolution group.

- Input the data
- Clean up the list of Gen 1 Pokémon so we have 1 row per Pokémon
- Clean up the Evolution Group input so that we can join it to the Gen 1 list 
    - Filter out Starter and Legendary Pokémon
- Using the Evolutions input, exclude any Pokémon that evolves from a Pokémon that is not part of Gen 1 or can evolve into a Pokémon outside of Gen 1
- Exclude any Pokémon with a mega evolution, Alolan, Galarian or Gigantamax form
- It's not possible to catch certain Pokémon in the most recent games. These are the only ones we will consider from this point on
- We're left with 10 evolution groups. Rank them in ascending order of how many times they've appeared in the anime to see who the worst Pokémon is!
- Output the data

### Output
![img](https://lh3.googleusercontent.com/-8kUZERJNZZY/YMzNoNRQwvI/AAAAAAAAA1M/pFn3S00eEoI2WeFPyEIJ17U2tvx-X4uYQCLcBGAsYHQ/image.png)

3 fields
   - Worst Pokémon
   - Evolution Group
   - Appearances

10 rows (11 including headers)

In [209]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

### Input the data

In [210]:
data = pd.read_excel("./data/2021W25 Input.xlsx", sheet_name=["Gen 1", "Evolution Group", "Evolutions",
                                                       "Mega Evolutions", "Alolan", "Galarian",
                                                       "Gigantamax", "Unattainable in Sword & Shield",
                                                       "Anime Appearances"])

### Clean up the list of Gen 1 Pokémon so we have 1 row per Pokémon

In [211]:
gen_1 = data["Gen 1"].copy()
gen_1.shape

(218, 10)

In [212]:
gen_1 = gen_1.drop_duplicates()
gen_1.shape

(162, 10)

### Clean up the Evolution Group input so that we can join it to the Gen 1 list
- Filter out Starter and Legendary Pokémon

In [213]:
evolution_group = data["Evolution Group"].copy()
evolution_group.shape

(158, 4)

In [214]:
evolution_group.head()

Unnamed: 0,Evolution Group,#,Starter?,Legendary?
0,Bulbasaur,1,1,0
1,Bulbasaur,2,1,0
2,Bulbasaur,3,1,0
3,Charmander,4,1,0
4,Charmander,5,1,0


In [215]:
evolution_group = evolution_group[(evolution_group["Starter?"] == 0) & (evolution_group["Legendary?"] == 0)]
evolution_group.shape

(134, 4)

In [216]:
evolution_group.sample(10)

Unnamed: 0,Evolution Group,#,Starter?,Legendary?
36,Vulpix,37,0,0
107,Lickitung,108,0,0
67,Machop,68,0,0
83,Doduo,84,0,0
116,Horsea,117,0,0
82,Farfetch'd,83,0,0
49,Diglett,50,0,0
111,Rhyhorn,112,0,0
38,Jigglypuff,39,0,0
22,Ekans,23,0,0


### Using the Evolutions input, exclude any Pokémon that evolves from a Pokémon that is not part of Gen 1 or can evolve into a Pokémon outside of Gen 1

In [217]:
evolutions = data["Evolutions"].copy()
evolutions

Unnamed: 0,Evolving from,Evolving to,Level,Condition,Evolution Type
0,Bulbasaur,Ivysaur,16.0,,Level
1,Ivysaur,Venusaur,32.0,,Level
2,Charmander,Charmeleon,16.0,,Level
3,Charmeleon,Charizard,36.0,,Level
4,Squirtle,Wartortle,16.0,,Level
...,...,...,...,...,...
385,Chingling,Chimecho,,Nighttime,Happiness
386,Buneary,Lopunny,,,Happiness
387,Riolu,Lucario,,Daytime,Happiness
388,Woobat,Swoobat,,,Happiness


In [218]:
gen_1_pokemon = gen_1.Name.dropna()
gen_1_pokemon

0       Bulbasaur
2         Ivysaur
4        Venusaur
6      Charmander
7      Charmeleon
          ...    
212       Dratini
213     Dragonair
214     Dragonite
216        Mewtwo
217           Mew
Name: Name, Length: 151, dtype: object

In [219]:
# exclude any Pokémon that evolves from a Pokémon that is not part of Gen 1
not_gen_1 = ~evolutions["Evolving from"].isin(gen_1_pokemon)
not_gen_1_idx = evolutions[not_gen_1].index

In [220]:
# can evolve into a Pokémon outside of Gen 1
evolve_out_gen_1 = ~evolutions["Evolving to"].isin(gen_1_pokemon)
evolve_out_gen_1_idx = evolutions[evolve_out_gen_1].index

In [221]:
exclude_idx = not_gen_1_idx.union(evolve_out_gen_1_idx)
evolutions = evolutions.drop(exclude_idx, axis=0)
evolutions.shape

(72, 5)

### Exclude any Pokémon with a mega evolution, Alolan, Galarian or Gigantamax form

In [222]:
"Mega Evolutions", "Alolan", "Galarian"

('Mega Evolutions', 'Alolan', 'Galarian')

In [223]:
mega = data["Mega Evolutions"].copy()
alolan = data["Alolan"].copy()
galarian = data["Galarian"].copy()
gigant = data["Gigantamax"].copy()

In [224]:
four_types = pd.concat([mega, alolan, galarian, gigant], axis=0)["Name"]
four_types.shape

(75,)

In [225]:
four_types = four_types.map(lambda x: x.split(" ")[1])

In [226]:
four_types_1_idx = evolutions[evolutions["Evolving to"].isin(four_types)].index
four_types_2_idx = evolutions[evolutions["Evolving from"].isin(four_types)].index
four_types_idx = four_types_1_idx.union(four_types_2_idx)

In [227]:
evolutions = evolutions.drop(four_types_idx, axis=0)
evolutions.shape

(44, 5)

### It's not possible to catch certain Pokémon in the most recent games. These are the only ones we will consider from this point on

In [228]:
unattainable = data["Unattainable in Sword & Shield"].copy().Name
unattainable

0         Weedle
1         Pidgey
2        Rattata
3        Spearow
4          Ekans
5          Paras
6        Venonat
7         Mankey
8     Bellsprout
9        Geodude
10         Doduo
11          Seel
12        Grimer
13       Drowzee
14       Voltorb
Name: Name, dtype: object

In [229]:
evolutions = evolutions[evolutions["Evolving from"].isin(unattainable)]
evolutions.shape

(12, 5)

### We're left with 10 evolution groups. Rank them in ascending order of how many times they've appeared in the anime to see who the worst Pokémon is!

In [230]:
evolution_group = evolution_group.drop_duplicates(subset="Evolution Group")

In [231]:
evolutions = evolutions.merge(evolution_group, how="left", left_on="Evolving from", right_on="Evolution Group")
evolutions

Unnamed: 0,Evolving from,Evolving to,Level,Condition,Evolution Type,Evolution Group,#,Starter?,Legendary?
0,Weedle,Kakuna,7.0,,Level,Weedle,13,0,0
1,Pidgey,Pidgeotto,18.0,,Level,Pidgey,16,0,0
2,Spearow,Fearow,20.0,,Level,Spearow,21,0,0
3,Ekans,Arbok,22.0,,Level,Ekans,23,0,0
4,Paras,Parasect,24.0,,Level,Paras,46,0,0
5,Venonat,Venomoth,31.0,,Level,Venonat,48,0,0
6,Mankey,Primeape,28.0,,Level,Mankey,56,0,0
7,Bellsprout,Weepinbell,21.0,,Level,Bellsprout,69,0,0
8,Doduo,Dodrio,31.0,,Level,Doduo,84,0,0
9,Seel,Dewgong,34.0,,Level,Seel,86,0,0


In [232]:
appearances = data["Anime Appearances"].copy()
appearances = appearances.groupby(["Pokemon"])["Episode"].count().sort_values(ascending=False)
appearances = appearances.reset_index().rename(columns={"Episode": "Appearances"})
appearances

Unnamed: 0,Pokemon,Appearances
0,Pikachu,128
1,Meowth,119
2,Togepi,66
3,Bulbasaur,66
4,Squirtle,66
...,...,...
148,Wigglytuff,1
149,Hypno,1
150,Omastar,1
151,Venustoise,1


In [233]:
evolutions = (evolutions.merge(appearances, how="left", left_on="Evolution Group", right_on="Pokemon")
                        .drop(["Evolving from", "Evolving to", "Level", "Condition", "Evolution Type", "#",
                               "Starter?", "Legendary?", "Pokemon"], axis=1)
                        .sort_values(by="Appearances", ascending=True))
evolutions["The Worst Pokemon"] = evolutions["Appearances"].rank(method="min").astype(int)
evolutions = evolutions.set_index("The Worst Pokemon")

In [236]:
evolutions

Unnamed: 0_level_0,Evolution Group,Appearances
The Worst Pokemon,Unnamed: 1_level_1,Unnamed: 2_level_1
1,Doduo,3
1,Drowzee,3
3,Weedle,5
4,Seel,7
5,Paras,10
6,Ekans,11
7,Mankey,12
7,Bellsprout,12
7,Voltorb,12
10,Spearow,14


In [235]:
evolutions.to_csv("./output/Week25_output.csv")