## 2021: Week 25 - The Worst Pokémon

Often our stakeholders can have very niche knowledge and their requests may baffle us. Luckily, as data preppers we have the tools to tackle any dataset, no matter how bizarre. Yes, Carl and Tom have allowed me to create another Pokémon challenge!

The idea came from a YouTube video that I stumbled across: Who Is Pokemon’s LEAST Favorite Pokémon? The logical steps applied in the video felt like they were screaming out for a Preppin' Data challenge to verify the results! But be warned, the answer to this challenge will differ from the conclusion of the video, due to differing datasets.

### Input
We have multiple inputs for this challenge:
1. Gen 1 Pokémon (from Pokémon Database)
2. Evolution Group (from Bulbapedia - also see Preppin' Data 2021 week 10) 
3. Evolutions (Bulbapedia)
4. Mega Evolutions (Pokémon DB)
5. Alolan Pokémon (Pokémon DB)
6. Galarian Pokémon (Pokémon DB)
7. Gigantamax Pokémon (from IGN)
8. Unattainable Pokémon in Sword & Shield (Pokémon DB)
9. Anime appearances for Pokémon (First 116 episodes webscraped from Bulbapedia)

### Challenge
Remember: once a Pokémon meets a condition, their whole evolution group is excluded from consideration. For example, since there is a Mega Beedrill, Weedle and Kakuna cannot be the worst Pokémon since they all belong to the Weedle evolution group.

- Input the data
- Clean up the list of Gen 1 Pokémon so we have 1 row per Pokémon
- Clean up the Evolution Group input so that we can join it to the Gen 1 list 
    - Filter out Starter and Legendary Pokémon
- Using the Evolutions input, exclude any Pokémon that evolves from a Pokémon that is not part of Gen 1 or can evolve into a Pokémon outside of Gen 1
- Exclude any Pokémon with a mega evolution, Alolan, Galarian or Gigantamax form
- It's not possible to catch certain Pokémon in the most recent games. These are the only ones we will consider from this point on
- We're left with 10 evolution groups. Rank them in ascending order of how many times they've appeared in the anime to see who the worst Pokémon is!
- Output the data

### Output
![img](https://lh3.googleusercontent.com/-8kUZERJNZZY/YMzNoNRQwvI/AAAAAAAAA1M/pFn3S00eEoI2WeFPyEIJ17U2tvx-X4uYQCLcBGAsYHQ/image.png)

3 fields
   - Worst Pokémon
   - Evolution Group
   - Appearances

10 rows (11 including headers)

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [None]:
### Input the data

In [5]:
data = pd.read_excel("./data/2021W25 Input.xlsx", sheet_name=["Gen 1", "Evolution Group", "Evolutions",
                                                       "Mega Evolutions", "Alolan", "Galarian",
                                                       "Gigantamax", "Unattainable in Sword & Shield",
                                                       "Anime Appearances"])

In [None]:
### Clean up the list of Gen 1 Pokémon so we have 1 row per Pokémon

In [9]:
gen_1 = data["Gen 1"].copy()
gen_1.shape

(218, 10)

In [10]:
gen_1 = gen_1.drop_duplicates()
gen_1.shape

(162, 10)

In [None]:
### Clean up the Evolution Group input so that we can join it to the Gen 1 list
### Filter out Starter and Legendary Pokémon

In [26]:
evolution_group = data["Evolution Group"].copy()
evolution_group.shape

(158, 4)

In [27]:
evolution_group.head()

Unnamed: 0,Evolution Group,#,Starter?,Legendary?
0,Bulbasaur,1,1,0
1,Bulbasaur,2,1,0
2,Bulbasaur,3,1,0
3,Charmander,4,1,0
4,Charmander,5,1,0


In [28]:
evolution_group = evolution_group[(evolution_group["Starter?"] == 0) & (evolution_group["Legendary?"] == 0)]
evolution_group.shape

(134, 4)

In [29]:
evolution_group.sample(10)

Unnamed: 0,Evolution Group,#,Starter?,Legendary?
27,Sandshrew,28,0,0
67,Machop,68,0,0
95,Drowzee,96,0,0
125,Magmar,126,0,0
62,Abra,63,0,0
71,Tentacool,72,0,0
106,Hitmonchan,107,0,0
119,Staryu,120,0,0
103,Cubone,104,0,0
107,Lickitung,108,0,0
