[YouTube Course Link](https://www.youtube.com/watch?v=gtjxAH8uaP0&t=2006s)
[DataWars Course Link](https://app.datawars.io/project/54b07e96-f0da-4b5d-ba40-c87475e42b8e?page=1)

In [None]:
!pip install pandas==2.1.3 --force-reinstall
!pip install matplotlib==3.8.2 --force-reinstall
!pip install seaborn==0.13.0 --force-reinstall

In [None]:
import pandas
import matplotlib.pyplot
import seaborn

In [None]:
dataframe = pandas.read_csv("data.csv")

In [None]:
dataframe.head()

In [None]:
dataframe.info()

In [None]:
dataframe.describe()

In [None]:
dataframe['Type 1'].value_counts()

In [None]:
dataframe['Type 1'].value_counts().plot(
    kind='pie',
    autopct='%1.1f%%',
    cmap='tab20c',
    figsize=(10, 8)
)

#### Distribution of Pokemon Totals:

In [None]:
dataframe['Total'].plot(kind='hist', figsize=(10,8))

In [None]:
dataframe['Total'].plot(kind='box', vert=False, figsize=(10, 5))

In [None]:
seaborn.boxplot(data=dataframe, x='Total')

#### Distribution of Legendary Pokemons:

In [None]:
dataframe['Legendary'].value_counts()

In [None]:
dataframe['Legendary'].value_counts().plot(
    kind='pie',
    autopct='%1.1f%%',
    cmap='Set3',
    figsize=(10, 8)
)

### Basic filtering

Let's start with a few simple activities regarding filtering.

##### 1. How many Pokemons exist with an `Attack` value greater than 150?

In [None]:
seaborn.boxplot(data=dataframe, x='Attack')

In [None]:
print(dataframe['Attack'] > 150)
print(dataframe[dataframe['Attack'] > 150])
len(dataframe[dataframe['Attack'] > 150])

##### 2. Select all pokemons with a Speed of `10` or less

In [None]:
seaborn.boxplot(data=dataframe, x='Speed')

In [None]:
print(dataframe.query('Speed <= 10'))
dataframe[dataframe["Speed"] <= 10]

##### 3. How many Pokemons have a `Sp. Def` value of 25 or less?

In [None]:
print(dataframe[dataframe["Sp. Def"] <= 25])
len(dataframe[dataframe["Sp. Def"] <= 25])

##### 4. Select all the Legendary pokemons

In [None]:
dataframe[dataframe['Legendary'] == True]

##### 5. Find the outlier

Find the pokemon that is clearly an outlier in terms of Attack / Defense: What's the Name of the pokemon that is a clear outlier (strong Defense, but very low Attack).

In [None]:
plot = seaborn.scatterplot(data=dataframe, x='Attack', y='Defense')

In [None]:
print(dataframe.query('Attack < 20 and Defense > 200'))
dataframe.query('Attack < 20 and Defense > 200')["Name"].tolist()[0]

### Advanced selection

Now let's use boolean operators to create more advanced expressions

##### 6. How many Fire-Flying Pokemons are there?

In [None]:
print(dataframe.query('`Type 1` == "Fire" and `Type 2` == "Flying"'))
len(dataframe.query('`Type 1` == "Fire" and `Type 2` == "Flying"'))

##### 7. How many 'Poison' pokemons are across both types?

How many pokemons exist that are of type Poison in either Type 1 or Type 2?

In [None]:
print(len(dataframe.query('`Type 1` == "Poison"')))
print(len(dataframe.query('`Type 2` == "Poison"')))
len(dataframe.query('`Type 1` == "Poison" or `Type 2` == "Poison"'))

##### 8. What pokemon of `Type 1` *Ice* has the strongest defense?

In [None]:
id = dataframe.query('`Type 1` == "Ice"')["Defense"].idxmax()
dataframe.iloc[id]["Name"]

##### 9. What's the most common type 1 of Legendary Pokemons?

In [None]:
dataframe.query('`Legendary` == True')

In [None]:
dataframe.query('`Legendary` == True')["Type 1"]

In [None]:
dataframe.query('`Legendary` == True')["Type 1"].mode().tolist()[0]

##### 10. What's the most powerful pokemon from the first 3 generations, of type water?

Find the most powerful Pokemon (by Total) from the first 3 generations that is of Type 1 'Water'. Enter its name below. In case of multiple names, enter the first name.

In [None]:
dataframe['Generation'].min()

In [None]:
dataframe.query('`Type 1` == "Water" and `Generation` <= 3')

In [None]:
dataframe.query('`Type 1` == "Water" and `Generation` <= 3').sort_values(by='Total', ascending=False)

In [None]:
id = dataframe.query('`Type 1` == "Water" and `Generation` <= 3')["Total"].idxmax()
dataframe.iloc[id]["Name"]

##### 11. What's the most powerful Dragon from the last two generations?

Find the most powerful pokemon (by Total) that is of type Dragon (either Type 1 or Type 2) and from the last two generations. Enter its name below. If there are multiple, enter the name of the Dragon with higher index value.

In [None]:
dataframe["Generation"].max()

In [None]:
(dataframe
 .query('`Type 1` == "Dragon" or `Type 2` == "Dragon"')
 .query('`Generation` >= 5')
 .sort_values(by='Total', ascending=False))

##### 12. Select most powerful Fire-type pokemons

Select all pokemons that have an Attack value above 100 and Type 1 equals to Fire (ignore Type 2 in this activity).

In [None]:
dataframe.query('`Type 1` == "Fire" and `Attack` > 100')

##### 13. Select all Water-type, Flying-type pokemons

Select those pokemons that are of Type 1 Water and Type 2 Flying.

In [None]:
dataframe.query('`Type 1` == "Water" and `Type 2` == "Flying"')

##### 14. Select specific columns of Legendary pokemons of type Fire

Perform a selection in your Dataframe of all the Legendary pokemons that are of Type 1 Fire. But select only the columns Name, Attack and Generation.

In [None]:
dataframe.query('`Type 1` == "Fire" and `Legendary` == True')[["Name", "Attack", "Generation"]]

##### 15. Select Slow and Fast pokemons

Select those pokemons that are either very slow (with Speed below the bottom 5%) or very fast (Speed above top 95%).

In [None]:
matplotlib.pyplot.figure(figsize=(10, 5))
histogram = seaborn.histplot(
    data=dataframe["Speed"],
    bins=100,
)
histogram.axvline(x=dataframe["Speed"].quantile(q=0.05), color="red")
histogram.axvline(x=dataframe["Speed"].quantile(q=0.95), color="green")

In [None]:
bottom = dataframe["Speed"].quantile(q=0.05)
top = dataframe["Speed"].quantile(q=0.95)
dataframe.query('`Speed` < @bottom or `Speed` > @top')

##### 16. Find the Ultra Powerful Legendary Pokemon

Create a scatter plot of Defense to Attack, Marking Legendary and non Legendary Pokemons by different Colours, add a line pointing to pokmon with defense 140 and attack 150, asking whos that?. Then print the name of that pokemon

In [None]:
scatterplot = seaborn.scatterplot(
    data=dataframe,
    x='Defense',
    y='Attack',
    hue='Legendary'
)
scatterplot.annotate(
    xy=(140, 150),
    xytext=(150, 130),
    text="Who's That?",
    color="Red",
    arrowprops=dict(arrowstyle="->", color="red")
)

In [None]:
dataframe.query('`Attack` == 150 and `Defense` == 140')["Name"].tolist()[0]