Exploring the world of Pokemons in order to determine which Pokemon is the strongest based on my criteria.

First of all let's import all the necessary libraries

In [None]:
import numpy as np
import pandas as pd
from pandas import Series, DataFrame
import seaborn as sns
import matplotlib.pyplot as plt

sns.set_style('whitegrid')
%matplotlib inline

Let's take a look at the dataset

In [None]:
#loading the dataset as pokemon_df
pokemon_df = pd.read_csv('../input/pokemonGO.csv')

#taking a look at the first 10 rows
pokemon_df.head(n=10)

What are the different types of pokemon?

In [None]:
sns.factorplot('Type 2',data=pokemon_df,kind='count',size=7)
sns.factorplot('Type 1',data=pokemon_df,kind='count',size=(8))

In [None]:
sns.factorplot('Type 1', data=pokemon_df, kind='count', hue='Type 2', size=10)

As you can see Fairy and Fighting type pokemons don't have a second type, and most of water type pokemons have a second type.

How many pokemons are there (single-type, double-type)?

In [None]:
pokemon_df[['Type 1','Type 2']].count()

As you can see we have in total 151 pokemons, 67 of which have two types of abilities, which leaves 84 of them being single-type pokemons.

Now let's take a look at battle performances

1. Who has the highest Combat Power (CP)?
2. Who has the lowest Combat Power (CP)?
3. Who has the highest Hit Points (HP)?
4. Who has the lowest Hit Points (HP)?
5. Who is the strongest Pokemon?

In [None]:
#Question 1
pokemon_df[pokemon_df['Max CP'] == pokemon_df['Max CP'].max()]

As we could've expected Mewtwo is the pokemon with the highest Combat Power in our dataset.

In [None]:
#Question 2
pokemon_df[pokemon_df['Max CP'] == pokemon_df['Max CP'].min()]

Magikarp turns out to be the pokemon with the lowest CP. However when it evolves into Gyarados, I can guarantee that he's among the strongest.

In [None]:
#Question 3
pokemon_df[pokemon_df['Max HP'] == pokemon_df['Max HP'].max()]

No wonder Chansey has the highest HP. This pokemon is just not suited for battle since it is known to be a medical assistant.

In [None]:
#Question 4
pokemon_df[pokemon_df['Max HP'] == pokemon_df['Max HP'].min()]

Rather surprising. What caused diglett to have the lowest HP? Is it due to the fact that it's a ground type pokemon?
Usually ground type pokemons/monsters/characters are known to be tough.

Let's dig a little deeper to study this possibility!

In [None]:
#First off, what is the mean HP value?
mean_hp = pokemon_df['Max HP'].mean()
mean_hp

Let's crop the dataframe into two other dataframes, one for pokemons whose first type is ground and then for those whose second type is ground

In [None]:
first_type_ground_df = pokemon_df[pokemon_df['Type 1'] == 'Ground']

second_type_ground_df = pokemon_df[pokemon_df['Type 2'] == 'Ground']

In [None]:
first_type_ground_df.count()

In [None]:
second_type_ground_df.count()

In [None]:
#the dataframe aren't don't have so many rows, so we can take a complete look at their content
first_type_ground_df

In [None]:
second_type_ground_df

Let's make two visualizations of both dataframes

In [None]:
#first dataframe
first_type_ground_df.plot(x='Name',y='Max HP',marker='o',figsize=(12,6),linestyle='--')
#setting the title of the dataframe
plt.title('Max HP pokemons whose first type is Ground')
#setting the y axis label
plt.ylabel('Max HP')
plt.axhline(mean_hp,color='red',linewidth=2)

second_type_ground_df.plot(x='Name', y='Max HP', marker='o',figsize=(12,5),linestyle='--')
#setting the title of the dataframe
plt.title('Max HP for pokemons whose second type is Ground')
#setting the y axis label
plt.ylabel('Max HP')
plt.axhline(mean_hp,color='red')

As you can see only 3 pokemons whose first type is ground have a Max HP higher than the average.

As to double-type pokemons whose second type is Ground, we have an even distribution. 3 pokemons are above average and 3 pokemons are below average.

Overall we have 8 ground type pokemons whose Max HP is below average and 6 whose Max HP is above average. The difference is very small, therefore we can only conclude that being a ground type pokemon gives you a chance to have a low Max HP. 

Now to answer the fifth and last question, let's think of what it actually means to be the strongest pokemon.
Being the strongest pokemon means having a CP that is above average but also a HP that is below average.
Let's gather all pokemons whose CP and HP follow this logic.

In [None]:
#finding the mean CP
mean_cp = pokemon_df['Max CP'].mean()
mean_cp

In [None]:
#creating a dataframe containing the strongest pokemons
strong_pokemon_df = pokemon_df[pokemon_df['Max CP'] >= mean_cp]
strong_pokemon_df = strong_pokemon_df[strong_pokemon_df['Max HP'] <= mean_hp]

In [None]:
strong_pokemon_df

In [None]:
strong_pokemon_df.count()

We have in total 17 pokemons that can compete for the strongest pokemon title. Now let's repeat the same process again in order to increase competitivity.

In [None]:
mean_hp2 = strong_pokemon_df['Max HP'].mean()
mean_cp2 = strong_pokemon_df['Max CP'].mean()

In [None]:
strong_pokemon_df = strong_pokemon_df[strong_pokemon_df['Max CP'] >= mean_cp2]
strong_pokemon_df = strong_pokemon_df[strong_pokemon_df['Max HP'] <= mean_hp2]

In [None]:
#Let's take a look at the new dataframe
strong_pokemon_df

We've finally reached the end of our research and as I personally did not expect, Cloyster with a CP of 2067 and a HP of 91 is the strongest pokemon on this dataset, again based on my criteria.