In [1]:
# Import libraries

import pandas as pd
from scipy.stats import ttest_rel, ttest_ind

#### Import dataset

In this challenge we will work on the Pokemon dataset you have used last week. The goal is to test whether different groups of pokemon (e.g. Legendary vs Normal, Generation 1 vs 2, single-type vs dual-type) have different stats (e.g. HP, Attack, Defense, etc.).

In [2]:
# Import dataset

pokemon = pd.read_csv('../../lab-df-calculation-and-transformation/your-code/Pokemon.csv')

pokemon.head()

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,False


#### First we want to define a function with which we can test the means of a feature set of two samples. 

In the next cell you'll see the annotations of the Python function that explains what this function does and its arguments and returned value. This type of annotation is called **docstring** which is a convention used among Python developers. The docstring convention allows developers to write consistent tech documentations for their codes so that others can read. It also allows some websites to automatically parse the docstrings and display user-friendly documentations.

Follow the specifications of the docstring and complete the function.

In [3]:
def t_test_features(s1, s2, features=['HP', 'Attack', 'Defense', 'Sp. Atk', 'Sp. Def', 'Speed', 'Total']):
    
    """Test means of a feature set of two samples
    
    Args:
        s1 (dataframe): sample 1
        s2 (dataframe): sample 2
        features (list): an array of features to test
    
    Returns:
        dict: a dictionary of t-test scores for each feature where the feature name is the key and the p-value is the value
    """
        
    results = {}
    for e in (features):
        results[e]=ttest_ind(s1[e],s2[e])[1]
    return results

#### Using the `t_test_features` function, conduct t-test for Lengendary vs non-Legendary pokemons.

*Hint: your output should look like below:*

```
{'HP': 1.0026911708035284e-13,
 'Attack': 2.520372449236646e-16,
 'Defense': 4.8269984949193316e-11,
 'Sp. Atk': 1.5514614112239812e-21,
 'Sp. Def': 2.2949327864052826e-15,
 'Speed': 1.049016311882451e-18,
 'Total': 9.357954335957446e-47}
 ```

In [4]:
pop1=pokemon.loc[pokemon['Legendary']==True]
pop2=pokemon.loc[pokemon['Legendary']==False]
print(pop1["Legendary"].value_counts())
print(pop2["Legendary"].value_counts())
s1=pop1.sample(50)
s2=pop2.sample(50)

True    65
Name: Legendary, dtype: int64
False    735
Name: Legendary, dtype: int64


In [5]:
pop1.head()

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
156,144,Articuno,Ice,Flying,580,90,85,100,95,125,85,1,True
157,145,Zapdos,Electric,Flying,580,90,90,85,125,90,100,1,True
158,146,Moltres,Fire,Flying,580,90,100,90,125,85,90,1,True
162,150,Mewtwo,Psychic,,680,106,110,90,154,90,130,1,True
163,150,MewtwoMega Mewtwo X,Psychic,Fighting,780,106,190,100,154,100,130,1,True


In [6]:
t_test_features(s1, s2)

{'HP': 5.631510900943808e-05,
 'Attack': 1.4650835274553293e-09,
 'Defense': 2.609044482138479e-05,
 'Sp. Atk': 8.709616094828683e-15,
 'Sp. Def': 1.4456065561789447e-10,
 'Speed': 2.6292468799222054e-12,
 'Total': 2.7113864121550267e-22}

#### From the test results above, what conclusion can you make? Do Legendary and non-Legendary pokemons have significantly different stats on each feature?

In [7]:
'''
p-values are below the confidence interval threshold in all cases (p<0.05), therefore we can
declare that the null-hypothesis is rejected. In the current case that means that the means 
of sample 1 and sample 2 are significally different.
'''

'\np-values are below the confidence interval threshold in all cases (p<0.05), therefore we can\ndeclare that the null-hypothesis is rejected. In the current case that means that the means \nof sample 1 and sample 2 are significally different.\n'

#### Next, conduct t-test for Generation 1 and Generation 2 pokemons.

In [8]:
pop1=pokemon.loc[pokemon['Generation']==1]
pop2=pokemon.loc[pokemon['Generation']==2]
print(pop1['Generation'].value_counts())
print(pop2['Generation'].value_counts())
s1=pop1.sample(80)
s2=pop2.sample(80)

1    166
Name: Generation, dtype: int64
2    106
Name: Generation, dtype: int64


In [9]:
t_test_features(s1, s2)

{'HP': 0.17296498559389759,
 'Attack': 0.08366624899926849,
 'Defense': 0.6911807568021451,
 'Sp. Atk': 0.5591680701048336,
 'Sp. Def': 0.005610845744973601,
 'Speed': 0.25851332177936975,
 'Total': 0.7713533656003516}

#### What conclusions can you make?

In [10]:
'''
p-values are above the confidence interval treshold in all cases (p>0.05), therefore we can
declare that the null-hypothesis is not rejected. In the current case that means that the means
of sample 1 and sample 2 are sufficiently similar.
'''

'\np-values are above the confidence interval treshold in all cases (p>0.05), therefore we can\ndeclare that the null-hypothesis is not rejected. In the current case that means that the means\nof sample 1 and sample 2 are sufficiently similar.\n'

#### Compare pokemons who have single type vs those having two types.

In [11]:
pop1=pokemon.loc[pokemon['Type 2'].isnull()==True]
pop2=pokemon.loc[pokemon['Type 2'].isnull()==False]
print(len(pop1))
print(len(pop2))
s1=pop1.sample(350)
s2=pop2.sample(350)


386
414


In [12]:
t_test_features(s1, s2)

{'HP': 0.1791682829593831,
 'Attack': 0.00019885800211162503,
 'Defense': 4.622442332268e-08,
 'Sp. Atk': 0.00021372665278887254,
 'Sp. Def': 0.0001895490060406737,
 'Speed': 0.03512959329284859,
 'Total': 3.661086129800477e-07}

#### What conclusions can you make?

In [13]:
'''
p-values are above the confidence interval treshold in some cases like HP, Defense, Sp.Def and Total (p>0.05), therefore
we can declare that the null-hypothesis is not rejected. In these cases that means that the means of sample 1 and sample
2 are sufficiently similar.


For the cases Attack, Sp.Atk and Speed, p is under the 0.05 treshold hence we can reject the null-hypothesis
which means that the means of the samples are significally different.
'''

'\np-values are above the confidence interval treshold in some cases like HP, Defense, Sp.Def and Total (p>0.05), therefore\nwe can declare that the null-hypothesis is not rejected. In these cases that means that the means of sample 1 and sample\n2 are sufficiently similar.\n\n\nFor the cases Attack, Sp.Atk and Speed, p is under the 0.05 treshold, which means that we can reject the null-hypothesis\nwhich means that the means of the samples are significally different.\n'

#### Now, we want to compare whether there are significant differences of `Attack` vs `Defense`  and  `Sp. Atk` vs `Sp. Def` of all pokemons. Please write your code below.

*Hint: are you comparing different populations or the same population?*

In [26]:
pop1=pokemon['Attack']
pop2=pokemon['Defense']
print(len(pop1))
print(len(pop2))
s1=pop1.sample(600)
s2=pop2.sample(600)

800
800


In [27]:
ttest_rel(s1,s2)[1]

0.01446018025557416

In [28]:
pop3=pokemon['Sp. Atk']
pop4=pokemon['Sp. Def']
print(len(pop3))
print(len(pop4))
s3=pop3.sample(600)
s4=pop4.sample(600)

800
800


In [29]:
ttest_rel(s3,s4)[1]

0.8595805409145195

#### What conclusions can you make?

In [None]:
'''
We observe that in both of the cases p>0.05 so we do not reject the null-hypothesis, meaning that the means of
the samples are sufficiently similar. Moreover, the first case's probability is far smaller than the second's which 
gives an indication of how similar the means of the samples are in the different cases.
'''