# Before your start:
- Read the README.md file
- Comment as much as you can and use the resources (README.md file)
- Happy learning!

In [1]:
import numpy as np
import pandas as pd

# Challenge 1 - Independent Sample T-tests

In this challenge, we will be using the Pokemon dataset. Before applying statistical methods to this data, let's first examine the data.

To load the data, run the code below.

In [2]:
# Run this code:

pokemon = pd.read_csv('../pokemon.csv')

Let's start off by looking at the `head` function in the cell below.

In [3]:
pokemon.head()

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,False


In [67]:
pokemon.describe()

Unnamed: 0,#,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation
count,800.0,800.0,800.0,800.0,800.0,800.0,800.0,800.0,800.0
mean,362.81375,435.1025,69.25875,79.00125,73.8425,72.82,71.9025,68.2775,3.32375
std,208.343798,119.96304,25.534669,32.457366,31.183501,32.722294,27.828916,29.060474,1.66129
min,1.0,180.0,1.0,5.0,5.0,10.0,20.0,5.0,1.0
25%,184.75,330.0,50.0,55.0,50.0,49.75,50.0,45.0,2.0
50%,364.5,450.0,65.0,75.0,70.0,65.0,70.0,65.0,3.0
75%,539.25,515.0,80.0,100.0,90.0,95.0,90.0,90.0,5.0
max,721.0,780.0,255.0,190.0,230.0,194.0,230.0,180.0,6.0


The first thing we would like to do is compare the legendary Pokemon to the regular Pokemon. To do this, we should examine the data further. What is the count of legendary vs. non legendary Pokemons?

In [4]:
pokemon['Legendary'].value_counts()

False    735
True      65
Name: Legendary, dtype: int64

Compute the mean and standard deviation of the total points for both legendary and non-legendary Pokemon.

In [12]:
pokemon.groupby(['Legendary']).mean()

Unnamed: 0_level_0,#,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation
Legendary,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
False,353.315646,417.213605,67.182313,75.669388,71.559184,68.454422,68.892517,65.455782,3.284354
True,470.215385,637.384615,92.738462,116.676923,99.661538,122.184615,105.938462,100.184615,3.769231


In [16]:
legendary = pokemon[pokemon['Legendary']==True]
legendary_mean = legendary['Total'].mean()
legendary_mean

637.3846153846154

In [17]:
non_legendary = pokemon[pokemon['Legendary']==False]
non_legendary_mean = non_legendary['Total'].mean()
non_legendary_mean

417.21360544217686

In [13]:
pokemon.groupby(['Legendary']).std()

Unnamed: 0_level_0,#,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation
Legendary,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
False,208.590419,106.760417,24.808849,30.490153,30.408194,29.091705,25.66931,27.843038,1.673471
True,173.651095,60.937389,21.722164,30.348037,28.255131,31.104608,28.827004,22.952323,1.455262


In [18]:
legendary_std = legendary['Total'].std()
legendary_std

60.93738905315346

In [19]:
non_legendary_std = non_legendary['Total'].std()
non_legendary_std

106.76041745713022

The computation of the mean might give us a clue regarding how the statistical test may turn out; However, it certainly does not prove whether there is a significant difference between the two groups.

In the cell below, use the `ttest_ind` function in `scipy.stats` to compare the the total points for legendary and non-legendary Pokemon. Since we do not have any information about the population, assume the variances are not equal.

In [14]:
from scipy.stats import ttest_ind
# variance paramenter = false
# ttest_ind >> Calculate the T-test for the means of two independent samples of scores.

In [57]:
t, p = ttest_ind( legendary['Total'],non_legendary['Total'] , equal_var = False)

In [58]:
print('p =', p)
print('t =', t)

p = 9.357954335957446e-47
t = 25.8335743895517


What do you conclude from this test? Write your conclusions below.

In [38]:
p > 0.05

False

In [6]:
'''
The test measures whether the average (expected) value differs significantly across samples. 
If we observe a large p-value, for example larger than 0.05 or 0.1, 
then we cannot reject the null hypothesis of identical average scores. 
If the p-value is smaller than the threshold, e.g. 1%, 5% or 10%, 
then we reject the null hypothesis of equal averages.

So in this case: the p value is smaller than 0.05, then we reject the null hypothesis of equal averages. 
The legendary and non_legendary means are significantly different.
If t is positiv, the mean of legendary is higher than the non-legendary mean.
'''

How about we try to compare the different types of pokemon? In the cell below, list the types of Pokemon from column `Type 1` and the count of each type.

In [40]:
pokemon['Type 1'].value_counts()

Water       112
Normal       98
Grass        70
Bug          69
Psychic      57
Fire         52
Electric     44
Rock         44
Dragon       32
Ghost        32
Ground       32
Dark         31
Poison       28
Steel        27
Fighting     27
Ice          24
Fairy        17
Flying        4
Name: Type 1, dtype: int64

Since water is the largest group of Pokemon, compare the mean and standard deviation of water Pokemon to all other Pokemon.

In [49]:
water = pokemon[pokemon['Type 1']=='Water']
water_mean = water['Total'].mean()
water_mean

430.45535714285717

In [50]:
water_std = water['Total'].std()
water_std

113.18826606431458

In [53]:
all_wo_water = pokemon[pokemon['Type 1']!='Water']

In [54]:
all_wo_water_mean = all_wo_water['Total'].mean()
all_wo_water_mean

435.85901162790697

In [55]:
all_wo_water_std = all_wo_water['Total'].std()
all_wo_water_std

121.09168230208066

Perform a hypothesis test comparing the mean of total points for water Pokemon to all non-water Pokemon. Assume the variances are equal. 

In [59]:
t_w, p_w = ttest_ind( water['Total'],all_wo_water['Total'])

In [60]:
print('p =', p_w)
print('t =', t_w)

p = 0.6587140317488793
t = -0.4418547448849676


Write your conclusion below.

In [61]:
p_w > 0.05

True

In [10]:
'''
So in this case: the p value is bigger than 0.05, so we accept the null hypothesis of equal averages. 
The water pokemon and non water pokemon means are not significantly different.
If t is negativ, the mean of water pokemon is smaller than the non-water pokemon mean.
'''

# Challenge 2 - Matched Pairs Test

In this challenge we will compare dependent samples of data describing our Pokemon. Our goal is to see whether there is a significant difference between each Pokemon's defense and attack scores. Our hypothesis is that the defense and attack scores are equal. In the cell below, import the `ttest_rel` function from `scipy.stats` and compare the two columns to see if there is a statistically significant difference between them.

In [62]:
from scipy.stats import ttest_rel
# ttest_ind >> Calculate the t-test on TWO RELATED samples of scores, a and b.
# This is a two-sided test for the null hypothesis that 2 related or repeated samples have identical average (expected) values.

In [64]:
t_scores, p_scores = ttest_rel(pokemon['Attack'], pokemon['Defense'])

In [65]:
print('p =', p_scores)
print('t =', t_scores)

p = 1.7140303479358558e-05
t = 4.325566393330478


In [66]:
p_scores > 0.05

False

Describe the results of the test in the cell below.

In [12]:
'''
The test measures whether the average score differs significantly across samples (e.g. exams). 
If we observe a large p-value, for example greater than 0.05 or 0.1, 
then we cannot reject the null hypothesis of identical average scores. 
If the p-value is smaller than the threshold, e.g. 1%, 5% or 10%, 
then we reject the null hypothesis of equal averages. Small p-values are associated with large t-statistics.

So in this case: the p value is smaller than 0.05, then we reject the null hypothesis of equal averages. 
The Attack and Defense means are significantly different.
If t is positiv, Attack means are higher than the Defense points.
'''

We are also curious about whether therer is a significant difference between the mean of special defense and the mean of special attack. Perform the hypothesis test in the cell below. 

In [68]:
t_sp, p_sp = ttest_rel(pokemon['Sp. Atk'], pokemon['Sp. Def'])
print('p =', p_sp)
print('t =', t_sp)

p = 0.3933685997548122
t = 0.853986188453353


In [69]:
p_sp > 0.05

True

Describe the results of the test in the cell below.

In [14]:
'''
So in this case: the p value is bigger than 0.05, then we accept the null hypothesis of equal averages. 
The Special Attack and Special Defense means are not significantly different.
If t is positiv, Sp. Attack means are higher than the Sp. Defense points.
'''


As you may recall, a two sample matched pairs test can also be expressed as a one sample test of the difference between the two dependent columns.

Import the `ttest_1samp` function and perform a one sample t-test of the difference between defense and attack. Test the hypothesis that the difference between the means is zero. Confirm that the results of the test are the same.

In [70]:
from scipy.stats import ttest_1samp
# Calculate the T-test for the mean of ONE group of scores.
# This is a two-sided test for the null hypothesis that the expected value (mean) of a sample of independent 
# observations "a" is equal to the given population mean, popmean.   
# ttest_1samp(a(series), popmean)

In [73]:
t_f, p_f = ttest_1samp(pokemon['Attack']-pokemon['Defense'], 0)

In [74]:
t_f == t_scores
p_f == p_scores

True

In [None]:
# confirmation, the results of the test are the same as our ttest_rel test.

# Bonus Challenge - The Chi-Square Test

The Chi-Square test is used to determine whether there is a statistically significant difference in frequencies. In other words, we are testing whether there is a relationship between categorical variables or rather when the variables are independent. This test is an alternative to Fisher's exact test and is used in scenarios where sample sizes are larger. However, with a large enough sample size, both tests produce similar results. Read more about the Chi Squared test [here](https://en.wikipedia.org/wiki/Chi-squared_test).

In the cell below, create a contingency table using `pd.crosstab` comparing whether a Pokemon is legenadary or not and whether the Type 1 of a Pokemon is water or not.

In [17]:
# Your code here:



Perform a chi-squared test using the `chi2_contingency` function in `scipy.stats`. You can read the documentation of the function [here](https://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.stats.chi2_contingency.html).

In [18]:
# Your code here:



Based on a 95% confidence, should we reject the null hypothesis?

In [19]:
# Your answer here:

