# Bonus Challenge 1 - T-test

In statistics, t-test is used to test if two data samples have a significant difference between their means. There are two types of t-test:

* **Student's t-test** (a.k.a. independent or uncorrelated t-test). This type of t-test is to compare the samples of **two independent populations** (e.g. test scores of students in two different classes). `scipy` provides the [`ttest_ind`](https://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.stats.ttest_ind.html) method to conduct student's t-test.

* **Paired t-test** (a.k.a. dependent or correlated t-test). This type of t-test is to compare the samples of **the same population** (e.g. scores of different tests of students in the same class). `scipy` provides the [`ttest_re`](https://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.stats.ttest_rel.html) method to conduct paired t-test.

Both types of t-tests return a number which is called the **p-value**. If p-value is below 0.05, we can confidently declare the null-hypothesis is rejected and the difference is significant. If p-value is between 0.05 and 0.1, we may also declare the null-hypothesis is rejected but we are not highly confident. If p-value is above 0.1 we do not reject the null-hypothesis.

Read more about the t-test in [this article](http://b.link/test50) and [this Quora](http://b.link/unpaired97). Make sure you understand when to use which type of t-test. 

In [3]:
# Import libraries
import pandas as pd
from sqlalchemy import create_engine
import pymysql

#### Import dataset

In this challenge we will work on the Pokemon dataset you have already used. The goal is to test whether different groups of pokemon (e.g. Legendary vs Normal, Generation 1 vs 2, single-type vs dual-type) have different stats (e.g. HP, Attack, Defense, etc.). Use Ironhack's database to load the data (db: pokemon, table: pokemon_stats). 

In [4]:
# Your code here:
driver   = 'mysql+pymysql:'
user     = 'data-guest_viewer'
password = 'guest_ironhack'
ip       = '127.0.0.1'
database = 'pokemon'

In [5]:
connection_string = f'{driver}//{user}:{password}@{ip}/{database}'

In [6]:
engine = create_engine(connection_string)
print(engine)

Engine(mysql+pymysql://data-guest_viewer:***@127.0.0.1/pokemon)


In [8]:
pokemon_show = pd.read_sql('SHOW TABLES', engine)

pokemon_show

Unnamed: 0,Tables_in_pokemon
0,pokemon_stats


In [9]:
pokemon = pd.read_sql('SELECT * FROM pokemon_stats', engine)

pokemon

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...
795,719,Diancie,Rock,Fairy,600,50,100,150,100,150,50,6,True
796,719,DiancieMega Diancie,Rock,Fairy,700,50,160,110,160,110,110,6,True
797,720,HoopaHoopa Confined,Psychic,Ghost,600,80,110,60,150,130,70,6,True
798,720,HoopaHoopa Unbound,Psychic,Dark,680,80,160,60,170,130,80,6,True


#### First we want to define a function with which we can test the means of a feature set of two samples. 

In the next cell you'll see the annotations of the Python function that explains what this function does and its arguments and returned value. This type of annotation is called **docstring** which is a convention used among Python developers. The docstring convention allows developers to write consistent tech documentations for their codes so that others can read. It also allows some websites to automatically parse the docstrings and display user-friendly documentations.

Follow the specifications of the docstring and complete the function.

In [14]:
# importing relevant test = scipy ttest_ind
from scipy.stats import ttest_ind

#This is a two-sided test measuring the null hypothesis that 2 independent 
# samples have identical average (expected) values

In [26]:
def t_test_features(s1, s2, features=['HP', 'Attack', 'Defense', 'Sp. Atk', 
                                      'Sp. Def', 'Speed', 'Total']):
    """ttest_ind means of a feature set of two samples
    
    Args:
        s1 (dataframe): sample 1
        s2 (dataframe): sample 2
        features (list): an array of features to test
    
    Returns:
        dict: a dictionary of t-test scores for each feature 
        where the feature name is the key and the p-value is the value
    """
    results = {}
    
#     tuple unpacking with ttest_ind
#     here we assume that both samples have different variance 
#         => equal_var=False
    
    for feature in features:
        results[feature] = ttest_ind(s1[feature], s2[feature], 
                                     equal_var=False).pvalue
        
    return results

#### Using the `t_test_features` function, conduct t-test for Lengendary vs non-Legendary pokemons.

*Hint: your output should look like below:*

```
{'HP': 1.0026911708035284e-13,
 'Attack': 2.520372449236646e-16,
 'Defense': 4.8269984949193316e-11,
 'Sp. Atk': 1.5514614112239812e-21,
 'Sp. Def': 2.2949327864052826e-15,
 'Speed': 1.049016311882451e-18,
 'Total': 9.357954335957446e-47}
 ```

In [28]:
#creating 2 sub df for legendary & non-legendary
legendary     = pokemon[pokemon['Legendary'] == 'True']
non_legendary = pokemon[pokemon['Legendary'] == 'False']

#Ho = legendary & non-legendary have same means of set of 6 features 
#Ha = legendary & non-legendary have different means of set of 6 features

pokemon_t_test = t_test_features(legendary, non_legendary)

pokemon_t_test

{'HP': 1.0026911708035284e-13,
 'Attack': 2.520372449236646e-16,
 'Defense': 4.826998494919331e-11,
 'Sp. Atk': 1.5514614112239816e-21,
 'Sp. Def': 2.2949327864052826e-15,
 'Speed': 1.0490163118824507e-18,
 'Total': 9.357954335957444e-47}

In [33]:
alpha = 0.05

for feature in pokemon_t_test:
    p_value = pokemon_t_test[feature]
    if p_value < alpha:
        print(feature,': \n Null hypothesis rejected for', feature)
    else:
        print(feature,': \n Null hypothesis cannot be rejected for', feature)

HP : 
 Null hypothesis rejected for HP
Attack : 
 Null hypothesis rejected for Attack
Defense : 
 Null hypothesis rejected for Defense
Sp. Atk : 
 Null hypothesis rejected for Sp. Atk
Sp. Def : 
 Null hypothesis rejected for Sp. Def
Speed : 
 Null hypothesis rejected for Speed
Total : 
 Null hypothesis rejected for Total


#### From the test results above, what conclusion can you make? Do Legendary and non-Legendary pokemons have significantly different stats on each feature?

In [None]:
# We can rejected the Null hypothesis for discussed features
# We can adopt the alternative hypothesis, stating that :
# legendary & non-legendary have significantly different means on each feature

#### Next, conduct t-test for Generation 1 and Generation 2 pokemons.

In [35]:
#creating 2 sub df for 1st & 2nd generations
generation1 = pokemon[pokemon['Generation'] == 1]
generation2 = pokemon[pokemon['Generation'] == 2]

In [37]:
#Ho = legendary & non-legendary have same means of set of 6 features 
#Ha = legendary & non-legendary have different means of set of 6 features

generation_t_test = t_test_features(generation1, generation2)

generation_t_test

{'HP': 0.14551697834219626,
 'Attack': 0.24721958967217725,
 'Defense': 0.5677711011725426,
 'Sp. Atk': 0.12332165977104394,
 'Sp. Def': 0.18829872292645752,
 'Speed': 0.00239265937312135,
 'Total': 0.5631377907941676}

In [47]:
alpha = 0.05

for feature in generation_t_test:
    p_value = generation_t_test[feature]
    if p_value < alpha:
        print(feature, ': p_value', round(p_value, ndigits=5),\
              '\n Null hypothesis rejected for', feature)
    else:
        print(feature, ': p_value', round(p_value, ndigits=5),\
              '\n Null hypothesis cannot be rejected for', feature)

HP : p_value 0.14552 
 Null hypothesis cannot be rejected for HP
Attack : p_value 0.24722 
 Null hypothesis cannot be rejected for Attack
Defense : p_value 0.56777 
 Null hypothesis cannot be rejected for Defense
Sp. Atk : p_value 0.12332 
 Null hypothesis cannot be rejected for Sp. Atk
Sp. Def : p_value 0.1883 
 Null hypothesis cannot be rejected for Sp. Def
Speed : p_value 0.00239 
 Null hypothesis rejected for Speed
Total : p_value 0.56314 
 Null hypothesis cannot be rejected for Total


#### What conclusions can you make?

In [None]:
# p_value is above alpha of 0.05, the result is statistically nonsignificant 
# EXCEPT for Speed : with p_value of 0.00239 

# We cannot reject the Null hypothesis when comparing 2 generations
#EXCEPT for Speed

#### Compare pokemons who have single type vs those having two types.

In [53]:
#creating 2 sub df for pokemon with 1 type & those with 2 types
type_1 = pokemon[pokemon['Type 2'] == '']
type_2 = pokemon[pokemon['Type 2'] != '']

In [54]:
# Ho : pokemons with 1 or 2 types have similar feature stats
# Ha : pokemons with 1 or 2 types have different feature stats

type_t_test = t_test_features(type_1, type_2)
type_t_test

{'HP': 0.11314389855379421,
 'Attack': 0.00014932578145948305,
 'Defense': 2.7978540411514693e-08,
 'Sp. Atk': 0.00013876216585667845,
 'Sp. Def': 0.00010730610934512779,
 'Speed': 0.02421703281819094,
 'Total': 1.1157056505229964e-07}

In [55]:
alpha = 0.05

for feature in type_t_test:
    p_value = type_t_test[feature]
    if p_value < alpha:
        print(feature, ': p_value', round(p_value, ndigits=5),\
              '\n Null hypothesis rejected for', feature)
    else:
        print(feature, ': p_value', round(p_value, ndigits=5),\
              '\n Null hypothesis cannot be rejected for', feature)

HP : p_value 0.11314 
 Null hypothesis cannot be rejected for HP
Attack : p_value 0.00015 
 Null hypothesis rejected for Attack
Defense : p_value 0.0 
 Null hypothesis rejected for Defense
Sp. Atk : p_value 0.00014 
 Null hypothesis rejected for Sp. Atk
Sp. Def : p_value 0.00011 
 Null hypothesis rejected for Sp. Def
Speed : p_value 0.02422 
 Null hypothesis rejected for Speed
Total : p_value 0.0 
 Null hypothesis rejected for Total


#### What conclusions can you make?

In [None]:
# p_value is above alpha of 0.05 for HP, the result is statistically nonsignificant 
# for all other 5 features, p_value below 0.05, results are statistically significant 

# When comparing 1-type & 2-type pokemons 
# We can reject the Null hypothesis for all features
#EXCEPT for HP

#### Now, we want to compare whether there are significant differences of `Attack` vs `Defense`  and  `Sp. Atk` vs `Sp. Def` of all pokemons. Please write your code below.

*Hint: are you comparing different populations or the same population?*

In [62]:
# Here we are comparing same population on different features
# We need to use a different t_test than before = ttest_rel function
# i.e. two-sided test for the null hypothesis over 2 related samples

In [63]:
from scipy.stats import ttest_rel

In [70]:
alpha = 0.05

# Ho : Attack & Defense are identical features
# Ha : Attack & Defense are significantly different

ttest_rel(pokemon['Attack'], pokemon['Defense'])

Ttest_relResult(statistic=4.325566393330478, pvalue=1.7140303479358558e-05)

In [71]:
# Ho : Sp. Attack & Sp. Defense are identical features
# Ha : Sp. Attack & Sp. Defense are significantly different

ttest_rel(pokemon['Sp. Atk'], pokemon['Sp. Def'])

Ttest_relResult(statistic=0.853986188453353, pvalue=0.3933685997548122)

In [72]:
#creating dict to store p_value for both test
results = {'Attack VS. Def':
           ttest_rel(pokemon['Attack'], pokemon['Defense']).pvalue,
          'Sp. Atk VS. Sp. Def':
           ttest_rel(pokemon['Sp. Atk'], pokemon['Sp. Def']).pvalue}

In [73]:
# Hypothesis testing result
for feature in results:
    p_value = results[feature]
    if p_value < alpha:
        print(feature, ': p_value', round(p_value, ndigits=5),\
              '\n Null hypothesis rejected for', feature)
    else:
        print(feature, ': p_value', round(p_value, ndigits=5),\
              '\n Null hypothesis cannot be rejected for', feature)

Attack VS. Def : p_value 2e-05 
 Null hypothesis rejected for Attack VS. Def
Sp. Atk VS. Sp. Def : p_value 0.39337 
 Null hypothesis cannot be rejected for Sp. Atk VS. Sp. Def


#### What conclusions can you make?

In [None]:
# Attack & Defense are different features & 
# have statistically significant mean difference 

# Null hypothesis cannot be rejected for Sp. Atk VS. Sp. Def
# i.e. both Special features are related and have equal means