# Desafío 1 - T-test

En estadística, la prueba t se utiliza para verificar si dos muestras de datos tienen una diferencia significativa entre sus medias. Hay dos tipos de t-test:

* **T-test de Student** (también conocida como prueba t independiente o no correlacionada). Este tipo de prueba t se utiliza para comparar las muestras de dos poblaciones independientes (por ejemplo, los puntajes de pruebas de estudiantes en dos clases diferentes). `scipy` proporciona el método [`ttest_ind`](https://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.stats.ttest_ind.html) para realizar la prueba t de Student.

* **T-test pareada** (también conocida como prueba t dependiente o correlacionada). Este tipo de prueba t se utiliza para comparar las muestras de **la misma población** (por ejemplo, los puntajes de diferentes pruebas de estudiantes en la misma clase). `scipy` proporciona el método [`ttest_re`](https://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.stats.ttest_rel.html) para realizar la prueba t pareada.

Ambos tipos de pruebas t devuelven un número llamado **valor p** (**p-value**). Si el valor p está por debajo de 0.05, podemos declarar con confianza que se rechaza la hipótesis nula y que la diferencia es significativa. Si el valor p está entre 0.05 y 0.1, también podemos declarar que se rechaza la hipótesis nula, pero no con alta confianza. Si el valor p está por encima de 0.1, no rechazamos la hipótesis nula.

Lee más sobre la t-test en [este artículo](https://researchbasics.education.uconn.edu/t-test/) and [esta página de Quora](https://www.quora.com/What-is-the-difference-between-a-paired-and-unpaired-t-test). Asegúrate de entender cuándo usar cada tipo de t-test. 

In [78]:
# Import libraries

import pandas as pd
import scipy
from scipy.stats import levene, ttest_ind, bartlett, f_oneway, f, ttest_rel

#### Importar conjunto de datos

En este desafío, trabajaremos con el conjunto de datos de Pokémon que utilizaste la semana pasada. El objetivo es probar si diferentes grupos de Pokémon (por ejemplo, Legendarios vs. Normales, Generación 1 vs. Generación 2, tipo único vs. tipo dual) tienen diferentes estadísticas (por ejemplo, HP, Ataque, Defensa, etc.).

In [3]:
pokemon = pd.read_csv('/home/ubuntu/Ironhack_all/Ironhack_21thLab_A-B-testing/data/Pokemon.csv')

pokemon.head()

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,False


#### Primero queremos definir una función con la cual podamos probar las medias de un conjunto de características de dos muestras.

En la siguiente celda verás las anotaciones de la función de Python que explican qué hace esta función, sus argumentos y el valor devuelto. Este tipo de anotación se llama **docstring**, que es una convención utilizada entre los desarrolladores de Python. La convención de docstring permite a los desarrolladores escribir documentaciones técnicas consistentes para sus códigos para que otros puedan leerlas. También permite que algunos sitios web analicen automáticamente las docstrings y muestren documentaciones amigables para el usuario.

Sigue las especificaciones del docstring y completa la función.

In [61]:
Legendaries = pokemon[pokemon['Legendary']==True]
Not_Legendaries = pokemon[pokemon['Legendary']==False]

p_values_list = {}

def t_test_features (columns, group1, group2):
    for column in columns:
        stat,p_value = ttest_ind(group1[column],group2[column],equal_var=False) #in the solution they assumed that all var was not equal
        # stat_levene, p_value_levene = levene(Legendaries[column], Not_Legendaries[column])
        # if p_value_levene > 0.05:
        #     stat, p_value = ttest_ind(Legendaries[column],Not_Legendaries[column],equal_var=True)
        #     p_values_list[column] = p_value
        # else:
        #     stat,p_value = ttest_ind(Legendaries[column],Not_Legendaries[column],equal_var=False)
        p_values_list[column] = p_value
    return p_values_list
print(p_values_list)

def mean_differences (columns, group1, group2):
    mean_difference_dict = {}
    for column in columns:
        mean_difference_dict[column] = group1[column].mean() - group2[column].mean()
    return mean_difference_dict
    print(mean_difference_dict)


columns_for_function = ['HP','Attack','Defense','Sp. Atk','Sp. Def', 'Speed','Total']

t_test_features (columns_for_function,Legendaries,Not_Legendaries)


{}


{'HP': 1.0026911708035284e-13,
 'Attack': 2.520372449236646e-16,
 'Defense': 4.8269984949193316e-11,
 'Sp. Atk': 1.5514614112239812e-21,
 'Sp. Def': 2.2949327864052826e-15,
 'Speed': 1.049016311882451e-18,
 'Total': 9.357954335957446e-47}

In [60]:
mean_differences (columns_for_function,Legendaries,Not_Legendaries)

{'HP': 25.55614861329147,
 'Attack': 41.00753532182104,
 'Defense': 28.102354788069064,
 'Sp. Atk': 53.73019361590791,
 'Sp. Def': 37.04594453165882,
 'Speed': 34.72883307169022,
 'Total': 220.1710099424385}

HO: there is no difference in variance between groups. p_value < 0.05 --> reject --> THERE IS DIFF

In [31]:
# The solution assumed unequal variance all the way without testing..

import numpy as np

var_legendary = np.var(Legendaries['HP'], ddof=1)
var_not_leg = np.var(Not_Legendaries['HP'], ddof=1)
F = var_legendary/var_not_leg
dof_leg = len(Legendaries['HP']) - 1
dof_not_leg = len(Not_Legendaries['HP']) - 1

p_value_f = 1 - f.cdf(F, dof_leg,dof_not_leg)

stat_levene, p_value_levene = levene(Legendaries['HP'], Not_Legendaries['HP'])
stat_bartlett, p_value_bartlett = bartlett(Legendaries['HP'], Not_Legendaries['HP'])

print(p_value_levene)
print(p_value_bartlett)
print(p_value_f)

0.5758428070138398
0.1657970515126399
0.9092354722941887


#### Usando la función `t_test_features`, realiza la prueba t para los Pokémon Legendarios vs no Legendarios.

*Pista: tu resultado debería verse como el siguiente:*

```
{'HP': 1.0026911708035284e-13,
 'Attack': 2.520372449236646e-16,
 'Defense': 4.8269984949193316e-11,
 'Sp. Atk': 1.5514614112239812e-21,
 'Sp. Def': 2.2949327864052826e-15,
 'Speed': 1.049016311882451e-18,
 'Total': 9.357954335957446e-47}
 ```

#### A partir de los resultados de la prueba anterior, ¿qué conclusión puedes sacar? ¿Tienen los Pokémon Legendarios y no Legendarios estadísticas significativamente diferentes en cada característica?

From the above results, we can interpret that there are ***SIGNIFICANT*** differences between the stats of legendary pokemons and non_legendary pokemons.

#### A continuación, realiza la prueba t (t-test) para los Pokémon de la Generación 1 (Generation 1) y la Generación 2 (Generation 2).

In [45]:
gen1 = pokemon[pokemon['Generation'] == 1]
gen2 = pokemon[pokemon['Generation'] == 2]

result_gen = t_test_features(columns_for_function,gen1,gen2)

result_gen


{'HP': 0.14551697834219623,
 'Attack': 0.24721958967217725,
 'Defense': 0.5677711011725426,
 'Sp. Atk': 0.12332165977104394,
 'Sp. Def': 0.18829872292645752,
 'Speed': 0.00239265937312135,
 'Total': 0.5631377907941676}

In [66]:
result_mean_gen = mean_differences(columns_for_function,gen1,gen2)

result_mean_gen

{'HP': -5.3882700613775825,
 'Attack': 4.610252330075028,
 'Defense': -2.5253466696976687,
 'Sp. Atk': 5.875880882018649,
 'Sp. Def': -4.815298931575356,
 'Speed': 10.773016594680605,
 'Total': 8.530234144123654}

#### ¿Qué conclusiones puedes sacar?

In [73]:
for key, value in result_gen.items():
    if value < 0.05:
        print(f'There is significant difference in {key} between generations')
        if key in result_mean_gen:
            print(f'The difference between gen1 and gen2 for {key} is {result_mean_gen[key]}')
        print('                                                                                  ')
    else:
        print(f'We fail to reject the null hypothesis, there is no significant difference in {key}')
        print('                                                                                  ')


We fail to reject the null hypothesis, there is no significant difference in HP
                                                                                  
There is significant difference in Attack between generations
The difference between gen1 and gen2 for Attack is 4.610252330075028
                                                                                  
There is significant difference in Defense between generations
The difference between gen1 and gen2 for Defense is -2.5253466696976687
                                                                                  
There is significant difference in Sp. Atk between generations
The difference between gen1 and gen2 for Sp. Atk is 5.875880882018649
                                                                                  
There is significant difference in Sp. Def between generations
The difference between gen1 and gen2 for Sp. Def is -4.815298931575356
                                                       

#### Compara los Pokémon que tienen un solo tipo vs aquellos que tienen dos tipos.

In [54]:
pokemon.head()

pokemon['one_or_two'] = np.select([(pokemon['Type 1'].notna()) & (pokemon['Type 2'].notna())], [2], default=1)

one_type_only = pokemon[pokemon['one_or_two'] == 1]
two_types = pokemon[pokemon['one_or_two'] == 2]

result_one_or_two = t_test_features(columns_for_function,one_type_only,two_types)

result_one_or_two

{'HP': 0.11314389855379421,
 'Attack': 0.00014932578145948305,
 'Defense': 2.7978540411514693e-08,
 'Sp. Atk': 0.00013876216585667842,
 'Sp. Def': 0.00010730610934512779,
 'Speed': 0.02421703281819094,
 'Total': 1.1157056505229961e-07}

In [75]:
mean_result_one_or_two = mean_differences(columns_for_function,one_type_only,two_types)
mean_result_one_or_two

{'HP': -2.8829190758679317,
 'Attack': -8.648006307726973,
 'Defense': -12.090836274436185,
 'Sp. Atk': -8.76333508547971,
 'Sp. Def': -7.591124127055636,
 'Speed': -4.636254411654278,
 'Total': -44.612475282220714}

#### ¿Qué conclusiones puedes sacar?

In [77]:
for key, value in result_one_or_two.items():
    if value < 0.05:
        print(f'There is significant difference in {key} between generations')
        if key in mean_result_one_or_two:
            print(f'The difference between two types and two types for {key} is {mean_result_one_or_two[key]}')
        print('                                                                                  ')
    else:
        print(f'We fail to reject the null hypothesis, there is no significant difference in {key}')
        print('                                                                                  ')

We fail to reject the null hypothesis, there is no significant difference in HP
                                                                                  
There is significant difference in Attack between generations
The difference between two types and two types for Attack is -8.648006307726973
                                                                                  
There is significant difference in Defense between generations
The difference between two types and two types for Defense is -12.090836274436185
                                                                                  
There is significant difference in Sp. Atk between generations
The difference between two types and two types for Sp. Atk is -8.76333508547971
                                                                                  
There is significant difference in Sp. Def between generations
The difference between two types and two types for Sp. Def is -7.591124127055636
              

#### Ahora, queremos comparar si hay diferencias significativas entre `Attack` vs `Defense` y `Sp. Atk` vs `Sp. Def` de todos los Pokémon. Por favor, escribe tu código a continuación.

*Pista: ¿estás comparando diferentes poblaciones o la misma población?*

#### ¿Qué conclusiones puedes sacar?

In [85]:
# We have to use paired sample t-tests:

stat, p_value = ttest_rel(pokemon['Attack'], pokemon['Defense'])

print(p_value)
print(pokemon['Attack'].mean() - pokemon['Defense'].mean())

print('We reject the null hypothesis, there IS statistical significance. Defense seems to be on average 5 pts lower than Attack')

1.7140303479358558e-05
5.158749999999998
We reject the null hypothesis, there IS statistical significance. Defense seems to be on average 5 pts lower than Attack


In [87]:
stat, p_value = ttest_rel(pokemon['Sp. Atk'], pokemon['Sp. Def'])

print(p_value)
print(pokemon['Sp. Atk'].mean() - pokemon['Sp. Def'].mean())
print('We fail to reject the null hypothesis, there IS NO significant difference')

0.3933685997548122
0.9174999999999898
We fail toreject the null hypothesis, there IS NO significant difference
