#  T-tests and P-values

In statistics, t-test is used to test if two data samples have a significant difference between their means. There are two types of t-test:

* **Student's t-test** (a.k.a. independent or uncorrelated t-test). This type of t-test is to compare the samples of **two independent populations** (e.g. test scores of students in two different classes). `scipy` provides the [`ttest_ind`](https://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.stats.ttest_ind.html) method to conduct student's t-test.

* **Paired t-test** (a.k.a. dependent or correlated t-test). This type of t-test is to compare the samples of **the same population** (e.g. scores of different tests of students in the same class). `scipy` provides the [`ttest_rel`](https://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.stats.ttest_rel.html) method to conduct paired t-test.

Both types of t-tests return a number which is called the **p-value**. If p-value is below 0.05, we can confidently declare the null-hypothesis is rejected and the difference is significant. If p-value is between 0.05 and 0.1, we may also declare the null-hypothesis is rejected but we are not highly confident. If p-value is above 0.1 we do not reject the null-hypothesis.

Read more about the t-test in [this article](http://b.link/test50) and [this Quora](http://b.link/unpaired97). Make sure you understand when to use which type of t-test. 

In [1]:
# Import libraries
import pandas as pd
import numpy as np
import scipy.stats as st
import matplotlib.pyplot as plt
import seaborn as sns
pd.set_option('display.max_columns', None)


### One tailed t-test 
In a packing plant, a machine packs cartons with jars. It is supposed that a new machine will pack faster on the average than the machine currently used. To test that hypothesis, the times it takes each machine to pack ten cartons are recorded. The results, in seconds, are shown in the tables in the file files_for_lab/ttest_machine.xlsx. Assume that there is sufficient evidence to conduct the t test, does the data provide sufficient evidence to show if one machine is better than the other?

#### Import dataset

In [2]:
# Your code here:
data=pd.read_csv(r"C:\Users\torra\IH-Labs\lab-t-tests-p-values\files_for_lab\ttest_machine.txt", sep=' ')

In [3]:
data.head()

Unnamed: 0,New_machine,Old_machine
0,42.1,42.7
1,41.0,43.6
2,41.3,43.8
3,41.8,43.3
4,42.4,42.5


In [4]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 2 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   New_machine  10 non-null     float64
 1   Old_machine  10 non-null     float64
dtypes: float64(2)
memory usage: 292.0 bytes


H0: New_machine = OLd_machine
H1: New_machine != Old_machine



In [5]:
st.ttest_ind(data['New_machine'], data['Old_machine'])

TtestResult(statistic=-3.3972307061176026, pvalue=0.0032111425007745158, df=18.0)

We can reject the null hypothesis, therefor the new machine and the old machine don't have the same speed of packing. We cannot tell with this test which one is faster, just that they are different.

#### Import dataset

In this challenge we will work on the Pokemon dataset you have already used. The goal is to test whether different groups of pokemon (e.g. Legendary vs Normal, Generation 1 vs 2, single-type vs dual-type) have different stats (e.g. HP, Attack, Defense, etc.). Use pokemon.csv

In [6]:
data_p=pd.read_csv(r"C:\Users\torra\IH-Labs\lab-t-tests-p-values\files_for_lab\pokemon.txt")


#### First we want to define a function with which we can test the means of a feature set of two samples. 

In the next cell you'll see the annotations of the Python function that explains what this function does and its arguments and returned value. This type of annotation is called **docstring** which is a convention used among Python developers. The docstring convention allows developers to write consistent tech documentations for their codes so that others can read. It also allows some websites to automatically parse the docstrings and display user-friendly documentations.

Follow the specifications of the docstring and complete the function.

#### Using the `t_test_features` function, conduct t-test for Lengendary vs non-Legendary pokemons.

*Hint: your output should look like below:*

```
{'HP': 1.0026911708035284e-13,
 'Attack': 2.520372449236646e-16,
 'Defense': 4.8269984949193316e-11,
 'Sp. Atk': 1.5514614112239812e-21,
 'Sp. Def': 2.2949327864052826e-15,
 'Speed': 1.049016311882451e-18,
 'Total': 9.357954335957446e-47}
 ```

In [7]:
#Take two samples from the dataframe
data_p.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 800 entries, 0 to 799
Data columns (total 13 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   #           800 non-null    int64 
 1   Name        800 non-null    object
 2   Type 1      800 non-null    object
 3   Type 2      414 non-null    object
 4   Total       800 non-null    int64 
 5   HP          800 non-null    int64 
 6   Attack      800 non-null    int64 
 7   Defense     800 non-null    int64 
 8   Sp. Atk     800 non-null    int64 
 9   Sp. Def     800 non-null    int64 
 10  Speed       800 non-null    int64 
 11  Generation  800 non-null    int64 
 12  Legendary   800 non-null    bool  
dtypes: bool(1), int64(9), object(3)
memory usage: 75.9+ KB


In [8]:
data_p.isna().sum()

#               0
Name            0
Type 1          0
Type 2        386
Total           0
HP              0
Attack          0
Defense         0
Sp. Atk         0
Sp. Def         0
Speed           0
Generation      0
Legendary       0
dtype: int64

First we are going to transform the null values of "Type 2" to "None"

In [9]:
data_p["Type 2"]=data_p["Type 2"].fillna(value="None")

In [27]:
s1=data_p[data_p['Legendary']==True]
s2=data_p[data_p['Legendary']==False]
#True/False are boolean  they shouldn't be in quotes

In [28]:
s1

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary,number_types
156,144,Articuno,Ice,Flying,580,90,85,100,95,125,85,1,True,two
157,145,Zapdos,Electric,Flying,580,90,90,85,125,90,100,1,True,two
158,146,Moltres,Fire,Flying,580,90,100,90,125,85,90,1,True,two
162,150,Mewtwo,Psychic,,680,106,110,90,154,90,130,1,True,one
163,150,MewtwoMega Mewtwo X,Psychic,Fighting,780,106,190,100,154,100,130,1,True,two
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
795,719,Diancie,Rock,Fairy,600,50,100,150,100,150,50,6,True,two
796,719,DiancieMega Diancie,Rock,Fairy,700,50,160,110,160,110,110,6,True,two
797,720,HoopaHoopa Confined,Psychic,Ghost,600,80,110,60,150,130,70,6,True,two
798,720,HoopaHoopa Unbound,Psychic,Dark,680,80,160,60,170,130,80,6,True,two


T-test on the following features 'HP', 'Attack', 'Defense', 'Sp. Atk', 'Sp. Def', 'Speed', 'Total'.

H0: legendary = non legendary
H1: legendary != non legendary

In [29]:



def t_test_features(s1, s2, features=['HP', 'Attack', 'Defense', 'Sp. Atk', 'Sp. Def', 'Speed', 'Total']):
    """Test means of a feature set of two samples
    t
    Args:
        s1 (dataframe): sample 1
        s2 (dataframe): sample 2
        features (list): an array of features to test
    
    Returns:
        dict: a dictionary of t-test scores for each feature where the feature name is the key and the p-value is the value
    """
    results = {}

    #for loop to comb all columns 
    for fe in features:
        t_stats,p_value= st.ttest_ind(s1[fe],s2[fe])
       
        results[fe]=p_value
   
    # Your code here
    # a ttest on the samples
    #pair comparison btw two s1,s2 on each feature
    
    return print(results)

In [30]:
t_test_features(s1,s2)

{'HP': 3.330647684846191e-15, 'Attack': 7.827253003205333e-24, 'Defense': 1.5842226094427259e-12, 'Sp. Atk': 6.314915770427266e-41, 'Sp. Def': 1.8439809580409594e-26, 'Speed': 2.3540754436898437e-21, 'Total': 3.0952457469652825e-52}


#### From the test results above, what conclusion can you make? Do Legendary and non-Legendary pokemons have significantly different stats on each feature?

We can conclude that legendary and non legendary pokemons have different statistics, as the p-value for all the features are below 0.05.

#### Next, conduct t-test for Generation 1 and Generation 2 pokemons.

In [14]:
data_p.head(15)


Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,False
5,5,Charmeleon,Fire,,405,58,64,58,80,65,80,1,False
6,6,Charizard,Fire,Flying,534,78,84,78,109,85,100,1,False
7,6,CharizardMega Charizard X,Fire,Dragon,634,78,130,111,130,85,100,1,False
8,6,CharizardMega Charizard Y,Fire,Flying,634,78,104,78,159,115,100,1,False
9,7,Squirtle,Water,,314,44,48,65,50,64,43,1,False


T-test on the following features 'HP', 'Attack', 'Defense', 'Sp. Atk', 'Sp. Def', 'Speed', 'Total'.

H0: Generation 1 = Generation 2
H1: Generation 1 != Generation 2

In [15]:
#First we select the segment of information that we want to use
features_list=['HP', 'Attack', 'Defense', 'Sp. Atk', 'Sp. Def', 'Speed', 'Total']
generation_1=data_p[data_p['Generation']==1]
generation_2=data_p[data_p['Generation']==2]
result=pd.DataFrame(st.ttest_ind(generation_1[features_list],generation_2[features_list]),columns=features_list, index=['statistics','pvalue'])

result
#link: https://stackoverflow.com/questions/77568379/how-to-run-t-test-on-multiple-pandas-columns



Unnamed: 0,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Total
statistics,-1.487996,1.176304,-0.612439,1.475669,-1.382984,3.012633,0.583693
pvalue,0.137919,0.24051,0.540763,0.141198,0.167812,0.002836,0.559914


#### What conclusions can you make?

The only case where there are significant differences between pokemons of the first and second generation is speed.

#### Compare pokemons who have single type vs those having two types.

Before comparing the pokemons that have one type of those who have two types, let's look at the unique values of each column.


In [16]:
data_p['Type 1'].unique()

array(['Grass', 'Fire', 'Water', 'Bug', 'Normal', 'Poison', 'Electric',
       'Ground', 'Fairy', 'Fighting', 'Psychic', 'Rock', 'Ghost', 'Ice',
       'Dragon', 'Dark', 'Steel', 'Flying'], dtype=object)

In [17]:
data_p['Type 2'].unique()

array(['Poison', 'None', 'Flying', 'Dragon', 'Ground', 'Fairy', 'Grass',
       'Fighting', 'Psychic', 'Steel', 'Ice', 'Rock', 'Dark', 'Water',
       'Electric', 'Fire', 'Ghost', 'Bug', 'Normal'], dtype=object)

In [18]:
data_p['number_types']=data_p['Type 2'].apply(lambda x: 'one' if x=='None'  else 'two')
#Change all NAN values in column Type 2 to "None" or some other type of string

In [19]:
data_p.head(25)

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary,number_types
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False,two
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False,two
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False,two
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False,two
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,False,one
5,5,Charmeleon,Fire,,405,58,64,58,80,65,80,1,False,one
6,6,Charizard,Fire,Flying,534,78,84,78,109,85,100,1,False,two
7,6,CharizardMega Charizard X,Fire,Dragon,634,78,130,111,130,85,100,1,False,two
8,6,CharizardMega Charizard Y,Fire,Flying,634,78,104,78,159,115,100,1,False,two
9,7,Squirtle,Water,,314,44,48,65,50,64,43,1,False,one


In [20]:
data_p['number_types'].unique()

array(['two', 'one'], dtype=object)

T-test on the following features 'HP', 'Attack', 'Defense', 'Sp. Atk', 'Sp. Def', 'Speed', 'Total'.

H0: One type= Two types
H1: One type != Two types

In [21]:
#We select the segment of information that we want to use
features_list=['HP', 'Attack', 'Defense', 'Sp. Atk', 'Sp. Def', 'Speed', 'Total']
single_type=data_p[data_p['number_types']=='one']
double_type=data_p[data_p['number_types']=='two']
result=pd.DataFrame(st.ttest_ind(single_type[features_list],double_type[features_list]),columns=features_list, index=['statistics','pvalue'])

result

Unnamed: 0,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Total
statistics,-1.597247,-3.797241,-5.58248,-3.817082,-3.889235,-2.26062,-5.345971
pvalue,0.110606,0.000157,3.250594e-08,0.000145,0.000109,0.024051,1.174904e-07


#### What conclusions can you make?

There are significant differences between the two types in most categories, their p_values are less than 0.05. The only case, where we can't reject the null hypothesis is HP.

#### Now, we want to compare whether there are significant differences of `Attack` vs `Defense`  and  `Sp. Atk` vs `Sp. Def` of all pokemons. Please write your code below.

*Hint: are you comparing different populations or the same population?*

T_test  Attack vs Defense on all pokemons

H0: Attack= Defense
H1: Attack != Defense

In [22]:
st.ttest_ind(data_p['Attack'], data_p['Defense'])


TtestResult(statistic=3.241764074042312, pvalue=0.0012123980547321489, df=1598.0)

T-test Speed Attack vs Speed Defense on all pokemons

H0: Speed Attack= Speed Defense
H1: Speed Attack != Speed Defense

In [23]:
st.ttest_ind(data_p['Sp. Atk'], data_p['Sp. Def'])

TtestResult(statistic=0.6041290031014401, pvalue=0.5458436328840358, df=1598.0)

#### What conclusions can you make?

We can say that there are significant differences between all pokemons in respect to Attack and Defense capabilities (p-value lower than 0.05).
On the contrary, we don't have the certainty to reject the null hypothesis in respect to Speed capabilities, we can't affirm that Speed Attack and Speed Defense are different for all pokemons (p-value over 0.05).