# Hypothesis Testing with Python

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Parametric-hypothesis-testing" data-toc-modified-id="Parametric-hypothesis-testing-1">Parametric hypothesis testing</a></span><ul class="toc-item"><li><span><a href="#Testing-the-mean" data-toc-modified-id="Testing-the-mean-1.1">Testing the mean</a></span><ul class="toc-item"><li><span><a href="#Comparing-the-mean---to-a-given-value." data-toc-modified-id="Comparing-the-mean---to-a-given-value.-1.1.1">Comparing the mean   to a given value.</a></span><ul class="toc-item"><li><span><a href="#The-theory" data-toc-modified-id="The-theory-1.1.1.1">The theory</a></span></li><li><span><a href="#Practice-with-Python" data-toc-modified-id="Practice-with-Python-1.1.1.2">Practice with Python</a></span></li></ul></li></ul></li><li><span><a href="#Comparing-two-means" data-toc-modified-id="Comparing-two-means-1.2">Comparing two means</a></span><ul class="toc-item"><li><span><a href="#Paired-sample-t-test" data-toc-modified-id="Paired-sample-t-test-1.2.1">Paired sample t-test</a></span></li><li><span><a href="#A-two-sample-t-test" data-toc-modified-id="A-two-sample-t-test-1.2.2">A two-sample t-test</a></span></li></ul></li></ul></li></ul></div>

Hypothesis testing is a data analysis method conducted to test one hypothesis (call null hypothesis, $H_0$) against another hypothesis (the alternative hypothesis, $H_1$). 

We will use the random sample $X_1,\ldots, X_n$ (the data) to help decide between the hypothesis $H_0$ or $H_1$ with a fixed level of significance $\alpha$ which is the error to reject $H_0$ knowing that $H_0$ is correct. 

$$\alpha = \mathbb{P}\left(H_0\mbox{ rejected } \mid H_0 \mbox{ true}\right)$$

Any Hypothesis testing procedure should derive the following

+ The test statistics $T$: It's a random variable computed from the random sample and where the probability distribution of $T$ is known when $H_0$ is true. 

+ The observed statistics $t_\mbox{obs}$ from $T$ computed from the observed random sample $x_1,\ldots,x_n$:

$$t_\mbox{obs}=T(x_1,\ldots,x_n)$$

+ The \textbf{p-value}: It's the largest probability to reject $H_0$ assuming that $H_0$ correct.  A smaller p-value means stronger evidence in favor of the alternative hypothesis.



There are two types of Hypothesis testing procedures: parametric and non-parametric testing. 

+ Parametric hypothesis testing is a testing procedure used when the hypothesis is based on comparing a population parameter to given values.

+ Non-parametric hypothesis testing is used when the hypothesis is not based on a population parameter. It's about testing one assumption against its opposite. 


## Parametric hypothesis testing 


+ Parametric tests are more powerful and reliable than non-parametric tests. 

+ The hypothesis is developed on the parameters of the population distribution. 

+ We will see in this chapter how to perform Hypothesis testing in the following cases: 

  + The mean: comparing to a given value, comparing between two means 
  + The proportion: comparing to a given value, comparing between two means



### Testing the mean 

#### Comparing the mean   to a given value. 

##### The theory

We would like to test the following null hypothesis 
$$ H_0: \; \mu=\mu_0$$
versus 
$$ H_1: \; \mu\not=\mu_0$$
where $\mu_0$ is given from a random sample $X_1,\ldots,X_n$ assumed to be generated from a Normal distribution with mean $\mu$ and unknown variance $\sigma^2$. 

The test statistics of the test is 
$$T=\sqrt{n}\,\displaystyle\frac{\overline{X}-\mu}{S}$$
where $\overline{X}$ is the sample mean and $S$ is the sample standard deviation. 

Under $H_0$, $T$ is equal to 
$$T=\sqrt{n}\,\displaystyle\frac{\overline{X}-\mu_0}{S}$$
and follows a $t-$distribution with $n-1$ degrees of freedom. 

##### Practice with Python

Let's generate a random sample of size 15, mean $\mu=-2$, and standard deviation $\sigma=2$ (then with variance $\sigma^2=4$). 

In [15]:
import numpy as np
np.random.seed(7654567)
x=np.random.normal(size=15,loc=-2,scale=2)

In [16]:
x

array([-3.89395351, -2.55250403, -2.39746865,  0.98703995, -2.8607237 ,
       -1.39481847, -0.74237156, -4.63886095, -3.81258843, -2.89188048,
       -1.50048335, -3.31429764, -3.28882943, -0.96256977, -2.0684448 ])

We assume that the random sample `x` is generated from a Normal probability distribution with unknown mean and variance. We will test now the following hypothesis 

$$ H_0: \; \mu=-2$$
versus 
$$ H_1: \; \mu\not=-2$$

In [17]:
from scipy.stats import ttest_1samp

In [23]:
test_mean=ttest_1samp(x, -2, alternative='two-sided')

The output above shows that $t_\mbox{obs}$ is 

In [25]:
test_mean.statistic

-0.948081719359044

and the pvalue is 

In [26]:
test_mean.pvalue

0.3591669622802579

Since this p-value is greater than 0.05 (5%), we can conclude that we can accept the hypothesis $H_0$.

**How can the test statistc and the p-value are computed?**

In [31]:
m=np.mean(x)
m

-2.3555169880044056

In [34]:
std_error=np.std(x)/np.sqrt(len(x)-1)
std_error

0.3749855953817513

Under $ H_0: \; \mu=-2$, the test statistics is then

In [35]:
(m+2)/std_error

-0.948081719359044

In [36]:
test_mean.statistic

-0.948081719359044

The p-value is then computed as follows

In [37]:
from scipy import stats
X = stats.t(len(x)-1)

In [40]:
2*X.cdf(test_mean.statistic)

0.3591669622802579

And the p-value is 

In [41]:
test_mean.pvalue

0.3591669622802579

We can also perform the following hypothesis testing. It's called the lower-tail alternative test: 

$$ H_0: \; \mu=-2$$
versus 
$$ H_1: \; \mu<-2$$

In [27]:
test_mean1=ttest_1samp(x, -2, alternative='less')

In [29]:
test_mean1.statistic

-0.948081719359044

In [30]:
test_mean1.pvalue

0.17958348114012895

It's computed as follows 

In [42]:
X.cdf(test_mean.statistic)

0.17958348114012895

We can also perform the following hypothesis testing. It's called the **upper-tail** alternative test: 

$$ H_0: \; \mu=-2$$
versus 
$$ H_1: \; \mu>-2$$

In [47]:
test_mean2=ttest_1samp(x, -2, alternative='greater')

In [48]:
test_mean2.statistic

-0.948081719359044

In [49]:
test_mean2.pvalue

0.8204165188598711

The p-value is computed as follows 

In [50]:
1-X.cdf(test_mean.statistic)

0.8204165188598711

### Comparing two means 

We have two types of hypothesis testing comparing two means: 

+ A paired sample t-test is a dependent sample t-test, which is used to decide whether the mean difference between two observations of the same group is zero. 

**Example:**  Compare the difference in blood pressure level for a group of patients before and after some drug treatment.



+ A two-sample t-test is used for comparing the significant difference between two independent groups. This test is also known as an independent samples t-test.

**Example:** Comparing between the salaries of a sample of men and women employees. 

#### Paired sample t-test

We are going to the following hypothesis: 
    
+ $H_0:$ Mean difference between the two dependent samples is 0. 
+ $H_1$: Mean difference between the two dependent samples is not 0.



**Example:** we're comparing the grades between the Quiz 1 and the Quiz 2

In [53]:
import pandas as pd

In [56]:
df=pd.read_csv('student_grades.csv')

In [57]:
df

Unnamed: 0,Student,Section,Quiz 1,Quiz 2,Midterm 1
0,1,Section 1,17.5,13.4575,22.5
1,2,Section 1,16.5,14.2125,25.0
2,3,Section 2,12.5,14.615,23.5
3,4,Section 2,10.5,15.21,25.0
4,5,Section 2,5.5,12.2225,24.5
5,6,Section 1,14.0,14.5875,23.5
6,7,Section 2,11.0,,20.5
7,8,Section 2,15.0,18.75,25.0
8,9,Section 2,17.0,16.5,25.0
9,10,Section 1,14.0,19.25,19.5


We remove the missing values from the data 

In [58]:
df=df.dropna()

In [59]:
df

Unnamed: 0,Student,Section,Quiz 1,Quiz 2,Midterm 1
0,1,Section 1,17.5,13.4575,22.5
1,2,Section 1,16.5,14.2125,25.0
2,3,Section 2,12.5,14.615,23.5
3,4,Section 2,10.5,15.21,25.0
4,5,Section 2,5.5,12.2225,24.5
5,6,Section 1,14.0,14.5875,23.5
7,8,Section 2,15.0,18.75,25.0
8,9,Section 2,17.0,16.5,25.0
9,10,Section 1,14.0,19.25,19.5
10,11,Section 2,11.0,10.9625,24.5


In [61]:
from scipy.stats import ttest_rel
test_diffmean1=ttest_rel(df['Quiz 1'],df['Quiz 2'])

In [62]:
test_diffmean1.statistic

-1.1156139485204906

In [63]:
test_diffmean1.pvalue

0.2701415971643917

We have tested here the following hypothesis:

$$ H_0:\,\mbox{ the averages of the grades in Q1 and Q2 are equal}$$
versus 
$$ H_0:\,\mbox{ the averages of the grades in Q1 and Q2 are different}$$



We used a paired t-test and we can conclude that $H_0$ can't be rejected since the p-value is greater than 0.05 (5%) (the given level of significance). 

We can test also if the mean $\mu_1$ of the grades of the Quiz 1 is higher than the mean $\mu_2$ of the grades of the Quiz 2: 

$$ H_0: \, \mu_1\geq \mu_2$$
versus
$$ H_1:\, \mu_1<\mu_2 $$

In [64]:
test_diffmean1a=ttest_rel(df['Quiz 1'],df['Quiz 2'],alternative='less')

In [66]:
test_diffmean1a.statistic

-1.1156139485204906

In [67]:
test_diffmean1a.pvalue

0.13507079858219584

Conclusion: $H_0$ can't be rejected 

In [68]:
test_diffmean1b=ttest_rel(df['Quiz 1'],df['Quiz 2'],alternative='greater')

In [69]:
test_diffmean1b.pvalue

0.8649292014178042

#### A two-sample t-test

We will compare now the mean of the Quiz 1 between Section 1 and 2

In [78]:
x1=df['Quiz 1'].loc[df['Section  ']=='Section 1']
x1

0     17.5
1     16.5
5     14.0
9     14.0
11    10.5
12    16.5
13    14.5
14    14.5
15     8.0
16    14.5
18    11.0
20    18.5
22    10.5
23    10.0
28    15.0
29    18.0
32    14.0
36    11.0
37    15.0
41    14.0
44    16.0
45    10.5
46    16.5
48    17.5
50    14.5
51    16.5
Name: Quiz 1, dtype: float64

In [79]:
x2=df['Quiz 1'].loc[df['Section  ']=='Section 2']
x2

2     12.5
3     10.5
4      5.5
7     15.0
8     17.0
10    11.0
17    14.0
19    17.0
21    14.0
24    11.0
25    15.0
26    16.0
27    19.5
30    14.5
31    18.5
33    11.5
34    10.0
35    20.0
38    18.5
40     9.5
43    13.0
47    10.5
49    13.5
Name: Quiz 1, dtype: float64

In [80]:
from scipy.stats import ttest_ind
test_diffmean2=ttest_ind(x1,x2)

In [81]:
test_diffmean2.statistic

0.4200033340360798

In [82]:
test_diffmean2.pvalue

0.6763969243532744

We conclude that both sections have the same means of the grades in Quiz 1.