# Beginners Guide to the t-test with Jupyter Notebook

# Table of contents

- Introduction
- The pooled two-sample t-test

# Introduction



- https://stats.idre.ucla.edu/other/mult-pkg/faq/general/faq-what-are-the-differences-between-one-tailed-and-two-tailed-tests/

# The Pooled two-sample t-test 

In IB Diploma Mathematics Applications and Interpretation, you are required to find the pooled two-sample t-test. This test is a comparison of the means of two independent set of data that are sampled selected from a normally-distributed population.

## One-tailed tests

One of the tests is called one-tailed test. It is a hypothesis test with an alternative hypothesis that only considers one side of the distribution curve; for example, H1:μ<μ0 or H1:μ>μ0.

\begin{align}
\text{If }\ H_0:\mu_1 \geq \mu_2, then H_1:\mu_1<\mu_2\neq0 \tag{1}\\
\text{If }\ H_0:\mu_1 \leq \mu_2, then H_1:\mu_1>\mu_2\neq0 \tag{2}\\
\end{align}


We are going to use [statsmodels](https://www.statsmodels.org/dev/index.html) module to find out the pooled two-sample t-test. 

We need to install it from a terminal. 

If you are using Anaconda,

`conda install -c conda-forge statsmodels`

You can install it by using `pip`.

`pip install statsmodels`

We set the alternative hypothesis in the option `alternative`. 

- ‘two-sided’ (default): H1: difference in means not equal to value $H_1:\mu_1\ne\mu_2$
- ‘larger’ : H1: difference in means larger than value $H_1:\mu_1>\mu_2$
- ‘smaller’ : H1: difference in means smaller than value $H_1:\mu_1<\mu_2$

In the following we set it to `smaller` which means 
$$H_0:\mu_1\geq\mu_2$$
$$H_1:\mu_1<\mu_2$$

Since IB requires the pooled test, we set it as `usevar='pooled'`.

In [47]:
# https://gist.github.com/shinokada/88dc8e2a2c868c97e15dc0cdc2f63572

import statsmodels.stats.weightstats as sm

significance = 0.05
list1=[3,5,4,6,6,5,3,2,3,4,5,3,4]
list2=[4,6,6,7,6,4,4,4,3,6,5,4,5]
tstat, pvalue, df = sm.ttest_ind(
    list1,list2,
    alternative='smaller',
    usevar='pooled')

print("""Test statistic=%.2f, 
p-value=%.4f, 
degree of freedom=%.0f\n""" % (tstat,pvalue,df))

if pvalue < significance:
	print("""At %.2f level of significance, 
we reject the null hypotheses. 
The mean 1 is less than the mean 2.""" % (significance))
else:
	print("""At %.2f level of significance, 
we accept the null hypotheses.  
The mean 1 is greater than the mean 2.""" % (significance))

Test statistic=-1.77, 
p-value=0.0451, 
degree of freedom=24

At 0.05 level of significance, 
we reject the null hypotheses. 
The mean 1 is less than the mean 2.


If you want to test $H_1:\mu_1>\mu_2$, you need to set the `alternative` to `larger` which means 

$$H_0:\mu_1\leq\mu_2$$
$$H_1:\mu_1>\mu_2$$

In [54]:
# https://gist.github.com/shinokada/29b65b084a1aa05a016d8081faaa40a2

import statsmodels.stats.weightstats as sm

significance = 0.05
list1=[3,5,4,6,6,5,3,2,3,4,5,3,4]
list2=[4,6,6,7,6,4,4,4,3,6,5,4,5]
tstat, pvalue, df = sm.ttest_ind(
    list1,list2,
    alternative='larger',
    usevar='pooled')

print("""Test statistic=%.2f, 
p-value=%.4f, 
degree of freedom=%.0f\n""" % (tstat,pvalue,df))

if pvalue < significance:
	print("""At %.2f level of significance, 
we reject the null hypotheses. 
The mean 1 is greater than the mean 2.""" % (significance))
else:
	print("""At %.2f level of significance, 
we accept the null hypotheses.  
The mean 1 is less than the mean 2.""" % (significance))

Test statistic=-1.77, 
p-value=0.9549, 
degree of freedom=24

At 0.05 level of significance, 
we accept the null hypotheses.  
The mean 1 is less than the mean 2.


## Two-tailed tests

Another test is called two-tailed test. It is a hypothesis test with an alternative hypothesis that considers both sides of the distribution curve; for example, H1:μ≠μ0.
Also testing whether the mean of the first set is significantly different from the mean of the second set on either side.

This means:

$$H_0:\mu_1=\mu_2$$
$$H_1:\mu_1\ne\mu_2$$


In [56]:
# https://gist.github.com/shinokada/967875874850b70c1a7950bfb12202f5

import statsmodels.stats.weightstats as sm

significance = 0.05
list1=[3,5,4,6,6,5,3,2,3,4,5,3,4]
list2=[4,6,6,7,6,4,4,4,3,6,5,4,5]
tstat, pvalue, df = sm.ttest_ind(
    list1,list2,
    alternative='two-sided',
    usevar='pooled')

print("""Test statistic=%.2f, 
p-value=%.4f, 
degree of freedom=%.0f\n""" % (tstat,pvalue,df))

if pvalue < significance:
	print("""At %.2f level of significance, 
we reject the null hypotheses. 
The mean 1 is equal to the mean 2.""" % (significance))
else:
	print("""At %.2f level of significance, 
we accept the null hypotheses.  
The mean 1 is not equal to the mean 2.""" % (significance))

Test statistic=-1.77, 
p-value=0.0903, 
degree of freedom=24

At 0.05 level of significance, 
we accept the null hypotheses.  
The mean 1 is not equal to the mean 2.


## Real-life example

In the internal assessment or the extended essay, students use a large data set. 
This link is a data set "Brain Size and Intelligence". You can find more details in this [link](https://www3.nd.edu/~busiforc/handouts/Data%20and%20Stories/correlation/Brain%20Size/brainsize.html).

We need to clean the data set by removing semi-colons and rows with empty data. `sep=';', na_value="."` do the job.

In [58]:
# https://gist.github.com/shinokada/a4489f80a6d32d7955bc37a008cea79e

import pandas as pd
import numpy as np

brain_size = pd.read_csv('https://raw.githubusercontent.com/shinokada/python-for-ib-diploma-mathematics/master/Data/brain_size.csv', sep=';', na_values=".")
brain_size.head()

Unnamed: 0.1,Unnamed: 0,Gender,FSIQ,VIQ,PIQ,Weight,Height,MRI_Count
0,1,Female,133,132,124,118.0,64.5,816932
1,2,Male,140,150,124,,72.5,1001121
2,3,Male,139,123,150,143.0,73.3,1038437
3,4,Male,133,129,128,172.0,68.8,965353
4,5,Female,137,132,134,147.0,65.0,951545


In [59]:
brain_size.columns

Index(['Unnamed: 0', 'Gender', 'FSIQ', 'VIQ', 'PIQ', 'Weight', 'Height',
       'MRI_Count'],
      dtype='object')

The followings are the details of the column labels.

1.Gender: Male or Female
2.FSIQ: Full Scale IQ scores based on the four Wechsler (1981) subtests
3.VIQ: Verbal IQ scores based on the four Wechsler (1981) subtests
4.PIQ: Performance IQ scores based on the four Wechsler (1981) subtests
5.Weight: body weight in pounds
6.Height: height in inches
7.MRI_Count: total pixel Count from the 18 MRI scans

Let's run the pooled two-sample t-test on FSIQ and VIQ. We need to select all rows and the second and the third columns.

In [68]:
# https://gist.github.com/shinokada/352bab2b15235a6925a7d08c7906e999

fsiq = brain_size.iloc[:,2:3]
viq = brain_size.iloc[:,3:4]
print(fsiq.head())
print(viq.head())

   FSIQ
0   133
1   140
2   139
3   133
4   137
   VIQ
0  132
1  150
2  123
3  129
4  132


In [71]:
# https://gist.github.com/shinokada/84fbb9ea56766f5f82bef799391878a5

import statsmodels.stats.weightstats as sm

significance = 0.05

tstat, pvalue, df = sm.ttest_ind(
    fsiq,viq,
    alternative='smaller',
    usevar='pooled')

print("""Test statistic=%.2f, 
p-value=%.4f, 
degree of freedom=%.0f\n""" % (tstat,pvalue,df))

if pvalue < significance:
	print("""At %.2f level of significance, 
we reject the null hypotheses. 
The FSIQ 1 is less than the VIQ 2.""" % (significance))
else:
	print("""At %.2f level of significance, 
we accept the null hypotheses.  
The FSIQ 1 is greater than the VIQ 2.""" % (significance))

Test statistic=0.21, 
p-value=0.5814, 
degree of freedom=78

At 0.05 level of significance, 
we accept the null hypotheses.  
The FSIQ 1 is greater than the VIQ 2.


# Reference

- https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html

- https://www.statsmodels.org/dev/generated/statsmodels.stats.weightstats.ttest_ind.html

- https://scipy-lectures.org/packages/statistics/index.html