# Introduction to Sampling and Hypothesis Testing

## Hypothesis Tests | Parametric tests

In [None]:
# Select this cell and type Ctrl-Enter to execute the code below.

import numpy as np
from scipy import stats
import matplotlib.pyplot as plt

Parametric tests rely on a probability distribution of known form as a model for the null hypothesis. 

Here we will look at some commonly-encountered examples.

### How surprising is my result? Calculating a p-value

There are many circumstances where we simply want to check whether an observation looks like it is compatible with the null hypothesis, $H_{0}$. 

Having decided on a significance level $\alpha$ and whether the situation warrants a one-tailed or a two-tailed test, we can use the cdf of the null distribution to calculate a p-value for the observation.

#### Example: probability of rolling a six

Your arch-nemesis Blofeld always seems to win at ludo, and you have started to suspect him of using a loaded die.

You observe the following outcomes from 100 rolls of his die:


In [None]:
data = np.array([6, 1, 5, 6, 2, 6, 4, 3, 4, 6, 1, 2, 5, 6, 6, 3, 6, 2, 6, 4, 6, 2,
       5, 4, 2, 3, 3, 6, 6, 1, 2, 5, 6, 4, 6, 2, 1, 3, 6, 5, 4, 5, 6, 3,
       6, 6, 1, 4, 6, 6, 6, 6, 6, 2, 3, 1, 6, 4, 3, 6, 2, 4, 6, 6, 6, 5,
       6, 2, 1, 6, 6, 4, 3, 6, 5, 6, 6, 2, 6, 3, 6, 6, 1, 4, 6, 4, 2, 6,
       6, 5, 2, 6, 6, 4, 3, 1, 6, 6, 5, 5])

Do you have enough evidence to confront him?

In [None]:
# We will work with the binomial distribution for the observed number of sixes

# Write down the hypotheses
# H0: p = 1/6
# H1: p > 1/6

# choose a significance level
# alpha = 0.01

In [None]:
# code the data as 6=success and {0-5}=failure
six = np.where(data==6,1,0)
print(six)

# how many sixes were observed?
x = np.sum(six)
print(x)

# check number of trials
n = len(data)
print(n)

In [None]:
# now use H0 to find the p-value of the observed number of sixes
pval = 1 - stats.binom.cdf(k=42,n=100,p=1/6)  # note this uses k=(observed value-1)
print(pval)

In [None]:
# pval is less than alpha, so reject H0.

#### Example: is the coin fair?

Dr Vogel has challenged you to a game of ludo and you have agreed to flip a coin to decide who starts.

You're not sure whether the coin she is using is fair or not.

She flips it 50 times for you, with the following results: (1=heads, 0=tails)


In [None]:
data = np.array([1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 1])

Does the coin appear to be fair? If not, should you pick heads or tails?

In [None]:
# Again, this is a binomial distribution for the observed number of heads, but this time the test is 2-tailed

# Write down the hypotheses
# H0: p = 1/2
# H1: p != 1/2

# choose a significance level
# alpha = 0.05

In [None]:
# find the number of heads
h = np.sum(data)
print(h)

# check number of trials
n = len(data)
print(n)

In [None]:
# now use H0 to find the p-value of the observed number of heads
ex = 50 * 0.5 # the expected value
print(ex)

In [None]:
x1 = 20  # the lower tail
p1 = stats.binom.cdf(k=x1,n=50,p=0.5)  
pval = 2 * p1 # double the p-value for a two-tailed test
print(pval)

In [None]:
# pval is greater than alpha, so we accept H0: there is no evidence that the coin is biased, at the 5% level.

### Difference between two means: independent 2-sample t-test

We use the **t test** to assess whether two samples taken from normal distributions have significantly different means. 

The test statistic follows a Student's t-distribution, provided that the variances of the two groups are equal.

Other variants of the t-test are applicable under different conditions.

The test statistic is

$$ t = \frac{\bar{X}_{1} - \bar{X}_{2}}{s_p \cdot \sqrt{\frac{1}{n_{1}} + \frac{1}{n_{2}}}} $$

where

$$ s_p = \sqrt{\frac{(n_1 - 1)s_1^2 + (n_2 - 1)s_2^2}{n_1 + n_2 - 2}} $$

is an estimator of the pooled standard deviation.

Under the null hypothesis of equal means, the statistic follows a Student's t-distribution with $(n_{1} + n_{2} - 2)$ degrees of freedom.

#### Example: difference in birth weight

The birth weights of babies (in kg) have been measured for a sample of mothers split into two categories: nonsmoking and heavy smoking.

- The two categories are measured independently from each other. 
- Both come from normal distributions
- The two groups are assumed to have the same unknown variance.




In [None]:
data_nonsmoking = np.array([3.99, 3.79, 3.60, 3.73, 3.21, 3.60, 4.08, 3.61, 3.83, 3.31, 4.13, 3.26, 3.54])
data_heavysmoking = np.array([3.18, 2.84, 2.90, 3.27, 3.85, 3.52, 3.23, 2.76, 3.60, 3.75, 3.59, 3.63, 2.38, 2.34, 2.44])


We want to know whether there is a significant difference in mean birth weight between the two categories.


In [None]:
# Write down the hypotheses
# H0: there is no difference in mean birth weight between groups: d == 0
# H1: there is a difference, d != 0

# choose a significance level
# alpha = 0.05

In [None]:

n_ns = len(data_nonsmoking)
n_hs = len(data_heavysmoking)

mean_ns = np.mean(data_nonsmoking)
mean_hs = np.mean(data_heavysmoking)

s_ns = np.std(data_nonsmoking,ddof=1)
s_hs = np.std(data_heavysmoking,ddof=1)

print(n_ns,mean_ns,s_ns)
print(n_hs,mean_hs,s_hs)

In [None]:
# difference between the two sample means:
d_obs = mean_ns - mean_hs
print(d_obs)

In [None]:
# the pooled standard deviation
sp = np.sqrt(((n_ns - 1)*s_ns**2 + (n_hs - 1)*s_hs**2)/(n_ns + n_hs - 2))
print(sp)

In [None]:
# the test statistic
t_obs = d_obs/(sp * np.sqrt(1/n_ns + 1/n_hs))
print(t_obs)

In [None]:
# degrees of freedom is given by n1 + n2 - 2
df = n_ns + n_hs - 2
print(df)

In [None]:
# find the critical value
print(stats.t.interval(df=df,alpha=0.95)) # interval containing 95% of probability mass

In [None]:
# t_obs lies outside the 95% confidence interval, so we reject H0