In [1]:
import numpy as np
from scipy.stats import norm

# ONE-SAMPLE TESTS

In this notebook, you'll practice one-sample tests, namely z- and t-tests for testing whether the population mean equals the hypothesized mean.

## P-VALUES REMINDER

Remember that a p-value is the probability to see the value of the test statistic as least as extreme as you observe. What does 'at least as extreme' mean in practice depends on the alternative hypothesis oo your test:

* One-sided 'greater': $p = P(T > x)$
* One-sided 'less': $p = P(X \leq x)$
* Two-sided: $p = 2min{P(T > x), P(T \leq x)}$

## Z-TEST

As you remember, a z-test is used to test if the population mean $\mu$ equals the hypothesized mean $\mu_0$ when the following assumtions are satisfied:

1. samples $X_1, ..., X_n$ are independent and coming from a normal distribution $N(\mu, \sigma^2)$ and
2. the true variance of the underlying distribution $\sigma^2$ is known.

We test the null hypothesis

$H_0: \mu = \mu_0$

against one of the follwoing alternatives:

1. (two-sided) $H_0: \mu \neq \mu_0$
2. (one-sided-greater) $H_0: \mu > \mu_0$
3. (one-sided-smaller) $H_0: \mu < \mu_0$

Test statistic is as follows:

$T(X) = \frac{(\bar{X} - \mu_0)\sqrt{n}}{\sigma} $

Assuming that $H_0$ holds, $T(X) \sim N(0, 1)$. 



Define a function *z_test()* that would perform the z-test as described above. 

The function should take in 
* data samples;
* hypothesized value of the mean $\mu_0$;
* true value of the $\sigma$ parameter;
* significance level $\alpha$;
* indicator for the type of the alternative hypothesis that $H_0$ is being tested against (*two-sided, one-sided-gretaer* or *one-sided-less*)

and return the resulting p-value.

In [2]:
# Your code here

def z_test(data, mu, sigma, a=0.05, kind='two-sided'):
  sample_mean = np.mean(data)
  n = len(data)
  t = (sample_mean - mu)*np.sqrt(n)/sigma
  print(t)
  if kind == 'two-sided':
    p_value = 2*min(1 - norm.cdf(t), norm.cdf(t))
  elif kind == 'greater':
    p_value = (1 - norm.cdf(t))
  else:
    p_value = norm.cdf(t)

  return p_value

Now, it's time to test your function! 

Let's solve the problem from yesterday's lecture once again.

A supermarket gets bread from the bakery. They expect each loaf to be about 2kg with the satndard deviation of 0.1, but notice the sample means across some random 20 loaves gives an average weight of 1.97 kg. 

The code below generates the data for the experiment.

In [3]:
loaves = [2, 1.97, 1.94, 2, 1.97, 1.94, 2, 1.97, 1.94, 1.97,
          2, 1.97, 1.94, 2, 1.97, 1.94, 2, 1.97, 1.94, 1.97]

sigma = 0.1

mu = 2

Is there evidence strong enough to believe that the bakery isn't delivering bread with the mean weight of 2 kg?

Apply the function you've defined above to run z-test on this problem. Specify the null hypothesis and possible alternatives (two-sided and two one-sided ones). Test all of them. 

Based on the p-value output by your function, do you reject the null hypothesis in each of the cases?

In [4]:
# Your code here

z_test(loaves, mu, sigma, a=0.05, kind='two-sided')

-1.341640786499895


0.179712494878993

In [5]:
z_test(loaves, mu, sigma, a=0.05, kind='greater')

-1.341640786499895


0.9101437525605035

In [6]:
z_test(loaves, mu, sigma, a=0.05, kind='smaller')

-1.341640786499895


0.0898562474394965

## T-TEST

A **t-test** is used to test if the population mean $\mu$ equals the hypothesized mean $\mu_0$ when the following assumtions are satisfied:

1. samples $X_1, ..., X_n$ are independent and coming from a normal distribution $N(\mu, \sigma^2)$ and
2. the true variance of the underlying distribution $\sigma^2$ is **unknown**.

We test the null hypothesis

$H_0: \mu = \mu_0$

against one of the follwoing alternatives:

1. (two-sided) $H_0: \mu \neq \mu_0$
2. (one-sided-greater) $H_0: \mu > \mu_0$
3. (one-sided-smaller) $H_0: \mu < \mu_0$

Test statistic is as follows:

$T(X) = \frac{(\bar{X} - \mu_0)\sqrt{n}}{s} $, where $s^2 = \frac{1}{n-1}\sum_{i=1}^n{(X_i-\bar{X})^2}$ - sample variance.

Assuming that $H_0$ holds, $T(X) \sim t(n-1)$. 

A one-sample t-test is implemented in Python as *ttest_1samp()* function in the scipy library.

See details: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_1samp.html#scipy.stats.ttest_1samp

Let't run an experiment to see how sample size influences the outcome of the statistical test.

For a sample size $n = 20$, generate a random sample from the normal distribution $N(1.97, 0.1)$. 

Pretend now that you don't know the true parameters of the underlying distribution. With the t-test, test the hypothesis $H_0: \mu = 2$ against a one-sided alternative  $H_1: \mu < 2$ (treat $\sigma$ as unknown).

What should be the "correct" outcome of the test?

Repeat this $N = 1000$ times. How many times do you reject the null hypothesis?

In [10]:
# Your code here

from scipy.stats import ttest_1samp

n = 20
mu = 2

count = 0
a = 0.05

N = 1000

for _ in range(N):
  data = np.random.normal(1.97, 0.1, n)
  # one-sided p-value is 2*two-sided one
  if (ttest_1samp(data, mu)[1] < 2*a):
    count += 1

print('Share of times we reject H0: ', count/N)

Share of times we reject H0:  0.363


Now, gradually increase the sample size $n$ and repeat the experiment. What do you observe?

In [11]:
# Your code here

n = 1000
mu = 2

count = 0
a = 0.05

N = 1000

for _ in range(N):
  data = np.random.normal(1.97, 0.1, n)
  # one-sided p-value is 2*two-sided one
  if (ttest_1samp(data, mu)[1] < 2*a):
    count += 1

print('Share of times we reject H0: ', count/N)

Share of times we reject H0:  1.0
