# T-test

A T-test is a basic statistical test that works with mean values. There are three types of t-tests:

- One-sample t-test: determines whether the mean value differs significantly from the known reference mean (expected value).
- Two independent samples t-test: determines if the mean values of two independent groups differs significantly.
- Paired samples t-test: determines if the mean values of two paired (and consequently related) groups differ significantly.

## One sample

- $H_0$: the sample mean is equal to the given reference mean.
- $H_1$: the sample mean is not equal to the given reference mean.

Suppose we have a sample, $X$, and want to compare wheather $\overline{X}$ equals to the given reference mean $\mu$.

Introduce $t$ statistic:

$$t=\frac{\overline{X} - \mu}{\frac{s}{\sqrt{n}}}$$

Here:
- $s$: the standard deviation of $X$.
- $n$: size of the sample - $n = |X|$.

The variable $t$ is distributed according to a Student's $t$ distribution with $n-1$ degrees of freedom: $t \sim T(n - 1)$.

---

For example, the results of the test were computed without special tools and compared with the results of the sepcialized t-test function, `scipy.stats.ttest_1samp`.

The following cell generates the sample used in the experiment.

In [72]:
import numpy as np
from scipy import stats

np.random.seed(11)
n = 500
sample = np.random.normal(0, 1, n)

The following code computes the t-statistic using only `numpy` and the t-test p-value using only the cumulative distribution function for the Student's distribution.

In [75]:
t_stat = (np.mean(sample) - 0) / (np.std(sample, ddof=1) / np.sqrt(n))
p_value = (1 - stats.t.cdf(np.abs(t_stat), n - 1)) * 2
float(t_stat), float(p_value)

(-0.7531988423186389, 0.4516856402737408)

The following cell represents the results of the same type of computation for the special package.

In [76]:
t_stat, p_value = stats.ttest_1samp(sample, popmean=0)
float(t_stat), float(p_value)

(-0.7531988423186389, 0.4516856402737408)

## Two sample

- $H_0$: the mean values in both groups are the same.
- $H_1$: the mean values in groups differs.

## Paired samples

Paired samples assume that there is a single group of objects being observed, and that the mean of their measurements changes under different conditions or over time.

* $X_1$, $X_2$: sets of observations under two conditions.
* $x_{i1} \in X_1$: the $i$-th observation from the first condition.
* $x_{i2} \in X_2$: the $i$-th observation from the second condition.

Since $X_1$ and $X_2$ are related (i.e., paired), the standard two-sample t-test is not appropriate. Instead, the problem can be reduced to a one-sample t-test by computing the differences $\delta_i = x_{i1} - x_{i2}$ and testing the null hypothesis $H_0$: the mean of $\delta_i$ is equal to zero.