# Parametric Statistical Hypothesis Tests (to compare data samples)

## 1. Student’s t-test
Tests whether the average (expected) values of two independent samples are significantly different.

**Assumptions**
- Observations in each sample are independent and identically distributed.
- Observations in each sample are normally distributed.
- Observations in each sample have the same variance.

**Interpretation**
- H0: the mean of the samples are identical.
- H1: the means of the samples are not identical.

**More Information**
- [scipy.stats.ttest_ind](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html)
- [Student’s t-test on Wikipedia](https://en.wikipedia.org/wiki/Student%27s_t-test)

In [2]:
# Example of the Student's t-test
from scipy.stats import ttest_ind

data1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]
data2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]

stat, p = ttest_ind(data1, data2)
print('stat=%.3f, p=%.3f' % (stat, p))

if p > 0.05:
    print('Probably the same average values')
else:
    print('Probably different average values')

stat=-0.326, p=0.748
Probably the same average values


## 2. Paired Student’s t-test
Tests whether two related or repeated samples have identical average (expected) values.
Examples for use are scores of the same set of student in different exams, or repeated sampling from the same units. The test measures whether the average score differs significantly across samples (e.g. exams). 

**Assumptions**
- Observations in each sample are independent and identically distributed.
- Observations in each sample are normally distributed.
- Observations in each sample have the same variance.
- Observations across each sample are paired.

**Interpretation**
- H0: the means of the samples are equal. 2 related or repeated samples have identical average (expected) values.
- H1: the means of the samples are unequal.

**More Information**
- [scipy.stats.ttest_rel](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_rel.html)
- [Student’s t-test for paired samples on Wikipedia](https://en.wikipedia.org/wiki/Student%27s_t-test#Dependent_t-test_for_paired_samples)

In [2]:
# Example of the Paired Student's t-test
from scipy.stats import ttest_rel
data1 = [0.673, 2.617, 0.021, -0.745, -0.067, -1.235, 0.163, -1.287, -1.435, -1.768]
data2 = [1.132, -0.421, -0.927, -0.738, -0.926, -0.130, 0.481, 1.161, -1.102, -0.147]
stat, p = ttest_rel(data1, data2)
print('stat=%.3f, p=%.3f' % (stat, p))
if p > 0.05:
    print('Probably the same average values (and, potentially, the same distribution)')
else:
    print('Probably different average values (and different distributions)')

stat=-0.300, p=0.771
Probably the same average values (and, potentially, the same distribution)


## 3. Analysis of Variance Test (ANOVA)
Tests whether the means of two or more independent samples are significantly different.

**Assumptions**
- Observations in each sample are independent and identically distributed.
- Observations in each sample are normally distributed.
- The standard deviations of the data samples are equal.

**Interpretation**
- H0: the means of the samples are equal.
- H1: one or more of the means of the samples are unequal.

**More Information**
- [scipy.stats.f_oneway](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.f_oneway.html)
- [Analysis of variance on Wikipedia](https://en.wikipedia.org/wiki/Analysis_of_variance)

In [3]:
# Example of the Analysis of Variance Test (ANOVA)
from scipy.stats import f_oneway
data1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]
data2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]
data3 = [-0.208, 0.696, 0.928, -1.148, -0.213, 0.229, 0.137, 0.269, -0.870, -1.204]
stat, p = f_oneway(data1, data2, data3)

print('stat=%.3f, p=%.3f' % (stat, p))

if p > 0.05:
    print('Probably the same distribution')
else:
    print('Probably different distributions')

stat=0.096, p=0.908
Probably the same distribution


## 4. Repeated Measures ANOVA Test
Tests whether there is a statistically significant difference between the means of three or more groups in which the same subjects show up in each group. Is typically used in two specific situations:
1. Measuring the mean scores of subjects during three or more time points.
2. Measuring the mean scores of subjects under three different conditions.

**Assumptions**
- Observations in each sample are independent and identically distributed.
- Observations in each sample are normally distributed.
- Observations in each sample have the same variance.
- Observations across each sample are paired.

**Interpretation**
- H0: the means of the samples are equal.
- H1: at least one of the means is different from the rest.

**More Information**
- [Analysis of variance on Wikipedia](https://en.wikipedia.org/wiki/Analysis_of_variance)

**Python Code**
Currently not supported in Python SciPy (though the implementations exist on statsmodels or pingouin).