# Non-parametric Statistical Hypothesis Tests (to compare sample distributions)

## 1. Mann-Whitney U Test
Tests whether the distributions of two independent samples are equal or not.

**Assumptions**
- Observations in each sample are independent and identically distributed.
- Observations in each sample can be ranked.

**Interpretation**
- H0: the distributions of two samples are equal.
- H1: the distributions of two samples are not equal.

**More Information**
- [scipy.stats.mannwhitneyu](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.mannwhitneyu.html)
- [Mann-Whitney U test on Wikipedia](https://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U_test)

In [1]:
# Example of the Mann-Whitney U Test
from scipy.stats import mannwhitneyu

data1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]
data2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]

stat, p = mannwhitneyu(data1, data2)
print('stat=%.3f, p=%.3f' % (stat, p))

if p > 0.05:
    print('Probably the same distribution')
else:
    print('Probably different distributions')

stat=40.000, p=0.236
Probably the same distribution


## 2. Kruskal-Wallis H Test
Tests whether the distributions of two or more independent samples are equal or not.

**Assumptions**
- Observations in each sample are independent and identically distributed.
- Observations in each sample can be ranked.

**Interpretation**
- H0: the distributions of all samples are equal.
- H1: the distributions of one or more samples are not equal.

**More Information**
- [scipy.stats.kruskal](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.kruskal.html)
- [Kruskal-Wallis one-way analysis of variance on Wikipedia](https://en.wikipedia.org/wiki/Kruskal%E2%80%93Wallis_one-way_analysis_of_variance)

In [2]:
# Example of the Kruskal-Wallis H Test
from scipy.stats import kruskal

data1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]
data2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]

stat, p = kruskal(data1, data2)
print('stat=%.3f, p=%.3f' % (stat, p))

if p > 0.05:
    print('Probably the same distribution')
else:
    print('Probably different distributions')

stat=0.571, p=0.450
Probably the same distribution
