Non-parametric tests are for data where we do not know or assume any specific (Gaussian/normal) distrubution.

# <span style="color:Orange"> Functions Summary </span>


In [28]:
stat, p = mannwhitneyu(data1, data2)
stat, p = wilcoxon(data1, data2)
stat, p = kruskal(data1, data2, data3)
stat, p = friedmanchisquare(data1, data2, data3)

# <span style="color:deeppink"> Mann-Whitney U test </span>
- independent data samples
- non-parametric Student t-test
- needs n=20 for each sample


1. combine and rank the two samples.
2. Are the values from two samples randomly mixed in the rank or do they cluster at opposite ends?

In [21]:
# Mann-Whitney U test
from numpy.random import seed
from numpy.random import randn
from scipy.stats import mannwhitneyu
# seed the random number generator
seed(1)
# generate two independent samples
data1 = 5 * randn(100) + 50
data2 = 5 * randn(100) + 51
# compare samples
stat, p = mannwhitneyu(data1, data2)
print('Statistics=%.3f, p=%.3f' % (stat, p))
# interpret
alpha = 0.05
if p > alpha:
	print('Same distribution (fail to reject H0)')
else:
	print('Different distribution (reject H0)')

Statistics=4025.000, p=0.009
Different distribution (reject H0)


# <span style="color:mediumorchid"> Wilcoxon signed-rank test </span>
- paired data samples
- non-parametric paired Student t-test
- needs n=20 for each sample

In [23]:
# Wilcoxon signed-rank test
from numpy.random import seed
from numpy.random import randn
from scipy.stats import wilcoxon
# seed the random number generator
seed(1)
# generate two independent samples
data1 = 5 * randn(100) + 50
data2 = 5 * randn(100) + 51
# compare samples
stat, p = wilcoxon(data1, data2)
print('Statistics=%.3f, p=%.3f' % (stat, p))
# interpret
alpha = 0.05
if p > alpha:
	print('Same distribution (fail to reject H0)')
else:
	print('Different distribution (reject H0)')

Statistics=1886.000, p=0.028
Different distribution (reject H0)


# <span style="color:deepskyblue"> Kruskal-Wallis H test </span>
- more than two independent data samples
- needs n=5 for each sample
- data samples can differ in size
- non-parametric ANOVA
- generalised MAnn Whitney U Test
- does not identify where or how many differences occur
- To identify the particular differences between sample pairs, a researcher might use sample contrasts, or post hoc tests, to analyze the specific sample pairs for significant difference(s). The Mann-Whitney U-test is a useful method for performing sample contrasts between individual sample sets.

In [26]:
# Kruskal-Wallis H-test
from numpy.random import seed
from numpy.random import randn
from scipy.stats import kruskal
# seed the random number generator
seed(1)
# generate three independent samples
data1 = 5 * randn(100) + 50
data2 = 5 * randn(100) + 50
data3 = 5 * randn(100) + 52
# compare samples
stat, p = kruskal(data1, data2, data3)
print('Statistics=%.3f, p=%.3f' % (stat, p))
# interpret
alpha = 0.05
if p > alpha:
	print('Same distributions (fail to reject H0)')
else:
	print('Different distributions (reject H0)')

Statistics=6.051, p=0.049
Different distributions (reject H0)


# <span style="color:lightseagreen"> Friedman  test </span>
- more than two paired samples
- non-parametric repeated measures ANOVA
- generalisation of Kruskal-Wallis H test
- need n=10 for each sample

In [27]:
# Friedman test
from numpy.random import seed
from numpy.random import randn
from scipy.stats import friedmanchisquare
# seed the random number generator
seed(1)
# generate three independent samples
data1 = 5 * randn(100) + 50
data2 = 5 * randn(100) + 50
data3 = 5 * randn(100) + 52
# compare samples
stat, p = friedmanchisquare(data1, data2, data3)
print('Statistics=%.3f, p=%.3f' % (stat, p))
# interpret
alpha = 0.05
if p > alpha:
	print('Same distributions (fail to reject H0)')
else:
	print('Different distributions (reject H0)')

Statistics=9.360, p=0.009
Different distributions (reject H0)
