# Null Hypothesis via hypothesis testing 

__$$ if p < 0.05 : H_0 should go$$__

- __p <= alpha: reject H0, different distribution.__
- __p > alpha: fail to reject H0, same distribution.__


<h1 id="tocheading">Table of Contents</h1>
<div id="toc"></div>

In [5]:
%%javascript
$.getScript('https://kmahelona.github.io/ipython_notebook_goodies/ipython_notebook_toc.js')

<IPython.core.display.Javascript object>

In [6]:
# generate gaussian data samples
from numpy.random import seed
from numpy.random import randn
from numpy import mean
from numpy import std
# seed the random number generator
seed(1)
# generate two sets of univariate observations
data1 = 5 * randn(100) + 50
data2 = 5 * randn(100) + 51
data3 = 5 * randn(100) + 52
# summarize
print('data1: mean=%.3f stdv=%.3f' % (mean(data1), std(data1)))
print('data2: mean=%.3f stdv=%.3f' % (mean(data2), std(data2)))
print('data3: mean=%.3f stdv=%.3f' % (mean(data3), std(data3)))

data1: mean=50.303 stdv=4.426
data2: mean=51.764 stdv=4.660
data3: mean=52.049 stdv=5.025


# Mann- whitney U test

The Mann-Whitney U test is a nonparametric statistical significance test for determining whether two independent samples were drawn from a population with the same distribution.

>The two samples are combined and rank ordered together. The strategy is to determine if the values from the two samples are randomly mixed in the rank ordering or if they are clustered at opposite ends when combined. A random rank order would mean that the two samples are not different, while a cluster of one sample values would indicate a difference between them.

- __Fail to Reject H0:__ Sample distributions are equal.
- __Reject H0:__ Sample distributions are not equal.


In [14]:
# Mann-Whitney U test
from scipy.stats import mannwhitneyu

# compare samples
stat, p = mannwhitneyu(data1, data2)
print('Statistics=%.3f, p=%.3f' % (stat, p))
# interpret
alpha = 0.05
if p > alpha:
	print('Same distribution (fail to reject H0)')
else:
	print('Different distribution (reject H0)')

Statistics=4025.000, p=0.009
Different distribution (reject H0)


# Wilcoxon signed-rank test

>The Wilcoxon signed ranks test is a nonparametric statistical procedure for comparing two samples that are paired, or related. The parametric equivalent to the Wilcoxon signed ranks test goes by names such as the Student’s t-test, t-test for matched pairs, t-test for paired samples, or t-test for dependent samples.

- __Fail to Reject H0:__ Sample distributions are equal.
- __Reject H0:__ Sample distributions are not equal.


In [15]:
# Wilcoxon signed-rank test
from scipy.stats import wilcoxon

# compare samples
stat, p = wilcoxon(data1, data2)
print('Statistics=%.3f, p=%.3f' % (stat, p))
# interpret
alpha = 0.05
if p > alpha:
	print('Same distribution (fail to reject H0)')
else:
	print('Different distribution (reject H0)')

Statistics=1886.000, p=0.028
Different distribution (reject H0)


# Kruskal-Wallis H-test

>When the Kruskal-Wallis H-test leads to significant results, then at least one of the samples is different from the other samples. However, the test does not identify where the difference(s) occur. Moreover, it does not identify how many differences occur. To identify the particular differences between sample pairs, a researcher might use sample contrasts, or post hoc tests, to analyze the specific sample pairs for significant difference(s). The Mann-Whitney U-test is a useful method for performing sample contrasts between individual sample sets.

- __Fail to Reject H0:__ All sample distributions are equal.
- __Reject H0:__ One or more sample distributions are not equal.

In [16]:
# Kruskal-Wallis H-test
from scipy.stats import kruskal

# compare samples
stat, p = kruskal(data1, data2, data3)
print('Statistics=%.3f, p=%.3f' % (stat, p))
# interpret
alpha = 0.05
if p > alpha:
	print('Same distributions (fail to reject H0)')
else:
	print('Different distributions (reject H0)')

Statistics=7.576, p=0.023
Different distributions (reject H0)


# Friedman test

>The Friedman test is a nonparametric statistical procedure for comparing more than two samples that are related. The parametric equivalent to this test is the repeated measures analysis of variance (ANOVA). When the Friedman test leads to significant results, at least one of the samples is different from the other samples.

- __Fail to Reject H0:__ Paired sample distributions are equal.
- __Reject H0:__ Paired sample distributions are not equal.

In [12]:
# Friedman test
from scipy.stats import friedmanchisquare

stat, p = friedmanchisquare(data1, data2, data3)
print('Statistics=%.3f, p=%.3f' % (stat, p))
# interpret
alpha = 0.05
if p > alpha:
	print('Same distributions (fail to reject H0)')
else:
	print('Different distributions (reject H0)')

Statistics=9.780, p=0.008
Different distributions (reject H0)
