### t2.micro, AWS Marketplace -> Anaconda with Python 3

Although there are hundreds of statistical hypothesis tests that you could use, there is only a small subset that you may need to use in a machine learning project.

Note, when it comes to assumptions such as the expected distribution of data or sample size, the results of a given test are likely to degrade gracefully rather than become immediately unusable if an assumption is violated.

Generally, data samples need to be representative of the domain and large enough to expose their distribution to analysis.

In some cases, the data can be corrected to meet the assumptions, such as correcting a nearly normal distribution to be normal by removing outliers, or using a correction to the degrees of freedom in a statistical test when samples have differing variance.

Finally, there may be multiple tests for a given concern, e.g. normality. We cannot get crisp answers to questions with statistics; instead, we get probabilistic answers. As such, we can arrive at different answers to the same question by considering the question in different ways. Hence the need for multiple tests for some questions we may have about data.

### 1) Shapiro-Wilk (Normality)

Tests whether a data sample has a Gaussian distribution.

<b>Assumptions:</b>

Observations in each sample are independent and identically distributed (IID).

<b>Interpretation:</b>

<b>H0:</b> the sample has a Gaussian distribution.<br>
<b>H1:</b> the sample does not have a Gaussian distribution.

In [1]:
from scipy import stats

stats.shapiro(
              stats.norm.rvs(loc=5, scale=3, size=100)
             )

(0.9899657964706421, 0.6618269681930542)

### 2) D’Agostino’s K^2 (Normality)

Tests whether a data sample has a Gaussian distribution.

<b>Assumptions:</b>

Observations in each sample are independent and identically distributed (IID).

<b>Interpretation:</b>

<b>H0:</b> the sample has a Gaussian distribution.<br>
<b>H1:</b> the sample does not have a Gaussian distribution.

In [2]:
import numpy as np
from scipy import stats

k2, p = stats.normaltest(
                         np.concatenate((
                                         np.random.normal(0, 1, size=1000),
                                         np.random.normal(2, 1, size=1000)
                                       ))
                        )

print('p = {:g}'.format(p))

alpha = 1e-3
if p < alpha:  # null hypothesis: x comes from a normal distribution
    print('The null hypothesis can be rejected')
else:
    print('The null hypothesis cannot be rejected')

p = 3.61427e-12
The null hypothesis can be rejected


### 3) Anderson-Darling (Normality)

Tests whether a data sample has a Gaussian distribution.

<b>Assumptions:</b>

Observations in each sample are independent and identically distributed (IID).

<b>Interpretation:</b>

<b>H0:</b> the sample has a Gaussian distribution.<br>
<b>H1:</b> the sample does not have a Gaussian distribution.

In [3]:
from scipy import stats

stats.anderson(
               stats.norm.rvs(loc=5, scale=3, size=100)
              )

AndersonResult(statistic=0.2765968022971208, critical_values=array([0.555, 0.632, 0.759, 0.885, 1.053]), significance_level=array([15. , 10. ,  5. ,  2.5,  1. ]))

### 4) Pearson’s Coefficient (Correlation)

Tests whether two samples have a linear relationship.

<b>Assumptions:</b>

Observations in each sample are independent and identically distributed (IID).<br>
Observations in each sample are normally distributed.<br>
Observations in each sample have the same variance.

<b>Interpretation:</b>

<b>H0:</b> the two samples are independent.<br>
<b>H1:</b> there is a dependency between the samples.

In [4]:
import numpy as np
from scipy import stats

stats.pearsonr(
               np.array([0, 0, 0, 1, 1, 1, 1]),
               np.arange(7)
              )

(0.8660254037844386, 0.011724811003954654)

### 5) Spearman’s Rank (Correlation)

Tests whether two samples have a monotonic relationship.

<b>Assumptions:</b>

Observations in each sample are independent and identically distributed (IID).<br>
Observations in each sample can be ranked.

<b>Interpretation:</b>

<b>H0:</b> the two samples are independent.<br>
<b>H1:</b> there is a dependency between the samples.

In [5]:
from scipy import stats

stats.spearmanr(
                [1,2,3,4,5],
                [5,6,7,8,7]
               )

SpearmanrResult(correlation=0.8207826816681233, pvalue=0.08858700531354381)

### 6) Kendall’s Rank (Correlation)

Tests whether two samples have a monotonic relationship.

<b>Assumptions:</b>

Observations in each sample are independent and identically distributed (IID).<br>
Observations in each sample can be ranked.<br>

<b>Interpretation:</b>

<b>H0:</b> the two samples are independent.<br>
<b>H1:</b> there is a dependency between the samples.

In [6]:
from scipy import stats

stats.kendalltau(
                 [12, 2, 1, 12, 2],
                 [1, 4, 7, 1, 0]
                )

KendalltauResult(correlation=-0.4714045207910316, pvalue=0.2827454599327748)

### 7) Chi-Squared (Correlation)

Tests whether two categorical variables are related or independent.

<b>Assumptions:</b>

Observations used in the calculation of the contingency table are independent.<br>
25 or more examples in each cell of the contingency table.

<b>Interpretation:</b>

<b>H0:</b> the two samples are independent.<br>
<b>H1:</b> there is a dependency between the samples.

In [7]:
import numpy as np
from scipy.stats import chi2_contingency

chi2_contingency(
                 np.array([[10, 10, 20], [20, 20, 20]])
                )

(2.7777777777777777, 0.24935220877729622, 2, array([[12., 12., 16.],
        [18., 18., 24.]]))

### 8) Student’s t (Parametric)

Tests whether the means of two independent samples are significantly different.

<b>Assumptions:</b>

Observations in each sample are independent and identically distributed (IID).<br>
Observations in each sample are normally distributed.<br>
Observations in each sample have the same variance.

<b>Interpretation:</b>

<b>H0:</b> the means of the samples are equal.<br>
<b>H1:</b> the means of the samples are unequal.

In [8]:
from scipy import stats

stats.ttest_ind(
                stats.norm.rvs(loc=5,scale=10,size=500),
                stats.norm.rvs(loc=5,scale=10,size=500)
               )

Ttest_indResult(statistic=-2.1782521980565073, pvalue=0.02962074232246727)

### 9) Paired Student’s t (Parametric)

Tests whether the means of two paired samples are significantly different.

<b>Assumptions:</b>

Observations in each sample are independent and identically distributed (IID).<br>
Observations in each sample are normally distributed.<br>
Observations in each sample have the same variance.<br>
Observations across each sample are paired.

<b>Interpretation:</b>

<b>H0:</b> the means of the samples are equal.<br>
<b>H1:</b> the means of the samples are unequal.

In [9]:
from scipy import stats

stats.ttest_rel(
                stats.norm.rvs(loc=5,scale=10,size=500),
                (stats.norm.rvs(loc=5,scale=10,size=500) + stats.norm.rvs(scale=0.2,size=500))
               )

Ttest_relResult(statistic=0.13447592064180153, pvalue=0.8930804720165942)

### 10) Analysis of Variance or ANOVA (Parametric)

Tests whether the means of two or more independent samples are significantly different.

<b>Assumptions:</b>

Observations in each sample are independent and identically distributed (IID).<br>
Observations in each sample are normally distributed.<br>
Observations in each sample have the same variance.

<b>Interpretation:</b>

<b>H0:</b> the means of the samples are equal.<br>
<b>H1:</b> one or more of the means of the samples are unequal.

In [10]:
import scipy.stats as stats

stats.f_oneway(
               [0.0571, 0.0813, 0.0831, 0.0976, 0.0817, 0.0859, 0.0735, 0.0659, 0.0923, 0.0836],
               [0.0873, 0.0662, 0.0672, 0.0819, 0.0749, 0.0649, 0.0835, 0.0725]                ,
               [0.0974, 0.1352, 0.0817, 0.1016, 0.0968, 0.1064, 0.105]                         ,
               [0.1033, 0.0915, 0.0781, 0.0685, 0.0677, 0.0697, 0.0764, 0.0689]                ,
               [0.0703, 0.1026, 0.0956, 0.0973, 0.1039, 0.1045]
              )

F_onewayResult(statistic=7.121019471642447, pvalue=0.0002812242314534544)

### 11) Mann-Whitney U (Nonparametric)

Tests whether the distributions of two independent samples are equal or not.

<b>Assumptions:</b>

Observations in each sample are independent and identically distributed (IID).<br>
Observations in each sample can be ranked.

<b>Interpretation:</b>

<b>H0:</b> the distributions of both samples are equal.<br>
<b>H1:</b> the distributions of both samples are not equal.

In [11]:
from scipy import stats

stats.mannwhitneyu(
                   stats.norm.rvs(loc=5,scale=10,size=500),
                   (stats.norm.rvs(loc=5,scale=10,size=500) + stats.norm.rvs(scale=0.2,size=500))
                  )

MannwhitneyuResult(statistic=120860.0, pvalue=0.1823446553487712)

### 12) Wilcoxon Signed-Rank (Nonparametric)

Tests whether the distributions of two paired samples are equal or not.

<b>Assumptions</b>

Observations in each sample are independent and identically distributed (IID).<br>
Observations in each sample can be ranked.<br>
Observations across each sample are paired.

<b>Interpretation:</b>

<b>H0:</b> the distributions of both samples are equal.<br>
<b>H1:</b> the distributions of both samples are not equal.

In [12]:
from scipy import stats

stats.wilcoxon(
               stats.norm.rvs(loc=5,scale=10,size=500),
               (stats.norm.rvs(loc=5,scale=10,size=500) + stats.norm.rvs(scale=0.2,size=500))
              )

WilcoxonResult(statistic=60064.0, pvalue=0.4281808375651276)

### 13) Kruskal-Wallis H (Nonparametric)

Tests whether the distributions of two or more independent samples are equal or not.

<b>Assumptions:</b>

Observations in each sample are independent and identically distributed (IID).<br>
Observations in each sample can be ranked.

<b>Interpretation:</b>

<b>H0:</b> the distributions of all samples are equal.<br>
<b>H1:</b> the distributions of one or more samples are not equal.

In [13]:
from scipy import stats

stats.kruskal(
              [1, 3, 5, 7, 9],
              [2, 4, 6, 8, 10]
             )

KruskalResult(statistic=0.2727272727272734, pvalue=0.6015081344405895)

### 14) Friedman (Nonparametric)

Tests whether the distributions of two or more paired samples are equal or not.

<b>Assumptions:</b>

Observations in each sample are independent and identically distributed (IID).<br>
Observations in each sample can be ranked.<br>
Observations across each sample are paired.

<b>Interpretation:</b>

<b>H0:</b> the distributions of all samples are equal.<br>
<b>H1:</b> the distributions of one or more samples are not equal.

In [14]:
from scipy import stats

stats.friedmanchisquare(
                        [1, 3, 5, 7, 9],
                        [2, 4, 6, 8, 10],
                        [3, 5, 7, 9, 11]
                       )

FriedmanchisquareResult(statistic=10.0, pvalue=0.006737946999085468)