# Normality Tests (used to check if your data has a Gaussian distribution).
 
## 1. Shapiro-Wilk Test

Tests whether a data sample has a Gaussian distribution.

**Assumptions**
- Observations in each sample are independent and identically distributed.

**Interpretation**
- H0: the sample has a Gaussian distribution.
- H1: the sample does not have a Gaussian distribution.

**More Information**
- [scipy.stats.shapiro](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.shapiro.html)
- [Shapiro-Wilk test on Wikipedia](https://en.wikipedia.org/wiki/Shapiro%E2%80%93Wilk_test)

In [1]:
# Example of the Shapiro-Wilk Normality Test
from scipy.stats import shapiro
data = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]
stat, p = shapiro(data)
print('stat=%.3f, p=%.3f' % (stat, p))
if p > 0.05:
    print('Probably Gaussian')
else:
    print('Probably not Gaussian')

stat=0.895, p=0.193
Probably Gaussian


## 2. Anderson-Darling Test
Tests whether a data sample has a Gaussian distribution.

**Assumptions**
- Observations in each sample are independent and identically distributed.

**Interpretation**
- H0: the sample has a Gaussian distribution.
- H1: the sample does not have a Gaussian distribution.

**More information**
- [scipy.stats.anderson](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.anderson.html)
- [Anderson-Darling test on Wikipedia](https://en.wikipedia.org/wiki/Anderson%E2%80%93Darling_test)

In [3]:
# Example of the Anderson-Darling Normality Test
from scipy.stats import anderson
data = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]
result = anderson(data)
print('stat=%.3f' % (result.statistic))

for i in range(len(result.critical_values)):
    sl, cv = result.significance_level[i], result.critical_values[i]
    if result.statistic < cv:
        print('Probably Gaussian at the %.1f%% level' % (sl))
    else:
        print('Probably not Gaussian at the %.1f%% level' % (sl))

stat=0.424
Probably Gaussian at the 15.0% level
Probably Gaussian at the 10.0% level
Probably Gaussian at the 5.0% level
Probably Gaussian at the 2.5% level
Probably Gaussian at the 1.0% level


## 3. D’Agostino’s K^2 Test
Tests whether a sample differs from a normal distribution.

**Assumptions**
- Observations are independent and identically distributed in each sample.

**Interpretation**
- H0: a sample comes from a normal distribution.
- H1: a sample does not come from a normal distribution.

**More information**
- [scipy.stats.normaltest](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.normaltest.html)
- [D’Agostino’s K-squared test on Wikipedia](https://en.wikipedia.org/wiki/D%27Agostino%27s_K-squared_test)

In [1]:
# Example of the D'Agostino's K^2 Normality Test
from scipy.stats import normaltest
data = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869, 
       0.973, 3.817, 0.131, -0.846, -0.075, -2.436, 0.560, -1.378, -1.151, -2.000]
stat, p = normaltest(data)
print('stat=%.3f, p=%.3f' % (stat, p))

if p > 0.05: # null hypothesis: x comes from a normal distribution
    print('Probably Gaussian (the null hypothesis cannot be rejected)')
else:
    print('Probably not Gaussian (the null hypothesis can be rejected')

stat=5.984, p=0.050
Probably Gaussian (the null hypothesis cannot be rejected)
