#### New to Plotly?
Plotly's Python library is free and open source! [Get started](https://plot.ly/python/getting-started/) by dowloading the client and [reading the primer](https://plot.ly/python/getting-started/).
<br>You can set up Plotly to work in [online](https://plot.ly/python/getting-started/#initialization-for-online-plotting) or [offline](https://plot.ly/python/getting-started/#initialization-for-offline-plotting) mode, or in [jupyter notebooks](https://plot.ly/python/getting-started/#start-plotting-online).
<br>We also have a quick-reference [cheatsheet](https://images.plot.ly/plotly-documentation/images/python_cheat_sheet.pdf) (new!) to help you get started!

#### Imports
The tutorial below imports [NumPy](http://www.numpy.org/), [Pandas](https://plot.ly/pandas/intro-to-pandas-tutorial/), and [SciPy](https://www.scipy.org/).

In [1]:
import plotly.plotly as py
import plotly.graph_objs as go
from plotly.tools import FigureFactory as FF

import numpy as np
import pandas as pd
import scipy

#### Import Data

To look at various normality tests, we will import some data of average wind speed sampled every 10 minutes:

In [2]:
data = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/wind_speed_laurel_nebraska.csv')
df = data[0:10]

table = FF.create_table(df)
py.iplot(table, filename='wind-data-sample')

In statistical analysis, it is always important to be as percise as possible in our language. In general for a normality test, we are testing the `null-hypothesis` that the our 1D data is sampled from a population that has a `Normal Distribution`. We assume a significance level of $0.05$ or $95\%$ for our tests unless otherwise stated.

For more information on the choice of 0.05 for a significance level, check out [this page](http://www.investopedia.com/exam-guide/cfa-level-1/quantitative-methods/hypothesis-testing.asp).

#### Shapiro-Wilk

The Shapiro-Wilk normality test is reputadely more well suited to smaller datasets.

In [3]:
x = data['10 Min Sampled Avg']

shapiro_results = scipy.stats.shapiro(x)

matrix_sw = [
    ['', 'DF', 'Test Statistic', 'p-value'],
    ['Sample Data', len(x) - 1, shapiro_results[0], shapiro_results[1]]
]

shapiro_table = FF.create_table(matrix_sw, index=True)
py.iplot(shapiro_table, filename='shapiro-table')

Since our `p-value` is much less than our `Test Statistic`, we have good evidence to not reject the null hypothesis at the 0.05 significance level.

#### Kolmogorov-Smirnov

The Kolmogorov-Smirnov test can be applied more broadly than Shapiro, since it is comparing any two distributions against each other, not necessarily one distriubtion to a normal one. These tests can be one-sided or both-sides, but the latter only applies if both distributions are continuous.

In [15]:
ks_results = scipy.stats.kstest(x, cdf='norm')

matrix_ks = [
    ['', 'DF', 'Test Statistic', 'p-value'],
    ['Sample Data', len(x) - 1, ks_results[0], ks_results[1]]
]

ks_table = FF.create_table(matrix_ks, index=True)
py.iplot(ks_table, filename='ks-table')

Since our p-value is read as 0.0 (meaning it is "practically" 0 given the decimal accuracy of the test) then we have strong evidence to not reject the null-hypothesis

#### Anderson-Darling

Anderson's test is derived from Kolmogorov and is used in a similar way to test the null-hypothesis that data is sampled from a population that follows a particular distribution.

In [4]:
anderson_results = scipy.stats.anderson(x)
print(anderson_results)

AndersonResult(statistic=2.653698947239036, critical_values=array([ 0.566,  0.645,  0.773,  0.902,  1.073]), significance_level=array([ 15. ,  10. ,   5. ,   2.5,   1. ]))


In [5]:
matrix_ad = [
    ['', 'DF', 'Test Statistic', 'p-value'],
    ['Sample Data', len(x) - 1, anderson_results[0], anderson_results[1][2]]
]

anderson_table = FF.create_table(matrix_ad, index=True)
py.iplot(anderson_table, filename='anderson-table')

As with our tests above, we have good evidence to not reject our null-hypothesis.

#### D’Agostino and Pearson

We can combine the D'Agostino and Pearson method to generate a new test which considers the `kurtosis`, the sharpest point on the curve.

In [6]:
dagostino_results = scipy.stats.mstats.normaltest(x)

matrix_dp = [
    ['', 'DF', 'Test Statistic', 'p-value'],
    ['Sample Data', len(x) - 1, dagostino_results[0], dagostino_results[1]]
]

dagostino_table = FF.create_table(matrix_dp, index=True)
py.iplot(dagostino_table, filename='dagostino-table')

Our p-value is very close to 0 and much less than our Test Statistic, so we have good evidence once again to not reject the null-hypothesis.

In [1]:
from IPython.display import display, HTML

display(HTML('<link href="//fonts.googleapis.com/css?family=Open+Sans:600,400,300,200|Inconsolata|Ubuntu+Mono:400,700" rel="stylesheet" type="text/css" />'))
display(HTML('<link rel="stylesheet" type="text/css" href="http://help.plot.ly/documentation/all_static/css/ipython-notebook-custom.css">'))

! pip install git+https://github.com/plotly/publisher.git --upgrade
import publisher
publisher.publish(
    'python-Normality-Test.ipynb', 'python/normality-test/', 'Normality Test | plotly',
    'Learn how to generate various normality tests using Python.',
    title='Normality Test in Python. | plotly',
    name='Normality Test',
    language='python',
    page_type='example_index', has_thumbnail='false', display_as='statistics', order=4,
    ipynb= '~notebook_demo/112')

Collecting git+https://github.com/plotly/publisher.git
  Cloning https://github.com/plotly/publisher.git to /var/folders/ld/6cl3s_l50wd40tdjq2b03jxh0000gp/T/pip-3xtevj-build
Installing collected packages: publisher
  Found existing installation: publisher 0.10
    Uninstalling publisher-0.10:
      Successfully uninstalled publisher-0.10
  Running setup.py install for publisher ... [?25l- \ | done
[?25hSuccessfully installed publisher-0.10


