#Shapiro statistic for normality:

The equation computes the ratio between the value that would be expected for a normally-distributed sample of that size containing that much information (the numerator of the fraction above) and the actual sum of the differences between each of the values in the variable and the sample mean (the denominator). Values close to 1 indicate that the distribution is similar to a normal distribution. The smaller the W statistic becomes, the more divergence there is between the distribution of the data and the normal distribution.

Tests are very sensitive to sample size.

These tests should be accompanied by visualizations since the test will detect very small instances of non normality, especially in large sample sizes. 

#Multiple Testing Correction: Tukey's Honest Sig DIfference (HSD)

This could be used instead of running multiple paired t-tests.

HSD performs pairwise tests that use a variability estimate that bases its variability form all groups rather than only from the two groups being tested (F-test).

When calculating the probability of getting this ratio, the test statistic, Q will be evaluated in light of a modified probability distribution that takes into account the number of means being tested across all pairwise tests.

Running Tukey's HSD using Python's statsmodels package will get us a table with the differences between each pair of means, the upper and lower bounds of that difference estimate, and whether we should reject the null hypothesis that each pair of groups is not different.

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as stats
from statsmodels.stats.multicomp import pairwise_tukeyhsd

  import pandas.util.testing as tm


In [0]:
coaster_heights = pd.DataFrame()

steel_heights = [
    18.5, 14, 30.2, 25.2024, 15, 16, 13.5, 30, 20, 17, 13.716, 8.5, 16.1, 18,
    41, 30.3, 32.004, 28.004, 30.48, 34
    ]

wood_heights = [
    38.70, 46, 27.8, 43.52, 33.77, 29.26, 16.764, 45, 48.1, 16.764, 24.384,
    24.5, 40, 35.96, 22.24, 21.33, 27.73, 23.46, 21.64, 30.12
    ]

plastic_heights = [
    9, 8.2, 12, 21, 6.3, 11.7, 19.44, 4.75, 13, 18, 15.5, 15.6, 10, 11.77, 29,
    5, 3.2, 14.75, 18.2, 17.7
    ]

coaster_heights['Steel'] = steel_heights
coaster_heights['Wood'] = wood_heights
coaster_heights['Plastic'] = plastic_heights

heights = np.asarray(
    coaster_heights['Steel'].tolist() +
    coaster_heights['Wood'].tolist() +
    coaster_heights['Plastic'].tolist())

materials = np.array(['Steel', 'Wood','Plastic'])
materials = np.repeat(materials, 20)

In [3]:
tukey = pairwise_tukeyhsd(endog=heights,
                          groups=materials,
                          alpha=0.01)#significance level
tukey.summary()

group1,group2,meandiff,p-adj,lower,upper,reject
Plastic,Steel,9.3698,0.0027,1.2031,17.5365,True
Plastic,Wood,17.6466,0.001,9.4799,25.8133,True
Steel,Wood,8.2768,0.0089,0.1101,16.4435,True


Above, "reject" is True for all groups. This means that the null hypothesis, that there is no difference between the means, has been rejected. 