# Shapiro-Wilk normality test

In [1]:
from scipy.stats import shapiro
from tqdm.notebook import tqdm, trange
from random import gauss, randint, choice, randrange, uniform

Non real-world example of Shapiro-Wilk test: perfect picks from a Gaussian distributions.

In [2]:
alpha = 0.05
tries = 10_000

n = 166
average = 35.34181107
sd = 6.867662223  

fail = 0
for i in trange(tries):
    data = [gauss(average, sd) for _ in range(n)]
    stat, p = shapiro(data)
    if p < 0.05:
        fail +=1
        
print(f"This test has given a WRONG answer in the {round(fail*100/tries, 2)}% of cases")

HBox(children=(FloatProgress(value=0.0, max=10000.0), HTML(value='')))


This test has given a WRONG answer in the 4.82% of cases


As expected, the accepted Type I error rate closely matches the threshold we set at ~5%, for numbers picked out of a clean math distribution.

In real world, outcomes can be influenced by a moltitude of other factors: animal behaviour, eating, social interactions. To be fair, let's try to jitter each value randomly by a randomly assigned score of the same order of the SD for each individual experiments (that's around 20% of the average).

In [3]:
alpha = 0.05
tries = 10_000

n = 166
average = 35.34181107
sd = 6.867662223

def jitter(numlike, amount=0.2):
    
    # 50% chance to flip sign
    modifier = numlike if choice([True, False]) else numlike * -1
        
    return numlike + modifier * (gauss(0, amount))
    

fail = 0
for i in trange(tries):
    data = [jitter(gauss(average, sd)) for _ in range(n)]
    stat, p = shapiro(data)
    if p < 0.05:
        fail +=1
        
print(f"This test has given a WRONG answer in the {round(fail*100/tries, 2)}% of cases")

HBox(children=(FloatProgress(value=0.0, max=10000.0), HTML(value='')))


This test has given a WRONG answer in the 46.02% of cases


In [4]:
alpha = 0.05
tries = 10_000

n = 166
average = 35.34181107
sd = 6.867662223

def jitter(numlike, amount=0.1):
    
    # 50% chance to flip sign
    modifier = numlike if choice([True, False]) else numlike * -1
        
    return numlike + modifier * (gauss(0, amount))
    

fail = 0
for i in trange(tries):
    data = [jitter(gauss(average, sd)) for _ in range(n)]
    stat, p = shapiro(data)
    if p < 0.05:
        fail +=1
        
print(f"This test has given a WRONG answer in the {round(fail*100/tries, 2)}% of cases")

HBox(children=(FloatProgress(value=0.0, max=10000.0), HTML(value='')))


This test has given a WRONG answer in the 16.84% of cases


Even the smallest jitter around values (from an otherwise perfect Gaussian distrubition) can negate the correctness of the test results.