<a href="https://colab.research.google.com/github/rosslogan702/hypothesis_testing_notes/blob/master/anova.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Hypothesis Testing - ANOVA (Analysis of Variance)

# Contents

The focus of this notebook is ANOVA testing.

The notebook will cover the following:



1.   Description
2.   Manual Calculation
3.   Practical Examples using Scipy Library
4. Assumptions

# 1. Description

ANOVA (Analysis of Variance) tests the null hypothesis that all of the datasets have the same mean.  

If we reject the null hypothesis when performing an ANOVA test, this tells us that at least one of the datasets has a different mean.  

However it does not inform us which of the datasets are different.

# 2. Manual Calculation

Using the same dataset as below in the scipy section, we are going to perform a oneway ANOVA hypothesis test to determine if there is a difference between the subscriber figures sales from the 3 different websites.

## Step 1- Define Null and Alternative Hypothesis

Null Hypothesis = $H_0$  
Alternative Hypothesis = $H_A$  

$H_0$: The sample mean of website a, website b and website c are equal.

$H_0: \mu_\text{website_a} = \mu_\text{website_b} = \mu_\text{website_c}$

$H_A$: There are at least two sample website means that are statistically significantly different from each other.

$H_A: \mu_\text{website_abc} - \mu_\text{website_abc} \neq 0$

In [0]:
website_a_figures = [73.57195018, 38.36736256, 49.36398786, 61.9617142, 
                     38.73959044, 55.94532269, 36.65062484, 60.67437231, 
                     63.07900236, 87.32085001, 50.34422982, 57.1090334, 
                     78.67520953, 61.03927418, 82.28774307, 53.58957582, 
                     72.92461536, 74.5603031, 55.02980576, 41.25844438, 
                     53.79588118, 64.79609893, 70.6964892, 66.74072317, 
                     75.0132205,  95.1255286, 49.455128, 66.03612649, 
                     53.02736305, 73.36372418, 40.25571098, 71.04422625, 
                     50.5013845, 38.22366664, 42.75767497, 52.50694334, 
                     38.604658, 59.67850535, 44.19604564, 46.92727224, 
                     55.24050064, 64.52773077, 34.09865429, 42.23778758, 
                     52.86937388, 90.10958086, 59.77157363, 65.57718324, 
                     67.40180559, 56.73021714, 63.26785746, 45.37055306, 
                     80.38995288, 87.65807685, 51.45634914, 65.99748438, 
                     72.47729986, 64.30071533, 19.73984606, 46.23986878, 
                     52.34828788, 61.11952527, 56.20838268, 39.34468135, 
                     57.93250947, 53.37617284, 48.81742261, 80.03593773, 
                     42.25474002, 44.4620247, 63.2401429, 53.75811252, 
                     41.12354869, 70.37251822, 58.0428706, 53.80533131, 
                     33.5540081, 50.05772819, 59.01472301, 63.18681147, 
                     56.36447661, 79.54804111, 57.58182513, 41.80650266, 
                     63.29608989, 69.20391058, 79.07999732, 68.87071256, 
                     54.61550389, 41.62384273, 58.05435004, 57.19652908, 
                     69.06753866, 76.73936006, 61.71461742, 90.44575864, 
                     44.97285945, 14.93461323, 60.22982158, 37.03612975, 
                     38.57973403, 56.3595887, 95.82780236, 82.89304549, 
                     48.08376399, 41.40360494, 39.62295003, 68.15395305, 
                     62.5134759, 48.39594647, 43.4393662, 53.8693371, 
                     45.40197091, 42.65484356, 77.56769986, 42.21598943, 
                     80.22825438, 52.2077973, 41.85889516, 83.40105978, 
                     63.19638331, 61.35383468, 80.02929924, 48.89037458, 
                     53.97640552, 56.44664214, 50.13546236, 41.93267706, 
                     62.23540804, 60.02470794, 71.94323655, 59.379194, 
                     42.88128137, 79.18722897, 79.31010058, 48.44544746, 
                     51.91236908, 41.13612282, 65.07530571, 49.21085783, 
                     82.5097768, 60.94264609, 56.83480824, 64.73765846, 
                     69.44225076, 47.86210011, 72.52226994, 68.98808623, 
                     58.23601966, 63.84862398]

In [0]:
website_b_figures = [166.22395047, 172.09005618, 152.32762949, 172.507172, 
             140.27311528, 140.36043391, 111.86744002, 141.70177057, 
             148.98833575, 146.91675049, 123.54945067, 123.40978504, 
             155.76569046, 142.96865798, 135.58264198, 169.68754397, 
             146.65975496, 150.9707252, 134.86971979, 149.60776063, 
             148.28307388, 145.88817947, 162.38303137, 135.00981504, 
             171.85180822, 130.07093286, 153.57912977, 138.00376972, 
             125.15058192, 165.07780817, 167.06076974, 148.21345784, 
             142.13482631, 161.74471113, 153.08759695, 127.46149326, 
             144.98488931, 168.04532604, 146.62742461, 169.08085481, 
             147.93072944, 177.82468602, 156.75914532, 149.6819541, 
             133.91578629, 131.58089602, 159.56290262, 160.28930871, 
             182.15691569, 141.93026478, 153.67681774, 124.85813365, 
             135.17597596, 151.92017019, 156.76937056, 151.0351934, 
             147.07835781, 163.30459951, 172.75128135, 132.2530624, 
             140.86317146, 153.71741024, 128.43960204, 150.16932458, 
             146.08145551, 171.98296242, 151.61688295, 159.94892745, 
             137.01951468, 139.07827094, 153.0684031, 151.07737089, 
             136.709464, 186.87288108, 146.4899794, 163.81305555, 
             135.53650919, 152.58431842, 149.59211758, 146.01451472, 
             159.67903997, 168.08994565, 155.03120985, 147.90284861, 
             133.29003945, 137.50441404, 179.52358583, 152.88678396, 
             124.53005843, 145.94780164, 150.74413238, 152.11680648, 
             147.87257702, 164.07964198, 141.96900012, 152.51128825, 
             145.32057291, 148.96494736, 146.39303683, 168.97273101, 
             126.22804995, 146.78461973, 152.16877455, 145.67293588, 
             146.75954115, 152.23049295, 172.39283402, 159.6911547, 
             153.36245345, 165.01578242, 150.94090222, 146.15422137, 
             140.50921191, 140.04764151, 153.15240977, 152.43495213, 
             152.45494029, 165.79913392, 126.14429818, 149.14747073, 
             134.03760893, 137.90312396, 134.40634288, 146.15657776, 
             139.77627365, 149.09244734, 152.204507, 159.88654681, 
             130.79661858, 143.78037133, 140.67739551, 138.27404966, 
             144.97202726, 129.27957612, 125.36927788, 156.07923218, 
             163.05308354, 151.10088824, 119.00762725, 147.40755788, 
             151.08377314, 139.35183159, 157.26332153, 117.17654103, 
             153.10114854, 133.8973409, 147.25560303, 170.96561247, 
             135.18641327, 123.07647302]

In [0]:
website_c_figures = [62.80215823, 54.62836519, 49.87267902, 71.37979508, 
             85.12967256, 49.81457321, 75.35136467, 54.45595669, 
             86.7241256, 75.56591744, 63.03183392, 57.3843091, 
             39.23348399, 82.69868909, 55.96390617, 61.76459869, 
             62.1037224, 66.92666631, 49.93339614, 51.13778227, 
             31.67316587, 38.49802083, 49.36472683, 69.04306874, 
             45.20762281, 73.5836671, 100.61092317, 30.92480424, 
             37.15912948, 53.73782208, 69.36703357, 60.15384459, 
             43.20003949, 51.12609883, 64.77507512, 63.28721074, 
             62.754003, 71.7590419, 73.08977513, 75.48485174, 
             47.8968874, 62.25258739, 66.33673312, 43.71093919, 
             80.65634624, 72.25758668, 100.91480345, 32.02761357, 
             58.31892089, 71.4399215, 45.07120452, 69.71137689, 
             85.37226652, 55.67710588, 70.25367706, 55.81488767, 
             61.21107415, 55.42183671, 66.80712926, 36.99284828, 
             42.37050081, 79.61120896, 58.88769703, 79.59158327, 
             59.16570772, 70.02097967, 85.29993197, 32.41236279, 
             52.6081084, 68.17342096, 65.32976073, 60.00672926, 
             26.30035248, 87.44943179, 55.35068819, 60.28778429, 
             33.03668105, 80.18693884, 77.27496626, 76.7616852, 
             100.94978198, 59.46503936, 78.07629437, 51.6102307, 
             86.95385235, 85.41984014, 54.83564532, 58.06315164, 
             66.17243082, 62.27966342, 83.35735441, 44.7213871, 
             42.8362959, 71.72428838, 68.4553881, 55.93152855, 
             52.33224863, 65.53344277, 48.19362864, 64.92300871, 
             56.23992888, 78.11260866, 55.80999334, 82.06126322, 
             67.03112813, 51.22917649, 51.6408127, 63.48194033, 
             71.77803695, 71.3696262, 45.12124272, 82.24703749, 
             70.91202908, 62.51210475, 71.71280187, 66.37758047, 
             49.28266126, 41.29074798, 61.81010589, 39.62654933, 
             73.54046019, 71.08493364, 61.88315174, 57.41521882, 
             69.83162524, 65.90217343, 77.11136925, 86.72149744, 
             81.81406049, 65.85430935, 94.96433541, 69.97078382, 
             73.34687737, 75.05530607, 57.51618582, 62.3665881, 
             58.80615169, 63.38469779, 35.87580831, 46.22850701, 
             56.05164877, 55.33599773, 45.84709985, 51.93706339, 
             70.16258619, 65.97686424, 50.51337502, 46.76635411, 
             70.39019472, 42.0636888]

## Step 2 - Calculate SST (Sum of Squares Total)

In [0]:
# Convert to numpy arrays to take advantage of in-built statistical functions
import numpy as np
website_a_figures_np = np.asarray(website_a_figures)
website_b_figures_np = np.asarray(website_b_figures)
website_c_figures_np = np.asarray(website_c_figures)

In [5]:
# print mean values and take mean of means to get grand mean
website_a_mean = website_a_figures_np.mean()
website_b_mean = website_b_figures_np.mean()
website_c_mean = website_c_figures_np.mean()

print('website a mean: {0:.3f}'.format(website_a_mean))
print('website b mean: {0:.3f}'.format(website_b_mean))
print('website c mean: {0:.3f}'.format(website_c_mean))

website a mean: 58.350
website b mean: 148.355
website c mean: 62.361


In [6]:
# Calculate grand mean (mean of mean's)
mean_of_means = np.asarray([website_a_mean, website_b_mean, website_c_mean])
grand_mean = mean_of_means.mean()
print('grand mean: {0:.3f}'.format(grand_mean))

grand mean: 89.689


In [0]:
# Alternative approach for calculating grand mean
website_a_total = website_a_figures_np.sum()
website_a_samples = len(website_a_figures)
website_b_total = website_b_figures_np.sum()
website_b_samples = len(website_b_figures)
website_c_total = website_c_figures_np.sum()
website_c_samples = len(website_c_figures)

In [8]:
grand_mean_alt = (website_a_total + website_b_total + website_c_total)/(website_a_samples + website_b_samples + website_c_samples)
print('grand mean alt: {0:.3f}'.format(grand_mean_alt))

grand mean alt: 89.689


In [9]:
# Calculating SST
SST = 0
all_website_figures = [website_a_figures, website_b_figures, website_c_figures]
for website_figures in all_website_figures:
  for figure in website_figures:
    SST += (figure - grand_mean)**2
print('SST: {:.3f}'.format(SST))

SST: 871657.188


In [10]:
# SST Degrees of freedom - m*n -1 
# m is the number of groups, n is the number of samples per group
# In this example there are 3 websites so m=3, 150 samples per website so n=150
degrees_of_freedom = 3 * 150 -1
print('Degrees of freedom: {:.3f}'.format(degrees_of_freedom))

Degrees of freedom: 449.000


## Step 3 - Calculate SSW (Sum of Squares Within)

In [11]:
# Find the mean for each group (website a,b & c)
# This has already been calculated above, so just printing out here
print('website a mean: {0:.3f}'.format(website_a_mean))
print('website b mean: {0:.3f}'.format(website_b_mean))
print('website c mean: {0:.3f}'.format(website_c_mean))

website a mean: 58.350
website b mean: 148.355
website c mean: 62.361


In [16]:
# Find SSW for each group and then total
ssw_a_total = 0
for figure in website_a_figures:
  ssw_a_total+= (figure-website_a_mean)**2
print('ssw_a_total: {0:.3f}'.format(ssw_a_total))

ssw_b_total = 0
for figure in website_b_figures:
  ssw_b_total+= (figure - website_b_mean)**2
print('ssw_b_total: {0:.3f}'.format(ssw_b_total))

ssw_c_total = 0
for figure in website_c_figures:
  ssw_c_total+= (figure - website_c_mean)**2
print('ssw_c_total: {0:.3f}'.format(ssw_c_total))

ssw_total = ssw_a_total + ssw_b_total + ssw_c_total
print('ssw_total: {0:.3f}'.format(ssw_total))

ssw_a_total: 32650.767
ssw_b_total: 29240.317
ssw_c_total: 34167.346
ssw_total: 96058.430


In [19]:
# SSB Degrees of freedom - m(n-1)
# m = number of groups = 3, n is number of samples = 150
ssw_dof = 3*(150-1)
print('ssw dof: {0:.3f}'.format(ssw_dof))

ssw dof: 447.000


## Step 4 - Calculate SSB

In [20]:
# To calculate SSB, we need the grand mean and the sample means
print('grand mean: {0:.3f}'.format(grand_mean))
print('website a mean: {0:.3f}'.format(website_a_mean))
print('website b mean: {0:.3f}'.format(website_b_mean))
print('website c mean: {0:.3f}'.format(website_c_mean))

grand mean: 89.689
website a mean: 58.350
website b mean: 148.355
website c mean: 62.361


In [21]:
# Find SSB for each group and then total
ssb_a_total = 0
for figure in website_a_figures:
  ssb_a_total+= (website_a_mean-grand_mean)**2
print('ssb_a_total: {0:.3f}'.format(ssb_a_total))

ssb_b_total = 0
for figure in website_a_figures:
  ssb_b_total+= (website_b_mean-grand_mean)**2
print('ssb_b_total: {0:.3f}'.format(ssb_a_total))

ssb_c_total = 0
for figure in website_a_figures:
  ssb_c_total+= (website_c_mean-grand_mean)**2
print('ssb_c_total: {0:.3f}'.format(ssb_a_total))

ssb_total = ssb_a_total + ssb_b_total + ssb_c_total
print('ssb_total: {0:.3f}'.format(ssb_total))

ssb_a_total: 147319.440
ssb_b_total: 147319.440
ssb_c_total: 147319.440
ssb_total: 775598.758


In [22]:
# ssb degrees of freedom is m-1, where m is number of groups
ssb_dof = 2
print('ssb dof: {0:.3f}'.format(ssb_dof))

ssb dof: 2.000


In [25]:
# Summary
print('SST : {0:.3f}'.format(SST))
print('SST DOF: {0:.3f}'.format(degrees_of_freedom))
print('SSW : {0:.3f}'.format(ssw_total))
print('SSW DOF: {0:.3f}'.format(ssw_dof))
print('SSB : {0:.3f}'.format(ssb_total))
print('SSB DOF: {0:.3f}'.format(ssb_dof))


SST : 871657.188
SST DOF: 449.000
SSW : 96058.430
SSW DOF: 447.000
SSB : 775598.758
SSB DOF: 2.000


## Step 5 - Calculate F-Statistic

$F\text{-}statistic = \frac{\dfrac{SSB}{\text{degrees of freedom ssb}}}{{\dfrac{SSW}{\text{degrees of freedom ssw}}}}$ 

In [26]:
f_statistic = (ssb_total/ssb_dof)/(ssw_total/ssw_dof)
print('F-Statistic: {0:.3f}'.format(f_statistic))

F-Statistic: 1804.592


We are going to set the significance level for this test to 5% (0.05).

This means that assuming the null hypothesis, if there is less than a 5% chance of getting the result we got, then we will reject the null hypothesis and favour the alternative hypothesis.

## Step 6 - Find the F-Table Value

Using the F-Table: http://socr.ucla.edu/Applets.dir/F_Table.html#FTable0.05

The SSB DOF is 2 and SSW DOF is 447, the F Critical value is : 2.9957

## Step 7 - Compare F-Statistic Value to Critical Value from F-Table

## Step 8 - Results

# 3. Practical Examples using Scipy Library

To perform an ANOVA test using the Scipy library we can use the function *f_oneway*. This allows an ANOVA test to be run on multiple datasets.  

The *f_oneway* function takes in multiple datasets as parameters to the function and returns the t-statistic and the p-val.

https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.f_oneway.html

## Example 1 - Subscriber Figures Sales

The following dataset is sales figures from 3 different websites for subscribers to their video streaming content.

In [0]:
website_a_figures = [73.57195018, 38.36736256, 49.36398786, 61.9617142, 
                     38.73959044, 55.94532269, 36.65062484, 60.67437231, 
                     63.07900236, 87.32085001, 50.34422982, 57.1090334, 
                     78.67520953, 61.03927418, 82.28774307, 53.58957582, 
                     72.92461536, 74.5603031, 55.02980576, 41.25844438, 
                     53.79588118, 64.79609893, 70.6964892, 66.74072317, 
                     75.0132205,  95.1255286, 49.455128, 66.03612649, 
                     53.02736305, 73.36372418, 40.25571098, 71.04422625, 
                     50.5013845, 38.22366664, 42.75767497, 52.50694334, 
                     38.604658, 59.67850535, 44.19604564, 46.92727224, 
                     55.24050064, 64.52773077, 34.09865429, 42.23778758, 
                     52.86937388, 90.10958086, 59.77157363, 65.57718324, 
                     67.40180559, 56.73021714, 63.26785746, 45.37055306, 
                     80.38995288, 87.65807685, 51.45634914, 65.99748438, 
                     72.47729986, 64.30071533, 19.73984606, 46.23986878, 
                     52.34828788, 61.11952527, 56.20838268, 39.34468135, 
                     57.93250947, 53.37617284, 48.81742261, 80.03593773, 
                     42.25474002, 44.4620247, 63.2401429, 53.75811252, 
                     41.12354869, 70.37251822, 58.0428706, 53.80533131, 
                     33.5540081, 50.05772819, 59.01472301, 63.18681147, 
                     56.36447661, 79.54804111, 57.58182513, 41.80650266, 
                     63.29608989, 69.20391058, 79.07999732, 68.87071256, 
                     54.61550389, 41.62384273, 58.05435004, 57.19652908, 
                     69.06753866, 76.73936006, 61.71461742, 90.44575864, 
                     44.97285945, 14.93461323, 60.22982158, 37.03612975, 
                     38.57973403, 56.3595887, 95.82780236, 82.89304549, 
                     48.08376399, 41.40360494, 39.62295003, 68.15395305, 
                     62.5134759, 48.39594647, 43.4393662, 53.8693371, 
                     45.40197091, 42.65484356, 77.56769986, 42.21598943, 
                     80.22825438, 52.2077973, 41.85889516, 83.40105978, 
                     63.19638331, 61.35383468, 80.02929924, 48.89037458, 
                     53.97640552, 56.44664214, 50.13546236, 41.93267706, 
                     62.23540804, 60.02470794, 71.94323655, 59.379194, 
                     42.88128137, 79.18722897, 79.31010058, 48.44544746, 
                     51.91236908, 41.13612282, 65.07530571, 49.21085783, 
                     82.5097768, 60.94264609, 56.83480824, 64.73765846, 
                     69.44225076, 47.86210011, 72.52226994, 68.98808623, 
                     58.23601966, 63.84862398]

In [0]:
website_b_figures = [166.22395047, 172.09005618, 152.32762949, 172.507172, 
             140.27311528, 140.36043391, 111.86744002, 141.70177057, 
             148.98833575, 146.91675049, 123.54945067, 123.40978504, 
             155.76569046, 142.96865798, 135.58264198, 169.68754397, 
             146.65975496, 150.9707252, 134.86971979, 149.60776063, 
             148.28307388, 145.88817947, 162.38303137, 135.00981504, 
             171.85180822, 130.07093286, 153.57912977, 138.00376972, 
             125.15058192, 165.07780817, 167.06076974, 148.21345784, 
             142.13482631, 161.74471113, 153.08759695, 127.46149326, 
             144.98488931, 168.04532604, 146.62742461, 169.08085481, 
             147.93072944, 177.82468602, 156.75914532, 149.6819541, 
             133.91578629, 131.58089602, 159.56290262, 160.28930871, 
             182.15691569, 141.93026478, 153.67681774, 124.85813365, 
             135.17597596, 151.92017019, 156.76937056, 151.0351934, 
             147.07835781, 163.30459951, 172.75128135, 132.2530624, 
             140.86317146, 153.71741024, 128.43960204, 150.16932458, 
             146.08145551, 171.98296242, 151.61688295, 159.94892745, 
             137.01951468, 139.07827094, 153.0684031, 151.07737089, 
             136.709464, 186.87288108, 146.4899794, 163.81305555, 
             135.53650919, 152.58431842, 149.59211758, 146.01451472, 
             159.67903997, 168.08994565, 155.03120985, 147.90284861, 
             133.29003945, 137.50441404, 179.52358583, 152.88678396, 
             124.53005843, 145.94780164, 150.74413238, 152.11680648, 
             147.87257702, 164.07964198, 141.96900012, 152.51128825, 
             145.32057291, 148.96494736, 146.39303683, 168.97273101, 
             126.22804995, 146.78461973, 152.16877455, 145.67293588, 
             146.75954115, 152.23049295, 172.39283402, 159.6911547, 
             153.36245345, 165.01578242, 150.94090222, 146.15422137, 
             140.50921191, 140.04764151, 153.15240977, 152.43495213, 
             152.45494029, 165.79913392, 126.14429818, 149.14747073, 
             134.03760893, 137.90312396, 134.40634288, 146.15657776, 
             139.77627365, 149.09244734, 152.204507, 159.88654681, 
             130.79661858, 143.78037133, 140.67739551, 138.27404966, 
             144.97202726, 129.27957612, 125.36927788, 156.07923218, 
             163.05308354, 151.10088824, 119.00762725, 147.40755788, 
             151.08377314, 139.35183159, 157.26332153, 117.17654103, 
             153.10114854, 133.8973409, 147.25560303, 170.96561247, 
             135.18641327, 123.07647302]

In [0]:
website_c_figures = [62.80215823, 54.62836519, 49.87267902, 71.37979508, 
             85.12967256, 49.81457321, 75.35136467, 54.45595669, 
             86.7241256, 75.56591744, 63.03183392, 57.3843091, 
             39.23348399, 82.69868909, 55.96390617, 61.76459869, 
             62.1037224, 66.92666631, 49.93339614, 51.13778227, 
             31.67316587, 38.49802083, 49.36472683, 69.04306874, 
             45.20762281, 73.5836671, 100.61092317, 30.92480424, 
             37.15912948, 53.73782208, 69.36703357, 60.15384459, 
             43.20003949, 51.12609883, 64.77507512, 63.28721074, 
             62.754003, 71.7590419, 73.08977513, 75.48485174, 
             47.8968874, 62.25258739, 66.33673312, 43.71093919, 
             80.65634624, 72.25758668, 100.91480345, 32.02761357, 
             58.31892089, 71.4399215, 45.07120452, 69.71137689, 
             85.37226652, 55.67710588, 70.25367706, 55.81488767, 
             61.21107415, 55.42183671, 66.80712926, 36.99284828, 
             42.37050081, 79.61120896, 58.88769703, 79.59158327, 
             59.16570772, 70.02097967, 85.29993197, 32.41236279, 
             52.6081084, 68.17342096, 65.32976073, 60.00672926, 
             26.30035248, 87.44943179, 55.35068819, 60.28778429, 
             33.03668105, 80.18693884, 77.27496626, 76.7616852, 
             100.94978198, 59.46503936, 78.07629437, 51.6102307, 
             86.95385235, 85.41984014, 54.83564532, 58.06315164, 
             66.17243082, 62.27966342, 83.35735441, 44.7213871, 
             42.8362959, 71.72428838, 68.4553881, 55.93152855, 
             52.33224863, 65.53344277, 48.19362864, 64.92300871, 
             56.23992888, 78.11260866, 55.80999334, 82.06126322, 
             67.03112813, 51.22917649, 51.6408127, 63.48194033, 
             71.77803695, 71.3696262, 45.12124272, 82.24703749, 
             70.91202908, 62.51210475, 71.71280187, 66.37758047, 
             49.28266126, 41.29074798, 61.81010589, 39.62654933, 
             73.54046019, 71.08493364, 61.88315174, 57.41521882, 
             69.83162524, 65.90217343, 77.11136925, 86.72149744, 
             81.81406049, 65.85430935, 94.96433541, 69.97078382, 
             73.34687737, 75.05530607, 57.51618582, 62.3665881, 
             58.80615169, 63.38469779, 35.87580831, 46.22850701, 
             56.05164877, 55.33599773, 45.84709985, 51.93706339, 
             70.16258619, 65.97686424, 50.51337502, 46.76635411, 
             70.39019472, 42.0636888]

## Step 1 - Define Null and Alternative Hypothesis

Null Hypothesis = $H_0$  
Alternative Hypothesis = $H_A$  

$H_0$: The sample mean of website a, website b and website c are equal.

$H_0: \mu_\text{website_a} = \mu_\text{website_b} = \mu_\text{website_c}$

$H_A$: There are at least two sample website means that are statistically significantly different from each other.

$H_A: \mu_\text{website_abc} - \mu_\text{website_abc} \neq 0$

## Step 2 - Prepare Data & Run Test

In [0]:
from scipy.stats import f_oneway
import numpy as np

In [0]:
# Collect Statistics about each website
website_a_mean = np.mean(website_a_figures)
website_b_mean = np.mean(website_b_figures)
website_c_mean = np.mean(website_c_figures)

website_a_std = np.std(website_a_figures)
website_b_std = np.std(website_b_figures)
website_c_std = np.std(website_c_figures)

In [0]:
# Print statistics
print('website a mean: {}'.format(website_a_mean))
print('website a std:  {}'.format(website_a_std))
print('website b mean: {}'.format(website_b_mean))
print('website b std:  {}'.format(website_b_std))
print('website c mean: {}'.format(website_c_mean))
print('website c std:  {}'.format(website_c_std))

website a mean: 58.34963608399999
website a std:  14.753704052425187
website b mean: 148.35494018599996
website b std:  13.96192849097182
website c mean: 62.361173186000016
website c std:  15.092458510871044


In [0]:
# Run test using scipy lib f_oneway
fstat, p_val = f_oneway(website_a_figures, website_b_figures, website_c_figures)

In [0]:
# print values returned from hypothesis test
print('fstat: {0:.3f}'.format(fstat))
print('p_val: {0:.6f}'.format(p_val))

fstat: 1804.592
p_val: 0.000000


## Step 3 - Collect & Analyse Results

In [0]:
print('p_val: {:.6f}'.format(p_val))
if p_val < 0.05:
  print("Result is statistically significant! There is a statistically significant difference between two or more website figures sample means!")
else:
  print("Result is not statistically signifcant! There is no statistically significant difference between any of the website figures datasets sample means!")

p_val: 0.000000
Result is statistically significant! There is a statistically significant difference between two or more website figures sample means!


# 4. Assumptions