Chapter 13
# Significance Tests

Parametric statistical tests assume that a data sample was drawn from a specific population distribution.

A typical question we may have about 2 or more samples of data is whether they have the same distribution.

Parametric statistical significance tests are those statistical methods that assume data comes from the same Gaussian distribution i.e. with the same mean and standard deviation.

In general, each test calculates a test statistic that must be interpreted with some background in statistics and a deeper knowledge of the statistical test itself.

Tests also return a p-value: the probability of observing the two data samples given the null hypothesis that the two samples were drawn from a population with the same distribution.

The p-value can be interpreted in the context of a chosen significance level (alpha), commonly 5% or 0.05:
- p-value <= alpha - significant result, so reject null hypothesis i.e. distributions differe
- p-value > alpha - non-significant result, so fail to reject null hypothesis i.e. distributions same

# Test Data
Define a test dataset we can use to demonstrate each test.

Generate two samples from different Gaussian distributions:
- the first is scaled to have a mean of 50 and standard deviation of 5
- the second is scaled to have a mean of 51 and a standard deviation of 5

We are using a small sample size of 100 observations per sample, which add some noise to the decision of whether or not the samples have been drawn from differing distributions

In [1]:
# generate gaussian data samples
from numpy.random import seed
from numpy.random import randn
from numpy import mean
from numpy import std

# seed the random number generator
seed(1)

# generate two sets of univariate observations
data1 = 5 * randn(100) + 50
data2 = 5 * randn(100) + 51

# summarize
print('data1: mean=%.3f stdv=%.3f' % (mean(data1), std(data1)))
print('data2: mean=%.3f stdv=%.3f' % (mean(data2), std(data2)))

data1: mean=50.303 stdv=4.426
data2: mean=51.764 stdv=4.660


# Student's t-Test
This is a statistical hypothesis test that 2 independent data samples, known to have a Gaussian distribution, have the same Gaussian distribution

Null hypothesis (H0) is that the means of the two populations are equal.  A rejection of the hypothesis indicates there is sufficient evidence that the means are different, and therefore the distbutions are not equal

The t-Test is available via the SciPy function ttest_ind(), which accepts two data samples as arguments, and returns the calculated statistic and p-value.

The test assumes both samples have the same variance.  If this is not the case, a corrected version of the test can be used by setting equal_var = False

In [2]:
# student's t-test
from numpy.random import seed
from numpy.random import randn
from scipy.stats import ttest_ind

# seed the random number generator
seed(1)

# generate two independent samples as above
data1 = 5 * randn(100) + 50
data2 = 5 * randn(100) + 51

# compare samples
stat, p = ttest_ind(data1, data2)
print('Statistics=%.3f, p=%.3f' % (stat, p))

# interpret using a 5% signficance level
alpha = 0.05
if p > alpha:
    print('Same distributions (fail to reject H0)')
else:
    print('Different distributions (reject H0)')

Statistics=-2.262, p=0.025
Different distributions (reject H0)


# Paired Student's t-Test
We may wish to compare the means between two data samples that are related in some way e.g. two independent measures or evaluations of the same object

Because the samples are not inependent, we cannot use the Student's t-test.  Instead, we must use a modified version of the test that corrects for the fact that the samples are dependent, called the paired Student's t-test.

The null hypothesis is that there is no difference in the means between the samples.  The rejection of the null hypothesis indicates that there is enough evidence that the sample means are different:
- fail to reject H0 - paired sample distributions are equal
- reject H0 - paired sample distributions are not equal

The paired Student's t-test is available via the SciPy function ttest_rel(), which accepts two data samples as arguments, and returns the calculated statistic and p-value.

In [3]:
# paired student's t-test
from numpy.random import seed
from numpy.random import randn
from scipy.stats import ttest_rel

# seed the random number generator
seed(1)

# generate two independent samples, as before.  Although the samples are independent, not paired, we can pretend for the sake of the demonstration that the observations are paired and calculate the statistic
data1 = 5 * randn(100) + 50
data2 = 5 * randn(100) + 51

# compare samples
stat, p = ttest_rel(data1, data2)
print('Statistics=%.3f, p=%.3f' % (stat, p))

# interpret using a 5% significance level   
alpha = 0.05
if p > alpha:
    print('Same distributions (fail to reject H0)')
else:
    print('Different distributions (reject H0)')

Statistics=-2.372, p=0.020
Different distributions (reject H0)


# Analysis of Variance Test (ANOVA)
There are sometimes situations where we may have multiple independent data samples.  We can perform the Student's t-test pairwise on each combination of the data samples to get an idea of which samples have different means.  This can be onerous if we are only interested in whether or not all samples have the same distribution: so we can use the Analysis of Variance test (ANOVA)

ANOVA is a statistical test that assumes that the mean across 2 or more groups are equal:
- fail to reject H0 - all sample distributions are equal
- reject H0 - one or more sample distributions are not equal

Importantly, the test can only comment on whether or not all samples are the same: it cannot quantify which samples differ, or by how much

The test requires that:
- the data samples are a Gaussian distribution
- the samples are independent
- all data samples have the same standard deviation

The ANOVA test is available via the SciPy function f_oneway(), which takes two or more data samples as arguments and returns the test statistic and f-value.

In [5]:
# analysis of variance test
from numpy.random import seed
from numpy.random import randn
from scipy.stats import f_oneway

# seed the random number generator
seed(1)

# generate three independent samples - two the same and one different
data1 = 5 * randn(100) + 50
data2 = 5 * randn(100) + 50
data3 = 5 * randn(100) + 52

# compare samples
stat, p = f_oneway(data1, data2, data3)
print('Statistics=%.3f, p=%.3f' % (stat, p))

# interpret using a 5% significance level
alpha = 0.05
if p > alpha:
    print('Same distributions (fail to reject H0)')
else:
    print('Different distributions (reject H0)')

Statistics=3.655, p=0.027
Different distributions (reject H0)


# Repeated Measures ANOVA Test
We may have multiple data samples that are related or dependent in some way e.g. repeating the same measurements on a subject at different time periods

As we will have multiple paired samples, we could repeat the pairwise Student's t-test multiple times

Alternatively, we can use a single test to check if all of the samples have the same mean, called the repeated measures ANOVA test

The null hypothesis is that all paired samples have the same mean, and therefore the same distribution:
- fail to reject H0 - all paired sample distributions are equal
- reject H0 - one or more paired sample distributions are not equal

Unfortunately, at the time of writing, there is no version of the repeated measures ANOVA test available in SciPy.

# Extensions

In [8]:
# update student's t-test example to operate on data samples with the same distribution
from numpy.random import seed
from numpy.random import randn
from scipy.stats import ttest_ind

# generate two independent samples as above
data1 = 5 * randn(100) + 50
data2 = 5 * randn(100) + 50

# compare samples
stat, p = ttest_ind(data1, data2)
print('Statistics=%.3f, p=%.3f' % (stat, p))

# interpret using a 5% signficance level
alpha = 0.05
if p > alpha:
    print('Same distributions (fail to reject H0)')
else:
    print('Different distributions (reject H0)')

Statistics=-0.415, p=0.679
Same distributions (fail to reject H0)


In [9]:
# update paired student's t-test example to operate on data samples with the same distribution
from numpy.random import seed
from numpy.random import randn
from scipy.stats import ttest_rel

# generate two independent samples, as before.  Although the samples are independent, not paired, we can pretend for the sake of the demonstration that the observations are paired and calculate the statistic
data1 = 5 * randn(100) + 50
data2 = 5 * randn(100) + 50

# compare samples
stat, p = ttest_rel(data1, data2)
print('Statistics=%.3f, p=%.3f' % (stat, p))

# interpret using a 5% significance level   
alpha = 0.05
if p > alpha:
    print('Same distributions (fail to reject H0)')
else:
    print('Different distributions (reject H0)')

Statistics=0.290, p=0.772
Same distributions (fail to reject H0)


In [10]:
# update analysis of variance test example to operate on data samples with the same distribution
from numpy.random import seed
from numpy.random import randn
from scipy.stats import f_oneway

# generate three independent samples - two the same and one different
data1 = 5 * randn(100) + 50
data2 = 5 * randn(100) + 50
data3 = 5 * randn(100) + 50

# compare samples
stat, p = f_oneway(data1, data2, data3)
print('Statistics=%.3f, p=%.3f' % (stat, p))

# interpret using a 5% significance level
alpha = 0.05
if p > alpha:
    print('Same distributions (fail to reject H0)')
else:
    print('Different distributions (reject H0)')

Statistics=0.166, p=0.847
Same distributions (fail to reject H0)
