Chapter 11
# Critical Values

Not all implementations of statistical tests return p-values.  In some cases, you must use an alternative method for interpreting the calculated test statistic directly, such as critical values.  In addition, critical values are used when estimating the expected intervals for observations from a population, such as tolerance intervals.

Some examples of statistical hypothesis tests and their distributions from which critical values can be calculated are:
- Z-Test - Gaussian distribution
- Student's t-Test - Student's t-distribution
- Chi-Squared Test - Chi-Squared distribution
- ANOVA - F-distribution

Critical values are also used when defining intervals for expected (or unexpected) observations in distributions

Note that a p-value can be calculated from a test statistic by retrieving the probability from the test statistic's cumulative density function (CDF)

For most common distributions, the value cannot be calculated analytically; instead it must be estimated using numerical methods.  Historically, it is common for tables of pre-calculated critical values to be provided in statistics textbooks.

A critical value is a value from the distribution of the test statistic, after which point the result is significant and H0 can be rejected:
- test statistic < critical value: non-significant result, so fail to reject H0
- test statistic >= critical value: significant result, so reject null hypothesis and accept H1

The observation values in the population beyond the critical value are often called the critical region, or the region of rejection.

Results are presented as either significance level (alpha) or confidence level:
- confidence level = 1 - significance level

Standard alpha values are often used when calculating critical values, chosen for historical reasons and continually used for consistency reasons.  These alpha values include:
- 1% (alpha = 0.01) i.e. confidence 99%
- 5% (alpha = 0.05) i.e. confidence 95%
- 10% (alpha = 0.10) i.e. confidence 90%

# One-Tailed Test
A one-tailed test has a single critical value, such as on the left or the right of the dsitrbution.

Often, a one-tailed test has a critical value on the right of the distribution for non-symmetrical distributions (such as Chi-Squared).
- test statistic <= critical value: non-significant result, so fail to reject H0
- test statistic > critical value: significant result, so reject null hypothesis and accept H1

# Two-Tailed Test
A two-tailed test has 2 critical values, one on each side of the distrbution, which is often assumed to be symmetrical (e.g. Gaussian and Student t-distributions).

When using a two-tailed test, a significance value (alpha) used in the calculation of critical values must be divided in 2.  The critical value will then use half of this alpha on each side of the distribution.

e.g. alpha = 5%.  For a 95% acceptance area in the middle of the distribution, there is an alpha value of 2.5% on each side of the distribution.  
- lower critical value <= test statistic <= upper critical value: non-significant result, so fail to reject H0
- test statistic < lower critical value or test statistic > upper critical value: significant result, so reject null hypothesis and accept H1

If the distribution is symmetric around a mean of zero, we can shortcut the check by comparing the absolute value of the test statistic to the upper critical value: 
- |test statistic| <= upper critical value: non-significant result, so fail to reject H0
- |test statistic| > upper critical value: significant result, so reject null hypothesis and accept H1


# How to Calculate Critical Values
Remember that the Cumulative Density Function (CDF) returns the probability for an observation less than or equal to an specific value from the distribution.

We want the opposite: to return the observation value less than or equal to a specific probability from the distribution.

This is known as the Percent Point Function (PPF).  A value from the distribution will be less than or equal to the value returned from the PPF with the specified probability.

The ppf can be calculated using the ppf() function in SciPy, and also using the inverse survival function called isf()

Note the 68-95-99.7 rule, which states:
- ~68% of values lie within one standard deviation of the mean
- ~95% of values lie within two standard deviations of the mean 
- ~99.7% of values lie within three standard deviations of the mean

The above is an approximation for simplicity:
- 95.45% is a more accurate % corresponding to 2 standard deviations
- 1.96 standard deviations is a more accurate figure corresponding to 95%

In [26]:
# gaussian percent point function
from scipy.stats import norm

# define probability
p = 0.95

# retrieve the value that marks 95% or less of the observations from the distribution
value = norm.ppf(p)
print('Critical Value:', value)

# confirm by using cdf to retrieve the probability
p = norm.cdf(value)
print('Probability:', p)

Critical Value: 1.6448536269514722
Probability: 0.95


In [27]:
# student t-distribution percent point function
from scipy.stats import t

# define probability and degrees of freedom
p = 0.95
df = 10

# retrieve value <= probability
value = t.ppf(p, df)
print('Critical Value:', value)

# confirm by using cdf to retrieve the probability
p = t.cdf(value, df)
print('Probability:', p)

Critical Value: 1.8124611228107335
Probability: 0.949999999999923


In [28]:
# chi-squared percent point function
from scipy.stats import chi2

# define probability and degrees of freedom
p = 0.95
df = 10

# retrieve value <= probability
value = chi2.ppf(p, df)
print('Critical Value:', value)

# confirm by using cdf to retrieve the probability
p = chi2.cdf(value, df)
print('Probability:', p)

Critical Value: 18.307038053275146
Probability: 0.95


# Extensions

In [29]:
# calculate critical values for 90%, 95% and 99% for the Gaussian distribution
from scipy.stats import norm

def get_critical_value(p):
    return norm.ppf(p)

print('Probability 0.90 : Critical Value', get_critical_value(0.90))
print('Probability 0.95 : Critical Value', get_critical_value(0.95))
print('Probability 0.99 : Critical Value', get_critical_value(0.99))

Probability 0.90 : Critical Value 1.2815515655446004
Probability 0.95 : Critical Value 1.6448536269514722
Probability 0.99 : Critical Value 2.3263478740408408


In [30]:
# calculate a p-value from a critical value for the Gaussian distribution
from scipy.stats import norm

def get_probability(critical_value):
    return norm.cdf(critical_value)

print('Critical Value 1.96 : Probability', get_probability(1.96))

Critical Value 1.96 : Probability 0.9750021048517795


In [31]:
# calculate critical values for 90%, 95% and 99% for the Student's t-distribution
from scipy.stats import t

def get_critical_value(p, df):
    return t.ppf(p, df)

print('Probability 0.90 : Df 10 : Critical Value', get_critical_value(0.90, 10))
print('Probability 0.95 : Df 10 : Critical Value', get_critical_value(0.95, 10))
print('Probability 0.99 : Df 10 : Critical Value', get_critical_value(0.99, 10))

Probability 0.90 : Df 10 : Critical Value 1.3721836411102863
Probability 0.95 : Df 10 : Critical Value 1.8124611228107335
Probability 0.99 : Df 10 : Critical Value 2.763769457447889


In [32]:
# calculate a p-value from a critical value for the Student's t-distribution
from scipy.stats import t

def get_probability(critical_value, df):
    return t.cdf(critical_value, df)

print('Critical Value 2.0 : Df 60 : Probability', get_probability(2.0, 60))
print('Critical Value 1.812 : Df 10 : Probability', get_probability(1.812, 10))

Critical Value 2.0 : Df 60 : Probability 0.9749834781742712
Critical Value 1.812 : Df 10 : Probability 0.9499623689670763


In [33]:
# calculate critical values for 90%, 95% and 99% for the Chi-Squared distribution
from scipy.stats import chi2

def get_critical_value(p, df):
    return chi2.ppf(p, df)

print('Probability 0.90 : Df 10 : Critical Value', get_critical_value(0.90, 10))
print('Probability 0.95 : Df 10 : Critical Value', get_critical_value(0.95, 10))
print('Probability 0.99 : Df 10 : Critical Value', get_critical_value(0.99, 10))

Probability 0.90 : Df 10 : Critical Value 15.987179172105265
Probability 0.95 : Df 10 : Critical Value 18.307038053275146
Probability 0.99 : Df 10 : Critical Value 23.209251158954356


In [34]:
# calculate a p-value from a critical value for the Chi-Squared distribution
# chi-squared percent point function
from scipy.stats import chi2

def get_probability(critical_value, df):
    return chi2.cdf(critical_value, df)

print('Critical Value 18.31 : Df 10 : Probability', get_probability(18.31, 10))
print('Critical Value 20 : Df 5 : Probability', get_probability(20, 5))

Critical Value 18.31 : Df 10 : Probability 0.9500458336563032
Critical Value 20 : Df 5 : Probability 0.9987502694369687


In [35]:
# calculate critical values for 90%, 95% and 99% for the F distribution
from scipy.stats import f

def get_critical_value(p):
    return f.ppf(p, dfn=4, dfd=26)

print('Degrees of Freedom: d1 = 4, d2 = 26')
print('Probability 0.90 : Critical Value', get_critical_value(0.90))
print('Probability 0.95 : Critical Value', get_critical_value(0.95))
print('Probability 0.99 : Critical Value', get_critical_value(0.99))

Degrees of Freedom: d1 = 4, d2 = 26
Probability 0.90 : Critical Value 2.174469193431442
Probability 0.95 : Critical Value 2.7425941372218587
Probability 0.99 : Critical Value 4.1399604836950115
