# Understanding Critical Values

In the world of statistics, interpreting the results of hypothesis tests often involves using p-values.
- However, not all statistical tests provide p-values. In such cases, alternative measures, known as critical values, come into play.
- Critical values are also essential when estimating expected intervals for observations within a population, as seen in tolerance intervals.

This tutorial aims to introduce you to critical values, explaining their significance, application, and how to calculate them using Python and SciPy.

By the end of this tutorial, you will gain knowledge about:

1. Real-world examples of statistical hypothesis tests and the associated distributions from which critical values can be derived and applied.

2. A clear understanding of how critical values are used in one-tail and two-tail statistical hypothesis tests.

3. Step-by-step guidance on calculating critical values for three important distributions: Gaussian, Student’s t, and Chi-Squared.


---


## Why Do We Need Critical Values?

- Critical values are derived from the distribution of the test statistic and offer a benchmark against which we can compare our calculated statistic. Here are some instances where we use critical values:

  1. **Z-Test:** For tests involving a Gaussian distribution.
  2. **Student’s t-Test:** Used with Student’s t-distribution.
  3. **Chi-Squared Test:** Applicable when dealing with Chi-Squared distribution.
  4. **ANOVA:** Appropriate when working with an F-distribution.

- Furthermore, critical values find application in defining intervals for expected or unexpected observations in distributions.
  - They are particularly useful when we aim to quantify the uncertainty in estimated statistics, confidence intervals, or tolerance intervals.

- It's worth noting that if you have a test statistic and want to calculate a p-value, you can achieve this by obtaining the probability from the cumulative density function (CDF) associated with the test statistic.

---


## Understanding Critical Values

- A critical value is a specific number that helps us make sense of data in statistics.
  - It's all about figuring out if an observation from a group of data is likely or unlikely based on a given chance. Here's how we write it down: $ P(X \leq \text{critical value}) = \text{probability} $

    - P stands for the chance we're talking about.
    - X is the data we're checking.
    - "Critical value" is that special number we want to know.
    - "Probability" is how likely something is.

- We don't always have easy ways to find critical values with math. For most common cases, we have to make educated guesses using numbers. People used to keep lists of these numbers in the back of their statistics books for help.

- Critical values are super important in statistical tests. The chance we're looking at is often called "significance," and we write it as (alpha ). It's just the opposite of the probability: $ probability = 1 - alpha $

- Usually, we use some standard values for \( \alpha \) that have been used for a long time, like:
  - 1% (alpha = 0.01)
  - 5% (alpha = 0.05)
  - 10% (alpha = 0.10)

Critical values give us a different way to understand statistical tests, kind of like p-values.


---



## How to Use Critical Values

Critical values act as a threshold for interpreting the results of a statistical test.
- When conducting a statistical analysis, you compare your calculated statistic to the critical value to determine if your findings support your hypothesis.
- If your statistic falls beyond this critical value, it enters what's known as the "critical region" or the "region of rejection."

- **Critical Value**: This is a specific number listed in tables for different statistical tests.
  - It tells you the point at which you can reject the null hypothesis, indicating that your calculated statistic is in the rejection region.

- Remember, statistical tests can be either one-tailed or two-tailed, depending on the nature of your research question and hypothesis.

---


## Understanding One-Tailed Tests

- In a one-tailed test, there's just one critical value, positioned either on the left or the right side of a distribution.
  - It's particularly useful for non-symmetrical distributions, like the Chi-Squared distribution.

- Here's how it works:

  - We have a statistic that we've calculated.
  - We compare this statistic to the critical value.
  - If the statistic is less than or equal to the critical value, we can't reject the null hypothesis (H0). We consider the result not significant.
  - If the statistic is greater than or equal to the critical value, we have a significant result, and we reject the null hypothesis (H0).

- In simple terms:
  - If Test Statistic < Critical Value: Not a significant result, we can't reject the null hypothesis (H0).
  - If Test Statistic ≥ Critical Value: It's a significant result, and we reject the null hypothesis (H0).


---


## Understanding Two-Tailed Tests

- In a two-tailed test, there are two critical values, one on each side of a distribution, assuming the distribution is symmetrical (e.g., Gaussian or Student-t distributions).

- Here's how it works:

  - We typically use a significance level (alpha) when calculating the critical values.
    - For a two-tailed test, we split alpha into two equal parts.
    - Imagine a 5% alpha, which gets divided into two alpha values of 2.5% on each side of the distribution. The middle 95% represents the acceptance area.
  - We call these critical values the "lower" and "upper" critical values for the left and right sides of the distribution.

- Now, interpreting the results:

  - If the test statistic falls between the lower and upper critical values: It's not a significant result, and we don't reject the null hypothesis (H0).
  - If the test statistic is less than the lower critical value or greater than the upper critical value: It's a significant result, and we reject the null hypothesis (H0).

- Simplifying further, if the distribution is symmetric around a mean of zero:

  - If the absolute (positive) value of the test statistic is less than the upper critical value, it's not a significant result, and we don't reject the null hypothesis (H0).
  - If the absolute value of the test statistic is greater than or equal to the upper critical value, it's a significant result, and we reject the null hypothesis (H0). This indicates that the distributions are different.

- In summary:
  - If Lower CR < Test Statistic > Upper CR: Not significant, we don't reject the null hypothesis (H0).
  - If Test Statistic ≤ Lower CR OR Test Statistic ≥ Upper CR: Significant result, we reject the null hypothesis (H0).
  - If |Test Statistic| < Upper Critical Value: Not significant, distributions are the same.
  - If |Test Statistic| ≥ Upper Critical Value: Significant result, distributions differ.

    - Where |Test Statistic| represents the absolute value of the calculated test statistic.

---


## Calculating Critical Values

Density functions help us understand the likelihood of an observation in a distribution. Here are two important concepts:

- **Probability Density Function (PDF)**: It tells us the likelihood of a specific observation having a particular value from the distribution.
- **Cumulative Density Function (CDF)**: It informs us about the likelihood of an observation being less than or equal to a certain value from the distribution.

To calculate a critical value, we need a function that can take a probability (or significance level) and provide the observation value from the distribution. This function is called the **Percent Point Function (PPF)** or, more broadly, the **Quantile Function**.

- **Percent Point Function (PPF)**: It gives us the observation value for a given probability that is less than or equal to that probability from the distribution.

In simple terms, if you have a value returned by the PPF with a specific probability, it means that a value from the distribution is either equal to or less than that returned value.

Let's illustrate this with three commonly used distributions for which we often need to calculate critical values: the Gaussian distribution, Student’s t-distribution, and the Chi-Squared distribution. You can calculate the PPF using the `ppf()` function in SciPy. It's worth noting that you may come across the term "inverse survival function (isf)" in third-party code, which is an alternative approach to calculate the PPF in SciPy.


---


## Calculating Critical Values for the Gaussian Distribution

- In this example, we calculate a critical value for a 95% confidence level on the standard Gaussian distribution. This value helps us understand where the middle 95% of observations fall.

- When you run this code, it first prints a value, approximately 1.645, which signifies that 95% or less of the observations from the Gaussian distribution fall below this value.
  - This value is then confirmed by retrieving the probability from the Cumulative Density Function (CDF), which returns 95%, as expected.

  - The value of approximately 1.645 aligns with the idea that it covers 95% of the distribution and corresponds to about 1.645 standard deviations from the mean, following the common "68-95-99.7 rule."

---

In [None]:
# Gaussian Percent Point Function
from scipy.stats import norm

# Define probability (95% confidence level)
p = 0.95

# Retrieve a value that corresponds to the probability
value = norm.ppf(p)
print(value)

# Confirm the probability with the Cumulative Density Function (CDF)
p = norm.cdf(value)
print(p)

1.6448536269514722
0.95


## Calculating Critical Values for the Student’s t-Distribution

- In this example, we calculate a critical value for a 95% confidence level on the standard Student’s t-distribution with 10 degrees of freedom. This value helps us understand where the middle 95% of observations fall.

- Running this code returns a value, approximately 1.812, which covers 95% or less of the observations from the Student’s t-distribution with 10 degrees of freedom.
  - The probability of this value is then confirmed (with minor rounding error) via the Cumulative Density Function (CDF).


---



In [None]:
# Student’s t-Distribution Percent Point Function
from scipy.stats import t

# Define probability (95% confidence level) and degrees of freedom
p = 0.95
df = 10

# Retrieve a value that corresponds to the probability
value = t.ppf(p, df)
print(value)

# Confirm the probability with the Cumulative Density Function (CDF)
p = t.cdf(value, df)
print(p)

1.8124611228107335
0.949999999999923


## Calculating Critical Values for the Chi-Squared Distribution

- In this example, we calculate a critical value for a 95% confidence level on the standard Chi-Squared distribution with 10 degrees of freedom. This value helps us understand where the middle 95% of observations fall.

- Running this code first calculates a value, approximately 18.3, that covers 95% or less of the observations from the Chi-Squared distribution with 10 degrees of freedom.
  - The probability associated with this observation is confirmed by using it as input to the Cumulative Density Function (CDF).

---



In [None]:
# Chi-Squared Distribution Percent Point Function
from scipy.stats import chi2

# Define probability (95% confidence level) and degrees of freedom
p = 0.95
df = 10

# Retrieve a value that corresponds to the probability
value = chi2.ppf(p, df)
print(value)

# Confirm the probability with the Cumulative Density Function (CDF)
p = chi2.cdf(value, df)
print(p)

18.307038053275146
0.95


## Further Reading

### Books

- [Handbook of Research Methods: A Guide for Practitioners and Students in the Social Sciences, 2003](http://amzn.to/2G4vG4k)

### API Documentation

- [scipy.stats.norm API](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.norm.html)
- [scipy.stats.t API](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.t.html)
- [scipy.stats.chi2 API](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.chi2.html)

### Articles

- [Critical value on Wikipedia](https://en.wikipedia.org/wiki/Critical_value)
- [P-value on Wikipedia](https://en.wikipedia.org/wiki/P-value)
- [One- and two-tailed tests on Wikipedia](https://en.wikipedia.org/wiki/One-_and_two-tailed_tests)
- [Quantile function on Wikipedia](https://en.wikipedia.org/wiki/Quantile_function)
- [68-95-99.7 rule on Wikipedia](https://en.wikipedia.org/wiki/68%E2%80%9395%E2%80%9399.7_rule)


---
