# Problem 4: ANOVA

## Problem Statement

The objective of this analysis is to investigate whether there is a statistically significant difference between the means of three independent samples.

> *"Generate three independent samples, each of size 30, from normal distributions with means 0, 0.5, and 1, each with standard deviation 1. Perform a one-way ANOVA to test whether all three means are equal. Perform three independent two-sample t-tests: samples 1 vs 2, 1 vs 3, and 2 vs 3. Compare the conclusions and write a short note on why ANOVA is preferred over running several t-tests."*

To solve this, we will:

- Generate **three independent samples**, each of size **n = 30**, from normal distributions with means **0**, **0.5**, and **1**, and a common standard deviation of **1**.
- Use a **one-way ANOVA** to test the null hypothesis that all three sample means are equal.
- Perform **three independent two-sample *t*-tests** to compare the sample pairs:
  - Sample 1 vs Sample 2  
  - Sample 1 vs Sample 3  
  - Sample 2 vs Sample 3
- Compare the conclusions drawn from the one-way ANOVA and the pairwise *t*-tests.
- Explain why **one-way ANOVA** is generally preferred over performing multiple independent *t*-tests when comparing more than two sample means.


---

In [2]:
# Libraries Required

# Numerical arrays.
import numpy as np

# Plotting.
import matplotlib.pyplot as plt

# Statistical functions.
import scipy.stats as stats

---

## One-Way ANOVA

One-way analysis of variance (ANOVA) is a statistical test used to determine whether there is a statistically significant difference between the means of independent samples, in this case three samples. A single hypothesis test is used to assess all sample means simultaneously, rather than comparing the samples pairwise.

The null hypothesis for the one-way ANOVA is:

$$
H_0: \mu_1 = \mu_2 = \mu_3
$$

The alternative hypothesis is that at least one of the sample means differs from the others.

For this test, a significance level of $\alpha = 0.05$ is used. The null hypothesis is rejected if the *p*-value returned by the ANOVA is less than $\alpha$. This indicates evidence of a difference in sample means between at least one pair of samples.


### References for this section:
1. One Way Anova - https://www.statology.org/null-hypothesis-for-anova/
2. One Way Anova - https://www.datacamp.com/tutorial/anova-test
3. One Way Anova - https://sarowarahmed.medium.com/understanding-anova-analysis-of-variance-test-in-hypothesis-testing-4a050a1f7599
4. One Way Anova - https://courses.lumenlearning.com/introstats1/chapter/one-way-anova/
5. One Way Anova - https://www.geeksforgeeks.org/machine-learning/one-way-anova/
6. One Way Anova - https://towardsdatascience.com/anova-explained-for-beginners-with-the-bachelorette-tv-show-8503c4aaba10/

---

## Independent Sample *t*-Test

To allow for a statistical assessment of whether there is a significant difference between the means of two independent samples, an independent two-sample *t*-test is used. In this analysis, the test is applied to compare the means of pairs of samples drawn from normal distributions with identical variances but potentially different means.

The null and alternative hypotheses are defined as:

$$
H_0: \mu_1 = \mu_2
$$

$$
H_1: \mu_1 \ne \mu_2
$$

Applying a significance level of $\alpha = 0.05$, a two-sided test is performed. If the resulting *p*-value is less than $\alpha = 0.05$, the null hypothesis is rejected.

The assumptions of the *t*-test are that the samples are normally distributed and independent. In this experiment, these assumptions are satisfied, as observations are generated independently from normal distributions with equal standard deviations.


### References for this section
1. Sample T-Test - https://www.statology.org/two-sample-t-test/
2. Sample T-Test - https://www.jmp.com/en/statistics-knowledge-portal/t-test/two-sample-t-test
3. Sample T-Test - https://statistics.laerd.com/statistical-guides/independent-t-test-statistical-guide.php
4. Sample T-Test - https://www.geeksforgeeks.org/r-language/what-is-the-differences-between-the-two-sample-t-test-and-paired-t-test/
5. Sample T-Test - https://www.qualitygurus.com/two-sample-t-test/
6. Sample T-Test - https://www.geeksforgeeks.org/machine-learning/how-to-conduct-a-two-sample-t-test-in-python/
7. Sample T-Test - https://www.statology.org/two-sample-t-test-python/
8. Sample T-Test - https://www.datacamp.com/tutorial/an-introduction-to-python-t-tests
9. Sample T-Test - https://www.jonathanbossio.com/post/two-sample-t-test-with-python
10. Alternative Hypothesis - https://www.geeksforgeeks.org/maths/alternative-hypothesis-definition-types-and-examples/
11. Alternative Hypothesis - https://courses.lumenlearning.com/introstats1/chapter/null-and-alternative-hypotheses/

---

## Generating the Samples



---

## One-Way ANOVA



---

## Running the Pairwise *t*-Tests


---

## Comparing the Conclusions


---

## Why ANOVA Is Preferred Over Several *t*-Tests




---

## Interpretation of Results


