# ðŸ§® Problem 4: ANOVA

## Objective

Generate three independent samples, each of size 30, from normal distributions with means 0, 0.5, and 1, each with standard deviation 1.

1. Perform a one-way ANOVA to test whether all three means are equal.
2. Perform three independent two-sample t-tests: samples 1 vs 2, 1 vs 3, and 2 vs 3.
3. Compare the conclusions.

Write a short note on why ANOVA is preferred over running several t-tests.

## Problem execution One-Way ANOVA vs Multiple t-Tests

I generate three independent samples, each of size 30, from:

- Sample 1: \( X_1 \sim \mathcal{N}(0, 1) \)
- Sample 2: \( X_2 \sim \mathcal{N}(0.5, 1) \)
- Sample 3: \( X_3 \sim \mathcal{N}(1, 1) \)

Then I:

1. Perform a **one-way ANOVA** to test  
   \[
   H_0: \mu_1 = \mu_2 = \mu_3 \quad \text{vs} \quad H_A: \text{at least one mean differs}.
   \]
2. Perform **three independent two-sample t-tests**:
   - sample 1 vs 2  
   - sample 1 vs 3  
   - sample 2 vs 3  
3. Compare conclusions and explain why ANOVA is preferred over several t-tests.


In [1]:
#Generate Data and Run ANOVA + t-tests

import numpy as np
from scipy.stats import f_oneway, ttest_ind

# Reproducibility
rng = np.random.default_rng(12345)

n = 30

# Generate samples
sample1 = rng.normal(loc=0.0,   scale=1.0, size=n)
sample2 = rng.normal(loc=0.5,   scale=1.0, size=n)
sample3 = rng.normal(loc=1.0,   scale=1.0, size=n)

# One-way ANOVA
F_stat, p_anova = f_oneway(sample1, sample2, sample3)
print("One-way ANOVA:")
print(f"  F-statistic = {F_stat:.3f}, p-value = {p_anova:.4f}")

# Pairwise t-tests (two-sample, independent, equal variance by default)
t_12, p_12 = ttest_ind(sample1, sample2, equal_var=True)
t_13, p_13 = ttest_ind(sample1, sample3, equal_var=True)
t_23, p_23 = ttest_ind(sample2, sample3, equal_var=True)

print("\nPairwise two-sample t-tests:")
print(f"  Sample 1 vs 2: t = {t_12:.3f}, p = {p_12:.4f}")
print(f"  Sample 1 vs 3: t = {t_13:.3f}, p = {p_13:.4f}")
print(f"  Sample 2 vs 3: t = {t_23:.3f}, p = {p_23:.4f}")


One-way ANOVA:
  F-statistic = 3.677, p-value = 0.0293

Pairwise two-sample t-tests:
  Sample 1 vs 2: t = -1.302, p = 0.1980
  Sample 1 vs 3: t = -2.531, p = 0.0141
  Sample 2 vs 3: t = -1.548, p = 0.1271


## Detailed Interpretation: Comparing ANOVA and t-tests, and Why ANOVA Is Preferred

### 1. Comparing the Conclusions (ANOVA vs. t-tests)

In this exercise, I generated three independent samples, each of size \(n = 30\), from normal distributions with true means 0, 0.5, and 1. Because these populations differ in their means, I expect statistical tests to detect these differences.

The one-way ANOVA tests the global null hypothesis:

\[
H_0: \mu_1 = \mu_2 = \mu_3.
\]

ANOVA compares the amount of variation **between groups** to the variation **within groups**. In this simulation, the ANOVA typically yields a small p-value (p < 0.05), indicating that the observed differences among the sample means are too large to be explained by sampling variability alone. Thus ANOVA rejects the null hypothesis and concludes that **at least one** population mean is different from the others.

The three independent t-tests examine pairwise comparisons:

- Sample 1 vs Sample 2 (difference â‰ˆ 0.5)
- Sample 1 vs Sample 3 (difference â‰ˆ 1.0)
- Sample 2 vs Sample 3 (difference â‰ˆ 0.5)

Because the true differences are moderate or large, the t-tests involving the largest mean difference (1 vs 3) almost always show a significant result. The comparisons involving smaller differences (0 vs 0.5, and 0.5 vs 1) may or may not be significant depending on sampling variability. Overall, the t-tests are generally consistent with the ANOVA result, though ANOVA provides a single unified test while the t-tests give pair-specific conclusions.

---

### 2. Why ANOVA Is Preferred Over Running Multiple t-tests

While t-tests can be used to examine specific pairwise differences, using several t-tests to compare multiple groups creates a serious statistical problem. Each t-test conducted at the \(\alpha = 0.05\) level has a 5% chance of producing a false positive (Type I error). When multiple t-tests are performed independently, the probability of making **at least one** false positive grows rapidly:

\[
\text{FWER} = 1 - (1 - \alpha)^k,
\]

where \(k\) is the number of comparisons.  
With three groups, there are \(3\) pairwise t-tests, and the overall false-positive rate exceeds 14%.  
With four groups, this grows to over 26%.  
This inflation of the familywise error rate makes uncorrected multiple t-tests statistically unreliable.

ANOVA solves this problem by:

1. **Performing a single global test** that evaluates all means simultaneously while keeping the Type I error rate fixed at \(\alpha\).
2. **Separating the analysis into two steps**:  
   - First, determine whether any differences exist (global ANOVA).  
   - Only if significant, follow with post-hoc tests (e.g., Tukey, Bonferroni) that properly control for multiple comparisons.
3. **Using all data efficiently** by pooling variance estimates across groups rather than estimating variance separately for each t-test.
4. **Scaling well as the number of groups increases**, unlike the rapidly growing number of pairwise t-tests.

In short, while both ANOVA and t-tests can detect differences between means, **ANOVA is the correct first step** whenever more than two groups are compared because it maintains statistical validity and prevents inflated false-positive rates.

---

### Conclusion

ANOVA provides a statistically sound, global assessment of whether group means differ, and should always be used before conducting any pairwise comparisons. The subsequent t-tests can help identify which groups differ, but without ANOVA and proper correction, multiple t-tests would lead to invalid inference due to inflated Type I error.


## References

1. **NumPy Random Number Generation**
   - `numpy.random.Generator.normal`  
     Used to generate samples from the normal distribution in the exercise.  
     https://numpy.org/doc/stable/reference/random/generated/numpy.random.Generator.normal.html

2. **SciPy Statistical Tests**
   - `scipy.stats.f_oneway`  
     Function used to perform the one-way ANOVA on the three samples.  
     https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.f_oneway.html

   - `scipy.stats.ttest_ind`  
     Function used for the three independent two-sample t-tests.  
     https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html

3. **NumPy Arrays**
   - `numpy.array`  
     Reference for NumPyâ€™s core array data structure used to store the samples.  
     https://numpy.org/doc/stable/reference/generated/numpy.array.html

4. **Python Built-in Functions**
   - `print()`
     Built-in Python function used to display the ANOVA and t-test results.  
     https://docs.python.org/3/library/functions.html#print

4. **Theoretical Reference for ANOVA**
   - Fisher, R. A. (1925). *Statistical Methods for Research Workers*  
     Foundational text introducing the analysis of variance (ANOVA) and the logic behind comparing multiple group means.  
     Free online version: https://archive.org/details/in.ernet.dli.2015.176850

5. **Theoretical Reference for Independent t-tests**
   - Student (1908). *The Probable Error of a Mean*  
     Classic paper introducing the t-test for comparing means between two samples.  
     Accessible summary: https://en.wikipedia.org/wiki/Student%27s_t-test

6. **Multiple Comparisons and Type I Error Inflation**
   - Tukey, J. W. (1953). *The Problem of Multiple Comparisons.*  
     Seminal work explaining why performing several t-tests inflates the probability of false positives, motivating the use of ANOVA followed by corrected post-hoc tests.  
     Summary overview: https://en.wikipedia.org/wiki/Multiple_comparisons_problem

7. **Theoretical Foundation of ANOVA (F-Distribution)**
   - Snedecor, G. W. (1934). *Calculation and Interpretation of the F-Statistic.*  
     One of the earliest formal treatments of the F-distribution, which is the core of ANOVAâ€™s hypothesis testing framework.  
     Modern explanation: https://en.wikipedia.org/wiki/F-distribution

