### ANOVA 
Definition:
ANOVA is a statistical test used to determine whether there are any statistically significant differences between the means of three or more independent groups.

### Why do we need ANOVA?

If you have 2 groups, you can use a t-test.

If you have 3 or more groups, doing multiple t-tests increases the risk of Type I error (false positives).

ANOVA solves this by checking all groups at once.

### ANOVA doesn’t tell which group is different, only that a difference exists.

### Hypotheses in ANOVA

Null Hypothesis (H₀): All group means are equal
=

Alternative Hypothesis (H₁): At least one group mean is different

### Why not multiple t-tests?
Each t-test has a 5% chance (if α = 0.05) of incorrectly rejecting the null (false positive).

More tests = more chances of error.
With 3 comparisons, error rate ≈ 14%.

With 6 comparisons (4 groups), error rate ≈ 26%.

Multiple t-tests → more false positives.

ANOVA → one clean test, keeps error rate under control.

### Null & Alternate Hypothesis in ANOVA.

Null Hypothesis (H₀):
All group means are equal.

Alternative Hypothesis (H₁):
At least one group mean is different.

### Assumptions of ANOVA.

Independence of observations

Normality

Homogeneity of variances

What if assumptions are violated?

If normality or equal variances don’t hold → use a non-parametric test instead, like Kruskal-Wallis test.

ANOVA assumes → independent samples + normal distribution + equal variances

### Types of ANOVA.

### One-Way ANOVA ✅ (what we’re learning today)

Used when comparing means of 3 or more groups based on 1 independent variable (factor).

Example: Compare exam scores across 3 teaching methods.

Factor = teaching method.

Groups = Method A, Method B, Method C.

### Two-Way ANOVA

Used when comparing means across groups with 2 independent variables (factors).

Example: Compare exam scores across teaching method (A, B, C) and gender (Male, Female).

Factor 1 = teaching method.

Factor 2 = gender.

Can also test for interaction effect (does the effect of teaching method depend on gender?).

In [6]:
import numpy as np
from scipy import stats

# Sample data: exam scores for 3 teaching methods
method_A = [85, 88, 90, 75, 95]
method_B = [70, 65, 80, 72, 68]
method_C = [88, 90, 92, 85, 87]

# Perform one-way ANOVA
f_stat, p_val = stats.f_oneway(method_A, method_B, method_C)
print("F-statistic:", f_stat)
print("p-value:", p_val)

# Interpretation
if p_val < 0.05:
    print("Reject H0 → At least one group mean is different.")
else:
    print("Fail to reject H0 → All group means are similar.")

F-statistic: 14.517970401691326
p-value: 0.000625318256583017
Reject H0 → At least one group mean is different.
