# Non-Parametric Hypothesis Testing

Parametric tests (t-tests, ANOVA) rely on assumptions such as normality
and equal variances. When these assumptions are violated, results may
be misleading.

This notebook introduces **non-parametric alternatives** that:
- rely on ranks rather than raw values
- are robust to outliers and skewness
- test differences in distributions rather than means


### ðŸŸ¦ Imports & Data

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

from scipy import stats

from src.data_generation import generate_student_dataset
from src.utils import check_normality, check_equal_variance

sns.set(style="whitegrid")

df = generate_student_dataset(n=4000, random_state=42)


### ðŸŸ¦ When to Use Non-Parametric Tests

Non-parametric tests are preferred when:

- distributions are strongly skewed
- outliers dominate the data
- sample sizes are small
- measurement scale is ordinal

They test **medians or distributional shifts**, not means.


## ðŸŸ© Part I â€” Normality Diagnostics

## ðŸŸ¦ Normality Testing

To decide whether parametric assumptions hold, we test normality using
the **Shapiroâ€“Wilk test**.

$$
H_0: \text{Data is normally distributed}
$$


### ðŸŸ¦ Shapiroâ€“Wilk Test

In [None]:
scores = df["score"].values

stat, p_value = stats.shapiro(scores[:500])  # subset due to test limits
stat, p_value


### ðŸŸ¦ Interpretation

- With large samples, even small deviations from normality
  can lead to rejection
- Visual inspection is often more informative

This motivates robust testing approaches.


### ðŸŸ¦ Visual Normality Check

In [None]:
plt.figure(figsize=(6, 4))
stats.probplot(scores, dist="norm", plot=plt)
plt.title("Q-Q Plot of Exam Scores")
plt.show()


## ðŸŸ© Part II â€” Mannâ€“Whitney U Test (Two-Sample)

The Mannâ€“Whitney U test is a non-parametric alternative to
the two-sample t-test.

$$
H_0: P(X > Y) = 0.5
$$

It tests whether two samples come from the same distribution.


### ðŸŸ¦ Data Preparation

In [None]:
group_m = df[df.gender == "M"]["score"].values
group_f = df[df.gender == "F"]["score"].values


### ðŸŸ¦ Mannâ€“Whitney U Test

In [None]:
u_stat, p_mw = stats.mannwhitneyu(
    group_m,
    group_f,
    alternative="two-sided"
)

u_stat, p_mw


### ðŸŸ¦ Interpretation

- A small p-value indicates a shift between distributions
- The test does not assume normality
- Results should be compared with parametric tests

Consistency strengthens conclusions.


## ðŸŸ© Part III â€” Kruskalâ€“Wallis Test (ANOVA Alternative)

### ðŸŸ¦ Kruskalâ€“Wallis Test

The Kruskalâ€“Wallis test is the non-parametric alternative
to one-way ANOVA.

$$
H_0: \text{All group distributions are identical}
$$

It operates on ranked data.


In [None]:
stats.kruskal(
    df[df.teaching_method == "A"]["score"],
    df[df.teaching_method == "B"]["score"],
    df[df.teaching_method == "C"]["score"]
)


#### ðŸŸ¦ Kruskalâ€“Wallis Interpretation

- A significant result indicates at least one group differs
- The test does not specify which groups differ
- Post-hoc testing is required


## ðŸŸ© Part IV â€” Post-hoc Tests for Non-Parametric ANOVA

### Post-hoc Testing

After a significant Kruskalâ€“Wallis test, we must identify
which groups differ.

Pairwise Mannâ€“Whitney tests with correction
control the Type I error rate.

### ðŸŸ¦ Pairwise Mannâ€“Whitney with Bonferroni

In [None]:
from itertools import combinations

groups = {
    "A": df[df.teaching_method == "A"]["score"],
    "B": df[df.teaching_method == "B"]["score"],
    "C": df[df.teaching_method == "C"]["score"]
}

results = []

for (g1, g2) in combinations(groups.keys(), 2):
    stat, p = stats.mannwhitneyu(
        groups[g1],
        groups[g2],
        alternative="two-sided"
    )
    results.append((g1, g2, p * 3))  # Bonferroni correction

results


### Post-hoc Interpretation

- Adjusted p-values control false positives
- Consistency with ANOVA results increases confidence
- Effect size should still be reported where possible


## ðŸŸ© Part V â€” Parametric vs Non-Parametric Summary


| Scenario | Parametric | Non-Parametric |
|--------|-----------|----------------|
| Two groups | t-test | Mannâ€“Whitney U |
| >2 groups | ANOVA | Kruskalâ€“Wallis |
| Assumptions | Normality, variance | Minimal |
| Interpretation | Mean difference | Distribution shift |


## Summary

This notebook demonstrated:

- Why and when parametric assumptions fail
- How to test normality and inspect distributions
- Non-parametric alternatives for two-sample and multi-group comparisons
- Robust post-hoc testing strategies

Non-parametric tests complement parametric methods and
strengthen analytical conclusions.
