[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/wasim/Data-Science/blob/main/data-analyst-roadmap/05_statistics_for_data_analysis/08_non_parametric_tests.ipynb)

# Non-Parametric Tests

Test whenever assumptions are violated (e.g., non-normal data).

## When to use?
- Data distribution is unknown/non-normal
- Small sample size
- Ordinal data (ranks)
- Outliers present

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats

sns.set_style('whitegrid')
np.random.seed(42)

## 1. Mann-Whitney U Test
**Alternative to:** Independent t-test.
Tests if two independent samples come from same distribution.

In [None]:
# Generate non-normal data (exponential)
group_a = np.random.exponential(scale=10, size=30)
group_b = np.random.exponential(scale=15, size=30)

# Visualize
plt.figure(figsize=(10, 5))
sns.histplot(group_a, color='blue', alpha=0.5, label='A')
sns.histplot(group_b, color='red', alpha=0.5, label='B')
plt.legend()
plt.title('Non-Normal Distributions')
plt.show()

# Perform test
u_stat, p_val = stats.mannwhitneyu(group_a, group_b)
print(f"Mann-Whitney U Test p-value: {p_val:.4f}")

## 2. Wilcoxon Signed-Rank Test
**Alternative to:** Paired t-test.
Tests repeated measurements on single sample.

In [None]:
# Paired data (before/after)
before = np.array([10, 20, 30, 40, 50])
after = np.array([12, 25, 35, 42, 58])

w_stat, p_val = stats.wilcoxon(before, after)
print(f"Wilcoxon Test p-value: {p_val:.4f}")

## 3. Kruskal-Wallis H Test
**Alternative to:** One-Way ANOVA.
Compare 3+ independent groups.

In [None]:
g1 = np.random.normal(10, 3, 20)
g2 = np.random.normal(12, 3, 20)
g3 = np.random.normal(11, 3, 20)

h_stat, p_val = stats.kruskal(g1, g2, g3)
print(f"Kruskal-Wallis Test p-value: {p_val:.4f}")

## Practice Exercise
Test hypothesis on skewed dataset.

In [None]:
# Load tips dataset
# Test if 'tips' amount differs by 'sex' using Mann-Whitney U
# Your code here