# Statistical Hypothesis Testing Cookbook

A comprehensive reference covering common statistical tests with real-world examples.

**Topics Covered:**
- t-tests (one-sample, two-sample, paired)
- Chi-squared tests
- ANOVA
- Non-parametric alternatives

In [None]:
import numpy as np
import scipy.stats as stats
import pandas as pd

## 1. One-Sample t-Test

**Question:** Is the sample mean significantly different from a hypothesized population mean?

**Example:** Testing if average student test scores differ from 75.

In [None]:
# Sample data
np.random.seed(42)
scores = np.random.normal(78, 8, 100)

# Perform test
t_stat, p_value = stats.ttest_1samp(scores, 75)

print('One-Sample t-Test Results:')
print(f't-statistic: {t_stat:.2f}')
print(f'p-value: {p_value:.4f}')
print(f'Result: {"REJECT" if p_value < 0.05 else "FAIL TO REJECT"} null hypothesis (scores differ from 75)')

## 2. Two-Sample t-Test

**Question:** Do two independent groups have different means?

**Example:** Comparing test scores between two classes.

In [None]:
class_a = np.random.normal(78, 8, 50)
class_b = np.random.normal(72, 10, 50)

t_stat, p_value = stats.ttest_ind(class_a, class_b)

print('Two-Sample t-Test Results:')
print(f'Class A mean: {class_a.mean():.1f}')
print(f'Class B mean: {class_b.mean():.1f}')
print(f't-statistic: {t_stat:.2f}')
print(f'p-value: {p_value:.4f}')
print(f'Result: {"REJECT" if p_value < 0.05 else "FAIL TO REJECT"} null hypothesis (classes have different means)')

## 3. Chi-Squared Test

**Question:** Is there an association between two categorical variables?

**Example:** Testing independence between gender and course preference.

In [None]:
# Contingency table
observed = np.array([
    [30, 20, 15],  # Male preferences
    [15, 35, 25]   # Female preferences
])

chi2, p_value, dof, expected = stats.chi2_contingency(observed)

print('Chi-Squared Test Results:')
print(f'Chi² statistic: {chi2:.2f}')
print(f'p-value: {p_value:.4f}')
print(f'Result: {"REJECT" if p_value < 0.05 else "FAIL TO REJECT"} null hypothesis (variables are associated)')

## 4. ANOVA (Analysis of Variance)

**Question:** Do three or more groups have different means?

**Example:** Comparing performance across multiple teaching methods.

In [None]:
method_a = np.random.normal(75, 8, 40)
method_b = np.random.normal(80, 7, 40)
method_c = np.random.normal(78, 9, 40)

f_stat, p_value = stats.f_oneway(method_a, method_b, method_c)

print('One-Way ANOVA Results:')
print(f'F-statistic: {f_stat:.2f}')
print(f'p-value: {p_value:.4f}')
print(f'Result: {"REJECT" if p_value < 0.05 else "FAIL TO REJECT"} null hypothesis (at least one group differs)')

## Interpretation Guidelines

### p-value Interpretation
- **p < 0.01:** Very strong evidence against null hypothesis
- **p < 0.05:** Strong evidence (conventional threshold)
- **p < 0.10:** Weak evidence
- **p ≥ 0.10:** Insufficient evidence

### Effect Size
Always report effect size alongside p-values:
- **Cohen's d** for t-tests
- **η²** (eta-squared) for ANOVA
- **Cramér's V** for chi-squared

### Assumptions
1. **t-test:** Normality, independence
2. **ANOVA:** Normality, homogeneity of variance, independence
3. **Chi-squared:** Independence, expected frequency ≥ 5