
---

## ✅ **What is Hypothesis Testing?**

Hypothesis Testing is a **statistical method to make decisions** using sample data. We test a claim (hypothesis) about a population parameter.

---

## 🔑 **Basic Terms**

| Term                                  | Meaning                                                           |
| ------------------------------------- | ----------------------------------------------------------------- |
| **Null Hypothesis (H₀)**              | No effect, no difference, or no relationship                      |
| **Alternative Hypothesis (H₁ or Ha)** | There **is** an effect, difference, or relationship               |
| **p-value**                           | Probability of observing the data (or more extreme) if H₀ is true |
| **α (alpha)**                         | Significance level (usually 0.05)                                 |
| **Test Statistic**                    | Numeric value calculated to decide whether to reject H₀           |

---

## 🎯 **Steps in Hypothesis Testing**

1. **State the Hypotheses**

   * H₀: µ₁ = µ₂ (no difference)
   * H₁: µ₁ ≠ µ₂ (two-sided) or µ₁ > µ₂ / µ₁ < µ₂ (one-sided)

2. **Set Significance Level (α)**

   * Common choices: 0.05, 0.01

3. **Choose the Right Test**

   * Depends on data type and distribution

4. **Calculate Test Statistic & p-value**

5. **Make Decision**

   * If `p < α`: Reject H₀ → statistically significant
   * If `p ≥ α`: Fail to reject H₀ → not statistically significant

---

## 🔍 **Which Test to Use?**

| Variable Type       | Test                               | Use Case                                                    |
| ------------------- | ---------------------------------- | ----------------------------------------------------------- |
| Num vs Num          | **Pearson / Spearman Correlation** | Relationship between 2 continuous variables                 |
| Num vs Binary Cat   | **T-Test / Point Biserial**        | Compare mean between 2 groups (e.g., Male vs Female on BMI) |
| Num vs Multi-Cat    | **One-Way ANOVA / Kruskal-Wallis** | Compare mean across 3+ groups                               |
| Cat vs Cat          | **Chi-Square / Fisher's Exact**    | Association between 2 categorical variables                 |
| 1 Num vs Population | **One-sample t-test**              | Compare sample mean with known population mean              |

---

## 📌 **Examples of Hypotheses**

### 🧪 1. T-Test (Numeric vs Binary Categorical)

* **Scenario**: Does Smoking affect BMI?
* H₀: Mean BMI (smoker) = Mean BMI (non-smoker)
* H₁: Mean BMI (smoker) ≠ Mean BMI (non-smoker)

```python
from scipy.stats import ttest_ind

group1 = data[data['Smoking'] == 'Yes']['BMI']
group2 = data[data['Smoking'] == 'No']['BMI']

t_stat, p = ttest_ind(group1, group2)
print(f"T = {t_stat:.3f}, p = {p:.3f}")
```

---

### 🧪 2. ANOVA (Numeric vs Categorical)

* **Scenario**: Does Region affect Income?
* H₀: All region means are equal
* H₁: At least one region mean is different

```python
from scipy.stats import f_oneway

groups = [g['Income'] for name, g in data.groupby('Region')]
f_stat, p = f_oneway(*groups)
print(f"F = {f_stat:.3f}, p = {p:.3f}")
```

---

### 🧪 3. Chi-Square (Cat vs Cat)

* **Scenario**: Is there a relationship between Gender and Purchase?
* H₀: Gender and Purchase are independent
* H₁: Gender and Purchase are dependent

```python
from scipy.stats import chi2_contingency

table = pd.crosstab(data['Gender'], data['Purchased'])
chi2, p, dof, expected = chi2_contingency(table)
print(f"Chi² = {chi2:.3f}, p = {p:.3f}")
```

---

## ⚠️ Common Mistakes to Avoid

* Don’t say **“accept the null”** → say **“fail to reject”**
* Small `p` does not mean big effect — **check effect size**
* Hypothesis tests **do not prove** anything — they test evidence

---

## 🧠 Final Tip: Visualize Before You Test!

* Use boxplots, scatterplots, histograms to understand your data distribution before applying any test.

---



---

### ✅ 1. **Numeric vs Numeric**

**Test:** Pearson Correlation

**Use When:** Checking **linear** relationship between two continuous variables (e.g., Age vs. Income)

```python

from scipy.stats import pearsonr

r, p = pearsonr(data['Age'], data['Income'])

print(f"Pearson r = {r:.3f}, p = {p:.3f}")

```

**Example Output:**

📊 **Pearson r = 0.312, p = 0.001**

✅ There is a statistically significant positive linear relationship between Age and Income.

❌ If `p > 0.05`: No significant linear relationship detected.

---

### ✅ 2. **Numeric vs Binary Categorical**

**Test:** Point Biserial Correlation

**Use When:** One variable is numeric and the other is a binary category (e.g., BMI vs Smoking Yes/No)

```python

from scipy.stats import pointbiserialr

data['Smoking_binary'] = data['Smoking'].map({'Yes': 1, 'No': 0})

r, p = pointbiserialr(data['Smoking_binary'], data['BMI'])

print(f"Point Biserial r = {r:.3f}, p = {p:.3f}")

```

**Output Interpretation:**

✅ If `p < 0.05`: Significant difference in BMI between smokers and non-smokers.

❌ If `p > 0.05`: No significant difference in BMI.

---

### ✅ 3. **Numeric vs Categorical (>2 levels)**

**Test:** One-Way ANOVA

**Use When:** Comparing means of numeric variable across 3+ groups (e.g., BMI by Education Level)


```python

from scipy.stats import f_oneway

grouped_data = [group['BMI'].values for name, group in data.groupby('Education')]

f_stat, p = f_oneway(*grouped_data)

print(f"ANOVA F = {f_stat:.3f}, p = {p:.3f}")

```

**Output Interpretation:**

✅ If `p < 0.05`: At least one group has a different mean BMI.

❌ If `p > 0.05`: No significant mean difference across education levels.

---

### ✅ 4. **Numeric vs Multi-Categorical (non-ordinal)**

**Test Options:**

* **One-Way ANOVA** if groups are >2

* **Kruskal-Wallis Test** if data is non-normal or has outliers

```python

from scipy.stats import kruskal

groups = [group['BMI'].values for name, group in data.groupby('Region')]

h_stat, p = kruskal(*groups)

print(f"Kruskal-Wallis H = {h_stat:.3f}, p = {p:.3f}")

```

**Interpretation:**

✅ Significant BMI difference across regions if `p < 0.05`

❌ No meaningful difference if `p > 0.05`

---

### ✅ 5. **Categorical vs Categorical (both binary or low cardinality)**

**Test:** Chi-Square Test of Independence

**Use When:** You want to test if two categorical variables are independent (e.g., Gender vs. Smoking)

```python

from scipy.stats import chi2_contingency

contingency_table = pd.crosstab(data['Gender'], data['Smoking'])

chi2, p, dof, expected = chi2_contingency(contingency_table)

print(f"Chi2 = {chi2:.3f}, p = {p:.3f}")

```

**Interpretation:**

✅ If `p < 0.05`: Gender and Smoking status are dependent.

❌ If `p > 0.05`: No association between Gender and Smoking.


---

### ✅ 6. **Categorical vs Multi-Categorical**

**Test:** Chi-Square Test (same as above)

**But Watch for:**

* High cardinality categories (e.g., 10+ unique values)

* Sparse data in the contingency table → Use **Fisher’s Exact Test** if the table is 2x2 and small.


```python

# Chi-Square still works

contingency = pd.crosstab(data['Region'], data['Product_Category'])

chi2, p, dof, expected = chi2_contingency(contingency)

print(f"Chi2 = {chi2:.3f}, p = {p:.3f}")

```

**Interpretation:**

✅ If `p < 0.05`: Region and Product Category are associated.

❌ If `p > 0.05`: No meaningful association.

---

### 🧠 Summary Table


| Type                           | Test                        | Function                            |
| ------------------------------ | --------------------------- | ----------------------------------- |
| Numeric vs Numeric             | Pearson Correlation         | `scipy.stats.pearsonr`              |
| Numeric vs Binary Categorical  | Point Biserial              | `scipy.stats.pointbiserialr`        |
| Numeric vs Categorical (>2)    | ANOVA / Kruskal-Wallis      | `f_oneway` / `kruskal`              |
| Categorical vs Categorical     | Chi-Square / Fisher         | `chi2_contingency` / `fisher_exact` |
| Categorical (high cardinality) | Chi-Square (watch sparsity) | `chi2_contingency`                  |

---

### 🔁 Bonus Tips (Different Scenarios):

* **Spearman Correlation**: Use instead of Pearson if the relationship is **nonlinear but monotonic**.
* **T-test**: Can be used instead of Point Biserial for two-group mean comparison.
* **Mann-Whitney U**: Non-parametric version of t-test if normality fails.
* **Post-hoc Tests**: Use after ANOVA (e.g., Tukey's HSD) to find which groups differ.

---

