# Lab: One-Way ANOVA – Comparing Three Weight-Loss Interventions

## Scenario  
A wellness clinic is evaluating three interventions for weight loss over 8 weeks:
1. **Control**: no intervention  
2. **Diet + Exercise**  
3. **Diet Pill**  

They want to know whether the **average weight loss** differs among these three groups.  Use a one-way ANOVA to test for overall differences, then follow up with post-hoc comparisons if appropriate.

---

## Dataset  
Simulate or load data with columns:

| ParticipantID | Group             | WeightLoss |
|---------------|-------------------|------------|
| 1             | Control           | 1.8        |
| 2             | Diet+Exercise     | 4.7        |
| 3             | DietPill          | 5.2        |
| …             | …                 | …          |

**Python simulation example:**


In [5]:
# Run this Code
import numpy as np
import pandas as pd

np.random.seed(42)
n = 30

control       = np.random.normal(2.0, 1.0, n)
diet_exercise = np.random.normal(4.5, 1.0, n)
diet_pill     = np.random.normal(5.0, 1.0, n)

df = pd.DataFrame({
    "Group": ["Control"]*n + ["Diet+Exercise"]*n + ["DietPill"]*n,
    "WeightLoss": np.concatenate([control, diet_exercise, diet_pill])
})

---

## Steps

### 1. State Hypotheses & Significance Level  
- **H₀:** 
- **H₁:** 
- **α = 0.05**

---

### 2. Check Assumptions  
1. **Independence:** Participants randomly assigned.  
2. **Normality:** Each group’s residuals approximately normal (n≥30 helps).  
3. **Homogeneity of Variance:** Similar variances across groups (Levene’s test).


In [6]:
df.groupby("Group")['WeightLoss'].agg(['mean','std','count'])

Unnamed: 0_level_0,mean,std,count
Group,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Control,1.811853,0.900006,30
Diet+Exercise,4.378838,0.931102,30
DietPill,5.012885,0.991983,30



---

### 3. Compute Summary Statistics
Computer Summary Statistics using Panda's GroupBy method:

In [None]:
# Your Code Here

| Group          | count | mean | std  |
|----------------|-------|------|------|
| Control        | 30    | 2.05 | 1.03 |
| Diet+Exercise  | 30    | 4.48 | 1.12 |
| DietPill       | 30    | 5.02 | 0.95 |

---

### 4. Perform One-Way ANOVA


In [None]:
!pip install statsmodels

In [7]:

import statsmodels.api as sn
from statsmodels.formula.api import ols
model = ols('WeightLoss ~ C(Group)', data=df).fit()

anova_table=sn.stats.anova_lm(model,type=2)
print(anova_table)

            df      sum_sq    mean_sq          F        PR(>F)
C(Group)   2.0  172.380287  86.190143  97.170646  6.720959e-23
Residual  87.0   77.168804   0.886998        NaN           NaN


**Example output:**

|           | sum_sq | df | F      | PR(>F)  |
|-----------|--------|----|--------|---------|
| C(Group)  | 160.3  | 2  | 82.34  | <0.0001 |
| Residual  | 105.6  | 87 |        |         |

- Ex) **F** ≈ 82.34, **p** < 0.0001 → reject H₀.

---

### 5. Post-Hoc Tests (if ANOVA significant)

Use Tukey’s HSD to see which pairs differ:


In [9]:
from statsmodels.stats.multicomp import pairwise_tukeyhsd
tukey= pairwise_tukeyhsd(df["WeightLoss"],df["Group"],alpha=0.05)
print(tukey)

      Multiple Comparison of Means - Tukey HSD, FWER=0.05       
    group1        group2    meandiff p-adj  lower  upper  reject
----------------------------------------------------------------
      Control Diet+Exercise    2.567    0.0 1.9871 3.1468   True
      Control      DietPill    3.201    0.0 2.6212 3.7809   True
Diet+Exercise      DietPill    0.634 0.0287 0.0542 1.2139   True
----------------------------------------------------------------


---

### 6. Interpretation & Reporting

1. **ANOVA conclusion:**  
   > “F(2,87) = 82.34, p < 0.001 → there is a significant difference in mean weight loss among the three groups.”

2. **Post-hoc insights:**  
   > “Tukey HSD shows DietPill > Diet+Exercise > Control (all pairwise p < 0.01).”

3. **Business implications:**  
   - The diet pill outperforms diet+exercise and control.  
   - Diet+exercise is also significantly better than doing nothing.  
   - Recommend offering the pill for maximal effect or combining interventions in future studies.

---

### 7. Visualization (Optional)


In [None]:
# Your Code Here