# Project 4 ‚Äî Module 5: Statistical Inference
## Lesson 6: Hypothesis Testing & Final Conclusions

| | |
|---|---|
| **Author** | Jose Marcel Lopez Pino |
| **Framework** | CRISP-DM + LEAN |
| **Phase** | 6 ‚Äî Deployment |
| **Module** | 5 ‚Äî Statistical Inference (Alkemy Bootcamp) |
| **Dataset** | Student Habits vs Academic Performance ‚Äî Kaggle |
| **Date** | 2026-02 |

---

> **Executive Summary:**
> This notebook corresponds to Lesson 6 of Module 5 (Statistical Inference).
> All four hypotheses defined in Lesson 1 are formally tested using appropriate
> statistical methods. Each test reports t-statistic/z-statistic, p-value,
> effect size (Cohen's d), and 95% confidence interval. Type I and Type II
> errors are discussed in business context. Final conclusions translate
> statistical findings into actionable university wellness recommendations.

---

## Table of Contents

1. [CRISP-DM Phase 6 ‚Äî Deployment](#1-crisp-dm-phase-6--deployment)
2. [Load Data](#2-load-data)
3. [H1 ‚Äî Sleep Duration vs WHO Benchmark](#3-h1--sleep-duration-vs-who-benchmark)
4. [H2 ‚Äî Exercise Frequency and Exam Score](#4-h2--exercise-frequency-and-exam-score)
5. [H3 ‚Äî Sedentary Lifestyle Prevalence](#5-h3--sedentary-lifestyle-prevalence)
6. [H4 ‚Äî Diet Quality and Academic Performance](#6-h4--diet-quality-and-academic-performance)
7. [Type I and Type II Errors in Context](#7-type-i-and-type-ii-errors-in-context)
8. [Hypotheses Summary ‚Äî All Results](#8-hypotheses-summary--all-results)
9. [Final Conclusions & Business Recommendations](#9-final-conclusions--business-recommendations)
10. [Prescriptive Analysis ‚Äî From Findings to Action](#10-prescriptive-analysis--from-findings-to-action)
11. [Deliverables Checklist](#11-deliverables-checklist)
12. [LEAN Retrospective](#12-lean-retrospective)
13. [Decisions Log ‚Äî Lesson 6](#13-decisions-log--lesson-6)

---
## 1. CRISP-DM Phase 6 ‚Äî Deployment

**Objective:** Execute all hypothesis tests defined in Lesson 1.
Translate statistical results into actionable business recommendations
for the University Health & Wellbeing Department.

**Lean Filter:** Every test result must connect to a specific intervention decision.
A statistically significant result without a business action is waste.

### Deliverables for this Phase

| Deliverable | Audience | Format | Status |
|-------------|----------|--------|--------|
| Technical Report (notebooks) | Data Science Team | Jupyter (EN) | ‚è≥ |
| Executive Summary | Health Director / Academic Senate | PDF/PPTX (ES) | ‚è≥ |
| Visualizations | All | PNG | ‚è≥ |
| GitHub Repository | Public / Portfolio | Git | ‚è≥ |

In [None]:
# ===== Environment Setup =====
from pathlib import Path

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
from scipy.stats import ttest_1samp, ttest_ind, f_oneway
from statsmodels.stats.proportion import proportions_ztest

np.random.seed(42)
plt.style.use('seaborn-v0_8-whitegrid')
sns.set_palette('Blues_d')

DATA_RAW        = Path('../data/raw')
REPORTS_FIGURES = Path('../reports/figures')
REPORTS_FIGURES.mkdir(parents=True, exist_ok=True)

print('Environment ready.')

---
## 2. Load Data

In [None]:
# ===== Load Dataset =====
CSV_FILE = DATA_RAW / 'student_habits_performance.csv'
df = pd.read_csv(CSV_FILE)

sleep    = df['sleep_hours'].dropna()
score    = df['exam_score'].dropna()
exercise = df['exercise_frequency'].dropna()

print(f'Dataset loaded: {df.shape[0]:,} rows x {df.shape[1]} columns')
print(f'Alpha = 0.05 for all tests (defined in Lesson 1)')

In [None]:
# ===== Standard Hypothesis Test Reporter =====
def report_test(test_name: str, t_stat: float, p_value: float,
                cohens_d: float, ci: tuple, alpha: float = 0.05) -> None:
    """Prints a standardized hypothesis test result summary.

    Args:
        test_name: Name of the hypothesis (e.g. 'H1').
        t_stat: t or z statistic.
        p_value: p-value from the test.
        cohens_d: Effect size (Cohen's d).
        ci: Tuple (lower, upper) confidence interval.
        alpha: Significance level (default 0.05).

    Returns:
        None
    """
    decision = 'Reject H‚ÇÄ' if p_value < alpha else 'Fail to reject H‚ÇÄ'
    d_interp = (
        'Negligible' if abs(cohens_d) < 0.2 else
        'Small'      if abs(cohens_d) < 0.5 else
        'Medium'     if abs(cohens_d) < 0.8 else
        'Large'
    )

    print(f'=== {test_name} Results ===')
    print(f'  t-statistic  : {t_stat:.4f}')
    print(f'  p-value      : {p_value:.4f}')
    print(f'  Cohen's d    : {cohens_d:.4f}  ({d_interp} effect)')
    print(f'  95% CI       : ({ci[0]:.4f}, {ci[1]:.4f})')
    print(f'  Œ±            : {alpha}')
    print(f'  Decision     : **{decision}**')
    print()

print('Reporter function ready.')

---
## 3. H1 ‚Äî Sleep Duration vs WHO Benchmark

| | |
|--|--|
| **H‚ÇÄ** | Œº_sleep = 7 hours |
| **H‚ÇÅ** | Œº_sleep < 7 hours |
| **Test** | One-sample t-test (one-tailed, left) |
| **Œ±** | 0.05 |
| **Business implication if H‚ÇÅ accepted** | Sleep deprivation is prevalent ‚Üí prioritize sleep hygiene programs |

In [None]:
# ===== H1 ‚Äî One-sample t-test =====
mu_0    = 7.0
alpha   = 0.05
n_sleep = len(sleep)

t_stat_h1, p_two_h1 = ttest_1samp(sleep, popmean=mu_0)
p_h1 = p_two_h1 / 2  # one-tailed left

# Effect size
cohens_d_h1 = (sleep.mean() - mu_0) / sleep.std(ddof=1)

# 95% CI
se_h1 = sleep.std(ddof=1) / np.sqrt(n_sleep)
t_crit = stats.t.ppf(0.975, df=n_sleep - 1)
ci_h1 = (sleep.mean() - t_crit * se_h1, sleep.mean() + t_crit * se_h1)

report_test('H1 ‚Äî Sleep Duration vs WHO Benchmark',
            t_stat_h1, p_h1, cohens_d_h1, ci_h1)

print('=== Business Interpretation ===')
if p_h1 < alpha:
    print(f'‚Üí We REJECT H‚ÇÄ at Œ±={alpha}')
    print(f'‚Üí Mean sleep ({sleep.mean():.2f}h) is significantly below the WHO 7h benchmark')
    print(f'‚Üí Cohen's d = {cohens_d_h1:.2f} ‚Äî effect is {"small" if abs(cohens_d_h1)<0.5 else "medium/large"}')
    print('‚Üí ACTION: Launch university sleep hygiene program')
else:
    print(f'‚Üí We FAIL to reject H‚ÇÄ at Œ±={alpha}')
    print('‚Üí No significant evidence that mean sleep is below 7h')

In [None]:
# ===== Plot H1 ‚Äî Distribution with critical region =====
fig, ax = plt.subplots(figsize=(10, 4))

x = np.linspace(sleep.min() - 0.5, sleep.max() + 0.5, 300)
ax.hist(sleep, bins=30, density=True, color='#90CAF9', alpha=0.6,
        edgecolor='white', label='Observed sleep_hours')

from scipy.stats import norm as norm_dist
mu_s, std_s = sleep.mean(), sleep.std()
ax.plot(x, norm_dist.pdf(x, mu_s, std_s), color='#1565C0', lw=2, label='Normal fit')
ax.axvline(mu_0, color='orange', linestyle='--', lw=2, label=f'H‚ÇÄ: Œº = {mu_0}h')
ax.axvline(sleep.mean(), color='red', linestyle='-', lw=2,
           label=f'XÃÑ = {sleep.mean():.2f}h')
ax.axvline(ci_h1[0], color='gray', linestyle=':', lw=1.5, label=f'95% CI lower = {ci_h1[0]:.2f}')
ax.axvline(ci_h1[1], color='gray', linestyle=':', lw=1.5, label=f'95% CI upper = {ci_h1[1]:.2f}')

ax.set_xlabel('sleep_hours')
ax.set_ylabel('Density')
ax.set_title(f'H1: Sleep Duration vs WHO Benchmark ‚Äî p={p_h1:.4f}',
             fontsize=12, fontweight='bold')
ax.legend(fontsize=8)
plt.tight_layout()
output_path = REPORTS_FIGURES / 'lesson6_h1_sleep.png'
plt.savefig(output_path, dpi=150, bbox_inches='tight')
plt.show()
print(f'Figure saved: {output_path}')

---
## 4. H2 ‚Äî Exercise Frequency and Exam Score

| | |
|--|--|
| **H‚ÇÄ** | Œº_active = Œº_sedentary (exam score) |
| **H‚ÇÅ** | Œº_active > Œº_sedentary |
| **Test** | Independent samples t-test (one-tailed, right) |
| **Œ±** | 0.05 |
| **Business implication if H‚ÇÅ accepted** | Physical activity programs have measurable academic ROI |

In [None]:
# ===== H2 ‚Äî Independent samples t-test =====
active    = df[df['exercise_frequency'] >= 3]['exam_score'].dropna()
sedentary = df[df['exercise_frequency'] < 3]['exam_score'].dropna()

print(f'Active group (‚â•3 days):   n={len(active):,}, mean={active.mean():.2f}')
print(f'Sedentary group (<3 days): n={len(sedentary):,}, mean={sedentary.mean():.2f}')
print()

t_stat_h2, p_two_h2 = ttest_ind(active, sedentary, equal_var=False)
p_h2 = p_two_h2 / 2  # one-tailed right

# Effect size (Cohen's d for two groups)
pooled_std = np.sqrt((active.std(ddof=1)**2 + sedentary.std(ddof=1)**2) / 2)
cohens_d_h2 = (active.mean() - sedentary.mean()) / pooled_std

# 95% CI for difference in means
se_diff = np.sqrt(active.var(ddof=1)/len(active) + sedentary.var(ddof=1)/len(sedentary))
df_welch = (active.var(ddof=1)/len(active) + sedentary.var(ddof=1)/len(sedentary))**2 / (
           (active.var(ddof=1)/len(active))**2/(len(active)-1) +
           (sedentary.var(ddof=1)/len(sedentary))**2/(len(sedentary)-1))
t_crit_h2 = stats.t.ppf(0.975, df=df_welch)
diff_mean = active.mean() - sedentary.mean()
ci_h2 = (diff_mean - t_crit_h2*se_diff, diff_mean + t_crit_h2*se_diff)

report_test('H2 ‚Äî Exercise vs Exam Score', t_stat_h2, p_h2, cohens_d_h2, ci_h2)

print('=== Business Interpretation ===')
if p_h2 < alpha and t_stat_h2 > 0:
    print(f'‚Üí We REJECT H‚ÇÄ at Œ±={alpha}')
    print(f'‚Üí Active students score {diff_mean:.2f} points higher on average')
    print(f'‚Üí Cohen's d = {cohens_d_h2:.2f}')
    print('‚Üí ACTION: Invest in exercise facilities and activity incentives')
else:
    print(f'‚Üí We FAIL to reject H‚ÇÄ at Œ±={alpha}')
    print('‚Üí No significant difference in exam scores between active and sedentary students')

In [None]:
# ===== Plot H2 ‚Äî Boxplot comparison =====
fig, ax = plt.subplots(figsize=(8, 4))

data_h2 = [sedentary.values, active.values]
bp = ax.boxplot(data_h2, patch_artist=True, vert=True,
                labels=['Sedentary (<3 days)', 'Active (‚â•3 days)'])
colors_h2 = ['#90CAF9', '#1565C0']
for patch, color in zip(bp['boxes'], colors_h2):
    patch.set_facecolor(color)
    patch.set_alpha(0.8)

ax.axhline(df['exam_score'].mean(), color='red', linestyle='--', lw=1.5,
           label=f'Overall mean = {df["exam_score"].mean():.2f}')
ax.set_ylabel('exam_score')
ax.set_title(f'H2: Exam Score by Exercise Group ‚Äî p={p_h2:.4f}',
             fontsize=12, fontweight='bold')
ax.legend(fontsize=9)
plt.tight_layout()
output_path = REPORTS_FIGURES / 'lesson6_h2_exercise.png'
plt.savefig(output_path, dpi=150, bbox_inches='tight')
plt.show()
print(f'Figure saved: {output_path}')

---
## 5. H3 ‚Äî Sedentary Lifestyle Prevalence

| | |
|--|--|
| **H‚ÇÄ** | p_sedentary = 0.50 |
| **H‚ÇÅ** | p_sedentary < 0.50 |
| **Test** | One-sample proportion z-test (one-tailed, **left**) |
| **Œ±** | 0.05 |
| **Empirical finding** | pÃÇ = 41.2% sedentary ‚Äî majority (58.8%) is active |
| **Business implication if H‚ÇÅ accepted** | Sedentarism is a **minority** behavior ‚Üí targeted intervention for at-risk subgroup |

> **Note:** The original hypothesis assumed >50% sedentarism. The data shows the opposite:
> only 41.2% exercise fewer than 3 days/week. H‚ÇÅ is reformulated to test whether
> sedentarism is significantly **below** 50% ‚Äî still actionable because 412 students
> remain at risk.

In [None]:
# ===== H3 ‚Äî Proportion z-test =====
sedentary_flag = (df['exercise_frequency'] < 3).astype(int)
n_h3   = len(sedentary_flag)
count  = sedentary_flag.sum()
p_hat  = sedentary_flag.mean()
p_null = 0.50

z_stat_h3, p_two_h3 = proportions_ztest(count, n_h3, value=p_null, alternative='smaller')
p_h3 = p_two_h3  # already one-tailed (left)

# Effect size (Cohen's h for proportions)
cohens_h = 2 * np.arcsin(np.sqrt(p_hat)) - 2 * np.arcsin(np.sqrt(p_null))

# 95% CI for proportion
se_prop = np.sqrt(p_hat * (1 - p_hat) / n_h3)
z_crit  = stats.norm.ppf(0.975)
ci_h3   = (p_hat - z_crit * se_prop, p_hat + z_crit * se_prop)

print(f'pÃÇ observed = {p_hat:.4f} ({p_hat*100:.1f}%)')
print(f'p‚ÇÄ null    = {p_null:.2f} (50%)')
print()
print(f'=== H3 Results ===')
print(f'  z-statistic  : {z_stat_h3:.4f}')
print(f'  p-value      : {p_h3:.4f}')
print(f'  Cohen's h    : {cohens_h:.4f}')
print(f'  95% CI       : ({ci_h3[0]:.4f}, {ci_h3[1]:.4f})')
print(f'  Œ±            : {alpha}')
decision_h3 = "Reject H‚ÇÄ" if p_h3 < alpha else "Fail to reject H‚ÇÄ"
print(f'  Decision     : **{decision_h3}**')
print()
print('=== Business Interpretation ===')
if p_h3 < alpha:
    print(f'‚Üí We REJECT H‚ÇÄ at Œ±={alpha}')
    print(f'‚Üí pÃÇ = {p_hat*100:.1f}% sedentary ‚Äî significantly BELOW 50%')
    print(f'‚Üí 58.8% of students are active ‚Äî positive baseline')
    print(f'‚Üí ACTION: Targeted program for the {p_hat*100:.1f}% at-risk minority ({int(p_hat*1000):,} students)')
else:
    print(f'‚Üí We FAIL to reject H‚ÇÄ at Œ±={alpha}')
    print(f'‚Üí pÃÇ = {p_hat*100:.1f}% ‚Äî not significantly different from 50%')

---
## 5b. H3 ‚Äî Data-Driven Hypothesis Revision

### What happened and why it matters

During execution of the H3 proportion z-test, the result was:

| Output | Value | Expected (original H‚ÇÅ) |
|--------|-------|------------------------|
| z-statistic | ‚àí5.65 | Positive (right tail) |
| p-value | 1.000 | < 0.05 |
| Decision | Fail to reject H‚ÇÄ | Reject H‚ÇÄ |

### Root cause

The original hypothesis was formulated **before exploring the data** (Lesson 1),
assuming that sedentarism would be the majority behavior:

> H‚ÇÅ (original): p_sedentary > 0.50

However, the actual data shows the opposite:

| Group | n | % |
|-------|---|---|
| Active (‚â• 3 days/week) | 588 | **58.8%** |
| Sedentary (< 3 days/week) | 412 | 41.2% |

A **negative z-statistic** on a right-tailed test means pÃÇ is **below** the null value ‚Äî
the test is looking in the wrong direction. p-value = 1.0 is mathematically correct
but statistically meaningless in this context.

### CRISP-DM response ‚Äî iterate

This is not a failure. CRISP-DM is explicitly iterative:
> *"It is normal to go back to previous phases when new findings demand it."*

The correct response is to **revise H‚ÇÅ based on evidence** and retest:

> H‚ÇÅ (revised): p_sedentary **< 0.50** (left-tailed test)

### Revised result

| Output | Value |
|--------|-------|
| z-statistic | ‚àí5.65 |
| p-value | < 0.05 |
| Decision | **Reject H‚ÇÄ** |
| Interpretation | Sedentarism (41.2%) is significantly below 50% |

### Business implication ‚Äî how the conclusion changes

| Version | Message | Intervention type |
|---------|---------|-------------------|
| Original H‚ÇÅ (wrong) | "Majority is sedentary ‚Üí systemic campaign" | Mass, high cost |
| Revised H‚ÇÅ (correct) | "41.2% at risk ‚Üí targeted program for 412 students" | Focused, high ROI |

The intervention is still warranted ‚Äî but the framing changes from a campus-wide
campaign to a **precision program** targeting the identifiable at-risk minority.
This is a better use of resources and a direct application of LEAN (eliminate waste,
maximize value per unit of investment).

### Key lesson for portfolio

> Hypotheses formulated before data exploration may not match the actual data direction.
> Always verify the sign of the test statistic against the direction of H‚ÇÅ before
> interpreting the p-value. A p-value of 1.0 on a one-tailed test is a diagnostic signal,
> not a result.

---
## 6. H4 ‚Äî Diet Quality and Academic Performance

| | |
|--|--|
| **H‚ÇÄ** | Œº_poor = Œº_fair = Œº_good (exam score) |
| **H‚ÇÅ** | At least one diet quality group has a significantly different mean exam score |
| **Test** | One-way ANOVA (Kruskal-Wallis if normality violated) |
| **Œ±** | 0.05 |
| **Business implication if H‚ÇÅ accepted** | Nutrition programs may improve academic outcomes |

In [None]:
# ===== H4 ‚Äî One-way ANOVA =====
# First verify normality per group (Shapiro-Wilk)
diet_groups = {}
print('=== Normality Check per Group ===')
for level in df['diet_quality'].dropna().unique():
    grp = df[df['diet_quality'] == level]['exam_score'].dropna()
    diet_groups[level] = grp
    stat_sw, p_sw = stats.shapiro(grp.sample(min(500, len(grp)), random_state=42))
    print(f'  {level}: n={len(grp):,}, mean={grp.mean():.2f}, Shapiro p={p_sw:.4f}')

print()

# ANOVA
groups = list(diet_groups.values())
f_stat, p_h4 = f_oneway(*groups)

# Effect size (eta squared)
grand_mean = df['exam_score'].dropna().mean()
ss_between = sum(len(g) * (g.mean() - grand_mean)**2 for g in groups)
ss_total   = sum((df['exam_score'].dropna() - grand_mean)**2)
eta_sq     = ss_between / ss_total

print(f'=== H4 Results ‚Äî One-way ANOVA ===')
print(f'  F-statistic  : {f_stat:.4f}')
print(f'  p-value      : {p_h4:.4f}')
print(f'  Œ∑¬≤ (eta sq.) : {eta_sq:.4f}  (variance explained by diet quality)')
print(f'  Œ±            : {alpha}')
decision_h4 = "Reject H‚ÇÄ" if p_h4 < alpha else "Fail to reject H‚ÇÄ"
print(f'  Decision     : **{decision_h4}**')
print()
print('=== Group Means ===')
for level, grp in diet_groups.items():
    print(f'  {level}: {grp.mean():.2f}')

In [None]:
# ===== Post-hoc: Tukey HSD (if ANOVA significant) =====
if p_h4 < alpha:
    print('ANOVA significant ‚Äî running post-hoc Tukey HSD to identify which groups differ')
    print()
    from itertools import combinations
    group_names = list(diet_groups.keys())
    for g1, g2 in combinations(group_names, 2):
        t_ph, p_ph = stats.ttest_ind(diet_groups[g1], diet_groups[g2])
        # Bonferroni correction for 3 comparisons
        p_bonf = min(p_ph * 3, 1.0)
        sig = '‚úÖ Significant' if p_bonf < alpha else '‚Äî Not significant'
        print(f'  {g1} vs {g2}: t={t_ph:.3f}, p={p_ph:.4f}, p_bonf={p_bonf:.4f}  {sig}')
else:
    print('ANOVA not significant ‚Äî no post-hoc test needed.')

In [None]:
# ===== Plot H4 ‚Äî Boxplot by diet quality =====
fig, ax = plt.subplots(figsize=(9, 4))

order = ['Poor', 'Fair', 'Good'] if 'Poor' in diet_groups else list(diet_groups.keys())
data_h4  = [diet_groups[k].values for k in order if k in diet_groups]
labels_h4 = [k for k in order if k in diet_groups]

bp = ax.boxplot(data_h4, patch_artist=True, labels=labels_h4)
colors_h4 = ['#E57373', '#FFB74D', '#81C784']
for patch, color in zip(bp['boxes'], colors_h4):
    patch.set_facecolor(color)
    patch.set_alpha(0.8)

ax.axhline(grand_mean, color='red', linestyle='--', lw=1.5,
           label=f'Grand mean = {grand_mean:.2f}')
ax.set_ylabel('exam_score')
ax.set_xlabel('diet_quality')
ax.set_title(f'H4: Exam Score by Diet Quality ‚Äî ANOVA p={p_h4:.4f}',
             fontsize=12, fontweight='bold')
ax.legend(fontsize=9)
plt.tight_layout()
output_path = REPORTS_FIGURES / 'lesson6_h4_diet.png'
plt.savefig(output_path, dpi=150, bbox_inches='tight')
plt.show()
print(f'Figure saved: {output_path}')

---
## 6b. H4 ‚Äî ANOVA Deep Dive

### Why ANOVA and not multiple t-tests?

Running 3 separate t-tests (Poor vs Fair, Poor vs Good, Fair vs Good) inflates
the familywise Type I error rate:

| Approach | Nominal Œ± | True Œ± (3 comparisons) |
|----------|-----------|------------------------|
| 3 separate t-tests | 0.05 each | 1 ‚àí (0.95)¬≥ = **0.143** |
| One-way ANOVA | 0.05 | **0.05** ‚Üê controlled |

ANOVA tests all groups simultaneously in a single F-test, keeping Œ± = 0.05.

### ANOVA Assumptions

| Assumption | Test | Status |
|------------|------|--------|
| Independence | Study design ‚Äî one obs. per student | ‚úÖ Assumed |
| Normality per group | Shapiro-Wilk per group | ‚è≥ Verified below |
| Homogeneity of variances | Levene's test | ‚è≥ Verified below |

In [None]:
# ===== H4 ANOVA ‚Äî Full Diagnostic =====
from scipy.stats import levene, shapiro, f_oneway, kruskal

diet_col  = 'diet_quality'
score_col = 'exam_score'

# Build groups
groups_dict = {}
for level in df[diet_col].dropna().unique():
    groups_dict[level] = df[df[diet_col] == level][score_col].dropna().values

print('=== Group Summary ===')
for k, v in groups_dict.items():
    print(f'  {k:<8}: n={len(v):>4},  mean={v.mean():.2f},  std={v.std():.2f}')

print()

# ‚îÄ‚îÄ Normality per group (Shapiro-Wilk) ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
print('=== Shapiro-Wilk Normality Test per Group ===')
normality_ok = True
for k, v in groups_dict.items():
    sample = v[:500] if len(v) > 500 else v
    stat_sw, p_sw = shapiro(sample)
    status = '‚úÖ' if p_sw > 0.05 else '‚ö†Ô∏è  Non-normal'
    if p_sw <= 0.05:
        normality_ok = False
    print(f'  {k:<8}: W={stat_sw:.4f}, p={p_sw:.4f}  {status}')

print()

# ‚îÄ‚îÄ Levene's test (homogeneity of variances) ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
groups_list = list(groups_dict.values())
lev_stat, lev_p = levene(*groups_list)
homogeneity_ok = lev_p > 0.05
print('=== Levene Test ‚Äî Homogeneity of Variances ===')
print(f'  F={lev_stat:.4f}, p={lev_p:.4f}')
print(f'  {"‚úÖ Equal variances assumed" if homogeneity_ok else "‚ö†Ô∏è  Variances differ ‚Äî consider Welch ANOVA"}')

In [None]:
# ===== ANOVA or Kruskal-Wallis decision =====
print('=== Test Selection ===')
if normality_ok:
    print('‚Üí Normality OK ‚Üí One-way ANOVA')
    f_stat, p_anova = f_oneway(*groups_list)
    test_name = 'One-way ANOVA'
    stat_label = 'F'
    stat_val = f_stat
else:
    print('‚Üí Normality violated ‚Üí Kruskal-Wallis (non-parametric alternative)')
    stat_val, p_anova = kruskal(*groups_list)
    test_name = 'Kruskal-Wallis'
    stat_label = 'H'
    f_stat = stat_val

print()
print(f'=== {test_name} Results ===')
print(f'  {stat_label}-statistic : {stat_val:.4f}')
print(f'  p-value      : {p_anova:.4f}')
print(f'  Œ±            : 0.05')
decision_h4 = 'Reject H‚ÇÄ' if p_anova < 0.05 else 'Fail to reject H‚ÇÄ'
print(f'  Decision     : **{decision_h4}**')

# Effect size ‚Äî eta squared
grand_mean = df[score_col].dropna().mean()
ss_between = sum(len(g) * (g.mean() - grand_mean)**2 for g in groups_list)
ss_total   = ((df[score_col].dropna() - grand_mean)**2).sum()
eta_sq     = ss_between / ss_total
print(f'  Œ∑¬≤ (eta sq.) : {eta_sq:.4f}  ({eta_sq*100:.1f}% variance explained by diet quality)')

interp = 'Small' if eta_sq < 0.06 else ('Medium' if eta_sq < 0.14 else 'Large')
print(f'  Interpretation: {interp} effect (Cohen 1988: small=0.01, medium=0.06, large=0.14)')

In [None]:
# ===== Post-hoc: Bonferroni-corrected pairwise t-tests =====
from itertools import combinations

print('=== Post-hoc Pairwise Comparison (Bonferroni correction) ===')
print('(Only run if ANOVA/Kruskal is significant)')
print()

group_names = list(groups_dict.keys())
n_comparisons = len(list(combinations(group_names, 2)))
alpha_bonf = 0.05 / n_comparisons

print(f'  Number of comparisons: {n_comparisons}')
print(f'  Bonferroni Œ±: 0.05 / {n_comparisons} = {alpha_bonf:.4f}')
print()

results_posthoc = []
for g1, g2 in combinations(group_names, 2):
    from scipy.stats import ttest_ind as ttest_ind_ph
    t_ph, p_ph = ttest_ind_ph(groups_dict[g1], groups_dict[g2], equal_var=False)
    p_adj = min(p_ph * n_comparisons, 1.0)
    diff  = groups_dict[g1].mean() - groups_dict[g2].mean()
    sig   = '‚úÖ Significant' if p_adj < 0.05 else '‚Äî Not significant'
    results_posthoc.append({
        'Comparison': f'{g1} vs {g2}',
        'Mean diff': round(diff, 2),
        't-stat': round(t_ph, 3),
        'p (raw)': round(p_ph, 4),
        'p (Bonferroni)': round(p_adj, 4),
        'Significant?': sig
    })
    print(f'  {g1} vs {g2}: diff={diff:+.2f}, t={t_ph:.3f}, p_raw={p_ph:.4f}, p_bonf={p_adj:.4f}  {sig}')

In [None]:
# ===== Plot ‚Äî ANOVA Diagnostic =====
fig, axes = plt.subplots(1, 3, figsize=(14, 4))
fig.suptitle('H4: Diet Quality vs Exam Score ‚Äî ANOVA Diagnostics',
             fontsize=12, fontweight='bold')

order = ['Poor', 'Fair', 'Good']
order = [o for o in order if o in groups_dict]
if not order:
    order = list(groups_dict.keys())

colors_diet = {'Poor': '#E57373', 'Fair': '#FFB74D', 'Good': '#81C784'}
default_colors = ['#E57373', '#FFB74D', '#81C784']

# ‚îÄ‚îÄ Boxplot ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
data_plot  = [groups_dict[k] for k in order]
labels_plt = order
bp = axes[0].boxplot(data_plot, patch_artist=True, labels=labels_plt)
for i, (patch, key) in enumerate(zip(bp['boxes'], order)):
    patch.set_facecolor(colors_diet.get(key, default_colors[i % 3]))
    patch.set_alpha(0.8)
axes[0].axhline(grand_mean, color='red', linestyle='--', lw=1.5,
                label=f'Grand mean={grand_mean:.1f}')
axes[0].set_ylabel('exam_score')
axes[0].set_title('Distribution by Group')
axes[0].legend(fontsize=8)

# ‚îÄ‚îÄ Mean + CI per group ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
from scipy.stats import t as t_dist
means = [groups_dict[k].mean() for k in order]
cis   = [t_dist.ppf(0.975, len(groups_dict[k])-1) *
         groups_dict[k].std(ddof=1) / np.sqrt(len(groups_dict[k]))
         for k in order]
x_pos = range(len(order))
axes[1].bar(x_pos, means, color=[colors_diet.get(k, default_colors[i]) for i, k in enumerate(order)],
            alpha=0.8, yerr=cis, capsize=5, error_kw={'linewidth': 2})
axes[1].axhline(grand_mean, color='red', linestyle='--', lw=1.5)
axes[1].set_xticks(list(x_pos))
axes[1].set_xticklabels(order)
axes[1].set_ylabel('Mean exam_score')
axes[1].set_title('Group Means + 95% CI')
axes[1].set_ylim(min(means) - 5, max(means) + 8)
for i, (m, c) in enumerate(zip(means, cis)):
    axes[1].text(i, m + c + 0.5, f'{m:.1f}', ha='center', fontsize=9, fontweight='bold')

# ‚îÄ‚îÄ Effect size visualization ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
eta_labels = ['Œ∑¬≤ observed', 'Small (0.01)', 'Medium (0.06)', 'Large (0.14)']
eta_values = [eta_sq, 0.01, 0.06, 0.14]
eta_colors = ['#1565C0', '#90CAF9', '#42A5F5', '#1976D2']
axes[2].barh(eta_labels, eta_values, color=eta_colors, alpha=0.85)
axes[2].axvline(eta_sq, color='red', linestyle='--', lw=1.5)
axes[2].set_xlabel('Œ∑¬≤')
axes[2].set_title('Effect Size ‚Äî Cohen Benchmarks')
for i, v in enumerate(eta_values):
    axes[2].text(v + 0.001, i, f'{v:.3f}', va='center', fontsize=9)

plt.tight_layout()
output_path = REPORTS_FIGURES / 'lesson6_h4_anova_full.png'
plt.savefig(output_path, dpi=150, bbox_inches='tight')
plt.show()
print(f'Figure saved: {output_path}')

---
## 7. Type I and Type II Errors in Context

| Error Type | Definition | In This Study | Business Consequence |
|------------|-----------|---------------|---------------------|
| **Type I (Œ±)** | Reject H‚ÇÄ when it is true | Concluding sleep deprivation exists when it doesn't | Invest in unnecessary sleep program ‚Üí wasted budget (*LEAN: muda*) |
| **Type II (Œ≤)** | Fail to reject H‚ÇÄ when it is false | Missing real sleep deprivation problem | No intervention ‚Üí continued academic underperformance |

### Œ± = 0.05 Justification

Setting Œ± = 0.05 in a university wellness context is a deliberate balance:
- **Too strict (Œ± = 0.01):** Risk missing real problems (high Œ≤) ‚Äî missed improvement opportunities
- **Too lenient (Œ± = 0.10):** Risk recommending ineffective programs (high Œ±) ‚Äî wasted resources
- **Œ± = 0.05:** Standard in social science ‚Äî acceptable balance between budget protection and student welfare

### Power Consideration

With n ‚âà 1,000, the study has **very high statistical power** (1-Œ≤ > 0.99 for medium effects).
This means virtually any practically meaningful difference will be detected.

---
## 8. Hypotheses Summary ‚Äî All Results

> Update this table after running all cells with actual values.

In [None]:
# ===== Final Summary Table =====
summary_data = {
    'Hypothesis': ['H1', 'H2', 'H3', 'H4'],
    'Test': [
        'One-sample t-test (left)',
        'Independent t-test (right)',
        'Proportion z-test (left)',
        'One-way ANOVA'
    ],
    'Statistic': [
        round(t_stat_h1, 4),
        round(t_stat_h2, 4),
        round(z_stat_h3, 4),
        round(f_stat, 4)
    ],
    'p-value': [
        round(p_h1, 4),
        round(p_h2, 4),
        round(p_h3, 4),
        round(p_h4, 4)
    ],
    'Effect Size': [
        f"Cohen's d = {cohens_d_h1:.3f}",
        f"Cohen's d = {cohens_d_h2:.3f}",
        f"Cohen's h = {cohens_h:.3f}",
        f"Œ∑¬≤ = {eta_sq:.3f}"
    ],
    'Decision': [
        'Reject H‚ÇÄ' if p_h1 < alpha else 'Fail to reject H‚ÇÄ',
        'Reject H‚ÇÄ' if p_h2 < alpha and t_stat_h2 > 0 else 'Fail to reject H‚ÇÄ',
        'Reject H‚ÇÄ' if p_h3 < alpha else 'Fail to reject H‚ÇÄ',
        'Reject H‚ÇÄ' if p_h4 < alpha else 'Fail to reject H‚ÇÄ',
    ]
}

summary_df = pd.DataFrame(summary_data)
print('=== Final Hypotheses Summary ===')
print(summary_df.to_string(index=False))

---
## 9. Final Conclusions & Business Recommendations

### For Technical Audience (English)

Based on the statistical analysis of 1,000 university students:

| Finding | Test | p-value | Effect | Recommendation |
|---------|------|---------|--------|----------------|
| [H1 result ‚Äî update after running] | One-sample t-test | [value] | [Cohen's d] | [Action] |
| [H2 result] | Independent t-test | [value] | [Cohen's d] | [Action] |
| [H3 result] | Proportion z-test | [value] | [Cohen's h] | [Action] |
| [H4 result] | One-way ANOVA | [value] | [Œ∑¬≤] | [Action] |

**Limitations:**
- Cross-sectional design ‚Äî cannot establish causality, only association
- Self-reported data ‚Äî possible social desirability bias
- Convenience sample ‚Äî may not represent all university populations
- Geographic context unknown ‚Äî results should be validated locally before policy implementation

---

### Para Audiencia de Negocio (Espa√±ol)

> **Resumen Ejecutivo ‚Äî Departamento de Salud Universitaria**

Basado en el an√°lisis estad√≠stico de 1.000 estudiantes universitarios,
el estudio identific√≥ los siguientes hallazgos con respaldo estad√≠stico formal:

**Hallazgos principales:**
- [Actualizar con resultados reales tras ejecutar las celdas]

**Recomendaciones de intervenci√≥n (ordenadas por prioridad):**

| Prioridad | Intervenci√≥n | Impacto esperado | Costo estimado |
|-----------|-------------|-----------------|----------------|
| **ALTA** | Programa de higiene del sue√±o | Reducir % estudiantes con < 7h | Bajo |
| **MEDIA** | Incentivos de actividad f√≠sica | Reducir sedentarismo | Medio |
| **MEDIA** | Subsidio de alimentaci√≥n saludable | Mejorar calidad nutricional | Medio-Alto |

**Limitaciones del estudio:**
- Dise√±o transversal ‚Äî no establece causalidad
- Datos autoreportados ‚Äî posible sesgo de deseabilidad social
- Muestra de conveniencia ‚Äî validar localmente antes de implementar pol√≠ticas

---
## 10. Prescriptive Analysis ‚Äî From Findings to Action

### Context ‚Üí Analysis ‚Üí Insight ‚Üí Decision (Possible)

The following prescriptions are derived from the statistical evidence gathered
in Lessons 2‚Äì6. Each recommendation is:
- Grounded in a rejected H‚ÇÄ (statistical evidence)
- Quantified with estimated business impact
- Prioritized by effect size and intervention cost
- Framed as actionable decisions for the Health Director

> **ICI Perspective:** Applying Lean Value Stream Mapping logic:
> identify the highest-leverage intervention (biggest impact per unit of investment).
> Sleep and sedentarism are the primary bottlenecks in the student wellness "production system." 

In [None]:
# ===== Prescriptive Analysis ‚Äî Quantified Recommendations =====

# ‚îÄ‚îÄ Base metrics from previous lessons ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
n_total       = len(df)
p_sleep_depr  = (df['sleep_hours'] < 7).mean()
p_sedentary   = (df['exercise_frequency'] < 3).mean()
p_acad_risk   = (df['exam_score'] < 60).mean()
p_risk_given_sleep = ((df['sleep_hours'] < 7) & (df['exam_score'] < 60)).sum() / (df['sleep_hours'] < 7).sum()
p_risk_given_ok    = ((df['sleep_hours'] >= 7) & (df['exam_score'] < 60)).sum() / (df['sleep_hours'] >= 7).sum()

mean_active   = df[df['exercise_frequency'] >= 3]['exam_score'].mean()
mean_sedent   = df[df['exercise_frequency'] < 3]['exam_score'].mean()
score_gap     = mean_active - mean_sedent

print('=== Base Metrics for Prescriptive Analysis ===')
print(f'  Population at risk of sleep deprivation: {p_sleep_depr*100:.1f}%  (n‚âà{int(p_sleep_depr*n_total):,})')
print(f'  Population sedentary:                    {p_sedentary*100:.1f}%  (n‚âà{int(p_sedentary*n_total):,})')
print(f'  Population at academic risk (score<60):  {p_acad_risk*100:.1f}%  (n‚âà{int(p_acad_risk*n_total):,})')
print(f'  P(academic risk | sleep-deprived):       {p_risk_given_sleep*100:.1f}%')
print(f'  P(academic risk | adequate sleep):       {p_risk_given_ok*100:.1f}%')
print(f'  Lift (sleep deprivation ‚Üí risk):         {p_risk_given_sleep/p_risk_given_ok:.2f}x')
print(f'  Exam score gap (active vs sedentary):    {score_gap:+.2f} points')

In [None]:
# ===== Prescription 1 ‚Äî Sleep Hygiene Program =====
print('=' * 60)
print('PRESCRIPTION 1: Sleep Hygiene Program')
print('=' * 60)
print()
print('Evidence base:')
print(f'  H1 rejected ‚Üí Œº_sleep < 7h confirmed')
print(f'  {p_sleep_depr*100:.1f}% of students sleep-deprived')
print(f'  Sleep-deprived students are {p_risk_given_sleep/p_risk_given_ok:.1f}x more likely to be at academic risk')
print()

# Target: reduce sleep deprivation from p_sleep_depr to 0.40 (achievable target)
target_sleep_depr = 0.40
students_to_help  = int((p_sleep_depr - target_sleep_depr) * n_total)
risk_reduction    = students_to_help * (p_risk_given_sleep - p_risk_given_ok)

print('Intervention design:')
print(f'  Current state:  {p_sleep_depr*100:.1f}% sleep-deprived')
print(f'  Target state:   {target_sleep_depr*100:.0f}% sleep-deprived (achievable with behavioral nudges)')
print(f'  Students helped: ‚âà{students_to_help:,}')
print(f'  Expected reduction in academic risk: ‚âà{risk_reduction:.0f} fewer students at risk')
print()
print('Intervention options (LEAN ‚Äî low cost, high leverage):')
print('  ‚Ä¢ Sleep hygiene workshops (1h/semester) ‚Äî cost: LOW')
print('  ‚Ä¢ Late-night library/screen curfew notifications ‚Äî cost: VERY LOW')
print('  ‚Ä¢ Residence hall quiet hours enforcement ‚Äî cost: LOW')
print('  ‚Ä¢ Wearable pilot program (sleep tracking) ‚Äî cost: MEDIUM')
print()
print('Priority: HIGH ‚Äî largest at-risk population, low intervention cost')

In [None]:
# ===== Prescription 2 ‚Äî Physical Activity Program =====
print('=' * 60)
print('PRESCRIPTION 2: Physical Activity & Anti-Sedentarism')
print('=' * 60)
print()
print('Evidence base:')
print(f'  H2 result ‚Üí active students score {score_gap:+.2f} points higher')
print(f'  H3 ‚Üí {p_sedentary*100:.1f}% of students sedentary (> 50% benchmark)')
print()

# Target: reduce sedentarism from p_sedentary to 0.40
target_sedent   = 0.40
students_active = int((p_sedentary - target_sedent) * n_total)
score_gain_pop  = students_active * score_gap

print('Intervention design:')
print(f'  Current state:  {p_sedentary*100:.1f}% sedentary (<3 days/week)')
print(f'  Target state:   {target_sedent*100:.0f}% sedentary')
print(f'  Students transitioned to active: ‚âà{students_active:,}')
print(f'  Expected aggregate score gain: ‚âà{score_gain_pop:.0f} exam score points across cohort')
print(f'  Average score gain per transitioned student: +{score_gap:.2f} points')
print()
print('Intervention options:')
print('  ‚Ä¢ Free gym access during study periods ‚Äî cost: LOW (existing infrastructure)')
print('  ‚Ä¢ Academic credit for sports participation ‚Äî cost: LOW')
print('  ‚Ä¢ Step-count challenges with academic incentives ‚Äî cost: VERY LOW')
print('  ‚Ä¢ Active transport (cycling) infrastructure ‚Äî cost: MEDIUM-HIGH')
print()
print('Priority: HIGH ‚Äî affects majority of students, measurable academic ROI')

In [None]:
# ===== Prescription 3 ‚Äî Nutrition Program =====
print('=' * 60)
print('PRESCRIPTION 3: Nutrition & Diet Quality')
print('=' * 60)
print()

# Diet quality means
for k in order:
    print(f'  {k:<8}: mean exam score = {groups_dict[k].mean():.2f}')

best_group  = max(order, key=lambda k: groups_dict[k].mean())
worst_group = min(order, key=lambda k: groups_dict[k].mean())
diet_gap    = groups_dict[best_group].mean() - groups_dict[worst_group].mean()

print()
print(f'  Score gap ({best_group} vs {worst_group}): {diet_gap:+.2f} points')
print()

p_poor_diet = (df['diet_quality'] == worst_group).mean() if worst_group in df['diet_quality'].values else 0
students_poor = int(p_poor_diet * n_total)

print('Evidence base:')
print(f'  H4 {"rejected" if p_anova < 0.05 else "not rejected"} ‚Üí p={p_anova:.4f}')
print(f'  Œ∑¬≤ = {eta_sq:.4f} ({interp} effect)')
print(f'  {p_poor_diet*100:.1f}% of students have {worst_group} diet quality (n‚âà{students_poor:,})')
print()
print('Intervention options:')
print('  ‚Ä¢ Subsidized healthy meal plan for low-income students ‚Äî cost: MEDIUM')
print('  ‚Ä¢ Nutrition workshops integrated into orientation ‚Äî cost: LOW')
print('  ‚Ä¢ Cafeteria redesign (healthy defaults) ‚Äî cost: MEDIUM')
print('  ‚Ä¢ Fruit/vegetable vending machines ‚Äî cost: LOW')
print()
if eta_sq < 0.06:
    print('Priority: MEDIUM ‚Äî statistically significant but small effect size')
    print('          Intervene after sleep and exercise programs (higher ROI)')
else:
    print('Priority: HIGH ‚Äî medium/large effect size supports strong intervention')

In [None]:
# ===== Intervention Priority Matrix =====
print()
print('=' * 60)
print('INTERVENTION PRIORITY MATRIX')
print('=' * 60)
print()

priority_data = {
    'Intervention': [
        'Sleep Hygiene Program',
        'Anti-Sedentarism Campaign',
        'Nutrition Program'
    ],
    'Evidence (H)': ['H1 ‚úÖ', 'H2+H3 ‚úÖ', 'H4 ‚úÖ'],
    'Population affected': [
        f'{p_sleep_depr*100:.0f}%',
        f'{p_sedentary*100:.0f}%',
        f'{p_poor_diet*100:.0f}%'
    ],
    'Effect size': [
        f"Cohen's d={cohens_d_h1:.2f}",
        f"Cohen's d={cohens_d_h2:.2f}",
        f'Œ∑¬≤={eta_sq:.3f}'
    ],
    'Cost': ['Low', 'Low‚ÄìMedium', 'Medium'],
    'Priority': ['üî¥ HIGH', 'üî¥ HIGH', 'üü° MEDIUM']
}

priority_df = pd.DataFrame(priority_data)
print(priority_df.to_string(index=False))
print()
print('Recommended implementation order: Sleep ‚Üí Activity ‚Üí Nutrition')
print('Rationale (LEAN): Maximum impact per resource unit')
print('  Sleep program: cheapest intervention, largest affected population')
print('  Activity:      measurable exam score ROI, affects majority')
print('  Nutrition:     smaller effect size, higher implementation cost')

In [None]:
# ===== Visualization ‚Äî Priority Matrix =====
fig, axes = plt.subplots(1, 2, figsize=(13, 4))
fig.suptitle('Prescriptive Analysis ‚Äî Intervention Priority',
             fontsize=12, fontweight='bold')

# ‚îÄ‚îÄ Bubble chart: Effect size vs Population affected ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
interventions = ['Sleep
Hygiene', 'Anti-
Sedentarism', 'Nutrition']
pop_affected  = [p_sleep_depr, p_sedentary, p_poor_diet]
effect_sizes  = [abs(cohens_d_h1), abs(cohens_d_h2), eta_sq * 10]  # scaled for visibility
costs         = [1, 2, 3]   # 1=low, 2=medium, 3=high
colors_prio   = ['#E53935', '#E53935', '#F59E0B']
sizes_bubble  = [p * 3000 for p in pop_affected]

for i, (name, pop, eff, color, sz) in enumerate(
        zip(interventions, pop_affected, effect_sizes, colors_prio, sizes_bubble)):
    axes[0].scatter(pop * 100, eff, s=sz, color=color, alpha=0.7, edgecolors='white', lw=2)
    axes[0].annotate(name, (pop * 100, eff), textcoords="offset points",
                     xytext=(0, 12), ha='center', fontsize=9, fontweight='bold')

axes[0].set_xlabel('Population Affected (%)')
axes[0].set_ylabel('Effect Size (standardized)')
axes[0].set_title('Impact vs Population ‚Äî Bubble = Population Size')
axes[0].axhline(0.2, color='gray', linestyle='--', alpha=0.5, label='Small effect threshold')
axes[0].legend(fontsize=8)

# ‚îÄ‚îÄ Bar: Expected students helped ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
students_helped = [
    int((p_sleep_depr - 0.40) * n_total),
    int((p_sedentary - 0.40) * n_total),
    int(p_poor_diet * n_total * 0.3)   # 30% improvement assumed
]
bars = axes[1].bar(interventions, students_helped,
                   color=['#E53935', '#E53935', '#F59E0B'], alpha=0.85, edgecolor='white')
for bar, val in zip(bars, students_helped):
    axes[1].text(bar.get_x() + bar.get_width()/2, bar.get_height() + 2,
                 f'‚âà{val:,}', ha='center', fontsize=10, fontweight='bold')
axes[1].set_ylabel('Students Benefited (estimated)')
axes[1].set_title('Estimated Reach per Intervention')

plt.tight_layout()
output_path = REPORTS_FIGURES / 'lesson6_prescriptions.png'
plt.savefig(output_path, dpi=150, bbox_inches='tight')
plt.show()
print(f'Figure saved: {output_path}')

---
## 11. Deliverables Checklist

| Deliverable | Audience | Format | Status |
|-------------|----------|--------|--------|
| 01_business_understanding.ipynb | Data team | Jupyter (EN) | ‚úÖ |
| 02_data_understanding.ipynb | Data team | Jupyter (EN) | ‚úÖ |
| 03_data_preparation.ipynb | Data team | Jupyter (EN) | ‚úÖ |
| 04_modeling.ipynb | Data team | Jupyter (EN) | ‚úÖ |
| 05_evaluation.ipynb | Data team | Jupyter (EN) | ‚úÖ |
| 06_deployment.ipynb | Data team | Jupyter (EN) | ‚úÖ |
| Figures (reports/figures/) | All | PNG | ‚úÖ |
| Executive Summary | Health Director | PDF/PPTX (ES) | ‚è≥ |
| GitHub Repository | Public / Portfolio | Git | ‚è≥ |

---
## 12. LEAN Retrospective

| LEAN Question | Answer |
|---------------|--------|
| Did every analysis step add value? | ‚úÖ Each notebook maps to a specific lesson objective |
| Was there analytical waste? | ‚ö†Ô∏è Distribution fitting for non-hypothesis variables was avoided |
| Did results translate to business decisions? | ‚úÖ Each rejected H‚ÇÄ maps to a specific intervention |
| Was the dual-format delivery achieved? | ‚è≥ Executive summary pending |
| What would I do differently? | Add power analysis in Lesson 1 to justify sample size requirements |

### Lean Waste Identified

| Waste Type | Instance | Resolution |
|------------|---------|------------|
| Over-processing | Fitting distributions for all 20 variables | Reduced to 4 hypothesis-relevant variables |
| Waiting | Dataset download not automated | kagglehub solution implemented |
| Defects | f-string syntax errors in notebooks | Fixed with variable intermediate pattern |

---
## 13. Decisions Log ‚Äî Lesson 6

| # | Decision | Rationale | Alternatives | LEAN Value? |
|---|----------|-----------|--------------|-------------|
| 1 | One-tailed tests for H1, H2, H3 | Direction specified in hypotheses from Lesson 1 | Two-tailed | ‚úÖ More powerful, appropriate |
| 2 | Welch t-test for H2 | Does not assume equal variances | Student t-test | ‚úÖ More robust |
| 3 | ANOVA for H4 (not multiple t-tests) | Multiple t-tests inflate Type I error | 3 separate t-tests | ‚úÖ Methodologically correct |
| 4 | Bonferroni correction for post-hoc | Controls familywise error rate | Tukey HSD | ‚úÖ Conservative but valid |
| 5 | Report Cohen's d and Œ∑¬≤ alongside p-value | p-value alone insufficient for business decisions | p-value only | ‚úÖ Portfolio standard |

---

**‚Üê Previous Phase:** [05 ‚Äî Evaluation](./05_evaluation.ipynb)

---

*End of Lesson 6 ‚Äî Project 4, Module 5 ‚Äî COMPLETE*
*Author: Jose Marcel Lopez Pino | Framework: CRISP-DM + LEAN | Bootcamp: Alkemy / SENCE 2025‚Äì2026*