# ðŸ”§ Debug Drill: The Suspiciously Perfect Model

**Scenario:**
Your colleague built a churn prediction model and is excited to show you the results.

"Look at this AUC!" they say. "It's almost perfect!"

**Your Task:**
1. Run the notebook
2. Find the bug (hint: why is performance SO good?)
3. Fix it
4. Write a 3-bullet postmortem

---

In [None]:
import pandas as pd
import numpy as np
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score, precision_score

# Load data
try:
    df = pd.read_csv('https://raw.githubusercontent.com/189investmentai/ml-foundations-interactive/main/streamcart_customers.csv')
except:
    df = pd.read_csv('../../data/streamcart_customers.csv')

print(f"Loaded {len(df):,} customers")

In [None]:
# ===== COLLEAGUE'S CODE (CONTAINS BUG) =====

# Feature engineering
df['has_cancel_reason'] = df['cancel_reason'].notna().astype(int)
df['days_until_churn'] = pd.to_datetime(df['churn_date']).sub(pd.to_datetime(df['snapshot_date'])).dt.days
df['days_until_churn'] = df['days_until_churn'].fillna(999)  # Non-churners get 999

# Select features
features = [
    'tenure_months',
    'logins_last_30d',
    'support_tickets_last_30d',
    'has_cancel_reason',      # <-- Colleague added this
    'days_until_churn'        # <-- And this
]

X = df[features].fillna(0)
y = df['churn_30d']

# Split and train
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = RandomForestClassifier(n_estimators=100, max_depth=5, random_state=42)
model.fit(X_train, y_train)

# Evaluate
y_proba = model.predict_proba(X_test)[:, 1]
auc = roc_auc_score(y_test, y_proba)

print(f"\nðŸŽ‰ Amazing results!")
print(f"AUC: {auc:.3f}")
print(f"\nThis model is ready for production, right?")

---

## Your Investigation

Something is wrong. An AUC this high is suspicious.

### Step 1: What's the problem?

In [None]:
# TODO: Investigate the features
# Hint: For each feature, ask "Would I have this at prediction time?"

# Check the suspicious features:
print("=== Investigating has_cancel_reason ===")
print(df.groupby('churn_30d')['has_cancel_reason'].mean())

print("\n=== Investigating days_until_churn ===")
print(df.groupby('churn_30d')['days_until_churn'].describe())

In [None]:
# What do you notice? Write your diagnosis:

diagnosis = """
YOUR DIAGNOSIS HERE:

The problem is...

This is called...

"""
print(diagnosis)

### Step 2: Fix the code

In [None]:
# TODO: Fix the feature selection - remove leaky features

features_fixed = [
    # List only non-leaky features here
]

# Retrain with fixed features
# X_fixed = ...
# model_fixed = ...
# auc_fixed = ...

In [None]:
# ============================================
# SELF-CHECK: Did you fix it?
# ============================================

# A properly trained model should have AUC between 0.60 and 0.85
# NOT above 0.95!

# assert auc_fixed < 0.90, "AUC still too high - check for remaining leakage!"
# assert auc_fixed > 0.55, "AUC too low - did you remove good features by mistake?"
# print(f"âœ“ Fixed AUC: {auc_fixed:.3f} - This looks realistic!")

### Step 3: Write your postmortem

In [None]:
postmortem = """
## Postmortem: Churn Model Leakage Bug

### What happened:
- (Your answer)

### Root cause:
- (Your answer)

### How to prevent:
- (Your answer)
"""

print(postmortem)

---

## âœ… Drill Complete!

**Key lesson:** If your model performance seems too good to be true, it probably is. Always ask "Would I have this data at prediction time?"