# Real-World Use Case: HR Employee Attrition

## 1. The Problem
The HR department wants to know *why* employees are quitting (Attrition) and predict who might leave next.

## 2. Why Decision Trees?
*   **Interpretability**: This is the most important factor. If the model says "John will quit", HR needs to know *why* (e.g., "Because he travels frequently and hasn't had a raise"). A Decision Tree provides clear IF-THEN rules that non-technical managers can understand.

## 3. Data Simulation (IBM HR Analytics Proxy)
Features: `Age`, `DailyRate`, `DistanceFromHome`, `OverTime`, `JobSatisfaction`.

In [None]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier, plot_tree
from sklearn.preprocessing import OneHotEncoder
import matplotlib.pyplot as plt

# 1. Generate Data
np.random.seed(42)
n = 500
age = np.random.randint(20, 60, n)
overtime = np.random.choice(['Yes', 'No'], n)
distance = np.random.randint(1, 30, n)
satisfaction = np.random.randint(1, 5, n)

# Attrition Logic: Young, Overtime=Yes, Low Satisfaction
attrition = []
for i in range(n):
    prob = 0.1
    if overtime[i] == 'Yes': prob += 0.3
    if satisfaction[i] == 1: prob += 0.4
    if age[i] < 30: prob += 0.2
    attrition.append('Yes' if np.random.rand() < prob else 'No')

df = pd.DataFrame({'Age': age, 'OverTime': overtime, 'Distance': distance, 'Satisfaction': satisfaction, 'Attrition': attrition})

# 2. Preprocessing (OneHot)
df_encoded = pd.get_dummies(df, drop_first=True)
# Target: Attrition_Yes (1=Yes, 0=No)

X = df_encoded.drop('Attrition_Yes', axis=1)
y = df_encoded['Attrition_Yes']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# 3. Train Constrained Tree (Max Depth 3 for readability)
dt = DecisionTreeClassifier(max_depth=3, random_state=42)
dt.fit(X_train, y_train)

# 4. Visualize the Explanation
plt.figure(figsize=(14, 8))
plot_tree(dt, feature_names=X.columns, class_names=['Stay', 'Quit'], filled=True, fontsize=10)
plt.title("Why are employees quitting?")
plt.show()