# Kolecto Churn Prediction - Complete Analysis

**Objective**: Predict conversion from 15-day trial to paid subscription

**Models Implemented**:
1. Logistic Regression (baseline)
2. XGBoost (gradient boosting)
3. LightGBM (fast gradient boosting)
4. LSTM/GRU (sequential model)
5. Transformer (attention-based)

**Data**: 
- 503 trials (filtered to 15-day duration)
- ~60% baseline conversion rate
- 20 daily usage features

All visualizations saved to `../results/figures/`

## 1. Setup & Imports

In [None]:
# Core libraries
import pandas as pd
import numpy as np
import json
from datetime import datetime

# Visualization
import matplotlib.pyplot as plt
import seaborn as sns

# Sklearn
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import (
    accuracy_score, roc_auc_score, roc_curve, 
    precision_recall_curve, auc, brier_score_loss,
    confusion_matrix, classification_report
)

# Tree models
import xgboost as xgb
import lightgbm as lgb
import shap

# Deep learning
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader

# Settings
%matplotlib inline
sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (12, 6)

import warnings
warnings.filterwarnings('ignore')

print("✅ All libraries imported successfully")

## 2. Data Loading & Preprocessing

In [None]:
# Load data
print("Loading data from ../data/raw/...")
subscriptions = pd.read_csv('../data/raw/subscriptions.csv')
daily_usage = pd.read_csv('../data/raw/daily_usage.csv')

print(f"Subscriptions: {subscriptions.shape}")
print(f"Daily usage: {daily_usage.shape}")

# Convert dates
date_cols = ['trial_starts_at', 'trial_ends_at', 'first_paid_invoice_paid_at']
for col in date_cols:
    subscriptions[col] = pd.to_datetime(subscriptions[col], errors='coerce')

# Calculate trial duration
subscriptions['trial_duration'] = (
    subscriptions['trial_ends_at'] - subscriptions['trial_starts_at']
).dt.days

# Filter to 15-day trials only (as per case study)
print(f"\nTrial duration distribution:\n{subscriptions['trial_duration'].value_counts()}")
subscriptions_15d = subscriptions[subscriptions['trial_duration'] == 15].copy()
print(f"\nAfter filtering to 15-day trials: {len(subscriptions_15d)} trials")

# Define target: converted if they have a paid invoice
subscriptions_15d['converted'] = subscriptions_15d['first_paid_invoice_paid_at'].notna().astype(int)
conversion_rate = subscriptions_15d['converted'].mean()
print(f"\n✅ Conversion rate: {conversion_rate:.2%}")
print(f"   Converted: {subscriptions_15d['converted'].sum()}")
print(f"   Not converted: {(~subscriptions_15d['converted'].astype(bool)).sum()}")

### 2.1 Feature Engineering

In [None]:
# Aggregate usage features per trial
usage_cols = [col for col in daily_usage.columns if col.startswith('nb_')]
print(f"Found {len(usage_cols)} usage features")

# Aggregate: sum, mean, max, std for each trial
usage_agg = daily_usage.groupby('subscription_id')[usage_cols].agg(
    ['sum', 'mean', 'max', 'std']
).reset_index()

# Flatten column names
usage_agg.columns = ['subscription_id'] + [
    f'{col[0]}_{col[1]}' for col in usage_agg.columns[1:]
]

# Fill NaN std with 0
usage_agg = usage_agg.fillna(0)

print(f"Aggregated usage features: {usage_agg.shape}")

# Merge with subscriptions
df = subscriptions_15d.merge(usage_agg, on='subscription_id', how='left')
df = df.fillna(0)

print(f"\n✅ Final dataset: {df.shape}")
print(f"   Features: {df.shape[1]}")

### 2.2 Train/Test Split

In [None]:
# Select features for modeling
# Use only numerical usage features
feature_cols = [col for col in df.columns if col.startswith('nb_')]
X = df[feature_cols].values
y = df['converted'].values

print(f"Features: {X.shape[1]}")
print(f"Samples: {len(y)}")

# Split data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

print(f"\nTrain set: {X_train.shape}")
print(f"Test set: {X_test.shape}")
print(f"Train conversion: {y_train.mean():.2%}")
print(f"Test conversion: {y_test.mean():.2%}")

# Scale features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

print("\n✅ Data prepared for modeling")

## 3. Model 1: Logistic Regression (Baseline)

In [None]:
# Train Logistic Regression
print("Training Logistic Regression...")
lr_model = LogisticRegression(max_iter=1000, random_state=42)
lr_model.fit(X_train_scaled, y_train)

# Predictions
lr_pred_proba = lr_model.predict_proba(X_test_scaled)[:, 1]
lr_pred = (lr_pred_proba >= 0.5).astype(int)

# Metrics
lr_accuracy = accuracy_score(y_test, lr_pred)
lr_auc = roc_auc_score(y_test, lr_pred_proba)

lr_precision, lr_recall, _ = precision_recall_curve(y_test, lr_pred_proba)
lr_pr_auc = auc(lr_recall, lr_precision)

lr_brier = brier_score_loss(y_test, lr_pred_proba)

# Store results
lr_results = {
    'accuracy': lr_accuracy,
    'roc_auc': lr_auc,
    'pr_auc': lr_pr_auc,
    'brier': lr_brier
}

print(f"✅ Logistic Regression Results:")
print(f"   Accuracy: {lr_accuracy:.3f}")
print(f"   ROC-AUC: {lr_auc:.3f}")
print(f"   PR-AUC: {lr_pr_auc:.3f}")
print(f"   Brier Score: {lr_brier:.3f}")

## 4. Model 2: XGBoost

In [None]:
# Train XGBoost
print("Training XGBoost...")
xgb_model = xgb.XGBClassifier(
    n_estimators=100,
    max_depth=5,
    learning_rate=0.1,
    random_state=42,
    eval_metric='logloss'
)
xgb_model.fit(X_train, y_train)

# Predictions
xgb_pred_proba = xgb_model.predict_proba(X_test)[:, 1]
xgb_pred = (xgb_pred_proba >= 0.5).astype(int)

# Metrics
xgb_accuracy = accuracy_score(y_test, xgb_pred)
xgb_auc = roc_auc_score(y_test, xgb_pred_proba)

xgb_precision, xgb_recall, _ = precision_recall_curve(y_test, xgb_pred_proba)
xgb_pr_auc = auc(xgb_recall, xgb_precision)

xgb_brier = brier_score_loss(y_test, xgb_pred_proba)

# Store results
xgb_results = {
    'accuracy': xgb_accuracy,
    'roc_auc': xgb_auc,
    'pr_auc': xgb_pr_auc,
    'brier': xgb_brier
}

print(f"✅ XGBoost Results:")
print(f"   Accuracy: {xgb_accuracy:.3f}")
print(f"   ROC-AUC: {xgb_auc:.3f}")
print(f"   PR-AUC: {xgb_pr_auc:.3f}")
print(f"   Brier Score: {xgb_brier:.3f}")

# Feature importance plot
fig, ax = plt.subplots(1, 1, figsize=(10, 6))
xgb.plot_importance(xgb_model, ax=ax, max_num_features=15)
plt.title('XGBoost Feature Importance (Top 15)', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.savefig('../results/figures/xgb_feature_importance.png', dpi=300, bbox_inches='tight')
print("✅ Saved: ../results/figures/xgb_feature_importance.png")
plt.show()

## 5. Model 3: LightGBM

In [None]:
# Train LightGBM
print("Training LightGBM...")
lgb_model = lgb.LGBMClassifier(
    n_estimators=100,
    max_depth=5,
    learning_rate=0.1,
    random_state=42
)
lgb_model.fit(X_train, y_train)

# Predictions
lgb_pred_proba = lgb_model.predict_proba(X_test)[:, 1]
lgb_pred = (lgb_pred_proba >= 0.5).astype(int)

# Metrics
lgb_accuracy = accuracy_score(y_test, lgb_pred)
lgb_auc = roc_auc_score(y_test, lgb_pred_proba)

lgb_precision, lgb_recall, _ = precision_recall_curve(y_test, lgb_pred_proba)
lgb_pr_auc = auc(lgb_recall, lgb_precision)

lgb_brier = brier_score_loss(y_test, lgb_pred_proba)

# Store results
lgb_results = {
    'accuracy': lgb_accuracy,
    'roc_auc': lgb_auc,
    'pr_auc': lgb_pr_auc,
    'brier': lgb_brier
}

print(f"✅ LightGBM Results:")
print(f"   Accuracy: {lgb_accuracy:.3f}")
print(f"   ROC-AUC: {lgb_auc:.3f}")
print(f"   PR-AUC: {lgb_pr_auc:.3f}")
print(f"   Brier Score: {lgb_brier:.3f}")

## Note: Deep Learning models trained above

Results stored in lstm_results and transformer_results dictionaries.

In [None]:
# Load pre-trained deep learning model results
print("Loading deep learning model results...")

with open('../results/metrics/lstm_results.json', 'r') as f:
    lstm_results = json.load(f)

with open('../results/metrics/transformer_results.json', 'r') as f:
    transformer_results = json.load(f)

# Rename keys to match
lstm_results = {
    'accuracy': lstm_results['test_accuracy'],
    'roc_auc': lstm_results['test_auc'],
    'pr_auc': lstm_results['test_pr_auc'],
    'brier': lstm_results['test_brier']
}

transformer_results = {
    'accuracy': transformer_results['test_accuracy'],
    'roc_auc': transformer_results['test_auc'],
    'pr_auc': transformer_results['test_pr_auc'],
    'brier': transformer_results['test_brier']
}

print("✅ LSTM/GRU Results:")
for k, v in lstm_results.items():
    print(f"   {k}: {v:.3f}")

print("\n✅ Transformer Results:")
for k, v in transformer_results.items():
    print(f"   {k}: {v:.3f}")

## 6. Deep Learning Models - Full Training

Training LSTM/GRU and Transformer models from scratch on sequential daily usage data.

In [None]:
# Import deep learning components
import sys
sys.path.append('..')

from models.gru_model import GRUChurnModel
from models.transformer_model import TransformerChurnModel
from config.model_config import MODEL_CONFIG, DATA_CONFIG, TRAINING_CONFIG

print("✅ Imported model classes and configuration")

### 6.1 Prepare Sequential Data

In [None]:
# Prepare sequential data for deep learning
def prepare_sequence_data(subscriptions_df, daily_usage_df, usage_cols, seq_length=15):
    """Convert daily usage to sequences for each trial."""
    
    sequences = []
    labels = []
    trial_ids_list = []
    
    for trial_id in subscriptions_df['subscription_id'].values:
        # Get usage for this trial
        trial_usage = daily_usage_df[daily_usage_df['subscription_id'] == trial_id].copy()
        
        if len(trial_usage) <  <seq_length:
            # Pad if needed
            padding = pd.DataFrame(0, index=range(seq_length - len(trial_usage)), columns=usage_cols)
            trial_usage = pd.concat([padding, trial_usage[usage_cols]], ignore_index=True)
        elif len(trial_usage) > seq_length:
            # Truncate if needed
            trial_usage = trial_usage[usage_cols].iloc[:seq_length]
        else:
            trial_usage = trial_usage[usage_cols]
        
        # Store sequence
        sequences.append(trial_usage.values)
        
        # Get label
        label = subscriptions_df[subscriptions_df['subscription_id'] == trial_id]['converted'].values[0]
        labels.append(label)
        trial_ids_list.append(trial_id)
    
    return np.array(sequences), np.array(labels), trial_ids_list

print("Preparing sequential data...")
X_seq, y_seq, trial_ids = prepare_sequence_data(subscriptions_15d, daily_usage, usage_cols, seq_length=15)

print(f"✅ Sequential data prepared:")
print(f"   Shape: {X_seq.shape}")  
print(f"   (trials, days, features)")
print(f"   Labels: {y_seq.shape}")

In [None]:
# Split sequential data
from sklearn.model_selection import train_test_split

X_seq_train, X_seq_test, y_seq_train, y_seq_test = train_test_split(
    X_seq, y_seq, test_size=0.2, random_state=42, stratify=y_seq
)

# Further split train into train/val
X_seq_train, X_seq_val, y_seq_train, y_seq_val = train_test_split(
    X_seq_train, y_seq_train, test_size=0.2, random_state=42, stratify=y_seq_train
)

print(f"Train: {X_seq_train.shape}")
print(f"Val: {X_seq_val.shape}")
print(f"Test: {X_seq_test.shape}")

# Convert to PyTorch tensors
X_seq_train_t = torch.FloatTensor(X_seq_train)
y_seq_train_t = torch.FloatTensor(y_seq_train)
X_seq_val_t = torch.FloatTensor(X_seq_val)
y_seq_val_t = torch.FloatTensor(y_seq_val)
X_seq_test_t = torch.FloatTensor(X_seq_test)
y_seq_test_t = torch.FloatTensor(y_seq_test)

# Create DataLoaders
from torch.utils.data import TensorDataset, DataLoader

train_dataset = TensorDataset(X_seq_train_t, y_seq_train_t)
val_dataset = TensorDataset(X_seq_val_t, y_seq_val_t)

train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=32, shuffle=False)

print("\n✅ Data loaders created")

### 6.2 Train GRU Model

In [None]:
# Initialize GRU model
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

input_size = X_seq_train.shape[2]  # Number of features
gru_config = MODEL_CONFIG['gru']

gru_model = GRUChurnModel(
    input_size=input_size,
    hidden_size=gru_config['hidden_size'],
    num_layers=gru_config['num_layers'],
    dropout=gru_config['dropout']
).to(device)

print(f"✅ GRU Model initialized:")
print(f"   Input size: {input_size}")
print(f"   Hidden size: {gru_config['hidden_size']}")
print(f"   Layers: {gru_config['num_layers']}")

# Loss and optimizer
criterion = nn.BCELoss()
optimizer = torch.optim.Adam(gru_model.parameters(), lr=gru_config['learning_rate'])

print("\n✅ Optimizer configured")

In [None]:
# Training loop for GRU
def train_model(model, train_loader, val_loader, criterion, optimizer, epochs=50, patience=10):
    """Train model with early stopping."""
    
    history = {'train_loss': [], 'val_loss': [], 'val_auc': []}
    best_val_auc = 0
    patience_counter = 0
    
    for epoch in range(epochs):
        # Training
        model.train()
        train_loss = 0
        for X_batch, y_batch in train_loader:
            X_batch, y_batch = X_batch.to(device), y_batch.to(device)
            
            optimizer.zero_grad()
            outputs = model(X_batch)
            loss = criterion(outputs, y_batch)
            loss.backward()
            optimizer.step()
            
            train_loss += loss.item()
        
        train_loss /= len(train_loader)
        
        # Validation
        model.eval()
        val_loss = 0
        val_preds = []
        val_targets = []
        
        with torch.no_grad():
            for X_batch, y_batch in val_loader:
                X_batch, y_batch = X_batch.to(device), y_batch.to(device)
                
                outputs = model(X_batch)
                loss = criterion(outputs, y_batch)
                
                val_loss += loss.item()
                val_preds.extend(outputs.cpu().numpy())
                val_targets.extend(y_batch.cpu().numpy())
        
        val_loss /= len(val_loader)
        val_auc = roc_auc_score(val_targets, val_preds)
        
        history['train_loss'].append(train_loss)
        history['val_loss'].append(val_loss)
        history['val_auc'].append(val_auc)
        
        if (epoch + 1) % 5 == 0:
            print(f"Epoch {epoch+1}/{epochs}: Train Loss={train_loss:.4f}, Val Loss={val_loss:.4f}, Val AUC={val_auc:.4f}")
        
        # Early stopping
        if val_auc > best_val_auc:
            best_val_auc = val_auc
            patience_counter = 0
            best_state = model.state_dict().copy()
        else:
            patience_counter += 1
            if patience_counter >= patience:
                print(f"Early stopping at epoch {epoch+1}")
                break
    
    # Restore best model
    model.load_state_dict(best_state)
    
    return history, best_val_auc

print("Training GRU model...")
gru_history, gru_best_val_auc = train_model(
    gru_model, train_loader, val_loader, criterion, optimizer,
    epochs=gru_config['epochs'],
    patience=gru_config['early_stopping_patience']
)

print(f"\n✅ GRU training complete")
print(f"   Best validation AUC: {gru_best_val_auc:.4f}")

In [None]:
# Evaluate GRU on test set
gru_model.eval()
with torch.no_grad():
    gru_test_preds = gru_model(X_seq_test_t.to(device)).cpu().numpy()

# Compute metrics
gru_test_accuracy = accuracy_score(y_seq_test, (gru_test_preds >= 0.5).astype(int))
gru_test_auc = roc_auc_score(y_seq_test, gru_test_preds)

gru_precision, gru_recall, _ = precision_recall_curve(y_seq_test, gru_test_preds)
gru_test_pr_auc = auc(gru_recall, gru_precision)

gru_test_brier = brier_score_loss(y_seq_test, gru_test_preds)

# Store results
lstm_results = {  # Named lstm_results to match previous code
    'accuracy': gru_test_accuracy,
    'roc_auc': gru_test_auc,
    'pr_auc': gru_test_pr_auc,
    'brier': gru_test_brier
}

print(f"✅ GRU/LSTM Test Results:")
print(f"   Accuracy: {gru_test_accuracy:.3f}")
print(f"   ROC-AUC: {gru_test_auc:.3f}")
print(f"   PR-AUC: {gru_test_pr_auc:.3f}")
print(f"   Brier Score: {gru_test_brier:.3f}")

# Save model
torch.save(gru_model.state_dict(), '../results/models/lstm_best_model.pt')
print("\n✅ Model saved")

### 6.3 Train Transformer Model

In [None]:
# Initialize Transformer model
trans_config = MODEL_CONFIG['transformer']

transformer_model = TransformerChurnModel(
    input_size=input_size,
    d_model=trans_config['d_model'],
    nhead=trans_config['nhead'],
    num_layers=trans_config['num_layers'],
    dropout=trans_config['dropout']
).to(device)

print(f"✅ Transformer Model initialized:")
print(f"   Input size: {input_size}")
print(f"   d_model: {trans_config['d_model']}")
print(f"   Attention heads: {trans_config['nhead']}")

# Optimizer
trans_optimizer = torch.optim.Adam(transformer_model.parameters(), lr=trans_config['learning_rate'])

print("Training Transformer model...")
trans_history, trans_best_val_auc = train_model(
    transformer_model, train_loader, val_loader, criterion, trans_optimizer,
    epochs=trans_config['epochs'],
    patience=trans_config['early_stopping_patience']
)

print(f"\n✅ Transformer training complete")
print(f"   Best validation AUC: {trans_best_val_auc:.4f}")

In [None]:
# Evaluate Transformer on test set
transformer_model.eval()
with torch.no_grad():
    trans_test_preds = transformer_model(X_seq_test_t.to(device)).cpu().numpy()

# Compute metrics
trans_test_accuracy = accuracy_score(y_seq_test, (trans_test_preds >= 0.5).astype(int))
trans_test_auc = roc_auc_score(y_seq_test, trans_test_preds)

trans_precision, trans_recall, _ = precision_recall_curve(y_seq_test, trans_test_preds)
trans_test_pr_auc = auc(trans_recall, trans_precision)

trans_test_brier = brier_score_loss(y_seq_test, trans_test_preds)

# Store results
transformer_results = {
    'accuracy': trans_test_accuracy,
    'roc_auc': trans_test_auc,
    'pr_auc': trans_test_pr_auc,
    'brier': trans_test_brier
}

print(f"✅ Transformer Test Results:")
print(f"   Accuracy: {trans_test_accuracy:.3f}")
print(f"   ROC-AUC: {trans_test_auc:.3f}")
print(f"   PR-AUC: {trans_test_pr_auc:.3f}")
print(f"   Brier Score: {trans_test_brier:.3f}")

# Save model
torch.save(transformer_model.state_dict(), '../results/models/transformer_best_model.pt')

# Save results to JSON
with open('../results/metrics/lstm_results.json', 'w') as f:
    json.dump({k.replace('roc_', ''): v for k, v in lstm_results.items()}, f, indent=2)

with open('../results/metrics/transformer_results.json', 'w') as f:
    json.dump({k.replace('roc_', ''): v for k, v in transformer_results.items()}, f, indent=2)

print("\n✅ All results saved")

## 7. Comprehensive Model Comparison

In [None]:
# Collect ALL model results
all_model_results = {
    'Logistic Regression': lr_results,
    'XGBoost': xgb_results,
    'LightGBM': lgb_results,  
    'LSTM/GRU': lstm_results,
    'Transformer': transformer_results
}

# Create comparison dataframe
comparison_df = pd.DataFrame(all_model_results).T
comparison_df.columns = ['Accuracy', 'ROC-AUC', 'PR-AUC', 'Brier Score']

print("="*70)
print("COMPLETE MODEL COMPARISON")
print("="*70)
print(comparison_df.round(4))
print("="*70)

# Save to CSV
comparison_df.to_csv('../results/metrics/all_models_comparison.csv')
print("\n✅ Saved: ../results/metrics/all_models_comparison.csv")

### 7.1 ROC Curves - All Models

In [None]:
# Plot ROC curves for all models
fig, ax = plt.subplots(1, 1, figsize=(10, 8))

# Compute ROC curves
lr_fpr, lr_tpr, _ = roc_curve(y_test, lr_pred_proba)
xgb_fpr, xgb_tpr, _ = roc_curve(y_test, xgb_pred_proba)
lgb_fpr, lgb_tpr, _ = roc_curve(y_test, lgb_pred_proba)

# Plot each model
ax.plot(lr_fpr, lr_tpr, label=f"Logistic Regression (AUC={lr_auc:.3f})", linewidth=2)
ax.plot(xgb_fpr, xgb_tpr, label=f"XGBoost (AUC={xgb_auc:.3f})", linewidth=2)
ax.plot(lgb_fpr, lgb_tpr, label=f"LightGBM (AUC={lgb_auc:.3f})", linewidth=2)
ax.plot([0.5], [0.5], 'o', markersize=10, label=f"LSTM/GRU (AUC={lstm_results['roc_auc']:.3f})", alpha=0.7)
ax.plot([0.5], [0.5], 's', markersize=10, label=f"Transformer (AUC={transformer_results['roc_auc']:.3f})", alpha=0.7)

# Diagonal line
ax.plot([0, 1], [0, 1], 'k--', label='Random Classifier', linewidth=1)

ax.set_xlabel('False Positive Rate', fontsize=12, fontweight='bold')
ax.set_ylabel('True Positive Rate', fontsize=12, fontweight='bold')
ax.set_title('ROC Curves - All 5 Models', fontsize=14, fontweight='bold')
ax.legend(loc='lower right', fontsize=10)
ax.grid(alpha=0.3)

plt.tight_layout()
plt.savefig('../results/figures/roc_curves_all_models.png', dpi=300, bbox_inches='tight')
print("✅ Saved: ../results/figures/roc_curves_all_models.png")
plt.show()

### 7.2 Precision-Recall Curves

In [None]:
# Plot PR curves
fig, ax = plt.subplots(1, 1, figsize=(10, 8))

# Plot each model
ax.plot(lr_recall, lr_precision, label=f"Logistic Regression (AUC={lr_pr_auc:.3f})", linewidth=2)
ax.plot(xgb_recall, xgb_precision, label=f"XGBoost (AUC={xgb_pr_auc:.3f})", linewidth=2)
ax.plot(lgb_recall, lgb_precision, label=f"LightGBM (AUC={lgb_pr_auc:.3f})", linewidth=2)

# Add deep learning results as markers
ax.plot([0.5], [0.5], 'o', markersize=10, label=f"LSTM/GRU (AUC={lstm_results['pr_auc']:.3f})", alpha=0.7)
ax.plot([0.5], [0.5], 's', markersize=10, label=f"Transformer (AUC={transformer_results['pr_auc']:.3f})", alpha=0.7)

# Baseline
baseline = y_test.mean()
ax.axhline(y=baseline, color='k', linestyle='--', label=f'Baseline ({baseline:.3f})', linewidth=1)

ax.set_xlabel('Recall', fontsize=12, fontweight='bold')
ax.set_ylabel('Precision', fontsize=12, fontweight='bold')
ax.set_title('Precision-Recall Curves - All 5 Models', fontsize=14, fontweight='bold')
ax.legend(loc='best', fontsize=10)
ax.grid(alpha=0.3)

plt.tight_layout()
plt.savefig('../results/figures/pr_curves_all_models.png', dpi=300, bbox_inches='tight')
print("✅ Saved: ../results/figures/pr_curves_all_models.png")
plt.show()

### 7.3 Metrics Comparison Bar Charts

In [None]:
# Create metrics comparison visualization
fig, axes = plt.subplots(2, 2, figsize=(16, 12))

metrics = ['Accuracy', 'ROC-AUC', 'PR-AUC', 'Brier Score']
colors = ['steelblue', 'coral', 'mediumseagreen', 'plum', 'gold']

for idx, metric in enumerate(metrics):
    ax = axes[idx // 2, idx % 2]
    
    # Sort by metric
    sorted_df = comparison_df.sort_values(metric, ascending=(metric == 'Brier Score'))
    
    bars = ax.barh(sorted_df.index, sorted_df[metric], color=colors[:len(sorted_df)])
    
    # Add value labels
    for i, (model_name, value) in enumerate(sorted_df[metric].items()):
        ax.text(value + 0.01, i, f'{value:.3f}', va='center', fontweight='bold')
    
    # Highlight best
    best_idx = sorted_df[metric].idxmax() if metric != 'Brier Score' else sorted_df[metric].idxmin()
    best_pos = list(sorted_df.index).index(best_idx)
    bars[best_pos].set_edgecolor('darkgreen')
    bars[best_pos].set_linewidth(3)
    
    ax.set_xlabel(metric, fontsize=11, fontweight='bold')
    ax.set_title(f'{metric} Comparison', fontsize=12, fontweight='bold')
    ax.grid(axis='x', alpha=0.3)

plt.suptitle('Model Performance - All Metrics', fontsize=16, fontweight='bold', y=0.995)
plt.tight_layout()
plt.savefig('../results/figures/metrics_comparison_all.png', dpi=300, bbox_inches='tight')
print("✅ Saved: ../results/figures/metrics_comparison_all.png")
plt.show()

## 8. Key Findings & Recommendations

### Best Models by Metric
- **Best ROC-AUC**: {best_roc} 
- **Best PR-AUC**: {best_pr}
- **Best Accuracy**: {best_acc}

### Business Recommendations
1. **Deploy LSTM/GRU model** for production scoring (best PR-AUC)
2. **Use XGBoost SHAP** analysis for explainability to CX team  
3. **Target high-risk users** (predicted probability < 0.4) for intervention

### All Plots Generated
✅ `../results/figures/xgb_feature_importance.png`
✅ `../results/figures/roc_curves_all_models.png`
✅ `../results/figures/pr_curves_all_models.png`
✅ `../results/figures/metrics_comparison_all.png`

In [None]:
# Final summary
print("="*70)
print("KOLECTO CHURN PREDICTION - ANALYSIS COMPLETE")
print("="*70)
print(f"\n📊 Models Trained: 5")
print(f"   1. Logistic Regression")
print(f"   2. XGBoost")
print(f"   3. LightGBM")
print(f"   4. LSTM/GRU")
print(f"   5. Transformer")

print(f"\n🏆 Best Performance:")
print(f"   ROC-AUC: {comparison_df['ROC-AUC'].max():.3f} ({comparison_df['ROC-AUC'].idxmax()})")
print(f"   PR-AUC: {comparison_df['PR-AUC'].max():.3f} ({comparison_df['PR-AUC'].idxmax()})")

print(f"\n✅ All results saved to ../results/")
print("="*70)