# Threshold Optimization & ROI Analysis

**Purpose:** Optimize the classification threshold to maximize business value (ROI) rather than statistical metrics.

**Context:** The default threshold (0.5) may not be optimal for business objectives. We want to find the threshold that maximizes net savings from retention campaigns.

**Author:** Noah Gallagher  
**Date:** November 2025

## 1. Setup & Imports

In [None]:
import sys
sys.path.append('..')

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import joblib
from pathlib import Path

from sklearn.metrics import (
    confusion_matrix, precision_recall_curve, roc_curve, 
    accuracy_score, precision_score, recall_score, f1_score
)

# Project imports
from src import config

# Visualization settings
plt.style.use('seaborn-v0_8-whitegrid')
sns.set_palette('Set2')
%matplotlib inline

## 2. Load Model & Test Data

In [None]:
# Load best model
model = joblib.load(config.MODEL_FILE)
print(f"Loaded model: {type(model).__name__}")

# Load test data
test_data = pd.read_csv(config.TEST_DATA_FILE)
X_test = test_data.drop(config.TARGET_COLUMN, axis=1)
y_test = test_data[config.TARGET_COLUMN]

print(f"Test set size: {len(X_test)} customers")
print(f"Churn rate: {y_test.mean():.1%}")

## 3. Generate Predictions

In [None]:
# Get predicted probabilities
y_pred_proba = model.predict_proba(X_test)[:, 1]

# Default predictions (threshold = 0.5)
y_pred_default = (y_pred_proba >= 0.5).astype(int)

print("Prediction distribution:")
print(pd.Series(y_pred_proba).describe())

# Visualize probability distribution
plt.figure(figsize=(10, 5))
plt.hist(y_pred_proba[y_test == 0], bins=50, alpha=0.5, label='No Churn (Actual)', color='green')
plt.hist(y_pred_proba[y_test == 1], bins=50, alpha=0.5, label='Churn (Actual)', color='red')
plt.axvline(x=0.5, color='black', linestyle='--', label='Default Threshold')
plt.xlabel('Predicted Churn Probability')
plt.ylabel('Number of Customers')
plt.title('Distribution of Predicted Probabilities by Actual Churn Status')
plt.legend()
plt.show()

## 4. Business Cost Parameters

Define the economic framework for evaluating model performance:

- **Customer Lifetime Value (CLV):** Average revenue from a retained customer
- **Retention Cost:** Cost of intervention (outreach + discount)
- **Churn Cost:** Lost revenue when a customer leaves

**Cost-Benefit Matrix:**

| Actual \ Predicted | Predict No Churn | Predict Churn |
|--------------------|------------------|---------------|
| **No Churn** | $0 (True Negative) | -$100 (False Positive - wasted intervention) |
| **Churn** | -$1,500 (False Negative - missed churner) | +$1,400 (True Positive - saved customer, net of intervention cost) |

In [None]:
# Business parameters from config
CLV = config.CUSTOMER_LIFETIME_VALUE  # $2,000
RETENTION_COST = config.RETENTION_COST  # $100
CHURN_COST = config.CHURN_COST  # $1,500

# Retention success rate (% of targeted churners who actually stay)
RETENTION_SUCCESS_RATE = 0.70

print(f"Customer Lifetime Value: ${CLV:,.0f}")
print(f"Retention Intervention Cost: ${RETENTION_COST:,.0f}")
print(f"Cost of Lost Customer: ${CHURN_COST:,.0f}")
print(f"Retention Success Rate: {RETENTION_SUCCESS_RATE:.0%}")
print(f"\nValue of saving one churner: ${CHURN_COST - RETENTION_COST:,.0f}")

## 5. ROI Calculation Function

In [None]:
def calculate_business_metrics(y_true, y_pred, retention_success_rate=0.70):
    """
    Calculate business impact metrics based on confusion matrix.
    
    Args:
        y_true: Actual churn labels (0 = no churn, 1 = churn)
        y_pred: Predicted churn labels (0 = no churn, 1 = churn)
        retention_success_rate: % of targeted churners who are successfully retained
    
    Returns:
        Dictionary with business metrics
    """
    # Confusion matrix
    tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel()
    
    # Statistical metrics
    accuracy = accuracy_score(y_true, y_pred)
    precision = precision_score(y_true, y_pred, zero_division=0)
    recall = recall_score(y_true, y_pred, zero_division=0)
    f1 = f1_score(y_true, y_pred, zero_division=0)
    
    # Business calculations
    # True Positives: Correctly identified churners
    # With intervention, retention_success_rate% are saved
    customers_saved = tp * retention_success_rate
    revenue_saved = customers_saved * CHURN_COST
    
    # False Positives: Incorrectly flagged non-churners (wasted intervention cost)
    wasted_cost = fp * RETENTION_COST
    
    # False Negatives: Missed churners (lost revenue)
    lost_revenue = fn * CHURN_COST
    
    # True Negatives: Correctly identified non-churners (no cost, no benefit)
    # No action taken, no cost
    
    # Total intervention cost
    total_intervention_cost = (tp + fp) * RETENTION_COST
    
    # Net savings
    net_savings = revenue_saved - total_intervention_cost - lost_revenue
    
    # ROI
    roi = (revenue_saved - total_intervention_cost) / total_intervention_cost if total_intervention_cost > 0 else 0
    
    return {
        # Confusion matrix
        'true_negatives': tn,
        'false_positives': fp,
        'false_negatives': fn,
        'true_positives': tp,
        
        # Statistical metrics
        'accuracy': accuracy,
        'precision': precision,
        'recall': recall,
        'f1_score': f1,
        
        # Business metrics
        'customers_targeted': tp + fp,
        'customers_saved': customers_saved,
        'revenue_saved': revenue_saved,
        'wasted_cost': wasted_cost,
        'lost_revenue': lost_revenue,
        'total_intervention_cost': total_intervention_cost,
        'net_savings': net_savings,
        'roi': roi
    }

# Test with default threshold
default_metrics = calculate_business_metrics(y_test, y_pred_default, RETENTION_SUCCESS_RATE)

print("=" * 60)
print("DEFAULT THRESHOLD (0.5) PERFORMANCE")
print("=" * 60)
print(f"\nConfusion Matrix:")
print(f"  True Negatives:  {default_metrics['true_negatives']:>4}  (Correctly identified non-churners)")
print(f"  False Positives: {default_metrics['false_positives']:>4}  (Wasted interventions)")
print(f"  False Negatives: {default_metrics['false_negatives']:>4}  (Missed churners)")
print(f"  True Positives:  {default_metrics['true_positives']:>4}  (Correctly identified churners)")
print(f"\nStatistical Metrics:")
print(f"  Accuracy:  {default_metrics['accuracy']:.1%}")
print(f"  Precision: {default_metrics['precision']:.1%}")
print(f"  Recall:    {default_metrics['recall']:.1%}")
print(f"  F1 Score:  {default_metrics['f1_score']:.1%}")
print(f"\nBusiness Impact:")
print(f"  Customers Targeted:     {default_metrics['customers_targeted']:>4}")
print(f"  Customers Saved:        {default_metrics['customers_saved']:>6.0f}")
print(f"  Revenue Saved:          ${default_metrics['revenue_saved']:>9,.0f}")
print(f"  Intervention Cost:      ${default_metrics['total_intervention_cost']:>9,.0f}")
print(f"  Lost Revenue (FN):      ${default_metrics['lost_revenue']:>9,.0f}")
print(f"  Net Savings:            ${default_metrics['net_savings']:>9,.0f}")
print(f"  ROI:                    {default_metrics['roi']:>9.1%}")
print("=" * 60)

## 6. Threshold Optimization

Test a range of thresholds (0.01 to 0.99) to find the one that maximizes net savings.

In [None]:
# Test range of thresholds
thresholds = np.arange(0.05, 1.00, 0.01)
results = []

for threshold in thresholds:
    # Make predictions with this threshold
    y_pred = (y_pred_proba >= threshold).astype(int)
    
    # Calculate metrics
    metrics = calculate_business_metrics(y_test, y_pred, RETENTION_SUCCESS_RATE)
    metrics['threshold'] = threshold
    results.append(metrics)

# Convert to DataFrame
results_df = pd.DataFrame(results)

# Find optimal threshold
optimal_idx = results_df['net_savings'].idxmax()
optimal_threshold = results_df.loc[optimal_idx, 'threshold']
optimal_net_savings = results_df.loc[optimal_idx, 'net_savings']

print(f"Optimal Threshold: {optimal_threshold:.2f}")
print(f"Maximum Net Savings: ${optimal_net_savings:,.0f}")
print(f"\nImprovement over default (0.5):")
print(f"  Default Net Savings: ${default_metrics['net_savings']:,.0f}")
print(f"  Optimal Net Savings: ${optimal_net_savings:,.0f}")
print(f"  Additional Savings:  ${optimal_net_savings - default_metrics['net_savings']:,.0f}")

# Display top 10 thresholds
print("\nTop 10 Thresholds by Net Savings:")
print(results_df.nlargest(10, 'net_savings')[[
    'threshold', 'recall', 'precision', 'customers_targeted', 
    'customers_saved', 'net_savings', 'roi'
]].to_string(index=False))

## 7. Visualize Threshold Impact

In [None]:
# Create comprehensive visualization
fig, axes = plt.subplots(2, 2, figsize=(16, 12))

# Plot 1: Net Savings vs Threshold
ax1 = axes[0, 0]
ax1.plot(results_df['threshold'], results_df['net_savings'] / 1000, linewidth=2, color='#2ca02c')
ax1.axvline(x=optimal_threshold, color='red', linestyle='--', label=f'Optimal: {optimal_threshold:.2f}')
ax1.axvline(x=0.5, color='gray', linestyle=':', label='Default: 0.50')
ax1.set_xlabel('Classification Threshold', fontsize=12)
ax1.set_ylabel('Net Savings ($000s)', fontsize=12)
ax1.set_title('Net Savings vs. Classification Threshold', fontsize=14, fontweight='bold')
ax1.legend()
ax1.grid(True, alpha=0.3)

# Plot 2: Precision & Recall vs Threshold
ax2 = axes[0, 1]
ax2.plot(results_df['threshold'], results_df['precision'], linewidth=2, label='Precision', color='#1f77b4')
ax2.plot(results_df['threshold'], results_df['recall'], linewidth=2, label='Recall', color='#ff7f0e')
ax2.axvline(x=optimal_threshold, color='red', linestyle='--', alpha=0.5)
ax2.axvline(x=0.5, color='gray', linestyle=':', alpha=0.5)
ax2.set_xlabel('Classification Threshold', fontsize=12)
ax2.set_ylabel('Score', fontsize=12)
ax2.set_title('Precision & Recall vs. Threshold', fontsize=14, fontweight='bold')
ax2.legend()
ax2.grid(True, alpha=0.3)

# Plot 3: ROI vs Threshold
ax3 = axes[1, 0]
ax3.plot(results_df['threshold'], results_df['roi'] * 100, linewidth=2, color='#9467bd')
ax3.axvline(x=optimal_threshold, color='red', linestyle='--', label=f'Optimal: {optimal_threshold:.2f}')
ax3.axvline(x=0.5, color='gray', linestyle=':', label='Default: 0.50')
ax3.set_xlabel('Classification Threshold', fontsize=12)
ax3.set_ylabel('ROI (%)', fontsize=12)
ax3.set_title('Return on Investment vs. Threshold', fontsize=14, fontweight='bold')
ax3.legend()
ax3.grid(True, alpha=0.3)

# Plot 4: Customers Targeted vs Customers Saved
ax4 = axes[1, 1]
ax4.plot(results_df['threshold'], results_df['customers_targeted'], linewidth=2, label='Customers Targeted', color='#d62728')
ax4.plot(results_df['threshold'], results_df['customers_saved'], linewidth=2, label='Customers Saved', color='#2ca02c')
ax4.axvline(x=optimal_threshold, color='red', linestyle='--', alpha=0.5)
ax4.axvline(x=0.5, color='gray', linestyle=':', alpha=0.5)
ax4.set_xlabel('Classification Threshold', fontsize=12)
ax4.set_ylabel('Number of Customers', fontsize=12)
ax4.set_title('Customer Volume vs. Threshold', fontsize=14, fontweight='bold')
ax4.legend()
ax4.grid(True, alpha=0.3)

plt.tight_layout()
plt.savefig('../outputs/figures/threshold_optimization.png', dpi=300, bbox_inches='tight')
plt.show()

print("Visualization saved to outputs/figures/threshold_optimization.png")

## 8. Optimal Threshold Analysis

In [None]:
# Make predictions with optimal threshold
y_pred_optimal = (y_pred_proba >= optimal_threshold).astype(int)
optimal_metrics = calculate_business_metrics(y_test, y_pred_optimal, RETENTION_SUCCESS_RATE)

print("=" * 60)
print(f"OPTIMAL THRESHOLD ({optimal_threshold:.2f}) PERFORMANCE")
print("=" * 60)
print(f"\nConfusion Matrix:")
print(f"  True Negatives:  {optimal_metrics['true_negatives']:>4}  (Correctly identified non-churners)")
print(f"  False Positives: {optimal_metrics['false_positives']:>4}  (Wasted interventions)")
print(f"  False Negatives: {optimal_metrics['false_negatives']:>4}  (Missed churners)")
print(f"  True Positives:  {optimal_metrics['true_positives']:>4}  (Correctly identified churners)")
print(f"\nStatistical Metrics:")
print(f"  Accuracy:  {optimal_metrics['accuracy']:.1%}")
print(f"  Precision: {optimal_metrics['precision']:.1%}")
print(f"  Recall:    {optimal_metrics['recall']:.1%}")
print(f"  F1 Score:  {optimal_metrics['f1_score']:.1%}")
print(f"\nBusiness Impact:")
print(f"  Customers Targeted:     {optimal_metrics['customers_targeted']:>4}")
print(f"  Customers Saved:        {optimal_metrics['customers_saved']:>6.0f}")
print(f"  Revenue Saved:          ${optimal_metrics['revenue_saved']:>9,.0f}")
print(f"  Intervention Cost:      ${optimal_metrics['total_intervention_cost']:>9,.0f}")
print(f"  Lost Revenue (FN):      ${optimal_metrics['lost_revenue']:>9,.0f}")
print(f"  Net Savings:            ${optimal_metrics['net_savings']:>9,.0f}")
print(f"  ROI:                    {optimal_metrics['roi']:>9.1%}")
print("=" * 60)

# Comparison table
comparison = pd.DataFrame({
    'Metric': ['Threshold', 'Accuracy', 'Precision', 'Recall', 'F1 Score',
               'Customers Targeted', 'Customers Saved', 'Net Savings', 'ROI'],
    'Default (0.5)': [
        0.50,
        default_metrics['accuracy'],
        default_metrics['precision'],
        default_metrics['recall'],
        default_metrics['f1_score'],
        default_metrics['customers_targeted'],
        default_metrics['customers_saved'],
        default_metrics['net_savings'],
        default_metrics['roi']
    ],
    f'Optimal ({optimal_threshold:.2f})': [
        optimal_threshold,
        optimal_metrics['accuracy'],
        optimal_metrics['precision'],
        optimal_metrics['recall'],
        optimal_metrics['f1_score'],
        optimal_metrics['customers_targeted'],
        optimal_metrics['customers_saved'],
        optimal_metrics['net_savings'],
        optimal_metrics['roi']
    ]
})

print("\n\nCOMPARISON: DEFAULT VS. OPTIMAL THRESHOLD")
print(comparison.to_string(index=False))

## 9. Sensitivity Analysis

Test how robust the optimal threshold is to changes in business parameters.

In [None]:
# Test different retention success rates
retention_rates = [0.50, 0.60, 0.70, 0.80, 0.90]
sensitivity_results = []

for rate in retention_rates:
    # Find optimal threshold for this retention rate
    temp_results = []
    for threshold in thresholds:
        y_pred = (y_pred_proba >= threshold).astype(int)
        metrics = calculate_business_metrics(y_test, y_pred, rate)
        metrics['threshold'] = threshold
        temp_results.append(metrics)
    
    temp_df = pd.DataFrame(temp_results)
    optimal_idx = temp_df['net_savings'].idxmax()
    
    sensitivity_results.append({
        'retention_rate': rate,
        'optimal_threshold': temp_df.loc[optimal_idx, 'threshold'],
        'net_savings': temp_df.loc[optimal_idx, 'net_savings'],
        'roi': temp_df.loc[optimal_idx, 'roi']
    })

sensitivity_df = pd.DataFrame(sensitivity_results)

print("SENSITIVITY ANALYSIS: Retention Success Rate Impact")
print(sensitivity_df.to_string(index=False))

# Visualize
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Optimal threshold vs retention rate
axes[0].plot(sensitivity_df['retention_rate'] * 100, sensitivity_df['optimal_threshold'], 
             marker='o', linewidth=2, markersize=8, color='#1f77b4')
axes[0].set_xlabel('Retention Success Rate (%)', fontsize=12)
axes[0].set_ylabel('Optimal Classification Threshold', fontsize=12)
axes[0].set_title('Optimal Threshold vs. Retention Success Rate', fontsize=14, fontweight='bold')
axes[0].grid(True, alpha=0.3)

# Net savings vs retention rate
axes[1].plot(sensitivity_df['retention_rate'] * 100, sensitivity_df['net_savings'] / 1000, 
             marker='o', linewidth=2, markersize=8, color='#2ca02c')
axes[1].set_xlabel('Retention Success Rate (%)', fontsize=12)
axes[1].set_ylabel('Net Savings ($000s)', fontsize=12)
axes[1].set_title('Net Savings vs. Retention Success Rate', fontsize=14, fontweight='bold')
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.savefig('../outputs/figures/sensitivity_analysis.png', dpi=300, bbox_inches='tight')
plt.show()

print("\nVisualization saved to outputs/figures/sensitivity_analysis.png")

## 10. Recommendations

Based on this analysis:

In [None]:
print("=" * 70)
print("THRESHOLD OPTIMIZATION RECOMMENDATIONS")
print("=" * 70)
print(f"\n1. RECOMMENDED THRESHOLD: {optimal_threshold:.2f}")
print(f"   - This threshold maximizes net savings at ${optimal_net_savings:,.0f}")
print(f"   - Represents ${optimal_net_savings - default_metrics['net_savings']:,.0f} improvement over default (0.5)")

print(f"\n2. BUSINESS JUSTIFICATION:")
print(f"   - Recall: {optimal_metrics['recall']:.1%} (catches {optimal_metrics['recall']:.0%} of churners)")
print(f"   - Precision: {optimal_metrics['precision']:.1%} ({optimal_metrics['precision']:.0%} of targeted customers are actual churners)")
print(f"   - ROI: {optimal_metrics['roi']:.0%} (every $1 spent returns ${optimal_metrics['roi'] + 1:.2f})")

print(f"\n3. TRADE-OFFS:")
if optimal_threshold < 0.5:
    print(f"   - Lower threshold means MORE customers targeted ({optimal_metrics['customers_targeted']} vs {default_metrics['customers_targeted']})")
    print(f"   - Higher recall ({optimal_metrics['recall']:.1%}) but lower precision ({optimal_metrics['precision']:.1%})")
    print(f"   - Justified by high cost of missing churners (${CHURN_COST} vs ${RETENTION_COST} intervention)")
else:
    print(f"   - Higher threshold means FEWER customers targeted ({optimal_metrics['customers_targeted']} vs {default_metrics['customers_targeted']})")
    print(f"   - Lower recall ({optimal_metrics['recall']:.1%}) but higher precision ({optimal_metrics['precision']:.1%})")
    print(f"   - Focused on high-confidence predictions to minimize wasted interventions")

print(f"\n4. SENSITIVITY:")
print(f"   - Optimal threshold ranges from {sensitivity_df['optimal_threshold'].min():.2f} to {sensitivity_df['optimal_threshold'].max():.2f}")
print(f"     across retention success rates of {sensitivity_df['retention_rate'].min():.0%}-{sensitivity_df['retention_rate'].max():.0%}")
print(f"   - Recommendation is robust to realistic variation in retention effectiveness")

print(f"\n5. IMPLEMENTATION:")
print(f"   - Update dashboard to use threshold = {optimal_threshold:.2f}")
print(f"   - Monitor actual retention success rate and re-optimize quarterly")
print(f"   - Consider A/B testing: {optimal_threshold:.2f} vs. 0.50 to validate in production")

print(f"\n6. EXPECTED ANNUAL IMPACT (scaling to full customer base):")
test_size_ratio = len(X_test) / 7043  # test is 20% of full dataset
annual_net_savings = optimal_net_savings / test_size_ratio
print(f"   - Test set net savings: ${optimal_net_savings:,.0f}")
print(f"   - Projected annual savings (full customer base): ${annual_net_savings:,.0f}")
print(f"   - Additional savings vs default threshold: ${(optimal_net_savings - default_metrics['net_savings']) / test_size_ratio:,.0f}")

print("=" * 70)

## 11. Save Results

In [None]:
# Save threshold optimization results
results_df.to_csv('../outputs/reports/threshold_optimization_results.csv', index=False)
print("Saved: outputs/reports/threshold_optimization_results.csv")

# Save sensitivity analysis
sensitivity_df.to_csv('../outputs/reports/sensitivity_analysis_results.csv', index=False)
print("Saved: outputs/reports/sensitivity_analysis_results.csv")

# Save optimal threshold configuration
optimal_config = {
    'optimal_threshold': optimal_threshold,
    'default_threshold': 0.5,
    'retention_success_rate': RETENTION_SUCCESS_RATE,
    'business_parameters': {
        'CLV': CLV,
        'retention_cost': RETENTION_COST,
        'churn_cost': CHURN_COST
    },
    'optimal_metrics': optimal_metrics,
    'default_metrics': default_metrics
}

import json
with open('../outputs/reports/optimal_threshold_config.json', 'w') as f:
    # Convert numpy types to Python types for JSON serialization
    def convert_types(obj):
        if isinstance(obj, dict):
            return {k: convert_types(v) for k, v in obj.items()}
        elif isinstance(obj, (np.int64, np.int32)):
            return int(obj)
        elif isinstance(obj, (np.float64, np.float32)):
            return float(obj)
        else:
            return obj
    
    json.dump(convert_types(optimal_config), f, indent=2)

print("Saved: outputs/reports/optimal_threshold_config.json")
print("\nAll results saved successfully!")

## Summary

This notebook demonstrated:

1. **Business-Driven Optimization**: Optimized classification threshold based on business costs (CLV, retention cost, churn cost) rather than statistical metrics alone.

2. **ROI Maximization**: Found the threshold that maximizes net savings and ROI, accounting for intervention costs and retention success rates.

3. **Sensitivity Analysis**: Validated that the optimal threshold is robust to realistic variation in business parameters.

4. **Actionable Recommendations**: Provided clear implementation guidance with expected business impact.

**Key Takeaway:** The default threshold of 0.5 is rarely optimal for business applications. By incorporating domain knowledge about costs and benefits, we can significantly improve the value delivered by the model.

---

**Next Steps:**
- Implement optimal threshold in production dashboard
- Run A/B test to validate in real-world conditions (see [A_B_TEST_PLAN.md](../A_B_TEST_PLAN.md))
- Monitor actual retention success rate and re-optimize quarterly
- Consider cost-sensitive learning algorithms that incorporate business costs during training