# Financial Risk Assessment: Credit Scoring Calibration

This example demonstrates rank-preserving calibration for financial risk assessment using simulated credit data. We'll show how credit scoring models need calibration when deployed across different market segments with varying risk profiles.

## Financial Motivation

Credit scoring models face several calibration challenges:
- **Market segment shifts**: Models trained on one population may be poorly calibrated for others
- **Economic conditions**: Default rates vary with economic cycles
- **Portfolio composition**: Different lending strategies lead to different risk distributions
- **Regulatory requirements**: Basel III and CECL require well-calibrated probability of default estimates

Rank-preserving calibration maintains credit ranking while adjusting absolute probabilities to match target portfolios.

In [None]:
import warnings

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
from sklearn.datasets import make_classification
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import accuracy_score, roc_auc_score
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

warnings.filterwarnings('ignore')

# Import our calibration package - proper imports
from rank_preserving_calibration import calibrate_dykstra

# Set style
plt.style.use('seaborn-v0_8')
sns.set_palette(["#2ecc71", "#e74c3c", "#f39c12"])  # Green, Red, Orange
np.random.seed(42)

print("Libraries loaded successfully!")

## Synthetic Credit Dataset Creation

We'll create a realistic credit dataset with multiple risk segments to simulate real-world credit portfolio management scenarios.

In [None]:
print("üè¶ CREATING SYNTHETIC CREDIT DATASET")
print("="*50)

# Create realistic credit features
n_samples = 3000
n_features = 15

# Generate base dataset with class imbalance (most loans don't default)
X_base, y_base = make_classification(
    n_samples=n_samples,
    n_features=n_features,
    n_classes=3,  # Good, Moderate Risk, High Risk
    n_clusters_per_class=2,
    class_sep=1.2,
    weights=[0.7, 0.2, 0.1],  # Most customers are good credit
    random_state=42
)

# Create feature names
feature_names = [
    'credit_score', 'income', 'debt_to_income', 'employment_years',
    'loan_amount', 'payment_history', 'utilization_rate', 'inquiries',
    'accounts_open', 'delinquencies', 'public_records', 'age',
    'education_level', 'housing_status', 'geographic_risk'
]

# Create DataFrame
df = pd.DataFrame(X_base, columns=feature_names)
df['risk_class'] = y_base

# Add realistic transformations to make it look like credit data
df['credit_score'] = (df['credit_score'] * 100 + 650).clip(300, 850).astype(int)
df['income'] = np.exp(df['income'] * 0.5 + 10).clip(20000, 200000).astype(int)
df['debt_to_income'] = (df['debt_to_income'] * 0.2 + 0.3).clip(0, 1)
df['employment_years'] = (df['employment_years'] * 3 + 5).clip(0, 40).astype(int)
df['loan_amount'] = (df['loan_amount'] * 50000 + 25000).clip(1000, 500000).astype(int)

print(f"Dataset created with {len(df)} samples and {len(feature_names)} features")
print("Risk class distribution:")
risk_labels = ['Good Credit (0)', 'Moderate Risk (1)', 'High Risk (2)']
for i, label in enumerate(risk_labels):
    count = np.sum(y_base == i)
    pct = count / len(y_base) * 100
    print(f"  {label}: {count} ({pct:.1f}%)")

# Show sample of the data
print("\nSample of generated credit data:")
display_cols = ['credit_score', 'income', 'debt_to_income', 'loan_amount', 'risk_class']
print(df[display_cols].head(10))

## Model Training and Initial Performance

In [None]:
# Prepare features and target
X = df[feature_names].values
y = df['risk_class'].values

# Split data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42, stratify=y
)

# Scale features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

print("üéØ TRAINING CREDIT RISK MODEL")
print("="*40)

# Train gradient boosting model (common in finance)
model = GradientBoostingClassifier(
    n_estimators=100,
    max_depth=6,
    learning_rate=0.1,
    random_state=42
)
model.fit(X_train_scaled, y_train)

# Get predictions and probabilities
y_pred = model.predict(X_test_scaled)
y_proba = model.predict_proba(X_test_scaled)

# Calculate performance metrics
accuracy = accuracy_score(y_test, y_pred)

# Multi-class AUC (one-vs-rest)
auc_scores = []
for i in range(3):
    y_binary = (y_test == i).astype(int)
    if len(np.unique(y_binary)) > 1:  # Only calculate if both classes exist
        auc = roc_auc_score(y_binary, y_proba[:, i])
        auc_scores.append(auc)
        print(f"AUC for class {i} ({risk_labels[i]}): {auc:.3f}")

mean_auc = np.mean(auc_scores) if auc_scores else 0
print("\nOverall model performance:")
print(f"  Accuracy: {accuracy:.3f}")
print(f"  Mean AUC: {mean_auc:.3f}")
print(f"  Test samples: {len(y_test)}")

# Show current probability distributions
current_marginals = np.mean(y_proba, axis=0)
print("\nCurrent model probability marginals:")
for i, (label, marginal) in enumerate(zip(risk_labels, current_marginals, strict=False)):
    print(f"  {label}: {marginal:.3f} ({marginal*100:.1f}%)")

## Economic Scenario and Target Portfolio

We'll simulate a scenario where economic conditions change, requiring the model to be recalibrated for a different risk environment.

In [None]:
print("üìà ECONOMIC SCENARIO ANALYSIS")
print("="*45)

# Simulate economic downturn scenario with increased defaults
# Based on historical credit loss data during recessions
recession_risk_distribution = np.array([
    0.55,   # Good Credit: Reduced from 70% to 55%
    0.30,   # Moderate Risk: Increased from 20% to 30%
    0.15    # High Risk: Increased from 10% to 15%
])

print("üìä SCENARIO: Economic Downturn Portfolio Rebalancing")
print("Target risk distribution reflects:")
print("  ‚Ä¢ Increased unemployment affecting credit quality")
print("  ‚Ä¢ Business cycle impact on default rates")
print("  ‚Ä¢ Regulatory stress testing requirements")
print("  ‚Ä¢ Portfolio-specific risk appetite changes")

print("\nTarget portfolio composition:")
for i, (label, target_pct) in enumerate(zip(risk_labels, recession_risk_distribution, strict=False)):
    current_pct = current_marginals[i]
    change = target_pct - current_pct
    direction = "‚Üë" if change > 0 else "‚Üì" if change < 0 else "‚Üí"
    print(f"  {label}: {target_pct:.1%} (change: {change:+.1%} {direction})")

# Business justification
print("\nüíº BUSINESS JUSTIFICATION:")
justifications = [
    "CECL accounting requires forward-looking loss estimates",
    "Basel III stress testing mandates adverse scenario modeling",
    "Economic indicators suggest increased default risk",
    "Portfolio rebalancing to maintain target risk-adjusted returns",
    "Regulatory examiner expectations for downturn preparedness"
]

for justification in justifications:
    print(f"   ‚Ä¢ {justification}")

# Calculate target marginals for calibration
n_test_samples = len(y_test)
target_marginals = recession_risk_distribution * n_test_samples

print("\nCalibration parameters:")
print(f"  Test samples: {n_test_samples}")
print(f"  Target marginals: {target_marginals}")
print(f"  Sum verification: {np.sum(target_marginals):.1f} (should equal {n_test_samples})")

## Rank-Preserving Calibration

In [None]:
print("‚öñÔ∏è APPLYING RANK-PRESERVING CALIBRATION")
print("="*50)

# Apply calibration
result = calibrate_dykstra(
    P=y_proba,
    M=target_marginals,
    max_iters=2000,
    tol=1e-7,
    verbose=True
)

y_proba_calibrated = result.Q
print("\n‚úÖ Calibration completed successfully!")
print(f"   Converged: {result.converged}")
print(f"   Iterations: {result.iterations}")
print(f"   Final objective: {result.objective:.2e}")

# Verify calibration accuracy
calibrated_marginals = np.sum(y_proba_calibrated, axis=0)
print("\nüéØ CALIBRATION VERIFICATION:")
print("Target vs Achieved marginals:")
for i, label in enumerate(risk_labels):
    target = target_marginals[i]
    achieved = calibrated_marginals[i]
    error = abs(achieved - target)
    print(f"  {label}: {target:.1f} ‚Üí {achieved:.1f} (error: {error:.2e})")

max_marginal_error = np.max(np.abs(calibrated_marginals - target_marginals))
print(f"\nMaximum marginal constraint violation: {max_marginal_error:.2e}")

# Check probability validity
row_sums = np.sum(y_proba_calibrated, axis=1)
print("\nüîç PROBABILITY VALIDITY CHECK:")
print(f"   Row sums range: [{np.min(row_sums):.6f}, {np.max(row_sums):.6f}]")
print(f"   Max deviation from 1.0: {np.max(np.abs(row_sums - 1.0)):.2e}")
print(f"   All probabilities non-negative: {np.all(y_proba_calibrated >= 0)}")

## Financial Impact Analysis

In [None]:
# Analyze ranking preservation
from scipy.stats import spearmanr

print("üìä RANKING PRESERVATION ANALYSIS")
print("="*45)

# Calculate rank correlations for each sample
spearman_correlations = []
for i in range(len(y_test)):
    corr, _ = spearmanr(y_proba[i], y_proba_calibrated[i])
    if not np.isnan(corr):  # Handle edge cases
        spearman_correlations.append(corr)

spearman_correlations = np.array(spearman_correlations)
perfect_preservation = np.sum(np.isclose(spearman_correlations, 1.0, atol=1e-10))

print("RANK PRESERVATION METRICS:")
print(f"  Perfect rank preservation: {perfect_preservation}/{len(spearman_correlations)}")
print(f"  Mean Spearman correlation: {np.mean(spearman_correlations):.6f}")
print(f"  Min Spearman correlation: {np.min(spearman_correlations):.6f}")

# Prediction stability analysis
original_predictions = np.argmax(y_proba, axis=1)
calibrated_predictions = np.argmax(y_proba_calibrated, axis=1)
prediction_changes = np.sum(original_predictions != calibrated_predictions)

print("\nPREDICTION STABILITY:")
print(f"  Prediction changes: {prediction_changes}/{len(y_test)}")
print(f"  Stability rate: {(1 - prediction_changes/len(y_test))*100:.1f}%")

# Show examples of changed predictions
if prediction_changes > 0:
    changed_indices = np.where(original_predictions != calibrated_predictions)[0]
    print("  Examples of prediction changes:")
    for idx in changed_indices[:3]:  # Show first 3
        orig_risk = risk_labels[original_predictions[idx]]
        calib_risk = risk_labels[calibrated_predictions[idx]]
        true_risk = risk_labels[y_test[idx]]
        print(f"    Sample {idx}: {orig_risk} ‚Üí {calib_risk} (true: {true_risk})")

# Performance comparison
original_accuracy = accuracy_score(y_test, original_predictions)
calibrated_accuracy = accuracy_score(y_test, calibrated_predictions)

print("\nPERFORMANCE COMPARISON:")
print(f"  Original accuracy: {original_accuracy:.4f}")
print(f"  Calibrated accuracy: {calibrated_accuracy:.4f}")
print(f"  Accuracy change: {calibrated_accuracy - original_accuracy:+.4f}")

In [None]:
# Create comprehensive visualization
fig, axes = plt.subplots(2, 3, figsize=(18, 12))

# 1. Risk distribution comparison
x_pos = np.arange(3)
width = 0.25

current_dist = current_marginals
target_dist = recession_risk_distribution
achieved_dist = calibrated_marginals / n_test_samples

axes[0, 0].bar(x_pos - width, current_dist, width, label='Original Model', alpha=0.8, color='#3498db')
axes[0, 0].bar(x_pos, target_dist, width, label='Target (Recession)', alpha=0.8, color='#e74c3c')
axes[0, 0].bar(x_pos + width, achieved_dist, width, label='Calibrated', alpha=0.8, color='#2ecc71')

axes[0, 0].set_xlabel('Risk Class')
axes[0, 0].set_ylabel('Probability Mass')
axes[0, 0].set_title('Risk Distribution Comparison')
axes[0, 0].legend()
axes[0, 0].set_xticks(x_pos)
axes[0, 0].set_xticklabels(['Good', 'Moderate', 'High'])

# 2. Calibration accuracy
calibration_errors = np.abs(achieved_dist - target_dist)
colors = ['#2ecc71', '#f39c12', '#e74c3c']

bars = axes[0, 1].bar(x_pos, calibration_errors, color=colors, alpha=0.7)
axes[0, 1].set_xlabel('Risk Class')
axes[0, 1].set_ylabel('Absolute Error')
axes[0, 1].set_title('Calibration Accuracy by Risk Class')
axes[0, 1].set_xticks(x_pos)
axes[0, 1].set_xticklabels(['Good', 'Moderate', 'High'])
axes[0, 1].set_yscale('log')

# 3. Probability changes distribution
prob_changes = y_proba_calibrated - y_proba
axes[0, 2].hist(prob_changes.flatten(), bins=50, alpha=0.7, density=True, color='#9b59b6')
axes[0, 2].axvline(0, color='black', linestyle='--', alpha=0.7)
axes[0, 2].set_xlabel('Probability Change')
axes[0, 2].set_ylabel('Density')
axes[0, 2].set_title('Distribution of Probability Changes')

# 4. Risk score distribution (max probability)
max_probs_original = np.max(y_proba, axis=1)
max_probs_calibrated = np.max(y_proba_calibrated, axis=1)

axes[1, 0].hist(max_probs_original, bins=30, alpha=0.7, label='Original', density=True, color='#3498db')
axes[1, 0].hist(max_probs_calibrated, bins=30, alpha=0.7, label='Calibrated', density=True, color='#e74c3c')
axes[1, 0].set_xlabel('Maximum Probability (Confidence)')
axes[1, 0].set_ylabel('Density')
axes[1, 0].set_title('Confidence Score Distribution')
axes[1, 0].legend()

# 5. Class-wise probability changes
for i, (risk_class, color) in enumerate(zip(['Good', 'Moderate', 'High'], colors, strict=False)):
    changes = prob_changes[:, i]
    axes[1, 1].hist(changes, bins=30, alpha=0.6, label=f'{risk_class} Credit',
                   color=color, density=True)

axes[1, 1].axvline(0, color='black', linestyle='--', alpha=0.7)
axes[1, 1].set_xlabel('Probability Change')
axes[1, 1].set_ylabel('Density')
axes[1, 1].set_title('Changes by Risk Class')
axes[1, 1].legend()

# 6. Confusion matrix comparison
from sklearn.metrics import confusion_matrix

cm_original = confusion_matrix(y_test, original_predictions)
cm_calibrated = confusion_matrix(y_test, calibrated_predictions)

# Show difference matrix
cm_diff = cm_calibrated - cm_original
im = axes[1, 2].imshow(cm_diff, interpolation='nearest', cmap='RdBu',
                      vmin=-np.max(np.abs(cm_diff)), vmax=np.max(np.abs(cm_diff)))
axes[1, 2].set_title('Prediction Changes\n(Calibrated - Original)')
axes[1, 2].set_xlabel('Predicted Risk Class')
axes[1, 2].set_ylabel('True Risk Class')
axes[1, 2].set_xticks([0, 1, 2])
axes[1, 2].set_xticklabels(['Good', 'Mod', 'High'])
axes[1, 2].set_yticks([0, 1, 2])
axes[1, 2].set_yticklabels(['Good', 'Mod', 'High'])

# Add colorbar
plt.colorbar(im, ax=axes[1, 2], shrink=0.6)

plt.tight_layout()
plt.show()

## Financial Business Impact

In [None]:
# Comprehensive business impact analysis
print("üí∞ FINANCIAL BUSINESS IMPACT ANALYSIS")
print("="*50)

# Portfolio parameters
portfolio_size = 100_000  # Number of loans
avg_loan_amount = 75_000  # Average loan size
total_portfolio_value = portfolio_size * avg_loan_amount

# Financial parameters
interest_margin = 0.035  # 3.5% net interest margin
loss_given_default = 0.45  # 45% loss rate on defaults
regulatory_capital_ratio = 0.08  # 8% risk-weighted capital requirement

print("üìä PORTFOLIO CHARACTERISTICS:")
print(f"   ‚Ä¢ Portfolio size: {portfolio_size:,} loans")
print(f"   ‚Ä¢ Average loan amount: ${avg_loan_amount:,}")
print(f"   ‚Ä¢ Total portfolio value: ${total_portfolio_value:,.0f}")
print(f"   ‚Ä¢ Net interest margin: {interest_margin:.1%}")
print(f"   ‚Ä¢ Loss given default: {loss_given_default:.1%}")

# Expected loss calculations
# Probability of default by risk class (simplified)
pd_by_class = np.array([0.02, 0.08, 0.25])  # 2%, 8%, 25% annual default rates

# Original portfolio expected losses
original_portfolio_dist = current_marginals
original_expected_pd = np.sum(original_portfolio_dist * pd_by_class)
original_expected_loss = portfolio_size * avg_loan_amount * original_expected_pd * loss_given_default

# Calibrated portfolio expected losses
calibrated_portfolio_dist = achieved_dist
calibrated_expected_pd = np.sum(calibrated_portfolio_dist * pd_by_class)
calibrated_expected_loss = portfolio_size * avg_loan_amount * calibrated_expected_pd * loss_given_default

print("\nüìà EXPECTED LOSS ANALYSIS:")
print(f"   ‚Ä¢ Original portfolio PD: {original_expected_pd:.2%}")
print(f"   ‚Ä¢ Calibrated portfolio PD: {calibrated_expected_pd:.2%}")
print(f"   ‚Ä¢ PD increase: {calibrated_expected_pd - original_expected_pd:+.2%}")
print(f"   ‚Ä¢ Original expected loss: ${original_expected_loss:,.0f}")
print(f"   ‚Ä¢ Calibrated expected loss: ${calibrated_expected_loss:,.0f}")
print(f"   ‚Ä¢ Additional loss provision: ${calibrated_expected_loss - original_expected_loss:+,.0f}")

# Regulatory capital impact
risk_weighted_assets_original = portfolio_size * avg_loan_amount * 0.75  # Typical RWA multiplier
capital_requirement_original = risk_weighted_assets_original * regulatory_capital_ratio

# Higher risk portfolio requires more capital
risk_weight_increase = (calibrated_expected_pd / original_expected_pd) ** 0.5  # Simplified risk weight adjustment
risk_weighted_assets_calibrated = risk_weighted_assets_original * risk_weight_increase
capital_requirement_calibrated = risk_weighted_assets_calibrated * regulatory_capital_ratio

additional_capital = capital_requirement_calibrated - capital_requirement_original
cost_of_capital = 0.12  # 12% cost of equity capital
annual_capital_cost = additional_capital * cost_of_capital

print("\nüè¶ REGULATORY CAPITAL IMPACT:")
print(f"   ‚Ä¢ Original capital requirement: ${capital_requirement_original:,.0f}")
print(f"   ‚Ä¢ Calibrated capital requirement: ${capital_requirement_calibrated:,.0f}")
print(f"   ‚Ä¢ Additional capital needed: ${additional_capital:+,.0f}")
print(f"   ‚Ä¢ Annual cost of additional capital: ${annual_capital_cost:,.0f}")

# CECL accounting impact
cecl_multiplier = 1.5  # CECL typically requires 1.5x current expected losses
cecl_provision_original = original_expected_loss * cecl_multiplier
cecl_provision_calibrated = calibrated_expected_loss * cecl_multiplier
cecl_impact = cecl_provision_calibrated - cecl_provision_original

print("\nüìã CECL ACCOUNTING IMPACT:")
print(f"   ‚Ä¢ Original CECL provision: ${cecl_provision_original:,.0f}")
print(f"   ‚Ä¢ Calibrated CECL provision: ${cecl_provision_calibrated:,.0f}")
print(f"   ‚Ä¢ Additional CECL provision: ${cecl_impact:+,.0f}")

# Total financial impact
total_annual_impact = annual_capital_cost + cecl_impact
impact_as_percent_of_portfolio = total_annual_impact / total_portfolio_value * 100

print("\nüí∞ TOTAL FINANCIAL IMPACT:")
print(f"   ‚Ä¢ Total annual financial impact: ${total_annual_impact:+,.0f}")
print(f"   ‚Ä¢ Impact as % of portfolio: {impact_as_percent_of_portfolio:+.3f}%")
print(f"   ‚Ä¢ Impact per loan: ${total_annual_impact / portfolio_size:+,.0f}")

# Risk management benefits
print("\n‚úÖ RISK MANAGEMENT BENEFITS:")
benefits = [
    "Accurate forward-looking loss estimates for CECL compliance",
    "Better alignment with economic cycle expectations",
    "Improved regulatory examination outcomes",
    "Enhanced stress testing capabilities",
    "More accurate pricing of credit risk",
    "Better portfolio management and diversification insights"
]

for benefit in benefits:
    print(f"   ‚Ä¢ {benefit}")

print("\nüìä KEY PERFORMANCE INDICATORS:")
print(f"   ‚Ä¢ Rank preservation: {np.mean(spearman_correlations):.6f} correlation")
print(f"   ‚Ä¢ Constraint satisfaction: {max_marginal_error:.2e} max error")
print(f"   ‚Ä¢ Convergence: {result.iterations} iterations")
print(f"   ‚Ä¢ Prediction stability: {(1-prediction_changes/len(y_test))*100:.1f}%")

print("\nüöÄ IMPLEMENTATION ROADMAP:")
roadmap = [
    "Validate calibrated model against recent economic data",
    "Implement in CECL calculation engine",
    "Update risk-based pricing models",
    "Integrate with regulatory reporting systems",
    "Establish quarterly recalibration process",
    "Train risk management team on new methodology"
]

for i, step in enumerate(roadmap, 1):
    print(f"   {i}. {step}")

## Next Steps

This example demonstrated rank-preserving calibration for financial risk assessment. The methodology applies broadly across financial services:

- **CECL accounting**: Forward-looking credit loss estimation
- **Stress testing**: Economic scenario modeling for regulatory compliance
- **Portfolio management**: Risk-adjusted pricing and capital allocation
- **Insurance**: Catastrophe modeling and reserve adequacy
- **Trading**: Market risk calibration across different volatility regimes

Key advantages of rank-preserving calibration in finance:
1. Maintains credit ranking while adjusting absolute probabilities
2. Ensures mathematical consistency with portfolio constraints
3. Provides regulatory-compliant probability estimates
4. Enables accurate capital requirement calculations

For more examples in different domains:
- Medical diagnosis with population health shifts
- Text classification with domain adaptation
- Computer vision with deployment environment changes
- Survey reweighting for demographic representation