# Credit Risk Prediction Model - Demonstration

This notebook demonstrates the usage of the comprehensive credit risk prediction model framework.

## 1. Import Required Libraries

In [None]:
import sys
import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Add src to path
sys.path.append('../src')

from credit_risk_model import DataPreprocessor, FeatureEngineer, ModelTrainer, ModelEvaluator

## 2. Data Creation and Preprocessing

In [None]:
# Initialize data preprocessor
preprocessor = DataPreprocessor(random_state=42)

# Create sample dataset
X, y = preprocessor.create_sample_dataset(n_samples=1000, n_features=8)

print(f"Dataset shape: {X.shape}")
print(f"Target distribution:\n{y.value_counts()}")
print(f"Features: {list(X.columns)}")

In [None]:
# Display sample data
X.head()

In [None]:
# Preprocess the data
X_train, X_test, y_train, y_test = preprocessor.preprocess_pipeline(X, y, test_size=0.3)

print(f"Training set shape: {X_train.shape}")
print(f"Test set shape: {X_test.shape}")

## 3. Model Training

In [None]:
# Initialize model trainer
trainer = ModelTrainer(random_state=42)

# Train multiple models
models = trainer.train_all_models(
    X_train, y_train,
    models_to_train=['logistic_regression', 'random_forest', 'xgboost'],
    use_grid_search=False,
    handle_imbalance='smote'
)

# Display training summary
summary = trainer.get_model_summary()
print("Training Summary:")
print(summary)

## 4. Model Evaluation

In [None]:
# Initialize evaluator
evaluator = ModelEvaluator()

# Compare models
comparison = evaluator.compare_models(models, X_test, y_test)
print("Model Comparison:")
print(comparison)

## 5. Visualizations

In [None]:
# Plot model comparison
plt.figure(figsize=(15, 10))
evaluator.plot_model_comparison(models, X_test, y_test)
plt.show()

## 6. Best Model Analysis

In [None]:
# Get best model
best_model_name = comparison.iloc[0]['Model']
best_model = models[best_model_name]

print(f"Best Model: {best_model_name}")

# Generate detailed report
report = evaluator.generate_evaluation_report(best_model_name)
print(report)

## 7. Feature Importance

In [None]:
# Plot feature importance if available
if hasattr(best_model, 'feature_importances_'):
    importance_df = pd.DataFrame({
        'feature': X_test.columns,
        'importance': best_model.feature_importances_
    }).sort_values('importance', ascending=False)
    
    plt.figure(figsize=(10, 6))
    plt.barh(range(len(importance_df)), importance_df['importance'])
    plt.yticks(range(len(importance_df)), importance_df['feature'])
    plt.xlabel('Feature Importance')
    plt.title(f'Feature Importance - {best_model_name}')
    plt.gca().invert_yaxis()
    plt.tight_layout()
    plt.show()
    
    print("Top 5 Features:")
    print(importance_df.head())

## 8. Predictions on New Data

In [None]:
# Make predictions on test set
y_pred = best_model.predict(X_test)
y_prob = best_model.predict_proba(X_test)[:, 1]

# Create predictions dataframe
predictions_df = pd.DataFrame({
    'actual': y_test,
    'predicted': y_pred,
    'probability': y_prob
})

print("Sample Predictions:")
print(predictions_df.head(10))

## 9. Business Impact Analysis

In [None]:
# Calculate business metrics
business_metrics = evaluator.calculate_business_metrics(y_test, y_pred, y_prob)

print("Business Impact Analysis:")
print(f"Net Profit: ${business_metrics['net_profit']:,.2f}")
print(f"Total Revenue: ${business_metrics['total_revenue']:,.2f}")
print(f"Total Losses: ${business_metrics['total_losses']:,.2f}")
print(f"Approval Rate: {business_metrics['approval_rate']:.2%}")
print(f"Profit Margin: {business_metrics['profit_margin']:.2%}")

## Conclusion

This notebook demonstrates the complete workflow of the credit risk prediction model:

1. **Data Creation**: Generate realistic credit risk datasets
2. **Preprocessing**: Handle missing values, outliers, and feature scaling
3. **Model Training**: Train multiple ML models with class imbalance handling
4. **Evaluation**: Comprehensive performance evaluation with business metrics
5. **Visualization**: Generate plots for model comparison and analysis
6. **Interpretation**: Feature importance and business impact analysis

The framework provides a robust foundation for credit risk modeling that can be extended with additional features and models as needed.