# Lending Club Credit Risk Model Development
## Purpose
This notebook demonstrates the development of a credit risk assessment model using Lending Club data. The model aims to predict loan default probability to assist in credit decisioning.

## Table of Contents
1. Data Loading and Exploration
   - Lending Club Data Overview
   - Feature Analysis
   - Missing Value Assessment
2. Preprocessing Steps
   - Credit Score Normalization
   - Income Verification
   - Debt-to-Income Ratio Calculation
   - Feature Engineering
3. Model Development
   - Model Selection (Gradient Boosting)
   - Hyperparameter Tuning
   - Cross-Validation
4. Performance Evaluation
   - ROC-AUC Analysis
   - Precision-Recall Curves
   - Population Stability Index
5. Model Validation
   - Out-of-Time Validation
   - Demographic Analysis
   - Model Stability Tests

In [None]:
# Required imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import roc_auc_score, precision_recall_curve

# Set random seed for reproducibility
np.random.seed(42)

## 1. Data Loading and Exploration

The Lending Club dataset contains historical loan data with various features including:
- Loan amount
- Interest rate
- Annual income
- Debt-to-income ratio
- Credit score
- Employment length
- Home ownership
- Loan purpose

We'll focus on these key features for our credit risk assessment model.