# German Credit Model Bias Detection in Credit Approvals

This project implemented a Basel III-compliant Internal Ratings-Based (IRB) credit risk system to assess capital requirements for a loan portfolio. The model calculates Probability of Default (PD), Risk-Weighted Assets (RWA), and regulatory capital while incorporating stress testing to evaluate resilience under adverse economic conditions.

This project  developed a comprehensive, Basel III-compliant credit risk framework that transforms raw loan data into actionable regulatory capital insights. By integrating machine learning with financial risk modeling, the system provides banks with a powerful tool for default prediction, capital adequacy assessment, and stress testing.

#### Strategic Value

This system enables banks to:

- Proactively manage risk through PD monitoring
- Optimize capital allocation per Basel requirements
- Demonstrate regulatory compliance with auditable calculations

While the framework provides a robust foundation for internal ratings-based approaches, its true value will emerge through iterative refinement using real-world portfolio data. The project demonstrates how machine learning and regulatory finance can converge to create smarter risk management systems.

In [1]:
import pandas as pd
import numpy as np
from scipy.stats import norm
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import roc_auc_score
from sklearn.utils.class_weight import compute_sample_weight
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.exceptions import NotFittedError
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.model_selection import (KFold, StratifiedKFold, 
                                   train_test_split, cross_val_score)
from sklearn.metrics import (roc_auc_score, brier_score_loss, 
                           precision_recall_curve, average_precision_score)
from scipy.stats import norm

## Data Loading & Preprocessing

Loads raw credit data, cleans it, and engineers financial risk features.

In [2]:
class CreditDataPreprocessor:
    """Handles all data loading and preprocessing operations"""
    
    DEFAULT_COLUMN_MAPPING = {
        'laufkont': 'Status',
        'laufzeit': 'Duration',
        'moral': 'CreditHistory',
        'verw': 'Purpose',
        'hoehe': 'Amount',
        'sparkont': 'Savings',
        'beszeit': 'EmploymentDuration',
        'rate': 'InstallmentRate',
        'famges': 'PersonalStatus',
        'buerge': 'OtherDebtors',
        'wohnzeit': 'ResidenceDuration',
        'verm': 'Property',
        'alter': 'Age',
        'weitkred': 'OtherInstallments',
        'wohn': 'Housing',
        'bishkred': 'NumCredits',
        'beruf': 'Job',
        'pers': 'Dependents',
        'telef': 'Telephone',
        'gastarb': 'ForeignWorker',
        'kredit': 'Default'
    }
    
    def __init__(self, column_mapping=None, min_bin_size=50):
        """
        Args:
            column_mapping: Dictionary for renaming columns
            min_bin_size: Minimum samples per bin for numerical features
        """
        self.column_mapping = column_mapping or self.DEFAULT_COLUMN_MAPPING
        self.min_bin_size = min_bin_size
        self.feature_stats_ = {}
        
    def load_data(self, filepath):
        """Load and validate credit data"""
        df = pd.read_csv(filepath)
        
        # Rename columns using mapping
        df = df.rename(columns={k: v for k, v in self.column_mapping.items() 
                               if k in df.columns})
        
        # Validate required columns
        required_columns = ['Duration', 'Amount', 'Age', 'Default']
        missing = [col for col in required_columns if col not in df.columns]
        if missing:
            raise ValueError(f"Missing required columns: {missing}")
            
        return df
    
    def preprocess_data(self, df):
        """Clean and transform raw data"""
        # Convert target: 1=default, 0=non-default
        if df['Default'].max() == 2:  # German credit data format
            df['Default'] = df['Default'] - 1
            
        # Handle missing values
        df = df.dropna()
        
        # Add financial ratios
        df = self._add_financial_features(df)
        
        # Store feature statistics
        self._store_feature_stats(df)
        
        return df
    
    def _add_financial_features(self, df):
        """Create financial ratios and risk indicators"""
        df = df.copy()
        
        # Liquidity ratios
        df['DebtToIncome'] = df['Amount'] / (df['Duration'] + 1e-6)
        df['InstallmentBurden'] = df['InstallmentRate'] / (df['Amount'] + 1e-6)
        
        # Stability indicators
        df['AgeSquared'] = df['Age'] ** 2
        df['LogAmount'] = np.log(df['Amount'] + 1)
        
        return df
    
    def _store_feature_stats(self, df):
        """Store descriptive statistics for features"""
        self.feature_stats_ = {
            'mean': df.mean(),
            'std': df.std(),
            'min': df.min(),
            'max': df.max()
        }

## Validation and Stress Testing

In [4]:
class CreditRiskValidator:
    """
    Extended model validation and stress testing framework that works with
    the existing CreditDataPreprocessor output.
    """
    
    def __init__(self, preprocessor):
        """
        Initialize with a preprocessor instance
        
        Args:
            preprocessor: CreditDataPreprocessor instance
        """
        self.preprocessor = preprocessor
        self.model = GradientBoostingClassifier(
            n_estimators=150,
            max_depth=3,
            min_samples_leaf=50,
            random_state=42
        )
        
    def load_and_prepare_data(self, filepath):
        """Load and preprocess data using existing preprocessor"""
        # First load the data
        self.df = self.preprocessor.load_data(filepath)  # Changed from load_and_preprocess
        
        # Then preprocess it
        self.df = self.preprocessor.preprocess_data(self.df)  # Separate preprocessing step
        
        self.X = self.df.drop(columns=['Default'])
        self.y = self.df['Default']
        return self.X, self.y

    # ... rest of your class methods remain the same ...
    
    def run_validation_suite(self, X, y, methods=['holdout', 'kfold', 'stratified']):
        """
        Execute multiple validation strategies
        
        Args:
            methods: List of validation methods to run
                    Options: 'holdout', 'kfold', 'stratified'
        
        Returns:
            dict: Validation results for each method
        """
        results = {}
        
        if 'holdout' in methods:
            X_train, X_test, y_train, y_test = train_test_split(
                X, y, test_size=0.3, random_state=42, stratify=y
            )
            self.model.fit(X_train, y_train)
            probs = self.model.predict_proba(X_test)[:, 1]
            
            results['holdout'] = {
                'auc': roc_auc_score(y_test, probs),
                'brier': brier_score_loss(y_test, probs),
                'avg_precision': average_precision_score(y_test, probs)
            }
        
        if 'kfold' in methods:
            kf = KFold(n_splits=5, shuffle=True, random_state=42)
            cv_results = cross_val_score(
                self.model, X, y, cv=kf, 
                scoring='roc_auc', n_jobs=-1
            )
            results['kfold'] = {
                'mean_auc': np.mean(cv_results),
                'std_auc': np.std(cv_results),
                'fold_scores': cv_results
            }
        
        if 'stratified' in methods:
            skf = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
            cv_results = cross_val_score(
                self.model, X, y, cv=skf, 
                scoring='roc_auc', n_jobs=-1
            )
            results['stratified'] = {
                'mean_auc': np.mean(cv_results),
                'std_auc': np.std(cv_results),
                'fold_scores': cv_results
            }
            
        return results
    
    def economic_scenario_stress_test(self, X, baseline_probs, scenarios):
        """
        Stress test the model under different economic conditions
        
        Args:
            X: Features DataFrame
            baseline_probs: Array of baseline PD predictions
            scenarios: Dict of scenario definitions
                Example: {
                    'recession': {
                        'pd_shock': 1.5,  # Multiply PDs by 1.5x
                        'feature_shocks': {
                            'DebtToIncome': 1.2,  # Multiply feature by 1.2x
                            'Amount': 0.9  # Multiply feature by 0.9x
                        }
                    }
                }
        
        Returns:
            dict: Stress test results for each scenario
        """
        results = {}
        
        for scenario_name, params in scenarios.items():
            # Create shocked copy of data
            X_shocked = X.copy()
            prob_shocked = baseline_probs.copy()
            
            # Apply feature shocks
            for feature, multiplier in params.get('feature_shocks', {}).items():
                if feature in X_shocked.columns:
                    X_shocked[feature] = X_shocked[feature] * multiplier
            
            # Apply PD shocks
            if 'pd_shock' in params:
                prob_shocked = np.minimum(prob_shocked * params['pd_shock'], 0.9999)
            
            # Re-predict if features changed
            if 'feature_shocks' in params:
                prob_shocked = self.model.predict_proba(X_shocked)[:, 1]
            
            # Calculate portfolio impact
            results[scenario_name] = self._calculate_stress_metrics(
                baseline_probs, prob_shocked
            )
            
        return results
    
    def _calculate_stress_metrics(self, baseline_probs, stressed_probs):
        """Calculate portfolio-level stress metrics"""
        return {
            'mean_pd_change': np.mean(stressed_probs - baseline_probs),
            'median_pd_change': np.median(stressed_probs - baseline_probs),
            'percentile_90_change': np.percentile(stressed_probs - baseline_probs, 90),
            'default_rate_baseline': np.mean(baseline_probs),
            'default_rate_stressed': np.mean(stressed_probs),
            'relative_increase': (np.mean(stressed_probs) - np.mean(baseline_probs)) / 
                               np.mean(baseline_probs)
        }
    
    def sensitivity_analysis(self, X, y, feature_of_interest, values_to_test):
        """
        Analyze how changing one feature affects PD predictions
        
        Args:
            feature_of_interest: Feature to vary
            values_to_test: List of values to test for the feature
        
        Returns:
            DataFrame: PD changes across tested values
        """
        results = []
        baseline = X.copy()
        
        for value in values_to_test:
            X_test = baseline.copy()
            X_test[feature_of_interest] = value
            probs = self.model.predict_proba(X_test)[:, 1]
            
            results.append({
                'feature_value': value,
                'mean_pd': np.mean(probs),
                'median_pd': np.median(probs),
                'default_rate': np.mean(probs > 0.5)  # Threshold at 50% PD
            })
            
        return pd.DataFrame(results)

# Example Usage
if __name__ == "__main__":
    # Initialize with existing preprocessor
    preprocessor = CreditDataPreprocessor()
    validator = CreditRiskValidator(preprocessor)
    
    # Load and prepare data
    X, y = validator.load_and_prepare_data("german_credit_data.csv")
    
    print("Running validation suite...")
    validation_results = validator.run_validation_suite(
        X, y, 
        methods=['holdout', 'kfold', 'stratified']
    )
    print("Validation Results:", validation_results)
    
    # Get baseline predictions
    baseline_probs = validator.model.predict_proba(X)[:, 1]
    
    print("\nRunning stress tests...")
    scenarios = {
        'mild_recession': {
            'pd_shock': 1.5,
            'feature_shocks': {
                'DebtToIncome': 1.2,
                'Amount': 0.95
            }
        },
        'severe_crisis': {
            'pd_shock': 2.0,
            'feature_shocks': {
                'DebtToIncome': 1.5,
                'Amount': 0.8
            }
        }
    }
    stress_results = validator.economic_scenario_stress_test(
        X, baseline_probs, scenarios
    )
    print("Stress Test Results:", stress_results)
    
    print("\nRunning sensitivity analysis...")
    sensitivity_results = validator.sensitivity_analysis(
        X, y, 
        feature_of_interest='Amount',
        values_to_test=np.linspace(X['Amount'].min(), X['Amount'].max(), 5)
    )
    print("Sensitivity Analysis:")
    print(sensitivity_results)

Running validation suite...
Validation Results: {'holdout': {'auc': 0.7728042328042327, 'brier': 0.1619762107975078, 'avg_precision': 0.8705127636573369}, 'kfold': {'mean_auc': 0.7689124444677506, 'std_auc': 0.010659575517553449, 'fold_scores': array([0.77711547, 0.76934925, 0.74970391, 0.76803905, 0.78035454])}, 'stratified': {'mean_auc': 0.7716190476190476, 'std_auc': 0.03783142282928286, 'fold_scores': array([0.75571429, 0.80107143, 0.77238095, 0.81892857, 0.71      ])}}

Running stress tests...
Stress Test Results: {'mild_recession': {'mean_pd_change': 0.018697498966144553, 'median_pd_change': 0.0017294611988927788, 'percentile_90_change': 0.08740464336878964, 'default_rate_baseline': 0.7020166986910017, 'default_rate_stressed': 0.7207141976571463, 'relative_increase': 0.026633980361162952}, 'severe_crisis': {'mean_pd_change': 0.041557354325645876, 'median_pd_change': 0.019511769263931222, 'percentile_90_change': 0.14882729020849061, 'default_rate_baseline': 0.7020166986910017, 'de

### Critical Observations

**1. High Baseline Risk**
- 70% default rate is extreme (real-world portfolios typically <10%)

**2. Economic Sensitivity**
- The 6% default rate increase in severe crisis suggests:
- Portfolio is already high-risk (limited "room to fall")
- Model may underestimate correlated risks

**Model Limitations**
- Good discrimination (AUC ~0.77) but not excellent
- High Brier score suggests imperfect calibration
- Stratified fold variance (0.71-0.82 AUC) indicates sensitivity to data splits

## Conclusion

While the model performs adequately given the extreme portfolio risk, the high baseline default rate makes this more suitable for stress testing methodology development than real-world deployment without further validation. 
    
### Recommended Actions

1. Data Investigation:
- Verify if 70% default rate is realistic
- Check for target leakage in features

2. Model Improvements:
- Try calibration methods (Platt scaling, isotonic regression)
- Add feature engineering (e.g., macroeconomic indicators)

3. Risk Management:
- Closely monitor high-PD loans (90th percentile)
- Develop mitigation strategies for small loans

4. Scenario Refinement:
- Test more extreme scenarios (e.g., 3× PD shock)
- Add unemployment rate shocks to features