# Risk Model Pipeline - Dual Pipeline Example

## ⚠️ IMPORTANT: Installation from GitHub

### Known Issues and Solutions

#### If you get `llvmlite` uninstall error:
```bash
# Option 1: Ignore the installed version
pip install --ignore-installed llvmlite
pip install git+https://github.com/selimoksuz/risk-model-pipeline.git

# Option 2: Use conda to manage llvmlite
conda update llvmlite
pip install git+https://github.com/selimoksuz/risk-model-pipeline.git

# Option 3: Force reinstall without dependencies
pip install --force-reinstall --no-deps git+https://github.com/selimoksuz/risk-model-pipeline.git
pip install numpy==1.24.3 pandas==1.5.3 scikit-learn==1.3.0
```

### Standard Installation
```bash
pip install git+https://github.com/selimoksuz/risk-model-pipeline.git
```

### Create Clean Environment (Recommended)
```bash
# Create new environment
python -m venv risk_env
risk_env\Scripts\activate  # Windows
source risk_env/bin/activate  # Linux/Mac

# Install in clean environment
pip install git+https://github.com/selimoksuz/risk-model-pipeline.git
```

## 1. Environment Check

In [1]:
# Check Python and package versions
import sys
print(f"Python: {sys.version}")
print(f"Python executable: {sys.executable}")
print("-" * 50)

# Try importing packages and show versions
packages = [
    ('numpy', 'np'),
    ('pandas', 'pd'),
    ('sklearn', 'sklearn')
]

import_success = True
for package_name, import_name in packages:
    try:
        module = __import__(package_name)
        print(f"✓ {package_name}: {module.__version__}")
    except ImportError as e:
        print(f"✗ {package_name}: Not installed")
        import_success = False
    except Exception as e:
        print(f"✗ {package_name}: Error - {e}")
        import_success = False

if not import_success:
    print("\n⚠️ Please install missing packages:")
    print("pip install git+https://github.com/selimoksuz/risk-model-pipeline.git")
else:
    print("\n✓ All packages imported successfully!")

# Output should appear here when cell is run

Python: 3.9.13 (main, Aug 25 2022, 23:51:50) [MSC v.1916 64 bit (AMD64)]
Python executable: C:\Users\Acer\anaconda3\python.exe
--------------------------------------------------
✓ numpy: 1.24.3


  from pandas.core.computation.check import NUMEXPR_INSTALLED
  from pandas.core import (


✓ pandas: 2.3.2
✓ sklearn: 1.6.1

✓ All packages imported successfully!


## 2. Setup and Imports

In [2]:
# Reinstall package from GitHub to get latest changes
import subprocess
import sys

print("Updating risk-model-pipeline from GitHub...")

# Uninstall existing version
print("1. Uninstalling existing version...")
subprocess.run([sys.executable, "-m", "pip", "uninstall", "-y", "risk-model-pipeline"], capture_output=True)

# Install fresh from GitHub (will install all requirements automatically)
print("2. Installing from GitHub (with all requirements)...")
result = subprocess.run(
    [sys.executable, "-m", "pip", "install", "git+https://github.com/selimoksuz/risk-model-pipeline.git"],
    capture_output=True, text=True
)

if result.returncode == 0:
    print("✓ Package installed successfully!")
else:
    print(f"✗ Installation failed: {result.stderr}")

# Clear import cache to ensure fresh import
import sys
modules_to_clear = ['risk_pipeline', 'risk_pipeline.pipeline', 'risk_pipeline.core']
for module in modules_to_clear:
    if module in sys.modules:
        del sys.modules[module]

print("✓ Ready to import pipeline")

Updating risk-model-pipeline from GitHub...
1. Uninstalling existing version...
2. Installing from GitHub (with all requirements)...
✓ Package installed successfully!
✓ Ready to import pipeline


In [3]:
# Import pipeline components
import numpy as np
import pandas as pd
import warnings
warnings.filterwarnings('ignore')

from risk_pipeline.pipeline import Config, RiskModelPipeline

print("✓ Pipeline imported successfully!")

✓ Pipeline imported successfully!


## 3. Generate Sample Data

In [4]:
def create_sample_data(n_samples=10000, seed=42, oot_shift=True):
    """
    Create synthetic credit risk data with controlled characteristics for testing
    
    Parameters:
    -----------
    n_samples : int
        Total number of samples
    seed : int
        Random seed for reproducibility
    oot_shift : bool
        If True, create distribution shift in OOT period for some features
    """
    np.random.seed(seed)
    import random
    random.seed(seed)
    
    # Time periods (70% train+test, 30% OOT)
    train_test_size = int(n_samples * 0.7)
    oot_size = n_samples - train_test_size
    
    # === STRONG PREDICTIVE FEATURES (stable) ===
    # These will have high IV and remain stable
    risk_score = np.concatenate([
        np.random.beta(2, 5, train_test_size),
        np.random.beta(2, 5, oot_size)  # Same distribution in OOT
    ])
    
    payment_score = np.concatenate([
        np.random.beta(3, 2, train_test_size),
        np.random.beta(3, 2, oot_size)  # Same distribution in OOT
    ])
    
    debt_ratio = np.concatenate([
        np.random.beta(2, 3, train_test_size),
        np.random.beta(2, 3, oot_size)  # Same distribution in OOT
    ])
    
    # === MODERATE PREDICTIVE FEATURES (with PSI shift) ===
    # These will have decent IV but high PSI in OOT
    income_level = np.concatenate([
        np.random.lognormal(10, 1.5, train_test_size),
        np.random.lognormal(10.5, 1.2, oot_size) if oot_shift else np.random.lognormal(10, 1.5, oot_size)
    ])
    
    credit_history_months = np.concatenate([
        np.random.gamma(3, 10, train_test_size),
        np.random.gamma(4, 12, oot_size) if oot_shift else np.random.gamma(3, 10, oot_size)
    ])
    
    # === WEAK/NOISY FEATURES ===
    # These should be filtered out by feature selection
    noise_feature1 = np.random.randn(n_samples)
    noise_feature2 = np.random.uniform(0, 1, n_samples)
    
    # === CATEGORICAL FEATURES ===
    employment_type = np.concatenate([
        np.random.choice(['Full-time', 'Part-time', 'Self-employed', 'Unemployed'], 
                        train_test_size, p=[0.6, 0.2, 0.15, 0.05]),
        np.random.choice(['Full-time', 'Part-time', 'Self-employed', 'Unemployed'], 
                        oot_size, p=[0.6, 0.2, 0.15, 0.05])
    ])
    
    # Region (shifts in OOT - new categories appear)
    region_train = np.random.choice(['North', 'South', 'East', 'West'], 
                                   train_test_size, p=[0.3, 0.3, 0.2, 0.2])
    if oot_shift:
        # Introduce new categories in OOT
        region_oot = np.random.choice(['North', 'South', 'East', 'West', 'Central', 'International'], 
                                     oot_size, p=[0.2, 0.2, 0.15, 0.15, 0.2, 0.1])
    else:
        region_oot = np.random.choice(['North', 'South', 'East', 'West'], 
                                     oot_size, p=[0.3, 0.3, 0.2, 0.2])
    region = np.concatenate([region_train, region_oot])
    
    product_type = np.random.choice(['A', 'B', 'C'], n_samples, p=[0.5, 0.3, 0.2])
    
    # === HIGHLY CORRELATED FEATURES (for correlation filtering) ===
    utilization_rate = debt_ratio + np.random.normal(0, 0.1, n_samples)
    utilization_rate = np.clip(utilization_rate, 0, 1)
    
    num_credit_lines = (credit_history_months / 10 + np.random.poisson(2, n_samples)).astype(int)
    num_credit_lines = np.clip(num_credit_lines, 0, 20)
    
    num_inquiries = np.random.poisson(2, n_samples)
    
    # === TARGET VARIABLE ===
    # Create target with strong signal from stable features
    risk_factor = (
        3.0 * risk_score +                    # Strong positive (bad is high)
        2.5 * payment_score +                  # Strong positive
        2.0 * debt_ratio +                     # Strong positive
        1.0 * utilization_rate +               # Moderate positive
        0.5 * (income_level < np.median(income_level)).astype(float) +
        0.3 * (credit_history_months < 24).astype(float) +
        0.5 * (employment_type == 'Unemployed').astype(float) +
        0.2 * (employment_type == 'Part-time').astype(float) +
        0.1 * noise_feature1 + 0.1 * noise_feature2
    )
    
    # Convert to probability
    default_prob = 1 / (1 + np.exp(-2 * (risk_factor - np.median(risk_factor))))
    target = np.random.binomial(1, default_prob)
    
    # Adjust to get ~20-30% default rate
    if target.mean() > 0.30:
        threshold = np.percentile(default_prob, 70)
        target = (default_prob > threshold).astype(int)
    elif target.mean() < 0.20:
        threshold = np.percentile(default_prob, 80)
        target = (default_prob > threshold).astype(int)
    
    # === ADD MISSING VALUES ===
    missing_idx = np.random.choice(n_samples, size=int(0.05 * n_samples), replace=False)
    income_level[missing_idx] = np.nan
    
    missing_idx = np.random.choice(n_samples, size=int(0.03 * n_samples), replace=False)
    credit_history_months[missing_idx] = np.nan
    
    # === CREATE DATAFRAME ===
    df = pd.DataFrame({
        'app_id': range(1, n_samples + 1),
        'app_dt': pd.date_range(start='2022-01-01', periods=n_samples, freq='H')[:n_samples],
        'risk_score': risk_score,
        'payment_score': payment_score,
        'debt_ratio': debt_ratio,
        'income_level': income_level,
        'credit_history_months': credit_history_months,
        'noise_feature1': noise_feature1,
        'noise_feature2': noise_feature2,
        'employment_type': employment_type,
        'region': region,
        'product_type': product_type,
        'utilization_rate': utilization_rate,
        'num_credit_lines': num_credit_lines,
        'num_inquiries': num_inquiries,
        'target': target
    })
    
    print(f"Dataset created:")
    print(f"  Shape: {df.shape}")
    print(f"  Default rate: {df['target'].mean():.2%}")
    print(f"  Date range: {df['app_dt'].min().date()} to {df['app_dt'].max().date()}")
    print(f"  Missing values: {df.isnull().sum().sum()}")
    
    # Show feature characteristics
    print(f"\nFeature characteristics:")
    print(f"  Strong predictors: risk_score, payment_score, debt_ratio")
    print(f"  PSI shift features: income_level, credit_history_months, region")
    print(f"  Noise features: noise_feature1, noise_feature2")
    print(f"  Correlated pairs: (debt_ratio, utilization_rate)")
    
    return df

# Generate data with fixed seed
try:
    df = create_sample_data(n_samples=10000, seed=42, oot_shift=True)
    print("\n✓ Data generated successfully!")
    display(df.head())
except Exception as e:
    print(f"✗ Error generating data: {e}")

Dataset created:
  Shape: (10000, 16)
  Default rate: 30.00%
  Date range: 2022-01-01 to 2023-02-21
  Missing values: 800

Feature characteristics:
  Strong predictors: risk_score, payment_score, debt_ratio
  PSI shift features: income_level, credit_history_months, region
  Noise features: noise_feature1, noise_feature2
  Correlated pairs: (debt_ratio, utilization_rate)

✓ Data generated successfully!


Unnamed: 0,app_id,app_dt,risk_score,payment_score,debt_ratio,income_level,credit_history_months,noise_feature1,noise_feature2,employment_type,region,product_type,utilization_rate,num_credit_lines,num_inquiries,target
0,1,2022-01-01 00:00:00,0.353677,0.576435,0.604789,66096.391145,7.45465,0.162789,0.210952,Full-time,South,C,0.666422,2,0,1
1,2,2022-01-01 01:00:00,0.248558,0.935896,0.245044,13009.413385,26.72538,-2.072867,0.277412,Unemployed,North,B,0.308693,6,1,1
2,3,2022-01-01 02:00:00,0.415959,0.92408,0.405689,65966.324008,27.662485,0.282163,0.837427,Full-time,South,A,0.591985,2,6,1
3,4,2022-01-01 03:00:00,0.159968,0.626639,0.602617,46467.378787,44.511552,0.550439,0.929937,Full-time,East,B,0.439253,10,4,0
4,5,2022-01-01 04:00:00,0.550283,0.854347,0.577085,14137.477402,11.4831,0.385806,0.915711,Self-employed,North,A,0.421884,4,1,1


## 4. Configure Pipeline

In [5]:
# Advanced Configuration with All Feature Selection Options
try:
    config = Config(
        # ========== CORE COLUMNS ==========
        id_col='app_id',
        time_col='app_dt',
        target_col='target',
        
        # ========== DUAL PIPELINE ==========
        enable_dual_pipeline=True,  # Enable both WOE and RAW pipelines
        
        # ========== RAW PIPELINE SETTINGS ==========
        # Imputation strategies: 'median', 'mean', 'mode', 'multiple', 'target_mean', 'forward_fill', 'interpolate'
        raw_imputation_strategy='multiple',  # Use multiple imputation (creates ensemble features)
        raw_outlier_method='iqr',           # Outlier detection: 'iqr', 'zscore', 'percentile', 'none'
        raw_outlier_threshold=1.5,          # IQR multiplier for outlier detection
        
        # ========== DATA SPLITTING ==========
        use_test_split=True,         # Create train/test/OOT splits
        test_size_row_frac=0.2,      # 20% for test set
        oot_window_months=3,         # Last 3 months as OOT (~30% of data)
        
        # ========== FEATURE ENGINEERING THRESHOLDS ==========
        
        # 1. PSI (Population Stability Index) - Feature stability monitoring
        psi_threshold=0.25,          # Features with PSI > 0.25 are dropped (unstable)
        # PSI < 0.10: No significant change
        # 0.10 <= PSI < 0.25: Some change, monitor
        # PSI >= 0.25: Significant change, drop feature
        
        # 2. IV (Information Value) - Feature importance
        iv_min=0.02,                 # Minimum IV to keep feature (filters weak predictors)
        # IV < 0.02: Not useful
        # 0.02 <= IV < 0.1: Weak predictor
        # 0.1 <= IV < 0.3: Medium predictor
        # 0.3 <= IV < 0.5: Strong predictor
        # IV >= 0.5: Very strong (check for overfitting)
        
        # 3. Correlation & Multicollinearity
        rho_threshold=0.90,          # Max correlation between features (drops redundant)
        vif_threshold=5.0,           # Variance Inflation Factor threshold
        cluster_top_k=2,             # Keep top K features from each correlation cluster
        
        # 4. Rare Categories
        rare_threshold=0.01,         # Categories with < 1% frequency are grouped as "RARE"
        
        # ========== FEATURE SELECTION METHODS ==========
        # The pipeline uses multiple methods in sequence:
        # 1. PSI filtering (stability check)
        # 2. IV filtering (importance check)
        # 3. Correlation clustering (redundancy removal)
        # 4. Boruta algorithm (all-relevant features)
        # 5. Forward selection with 1SE rule
        # 6. Noise sentinel (final sanity check)
        
        # ========== MODEL SELECTION CRITERIA ==========
        # How to select the best model from all trained models
        model_selection_method='balanced',   # Options: 'gini_oot', 'stable', 'balanced', 'conservative'
        
        # For 'balanced' method: weighted score = (1-weight)*performance + weight*stability
        model_stability_weight=0.3,          # 30% weight on stability, 70% on performance
        
        # For 'conservative' method: max allowed Train-OOT gap
        max_train_oot_gap=0.15,              # Models with gap > 15% are excluded
        
        # For 'stable' method: minimum acceptable performance
        min_gini_threshold=0.5,              # Only consider models with Gini >= 0.5
        
        # ========== MODEL TRAINING ==========
        cv_folds=3,                  # Cross-validation folds
        hpo_method='random',         # Hyperparameter optimization: 'random', 'optuna', 'grid'
        hpo_timeout_sec=30,          # Time limit for HPO per model
        hpo_trials=10,               # Number of HPO trials
        
        # ========== OUTPUT SETTINGS ==========
        output_folder='outputs_dual_example',
        output_excel_path='dual_pipeline_results.xlsx',
        write_parquet=True,          # Also save data in Parquet format
        
        # ========== OTHER SETTINGS ==========
        random_state=42,             # For reproducibility
        n_jobs=-1,                   # Use all CPU cores
        use_noise_sentinel=True,     # Final check to remove noise features
        use_benchmarks=True,         # Compare with benchmark models
    )
    
    print("✓ Configuration created successfully!")
    print("\n" + "="*60)
    print("CONFIGURATION SUMMARY")
    print("="*60)
    
    print("\n📊 Feature Selection Thresholds:")
    print(f"  • PSI Threshold: {config.psi_threshold} (stability check)")
    print(f"  • IV Minimum: {config.iv_min} (importance filter)")
    print(f"  • Correlation Threshold: {config.rho_threshold} (redundancy removal)")
    print(f"  • VIF Threshold: {config.vif_threshold} (multicollinearity)")
    print(f"  • Rare Category Threshold: {config.rare_threshold} (1% minimum)")
    
    print("\n🎯 Model Selection Strategy:")
    print(f"  • Method: {config.model_selection_method}")
    if config.model_selection_method == 'balanced':
        print(f"  • Stability Weight: {config.model_stability_weight} ({int(config.model_stability_weight*100)}% stability, {int((1-config.model_stability_weight)*100)}% performance)")
    if config.max_train_oot_gap:
        print(f"  • Max Train-OOT Gap: {config.max_train_oot_gap} (stability constraint)")
    
    print("\n🔧 Pipeline Settings:")
    print(f"  • Dual Pipeline: {config.enable_dual_pipeline}")
    print(f"  • RAW Imputation: {config.raw_imputation_strategy}")
    print(f"  • HPO Trials: {config.hpo_trials} trials in {config.hpo_timeout_sec}s")
    print(f"  • Output: {config.output_folder}")
    
except Exception as e:
    print(f"✗ Error creating configuration: {e}")

✓ Configuration created successfully!

CONFIGURATION SUMMARY

📊 Feature Selection Thresholds:
  • PSI Threshold: 0.25 (stability check)
  • IV Minimum: 0.02 (importance filter)
  • Correlation Threshold: 0.9 (redundancy removal)
  • VIF Threshold: 5.0 (multicollinearity)
  • Rare Category Threshold: 0.01 (1% minimum)

🎯 Model Selection Strategy:
  • Method: balanced
  • Stability Weight: 0.3 (30% stability, 70% performance)
  • Max Train-OOT Gap: 0.15 (stability constraint)

🔧 Pipeline Settings:
  • Dual Pipeline: True
  • RAW Imputation: multiple
  • HPO Trials: 10 trials in 30s
  • Output: outputs_dual_example


## 5. Run Pipeline

In [6]:
# Run pipeline with error handling
print("Preparing to run pipeline...")

try:
    # Import time for elapsed time calculation
    import time
    import numpy as np
    import random
    
    # Set random seed before pipeline run for consistency
    np.random.seed(42)
    random.seed(42)
    
    # Create pipeline instance
    pipeline = RiskModelPipeline(config)
    print("✓ Pipeline instance created")
    
    # Run pipeline
    print("\n" + "="*60)
    print("STARTING DUAL PIPELINE EXECUTION")
    print("="*60 + "\n")
    
    start_time = time.time()
    pipeline.run(df)
    elapsed = time.time() - start_time
    
    print(f"\n✓ Pipeline completed in {elapsed:.2f} seconds")
    
except Exception as e:
    print(f"\n✗ Pipeline error: {e}")
    print("\nPossible solutions:")
    print("  1. Check if all required packages are installed")
    print("  2. Verify numpy/pandas compatibility")
    print("  3. Run: pip install git+https://github.com/selimoksuz/risk-model-pipeline.git")
    print("\nDetailed error:")
    import traceback
    traceback.print_exc()

# Note: Pipeline output will appear here when cell is run

Preparing to run pipeline...
✓ Pipeline instance created

STARTING DUAL PIPELINE EXECUTION

[21:57:50] >> 1) Veri yukleme & hazirlik basliyor | CPU=1% RAM=23%
   - Veri boyutu: 10,000 satir x 16 sutun
   - Target orani: 30.00%
   - Random seed: 42
[21:57:50] â--  1) Veri yukleme & hazirlik bitti (0.10s) — OK | CPU=1% RAM=23%
[21:57:50] >> 2) Giris dogrulama & sabitleme basliyor | CPU=0% RAM=23%
[21:57:50] â--  2) Giris dogrulama & sabitleme bitti (0.13s) — OK | CPU=5% RAM=23%
[21:57:50] >> 3) Degisken siniflamasi basliyor | CPU=5% RAM=23%
   - numeric=10, categorical=4
[21:57:50] â--  3) Degisken siniflamasi bitti (0.11s) — OK | CPU=0% RAM=23%
[21:57:50] >> 4) Eksik & Nadir deger politikasi basliyor | CPU=0% RAM=23%
[21:57:50] â--  4) Eksik & Nadir deger politikasi bitti (0.11s) — OK | CPU=0% RAM=23%
[21:57:50] >> 5) Zaman bolmesi (Train/Test/OOT) basliyor | CPU=2% RAM=23%
   - Train=6233, Test=1558, OOT=2209
[21:57:51] â--  5) Zaman bolmesi (Train/Test/OOT) bitti (0.12s) — OK | CPU=4%

## 6. Review Results

In [7]:
# Complete Results Display - Full Output
try:
    print("="*80)
    print("COMPLETE MODEL PERFORMANCE SUMMARY")
    print("="*80)
    
    if hasattr(pipeline, 'models_summary_') and pipeline.models_summary_ is not None:
        summary = pipeline.models_summary_
        
        # Display full model summary
        print("\n📊 ALL MODELS TRAINED:")
        print("-"*80)
        
        # Show all columns for complete information
        pd.set_option('display.max_columns', None)
        pd.set_option('display.width', None)
        pd.set_option('display.max_rows', None)
        
        # Sort by Gini_OOT for better readability
        summary_sorted = summary.sort_values('Gini_OOT', ascending=False)
        
        # Display all models with all metrics
        for idx, row in summary_sorted.iterrows():
            print(f"\nModel: {row['model_name']}")
            print(f"  Pipeline: {row.get('pipeline', 'N/A')}")
            print(f"  Gini_Train: {row.get('Gini_Train', 'N/A'):.4f}")
            print(f"  Gini_Test: {row.get('Gini_Test', 'N/A'):.4f}")
            print(f"  Gini_OOT: {row.get('Gini_OOT', 'N/A'):.4f}")
            print(f"  AUC_Train: {row.get('AUC_Train', 'N/A'):.4f}")
            print(f"  AUC_Test: {row.get('AUC_Test', 'N/A'):.4f}")
            print(f"  AUC_OOT: {row.get('AUC_OOT', 'N/A'):.4f}")
            
            # Calculate stability metrics
            if 'Gini_Train' in row and 'Gini_OOT' in row:
                train_oot_gap = abs(row['Gini_Train'] - row['Gini_OOT'])
                print(f"  Train-OOT Gap: {train_oot_gap:.4f}")
                
            if 'KS_OOT' in row:
                print(f"  KS_OOT: {row['KS_OOT']:.4f}")
        
        # Pipeline Comparison
        print("\n" + "="*80)
        print("PIPELINE COMPARISON ANALYSIS")
        print("="*80)
        
        for pipeline_type in ['WOE', 'RAW']:
            pipeline_models = summary[summary['model_name'].str.contains(pipeline_type)]
            if not pipeline_models.empty:
                print(f"\n{pipeline_type} Pipeline Performance:")
                print(f"  Total Models: {len(pipeline_models)}")
                print(f"  Best Gini_OOT: {pipeline_models['Gini_OOT'].max():.4f}")
                print(f"  Mean Gini_OOT: {pipeline_models['Gini_OOT'].mean():.4f}")
                print(f"  Std Gini_OOT: {pipeline_models['Gini_OOT'].std():.4f}")
                
                # Find most stable model (smallest Train-OOT gap)
                pipeline_models['train_oot_gap'] = abs(
                    pipeline_models['Gini_Train'] - pipeline_models['Gini_OOT']
                )
                most_stable = pipeline_models.nsmallest(1, 'train_oot_gap').iloc[0]
                print(f"  Most Stable Model: {most_stable['model_name']}")
                print(f"    - Train-OOT Gap: {most_stable['train_oot_gap']:.4f}")
                
        # Feature Analysis
        print("\n" + "="*80)
        print("FEATURE SELECTION DETAILED ANALYSIS")
        print("="*80)
        
        if hasattr(pipeline, 'final_vars'):
            print(f"\n✅ Final Variables Selected ({len(pipeline.final_vars)}):")
            for i, var in enumerate(pipeline.final_vars, 1):
                print(f"  {i}. {var}")
        
        if hasattr(pipeline, 'raw_final_vars'):
            print(f"\n✅ RAW Final Variables ({len(pipeline.raw_final_vars)}):")
            for i, var in enumerate(pipeline.raw_final_vars, 1):
                print(f"  {i}. {var}")
                
        # PSI Analysis
        print("\n" + "="*80)
        print("PSI (POPULATION STABILITY INDEX) ANALYSIS")
        print("="*80)
        
        if hasattr(pipeline, 'psi_summary'):
            psi_df = pipeline.psi_summary
            if not psi_df.empty:
                print(f"\nTotal Features Analyzed: {len(psi_df)}")
                
                # Group by status
                status_counts = psi_df['status'].value_counts()
                for status, count in status_counts.items():
                    print(f"  {status}: {count} features")
                
                # Show all features with their PSI values
                print("\nDetailed PSI Values:")
                for _, row in psi_df.iterrows():
                    psi_val = row['psi']
                    status_icon = "✅" if row['status'] == "KEEP" else "❌"
                    print(f"  {status_icon} {row['variable']}: PSI={psi_val:.4f} [{row['status']}]")
        
        # IV Analysis
        print("\n" + "="*80)
        print("INFORMATION VALUE (IV) ANALYSIS")
        print("="*80)
        
        if hasattr(pipeline, 'woe_map'):
            iv_values = []
            for var, woe_obj in pipeline.woe_map.items():
                if hasattr(woe_obj, 'iv'):
                    iv_values.append((var, woe_obj.iv))
            
            if iv_values:
                iv_values.sort(key=lambda x: x[1], reverse=True)
                print(f"\nTop Features by IV:")
                for i, (var, iv) in enumerate(iv_values[:10], 1):
                    strength = "Very Strong" if iv > 0.5 else "Strong" if iv > 0.3 else "Medium" if iv > 0.1 else "Weak"
                    print(f"  {i}. {var}: IV={iv:.4f} ({strength})")
        
        # Model Selection Criteria
        print("\n" + "="*80)
        print("BEST MODEL SELECTION ANALYSIS")
        print("="*80)
        
        if hasattr(pipeline, 'best_model_name'):
            best_model_row = summary[summary['model_name'] == pipeline.best_model_name].iloc[0]
            print(f"\n🏆 Selected Best Model: {pipeline.best_model_name}")
            print(f"  Gini_OOT: {best_model_row['Gini_OOT']:.4f}")
            print(f"  Gini_Train: {best_model_row['Gini_Train']:.4f}")
            print(f"  Train-OOT Gap: {abs(best_model_row['Gini_Train'] - best_model_row['Gini_OOT']):.4f}")
            
            # Alternative selection criteria
            print("\n📈 Alternative Best Models:")
            
            # By pure performance
            best_perf = summary.nlargest(1, 'Gini_OOT').iloc[0]
            print(f"\n1. By Highest OOT Performance:")
            print(f"   Model: {best_perf['model_name']}")
            print(f"   Gini_OOT: {best_perf['Gini_OOT']:.4f}")
            
            # By stability
            summary['stability_gap'] = abs(summary['Gini_Train'] - summary['Gini_OOT'])
            best_stable = summary.nsmallest(1, 'stability_gap').iloc[0]
            print(f"\n2. By Best Stability (Smallest Train-OOT Gap):")
            print(f"   Model: {best_stable['model_name']}")
            print(f"   Train-OOT Gap: {best_stable['stability_gap']:.4f}")
            print(f"   Gini_OOT: {best_stable['Gini_OOT']:.4f}")
            
            # By balanced criteria
            summary['balanced_score'] = (
                0.7 * summary['Gini_OOT'] - 0.3 * summary['stability_gap']
            )
            best_balanced = summary.nlargest(1, 'balanced_score').iloc[0]
            print(f"\n3. By Balanced Criteria (70% Performance + 30% Stability):")
            print(f"   Model: {best_balanced['model_name']}")
            print(f"   Gini_OOT: {best_balanced['Gini_OOT']:.4f}")
            print(f"   Train-OOT Gap: {best_balanced['stability_gap']:.4f}")
        
        # Summary Statistics
        print("\n" + "="*80)
        print("OVERALL PIPELINE STATISTICS")
        print("="*80)
        
        print(f"\nTotal Models Trained: {len(summary)}")
        print(f"WOE Models: {len(summary[summary['model_name'].str.contains('WOE')])}")
        print(f"RAW Models: {len(summary[summary['model_name'].str.contains('RAW')])}")
        
        print(f"\nPerformance Range:")
        print(f"  Gini_OOT: {summary['Gini_OOT'].min():.4f} - {summary['Gini_OOT'].max():.4f}")
        print(f"  Mean Gini_OOT: {summary['Gini_OOT'].mean():.4f}")
        print(f"  Std Gini_OOT: {summary['Gini_OOT'].std():.4f}")
        
        # Reset display options
        pd.reset_option('display.max_columns')
        pd.reset_option('display.width')
        pd.reset_option('display.max_rows')
        
    else:
        print("❌ No model summary available")
        print("\nPossible reasons:")
        print("  1. Pipeline execution failed")
        print("  2. All features were filtered out")
        print("  3. Check error logs above")
        
except Exception as e:
    print(f"❌ Error displaying results: {e}")
    import traceback
    traceback.print_exc()

COMPLETE MODEL PERFORMANCE SUMMARY

📊 ALL MODELS TRAINED:
--------------------------------------------------------------------------------

Model: RAW_GAM
  Pipeline: RAW
  Gini_Train: 0.9861
  Gini_Test: 0.9865
  Gini_OOT: 0.9816
  AUC_Train: 0.9931
  AUC_Test: 0.9933
  AUC_OOT: 0.9908
  Train-OOT Gap: 0.0045
  KS_OOT: 0.9021

Model: RAW_XGBoost
  Pipeline: RAW
  Gini_Train: 1.0000
  Gini_Test: 0.9821
  Gini_OOT: 0.9779
  AUC_Train: 1.0000
  AUC_Test: 0.9911
  AUC_OOT: 0.9890
  Train-OOT Gap: 0.0221
  KS_OOT: 0.8978

Model: RAW_LightGBM
  Pipeline: RAW
  Gini_Train: 1.0000
  Gini_Test: 0.9809
  Gini_OOT: 0.9753
  AUC_Train: 1.0000
  AUC_Test: 0.9905
  AUC_OOT: 0.9877
  Train-OOT Gap: 0.0247
  KS_OOT: 0.8839

Model: RAW_RandomForest
  Pipeline: RAW
  Gini_Train: 1.0000
  Gini_Test: 0.9758
  Gini_OOT: 0.9721
  AUC_Train: 1.0000
  AUC_Test: 0.9879
  AUC_OOT: 0.9861
  Train-OOT Gap: 0.0279
  KS_OOT: 0.8735

Model: RAW_ExtraTrees
  Pipeline: RAW
  Gini_Train: 1.0000
  Gini_Test: 0.9747
  G

## 7. Export Reports

In [8]:
# Export reports
try:
    pipeline.export_reports()
    print("✓ Reports exported successfully!")
    
    # List generated files
    import os
    if os.path.exists(config.output_folder):
        files = os.listdir(config.output_folder)
        print(f"\nGenerated {len(files)} files in '{config.output_folder}':")
        for f in sorted(files)[:10]:
            size = os.path.getsize(os.path.join(config.output_folder, f)) / 1024
            print(f"  - {f} ({size:.1f} KB)")
        if len(files) > 10:
            print(f"  ... and {len(files)-10} more files")
            
except Exception as e:
    print(f"Error exporting reports: {e}")

✓ Reports exported successfully!

Generated 4 files in 'outputs_dual_example':
  - best_model_20250904_215720_0300ba8a.joblib (210.1 KB)
  - dual_pipeline_results.xlsx (28.8 KB)
  - final_vars_20250904_215720_0300ba8a.json (0.2 KB)
  - woe_mapping_20250904_215720_0300ba8a.json (32.9 KB)


## Troubleshooting

If you encounter any errors:

1. **numpy.dtype size changed error**:
   ```bash
   pip install git+https://github.com/selimoksuz/risk-model-pipeline.git
   ```

2. **Import errors**:
   ```bash
   pip install --force-reinstall git+https://github.com/selimoksuz/risk-model-pipeline.git
   ```

3. **Memory issues**:
   - Reduce n_samples in create_sample_data()
   - Reduce hpo_trials in config

4. **Create fresh environment**:
   ```bash
   python -m venv fresh_env
   fresh_env\Scripts\activate  # Windows
   pip install git+https://github.com/selimoksuz/risk-model-pipeline.git
   ```