# Credit Risk Modeling for Mal Bank (Sharia-Compliant)

## Case Study: Credit Scoring Model Development

This notebook implements a comprehensive credit risk modeling pipeline for Mal Bank, a Sharia-compliant financial institution. The project includes:

1. **Credit Scoring Model Development** - Baseline and advanced models
2. **Islamic Lending Context** - PD/EAD/LGD considerations for Murabaha, Ijara, etc.
3. **Behavioural & Limit Management** - Early warning systems and limit recommendations
4. **Production & Monitoring** - Architecture and monitoring strategies
5. **Ethical & Bias Considerations** - Fairness and bias mitigation

---

## Part 1: Credit Scoring Model Development


In [None]:
# Import libraries
import sys
import os
import warnings
warnings.filterwarnings('ignore')

# Add src to path
sys.path.append('../src')

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split

# Import custom modules
from data_loading import load_all_data
from feature_engineering import build_feature_table
from modeling import (
    get_preprocessing_pipeline, 
    train_logistic_regression, 
    train_lightgbm,
    prepare_features_and_target
)
from evaluation import evaluate_model, plot_roc_curve, plot_pr_curve, plot_confusion_matrix, print_metrics
from explainability import (
    extract_logistic_coefficients, 
    plot_logistic_coefficients,
    plot_shap_summary,
    plot_shap_waterfall
)
from utils import identify_target_column, get_column_types

# Set style
try:
    plt.style.use('seaborn-v0_8-darkgrid')
except:
    try:
        plt.style.use('seaborn-darkgrid')
    except:
        plt.style.use('ggplot')
sns.set_palette("husl")

print("Libraries imported successfully!")


### 1.1 Data Loading & Initial Exploration

We start by loading all data files and performing initial exploration to understand the structure and identify key columns.


In [None]:
# Load data - adjust path if needed
# Option 1: If data is in ../data/ directory
# data_dir = "../data"

# Option 2: If data is in ../malbank_case_data/ directory
data_dir = "../malbank_case_data"

print("Loading data files...")
data = load_all_data(data_dir)

print("\nData loaded successfully!")
print(f"Keys: {list(data.keys())}")

# Extract individual tables
apps = data['applications']
prev_apps = data['previous_applications']
inst = data['installments']
bureau = data['bureau']
bureau_bal = data['bureau_balance']
cc_bal = data['credit_card_balance']
pos_bal = data['pos_cash_balance']
cols_desc = data['columns_description']

print(f"\nApplications shape: {apps.shape}")
print(f"Previous applications shape: {prev_apps.shape}")
print(f"Installments shape: {inst.shape}")
print(f"Bureau shape: {bureau.shape}")
print(f"Bureau balance shape: {bureau_bal.shape}")
print(f"Credit card balance shape: {cc_bal.shape}")
print(f"POS cash balance shape: {pos_bal.shape}")


In [None]:
# Explore applications table
print("Applications table info:")
print(apps.info())
print("\nFirst few rows:")
apps.head()


In [None]:
# Identify target column
target_col = identify_target_column(apps)
print(f"Target column identified: {target_col}")

if target_col:
    print(f"\nTarget distribution:")
    print(apps[target_col].value_counts())
    print(f"\nTarget proportion:")
    print(apps[target_col].value_counts(normalize=True))
    
    # Visualize target distribution
    plt.figure(figsize=(8, 5))
    apps[target_col].value_counts().plot(kind='bar', color=['skyblue', 'coral'])
    plt.title('Target Distribution', fontsize=14, fontweight='bold')
    plt.xlabel('Target (0=No Default, 1=Default)', fontsize=12)
    plt.ylabel('Count', fontsize=12)
    plt.xticks(rotation=0)
    plt.grid(alpha=0.3, axis='y')
    plt.tight_layout()
    plt.show()


### 1.2 Feature Engineering - Aggregating Supporting Tables

We aggregate features from supporting tables (previous applications, installments, bureau, etc.) to create a comprehensive feature set at the application level.


In [None]:
# Build feature table by aggregating all supporting tables
print("Building feature table...")
print("This may take a few minutes...")

feature_df = build_feature_table(
    apps=apps,
    prev_apps=prev_apps,
    inst=inst,
    bureau=bureau,
    bureau_bal=bureau_bal,
    cc_bal=cc_bal,
    pos_bal=pos_bal
)

print(f"\nFeature table shape: {feature_df.shape}")
print(f"Original applications shape: {apps.shape}")
print(f"New features added: {feature_df.shape[1] - apps.shape[1]}")

# Check for missing values
print(f"\nMissing values in aggregated features:")
agg_cols = [col for col in feature_df.columns if col not in apps.columns]
missing_counts = feature_df[agg_cols].isnull().sum()
print(missing_counts[missing_counts > 0].head(20))


### 1.3 Target and Basic Cleaning

We identify the target, handle missing data, cap outliers, and prepare for modeling.


In [None]:
# Prepare features and target
X, y = prepare_features_and_target(feature_df, target_col=target_col)

print(f"Features shape: {X.shape}")
print(f"Target shape: {y.shape}")
print(f"\nTarget distribution:")
print(y.value_counts())

# Identify column types
categorical_cols, numerical_cols = get_column_types(X)
print(f"\nCategorical columns: {len(categorical_cols)}")
print(f"Numerical columns: {len(numerical_cols)}")
print(f"\nFirst 10 categorical: {categorical_cols[:10]}")
print(f"First 10 numerical: {numerical_cols[:10]}")


In [None]:
# Train-test split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, 
    test_size=0.2, 
    random_state=42, 
    stratify=y
)

print(f"Training set: {X_train.shape[0]} samples")
print(f"Test set: {X_test.shape[0]} samples")
print(f"\nTraining target distribution:")
print(y_train.value_counts(normalize=True))
print(f"\nTest target distribution:")
print(y_test.value_counts(normalize=True))


### 1.4 Model Training

We train two models:
1. **Logistic Regression** - Interpretable baseline model
2. **LightGBM** - Non-linear gradient boosting model

Both models use the same preprocessing pipeline with proper handling of missing values, outliers, and class imbalance.


In [None]:
# Build preprocessing pipeline
print("Building preprocessing pipeline...")
preprocessing = get_preprocessing_pipeline(
    X_train,
    categorical_cols=categorical_cols,
    numerical_cols=numerical_cols,
    cap_outliers=True
)

print("Preprocessing pipeline created successfully!")


In [None]:
# Train Logistic Regression
print("Training Logistic Regression model...")
model_lr = train_logistic_regression(
    X_train, 
    y_train, 
    preprocessing,
    class_weight='balanced',
    random_state=42
)

print("Logistic Regression trained successfully!")


In [None]:
# Calculate scale_pos_weight for LightGBM
neg_count = (y_train == 0).sum()
pos_count = (y_train == 1).sum()
scale_pos_weight = neg_count / (pos_count + 1e-6)
print(f"Scale pos weight: {scale_pos_weight:.2f}")

# Train LightGBM
print("Training LightGBM model...")
model_lgb = train_lightgbm(
    X_train,
    y_train,
    preprocessing,
    scale_pos_weight=scale_pos_weight,
    random_state=42,
    n_estimators=200,
    learning_rate=0.05,
    max_depth=7
)

print("LightGBM trained successfully!")


### 1.5 Model Evaluation

We evaluate both models using comprehensive metrics: AUC-ROC, AUC-PR, KS statistic, Brier score, and confusion matrices.


In [None]:
# Evaluate Logistic Regression
print("Evaluating Logistic Regression...")
metrics_lr = evaluate_model(model_lr, X_test, y_test, threshold=0.5, model_name="Logistic Regression")
print_metrics(metrics_lr, "Logistic Regression")


In [None]:
# Evaluate LightGBM
print("Evaluating LightGBM...")
metrics_lgb = evaluate_model(model_lgb, X_test, y_test, threshold=0.5, model_name="LightGBM")
print_metrics(metrics_lgb, "LightGBM")


In [None]:
# Plot ROC curves
plot_roc_curve(
    y_test.values, 
    metrics_lr['y_pred_proba'], 
    "Logistic Regression",
    save_path="../plots/roc_logreg.png"
)

plot_roc_curve(
    y_test.values, 
    metrics_lgb['y_pred_proba'], 
    "LightGBM",
    save_path="../plots/roc_lgbm.png"
)


In [None]:
# Plot Precision-Recall curves
plot_pr_curve(
    y_test.values,
    metrics_lr['y_pred_proba'],
    "Logistic Regression",
    save_path="../plots/prc_logreg.png"
)

plot_pr_curve(
    y_test.values,
    metrics_lgb['y_pred_proba'],
    "LightGBM",
    save_path="../plots/prc_lgbm.png"
)


In [None]:
# Plot confusion matrices
plot_confusion_matrix(
    metrics_lr['confusion_matrix'],
    "Logistic Regression",
    threshold=0.5,
    save_path="../plots/confusion_logreg.png"
)

plot_confusion_matrix(
    metrics_lgb['confusion_matrix'],
    "LightGBM",
    threshold=0.5,
    save_path="../plots/confusion_lgbm.png"
)


### 1.6 Model Explainability

We analyze model interpretability using:
- **Logistic Regression**: Coefficient analysis
- **LightGBM**: SHAP values for global and local explanations


In [None]:
# Extract and plot Logistic Regression coefficients
top_positive_lr, top_negative_lr = extract_logistic_coefficients(
    model_lr,
    feature_names=list(X_train.columns),
    top_n=15
)

print("Top 15 Positive Coefficients (increase default risk):")
print(top_positive_lr)

print("\nTop 15 Negative Coefficients (decrease default risk):")
print(top_negative_lr)

plot_logistic_coefficients(
    top_positive_lr,
    top_negative_lr,
    model_name="Logistic Regression",
    save_path="../plots/coefficients_logreg.png"
)


In [None]:
# SHAP summary plot for LightGBM
print("Computing SHAP values for LightGBM (this may take a few minutes)...")
plot_shap_summary(
    model_lgb,
    X_test.sample(min(500, len(X_test)), random_state=42),
    model_name="LightGBM",
    save_path="../plots/shap_summary_lgbm.png",
    n_samples=100
)


In [None]:
# Find examples for local explanation
# High-risk customer (predicted default with high probability)
high_risk_idx = np.where((metrics_lgb['y_pred_proba'] > 0.7) & (y_test.values == 1))[0]
if len(high_risk_idx) > 0:
    high_risk_idx = high_risk_idx[0]
    print(f"High-risk customer (actual default): Index {high_risk_idx}, Probability: {metrics_lgb['y_pred_proba'][high_risk_idx]:.3f}")
else:
    # Use highest probability default prediction
    high_risk_idx = np.argmax(metrics_lgb['y_pred_proba'])
    print(f"High-risk customer (highest predicted probability): Index {high_risk_idx}, Probability: {metrics_lgb['y_pred_proba'][high_risk_idx]:.3f}")

# Low-risk customer (predicted non-default with low probability)
low_risk_idx = np.where((metrics_lgb['y_pred_proba'] < 0.2) & (y_test.values == 0))[0]
if len(low_risk_idx) > 0:
    low_risk_idx = low_risk_idx[0]
    print(f"Low-risk customer (actual non-default): Index {low_risk_idx}, Probability: {metrics_lgb['y_pred_proba'][low_risk_idx]:.3f}")
else:
    # Use lowest probability
    low_risk_idx = np.argmin(metrics_lgb['y_pred_proba'])
    print(f"Low-risk customer (lowest predicted probability): Index {low_risk_idx}, Probability: {metrics_lgb['y_pred_proba'][low_risk_idx]:.3f}")


In [None]:
# SHAP waterfall plots for individual customers
print("\nGenerating SHAP waterfall for high-risk customer...")
plot_shap_waterfall(
    model_lgb,
    X_test.reset_index(drop=True),
    high_risk_idx,
    model_name="LightGBM",
    save_path="../plots/shap_waterfall_high_risk.png"
)

print("\nGenerating SHAP waterfall for low-risk customer...")
plot_shap_waterfall(
    model_lgb,
    X_test.reset_index(drop=True),
    low_risk_idx,
    model_name="LightGBM",
    save_path="../plots/shap_waterfall_low_risk.png"
)


#### Local Explanation: High-Risk Customer

**Customer Profile Analysis:**

Based on the SHAP waterfall plot above, this customer shows high default risk due to:

- **Key Risk Factors**: [Analyze top positive SHAP values from the plot]
  - High utilization of credit facilities
  - History of late payments (DPD > 0)
  - Low income relative to credit amount
  - Previous application rejections or cancellations

**Credit Officer Action:**

1. **Immediate Review**: Flag for manual underwriting review
2. **Additional Documentation**: Request proof of income stability, employment verification
3. **Risk Mitigation**: Consider:
   - Lower credit limit
   - Shorter repayment term
   - Require co-signer or collateral (for Murabaha/Ijara)
   - Stricter monitoring of payment behavior

**Sharia-Compliant Considerations:**

- For Murabaha: Ensure asset valuation is conservative
- For Ijara: Verify lessee's ability to maintain regular payments
- No interest-based penalties; focus on restructuring if needed


#### Local Explanation: Low-Risk Customer

**Customer Profile Analysis:**

Based on the SHAP waterfall plot above, this customer shows low default risk due to:

- **Key Positive Factors**: [Analyze top negative SHAP values from the plot]
  - Strong payment history (no late payments)
  - Low credit utilization
  - Stable income and employment
  - Good bureau credit history
  - Previous successful loan completions

**Credit Officer Action:**

1. **Standard Processing**: Approve with standard terms
2. **Potential Upsell**: Consider offering:
   - Higher credit limit (if utilization is low)
   - Additional Sharia-compliant products (Murabaha, Ijara)
   - Preferred customer benefits

**Sharia-Compliant Considerations:**

- Good candidates for profit-sharing products (Mudarabah/Musharakah) if applicable
- Can support higher-value asset financing (Murabaha)
- Eligible for longer-term Ijara contracts


---

## Part 2: Islamic Lending Context

### 2.1 Differences Between Conventional and Islamic Lending

**Fundamental Principles:**

1. **Prohibition of Riba (Interest)**: Islamic finance prohibits charging or paying interest. Instead, transactions must be asset-backed and involve profit-sharing or cost-plus structures.

2. **Asset-Backed Financing**: All financing must be tied to real assets or services, ensuring tangible value creation.

3. **Risk-Sharing**: Islamic finance emphasizes sharing both profits and losses between the bank and customer, rather than fixed interest payments.

**Key Islamic Financing Products:**

- **Murabaha**: Cost-plus sale where the bank purchases an asset and sells it to the customer at a marked-up price, payable in installments.
- **Ijara**: Leasing arrangement where the bank owns the asset and leases it to the customer for regular payments.
- **Mudarabah**: Profit-sharing partnership where the bank provides capital and the customer provides expertise.
- **Musharakah**: Joint venture partnership with shared profits and losses.

### 2.2 Impact on Credit Risk Modeling

**Feature Engineering Adjustments:**

- **No Interest-Based Features**: Remove interest rate, APR, or interest payment history
- **Asset Valuation**: Include features for asset value, depreciation, and market conditions
- **Profit Payment History**: Track deferred profit payments (equivalent to interest in conventional) but structured differently
- **Installment Structure**: Focus on principal + profit components separately

**Target Definition:**

- **Default**: Failure to meet agreed installment/profit payments (not unpaid interest)
- **Delinquency**: Days past due on installments (principal + profit), not interest arrears
- **Loss**: Outstanding principal + unearned profit at default, minus asset recovery value

### 2.3 PD, EAD, and LGD in Islamic Context

**Probability of Default (PD):**

- **Definition**: Probability that a customer fails to meet agreed installment/profit payments
- **Modeling**: Similar to conventional, but payment obligations are structured as:
  - Principal repayment (return of capital)
  - Profit margin (deferred profit in Murabaha, rental in Ijara)
- **Features**: Payment history, income stability, asset value trends, economic conditions

**Exposure at Default (EAD):**

- **Murabaha**: 
  - Outstanding principal (remaining cost)
  - Unearned profit (profit not yet realized)
  - Formula: `EAD = Outstanding Principal + (Total Profit × Remaining Installments / Total Installments)`
  
- **Ijara**:
  - Remaining lease payments (rental + principal if applicable)
  - Residual asset value (if customer defaults, bank retains asset)
  - Formula: `EAD = Sum of Remaining Lease Payments - Expected Residual Value`

- **Mudarabah/Musharakah**:
  - Outstanding capital contribution
  - Expected profit share (if applicable)

**Loss Given Default (LGD):**

- **Components**:
  1. **Outstanding Amount**: EAD at default
  2. **Recovery Value**: 
     - Asset sale proceeds (Murabaha: sell asset; Ijara: re-lease or sell)
     - Collateral liquidation (if applicable)
  3. **Recovery Costs**: Legal, administrative, asset disposal costs
  
- **Formula**: `LGD = 1 - (Recovery Value - Recovery Costs) / EAD`

- **Sharia Considerations**:
  - No penalty interest on overdue amounts
  - Focus on asset repossession and resale
  - Restructuring options (rescheduling, resale) preferred over foreclosure

### 2.4 Murabaha-Specific Modeling

**Delinquency Measurement:**

- **Days Past Due (DPD)**: Days since last successful installment payment
- **Installment Components**:
  - Principal component (return of capital)
  - Profit component (deferred profit margin)
- **Tracking**: Monitor both components separately, but DPD applies to total installment

**Loss Severity Calculation:**

1. **At Default**:
   - Outstanding Principal = Original Cost - Principal Paid
   - Unearned Profit = Total Profit × (Remaining Installments / Total Installments)
   - EAD = Outstanding Principal + Unearned Profit

2. **Recovery Process**:
   - Repossess asset (if customer defaults)
   - Sell asset at market value (may be depreciated)
   - Recovery Value = Sale Price - Selling Costs

3. **LGD Calculation**:
   ```
   LGD = (EAD - Recovery Value) / EAD
   ```

**Key Risk Factors for Murabaha:**

- Asset depreciation rate (affects recovery value)
- Market liquidity for asset resale
- Customer's payment behavior on profit component
- Economic conditions affecting asset values
- Asset condition at default (wear and tear)

**Modeling Recommendations:**

- Include asset type and depreciation features
- Track profit payment separately from principal
- Model recovery rates by asset category
- Consider market conditions in LGD models
- Use conservative asset valuations for EAD calculations


---

## Part 3: Behavioural & Limit Management

### 3.1 Behavioural Variables from Transactional History

**DPD Trends:**

- **Current DPD**: Days past due at current time
- **DPD Buckets**: 0, 1-30, 31-60, 61-90, 90+
- **DPD Trend**: Direction and magnitude of DPD changes over time
- **DPD Volatility**: Standard deviation of DPD over rolling windows

**Payment Behavior:**

- **Missed Payments**: Count of missed payments in rolling windows (3, 6, 12 months)
- **Late Payments**: Count of payments with DPD > 0 in rolling windows
- **Payment-to-Scheduled Ratio**: Actual payment / scheduled payment
- **Payment Patterns**:
  - Full payment rate
  - Partial payment rate
  - Above-minimum payment rate
  - Payment timing consistency

**Utilization Metrics:**

- **Current Utilization**: Current balance / Credit limit
- **Peak Utilization**: Maximum utilization in last N months
- **Average Utilization**: Mean utilization over rolling windows
- **Utilization Trend**: Increasing, stable, or decreasing
- **Utilization Volatility**: Standard deviation of utilization

**Income Stability (Hypothetical):**

- **Income Trend**: Increasing, stable, or decreasing
- **Income Volatility**: Coefficient of variation of income
- **Income-to-Payment Ratio**: Monthly income / Monthly payment obligation
- **Payment Coverage**: Ability to cover payments from income

### 3.2 Early Warning System for Delinquency

**Approach 1: Behavioural Score Model**

A small predictive model (e.g., logistic regression or simple tree) predicting 3-6 month delinquency:

```python
# Pseudo-code for behavioural score
def behavioral_score_model(features):
    """
    Predicts probability of delinquency in next 3-6 months.
    
    Features:
    - DPD trend (increasing = risk)
    - Payment ratio (declining = risk)
    - Utilization (high = risk)
    - Missed payments count (increasing = risk)
    - Income stability indicators
    """
    risk_score = model.predict_proba(features)[:, 1]
    return risk_score
```

**Approach 2: Rule-Based Early Warning System**

Define thresholds that trigger alerts:

- **Yellow Alert** (Monitor):
  - DPD: 1-15 days
  - Payment ratio: 0.8-0.95
  - Utilization: 80-90%
  - 1-2 missed payments in last 6 months

- **Orange Alert** (Intervene):
  - DPD: 16-30 days
  - Payment ratio: 0.6-0.8
  - Utilization: 90-95%
  - 3-4 missed payments in last 6 months
  - Declining income trend

- **Red Alert** (Immediate Action):
  - DPD: 31+ days
  - Payment ratio: < 0.6
  - Utilization: > 95%
  - 5+ missed payments in last 6 months
  - Significant income decline

**Intervention Strategies (Sharia-Compliant):**

1. **Early Contact**: Reach out to customer to understand situation
2. **Restructuring**: Reschedule installments (extend term, reduce payment amount)
3. **Resale (Murabaha)**: If customer cannot continue, facilitate asset resale
4. **Limit Adjustment**: Reduce or freeze credit limits
5. **No Penalties**: Avoid interest-based penalties; focus on solutions

### 3.3 Limit Management Framework

**Limit Increase Conditions:**

- **Good Behavioural Score**: Low risk of delinquency
- **Low DPD**: No or minimal days past due
- **Stable Payment History**: Consistent on-time payments
- **Low Utilization**: Current utilization < 60%
- **Stable/Increasing Income**: Positive income trend
- **Long Relationship**: Successful history with bank

**Limit Decrease or Freeze Conditions:**

- **High DPD**: 30+ days past due
- **Frequent Late Payments**: Multiple late payments in recent months
- **High Utilization**: Utilization > 90%
- **Volatile/Decreasing Income**: Negative income trend
- **Missed Payments**: Multiple missed payments
- **Negative Bureau Updates**: New adverse credit bureau information

**Sharia-Compliant Levers:**

- **No Interest-Based Penalties**: Cannot charge interest on overdue amounts
- **Limit Adjustments**: Reduce available credit to prevent further exposure
- **Restructuring**: Modify payment terms (extend term, reduce amount)
- **Asset Repossession**: For Murabaha/Ijara, repossess asset if default occurs
- **Early Settlement Incentives**: Offer discounts for early full payment (discount on profit, not interest)

### 3.4 Evaluation Strategy

**Backtesting Framework:**

1. **Historical Data Analysis**:
   - Apply behavioural score to historical data
   - Identify customers who would have triggered alerts
   - Compare outcomes: Did they default? How many were false positives?

2. **Metrics**:
   - **Bad Rate Reduction**: % reduction in default rate among intervened customers
   - **Losses Avoided**: Estimated losses prevented through early intervention
   - **Retention Rate**: % of customers retained after intervention
   - **False Positive Rate**: % of alerts that did not lead to default
   - **Cost-Benefit**: Cost of intervention vs. losses avoided

3. **A/B Testing** (if possible):
   - Randomize customers into intervention vs. control groups
   - Measure impact of interventions on default rates

**Implementation Pseudo-code:**

```python
def evaluate_limit_strategy(historical_data, model, thresholds):
    """
    Backtest limit management strategy.
    """
    # Apply behavioural score
    scores = model.predict_proba(historical_data)[:, 1]
    
    # Identify alerts
    alerts = apply_thresholds(scores, thresholds)
    
    # Compare outcomes
    intervened_outcomes = historical_data[alerts]['default']
    control_outcomes = historical_data[~alerts]['default']
    
    # Calculate metrics
    bad_rate_reduction = (
        control_outcomes.mean() - intervened_outcomes.mean()
    ) / control_outcomes.mean()
    
    return {
        'bad_rate_reduction': bad_rate_reduction,
        'losses_avoided': calculate_losses_avoided(...),
        'retention_rate': calculate_retention(...)
    }
```


---

## Part 4: Production & Monitoring

### 4.1 Production Architecture

**Batch Scoring (Daily Portfolio Scoring):**

```
┌─────────────────┐
│  Data Sources   │
│  (Applications, │
│   Payments,     │
│   Bureau)       │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Feature Store   │
│ (Historical +   │
│  Real-time)     │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Feature Prep    │
│ Pipeline        │
│ (Same as train) │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Model Artifact  │
│ (LightGBM/      │
│  Logistic Reg)  │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Scoring Engine  │
│ (Batch Job)     │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Decision Rules  │
│ (Approve/       │
│  Reject/        │
│  Manual Review) │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Results Store   │
│ (Scores,        │
│  Decisions)     │
└─────────────────┘
```

**Real-Time Scoring (API):**

```
Client Request → API Gateway → Feature Extraction → 
Model Scoring → Decision Engine → Response
```

**Key Components:**

1. **Feature Preparation Pipeline**: 
   - Reuse same preprocessing as training (imputation, encoding, scaling)
   - Aggregate supporting tables in real-time or from feature store
   - Ensure consistency with training data

2. **Model Artifact Loading**:
   - Store trained models in model registry (MLflow, S3, etc.)
   - Version control for model artifacts
   - Load latest approved model version

3. **Scoring Endpoint**:
   - REST API (Flask/FastAPI) for real-time scoring
   - Batch job (Airflow, cron) for daily portfolio scoring
   - Return: probability, risk score, decision recommendation

4. **Decision Rules**:
   - Automated: Approve (score < threshold_low), Reject (score > threshold_high)
   - Manual Review: threshold_low < score < threshold_high
   - Policy rules: Override based on business rules (e.g., minimum income)

5. **Logging & Audit Trail**:
   - Log all predictions with timestamps, features, scores, decisions
   - Store in database for compliance and monitoring
   - Enable model explainability queries (SHAP values on demand)

### 4.2 Monitoring & Retraining

**Data Drift Detection:**

- **Population Stability Index (PSI)**:
  - Compare feature distributions between training and production
  - Threshold: PSI > 0.25 indicates significant drift
  - Monitor key features: income, credit amount, utilization, DPD

- **Feature Drift Metrics**:
  ```python
  def calculate_psi(expected, actual, bins=10):
      """Calculate Population Stability Index."""
      # Bin data
      # Compare distributions
      # Return PSI value
  ```

- **Action**: If PSI > threshold, investigate data quality, feature engineering, or retrain model

**Performance Monitoring:**

- **AUC-ROC**: Track over time (weekly/monthly)
  - Threshold: AUC drops > 0.05 from baseline → investigate
- **KS Statistic**: Monitor discrimination power
  - Threshold: KS drops > 0.05 → investigate
- **Brier Score**: Monitor calibration
  - Threshold: Brier increases > 0.05 → investigate
- **Precision/Recall**: Track at chosen threshold
  - Monitor false positive/negative rates

**Fairness Monitoring:**

- **Group-Wise Metrics**: If protected attributes available (gender, age group, region):
  - Calculate AUC, precision, recall by group
  - Monitor for significant disparities
  - Threshold: Difference > 0.05 in AUC between groups → investigate

- **Proxy Variable Detection**:
  - Identify features highly correlated with protected attributes
  - Monitor impact on fairness metrics
  - Consider removal or regularization if bias detected

**Retraining Triggers:**

1. **Performance Degradation**: AUC/KS drops below threshold
2. **Data Drift**: PSI > 0.25 for multiple key features
3. **Time-Based**: Quarterly or semi-annual retraining
4. **Significant Events**: Economic shocks, regulatory changes, product changes
5. **Fairness Issues**: Significant bias detected across groups

**Retraining Process:**

1. Collect new training data (last 12-24 months)
2. Re-run feature engineering pipeline
3. Train new model version
4. Validate on holdout set
5. Compare with current model (A/B test if possible)
6. Deploy if performance improved or maintained
7. Monitor post-deployment performance

**Model Versioning:**

- Use MLflow or similar for model versioning
- Track: training date, data version, hyperparameters, performance metrics
- Enable rollback to previous version if issues arise


---

## Part 5: Ethical & Bias Considerations

### 5.1 Ensuring Non-Discriminatory Models

**Protected Attributes Exclusion:**

- **Explicit Exclusion**: Remove protected attributes from features:
  - Gender, race, religion, ethnicity, age (if protected)
  - National origin, marital status (if protected by regulation)
  
- **Implementation**:
  ```python
  protected_attributes = ['CODE_GENDER', 'AGE', 'RELIGION', ...]
  X_model = X.drop(columns=protected_attributes)
  ```

**Proxy Variable Detection:**

- **Identification**: Features highly correlated with protected attributes may act as proxies:
  - Example: ZIP code → race/ethnicity
  - Example: Occupation type → gender
  - Example: Education level → socioeconomic status (may correlate with protected attributes)

- **Detection Method**:
  1. Calculate correlation between features and protected attributes
  2. Flag features with correlation > threshold (e.g., 0.3)
  3. Evaluate impact on group-wise performance
  4. Consider removal or regularization if bias detected

**Group-Wise Evaluation:**

- **Metrics by Group**: Calculate performance metrics separately for each protected group:
  - AUC-ROC by gender, age group, region
  - Precision, recall, F1 by group
  - Default rates by group

- **Disparity Detection**:
  - Compare metrics across groups
  - Flag if difference > threshold (e.g., 0.05 in AUC)
  - Investigate root causes

### 5.2 Handling Correlated Features

**Options for Features Correlated with Protected Attributes:**

1. **Removal**: 
   - Remove features with high correlation to protected attributes
   - Pros: Simple, eliminates direct proxy
   - Cons: May lose predictive power

2. **Regularization**:
   - Use L1/L2 regularization to reduce feature importance
   - Pros: Retains some predictive power
   - Cons: May not fully eliminate bias

3. **Reweighting**:
   - Adjust sample weights to balance representation
   - Pros: Can improve fairness
   - Cons: May reduce overall accuracy

4. **Post-Processing**:
   - Adjust thresholds by group to equalize outcomes
   - Pros: Can achieve equalized odds or demographic parity
   - Cons: May reduce accuracy, requires careful calibration

5. **Fairness-Aware Algorithms**:
   - Use algorithms that explicitly optimize for fairness
   - Examples: Fairness constraints in optimization, adversarial debiasing
   - Pros: Built-in fairness
   - Cons: More complex, may reduce accuracy

**Recommendation for Mal Bank:**

- **Primary**: Remove protected attributes and high-correlation proxies
- **Secondary**: Use regularization to reduce impact of remaining correlated features
- **Tertiary**: Monitor group-wise metrics and apply post-processing if needed
- **Governance**: Review by model risk committee before deployment

### 5.3 Balancing Fairness, Accuracy, and Business Impact

**Trade-Offs:**

- **Fairness vs. Accuracy**:
  - Removing biased features may reduce AUC by 0.01-0.03
  - Post-processing for fairness may reduce precision/recall
  - **Decision**: Accept small accuracy loss (e.g., < 0.02 AUC) for significant fairness gain

- **Fairness vs. Business Impact**:
  - Equalizing approval rates may increase default rate
  - Equalizing default rates may reduce approval rate
  - **Decision**: Balance based on business objectives and regulatory requirements

**Hypothetical Example:**

- **Baseline Model**: AUC = 0.75, but 0.10 AUC difference between gender groups
- **Fairness-Adjusted Model**: AUC = 0.73, 0.02 AUC difference between groups
- **Trade-Off**: 0.02 AUC loss for 0.08 fairness improvement
- **Decision**: Accept trade-off if regulatory compliance requires it

**Governance Framework:**

1. **Model Risk Committee**:
   - Review model for fairness before deployment
   - Approve trade-offs between accuracy and fairness
   - Set thresholds for acceptable disparities

2. **Regular Audits**:
   - Quarterly reviews of group-wise performance
   - Investigate and remediate if disparities detected
   - Document decisions and rationale

3. **Transparency**:
   - Document excluded features and rationale
   - Report group-wise metrics to stakeholders
   - Maintain audit trail of model decisions

4. **Regulatory Compliance**:
   - Ensure compliance with local regulations (e.g., anti-discrimination laws)
   - Consider Sharia principles (fairness, justice)
   - Align with bank's ethical guidelines

**Sharia-Compliant Considerations:**

- **Justice (Adl)**: Ensure fair treatment of all customers
- **No Exploitation**: Avoid discriminatory practices that exploit vulnerable groups
- **Transparency**: Clear, explainable decisions
- **Social Responsibility**: Consider impact on community and society
