# AI Model Governance Toolkit - Bias Detection Demo

This notebook demonstrates the usage of the Bias Detection Module for analyzing
potential bias in a credit scoring model. We'll:
1. Load and prepare sample credit data with protected attributes
2. Train a credit scoring model
3. Analyze the model for various types of bias
4. Generate comprehensive bias reports

In [None]:
import sys
sys.path.append('..')

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt
import seaborn as sns

from bias_detection.fairness_metrics import BiasDetector

## 1. Load and Prepare Sample Credit Data

We'll create a synthetic dataset that includes protected attributes like gender and age,
along with other credit-related features.

In [None]:
# Generate synthetic credit data with protected attributes
np.random.seed(42)
n_samples = 1000

# Generate protected attributes
gender = np.random.choice(['male', 'female'], size=n_samples, p=[0.5, 0.5])
age = np.random.normal(35, 10, n_samples)
age = np.clip(age, 18, 80)  # Clip age to reasonable range

# Generate other features
data = {
    'gender': gender,
    'age': age,
    'income': np.random.normal(50000, 20000, n_samples),
    'employment_length': np.random.normal(5, 3, n_samples),
    'debt_to_income': np.random.normal(0.3, 0.1, n_samples),
    'credit_score': np.random.normal(700, 50, n_samples),
    'payment_history': np.random.normal(0.95, 0.05, n_samples),
    'loan_amount': np.random.normal(10000, 5000, n_samples)
}

df = pd.DataFrame(data)

# Generate target (loan approval) with some bias
prob = 1 / (1 + np.exp(-(
    0.1 * df['credit_score'] +
    0.05 * df['income'] -
    0.2 * df['debt_to_income'] -
    0.1 * df['payment_history'] +
    0.1 * (df['gender'] == 'male').astype(int)  # Introduce gender bias
)))
df['approved'] = (prob > 0.5).astype(int)

# Split data
X = df.drop('approved', axis=1)
y = df['approved']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Scale numerical features
scaler = StandardScaler()
numerical_features = ['age', 'income', 'employment_length', 'debt_to_income', 
                     'credit_score', 'payment_history', 'loan_amount']
X_train_scaled = X_train.copy()
X_test_scaled = X_test.copy()
X_train_scaled[numerical_features] = scaler.fit_transform(X_train[numerical_features])
X_test_scaled[numerical_features] = scaler.transform(X_test[numerical_features])

print("Dataset shape:", df.shape)
print("\nSample of the data:")
display(df.head())
print("\nClass distribution:")
display(df['approved'].value_counts(normalize=True))

## 2. Train Credit Scoring Model

We'll use a Random Forest classifier as our credit scoring model.

In [None]:
# Train model
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train_scaled, y_train)

# Evaluate model
train_score = model.score(X_train_scaled, y_train)
test_score = model.score(X_test_scaled, y_test)

print(f"Training accuracy: {train_score:.3f}")
print(f"Test accuracy: {test_score:.3f}")

## 3. Bias Detection Analysis

Now we'll use our BiasDetector to analyze potential bias in the model.

In [None]:
# Initialize bias detector
protected_attributes = ['gender', 'age']
privileged_groups = {
    'gender': 'male',
    'age': 35  # Using mean age as privileged value
}

detector = BiasDetector(
    model=model,
    protected_attributes=protected_attributes,
    privileged_groups=privileged_groups
)

# Generate comprehensive bias report
report = detector.generate_bias_report(X_test_scaled, y_test)

# Display report
print("Bias Analysis Report:")
print("\n1. Disparate Impact Ratios:")
for attr, ratio in report['disparate_impact'].items():
    print(f"{attr}: {ratio:.3f}")
    print(f"Interpretation: {'Fair' if 0.8 <= ratio <= 1.2 else 'Potential bias detected'}")

print("\n2. Demographic Parity Differences:")
for attr, diff in report['demographic_parity'].items():
    print(f"{attr}: {diff:.3f}")
    print(f"Interpretation: {'Fair' if diff < 0.1 else 'Potential bias detected'}")

print("\n3. Equal Opportunity Differences:")
for attr, diff in report['equal_opportunity'].items():
    print(f"{attr}: {diff:.3f}")
    print(f"Interpretation: {'Fair' if diff < 0.1 else 'Potential bias detected'}")

print("\n4. Feature Correlations with Protected Attributes:")
for attr, correlations in report['feature_correlations'].items():
    if correlations:
        print(f"\n{attr} correlations:")
        for feature, corr in correlations.items():
            print(f"{feature}: {corr:.3f}")

## 4. Visualize Bias Metrics

Let's visualize the bias metrics to better understand the model's behavior.

In [None]:
# Plot all bias metrics
detector.plot_bias_report(report)

## 5. Mitigation Recommendations

Based on the bias analysis, here are some recommendations for mitigating bias:

1. **Data Collection and Preprocessing**
   - Ensure balanced representation of protected groups in training data
   - Consider removing or transforming features that strongly correlate with protected attributes

2. **Model Training**
   - Use fairness-aware algorithms or constraints during training
   - Consider reweighting samples to balance protected groups

3. **Post-processing**
   - Implement threshold adjustment for different protected groups
   - Use calibration techniques to ensure equal prediction rates

4. **Monitoring and Maintenance**
   - Regularly monitor bias metrics on new data
   - Implement automated bias detection in production
   - Maintain documentation of bias mitigation efforts