# AI Governance: Practical Examples

**Session 3: Responsible AI Implementation**

This notebook demonstrates three core concepts in AI governance:
1. **Fairness Evaluation** - Detecting and measuring bias in ML models
2. **Bias Mitigation** - Techniques to reduce unfairness
3. **Model Monitoring** - Tracking governance metrics in production

## Introduction

AI governance ensures ML systems are fair, transparent, accountable, and safe. In this notebook, you'll build practical skills for:

- Evaluating fairness across demographic groups
- Detecting bias in model predictions
- Implementing monitoring dashboards

We'll use a **credit scoring scenario** as our working example.

---
## Example 1: Fairness Evaluation

**Objective**: Measure if a credit scoring model treats different demographic groups fairly.

We'll evaluate three fairness metrics:
- **Demographic Parity**: Equal approval rates across groups
- **Equalized Odds**: Equal TPR and FPR across groups  
- **Predictive Parity**: Equal precision across groups

In [None]:
# Install required library
!pip install -q scikit-learn pandas numpy matplotlib

import pandas as pd
import numpy as np
from sklearn.metrics import confusion_matrix
import matplotlib.pyplot as plt

# Simulate credit scoring data
np.random.seed(42)
n_samples = 1000

# Create synthetic dataset
data = {
    'income': np.random.normal(50000, 20000, n_samples),
    'credit_history': np.random.uniform(300, 850, n_samples),
    'age': np.random.randint(22, 70, n_samples),
    'gender': np.random.choice(['M', 'F'], n_samples),
    'approved': np.random.choice([0, 1], n_samples, p=[0.4, 0.6])
}

# Introduce bias: males get higher approval at same credit score
for i in range(n_samples):
    if data['gender'][i] == 'M' and data['credit_history'][i] > 600:
        data['approved'][i] = 1 if np.random.random() > 0.2 else 0
    elif data['gender'][i] == 'F' and data['credit_history'][i] > 600:
        data['approved'][i] = 1 if np.random.random() > 0.4 else 0

df = pd.DataFrame(data)
print("Dataset shape:", df.shape)
print("\nFirst few rows:")
print(df.head())
print("\nApproval rate by gender:")
print(df.groupby('gender')['approved'].mean())

In [None]:
def evaluate_fairness(df, sensitive_attr='gender'):
    """
    Calculate fairness metrics across demographic groups.
    """
    results = {}
    
    # 1. DEMOGRAPHIC PARITY
    # Equal approval rates across groups
    approval_rates = df.groupby(sensitive_attr)['approved'].mean()
    results['approval_rates'] = approval_rates
    results['demographic_parity_diff'] = approval_rates.max() - approval_rates.min()
    
    # 2. EQUALIZED ODDS (simplified: just TPR for this demo)
    for group in df[sensitive_attr].unique():
        group_data = df[df[sensitive_attr] == group]
        # Simulate ground truth (in real scenario, you'd have actual labels)
        # For demo: assume approval correlates with credit history
        y_true = (group_data['credit_history'] > 650).astype(int)
        y_pred = group_data['approved']
        
        if len(y_true) > 0 and y_true.sum() > 0:
            tpr = ((y_pred == 1) & (y_true == 1)).sum() / y_true.sum()
            results[f'TPR_{group}'] = tpr
    
    return results

# Evaluate fairness
fairness_results = evaluate_fairness(df)

print("="*50)
print("FAIRNESS EVALUATION RESULTS")
print("="*50)
for metric, value in fairness_results.items():
    if isinstance(value, (int, float)):
        print(f"{metric}: {value:.3f}")
    else:
        print(f"{metric}:")
        print(value)

# Visualize approval rates
plt.figure(figsize=(8, 5))
fairness_results['approval_rates'].plot(kind='bar', color=['#FF6B6B', '#4ECDC4'])
plt.title('Approval Rates by Gender', fontsize=14, fontweight='bold')
plt.ylabel('Approval Rate')
plt.xlabel('Gender')
plt.xticks(rotation=0)
plt.axhline(y=0.6, color='gray', linestyle='--', label='Overall Rate')
plt.legend()
plt.grid(axis='y', alpha=0.3)
plt.tight_layout()
plt.show()

# Interpretation
print("\n📊 INTERPRETATION:")
parity_diff = fairness_results['demographic_parity_diff']
if parity_diff < 0.1:
    print(f"✅ Demographic parity difference ({parity_diff:.3f}) is acceptable (< 0.1)")
else:
    print(f"⚠️ Demographic parity difference ({parity_diff:.3f}) exceeds threshold!")
    print("   This indicates potential bias in approval rates.")

---
## Example 2: Bias Mitigation

**Objective**: Apply a simple bias mitigation technique - threshold adjustment.

When we detect unfairness, we can:
1. **Reweighting**: Adjust training data weights
2. **Threshold Optimization**: Use different decision thresholds per group
3. **Adversarial Debiasing**: Train model to be fair

We'll demonstrate **threshold adjustment** as it's simple yet effective.

In [None]:
# Bias Mitigation: Threshold Adjustment
def adjust_thresholds(df, sensitive_attr='gender', target_parity=0.05):
    """
    Adjust approval thresholds to achieve demographic parity.
    """
    # Calculate current approval rates
    overall_rate = df['approved'].mean()
    
    print(f"Overall approval rate: {overall_rate:.3f}")
    print("\nCurrent approval rates by group:")
    for group in df[sensitive_attr].unique():
        group_rate = df[df[sensitive_attr] == group]['approved'].mean()
        print(f"  {group}: {group_rate:.3f}")
    
    # Simulate threshold adjustment
    # In practice, you'd adjust based on probability scores
    df_adjusted = df.copy()
    
    # For the disadvantaged group, boost approvals
    disadvantaged = 'F'  # Based on our analysis
    boost_factor = 1.2  # 20% boost
    
    for idx in df_adjusted[df_adjusted[sensitive_attr] == disadvantaged].index:
        if np.random.random() < 0.15:  # Adjust 15% of cases
            df_adjusted.loc[idx, 'approved_adjusted'] = 1
        else:
            df_adjusted.loc[idx, 'approved_adjusted'] = df_adjusted.loc[idx, 'approved']
    
    # Keep original approvals for other groups
    for idx in df_adjusted[df_adjusted[sensitive_attr] != disadvantaged].index:
        df_adjusted.loc[idx, 'approved_adjusted'] = df_adjusted.loc[idx, 'approved']
    
    return df_adjusted

# Apply mitigation
df_mitigated = adjust_thresholds(df)

# Re-evaluate fairness
print("\n" + "="*50)
print("AFTER BIAS MITIGATION")
print("="*50)

approval_rates_after = df_mitigated.groupby('gender')['approved_adjusted'].mean()
parity_diff_after = approval_rates_after.max() - approval_rates_after.min()

print(f"\nNew approval rates:")
for group, rate in approval_rates_after.items():
    print(f"  {group}: {rate:.3f}")
print(f"\nDemographic parity difference: {parity_diff_after:.3f}")

# Visualize before/after
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5))

# Before
fairness_results['approval_rates'].plot(kind='bar', ax=ax1, color=['#FF6B6B', '#4ECDC4'])
ax1.set_title('Before Mitigation', fontsize=12, fontweight='bold')
ax1.set_ylabel('Approval Rate')
ax1.set_xlabel('Gender')
ax1.set_ylim(0, 1)
ax1.grid(axis='y', alpha=0.3)
ax1.set_xticklabels(ax1.get_xticklabels(), rotation=0)

# After
approval_rates_after.plot(kind='bar', ax=ax2, color=['#95E1D3', '#FFB6B9'])
ax2.set_title('After Mitigation', fontsize=12, fontweight='bold')
ax2.set_ylabel('Approval Rate')
ax2.set_xlabel('Gender')
ax2.set_ylim(0, 1)
ax2.grid(axis='y', alpha=0.3)
ax2.set_xticklabels(ax2.get_xticklabels(), rotation=0)

plt.tight_layout()
plt.show()

print(f"\n✅ Bias reduced from {fairness_results['demographic_parity_diff']:.3f} to {parity_diff_after:.3f}")

---
## Example 3: Model Monitoring Dashboard

**Objective**: Build a simple monitoring system to track governance metrics over time.

In production, you must continuously monitor:
- Performance metrics (accuracy, precision)
- Fairness metrics (demographic parity)
- Data drift (distribution changes)
- Security alerts (adversarial attacks)

In [None]:
# Simulate production monitoring data
import pandas as pd
from datetime import datetime, timedelta

# Simulate 30 days of monitoring data
dates = [datetime.now() - timedelta(days=x) for x in range(30, 0, -1)]
monitoring_data = []

for date in dates:
    # Simulate metrics with some drift over time
    drift_factor = (30 - (datetime.now() - date).days) / 30
    
    monitoring_data.append({
        'date': date,
        'accuracy': 0.85 + np.random.normal(0, 0.02) - (0.01 * (1 - drift_factor)),
        'demographic_parity_diff': 0.08 + np.random.normal(0, 0.01) + (0.05 * (1 - drift_factor)),
        'prediction_latency_ms': 45 + np.random.normal(0, 5) + (10 * (1 - drift_factor)),
        'throughput_qps': 1000 + np.random.normal(0, 50),
        'drift_score': 0.15 + (0.3 * (1 - drift_factor)) + np.random.normal(0, 0.02)
    })

monitor_df = pd.DataFrame(monitoring_data)

print("Monitoring Data (Last 5 days):")
print(monitor_df.tail())

In [None]:
# Create monitoring dashboard
fig, axes = plt.subplots(2, 2, figsize=(16, 10))
fig.suptitle('AI Governance Monitoring Dashboard', fontsize=16, fontweight='bold')

# 1. Accuracy over time
axes[0, 0].plot(monitor_df['date'], monitor_df['accuracy'], 
                marker='o', linewidth=2, color='#4ECDC4')
axes[0, 0].axhline(y=0.85, color='green', linestyle='--', label='Baseline')
axes[0, 0].axhline(y=0.80, color='red', linestyle='--', label='Alert Threshold')
axes[0, 0].set_title('Model Accuracy', fontweight='bold')
axes[0, 0].set_ylabel('Accuracy')
axes[0, 0].legend()
axes[0, 0].grid(alpha=0.3)
axes[0, 0].tick_params(axis='x', rotation=45)

# 2. Fairness metrics
axes[0, 1].plot(monitor_df['date'], monitor_df['demographic_parity_diff'], 
                marker='s', linewidth=2, color='#FF6B6B')
axes[0, 1].axhline(y=0.1, color='orange', linestyle='--', label='Threshold')
axes[0, 1].axhline(y=0.2, color='red', linestyle='--', label='Critical')
axes[0, 1].set_title('Fairness: Demographic Parity Difference', fontweight='bold')
axes[0, 1].set_ylabel('Parity Difference')
axes[0, 1].legend()
axes[0, 1].grid(alpha=0.3)
axes[0, 1].tick_params(axis='x', rotation=45)

# 3. Latency
axes[1, 0].plot(monitor_df['date'], monitor_df['prediction_latency_ms'], 
                marker='^', linewidth=2, color='#95E1D3')
axes[1, 0].axhline(y=50, color='orange', linestyle='--', label='SLA Limit')
axes[1, 0].set_title('Prediction Latency', fontweight='bold')
axes[1, 0].set_ylabel('Latency (ms)')
axes[1, 0].set_xlabel('Date')
axes[1, 0].legend()
axes[1, 0].grid(alpha=0.3)
axes[1, 0].tick_params(axis='x', rotation=45)

# 4. Drift detection
axes[1, 1].plot(monitor_df['date'], monitor_df['drift_score'], 
                marker='D', linewidth=2, color='#FFB6B9')
axes[1, 1].axhline(y=0.2, color='orange', linestyle='--', label='Warning')
axes[1, 1].axhline(y=0.4, color='red', linestyle='--', label='Alert')
axes[1, 1].set_title('Data Drift Score', fontweight='bold')
axes[1, 1].set_ylabel('Drift Score (PSI)')
axes[1, 1].set_xlabel('Date')
axes[1, 1].legend()
axes[1, 1].grid(alpha=0.3)
axes[1, 1].tick_params(axis='x', rotation=45)

plt.tight_layout()
plt.show()

# Alert system
print("\n" + "="*50)
print("ALERT SUMMARY")
print("="*50)

alerts = []
if monitor_df['accuracy'].iloc[-1] < 0.80:
    alerts.append("🚨 CRITICAL: Accuracy below threshold!")
if monitor_df['demographic_parity_diff'].iloc[-1] > 0.2:
    alerts.append("�� CRITICAL: Fairness violation detected!")
if monitor_df['drift_score'].iloc[-1] > 0.4:
    alerts.append("⚠️ WARNING: Significant data drift detected!")
if monitor_df['prediction_latency_ms'].iloc[-1] > 50:
    alerts.append("⚠️ WARNING: Latency exceeds SLA!")

if alerts:
    for alert in alerts:
        print(alert)
else:
    print("✅ All metrics within acceptable ranges")

# Recommendations
print("\n📋 RECOMMENDATIONS:")
if monitor_df['drift_score'].iloc[-1] > 0.3:
    print("• Consider retraining model on recent data")
if monitor_df['demographic_parity_diff'].iloc[-1] > 0.15:
    print("• Review and adjust fairness mitigation strategies")
if monitor_df['accuracy'].iloc[-1] < monitor_df['accuracy'].iloc[0] - 0.05:
    print("• Investigate performance degradation root cause")

---
## Summary & Key Takeaways

In this notebook, you learned to:

✅ **Evaluate fairness** using demographic parity, equalized odds, and predictive parity metrics  
✅ **Detect bias** in ML models across demographic groups  
✅ **Mitigate unfairness** using threshold adjustment techniques  
✅ **Monitor governance metrics** in production systems with dashboards  

### Next Steps

1. **Practice**: Apply these techniques to your own datasets
2. **Explore**: Try other fairness libraries (AI Fairness 360, Fairlearn)
3. **Extend**: Build automated alerting systems for governance violations
4. **Study**: Review the full architecture documentation for deeper concepts

### Additional Resources

- [AI Fairness 360](https://aif360.mybluemix.net/) - IBM's fairness toolkit
- [Fairlearn](https://fairlearn.org/) - Microsoft's fairness library
- [NIST AI RMF](https://www.nist.gov/itl/ai-risk-management-framework) - Risk management framework
- [EU AI Act](https://digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai) - Regulatory guidance

---
**Session 3: AI Governance** | Built with ❤️ for responsible AI