# Ethics and Bias in Machine Learning - Data Science Koans

Welcome to Notebook 15: Ethics and Bias - The Final Frontier!

## What You Will Learn
- Measuring fairness with quantitative metrics  
- Detecting bias in model predictions and training data
- Implementing bias mitigation techniques
- Model interpretability and explainability methods
- Creating responsible ML checklists and governance

## Why This Matters More Than Ever
As ML systems increasingly impact human lives, ethical considerations become paramount:
- **Legal Compliance**: Avoid discrimination lawsuits and regulatory violations
- **Social Responsibility**: Ensure equitable outcomes across all groups  
- **Business Risk**: Prevent reputational damage from biased systems
- **Trust Building**: Maintain public confidence in AI systems
- **Fairness**: Uphold principles of justice and equality

## Key Ethical Challenges
- **Historical Bias**: Training data reflects past discrimination
- **Representation Bias**: Underrepresented groups in datasets
- **Measurement Bias**: Proxy variables that correlate with protected attributes
- **Algorithmic Amplification**: ML systems that exacerbate existing inequalities
- **Explainability**: Understanding why models make certain decisions

## Prerequisites
- Model Selection and Pipelines (Notebook 14)
- Understanding of classification metrics
- Awareness of social justice issues

## Critical Mindset
This notebook isn't just about technical implementation - it's about developing ethical reasoning skills that will guide your entire ML career. Every model you build affects real people.

## How to Use
1. Examine each ethical scenario with critical thinking
2. Implement bias detection and mitigation techniques
3. Practice explaining model decisions to diverse stakeholders  
4. Develop frameworks for ongoing ethical evaluation
5. Build habits of responsible ML development

Ready to become a responsible ML practitioner? Let's build ethical AI! 🧭✨

In [None]:
# Setup - Run first!
import sys
sys.path.append('../..')

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import (accuracy_score, precision_score, recall_score, 
                            confusion_matrix, classification_report)
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline

# For model interpretability
try:
    import shap
    SHAP_AVAILABLE = True
    print("✓ SHAP available for model interpretability")
except ImportError:
    SHAP_AVAILABLE = False
    print("⚠️ SHAP not available (install: pip install shap)")

try:
    from lime.lime_tabular import LimeTabularExplainer
    LIME_AVAILABLE = True
    print("✓ LIME available for model explanations")
except ImportError:
    LIME_AVAILABLE = False
    print("⚠️ LIME not available (install: pip install lime)")

from koans.core.validator import KoanValidator
from koans.core.progress import ProgressTracker

validator = KoanValidator("15_ethics_and_bias")
tracker = ProgressTracker()

print("Setup complete!")
print(f"Current progress: {tracker.get_notebook_progress('15_ethics_and_bias')}%")
print("\n🧭 Entering the realm of responsible AI...")

## KOAN 15.1: Fairness Metrics - Measuring Bias Quantitatively  
**Objective**: Calculate fairness metrics to detect bias across groups  
**Difficulty**: Advanced

Fairness can be measured in multiple ways, and different metrics may conflict. Understanding these trade-offs is crucial for building equitable ML systems.

**Key Concepts**: 
- **Demographic Parity**: Equal positive prediction rates across groups
- **Equal Opportunity**: Equal true positive rates across groups  
- **Equalized Odds**: Equal TPR and FPR across groups

In [None]:
def calculate_fairness_metrics():
    """
    Create a synthetic hiring dataset and calculate fairness metrics.
    
    We'll simulate a biased hiring algorithm and measure different fairness criteria.
    
    Returns:
        dict: Fairness metrics for different demographic groups
    """
    # Create synthetic hiring dataset
    np.random.seed(42)
    n_samples = 2000
    
    # Protected attribute: gender (0=male, 1=female)  
    gender = np.random.choice([0, 1], n_samples, p=[0.6, 0.4])
    
    # Create features that correlate with both gender and hiring decision
    # This simulates real-world scenarios where bias exists
    experience = np.random.normal(5, 2, n_samples)  # Years of experience
    education = np.random.normal(3, 1, n_samples)   # Education score
    
    # Introduce bias: women tend to have slightly lower "network scores" 
    # (simulating old-boys-network effects)
    network_score = np.where(gender == 0, 
                            np.random.normal(7, 2, n_samples),  # Men
                            np.random.normal(6, 2, n_samples))  # Women
    
    X = np.column_stack([experience, education, network_score])
    
    # Create biased hiring decisions
    # True qualification should depend mainly on experience and education
    true_qualification = (0.4 * experience + 0.6 * education + 
                         np.random.normal(0, 1, n_samples)) > 3.5
    
    # But hiring decisions are biased by network score and gender
    biased_score = (0.3 * experience + 0.4 * education + 0.3 * network_score - 
                   0.2 * gender + np.random.normal(0, 1, n_samples))
    y_biased = biased_score > 4.5
    
    # Split data
    X_train, X_test, y_train, y_test, gender_train, gender_test = train_test_split(
        X, y_biased, gender, test_size=0.3, random_state=42
    )
    
    # TODO: Train biased model
    model = None  # RandomForestClassifier(n_estimators=100, random_state=42)
    # model.fit(X_train, y_train)
    
    # TODO: Make predictions
    # y_pred = model.predict(X_test)
    y_pred = np.zeros_like(y_test)  # Placeholder
    
    # TODO: Calculate fairness metrics by gender
    # Group data by gender
    male_mask = gender_test == 0
    female_mask = gender_test == 1
    
    # Demographic Parity: P(Ŷ=1|A=0) vs P(Ŷ=1|A=1)
    male_positive_rate = None    # np.mean(y_pred[male_mask])
    female_positive_rate = None  # np.mean(y_pred[female_mask])
    demographic_parity_diff = None  # male_positive_rate - female_positive_rate
    
    # Equal Opportunity: TPR for each group
    male_tpr = None   # recall_score(y_test[male_mask], y_pred[male_mask])
    female_tpr = None # recall_score(y_test[female_mask], y_pred[female_mask])
    equal_opportunity_diff = None  # male_tpr - female_tpr
    
    # Overall accuracy by group
    male_accuracy = None   # accuracy_score(y_test[male_mask], y_pred[male_mask])
    female_accuracy = None # accuracy_score(y_test[female_mask], y_pred[female_mask])
    
    return {
        'model': model,
        'male_positive_rate': male_positive_rate or 0,
        'female_positive_rate': female_positive_rate or 0,
        'demographic_parity_diff': demographic_parity_diff or 0,
        'male_tpr': male_tpr or 0,
        'female_tpr': female_tpr or 0,
        'equal_opportunity_diff': equal_opportunity_diff or 0,
        'male_accuracy': male_accuracy or 0,
        'female_accuracy': female_accuracy or 0,
        'n_male': np.sum(male_mask),
        'n_female': np.sum(female_mask)
    }

@validator.koan(1, "Fairness Metrics - Measuring Bias Quantitatively", difficulty="Advanced")
def validate():
    results = calculate_fairness_metrics()
    
    assert results['model'] is not None, "Model not trained"
    assert results['n_male'] > 0 and results['n_female'] > 0, "Should have both gender groups"
    assert 0 <= results['male_positive_rate'] <= 1, "Male positive rate should be probability"
    assert 0 <= results['female_positive_rate'] <= 1, "Female positive rate should be probability"
    
    print("✓ Fairness metrics calculated successfully!")
    print(f"  - Test samples: {results['n_male']} male, {results['n_female']} female")
    
    print(f"\n  📊 Bias Detection Results:")
    print(f"    Demographic Parity:")
    print(f"      Male hiring rate: {results['male_positive_rate']:.3f}")
    print(f"      Female hiring rate: {results['female_positive_rate']:.3f}")
    print(f"      Difference: {results['demographic_parity_diff']:+.3f}")
    
    print(f"\n    Equal Opportunity:")
    print(f"      Male TPR (recall): {results['male_tpr']:.3f}")
    print(f"      Female TPR (recall): {results['female_tpr']:.3f}")
    print(f"      Difference: {results['equal_opportunity_diff']:+.3f}")
    
    print(f"\n    Overall Accuracy:")
    print(f"      Male accuracy: {results['male_accuracy']:.3f}")
    print(f"      Female accuracy: {results['female_accuracy']:.3f}")
    
    # Interpret results
    dp_threshold = 0.05  # 5% threshold for demographic parity
    eo_threshold = 0.05  # 5% threshold for equal opportunity
    
    if abs(results['demographic_parity_diff']) > dp_threshold:
        print(f"\n  🚨 Demographic Parity Violation Detected!")
        print(f"     Difference ({results['demographic_parity_diff']:+.3f}) exceeds threshold (±{dp_threshold})")
    else:
        print(f"\n  ✅ Demographic Parity: Within acceptable range")
        
    if abs(results['equal_opportunity_diff']) > eo_threshold:
        print(f"\n  🚨 Equal Opportunity Violation Detected!")  
        print(f"     Difference ({results['equal_opportunity_diff']:+.3f}) exceeds threshold (±{eo_threshold})")
    else:
        print(f"\n  ✅ Equal Opportunity: Within acceptable range")
    
    print(f"\n  💡 Understanding Fairness Metrics:")
    print(f"    • Demographic Parity: Equal selection rates")
    print(f"    • Equal Opportunity: Equal true positive rates")
    print(f"    • These metrics can conflict - choose based on context")
    print(f"    • Legal requirements vary by jurisdiction")
    
    print(f"\n  ⚖️ Fairness Trade-offs:")
    print(f"    • Perfect fairness across all metrics is often impossible")
    print(f"    • Business context determines which metric to prioritize")
    print(f"    • Regular monitoring is essential")

validate()

## KOAN 15.2: Bias Detection - Identifying Problematic Patterns
**Objective**: Systematically detect bias in datasets and model predictions  
**Difficulty**: Advanced

Bias can hide in datasets through historical discrimination, sampling issues, or proxy variables. Systematic detection is the first step toward fair ML.

**Key Concepts**: Examine data distributions, correlation with protected attributes, and model behavior across different groups.

In [None]:
def comprehensive_bias_detection():
    """
    Perform comprehensive bias detection on a loan approval dataset.
    
    We'll examine multiple sources of bias:
    1. Dataset representation bias
    2. Historical bias in labels  
    3. Proxy variable bias
    4. Model prediction bias
    
    Returns:
        dict: Comprehensive bias detection results
    """
    # Create synthetic loan dataset with known biases
    np.random.seed(42)
    n_samples = 3000
    
    # Protected attributes
    race = np.random.choice(['White', 'Black', 'Hispanic', 'Asian'], 
                           n_samples, p=[0.6, 0.15, 0.15, 0.1])
    age = np.random.normal(40, 12, n_samples)
    gender = np.random.choice(['Male', 'Female'], n_samples, p=[0.55, 0.45])
    
    # Create features with embedded bias
    income = np.random.lognormal(10.5, 0.8, n_samples)
    
    # Introduce racial bias in income (historical discrimination)
    race_multipliers = {'White': 1.0, 'Asian': 1.1, 'Black': 0.8, 'Hispanic': 0.85}
    for i, r in enumerate(race):
        income[i] *= race_multipliers[r]
    
    credit_score = np.random.normal(700, 100, n_samples)
    # Introduce bias - credit scores correlate with race due to systemic factors
    for i, r in enumerate(race):
        if r == 'Black':
            credit_score[i] -= 50
        elif r == 'Hispanic':
            credit_score[i] -= 30
    
    # Create biased loan approvals (historical bias)
    # Approval should depend on income and credit score, but historically was biased
    loan_worthiness = (0.3 * (income / 100000) + 0.7 * (credit_score / 700)) > 0.8
    
    # Historical bias in approvals
    approvals = loan_worthiness.copy()
    for i, r in enumerate(race):
        if r in ['Black', 'Hispanic'] and np.random.random() < 0.2:
            approvals[i] = False  # 20% additional rejection bias
    
    # Create dataset
    df = pd.DataFrame({
        'race': race,
        'age': age,
        'gender': gender,
        'income': income,
        'credit_score': credit_score,
        'approved': approvals
    })
    
    # TODO: Detect representation bias
    representation_analysis = {}
    for group in ['race', 'gender']:
        # Calculate group proportions
        proportions = None  # df[group].value_counts(normalize=True)
        representation_analysis[group] = proportions
    
    # TODO: Detect outcome bias by protected attributes  
    outcome_bias = {}
    for group in ['race', 'gender']:
        # Calculate approval rates by group
        approval_rates = None  # df.groupby(group)['approved'].mean()
        outcome_bias[group] = approval_rates
    
    # TODO: Train model and detect prediction bias
    # Prepare features (exclude protected attributes from training)
    X = df[['age', 'income', 'credit_score']]
    y = df['approved']
    
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.3, random_state=42
    )
    
    # TODO: Train model
    model = None  # LogisticRegression(random_state=42)
    # model.fit(X_train, y_train)
    
    # TODO: Analyze model predictions by race
    df_test = df.iloc[y_test.index] if hasattr(y_test, 'index') else df.iloc[-len(y_test):]
    # y_pred = model.predict(X_test) if model else np.zeros(len(y_test))
    y_pred = np.zeros(len(y_test))  # Placeholder
    
    df_test = df_test.copy()
    df_test['predicted'] = y_pred
    
    # TODO: Calculate prediction bias by race
    prediction_bias = {}
    for race_group in df_test['race'].unique():
        mask = df_test['race'] == race_group
        if np.sum(mask) > 0:
            pred_rate = None  # df_test[mask]['predicted'].mean()
            actual_rate = None  # df_test[mask]['approved'].mean()
            prediction_bias[race_group] = {
                'predicted_approval_rate': pred_rate or 0,
                'actual_approval_rate': actual_rate or 0
            }
    
    return {
        'representation_analysis': representation_analysis,
        'outcome_bias': outcome_bias,
        'prediction_bias': prediction_bias,
        'model': model,
        'dataset_size': len(df)
    }

@validator.koan(2, "Bias Detection - Identifying Problematic Patterns", difficulty="Advanced")
def validate():
    results = comprehensive_bias_detection()
    
    assert results['representation_analysis'] is not None, "Representation analysis not performed"
    assert results['outcome_bias'] is not None, "Outcome bias analysis not performed"
    assert results['dataset_size'] > 0, "Dataset not created properly"
    
    print("✓ Comprehensive bias detection completed!")
    print(f"  - Dataset size: {results['dataset_size']} samples")
    
    print(f"\n  📊 Representation Analysis:")
    if 'race' in results['representation_analysis']:
        race_dist = results['representation_analysis']['race']
        if race_dist is not None:
            for group, prop in race_dist.items():
                print(f"    {group}: {prop:.3f} ({prop*100:.1f}%)")
    
    print(f"\n  🎯 Historical Outcome Bias:")
    if 'race' in results['outcome_bias']:
        race_outcomes = results['outcome_bias']['race']
        if race_outcomes is not None:
            for group, rate in race_outcomes.items():
                print(f"    {group} approval rate: {rate:.3f}")
    
    print(f"\n  🤖 Model Prediction Bias:")
    for group, metrics in results['prediction_bias'].items():
        pred_rate = metrics['predicted_approval_rate']
        actual_rate = metrics['actual_approval_rate']
        print(f"    {group}:")
        print(f"      Predicted: {pred_rate:.3f}")
        print(f"      Actual: {actual_rate:.3f}")
        print(f"      Bias: {pred_rate - actual_rate:+.3f}")
    
    print(f"\n  🔍 Bias Detection Checklist:")
    print(f"    ✅ Representation bias analysis")
    print(f"    ✅ Historical outcome analysis")
    print(f"    ✅ Protected attribute correlation")
    print(f"    ✅ Model prediction fairness")
    
    print(f"\n  🚨 Common Bias Sources:")
    print(f"    • Historical discrimination in training data")
    print(f"    • Underrepresentation of minority groups")
    print(f"    • Proxy variables (zip code → race)")
    print(f"    • Sampling bias in data collection")
    print(f"    • Confirmation bias in labeling")
    
    print(f"\n  💡 Next Steps After Detection:")
    print(f"    • Quantify the severity of bias")
    print(f"    • Understand root causes")
    print(f"    • Implement mitigation strategies")
    print(f"    • Monitor ongoing bias")

validate()

## KOAN 15.3: Bias Mitigation - Reducing Unfair Discrimination
**Objective**: Implement techniques to reduce bias in ML models  
**Difficulty**: Advanced

Once bias is detected, several techniques can help mitigate it. Each approach has trade-offs between fairness, accuracy, and complexity.

**Key Techniques**: Data rebalancing, adversarial debiasing, fairness constraints, and threshold adjustment.

In [None]:
def implement_bias_mitigation():
    """
    Implement multiple bias mitigation techniques and compare their effectiveness.
    
    Techniques:
    1. Data rebalancing (preprocessing)
    2. Threshold adjustment (postprocessing)
    3. Fairness-aware training (in-processing)
    
    Returns:
        dict: Results comparing different mitigation approaches
    """
    # Create biased dataset (same as before but cleaner)
    np.random.seed(42)
    n_samples = 2000
    
    # Protected attribute: gender
    gender = np.random.choice([0, 1], n_samples, p=[0.6, 0.4])  # 0=male, 1=female
    
    # Create features with bias
    experience = np.random.normal(5, 2, n_samples)
    education = np.random.normal(7, 1.5, n_samples)
    
    # Network score biased against women
    network_score = np.where(gender == 0,
                            np.random.normal(6, 1.5, n_samples),  # Men
                            np.random.normal(5, 1.5, n_samples))  # Women
    
    X = np.column_stack([experience, education, network_score])
    
    # Create biased hiring decisions
    hiring_score = (0.4 * experience + 0.4 * education + 0.2 * network_score - 
                   0.3 * gender + np.random.normal(0, 1, n_samples))
    y = hiring_score > 4.0
    
    X_train, X_test, y_train, y_test, gender_train, gender_test = train_test_split(
        X, y, gender, test_size=0.3, random_state=42
    )
    
    results = {}
    
    # 1. Baseline biased model
    baseline_model = RandomForestClassifier(n_estimators=100, random_state=42)
    baseline_model.fit(X_train, y_train)
    baseline_pred = baseline_model.predict(X_test)
    
    # Calculate baseline fairness
    male_mask = gender_test == 0
    female_mask = gender_test == 1
    baseline_male_rate = np.mean(baseline_pred[male_mask])
    baseline_female_rate = np.mean(baseline_pred[female_mask])
    baseline_dp_diff = baseline_male_rate - baseline_female_rate
    
    results['baseline'] = {
        'accuracy': accuracy_score(y_test, baseline_pred),
        'male_positive_rate': baseline_male_rate,
        'female_positive_rate': baseline_female_rate,
        'demographic_parity_diff': baseline_dp_diff
    }
    
    # TODO: 2. Data rebalancing mitigation
    # Balance the training data by gender
    male_indices = np.where(gender_train == 0)[0]
    female_indices = np.where(gender_train == 1)[0]
    
    # Downsample majority group or upsample minority group
    min_size = min(len(male_indices), len(female_indices))
    balanced_indices = None  # np.concatenate([male_indices[:min_size], female_indices[:min_size]])
    
    # TODO: Train on balanced data
    if balanced_indices is not None:
        X_balanced = X_train[balanced_indices]
        y_balanced = y_train[balanced_indices]
        
        balanced_model = RandomForestClassifier(n_estimators=100, random_state=42)
        balanced_model.fit(X_balanced, y_balanced)
        balanced_pred = balanced_model.predict(X_test)
        
        balanced_male_rate = np.mean(balanced_pred[male_mask])
        balanced_female_rate = np.mean(balanced_pred[female_mask])
        balanced_dp_diff = balanced_male_rate - balanced_female_rate
        
        results['data_rebalancing'] = {
            'accuracy': accuracy_score(y_test, balanced_pred),
            'male_positive_rate': balanced_male_rate,
            'female_positive_rate': balanced_female_rate,
            'demographic_parity_diff': balanced_dp_diff
        }
    
    # TODO: 3. Threshold adjustment mitigation  
    # Use different decision thresholds for different groups
    baseline_proba = baseline_model.predict_proba(X_test)[:, 1]
    
    # Find thresholds that equalize positive rates
    male_proba = baseline_proba[male_mask]
    female_proba = baseline_proba[female_mask]
    
    # TODO: Find optimal thresholds
    # For simplicity, use fixed thresholds that balance rates
    male_threshold = None   # np.percentile(male_proba, 80)  # Raise threshold for advantaged group
    female_threshold = None # np.percentile(female_proba, 70) # Lower threshold for disadvantaged group
    
    if male_threshold is not None and female_threshold is not None:
        threshold_pred = np.zeros_like(baseline_pred)
        threshold_pred[male_mask] = male_proba >= male_threshold
        threshold_pred[female_mask] = female_proba >= female_threshold
        
        threshold_male_rate = np.mean(threshold_pred[male_mask])
        threshold_female_rate = np.mean(threshold_pred[female_mask])
        threshold_dp_diff = threshold_male_rate - threshold_female_rate
        
        results['threshold_adjustment'] = {
            'accuracy': accuracy_score(y_test, threshold_pred),
            'male_positive_rate': threshold_male_rate,
            'female_positive_rate': threshold_female_rate,
            'demographic_parity_diff': threshold_dp_diff,
            'male_threshold': male_threshold,
            'female_threshold': female_threshold
        }
    
    return results

@validator.koan(3, "Bias Mitigation - Reducing Unfair Discrimination", difficulty="Advanced")
def validate():
    results = implement_bias_mitigation()
    
    assert 'baseline' in results, "Baseline model results missing"
    assert results['baseline']['accuracy'] > 0, "Baseline accuracy not calculated"
    
    print("✓ Bias mitigation techniques implemented!")
    
    print(f"\n  📊 Mitigation Results Comparison:")
    print(f"  {'Method':<20} {'Accuracy':<10} {'Male Rate':<10} {'Female Rate':<12} {'DP Diff':<8}")
    print(f"  {'-'*65}")
    
    for method, metrics in results.items():
        acc = metrics['accuracy']
        male_rate = metrics['male_positive_rate']
        female_rate = metrics['female_positive_rate']
        dp_diff = metrics['demographic_parity_diff']
        
        print(f"  {method:<20} {acc:<10.3f} {male_rate:<10.3f} {female_rate:<12.3f} {dp_diff:<+8.3f}")
    
    # Analyze trade-offs
    baseline_acc = results['baseline']['accuracy']
    baseline_bias = abs(results['baseline']['demographic_parity_diff'])
    
    print(f"\n  ⚖️ Fairness vs. Accuracy Trade-offs:")
    
    for method, metrics in results.items():
        if method == 'baseline':
            continue
            
        acc_change = metrics['accuracy'] - baseline_acc
        bias_change = baseline_bias - abs(metrics['demographic_parity_diff'])
        
        print(f"\n    {method.replace('_', ' ').title()}:")
        print(f"      Accuracy change: {acc_change:+.3f}")
        print(f"      Bias reduction: {bias_change:+.3f}")
        
        if bias_change > 0.05:
            print(f"      ✅ Significant bias reduction")
        elif bias_change > 0:
            print(f"      ⚠️ Modest bias reduction")
        else:
            print(f"      🚨 Bias not reduced")
    
    print(f"\n  🛠️ Mitigation Technique Summary:")
    print(f"    • Data Rebalancing: Change training data distribution")
    print(f"    • Threshold Adjustment: Different decision thresholds per group")
    print(f"    • Fairness Constraints: Add fairness terms to loss function")
    print(f"    • Adversarial Training: Train to hide protected attributes")
    
    print(f"\n  💡 Choosing Mitigation Strategies:")
    print(f"    • Legal requirements (regulatory compliance)")
    print(f"    • Business context (cost of false positives/negatives)")
    print(f"    • Stakeholder values (community input)")
    print(f"    • Technical constraints (available data, compute)")
    
    print(f"\n  ⚠️ Important Considerations:")
    print(f"    • Perfect fairness often impossible across all metrics")
    print(f"    • Mitigation may reduce accuracy")
    print(f"    • Regular monitoring essential")
    print(f"    • Transparency with stakeholders critical")

validate()

## KOAN 15.4: Model Interpretability - Explaining AI Decisions
**Objective**: Use SHAP and LIME to explain model predictions and identify bias sources  
**Difficulty**: Advanced

Understanding why models make certain decisions is crucial for trust, debugging, and regulatory compliance. Interpretability tools help identify when models rely on inappropriate features.

**Key Tools**: SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-Agnostic Explanations) provide different approaches to model explanation.

In [None]:
def model_interpretability_analysis():
    """
    Use interpretability tools to understand model decisions and detect bias.
    
    We'll train a model and use both global and local explanations
    to understand feature importance and individual predictions.
    
    Returns:
        dict: Interpretability analysis results
    """
    # Create interpretable dataset
    np.random.seed(42)
    n_samples = 1000
    
    # Create meaningful features
    age = np.random.normal(35, 10, n_samples)
    income = np.random.lognormal(10.5, 0.6, n_samples) 
    credit_score = np.random.normal(650, 100, n_samples)
    debt_ratio = np.random.beta(2, 5, n_samples)  # Debt-to-income ratio
    employment_years = np.random.exponential(3, n_samples)
    
    # Create target: loan approval
    # Should depend on financial factors, not age
    loan_score = (0.3 * (credit_score - 600) / 100 + 
                 0.4 * np.log(income / 50000) +
                 0.2 * employment_years / 10 - 
                 0.3 * debt_ratio +
                 np.random.normal(0, 0.5, n_samples))
    
    # But introduce age bias in the model training
    loan_score += 0.1 * (age - 30) / 10  # Bias: prefer middle-aged applicants
    
    y = loan_score > 0.5
    
    # Create DataFrame for easy interpretation
    df = pd.DataFrame({
        'age': age,
        'income': income,
        'credit_score': credit_score, 
        'debt_ratio': debt_ratio,
        'employment_years': employment_years,
        'approved': y
    })
    
    X = df.drop('approved', axis=1)
    y = df['approved']
    
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.3, random_state=42
    )
    
    # TODO: Train interpretable model
    model = None  # RandomForestClassifier(n_estimators=100, random_state=42, max_depth=10)
    # model.fit(X_train, y_train)
    
    accuracy = model.score(X_test, y_test) if model else 0
    
    # Basic feature importance (built-in)
    if model:
        feature_importance = dict(zip(X.columns, model.feature_importances_))
    else:
        feature_importance = {}
    
    # TODO: SHAP Analysis (if available)
    shap_analysis = {}
    if SHAP_AVAILABLE and model:
        try:
            # Create SHAP explainer
            explainer = None  # shap.TreeExplainer(model)
            # shap_values = explainer.shap_values(X_test.iloc[:100])  # Sample for speed
            
            # Global feature importance from SHAP
            # if isinstance(shap_values, list):
            #     shap_values = shap_values[1]  # For binary classification
            # 
            # mean_abs_shap = np.mean(np.abs(shap_values), axis=0)
            # shap_importance = dict(zip(X.columns, mean_abs_shap))
            
            shap_analysis = {
                'available': True,
                'global_importance': {},  # shap_importance
                'explanation_type': 'TreeExplainer'
            }
        except Exception as e:
            shap_analysis = {'available': False, 'error': str(e)}
    else:
        shap_analysis = {'available': False, 'reason': 'SHAP not installed'}
    
    # TODO: LIME Analysis (if available)  
    lime_analysis = {}
    if LIME_AVAILABLE and model:
        try:
            # Create LIME explainer
            explainer = None
            # explainer = LimeTabularExplainer(
            #     X_train.values,
            #     feature_names=X.columns.tolist(),
            #     class_names=['Denied', 'Approved'],
            #     mode='classification'
            # )
            
            # Explain a few instances
            # instance_explanations = []
            # for i in range(min(5, len(X_test))):
            #     explanation = explainer.explain_instance(
            #         X_test.iloc[i].values, model.predict_proba
            #     )
            #     instance_explanations.append(explanation.as_list())
            
            lime_analysis = {
                'available': True,
                'sample_explanations': [],  # instance_explanations
                'explanation_type': 'TabularExplainer'
            }
        except Exception as e:
            lime_analysis = {'available': False, 'error': str(e)}
    else:
        lime_analysis = {'available': False, 'reason': 'LIME not installed'}
    
    # TODO: Bias detection through feature importance
    bias_indicators = {}
    if feature_importance:
        # Check if age has high importance (potential bias)
        age_importance = feature_importance.get('age', 0)
        total_importance = sum(feature_importance.values())
        age_percentage = (age_importance / total_importance) * 100 if total_importance > 0 else 0
        
        bias_indicators = {
            'age_importance_pct': age_percentage,
            'high_age_importance': age_percentage > 15,  # Threshold for concern
            'most_important_feature': max(feature_importance, key=feature_importance.get) if feature_importance else None
        }
    
    return {
        'model': model,
        'accuracy': accuracy,
        'feature_importance': feature_importance,
        'shap_analysis': shap_analysis,
        'lime_analysis': lime_analysis,
        'bias_indicators': bias_indicators,
        'dataset_size': len(df)
    }

@validator.koan(4, "Model Interpretability - Explaining AI Decisions", difficulty="Advanced")
def validate():
    results = model_interpretability_analysis()
    
    assert results['model'] is not None, "Model not trained"
    assert results['accuracy'] > 0, "Model accuracy not calculated"
    assert len(results['feature_importance']) > 0, "Feature importance not calculated"
    assert 0.7 <= results['accuracy'] <= 1.0, f"Model accuracy should be reasonable, got {results['accuracy']:.3f}"
    
    print("✓ Model interpretability analysis completed!")
    print(f"  - Model accuracy: {results['accuracy']:.3f}")
    print(f"  - Dataset size: {results['dataset_size']}")
    
    print(f"\n  🎯 Feature Importance Analysis:")
    feature_imp = results['feature_importance']
    sorted_features = sorted(feature_imp.items(), key=lambda x: x[1], reverse=True)
    
    for feature, importance in sorted_features:
        print(f"    {feature}: {importance:.3f} ({importance*100:.1f}%)")
    
    # Bias detection
    bias_indicators = results['bias_indicators']
    if bias_indicators.get('high_age_importance'):
        print(f"\n  🚨 Potential Age Bias Detected!")
        print(f"    Age importance: {bias_indicators['age_importance_pct']:.1f}%")
        print(f"    This may indicate age discrimination in the model")
    else:
        print(f"\n  ✅ Age feature has reasonable importance")
    
    most_important = bias_indicators.get('most_important_feature')
    if most_important:
        print(f"  🏆 Most important feature: {most_important}")
    
    # SHAP availability
    shap_status = results['shap_analysis']
    if shap_status['available']:
        print(f"\n  📊 SHAP Analysis: Available")
        print(f"    • Provides individual prediction explanations")
        print(f"    • Shows feature contributions for each decision")
        print(f"    • Helps identify systematic bias patterns")
    else:
        print(f"\n  📊 SHAP Analysis: Not available")
        print(f"    Reason: {shap_status.get('reason', shap_status.get('error', 'Unknown'))}")
        print(f"    Install with: pip install shap")
    
    # LIME availability  
    lime_status = results['lime_analysis']
    if lime_status['available']:
        print(f"\n  🔍 LIME Analysis: Available")
        print(f"    • Provides local explanations for individual predictions")
        print(f"    • Model-agnostic explanation method")
        print(f"    • Good for explaining specific decisions to users")
    else:
        print(f"\n  🔍 LIME Analysis: Not available")
        print(f"    Reason: {lime_status.get('reason', lime_status.get('error', 'Unknown'))}")
        print(f"    Install with: pip install lime")
    
    print(f"\n  💡 Interpretability Best Practices:")
    print(f"    • Use multiple explanation methods (SHAP + LIME + feature importance)")
    print(f"    • Focus on business-relevant features")
    print(f"    • Validate explanations with domain experts")
    print(f"    • Monitor explanations over time")
    
    print(f"\n  🧭 Regulatory Requirements:")
    print(f"    • GDPR: Right to explanation for automated decisions")
    print(f"    • Fair Credit Reporting Act: Adverse action notices")
    print(f"    • Equal Credit Opportunity Act: No discrimination")
    print(f"    • Industry standards: Model governance and documentation")

validate()

## KOAN 15.5: Responsible ML Checklist - Comprehensive Ethical Framework
**Objective**: Create a comprehensive responsible ML checklist and governance framework  
**Difficulty**: Advanced

This final koan brings together all ethical considerations into a practical framework for responsible ML development and deployment.

**Key Elements**: Data governance, model auditing, stakeholder involvement, ongoing monitoring, and incident response procedures.

In [None]:
def responsible_ml_framework():
    """
    Create and evaluate a comprehensive responsible ML framework.
    
    This includes:
    1. Pre-deployment ethical checklist
    2. Ongoing monitoring procedures  
    3. Stakeholder engagement process
    4. Incident response protocol
    5. Documentation requirements
    
    Returns:
        dict: Complete responsible ML framework evaluation
    """
    
    # TODO: Define pre-deployment checklist
    pre_deployment_checklist = {
        'data_governance': {
            'data_source_documented': None,  # True/False
            'collection_method_ethical': None,  # True/False  
            'consent_obtained': None,  # True/False
            'retention_policy_defined': None,  # True/False
            'privacy_protected': None  # True/False
        },
        'bias_assessment': {
            'protected_attributes_identified': None,  # True/False
            'historical_bias_analyzed': None,  # True/False
            'representation_bias_checked': None,  # True/False
            'fairness_metrics_calculated': None,  # True/False
            'mitigation_strategies_implemented': None  # True/False
        },
        'model_interpretability': {
            'feature_importance_analyzed': None,  # True/False
            'decision_boundaries_understood': None,  # True/False
            'explanation_tools_implemented': None,  # True/False
            'stakeholder_explanations_prepared': None,  # True/False
            'adverse_action_notices_ready': None  # True/False
        },
        'performance_validation': {
            'holdout_testing_completed': None,  # True/False
            'subgroup_performance_evaluated': None,  # True/False
            'edge_cases_tested': None,  # True/False
            'error_analysis_conducted': None,  # True/False
            'confidence_intervals_calculated': None  # True/False
        },
        'legal_compliance': {
            'regulatory_requirements_reviewed': None,  # True/False
            'legal_team_consulted': None,  # True/False
            'industry_standards_followed': None,  # True/False
            'audit_trail_established': None,  # True/False
            'documentation_complete': None  # True/False
        }
    }
    
    # TODO: Define ongoing monitoring framework
    monitoring_framework = {
        'data_drift_monitoring': {
            'feature_distribution_tracking': None,  # True/False
            'statistical_tests_automated': None,  # True/False
            'alerts_configured': None,  # True/False
            'retraining_triggers_defined': None  # True/False
        },
        'performance_monitoring': {
            'accuracy_tracking': None,  # True/False
            'subgroup_performance_monitoring': None,  # True/False
            'fairness_metrics_tracked': None,  # True/False
            'user_feedback_collected': None,  # True/False
            'business_metrics_monitored': None  # True/False
        },
        'bias_monitoring': {
            'demographic_parity_tracked': None,  # True/False
            'equal_opportunity_monitored': None,  # True/False
            'prediction_parity_assessed': None,  # True/False
            'intersectional_bias_checked': None,  # True/False
            'complaint_tracking_system': None  # True/False
        }
    }
    
    # TODO: Define stakeholder engagement process
    stakeholder_engagement = {
        'identification': {
            'internal_stakeholders_mapped': None,  # True/False
            'external_stakeholders_identified': None,  # True/False
            'affected_communities_engaged': None,  # True/False
            'subject_matter_experts_involved': None  # True/False
        },
        'communication': {
            'regular_updates_scheduled': None,  # True/False
            'feedback_mechanisms_established': None,  # True/False
            'transparency_reports_published': None,  # True/False
            'public_comment_periods_held': None  # True/False
        },
        'governance': {
            'ethics_committee_established': None,  # True/False
            'decision_making_process_defined': None,  # True/False
            'appeals_process_created': None,  # True/False
            'accountability_measures_implemented': None  # True/False
        }
    }
    
    # TODO: Calculate framework completeness score
    def calculate_completeness(framework_dict):
        total_items = 0
        completed_items = 0
        
        for category in framework_dict.values():
            for item, status in category.items():
                total_items += 1
                if status is True:
                    completed_items += 1
                    
        return completed_items / total_items if total_items > 0 else 0
    
    # For demonstration, simulate partial completion
    # In real scenario, these would be evaluated based on actual implementation
    
    # Simulate checklist completion (you would evaluate these based on your actual project)
    pre_deployment_score = None  # calculate_completeness(pre_deployment_checklist)
    monitoring_score = None      # calculate_completeness(monitoring_framework)
    engagement_score = None      # calculate_completeness(stakeholder_engagement)
    
    # TODO: Generate recommendations based on gaps
    recommendations = []
    
    # Check critical gaps
    if pre_deployment_score is not None and pre_deployment_score < 0.8:
        recommendations.append("Complete pre-deployment ethical assessment")
    
    if monitoring_score is not None and monitoring_score < 0.7:
        recommendations.append("Implement comprehensive monitoring system")
        
    if engagement_score is not None and engagement_score < 0.6:
        recommendations.append("Enhance stakeholder engagement process")
    
    # Always include these critical recommendations
    critical_recommendations = [
        "Establish regular bias audits",
        "Create incident response procedures", 
        "Implement user feedback systems",
        "Develop model governance documentation",
        "Train team on ethical AI principles"
    ]
    
    return {
        'pre_deployment_checklist': pre_deployment_checklist,
        'monitoring_framework': monitoring_framework,
        'stakeholder_engagement': stakeholder_engagement,
        'pre_deployment_score': pre_deployment_score or 0.75,  # Example score
        'monitoring_score': monitoring_score or 0.65,          # Example score
        'engagement_score': engagement_score or 0.55,          # Example score
        'overall_readiness': ((pre_deployment_score or 0.75) + 
                             (monitoring_score or 0.65) + 
                             (engagement_score or 0.55)) / 3,
        'recommendations': recommendations + critical_recommendations[:3],
        'framework_complete': True
    }

@validator.koan(5, "Responsible ML Checklist - Comprehensive Ethical Framework", difficulty="Advanced")
def validate():
    results = responsible_ml_framework()
    
    assert results['pre_deployment_checklist'] is not None, "Pre-deployment checklist not created"
    assert results['monitoring_framework'] is not None, "Monitoring framework not created"  
    assert results['stakeholder_engagement'] is not None, "Stakeholder engagement process not defined"
    assert results['framework_complete'] is True, "Framework not complete"
    assert len(results['recommendations']) > 0, "No recommendations provided"
    
    print("✓ Responsible ML framework created successfully!")
    
    print(f"\n  📊 Framework Readiness Assessment:")
    print(f"    Pre-deployment readiness: {results['pre_deployment_score']:.1%}")
    print(f"    Monitoring readiness: {results['monitoring_score']:.1%}")
    print(f"    Stakeholder engagement: {results['engagement_score']:.1%}")
    print(f"    Overall readiness: {results['overall_readiness']:.1%}")
    
    # Readiness interpretation
    overall_readiness = results['overall_readiness']
    if overall_readiness >= 0.8:
        print(f"  ✅ Excellent: Ready for responsible deployment")
    elif overall_readiness >= 0.6:
        print(f"  ⚠️ Good: Address key gaps before deployment")  
    else:
        print(f"  🚨 Needs Work: Significant ethical preparation required")
    
    print(f"\n  🎯 Priority Recommendations:")
    for i, rec in enumerate(results['recommendations'][:5], 1):
        print(f"    {i}. {rec}")
    
    print(f"\n  📋 Responsible ML Framework Components:")
    
    print(f"\n  🔍 Pre-Deployment Checklist:")
    checklist_categories = list(results['pre_deployment_checklist'].keys())
    for category in checklist_categories:
        print(f"    • {category.replace('_', ' ').title()}")
    
    print(f"\n  📡 Ongoing Monitoring:")
    monitoring_categories = list(results['monitoring_framework'].keys())
    for category in monitoring_categories:
        print(f"    • {category.replace('_', ' ').title()}")
    
    print(f"\n  🤝 Stakeholder Engagement:")
    engagement_categories = list(results['stakeholder_engagement'].keys())
    for category in engagement_categories:
        print(f"    • {category.replace('_', ' ').title()}")
    
    print(f"\n  🛡️ Critical Success Factors:")
    print(f"    • Executive leadership commitment")
    print(f"    • Cross-functional team collaboration")
    print(f"    • Continuous learning and adaptation")
    print(f"    • Transparent communication with stakeholders")
    print(f"    • Regular third-party audits")
    
    print(f"\n  📚 Essential Resources:")
    print(f"    • Partnership on AI Tenets")
    print(f"    • IEEE Standards for Ethical AI")
    print(f"    • Algorithmic Accountability Act guidelines")
    print(f"    • Industry-specific ethical guidelines")
    print(f"    • Academic research on AI ethics")
    
    print(f"\n  🚀 Implementation Roadmap:")
    print(f"    1. Form ethics committee and define governance")
    print(f"    2. Complete pre-deployment checklist")
    print(f"    3. Implement monitoring and alerting systems")
    print(f"    4. Engage with stakeholders and communities")
    print(f"    5. Deploy with gradual rollout and monitoring")
    print(f"    6. Conduct regular audits and improvements")

validate()

## 🎉 Congratulations - Data Science Koans Journey Complete!

You have not only mastered the technical aspects of machine learning but also embraced the ethical responsibility that comes with building AI systems that affect real human lives.

### Your Ethical AI Mastery
- ✅ **Fairness Measurement**: Quantified bias with demographic parity and equal opportunity
- ✅ **Bias Detection**: Systematically identified discrimination in data and models  
- ✅ **Bias Mitigation**: Implemented techniques to reduce unfair treatment
- ✅ **Model Interpretability**: Explained AI decisions using SHAP and LIME
- ✅ **Responsible ML Framework**: Built comprehensive ethical governance systems

### The Journey You've Completed
**15 Notebooks | 130+ Koans | Complete ML Mastery**

1. **NumPy Fundamentals** → Array operations and broadcasting
2. **Pandas Essentials** → Data manipulation and analysis  
3. **Data Exploration** → EDA and data profiling
4. **Data Cleaning** → Missing values and quality issues
5. **Data Transformation** → Feature engineering basics
6. **Feature Engineering** → Advanced feature creation
7. **Regression Basics** → Linear models and evaluation
8. **Classification Basics** → Decision trees and metrics
9. **Model Evaluation** → Cross-validation and selection
10. **Clustering** → Unsupervised learning methods
11. **Dimensionality Reduction** → PCA and manifold learning
12. **Ensemble Methods** → Random forests and boosting
13. **Hyperparameter Tuning** → Optimization and search
14. **Pipelines** → Production ML workflows
15. **Ethics and Bias** → Responsible AI practices

### Your Professional Impact
🎯 **Technical Excellence**: Master-level skills across the ML pipeline  
🛡️ **Ethical Leadership**: Champion for responsible AI development  
🚀 **Production Ready**: Build deployable, maintainable ML systems  
🧭 **Principled Approach**: Balance performance with fairness and transparency  

### Beyond the Koans
- **Continue Learning**: AI ethics is an evolving field - stay current
- **Mentor Others**: Share your knowledge and ethical mindset
- **Lead by Example**: Advocate for responsible AI in your organization  
- **Stay Engaged**: Participate in AI ethics communities and discussions

### Final Wisdom
*"With great ML power comes great responsibility."*

You now possess both the technical skills to build powerful AI systems and the ethical framework to ensure they benefit all of humanity. Use this knowledge wisely.

**Welcome to the community of Responsible AI Practitioners! 🌟**

*The world needs more data scientists who care about fairness, transparency, and human dignity. Thank you for joining this mission.*

In [None]:
# 🎊 FINAL CELEBRATION - DATA SCIENCE KOANS COMPLETE! 🎊

progress = tracker.get_notebook_progress('15_ethics_and_bias')
print(f"📊 Notebook 15 Progress: {progress}% complete!")

# Calculate overall course completion
completed_notebooks = 0
total_notebooks = 15

print(f"\n🏆 DATA SCIENCE KOANS - FINAL REPORT 🏆")
print(f"="*50)

for i in range(1, 16):
    nb_progress = tracker.get_notebook_progress(f'{i:02d}_*')
    status = "✅ COMPLETE" if nb_progress == 100 else f"📊 {nb_progress}%"
    notebook_names = [
        "NumPy Fundamentals", "Pandas Essentials", "Data Exploration", 
        "Data Cleaning", "Data Transformation", "Feature Engineering",
        "Regression Basics", "Classification Basics", "Model Evaluation",
        "Clustering", "Dimensionality Reduction", "Ensemble Methods",
        "Hyperparameter Tuning", "Model Pipelines", "Ethics and Bias"
    ]
    print(f"Notebook {i:2d}: {notebook_names[i-1]:<25} {status}")
    if nb_progress == 100:
        completed_notebooks += 1

overall_completion = (completed_notebooks / total_notebooks) * 100
print(f"\n🎯 OVERALL COURSE COMPLETION: {overall_completion:.1f}%")
print(f"📚 Notebooks Completed: {completed_notebooks}/{total_notebooks}")

if overall_completion == 100:
    print(f"\n" + "🎉" * 50)
    print(f"🌟 PHENOMENAL ACHIEVEMENT! 🌟")
    print(f"")
    print(f"You have successfully completed ALL Data Science Koans!")
    print(f"")
    print(f"🧠 Technical Mastery: ACHIEVED")
    print(f"🛡️ Ethical Foundation: ESTABLISHED") 
    print(f"🚀 Production Skills: MASTERED")
    print(f"🧭 Responsible AI: EMBRACED")
    print(f"")
    print(f"You are now a COMPLETE Data Science Practitioner!")
    print(f"")
    print(f"🎖️ CONGRATULATIONS, ML MASTER! 🎖️")
    print(f"🎉" * 50)
elif overall_completion >= 90:
    print(f"\n🌟 Outstanding! You're almost at the finish line!")
    print(f"🎯 Complete the remaining koans to earn your ML Master status!")
elif overall_completion >= 75:
    print(f"\n💪 Excellent progress! You're in the advanced stages!")
else:
    print(f"\n🚀 Great start! Keep building your ML expertise!")

print(f"\n📜 Your Data Science Journey:")
print(f"   ✨ From arrays to ethics")
print(f"   📈 From data to insights") 
print(f"   🤖 From algorithms to responsibility")
print(f"   🌍 From code to positive impact")

print(f"\n🧭 Remember: Use your powers for good!")
print(f"   Build AI that serves humanity 🌟")

In [None]:
import sys
sys.path.append('../..')
import numpy as np
from koans.core.validator import KoanValidator
from koans.core.progress import ProgressTracker
validator = KoanValidator('15_ethics_and_bias')
tracker = ProgressTracker()
print('Setup complete!')

## KOAN 15.1: Exercise 1

In [None]:
def koan_1():
    # TODO: Complete this exercise
    return True

@validator.koan(1, 'Ex1', difficulty='Advanced')
def validate():
    result = koan_1()
    assert result == True
validate()

## KOAN 15.2: Exercise 2

In [None]:
def koan_2():
    # TODO: Complete this exercise
    return True

@validator.koan(2, 'Ex2', difficulty='Advanced')
def validate():
    result = koan_2()
    assert result == True
validate()

## KOAN 15.3: Exercise 3

In [None]:
def koan_3():
    # TODO: Complete this exercise
    return True

@validator.koan(3, 'Ex3', difficulty='Advanced')
def validate():
    result = koan_3()
    assert result == True
validate()

## KOAN 15.4: Exercise 4

In [None]:
def koan_4():
    # TODO: Complete this exercise
    return True

@validator.koan(4, 'Ex4', difficulty='Advanced')
def validate():
    result = koan_4()
    assert result == True
validate()

## KOAN 15.5: Exercise 5

In [None]:
def koan_5():
    # TODO: Complete this exercise
    return True

@validator.koan(5, 'Ex5', difficulty='Advanced')
def validate():
    result = koan_5()
    assert result == True
validate()

## Congratulations!

Ethics and Bias complete!

In [None]:
progress = tracker.get_notebook_progress('15_ethics_and_bias')
print(f'Progress: {progress}%')