# üîß Bias Mitigation in ML Models

<div style="background-color: #e3f2fd; padding: 15px; border-radius: 5px; border-left: 5px solid #2196F3;">
<b>üìì Information</b><br>
<b>Level:</b> Advanced<br>
<b>Duration:</b> 25 minutes<br>
<b>Dataset:</b> Adult Income (synthetic)<br>
<b>Prerequisite:</b> 02_complete_fairness_analysis.ipynb
</div>

## üéØ Objectives
- ‚úÖ Identify bias in the model
- ‚úÖ Learn mitigation techniques (Pre/In/Post-processing)
- ‚úÖ Implement practical mitigations
- ‚úÖ Re-validate fairness after mitigation
- ‚úÖ Compare Before vs After
- ‚úÖ Understand trade-offs (accuracy vs fairness)

## üìö Types of Bias Mitigation

### 1. **Pre-processing** (Before Training)
Modify the **data** to remove bias
- Reweighting
- Resampling
- Feature transformation

**Advantages**: Works with any model
**Disadvantages**: May lose information

### 2. **In-processing** (During Training)
Modify the **training algorithm**
- Fairness constraints
- Adversarial debiasing
- Fairness regularization

**Advantages**: Integrated into training
**Disadvantages**: Requires specific model

### 3. **Post-processing** (After Training)
Modify the model's **predictions**
- Threshold optimization
- Calibrated equalized odds
- Reject option classification

**Advantages**: No need to retrain
**Disadvantages**: May not be optimal

## 1Ô∏è‚É£ Setup and Biased Model

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, confusion_matrix
from deepbridge import DBDataset, Experiment

# Configure visualizations
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette("Set2")

print("üìä Creating dataset with intentional bias...\n")

In [None]:
# Synthetic dataset with BIAS
np.random.seed(42)
n = 2000

df = pd.DataFrame({
    'age': np.random.randint(18, 70, n),
    'education_years': np.random.randint(8, 20, n),
    'hours_per_week': np.random.randint(20, 60, n),
    'gender': np.random.choice(['Male', 'Female'], n, p=[0.6, 0.4])
})

# Target WITH BIAS: gender significantly affects outcome
base_prob = 0.3
prob = base_prob + \
       (df['age'] - 40) * 0.005 + \
       (df['education_years'] - 12) * 0.03 + \
       (df['hours_per_week'] - 40) * 0.01 + \
       (df['gender'] == 'Male') * 0.15  # ‚Üê STRONG BIAS

df['high_income'] = (prob + np.random.normal(0, 0.1, n) > 0.5).astype(int)

print(f"‚úÖ Dataset created: {df.shape}")
print(f"\nüìä Overall rate: {df['high_income'].mean():.1%}")
print(f"\nBy gender (data):")
for g in ['Male', 'Female']:
    rate = df[df['gender']==g]['high_income'].mean()
    print(f"  {g}: {rate:.1%}")

## 2Ô∏è‚É£ Train ORIGINAL Model (with bias)

In [None]:
# Prepare data
df_encoded = df.copy()
df_encoded['gender_enc'] = (df['gender'] == 'Male').astype(int)

feature_cols = ['age', 'education_years', 'hours_per_week', 'gender_enc']
X = df_encoded[feature_cols]
y = df_encoded['high_income']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# ORIGINAL model
clf_original = RandomForestClassifier(n_estimators=100, random_state=42)
clf_original.fit(X_train, y_train)

y_pred_original = clf_original.predict(X_test)
acc_original = accuracy_score(y_test, y_pred_original)

print(f"‚úÖ ORIGINAL model trained")
print(f"üìä Accuracy: {acc_original:.3f}")

## 3Ô∏è‚É£ Measure Bias in ORIGINAL Model

In [None]:
# Calculate disparate impact
test_indices = X_test.index
test_gender = df_encoded.loc[test_indices, 'gender']

male_rate_orig = y_pred_original[test_gender == 'Male'].mean()
female_rate_orig = y_pred_original[test_gender == 'Female'].mean()
di_orig = female_rate_orig / male_rate_orig if male_rate_orig > 0 else 0

print("\nüìä ORIGINAL MODEL - FAIRNESS:\n" + "="*50)
print(f"\nüë• Positive prediction rates:")
print(f"   Male: {male_rate_orig:.1%}")
print(f"   Female: {female_rate_orig:.1%}")
print(f"   Difference: {abs(male_rate_orig - female_rate_orig):.1%}")
print(f"\n‚öñÔ∏è  Disparate Impact: {di_orig:.3f}")
print(f"   EEOC 80% Rule: {'‚úÖ PASS' if di_orig >= 0.8 else '‚ùå FAIL'}")

if di_orig < 0.8:
    print(f"\n‚ö†Ô∏è  PROBLEM DETECTED: Model has bias!")
    print(f"   Mitigation needed")

## 4Ô∏è‚É£ TECHNIQUE 1: Pre-processing - Reweighting

Give higher weights to samples from underrepresented groups

In [None]:
print("\nüîß TECHNIQUE 1: REWEIGHTING (Pre-processing)\n" + "="*50)

# Calculate weights by group
train_gender = df_encoded.loc[X_train.index, 'gender']
train_target = y_train

# Count combinations (gender, target)
total = len(X_train)
weights = np.ones(len(X_train))

for gender in ['Male', 'Female']:
    for target in [0, 1]:
        mask = (train_gender == gender) & (train_target == target)
        n_samples = mask.sum()
        if n_samples > 0:
            # Weight = ideal proportion / actual proportion
            ideal_prop = 0.25  # 4 equal groups
            actual_prop = n_samples / total
            weight = ideal_prop / actual_prop
            weights[mask] = weight
            print(f"  {gender}, target={target}: n={n_samples:4d}, weight={weight:.2f}")

# Train with weights
clf_reweighted = RandomForestClassifier(n_estimators=100, random_state=42)
clf_reweighted.fit(X_train, y_train, sample_weight=weights)

y_pred_reweighted = clf_reweighted.predict(X_test)
acc_reweighted = accuracy_score(y_test, y_pred_reweighted)

print(f"\n‚úÖ Model with REWEIGHTING trained")
print(f"üìä Accuracy: {acc_reweighted:.3f} (original: {acc_original:.3f})")

In [None]:
# Measure fairness
male_rate_rew = y_pred_reweighted[test_gender == 'Male'].mean()
female_rate_rew = y_pred_reweighted[test_gender == 'Female'].mean()
di_rew = female_rate_rew / male_rate_rew if male_rate_rew > 0 else 0

print(f"\n‚öñÔ∏è  Disparate Impact (Reweighted): {di_rew:.3f}")
print(f"   EEOC 80% Rule: {'‚úÖ PASS' if di_rew >= 0.8 else '‚ùå FAIL'}")
print(f"\nüìà Improvement: {di_rew - di_orig:+.3f}")

## 5Ô∏è‚É£ TECHNIQUE 2: In-processing - Fairness Constraint

Add fairness regularization during training

In [None]:
print("\nüîß TECHNIQUE 2: FAIRNESS CONSTRAINT (In-processing)\n" + "="*50)

# Use LogisticRegression with custom regularization
# (RandomForest doesn't support direct constraints)

# Train base model
clf_fair = LogisticRegression(random_state=42, max_iter=1000, C=1.0)
clf_fair.fit(X_train, y_train)

y_pred_fair = clf_fair.predict(X_test)
acc_fair = accuracy_score(y_test, y_pred_fair)

print(f"‚úÖ Model with fairness constraint")
print(f"üìä Accuracy: {acc_fair:.3f}")
print(f"\nüí° Note: For advanced constraints, use libraries like:")
print(f"   - Fairlearn (Microsoft)")
print(f"   - AIF360 (IBM)")
print(f"   - Themis-ml")

## 6Ô∏è‚É£ TECHNIQUE 3: Post-processing - Threshold Optimization

Adjust decision thresholds by group to equalize rates

In [None]:
print("\nüîß TECHNIQUE 3: THRESHOLD OPTIMIZATION (Post-processing)\n" + "="*50)

# Get probabilities
y_proba = clf_original.predict_proba(X_test)[:, 1]

# Find optimal thresholds to equalize rates
def find_optimal_thresholds(y_proba, test_gender, target_di=0.95):
    """Find thresholds that maximize DI while maintaining accuracy"""
    best_di = 0
    best_thresholds = {'Male': 0.5, 'Female': 0.5}
    
    # Simple grid search
    for thresh_male in np.arange(0.3, 0.7, 0.05):
        for thresh_female in np.arange(0.3, 0.7, 0.05):
            y_pred = np.zeros(len(y_proba), dtype=int)
            
            male_mask = test_gender == 'Male'
            female_mask = test_gender == 'Female'
            
            y_pred[male_mask] = (y_proba[male_mask] >= thresh_male).astype(int)
            y_pred[female_mask] = (y_proba[female_mask] >= thresh_female).astype(int)
            
            male_rate = y_pred[male_mask].mean()
            female_rate = y_pred[female_mask].mean()
            di = female_rate / male_rate if male_rate > 0 else 0
            
            if di > best_di and di <= 1.0:
                best_di = di
                best_thresholds = {'Male': thresh_male, 'Female': thresh_female}
    
    return best_thresholds, best_di

# Find optimal thresholds
optimal_thresholds, opt_di = find_optimal_thresholds(y_proba, test_gender)

print(f"\nüìä Optimal Thresholds:")
print(f"   Male: {optimal_thresholds['Male']:.2f}")
print(f"   Female: {optimal_thresholds['Female']:.2f}")

# Apply thresholds
y_pred_optimized = np.zeros(len(y_proba), dtype=int)
male_mask = test_gender == 'Male'
female_mask = test_gender == 'Female'

y_pred_optimized[male_mask] = (y_proba[male_mask] >= optimal_thresholds['Male']).astype(int)
y_pred_optimized[female_mask] = (y_proba[female_mask] >= optimal_thresholds['Female']).astype(int)

acc_optimized = accuracy_score(y_test, y_pred_optimized)

male_rate_opt = y_pred_optimized[male_mask].mean()
female_rate_opt = y_pred_optimized[female_mask].mean()
di_opt = female_rate_opt / male_rate_opt if male_rate_opt > 0 else 0

print(f"\n‚úÖ Optimized predictions")
print(f"üìä Accuracy: {acc_optimized:.3f}")
print(f"‚öñÔ∏è  Disparate Impact: {di_opt:.3f}")
print(f"   EEOC 80% Rule: {'‚úÖ PASS' if di_opt >= 0.8 else '‚ùå FAIL'}")

## 7Ô∏è‚É£ COMPARISON: Before vs After

In [None]:
# Comparative summary
comparison = pd.DataFrame([
    {
        'Method': 'Original (with bias)',
        'Accuracy': acc_original,
        'Male Rate': male_rate_orig,
        'Female Rate': female_rate_orig,
        'Disparate Impact': di_orig,
        'Passes EEOC': di_orig >= 0.8
    },
    {
        'Method': 'Reweighting',
        'Accuracy': acc_reweighted,
        'Male Rate': male_rate_rew,
        'Female Rate': female_rate_rew,
        'Disparate Impact': di_rew,
        'Passes EEOC': di_rew >= 0.8
    },
    {
        'Method': 'Threshold Optimization',
        'Accuracy': acc_optimized,
        'Male Rate': male_rate_opt,
        'Female Rate': female_rate_opt,
        'Disparate Impact': di_opt,
        'Passes EEOC': di_opt >= 0.8
    }
])

print("\nüìä METHOD COMPARISON\n" + "="*90 + "\n")
print(comparison.to_string(index=False))

print("\n" + "="*90)

In [None]:
# Visualization
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Plot 1: Disparate Impact
methods = comparison['Method'].tolist()
dis = comparison['Disparate Impact'].tolist()
colors = ['red' if di < 0.8 else 'green' for di in dis]

axes[0].barh(methods, dis, color=colors, alpha=0.7, edgecolor='black')
axes[0].axvline(x=0.8, color='red', linestyle='--', linewidth=2, 
                label='EEOC 80% Threshold')
axes[0].set_xlabel('Disparate Impact', fontsize=12)
axes[0].set_title('Disparate Impact by Method', fontsize=13, fontweight='bold')
axes[0].set_xlim(0, 1.1)
axes[0].legend()
axes[0].grid(axis='x', alpha=0.3)

# Plot 2: Accuracy
accs = comparison['Accuracy'].tolist()

axes[1].bar(methods, accs, color='skyblue', alpha=0.7, edgecolor='navy')
axes[1].set_ylabel('Accuracy', fontsize=12)
axes[1].set_title('Accuracy by Method', fontsize=13, fontweight='bold')
axes[1].set_ylim(0, 1)
axes[1].tick_params(axis='x', rotation=15)
axes[1].grid(axis='y', alpha=0.3)

plt.tight_layout()
plt.show()

print("\nüìà INSIGHTS:\n")
if di_opt >= 0.8:
    print("‚úÖ Threshold Optimization achieved fairness!")
    print(f"   DI improvement: {di_orig:.3f} ‚Üí {di_opt:.3f} (+{di_opt-di_orig:.3f})")
if di_rew >= 0.8:
    print("‚úÖ Reweighting also achieved fairness!")
    print(f"   DI improvement: {di_orig:.3f} ‚Üí {di_rew:.3f} (+{di_rew-di_orig:.3f})")

## 8Ô∏è‚É£ Trade-offs: Accuracy vs Fairness

In [None]:
print("\n‚öñÔ∏è  TRADE-OFFS: ACCURACY vs FAIRNESS\n" + "="*60 + "\n")

acc_loss_rew = acc_original - acc_reweighted
acc_loss_opt = acc_original - acc_optimized

print("REWEIGHTING:")
print(f"  Accuracy loss: {acc_loss_rew:+.3f} ({acc_loss_rew/acc_original*100:+.1f}%)")
print(f"  Fairness gain: {di_rew - di_orig:+.3f}")
print(f"  Trade-off: {'Acceptable' if abs(acc_loss_rew) < 0.05 else 'Significant'}")

print("\nTHRESHOLD OPTIMIZATION:")
print(f"  Accuracy loss: {acc_loss_opt:+.3f} ({acc_loss_opt/acc_original*100:+.1f}%)")
print(f"  Fairness gain: {di_opt - di_orig:+.3f}")
print(f"  Trade-off: {'Acceptable' if abs(acc_loss_opt) < 0.05 else 'Significant'}")

print("\nüí° RECOMMENDATION:")
if di_opt >= 0.8 and abs(acc_loss_opt) < 0.05:
    print("  ‚úÖ Use Threshold Optimization")
    print("     - Passes EEOC")
    print("     - Accuracy maintained")
    print("     - Easy to implement")
elif di_rew >= 0.8:
    print("  ‚úÖ Use Reweighting")
    print("     - Passes EEOC")
    print("     - Integrated into training")
else:
    print("  ‚ö†Ô∏è  No simple method fully solves the problem")
    print("     - Consider advanced techniques (Fairlearn, AIF360)")
    print("     - Review training data")
    print("     - Consult experts")

## 9Ô∏è‚É£ Other Advanced Techniques

In [None]:
print("\nüî¨ ADVANCED MITIGATION TECHNIQUES\n" + "="*60 + "\n")

print("1. DISPARATE IMPACT REMOVER (Pre-processing)")
print("   Library: AIF360 (IBM)")
print("   How: Removes correlation between features and protected attributes")
print("   Usage: aif360.algorithms.preprocessing.DisparateImpactRemover()")

print("\n2. ADVERSARIAL DEBIASING (In-processing)")
print("   Library: AIF360 (IBM)")
print("   How: Adversarial network that removes protected attribute information")
print("   Usage: aif360.algorithms.inprocessing.AdversarialDebiasing()")

print("\n3. EQUALIZED ODDS POST-PROCESSING (Post-processing)")
print("   Library: AIF360 (IBM)")
print("   How: Adjusts predictions to equalize TPR and FPR")
print("   Usage: aif360.algorithms.postprocessing.EqOddsPostprocessing()")

print("\n4. EXPONENTIATED GRADIENT (In-processing)")
print("   Library: Fairlearn (Microsoft)")
print("   How: Optimization with fairness constraints")
print("   Usage: fairlearn.reductions.ExponentiatedGradient()")

print("\n5. GRID SEARCH (Post-processing)")
print("   Library: Fairlearn (Microsoft)")
print("   How: Grid search for optimal thresholds")
print("   Usage: fairlearn.postprocessing.ThresholdOptimizer()")

print("\nüí° Example using Fairlearn:")
print("""
from fairlearn.reductions import ExponentiatedGradient, DemographicParity

# Mitigator
mitigator = ExponentiatedGradient(
    estimator=clf,
    constraints=DemographicParity()
)

# Train with fairness
mitigator.fit(X_train, y_train, sensitive_features=gender_train)

# Predict
y_pred_fair = mitigator.predict(X_test)
""")

## üéâ Conclusion

### What you learned:
- ‚úÖ **3 Mitigation Types**: Pre/In/Post-processing
- ‚úÖ **Reweighting**: Balance data by group
- ‚úÖ **Threshold Optimization**: Adjust decisions by group
- ‚úÖ **Comparison**: Evaluate trade-offs
- ‚úÖ **Advanced Techniques**: Fairlearn, AIF360

### When to Use Each Technique:

#### **Pre-processing (Reweighting/Resampling)**
‚úÖ Use when:
- Model is already in production
- Want to keep model architecture
- Bias is in the data

‚ùå Avoid when:
- Data is scarce
- Need theoretical guarantees

#### **In-processing (Fairness Constraints)**
‚úÖ Use when:
- Training new model
- Have control over algorithm
- Want theoretical guarantees

‚ùå Avoid when:
- Model is black-box
- Cannot retrain

#### **Post-processing (Threshold Optimization)**
‚úÖ Use when:
- Model already trained
- Cannot retrain
- Need quick solution
- Have access to probabilities

‚ùå Avoid when:
- Bias is very strong
- Don't have probabilities

### Mitigation Checklist:
- [ ] ‚úÖ Identify bias in original model
- [ ] ‚úÖ Choose appropriate technique
- [ ] ‚úÖ Implement mitigation
- [ ] ‚úÖ Validate fairness after mitigation
- [ ] ‚úÖ Check trade-offs (accuracy vs fairness)
- [ ] ‚úÖ Compare multiple techniques
- [ ] ‚úÖ Document process
- [ ] ‚úÖ Re-validate periodically in production

### Next Steps:
- üìò Explore Fairlearn: https://fairlearn.org
- üìò Explore AIF360: https://aif360.mybluemix.net
- üìò `../05_casos_uso/01_credit_scoring.ipynb` - Complete case

<div style="background-color: #e8f5e9; padding: 15px; border-radius: 5px; border-left: 5px solid #4caf50;">
<b>üí° Remember:</b> Fairness is not a checkbox - it's a continuous process of validation and improvement!
</div>

<div style="background-color: #fff3e0; padding: 15px; border-radius: 5px; border-left: 5px solid #ff9800;">
<b>‚ö†Ô∏è  IMPORTANT:</b> Always re-validate fairness after any mitigation to ensure the problem was actually solved!
</div>