
# Module B: Algorithmic Bias and Fair Machine Learning — Demo Notebook

This notebook provides runnable examples for the key concepts covered in Module B. Each section pairs conceptual explanations with executable code snippets that you can use for live demonstrations or hands-on activities.



## 8.0 Module Overview

We will explore:

- How bias can emerge in machine learning pipelines.
- Legal and ethical framing around discrimination.
- Quantitative fairness metrics used to audit models.
- Practical strategies to mitigate unfair outcomes.

To keep the focus on ideas rather than data wrangling, we will work with a synthetic dataset whose structure mimics a typical tabular prediction task.



## 8.1 Module Introduction — Setup

The first code cell imports the Python libraries we will use throughout the module and prints their versions. This is a simple smoke test to confirm that the environment (including Google Colab) has the required packages installed.


In [None]:

import numpy as np
import pandas as pd
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
import seaborn as sns
import matplotlib.pyplot as plt

np.random.seed(42)

print("Library versions:")
print({
    "numpy": np.__version__,
    "pandas": pd.__version__,
    "scikit-learn": __import__('sklearn').__version__,
    "seaborn": sns.__version__
})



## 8.2 Legal Definitions of Discrimination — Building a Scenario

Discrimination law typically compares outcomes across protected groups. To simulate this setting, we create a dataset with an explicit sensitive attribute. The goal is to classify whether an applicant receives a positive decision (`approved = 1`) based on several features.

The sensitive attribute `group` splits applicants into Group A and Group B. In this simulation, Group A enjoys a slightly higher base rate of positive outcomes, mirroring historical advantages that can propagate into automated systems.


In [None]:

# Create a synthetic binary classification dataset
X, y = make_classification(
    n_samples=2000,
    n_features=6,
    n_informative=4,
    n_redundant=0,
    n_clusters_per_class=2,
    weights=[0.55, 0.45],
    class_sep=1.0,
    random_state=42
)

feature_names = [f"feature_{i}" for i in range(X.shape[1])]
df = pd.DataFrame(X, columns=feature_names)
df['approved'] = y

# Construct a sensitive attribute correlated with the target
score = X[:, 0] + 0.5 * X[:, 1]
threshold = np.percentile(score, 50)
df['group'] = np.where(score >= threshold, 'Group A', 'Group B')

# Introduce a mild historical bias: applicants from Group A get a small boost
bias_mask = (df['group'] == 'Group A') & (np.random.rand(len(df)) < 0.1)
df.loc[bias_mask, 'approved'] = 1

print(df.head())
print()
print('Group distribution:')
print(df['group'].value_counts(normalize=True))
print()
print('Approval rate by group:')
print(df.groupby('group')['approved'].mean())



### Visualizing Group Differences

Plots help us communicate disparities to stakeholders. The following cell compares the approval rate of each group and shows a feature distribution to highlight how data characteristics can vary with group membership.


In [None]:

fig, axes = plt.subplots(1, 2, figsize=(12, 4))

sns.barplot(
    data=df,
    x='group',
    y='approved',
    estimator=np.mean,
    ax=axes[0],
    palette='viridis'
)
axes[0].set_title('Approval Rate by Group')
axes[0].set_ylabel('Approval Probability')

sns.kdeplot(
    data=df,
    x='feature_0',
    hue='group',
    common_norm=False,
    fill=True,
    alpha=0.4,
    ax=axes[1]
)
axes[1].set_title('Feature 0 Distribution by Group')

plt.tight_layout()
plt.show()



## 8.3 Algorithmic Bias — Motivating Examples

With the dataset in place, we train a logistic regression model. Logistic regression is easy to interpret and common in high-stakes domains like credit scoring and admissions.

We build a pipeline that standardizes features (important for logistic regression) and fits the model. The evaluation includes accuracy metrics as well as a confusion matrix to illustrate how errors may differ across groups.


In [None]:

features = feature_names
target = 'approved'
sensitive = 'group'

X_train, X_test, y_train, y_test, group_train, group_test = train_test_split(
    df[features],
    df[target],
    df[sensitive],
    test_size=0.3,
    stratify=df[[target, sensitive]],
    random_state=42
)

pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('clf', LogisticRegression(max_iter=1000, solver='lbfgs'))
])

pipeline.fit(X_train, y_train)
y_pred = pipeline.predict(X_test)
y_proba = pipeline.predict_proba(X_test)[:, 1]

print(f"Accuracy: {accuracy_score(y_test, y_pred):.3f}")
print()
print('Classification report:')
print(classification_report(y_test, y_pred))

cm = confusion_matrix(y_test, y_pred)
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues')
plt.title('Confusion Matrix (All Applicants)')
plt.xlabel('Predicted Label')
plt.ylabel('True Label')
plt.show()



## 8.5 Fairness Metrics — Measuring Bias

We examine several common fairness metrics:

- **Statistical Parity Difference (SPD):** Difference in positive prediction rates between groups. Values near zero indicate parity.
- **Disparate Impact (DI):** Ratio of positive prediction rates. The "80% rule" flags ratios below 0.8.
- **Equal Opportunity Difference (EOD):** Difference in true positive rates (sensitivity) between groups.

The helper functions below compute these metrics from predictions and ground-truth labels.


In [None]:

import numpy.typing as npt

def group_positive_rate(y_pred: npt.NDArray[np.int_], group: pd.Series, label: str) -> float:
    mask = group == label
    if mask.sum() == 0:
        return np.nan
    return y_pred[mask].mean()


def true_positive_rate(y_true: npt.NDArray[np.int_], y_pred: npt.NDArray[np.int_], group: pd.Series, label: str) -> float:
    mask = group == label
    positives = (y_true[mask] == 1)
    if positives.sum() == 0:
        return np.nan
    return (y_pred[mask][positives] == 1).mean()


def statistical_parity_difference(y_pred: npt.NDArray[np.int_], group: pd.Series, reference: str, protected: str) -> float:
    return group_positive_rate(y_pred, group, protected) - group_positive_rate(y_pred, group, reference)


def disparate_impact_ratio(y_pred: npt.NDArray[np.int_], group: pd.Series, reference: str, protected: str) -> float:
    ref_rate = group_positive_rate(y_pred, group, reference)
    prot_rate = group_positive_rate(y_pred, group, protected)
    return prot_rate / ref_rate if ref_rate else np.nan


def equal_opportunity_difference(y_true: npt.NDArray[np.int_], y_pred: npt.NDArray[np.int_], group: pd.Series, reference: str, protected: str) -> float:
    return true_positive_rate(y_true, y_pred, group, protected) - true_positive_rate(y_true, y_pred, group, reference)

reference_group = 'Group A'
protected_group = 'Group B'

y_pred_binary = y_pred.astype(int)

target_array = y_test.values.astype(int)

spd = statistical_parity_difference(y_pred_binary, group_test, reference_group, protected_group)
di = disparate_impact_ratio(y_pred_binary, group_test, reference_group, protected_group)
eod = equal_opportunity_difference(target_array, y_pred_binary, group_test, reference_group, protected_group)

print(f"Statistical Parity Difference (Group B - Group A): {spd:.3f}")
print(f"Disparate Impact Ratio (Group B / Group A): {di:.3f}")
print(f"Equal Opportunity Difference (Group B - Group A): {eod:.3f}")



### Disaggregated Confusion Matrices

Disaggregating errors by group makes disparities tangible. The cell below prints group-specific confusion matrices to highlight where the model struggles.


In [None]:

for label in sorted(group_test.unique()):
    mask = group_test == label
    cm_group = confusion_matrix(y_test[mask], y_pred[mask])
    print()
    print(f'Confusion matrix for {label}:')
    display(pd.DataFrame(
        cm_group,
        index=pd.Index(['Actual 0', 'Actual 1']),
        columns=pd.Index(['Predicted 0', 'Predicted 1'])
    ))



## 8.7 Mitigating Bias — Simple Reweighting Strategy

One mitigation tactic is **reweighting**, where we increase the influence of underrepresented or disadvantaged examples during training. Here we compute inverse-probability weights by group and target label, then fit a new model using those weights. The goal is to equalize the effective sample contribution from each subgroup.

> **Teaching tip:** Reweighting is a transparent intervention that maps closely to policy levers such as affirmative action or targeted outreach.


In [None]:

train_df = X_train.copy()
train_df[target] = y_train
train_df[sensitive] = group_train

# Compute inverse probability weights for each (group, label) combination
counts = train_df.groupby([sensitive, target]).size().rename('count').reset_index()
counts['weight'] = counts['count'].sum() / (len(counts) * counts['count'])

train_df = train_df.merge(counts[[sensitive, target, 'weight']], on=[sensitive, target], how='left')

sample_weights = train_df['weight'].values

mitigated_pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('clf', LogisticRegression(max_iter=1000, solver='lbfgs'))
])

mitigated_pipeline.fit(X_train, y_train, clf__sample_weight=sample_weights)
mitigated_pred = mitigated_pipeline.predict(X_test)

print(f"Mitigated accuracy: {accuracy_score(y_test, mitigated_pred):.3f}")
print()
print('Classification report (mitigated model):')
print(classification_report(y_test, mitigated_pred))



### Comparing Fairness Metrics Before and After Mitigation

To assess whether the intervention helped, we recompute the fairness metrics for the mitigated model and compare them side-by-side with the baseline results. Small improvements illustrate the trade-offs and iterative nature of fairness work.


In [None]:

mitigated_pred_binary = mitigated_pred.astype(int)

mitigated_spd = statistical_parity_difference(mitigated_pred_binary, group_test, reference_group, protected_group)
mitigated_di = disparate_impact_ratio(mitigated_pred_binary, group_test, reference_group, protected_group)
mitigated_eod = equal_opportunity_difference(target_array, mitigated_pred_binary, group_test, reference_group, protected_group)

comparison = pd.DataFrame({
    'Metric': ['Statistical Parity Difference', 'Disparate Impact Ratio', 'Equal Opportunity Difference'],
    'Baseline': [spd, di, eod],
    'Mitigated': [mitigated_spd, mitigated_di, mitigated_eod]
})
comparison['Change (Mitigated - Baseline)'] = comparison['Mitigated'] - comparison['Baseline']
comparison



## 8.8 Closing Discussion

Fairness assessments are inherently contextual. This notebook showed how to:

1. Surface disparities in model outcomes.
2. Quantify those disparities with multiple metrics.
3. Apply a lightweight mitigation strategy and evaluate its effect.

Encourage students to experiment with alternative mitigation ideas—such as different weighting schemes, threshold adjustments, or feature auditing—to see how each approach shifts the fairness-accuracy balance.
