<a href="https://colab.research.google.com/github/temahm/AiCon/blob/main/Income_Fairness_Evaluation_with_Logistic_Regression.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Dataset A: Adult Income (UCI Census Income)**

Use case: hiring / income proxy fairness
Sensitive attributes: sex, race

Why good: no licensing issues

In [None]:
from sklearn.datasets import fetch_openml
adult = fetch_openml("adult", version=2, as_frame=True)
df = adult.frame
df.head()


**Dataset B (Optional / advanced): COMPAS Recidivism (ProPublica)**

Use case: justice risk scoring fairness

Sensitive attribute: race (and sex)

Why good: powerful story

Colab load (direct CSV from ProPublica repo)

In [None]:
import pandas as pd
url = "https://raw.githubusercontent.com/propublica/compas-analysis/master/compas-scores-two-years.csv"
df = pd.read_csv(url)
df.head()


Installing Tools

In [None]:
!pip -q install fairlearn scikit-learn pandas numpy matplotlib

Also XGBoost for comparing Results

In [None]:
!pip -q install xgboost

In [None]:
print(df.columns)

Process Adult data set (fast and correct)

In [None]:
import pandas as pd
import numpy as np

# Clean missing values represented as '?'
df = df.replace("?", np.nan).dropna()

# Define target y (binary)
y = (df["class"] == ">50K").astype(int)

# Sensitive features (kept separately for fairness evaluation)
A_sex = df["sex"]
A_race = df["race"]

# Features X: drop target + sensitive columns (you can keep sensitive columns OUT of training)
X = df.drop(columns=["class", "sex", "race"])
X = pd.get_dummies(X, drop_first=True)  # one-hot encode categoricals


Train and Test data split

In [None]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test, Asex_train, Asex_test = train_test_split(
    X, y, A_sex, test_size=0.2, random_state=42, stratify=y
)

4) “Right baseline model” line (Logistic Regression)

In [None]:
from sklearn.linear_model import LogisticRegression
model = LogisticRegression(max_iter=2000, n_jobs=-1)
model.fit(X_train, y_train)

Predictions:

In [None]:
y_pred = model.predict(X_test)
y_prob = model.predict_proba(X_test)[:,1]

Overall accuracy might look “good”

But error rates differ across groups

That difference is a fairness risk

METRICS TO COMPUTE:

Selection rate: how often the model predicts “positive”

False Positive Rate (FPR): unfair harm when someone is incorrectly flagged positive

False Negative Rate (FNR): unfair harm when someone is incorrectly denied positive

**Code: group metrics with Fairlearn MetricFrame**

In [None]:
from fairlearn.metrics import MetricFrame, selection_rate
from sklearn.metrics import accuracy_score, confusion_matrix

def false_positive_rate(y_true, y_hat):
    tn, fp, fn, tp = confusion_matrix(y_true, y_hat).ravel()
    return fp / (fp + tn)

def false_negative_rate(y_true, y_hat):
    tn, fp, fn, tp = confusion_matrix(y_true, y_hat).ravel()
    return fn / (fn + tp)

metrics = {
    "accuracy": accuracy_score,
    "selection_rate": selection_rate,
    "FPR": false_positive_rate,
    "FNR": false_negative_rate
}

mf = MetricFrame(
    metrics=metrics,
    y_true=y_test,
    y_pred=y_pred,
    sensitive_features=Asex_test
)

mf.by_group


Disparity: max difference

In [None]:
mf.difference()

This prints one number per metric showing how far apart groups are.

A) Dataset statement

“Adult Income dataset; target is income bracket; sensitive attributes include sex/race; used for fairness benchmarking.”

B) Model statement

“Baseline logistic regression; chosen for transparency and stable behavior; sensitive attributes excluded from training features.”

C) Fairness definition statement

“We evaluate fairness by comparing selection rates and error rates across groups.”

“Large gaps indicate the model may treat groups differently.”


D) Interpretation statement
“Disparities can stem from historical patterns in data, feature proxies, and model decision boundaries.”

------------ ** Human-in-the-loop demo (quick) ** --------------------

This simulates a human review on “borderline cases” and shows improvement.

Pick borderline cases near the threshold

In [None]:
threshold = 0.5
borderline = (y_prob > 0.45) & (y_prob < 0.55)

Apply a simple “human review policy”

Example: if borderline negative but strong indicators, flip to positive:

In [None]:
y_pred_h = y_pred.copy()

rule = (
    borderline &
    (y_pred == 0) &
    (X_test.get("education-num", pd.Series(0, index=X_test.index)) >= 13) &
    (X_test.get("hours-per-week", pd.Series(0, index=X_test.index)) >= 40)
)

y_pred_h[rule] = 1

recompute fairness metrics (before vs after)

In [None]:
mf_after = MetricFrame(
    metrics=metrics,
    y_true=y_test,
    y_pred=y_pred_h,
    sensitive_features=Asex_test
)

print("BEFORE (by group):")
display(mf.by_group)
print("\nBEFORE (disparity):")
display(mf.difference())

print("\nAFTER (by group):")
display(mf_after.by_group)
print("\nAFTER (disparity):")
display(mf_after.difference())

# XGBoost (comparison only)
- Model most used in real life by ATS systems.
- “better accuracy ≠ more fair.”

In [None]:
from xgboost import XGBClassifier
xgb = XGBClassifier(
    n_estimators=300, max_depth=4, learning_rate=0.05,
    subsample=0.8, colsample_bytree=0.8, random_state=42, eval_metric="logloss"
)
xgb.fit(X_train, y_train)
y_pred_xgb = xgb.predict(X_test)

Now running MetricFrame on y_pred_xgb

In [None]:
from fairlearn.metrics import MetricFrame

mf_xgb = MetricFrame(
    metrics=metrics,              # same metrics dictionary
    y_true=y_test,
    y_pred=y_pred_xgb,
    sensitive_features=Asex_test  # or A_race_test if you're using race
)

Exporting a clean scorecard

In [None]:
scorecard = mf.by_group.copy()
scorecard.loc["DISPARITY (max-min)"] = mf.difference()
scorecard

Save scv

In [None]:
scorecard.to_csv("AII_scorecard_adult_sex.csv")

Group-Level Results

In [None]:
print("XGBoost — Metrics by Group:")
display(mf_xgb.by_group)

Disparity (max difference across groups)

In [None]:
print("XGBoost — Disparity:")
display(mf_xgb.difference())

# Logistic vs XGBoost

In [None]:
print("Logistic Regression Disparity:")
display(mf.difference())

print("\nXGBoost Disparity:")
display(mf_xgb.difference())

# “Although XGBoost may improve predictive accuracy, fairness disparities across groups may increase or persist. Higher accuracy does not guarantee equitable outcomes.”

Higher predictive accuracy does not mean equal error distribution across groups.

XGBoost optimizes global accuracy. Fairness depends on how errors are distributed across subpopulations.

Higher predictive accuracy does not mean equal error distribution across groups.

XGBoost optimizes global accuracy. Fairness depends on how errors are distributed across subpopulations.

XGBoost:
Captures nonlinear relationships

Detects complex feature interactions

Fits fine-grained decision boundaries

Minimizes total loss aggressively

Logistic regression:

Assumes linear relationships

Has a single global decision boundary

Is less flexible

**So XGBoost typically finds patterns logistic regression cannot.**

Accuracy is an aggregate metric

Accuracy = (Correct predictions) / (Total predictions)

It does NOT tell you:

Who is being misclassified

Which group has higher false positives

Which group has higher false negatives

Two models can have:

Same accuracy

Very different group-level errors

Or:

Higher accuracy


**Worse disparity
Why XGBoost can amplify disparity**

XGBoost builds trees that:
Partition data into increasingly specific regions
Exploit subtle correlations
If features correlate with sensitive attributes (even indirectly), the model may:
Learn proxies for protected characteristics
Create decision boundaries that disproportionately affect one group

Example:
Education, zip code, work history, income — all can act as proxies.

The more powerful the model, the more precisely it can exploit those patterns.

That increases accuracy.
But it may also increase disparity.

# Why XGBoost can amplify disparity

XGBoost builds trees that:

Partition data into increasingly specific regions

Exploit subtle correlations

If features correlate with sensitive attributes (even indirectly), the model may:

Learn proxies for protected characteristics

Create decision boundaries that disproportionately affect one group

Example:
Education, zip code, work history, income — all can act as proxies.

The more powerful the model, the more precisely it can exploit those patterns.

That increases accuracy.
But it may also increase disparity.

# The Key Insight

More predictive power = more ability to learn structural inequalities in the data.

If historical bias exists in data:

A more powerful model will reproduce it more efficiently.

Accuracy measures fit to historical reality.

Fairness evaluates whether that reality should be reproduced.
That’s the core philosophical tension.

# **# Accuracy measures how well the model predicts the past. Fairness measures how evenly the model distributes its mistakes. A more powerful model can predict the past better — including past inequities.**

In [None]:
comparison = pd.DataFrame({
    "Logistic_Disparity": mf.difference(),
    "XGBoost_Disparity": mf_xgb.difference()
})

comparison

graph...

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
from fairlearn.metrics import MetricFrame, false_positive_rate

# Define metric dictionary (only FPR for clean visualization)
metrics_fpr = {"FPR": false_positive_rate}

# Logistic model
mf_log = MetricFrame(
    metrics=metrics_fpr,
    y_true=y_test,
    y_pred=y_pred,
    sensitive_features=Asex_test
)

# XGBoost model
mf_xgb = MetricFrame(
    metrics=metrics_fpr,
    y_true=y_test,
    y_pred=y_pred_xgb,
    sensitive_features=Asex_test
)

# Convert to dataframe
df_plot = pd.DataFrame({
    "Logistic Regression": mf_log.by_group["FPR"],
    "XGBoost": mf_xgb.by_group["FPR"]
})

df_plot

In [None]:
df_plot.plot(kind="bar", figsize=(8,5))
plt.title("False Positive Rate by Group")
plt.ylabel("False Positive Rate")
plt.xlabel("Group")
plt.xticks(rotation=0)
plt.legend()
plt.tight_layout()
plt.show()

model more likely to incorrectly label individuals from this group as high risk. That is a measurable bias.”

Computing disparity: The maximum difference in FPR between groups is X%.

In [None]:
print("Logistic FPR Disparity:", mf_log.difference()["FPR"])
print("XGBoost FPR Disparity:", mf_xgb.difference()["FPR"])

# Before vs After Human-in-the-Loop

Using Adult dataset

(You can swap to race by replacing Asex_test with Arace_test)

get probabilities + baseline predictions

In [None]:
# Baseline (Logistic) predictions
y_prob = model.predict_proba(X_test)[:, 1]
y_pred = (y_prob >= 0.5).astype(int)

**Define a Human-in-the-Loop (HITL) “borderline review” policy**

We will:

Identify borderline cases near the decision boundary (uncertain)

Apply a consistent “human review guideline” to a small subset

Recompute fairness metrics

In [None]:
import numpy as np

threshold = 0.5
band_low, band_high = 0.45, 0.55
borderline = (y_prob >= band_low) & (y_prob <= band_high)

**Human review rule**

For borderline cases predicted negative, flip to positive if “strong indicators” exist.

Adult dataset features typically include education-num, hours-per-week, capital-gain (after one-hot).

Important: If your one-hot encoding changed column names, you may need to adjust feature access

In [None]:
import pandas as pd

y_pred_h = y_pred.copy()

# Safe gets (won't crash if column missing)
edu = X_test["education-num"] if "education-num" in X_test.columns else pd.Series(0, index=X_test.index)
hrs = X_test["hours-per-week"] if "hours-per-week" in X_test.columns else pd.Series(0, index=X_test.index)
cap = X_test["capital-gain"] if "capital-gain" in X_test.columns else pd.Series(0, index=X_test.index)

human_rule = (
    borderline &
    (y_pred == 0) &
    (edu >= 13) &
    (hrs >= 40) &
    (cap > 0)
)

y_pred_h[human_rule] = 1

print("Borderline cases:", borderline.sum())
print("Human overrides applied:", human_rule.sum())

Compute fairness metrics Before vs After (FPR by group)

In [None]:
from fairlearn.metrics import MetricFrame, false_positive_rate

mf_before = MetricFrame(
    metrics={"FPR": false_positive_rate},
    y_true=y_test,
    y_pred=y_pred,
    sensitive_features=Asex_test
)

mf_after = MetricFrame(
    metrics={"FPR": false_positive_rate},
    y_true=y_test,
    y_pred=y_pred_h,
    sensitive_features=Asex_test
)

print("FPR by group (BEFORE):")
display(mf_before.by_group)

print("FPR by group (AFTER):")
display(mf_after.by_group)

print("FPR disparity BEFORE:", mf_before.difference()["FPR"])
print("FPR disparity AFTER :", mf_after.difference()["FPR"])

One Chart: Before vs After HITL Adjustment

In [None]:
import matplotlib.pyplot as plt
import pandas as pd

df_hitl_plot = pd.DataFrame({
    "Before (Baseline)": mf_before.by_group["FPR"],
    "After (HITL Review)": mf_after.by_group["FPR"]
})

df_hitl_plot.plot(kind="bar", figsize=(8,5))
plt.title("False Positive Rate by Group — Before vs After Human-in-the-Loop Review")
plt.ylabel("False Positive Rate")
plt.xlabel("Group")
plt.xticks(rotation=0)
plt.legend()
plt.tight_layout()
plt.show()

First, we measure fairness as differences in error rates across groups. Then we apply a targeted human-in-the-loop review only to borderline, uncertain cases, using transparent guidelines. This reduces disparity while keeping most of the model automated. In AII, these human decisions are logged, auditable, and policy-driven—not arbitrary.