# Zemel et al. pre-processing fairness intervention

Zemel et al. (2013) proposes a clustering method which transforms the original data set by expressing points as linear combinations of learnt cluster centres. The transformed data set is as close as possible to the original while containing as little information as possible about the sensitive attributes. Thereby, demographic parity is achieved.

The output of their method includes besides a fair data representation also fair label predictions, which allows the comparison according to the usual fairness metrics. We apply their approach as implemented by IBM's AIF360 fairness tool box.

In [None]:
from pathlib import Path

import joblib
import numpy as np
import pandas as pd
from aif360.algorithms.preprocessing.lfr import LFR  # noqa
from aif360.datasets import StandardDataset
from fairlearn.metrics import (
    demographic_parity_difference,
    demographic_parity_ratio,
)
from helpers.metrics import accuracy
from helpers.plot import group_bar_plots

In [None]:
from helpers import export_plot

## Load data

We have committed preprocessed data to the repository for reproducibility and we load it here. Check out the preprocessing notebook for details on how this data was obtained.

In [None]:
artifacts_dir = Path("../../../artifacts")

In [None]:
# override data_dir in source notebook
# this is stripped out for the hosted notebooks
artifacts_dir = Path("../../../../artifacts")

In [None]:
data_dir = artifacts_dir / "data" / "recruiting"

In [None]:
train = pd.read_csv(data_dir / "processed" / "train.csv")
val = pd.read_csv(data_dir / "processed" / "val.csv")
test = pd.read_csv(data_dir / "processed" / "test.csv")

AIF360 requires expressing the original data sets via the "StandardDataset" class.

In [None]:
train_sds = StandardDataset(
    train,
    label_name="employed_yes",
    favorable_classes=[1],
    protected_attribute_names=["race_white"],
    privileged_classes=[[1]],
)
test_sds = StandardDataset(
    test,
    label_name="employed_yes",
    favorable_classes=[1],
    protected_attribute_names=["race_white"],
    privileged_classes=[[1]],
)
val_sds = StandardDataset(
    val,
    label_name="employed_yes",
    favorable_classes=[1],
    protected_attribute_names=["race_white"],
    privileged_classes=[[1]],
)
index = train_sds.feature_names.index("race_white")

In [None]:
privileged_groups = [{"race_white": 1.0}]
unprivileged_groups = [{"race_white": 0.0}]

## Train unfair model

For maximum reproducibility we load the baseline model from disk, but the code used to train can be found in the baseline model notebook.

In [None]:
bl_model = joblib.load(
    artifacts_dir / "models" / "recruiting" / "baseline.pkl"
)

bl_test_probs = bl_model.predict_proba(test_sds.features)[:, 1]
bl_test_pred = bl_test_probs > 0.5

## Learn fair representation

We chose the hyperparameters $A_x, A_y, A_z$ and $k$ by a grid search, and load a pretrained model from disk for reproducibility, however we encourage you to experiment with other values of these hyperparameters. 

In [None]:
TR = joblib.load(artifacts_dir / "models" / "recruiting" / "zemel.pkl")

# TR = LFR(
#     unprivileged_groups=unprivileged_groups,
#     privileged_groups=privileged_groups,
#     k=5,
#     Ax=0.01,
#     Ay=1.0,
#     Az=1500.0,
# )
# TR = TR.fit(train_sds)  # , maxiter=500, maxfun=500)

Apply transformation to test data

In [None]:
transf_test_sds = TR.transform(test_sds)
test_fair_labels = transf_test_sds.labels.flatten()

Evaluate fairness and accuracy

In [None]:
bl_acc = bl_model.score(test.drop(columns="employed_yes"), test.employed_yes)
bl_dpd = demographic_parity_difference(
    test.employed_yes, bl_test_pred, sensitive_features=test.race_white,
)
bl_dpr = demographic_parity_ratio(
    test.employed_yes, bl_test_pred, sensitive_features=test.race_white,
)

acc = accuracy(test.employed_yes, test_fair_labels)
dpd = demographic_parity_difference(
    test.employed_yes, test_fair_labels, sensitive_features=test.race_white,
)
dpr = demographic_parity_ratio(
    test.employed_yes, test_fair_labels, sensitive_features=test.race_white,
)

print(f"Baseline accuracy: {bl_acc:.3f}")
print(f"Accuracy: {acc:.3f}\n")

print(f"Baseline demographic parity difference: {bl_dpd:.3f}")
print(f"Demographic parity difference: {dpd:.3f}\n")

print(f"Baseline demographic parity ratio: {bl_dpr:.3f}")
print(f"Demographic parity ratio: {dpr:.3f}")

We visualise the difference in mean outcomes using a bar chart.

In [None]:
dp_bar = group_bar_plots(
    np.concatenate([bl_test_pred, test_fair_labels]),
    np.tile(test.race_white.map({0: "Black", 1: "White"}), 2),
    groups=np.concatenate(
        [np.zeros_like(bl_test_pred), np.ones_like(test_fair_labels)]
    ),
    group_names=["Baseline", "Zemel"],
    title="Proportion of predicted high earners by race",
    xlabel="Propotion of predicted high earners",
    ylabel="Method",
)
dp_bar

In [None]:
export_plot(dp_bar, "zemel-dp.json")