# Zemel et al. pre-processing fairness intervention

This notebook contains an implementation of the pre-processing fairness intervention introduced in [Learning Fair Representations](http://proceedings.mlr.press/v28/zemel13.html) by Zemel et al. (2013) as part of the IBM AIF360 fairness tool box github.com/IBM/AIF360.

The intervention achieves demographic parity by the use of a clustering method which transforms the original data set by expressing points as linear combinations of learnt cluster centres. The transformed data set is as close as possible to the original while containing as little information as possible about the sensitive attributes.

The output of their method includes besides a fair data representation also fair label predictions. Predicted labels of the transformed data set can be defined so that similar points are mapped to a similar label prediction. In that sense Individual Fairness is achieved, too.

Here, we consider fairness defined with respect to sex. There is another notebook considering fairness with respect to race using Zemel et al.'s intervention method.

In [None]:
from pathlib import Path

import joblib
import pandas as pd
import plotly.graph_objs as go

from aif360.algorithms.preprocessing.lfr import LFR
from aif360.datasets import StandardDataset

from helpers.fairness_measures import accuracy, disparate_impact_d
from helpers.finance import preprocess
from helpers import export_plot

## Load data

Location of artifacts (model and data)

In [None]:
artifacts_dir = Path("../../../artifacts")

In [None]:
# override data_dir in source notebook
# this is stripped out for the hosted notebooks
artifacts_dir = Path("../../../../artifacts")

In [None]:
data_dir = artifacts_dir / "data" / "adult"
preprocess(data_dir)

Location of the data

In [None]:
train = pd.read_csv(data_dir / "processed" / "train-one-hot.csv")
val = pd.read_csv(data_dir / "processed" / "val-one-hot.csv")
test = pd.read_csv(data_dir / "processed" / "test-one-hot.csv")

In order to process data for our fairness intervention we need to define special dataset objects which are part of every intervention pipeline within the IBM AIF360 toolbox. These objects contain the original data as well as some useful further information, e.g., which feature is the protected attribute as well as which column corresponds to the label.

In [None]:
train_sds = StandardDataset(
    train,
    label_name="salary",
    favorable_classes=[1],
    protected_attribute_names=["sex"],
    privileged_classes=[[1]],
)
test_sds = StandardDataset(
    test,
    label_name="salary",
    favorable_classes=[1],
    protected_attribute_names=["sex"],
    privileged_classes=[[1]],
)
val_sds = StandardDataset(
    val,
    label_name="salary",
    favorable_classes=[1],
    protected_attribute_names=["sex"],
    privileged_classes=[[1]],
)
index = train_sds.feature_names.index("sex")

In [None]:
privileged_groups = [{"sex": 1.0}]
unprivileged_groups = [{"sex": 0.0}]

## Demographic parity

Given the original unfair data set we apply Zemel et al.'s intervention to obtain a fair data set including fair labels. More precisely, we load an already learnt mitigation or learn a new mitigation procedure based on the true and predicted labels of the training data. We then apply the learnt procedure to transform the testing data and analyse fairness and accuracy in the transformed testing data.

The degree of fairness and accuracy can be controlled by the choice of parameters $A_x, A_y, A_z$ and $k$ when setting up the mitigation procedure. Here, $A_x$ controls the loss associated with the distance between original and transformed data set, $A_y$ the accuracy loss and $A_z$ the fairness loss. The larger one of these parameter is chosen compared to the others, the larger the priority of minimising the loss associated with that parameter. Hence, leaving $A_x$ and $A_y$ fixed, we can increase the degree of fairness achieved by increasing the parameter $A_z$.

As differences in fairness between independently learnt mitigations with same parameter choice can sometimes be significant we load a pre-trained intervention which achieves reasonable results. The user is still encouraged to train inteventions themselves (see commented out code below), and compare achieved fairness, potentially for a number of indepedent runs.

### Load or learn intervention

a) Location of the intervention previously learned on the training data.

In [None]:
TR = joblib.load(artifacts_dir / "models" / "finance" / "zemel-sex.pkl")

b) Learn intervention of the training data.

In [None]:
# TR = LFR(
#     unprivileged_groups=unprivileged_groups,
#     privileged_groups=privileged_groups,
#     k=5,
#     Ax=0.01,
#     Ay=1.0,
#     Az=25.0,
# )
# TR = TR.fit(train_sds)

### Apply intervention

To test set.

In [None]:
transf_test_sds = TR.transform(test_sds)
test_fair_labels = transf_test_sds.labels.flatten()

## Analyse fairness and accuracy

On test data.

In [None]:
print("Accuracy =", accuracy(test_fair_labels, test.salary))
print(
    "Female accuracy =",
    accuracy(test_fair_labels[test.sex == 0], test.salary[test.sex == 0]),
)
print(
    "Male accuracy =",
    accuracy(test_fair_labels[test.sex == 1], test.salary[test.sex == 1]),
)
print("Mean female score =", test_fair_labels[test.sex == 0].mean())
print("Mean male score =", test_fair_labels[test.sex == 1].mean())

dp_d = disparate_impact_d(test_fair_labels, test.sex)
print("Demographic parity =", dp_d)

### Demographic parity

In [None]:
dp_bar = go.Figure(
    data=[
        go.Bar(
            x=[sex],
            y=[test_fair_labels[test.sex == sex].mean()],
            name="Male" if sex else "Female",
        )
        for sex in range(2)
    ],
    layout={"yaxis": {"range": [0, 1]}},
)
dp_bar

In [None]:
export_plot(dp_bar, "zemel-sex-dp.json")