# Part 4: Improve fairness

In the previous notebook we've seen how we can measure fairness. In this notebook we're going to use a TresholdOptimizer to improve the fairness of our model. We'll cover the following topics:

* Loading the dataset and model
* Using the TresholdOptimizer
* Comparing the performance between the old and the new model

## Loading the dataset and model
Before we work on improving the model, let's load the existing model and the dataset.
We'll perform the following steps:

* First, we load the training and testing dataset. 
* Then, we load the model.

In [1]:
import joblib
import pandas as pd

In [2]:
model = joblib.load('../models/model.bin')

In [3]:
model_1 = joblib.load('../models/model_1.bin')

In [4]:
model_2 = joblib.load('../models/model_2.bin')

In [5]:
model_3 = joblib.load('../models/model_3.bin')

In [6]:
df_train = pd.read_csv('../data/processed/train.csv')
df_test = pd.read_csv('../data/processed/test.csv')

In [7]:
x_train = df_train.drop(['SEX', 'default.payment.next.month'], axis=1)
y_train = df_train['default.payment.next.month']

In [8]:
x_test = df_test.drop(['SEX', 'default.payment.next.month'], axis=1)
y_test = df_test['default.payment.next.month']

Previously, our dataset was unbalanced since we didn't spend any time looking at that. If we are going to improve the model with the `TresholdOptimizer` we need to make sure that the input has an equal number of samples producing the output value `0` to the number of samples producing the output value `1`. This is a requirement of the algorithm that's used by the optimizer.

In [9]:
balanced_idx = df_train[y_train == 1].index
balanced_idx = balanced_idx.union(
    y_train[y_train==0].sample(n=balanced_idx.size).index
)

In [10]:
x_train_balanced = df_train.loc[balanced_idx, :].drop(['SEX', 'default.payment.next.month'], axis=1)
y_train_balanced = df_train.loc[balanced_idx, 'default.payment.next.month']

Once we have a prepared dataset, we can start working on optimizing the model using the `TresholdOptimizer`.

## Using the TresholdOptimizer
We can improve existing models using the `TresholdOptimizer`. This component takes the original model and adds a layer of optimization on top of it to equalize the performance difference between different demographic groups. It treats the original model as a black box for this.

There are two methods of optimization that you can use:

* Equalized odds
* Demographic parity

Both are loss functions that are used by the optimizer to minimize the differences in performance between demographic groups. Note that we can't fully fix the problems, since this is a problem where we have to maximize performance (thus minimize model loss) and minimize the differences between demographic groups specified as a set of sensitive features. This min-maxing problem turns out to be a balancing act where we can only reach equilibrium between the disparity metric and performance metric.

In [11]:
from fairlearn.postprocessing import ThresholdOptimizer

In [12]:
optimizer = ThresholdOptimizer(estimator=model, constraints='demographic_parity')
optimizer_1 = ThresholdOptimizer(estimator=model_1, constraints='demographic_parity')
optimizer_2 = ThresholdOptimizer(estimator=model_2, constraints='demographic_parity')
optimizer_3 = ThresholdOptimizer(estimator=model_3, constraints='demographic_parity')

In [13]:
optimizer.fit(x_train_balanced, y_train_balanced, sensitive_features=df_train.loc[balanced_idx, 'SEX'])
optimizer_1.fit(x_train_balanced, y_train_balanced, sensitive_features=df_train.loc[balanced_idx, 'SEX'])
optimizer_2.fit(x_train_balanced, y_train_balanced, sensitive_features=df_train.loc[balanced_idx, 'SEX'])
optimizer_3.fit(x_train_balanced, y_train_balanced, sensitive_features=df_train.loc[balanced_idx, 'SEX'])

## Measuring fairness of the new model
Once we have an improved model, we can look at its fairness and compare it to the old model. First, we'll measure the fairness of the new model.

In [14]:
from fairlearn.widget import FairlearnDashboard

In [15]:
FairlearnDashboard(
    sensitive_features=df_test['SEX'],
    sensitive_feature_names=['SEX'],
    y_true=df_test['default.payment.next.month'],
    y_pred=optimizer.predict(x_test, sensitive_features=df_test['SEX'])
)

FairlearnWidget(value={'true_y': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0…

<fairlearn.widget._fairlearn_dashboard.FairlearnDashboard at 0x7f88c9ecb820>

In [16]:
FairlearnDashboard(
    sensitive_features=df_test['SEX'],
    sensitive_feature_names=['SEX'],
    y_true=df_test['default.payment.next.month'],
    y_pred=optimizer_1.predict(x_test, sensitive_features=df_test['SEX'])
)

FairlearnWidget(value={'true_y': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0…

<fairlearn.widget._fairlearn_dashboard.FairlearnDashboard at 0x7f88c9ea6be0>

In [17]:
FairlearnDashboard(
    sensitive_features=df_test['SEX'],
    sensitive_feature_names=['SEX'],
    y_true=df_test['default.payment.next.month'],
    y_pred=optimizer_2.predict(x_test, sensitive_features=df_test['SEX'])
)

FairlearnWidget(value={'true_y': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0…

<fairlearn.widget._fairlearn_dashboard.FairlearnDashboard at 0x7f88c9bf9790>

In [18]:
FairlearnDashboard(
    sensitive_features=df_test['SEX'],
    sensitive_feature_names=['SEX'],
    y_true=df_test['default.payment.next.month'],
    y_pred=optimizer_3.predict(x_test, sensitive_features=df_test['SEX'])
)

FairlearnWidget(value={'true_y': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0…

<fairlearn.widget._fairlearn_dashboard.FairlearnDashboard at 0x7f88c9ed28b0>

Then, we're comparing the two models.

In [19]:
comparison = {
    'Original model': model.predict(x_test),
    'TresholdOptimizer': optimizer.predict(x_test, sensitive_features=df_test['SEX'])
}

In [20]:
comparison_1 = {
    'Original model': model_1.predict(x_test),
    'TresholdOptimizer': optimizer_1.predict(x_test, sensitive_features=df_test['SEX'])
}

In [21]:
comparison_2 = {
    'Original model': model_2.predict(x_test),
    'TresholdOptimizer': optimizer_2.predict(x_test, sensitive_features=df_test['SEX'])
}

In [22]:
comparison_3 = {
    'Original model': model_3.predict(x_test),
    'TresholdOptimizer': optimizer_3.predict(x_test, sensitive_features=df_test['SEX'])
}

In [23]:
FairlearnDashboard(
    sensitive_features=df_test['SEX'],
    sensitive_feature_names=['Gender'],
    y_true=df_test['default.payment.next.month'],
    y_pred=comparison
)

FairlearnWidget(value={'true_y': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0…

<fairlearn.widget._fairlearn_dashboard.FairlearnDashboard at 0x7f88eb62b7c0>

In [24]:
FairlearnDashboard(
    sensitive_features=df_test['SEX'],
    sensitive_feature_names=['Gender'],
    y_true=df_test['default.payment.next.month'],
    y_pred=comparison_1
)

FairlearnWidget(value={'true_y': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0…

<fairlearn.widget._fairlearn_dashboard.FairlearnDashboard at 0x7f88eb5d83d0>

In [25]:
FairlearnDashboard(
    sensitive_features=df_test['SEX'],
    sensitive_feature_names=['Gender'],
    y_true=df_test['default.payment.next.month'],
    y_pred=comparison_2
)

FairlearnWidget(value={'true_y': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0…

<fairlearn.widget._fairlearn_dashboard.FairlearnDashboard at 0x7f88eb62bca0>

In [26]:
FairlearnDashboard(
    sensitive_features=df_test['SEX'],
    sensitive_feature_names=['Gender'],
    y_true=df_test['default.payment.next.month'],
    y_pred=comparison_3
)

FairlearnWidget(value={'true_y': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0…

<fairlearn.widget._fairlearn_dashboard.FairlearnDashboard at 0x7f88eb5d83a0>