# Removing Unfair Bias in Machine Learning

**[Code@Think UKI](https://www.ibm.com/uk-en/events/think-summit/codeatthink.html) workshop - 7 October 2020 -  Margriet Groenendijk** 

*Code from [these examples](https://github.com/Trusted-AI/AIF360/tree/master/examples) is used and adapted.*

## Outline

AI can embed human and societal bias and be then deployed at scale. 

Many algorithms are now being reexamined due to illegal bias. 

So how do you remove bias & discrimination in the machine learning pipeline? 

In this workshop debiasing techniques will be explored that can be implemented by using the open source toolkit [AI Fairness 360](https://github.com/IBM/AIF360). 

[1. Introduction](#intro)

[2. Bias definitions](#definitions)

* [2.1 Bias and variance](#stats)
* [2.2 Statistical vs. cognitive bias](#defs)

[3. AI fairness metrics](#metrics)

* [3.1 Install aif360 and import packages](#install)
* [3.2 Exploring data](#explore)
* [3.3 Exploring bias](#bias)

[4. Model building](#model)
 
* [4.1 Train on the original data](#original) 
 
[5. AI fairness algorithms](#algorithms)

* [5.1 Pre-processing](#preproc)
* [5.2 In-processing](#inproc)
* [5.3 Post-processing](#postproc)


<a class="anchor" id="intro"></a>
# 1. Introduction

**In the UK this happened...**


<img src="https://github.com/MargrietGroenendijk/gitbooks2/blob/master/files/Alevels.png?raw=true" width="1000" align="left">

* This model was used to estimate students grades
* Based on their previous grades, the school they went, the average for the school, etc.
* Impacting their chance of getting accepted by universities

# Is this fair?

# How can you avoid this?

## Thinking about bias

<span style="font-family:Comic Sans MS">Going back now to the start of my [PhD research](https://research.vu.nl/en/publications/boxing-nature-global-generalities-in-terrestrial-ecosystem-photos), I clearly knew then there was a big risk in using these observations to explore global scale relationships as I found in one of the first slides I made. But looking back now, I am not so sure I fully explored this enough. </span>

<img src="https://github.com/MargrietGroenendijk/gitbooks2/blob/master/files/points.png?raw=true"  width="1000" align="left">

## [Emb(race)](https://www.ibm.org/responsibility/2019/case-studies/embrace)

> IBM and IBMers stand with the Black community and call for change to ensure racial equality.

How is this all connected? Can technology be used, even if it can only be a small part of a possible solution?

### Let's explore fairness, and how to define and reduce bias in data and models. 

<a class="anchor" id="definitions"></a>
# 2. Bias definitions

<a class="anchor" id="stats"></a>
## 2.1 Bias and variance

Some definitions of how I used to think of bias (and are the first ones coming up in a google search):

The [**bias error**](https://en.wikipedia.org/wiki/Bias%E2%80%93variance_tradeoff) is an error from erroneous assumptions in the learning algorithm. High bias can cause an algorithm to miss the relevant relations between features and target outputs (underfitting).

The [**bias**](https://en.wikipedia.org/wiki/Bias_of_an_estimator) (or bias function) of an estimator is the difference between this estimator's expected value and the true value of the parameter being estimated.

[**Statistical bias**](https://en.wikipedia.org/wiki/Bias_(statistics) is a feature of a statistical technique or of its results whereby the expected value of the results differs from the true underlying quantitative parameter being estimated.

Probably clearer in this form with an example from [here](https://machinelearningmastery.com/calculate-the-bias-variance-trade-off/):

Error = Variance + Bias + Noise

* The bias is a measure of how close the model can capture the mapping function between inputs and outputs.
* The variance of the model is the amount the performance of the model changes when it is fit on different training data.
* Bias-variance trade-off: reducing bias can easily be achieved by increasing variance, and the other way around

This is not the bias that we will discuss in this workshop.

> <span style="font-family:Comic Sans MS">The bias rabbit hole... </span> links to links to even more definitions: [inductive bias](https://en.wikipedia.org/wiki/Inductive_bias), [calibration bias](https://onlinelibrary.wiley.com/doi/10.1002/9781118736890.ch7) or [precision bias](https://en.wikipedia.org/wiki/Precision_bias).

<a class="anchor" id="defs"></a>
## 2.2 Statistical vs. cognitive bias

There are clearly a lot of definitions. Have these anything to do with fairness? Some of the definitions from [this page](https://en.wikipedia.org/wiki/Bias_(statistics)) are are types of bias that can cause unfairness:

* **Selection bias** involves individuals being more likely to be selected for study than others, biasing the sample
* **Spectrum bias** arises from evaluating diagnostic tests on biased patient samples, leading to an overestimate of the sensitivity and specificity of the test
* The **bias of an estimator** is the difference between an estimator's expected value and the true value of the parameter being estimated
* **Omitted-variable bias** is the bias that appears in estimates of parameters in regression analysis when the assumed specification omits an independent variable that should be in the model
* **Detection bias** occurs when a phenomenon is more likely to be observed for a particular set of study subjects. 
* **Funding bias** may lead to the selection of outcomes, test samples, or test procedures that favor a study's financial sponsor.
* **Reporting bias** involves a skew in the availability of data, such that observations of a certain kind are more likely to be reported.
* **Analytical bias** arises due to the way that the results are evaluated.
* **Exclusion bias** arise due to the systematic exclusion of certain individuals from the study.
* **Attrition bias** arises due to a loss of participants e.g. loss to follow up during a study.
* **Recall bias** arises due to differences in the accuracy or completeness of participant recollections of past events. e.g. a patient cannot recall how many cigarettes they smoked last week exactly, leading to over-estimation or under-estimation.
* **Observer bias** arises when the researcher subconsciously influences the experiment due to cognitive bias where judgment may alter how an experiment is carried out / how results are recorded.

For fair models, metrics are needed that can help explore if any of the above biases is present in data and models. 

<a class="anchor" id="metrics"></a>
# 3. AI fairness metrics

<a class="anchor" id="install"></a>
## 3.1 Install aif360  and import packages

In [None]:
!pip install aif360
!pip install cvxpy
!pip install tqdm

In [None]:
import warnings
warnings.filterwarnings('ignore')

import sys
sys.path.insert(1, "../")  

# data exploration
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from tqdm import tqdm

np.random.seed(0)

from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn import metrics

# aif360 data, metrics and algorithms
from aif360.datasets import GermanDataset
from aif360.metrics import BinaryLabelDatasetMetric
from aif360.metrics import ClassificationMetric
from aif360.algorithms.preprocessing import Reweighing
from aif360.algorithms.preprocessing.optim_preproc import OptimPreproc
from aif360.algorithms.preprocessing.optim_preproc_helpers.data_preproc_functions import load_preproc_data_german
from aif360.algorithms.preprocessing.optim_preproc_helpers.distortion_functions import get_distortion_german
from aif360.algorithms.preprocessing.optim_preproc_helpers.opt_tools import OptTools
from aif360.algorithms.postprocessing.reject_option_classification import RejectOptionClassification

from IPython.display import Markdown, display
%matplotlib inline

<a class="anchor" id="explore"></a>
## 3.2 Exploring data

A [dataset](https://archive.ics.uci.edu/ml/datasets/Statlog+%28German+Credit+Data%29) that classifies people described by a set of attributes as good or bad credit risks.

This data is one of the example [datasets](https://aif360.readthedocs.io/en/latest/modules/datasets.html#module-aif360.datasets) used in aif360 and has it's own [class](https://aif360.readthedocs.io/en/latest/modules/generated/aif360.datasets.GermanDataset.html#aif360.datasets.GermanDataset) that will be used. 

### Load data

It is assumed that the dataset can be found within a specific location. Let's create this folder and the download the data to this new folder. The advantage is that this works both locally and when running in Watson Studio.

In [None]:
aif360_location = !python -c "from distutils.sysconfig import get_python_lib; print(get_python_lib())"
import os
install_loc = os.path.join(aif360_location[0], "aif360/data/raw/german/")
%cd $install_loc

In [None]:
!wget ftp://ftp.ics.uci.edu/pub/machine-learning-databases/statlog/german/german.data
!wget ftp://ftp.ics.uci.edu/pub/machine-learning-databases/statlog/german/german.doc
%cd -

In [None]:
dataset_german = GermanDataset()

### AIF360 data format

All variables of this dataset are described in the [documentation](https://aif360.readthedocs.io/en/latest/modules/generated/aif360.datasets.GermanDataset.html) with more details in the description of the [`StandardDataset`](https://aif360.readthedocs.io/en/latest/modules/generated/aif360.datasets.StandardDataset.html). In short, the dataset class contains a numpy array or pandas DataFrame with several variables. 

In [None]:
type(dataset_german)

In [None]:
type(dataset_german.features)

In [None]:
print(f'labels: {dataset_german.label_names}')
print(f'protected attributes: {dataset_german.protected_attribute_names}')
print(f'number of features: {len(dataset_german.feature_names)}')

### Explore with pandas

Convert the data to a `features` DataFrame and `labels` Series:

In [None]:
features = pd.DataFrame(dataset_german.features, columns=dataset_german.feature_names)
labels = pd.Series(dataset_german.labels.ravel(), name=dataset_german.label_names[0])

In [None]:
features.describe().transpose().head(12)

From the [data description](https://archive.ics.uci.edu/ml/datasets/Statlog+%28German+Credit+Data%29):

Attribute 1: (qualitative) \
Status of existing checking account \
A11 : ... < 0 DM \
A12 : 0 <= ... < 200 DM \
A13 : ... >= 200 DM / salary assignments for at least 1 year \
A14 : no checking account 

In [None]:
labels.unique()

### The distribution of the features

In [None]:
plt.rcParams["figure.figsize"] = (18,18)

features.hist();
plt.tight_layout()

Most features are binary, but a few are continuous:

In [None]:
features[['credit_amount','month','number_of_credits']]. \
        plot(subplots=True, \
             kind='hist', \
             layout=(2, 2),
             sharex=False, \
             figsize=(10, 10));

<a class="anchor" id="bias"></a>
## 3.3 Exploring bias (the aif360 way)

Let's review the basics of how a model is created in a supervised machine learning process to understand how bias can enter a machine learning model:

<img src="https://nbviewer.jupyter.org/github/IBM/AIF360/blob/master/examples/images/Complex_NoProc_V3.jpg"   width="500" align="left">

Bias can enter the system in any of these three steps:
1. The process starts with a training dataset, which contains a sequence of instances, where each instance has two components: the features and the correct prediction for those features. 
2. A machine learning algorithm is trained on this training dataset to produce a machine learning model. This generated model can be used to make a prediction when given a new instance. 
3. A second dataset with features and correct predictions, called a test dataset, is used to assess the accuracy of the model. Since this test dataset is the same format as the training dataset, a set of instances of features and prediction pairs, often these two datasets derive from the same initial dataset. A random partitioning algorithm is used to split the initial dataset into training and test datasets.

* The training data set may be biased in that its outcomes may be biased towards particular kinds of instances
* The algorithm that creates the model may be biased in that it may generate models that are weighted towards particular features in the input
* The test data set may be biased in that it has expectations on correct answers that may be biased

These three points in the machine learning process represent points for testing and mitigating bias. In AI Fairness 360 these are called:

* pre-processing 
* in-processing
* post-processing

### Bias in a credit dataset

Bias could occur based on age or sex in this dataset. 

* set the protected attribute to be `age`, where `age >=25` is considered privileged
* the protected attribute for `sex` is not consider in this evaluation
* split the original dataset into training and testing datasets
* set two variables for the privileged (1) and unprivileged (0) values for the age attribute. These are key inputs for detecting and mitigating bias

<div class="alert alert-success">
 <b>OPTIONAL EXERCISE</b> <br/> 
 To explore the gender bias in this dataset, edit the below code to use `sex` as the protected attribute and assign new privileged and unprivileged groups.
</div>

### Metrics based on a single `BinaryLabelDataset`

<div class="alert alert-info" style="font-size:100%">
<b>Read <a href="https://aif360.readthedocs.io/en/latest/modules/generated/aif360.metrics.BinaryLabelDatasetMetric.html">the documentation</a> for a full overview of this class and a list of all bias metrics. <a href="http://aif360.mybluemix.net/data">This demo</a> provides definitions of the metrics as well.<br>
</div>

In [None]:
dataset_german = GermanDataset(protected_attribute_names=['age'],
                    privileged_classes=[lambda x: x >= 25],      
                    features_to_drop=['personal_status', 'sex']) 

# Split into train, validation, and test
dataset_german_train, dataset_german_test = dataset_german.split([0.7], shuffle=True)

privileged_groups = [{'age': 1}]
unprivileged_groups = [{'age': 0}]

In [None]:
metric_german_train = BinaryLabelDatasetMetric(dataset_german_train, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)

metric_german_test = BinaryLabelDatasetMetric(dataset_german_test, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)



In [None]:
help(metric_german_train)

### Bias metrics

<img src="https://github.com/MargrietGroenendijk/gitbooks2/blob/master/files/metrics.png?raw=true" width="1000" align="left">

* `mean_difference`: alias of `statistical_parity_difference` 
    * Difference of the rate of favorable outcomes received by the unprivileged group to the privileged group. 
    * A negative value indicates less favorable outcomes for the unprivileged groups
    * The ideal value of this metric is 0
    * Fairness for this metric is between -0.1 and 0.1
    

* `disparate_impact`: ratio of rate of favorable outcome for the unprivileged group to that of the privileged group

    $\frac{Pr(Y = 1 | D = \text{unprivileged})}
     |         {Pr(Y = 1 | D = \text{privileged})}$
     
     
* `consistency`: individual fairness metric that measures how similar the labels are for similar instances.

    $1 - \frac{1}{n\cdot\text{n_neighbors}}\sum_{i=1}^n |\hat{y}_i -
     |         \sum_{j\in\mathcal{N}_{\text{n_neighbors}}(x_i)} \hat{y}_j|$


* `base_rate`

    $Pr(Y = 1) = P/(P+N)$       

In [None]:
display(Markdown("#### Original training dataset"))
print("mean_difference = %f" % metric_german_train.mean_difference())
print("disparate_impact = %f" % metric_german_train.disparate_impact())
print("consistency = %f" % metric_german_train.consistency())
print("base_rate = %f" % metric_german_train.base_rate())
print("num_negatives = %f" % metric_german_train.num_negatives())
print("num_positives = %f" % metric_german_train.num_positives())
print("smoothed_empirical_differential_fairness = %f" % metric_german_train.smoothed_empirical_differential_fairness())

<div class="alert alert-info" style="font-size:200%">
<b>Question / discussion time</b> <br> 
</div>

* How would you visualise these metrics?
* Do you think these are meaningful?
* What if you are looking at multiple classes or a regression problem?
* Or where do you start when you cannot easily define the privileged and unprivileged groups?    
    
<div class="alert alert-success">
 <b>OPTIONAL EXERCISE</b> <br/> 
 To explore the gender bias in this dataset, edit the below code to use `sex` as the protected attribute and assign new privileged and unprivileged groups.
</div>

<a class="anchor" id="model"></a>
## 4. Model building

### Binary Classification

Some of the model options:
* Logistic regression
* Decision trees
* Random forests
* Bayesian networks
* Support vector machines
* Neural networks

### But first: scale and normalise features

* tidy dataset, so this is going to be unreaslistically easy, e.g. there are no missing values
* one-hot encoding for multiple classes (already done, e.g., [features A11-A14](https://archive.ics.uci.edu/ml/datasets/Statlog+%28German+Credit+Data%29))
* features need to be standardised, from same distribution

[StandardScaler](https://scikit-learn.org/stable/modules/preprocessing.html) - 
*Standardization of datasets is a common requirement for many machine learning estimators implemented in scikit-learn; they might behave badly if the individual features do not more or less look like standard normally distributed data: Gaussian with zero mean and unit variance. `StandardScaler` implements the Transformer API to compute the mean and standard deviation on a training set so as to be able to later reapply the same transformation on the testing set.*

aif360 format can be used with scikitlearn!

In [None]:
scale_german = StandardScaler().fit(dataset_german_train.features)

X_train = scale_german.transform(dataset_german_train.features)
y_train = dataset_german_train.labels.ravel()
w_train = dataset_german_train.instance_weights.ravel()

X_test = scale_german.transform(dataset_german_test.features)
y_test = dataset_german_test.labels.ravel()
w_test = dataset_german_test.instance_weights.ravel()

In [None]:
# what does the data look like now?
plt.rcParams["figure.figsize"] = (18,18)

scaled_features = pd.DataFrame(X_train, columns=dataset_german.feature_names)

scaled_features.hist();
plt.tight_layout()

<div class="alert alert-info" style="font-size:100%">
<b>If you are new to scikit-learn read this <a href="https://developer.ibm.com/series/learning-path-machine-learning-for-developers/">practical introduction</a> for a quick overview.<br>
</div>
    
    
<a class="anchor" id="original"></a>
### 4.1 Train on the original data

In [None]:
# Logistic regression classifier and predictions

# create an instance of the model
lmod = LogisticRegression()

# train the model
lmod.fit(X_train, y_train, 
         sample_weight=dataset_german_train.instance_weights)

# calculate predicted labels
y_train_pred = lmod.predict(X_train)

# assign positive class index
pos_ind = np.where(lmod.classes_ == dataset_german_train.favorable_label)[0][0]

# add predicted labels to predictions dataset
dataset_german_train_pred = dataset_german_train.copy()
dataset_german_train_pred.labels = y_train_pred

dataset_german_test_pred = dataset_german_test.copy(deepcopy=True)
X_test = scale_german.transform(dataset_german_test_pred.features)
y_test = dataset_german_test_pred.labels
dataset_german_test_pred.scores = lmod.predict_proba(X_test)[:,pos_ind].reshape(-1,1)


In [None]:
# model accuracy
score = lmod.score(X_test, y_test)
print(score)

In [None]:
# confusion matrix
cm = metrics.confusion_matrix(y_test, lmod.predict(X_test))

plt.figure(figsize=(5,5))
sns.heatmap(cm, annot=True, fmt=".3f", linewidths=.5, square = True, cmap = 'Blues_r');
plt.ylabel('Actual label');
plt.xlabel('Predicted label');

In [None]:
from sklearn.metrics import plot_confusion_matrix

[fig, ax] = plt.subplots(1, figsize=(5, 5));
plot_confusion_matrix(lmod, X_test, y_test,
                      cmap=plt.cm.Blues, 
                      display_labels=['good credit','bad credit'],
                      normalize='true',ax=ax);

In [None]:
plt.rcParams["figure.figsize"] = (5,5)
metrics.plot_roc_curve(lmod, X_test, y_test);                                 

In [None]:
df_test = pd.DataFrame(dataset_german_test.features, columns=dataset_german_test.feature_names)

[fig, ax] = plt.subplots(1,2, figsize=(15, 5));
plot_confusion_matrix(lmod, X_test[df_test['age']==0], y_test[df_test['age']==0],
                      cmap=plt.cm.Blues, 
                      display_labels=['good credit','bad credit'],
                      ax=ax[0]);
ax[0].set_title('Age < 25')

plot_confusion_matrix(lmod, X_test[df_test['age']==1], y_test[df_test['age']==1],
                      cmap=plt.cm.Blues, 
                      display_labels=['good credit','bad credit'],
                      ax=ax[1]);
ax[1].set_title('Age > 25');

<a class="anchor" id="algorithms"></a>
## 5. AI fairness algorithms

<img src="https://github.com/MargrietGroenendijk/gitbooks2/blob/master/files/pipeline.png?raw=true" width="1000" align="left">

<img src="https://github.com/MargrietGroenendijk/gitbooks2/blob/master/files/algorithms.png?raw=true"  width="1000" align="left">

<a class="anchor" id="preproc"></a>
### 5.1 pre-processing algorithms

### Remove bias by reweighing data

**Reweighing** is a preprocessing technique that weights the examples in each (group, label) combination differently to ensure fairness before classification.

<div class="alert alert-info" style="font-size:100%">
<b>Read the <a href="https://aif360.readthedocs.io/en/latest/modules/generated/aif360.algorithms.preprocessing.Reweighing.html">aif360 documentation</a> for a full overview<br>
</div>

In [None]:
RW = Reweighing(unprivileged_groups=unprivileged_groups,
               privileged_groups=privileged_groups)

# compute the weights for reweighing the dataset
RW.fit(dataset_german_train)

# transform the dataset to a new dataset based on the estimated transformation
dataset_rw_train = RW.transform(dataset_german_train)
dataset_rw_test = RW.transform(dataset_german_test)

In [None]:
display(Markdown("#### Original training dataset"))
print("mean_difference = %f" % metric_german_train.mean_difference())
print("disparate_impact = %f" % metric_german_train.disparate_impact())
print("consistency = %f" % metric_german_train.consistency())
print("base_rate = %f" % metric_german_train.base_rate())
print("num_negatives = %f" % metric_german_train.num_negatives())
print("num_positives = %f" % metric_german_train.num_positives())
print("smoothed_empirical_differential_fairness = %f" % metric_german_train.smoothed_empirical_differential_fairness())

metric_rw_train = BinaryLabelDatasetMetric(dataset_rw_train, 
                                         unprivileged_groups=unprivileged_groups,
                                         privileged_groups=privileged_groups)

display(Markdown("#### Reweighted training dataset"))
print("mean_difference = %f" % metric_rw_train.mean_difference())
print("disparate_impact = %f" % metric_rw_train.disparate_impact())
print("consistency = %f" % metric_rw_train.consistency())
print("base_rate = %f" % metric_rw_train.base_rate())
print("num_negatives = %f" % metric_rw_train.num_negatives())
print("num_positives = %f" % metric_rw_train.num_positives())
print("smoothed_empirical_differential_fairness = %f" % metric_rw_train.smoothed_empirical_differential_fairness())

#### Train on reweighted data

In [None]:
# scale data
scale_rw = StandardScaler().fit(dataset_rw_train.features)

X_train_rw = scale_rw.transform(dataset_rw_train.features)
y_train_rw = dataset_rw_train.labels.ravel()
w_train_rw = dataset_rw_train.instance_weights.ravel()

X_test_rw = scale_rw.transform(dataset_rw_test.features)
y_test_rw = dataset_rw_test.labels.ravel()
w_test_rw = dataset_rw_test.instance_weights.ravel()

In [None]:
dataset_rw_train.instance_weights

In [None]:
# create a new instance of the model
lmod_rw = LogisticRegression()

# train the model
lmod_rw.fit(X_train_rw, y_train_rw, 
         sample_weight=dataset_rw_train.instance_weights)

# calculate predicted labels
y_train_pred_rw = lmod_rw.predict(X_train_rw)

# assign positive class index
pos_ind_rw = np.where(lmod_rw.classes_ == dataset_rw_train.favorable_label)[0][0]

# add predicted labels to predictions dataset
dataset_rw_train_pred = dataset_rw_train.copy()
dataset_rw_train_pred.labels = y_train_pred_rw

In [None]:
# model accuracy
print(score)
score_rw = lmod_rw.score(X_test_rw, y_test_rw)
print(score_rw)

In [None]:
plt.rcParams["figure.figsize"] = (5,5)
metrics.plot_roc_curve(lmod_rw, X_test_rw, y_test_rw); 
plt.title('Reweighted model');

In [None]:
plt.rcParams["figure.figsize"] = (5,5)
metrics.plot_roc_curve(lmod, X_test, y_test); 
plt.title('Original model');


In [None]:
# confusion matrix
cm_rw = metrics.confusion_matrix(y_test_rw, lmod_rw.predict(X_test_rw))

plt.figure(figsize=(5,5))
sns.heatmap(cm_rw, annot=True, fmt=".3f", linewidths=.5, square = True, cmap = 'Blues_r');
plt.ylabel('Actual label');
plt.xlabel('Predicted label');
plt.title('Reweighted model');

In [None]:
[fig, ax] = plt.subplots(2,2, figsize=(15, 12));
plot_confusion_matrix(lmod, X_test[df_test['age']==0], y_test[df_test['age']==0],
                      cmap=plt.cm.Blues, 
                      display_labels=['good credit','bad credit'],
                      ax=ax[0,0]);
ax[0,0].set_title('Age < 25')

plot_confusion_matrix(lmod, X_test[df_test['age']==1], y_test[df_test['age']==1],
                      cmap=plt.cm.Blues, 
                      display_labels=['good credit','bad credit'],
                      ax=ax[0,1]);
ax[0,1].set_title('Age > 25');

plot_confusion_matrix(lmod_rw, X_test_rw[df_test['age']==0], y_test_rw[df_test['age']==0],
                      cmap=plt.cm.Blues, 
                      display_labels=['good credit','bad credit'],
                      ax=ax[1,0]);
ax[1,0].set_title('Age < 25 (Reweighted)')

plot_confusion_matrix(lmod_rw, X_test_rw[df_test['age']==1], y_test_rw[df_test['age']==1],
                      cmap=plt.cm.Blues, 
                      display_labels=['good credit','bad credit'],
                      ax=ax[1,1]);
ax[1,1].set_title('Age > 25 (Reweighted)');

In [None]:
[fig, ax] = plt.subplots(2,2, figsize=(15, 12));
plot_confusion_matrix(lmod, X_test[df_test['age']==0], y_test[df_test['age']==0],
                      cmap=plt.cm.Blues, 
                      display_labels=['good credit','bad credit'],
                      ax=ax[0,0],normalize='true');
ax[0,0].set_title('Age < 25')

plot_confusion_matrix(lmod, X_test[df_test['age']==1], y_test[df_test['age']==1],
                      cmap=plt.cm.Blues, 
                      display_labels=['good credit','bad credit'],
                      ax=ax[0,1],normalize='true');
ax[0,1].set_title('Age > 25');

plot_confusion_matrix(lmod_rw, X_test_rw[df_test['age']==0], y_test_rw[df_test['age']==0],
                      cmap=plt.cm.Blues, 
                      display_labels=['good credit','bad credit'],
                      ax=ax[1,0],normalize='true');
ax[1,0].set_title('Age < 25 (Reweighted)')

plot_confusion_matrix(lmod_rw, X_test_rw[df_test['age']==1], y_test_rw[df_test['age']==1],
                      cmap=plt.cm.Blues, 
                      display_labels=['good credit','bad credit'],
                      ax=ax[1,1],normalize='true');
ax[1,1].set_title('Age > 25 (Reweighted)');

### Remove bias with the optimized data pre-processing algorithm 


The debiasing function used is implemented in the [OptimPreproc](https://aif360.readthedocs.io/en/latest/modules/generated/aif360.algorithms.preprocessing.OptimPreproc.html?highlight=get%20distortion) class. It modifies training data features & labels.

* Define parameters for optimized pre-processing specific to the dataset.
* Divide the dataset into training, validation, and testing partitions.
* Learn the optimized pre-processing transformation from the training data.
* Train classifier on original training data.
* Estimate the optimal classification threshold, that maximizes balanced accuracy without fairness constraints (from the original validation set).
* Determine the prediction scores for original testing data. Using the estimated optimal classification threshold, compute accuracy and fairness metrics.
* Transform the testing set using the learned probabilistic transformation.
* Determine the prediction scores for transformed testing data. Using the estimated optimal classification threshold, compute accuracy and fairness metrics.

See the example notebook here: https://github.com/Trusted-AI/AIF360/blob/master/examples/demo_optim_data_preproc.ipynb

#### Train with and transform the original training data

This algorithm does not use the privileged and unprivileged groups that are specified during initialization yet. Instead, it automatically attempts to reduce statistical parity difference between all possible combinations of groups in the dataset.

<div class="alert alert-warning">
 <b>This seems to take very long to run, you might want to skip this.</b> <br/> 
 The algorithm does not use the privileged and unprivileged groups that are specified during initialization yet. Instead, it automatically attempts to reduce statistical parity difference between all possible combinations of groups in the dataset.
</div>

In [None]:
optim_options = {
            "distortion_fun": get_distortion_german,
            "epsilon": 0.1,
            "clist": [0.99, 1.99, 2.99],
            "dlist": [.1, 0.05, 0]
        } 

In [None]:
#OP = OptimPreproc(OptTools, optim_options,
#                  unprivileged_groups = unprivileged_groups,
#                  privileged_groups = privileged_groups)

#OP = OP.fit(dataset_german_train)

# Transform training data and align features
#dataset_op_train = OP.transform(dataset_german_train, transform_Y=True)
#dataset_op_train = dataset_german_train.align_datasets(dataset_transf_train)

#### Metric with the transformed training data

In [None]:
#metric_op_train = BinaryLabelDatasetMetric(dataset_op_train, 
#                                         unprivileged_groups=unprivileged_groups,
#                                         privileged_groups=privileged_groups)

#display(Markdown("#### Optimized training dataset"))
#print("mean_difference = %f" % metric_op_train.mean_difference())
#print("disparate_impact = %f" % metric_op_train.disparate_impact())
#print("consistency = %f" % metric_op_train.consistency())
#print("base_rate = %f" % metric_op_train.base_rate())
#print("num_negatives = %f" % metric_op_train.num_negatives())
#print("num_positives = %f" % metric_op_train.num_positives())
#print("smoothed_empirical_differential_fairness = %f" % metric_op_train.smoothed_empirical_differential_fairness())

#### Load, clean up original test data and compute metric

In [None]:
#dataset_orig_test = dataset_op_train.align_datasets(dataset_orig_test)
#display(Markdown("#### Testing Dataset shape"))
#print(dataset_orig_test.features.shape)

#metric_orig_test = BinaryLabelDatasetMetric(dataset_orig_test, 
#                                         unprivileged_groups=unprivileged_groups,
#                                         privileged_groups=privileged_groups)
#display(Markdown("#### Original test dataset"))
#print("Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_orig_test.mean_difference())


<a class="anchor" id="inproc"></a>
### 5.2 In-processing algorithms

* [Adversarial Debiasing](https://github.com/Trusted-AI/AIF360/blob/master/examples/demo_adversarial_debiasing.ipynb) - Uses adversarial techniques to maximize accuracy & reduce evidence of protected attributes in predictions
* [Reject Option Classification](https://github.com/Trusted-AI/AIF360/blob/master/examples/demo_reject_option_classification.ipynb) - Adds a discrimination-aware regularization term to the learning objective
![image-2.png](attachment:image-2.png)

<a class="anchor" id="postproc"></a>
### 5.3 Post-processing algorithms

[Reject option classification (ROC)](https://aif360.readthedocs.io/en/latest/modules/generated/aif360.algorithms.postprocessing.RejectOptionClassification.html?highlight=reject) is a postprocessing technique that gives favorable outcomes to unpriviliged groups and unfavorable outcomes to priviliged groups in a confidence band around the decision boundary with the highest uncertainty.

* The debiasing function used is implemented in the RejectOptionClassification class.
* Divide the dataset into training, validation, and testing partitions.
* Train classifier on original training data.
* Estimate the optimal classification threshold, that maximizes balanced accuracy without fairness constraints.
* Estimate the optimal classification threshold, and the critical region boundary (ROC margin) using a validation set for the desired constraint on fairness. The best parameters are those that maximize the classification threshold while satisfying the fairness constraints.
* The constraints can be used on the following fairness measures:
    * Statistical parity difference on the predictions of the classifier
    * Average odds difference for the classifier
    * Equal opportunity difference for the classifier
* Determine the prediction scores for testing data. Using the estimated optimal classification threshold, compute accuracy and fairness metrics.
* Using the determined optimal classification threshold and the ROC margin, adjust the predictions. Report accuracy and fairness metric on the new predictions.

In [None]:
num_thresh = 100
ba_arr = np.zeros(num_thresh)
class_thresh_arr = np.linspace(0.01, 0.99, num_thresh)
for idx, class_thresh in enumerate(class_thresh_arr):
    
    fav_inds = dataset_german_test_pred.scores > class_thresh
    dataset_german_test_pred.labels[fav_inds] = dataset_german_test_pred.favorable_label
    dataset_german_test_pred.labels[~fav_inds] = dataset_german_test_pred.unfavorable_label
    
    classified_metric_german_test = ClassificationMetric(dataset_german_test,
                                             dataset_german_test_pred, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)
    
    ba_arr[idx] = 0.5*(classified_metric_german_test.true_positive_rate()\
                       +classified_metric_german_test.true_negative_rate())

best_ind = np.where(ba_arr == np.max(ba_arr))[0][0]
best_class_thresh = class_thresh_arr[best_ind]

print("Best balanced accuracy (no fairness constraints) = %.4f" % np.max(ba_arr))
print("Optimal classification threshold (no fairness constraints) = %.4f" % best_class_thresh)

In [None]:
metric_name = "Statistical parity difference"

# Upper and lower bound on the fairness metric used
metric_ub = 0.05
metric_lb = -0.05

ROC = RejectOptionClassification(unprivileged_groups=unprivileged_groups, 
                                 privileged_groups=privileged_groups, 
                                 low_class_thresh=0.01, high_class_thresh=0.99,
                                  num_class_thresh=100, num_ROC_margin=50,
                                  metric_name=metric_name,
                                  metric_ub=metric_ub, metric_lb=metric_lb)
ROC = ROC.fit(dataset_german_test, dataset_german_test_pred)

In [None]:
print("Optimal classification threshold (with fairness constraints) = %.4f" % ROC.classification_threshold)
print("Optimal ROC margin = %.4f" % ROC.ROC_margin)

* [Odds Equalizing](https://github.com/Trusted-AI/AIF360/blob/master/examples/demo_calibrated_eqodds_postprocessing.ipynb) - Modifies the predicted label using an optimization scheme to make predictions fairer
![image.png](attachment:image.png)


Another full example: https://nbviewer.jupyter.org/github/IBM/AIF360/blob/master/examples/tutorial_medical_expenditure.ipynb

Copyright © 2020 IBM. This notebook and its source code are released under the terms of the MIT License.