# **Trustworthy AI - Fairness**
## Case study - Predictive risk assessment tool

Use this notebook to 1) **assess** the fairness of your ML predictions and 2) **address** possible biases using mitigation strategies

**About the use case**

Loans form an integral part of banking operations. However, not all the loans are promptly returned and hence it is important for a bank to closely monitor and understand loan applications so that they know which loans to reject and which to approve. 

This notebooks assess the fairness of machine learning models that use the German credit data set (https://archive.ics.uci.edu/ml/datasets/statlog+(german+credit+data)). It contains details of 1000 loan applicants with 20 attributes and the classification whether an applicant is considered a "good" or a "bad" credit risk (target).


**Summary**

This notebook demonstrates how the AIF360 toolkit by IBM can used to detect and mitigate bias throughout different stages of machine learning. We assess the dataset and model on bias with respect to the age of loan applicants and go into various metrics to assess fairness and algorithms to mitigate bias.

_**With fairness in ML, please keep in mind that fairness is a multifaceted, context-dependent social construct that defies simple definition. We can try to approach machine learning fairness by using metrics to assess fairness and algorithms to mitigate bias. The metrics and algorithms we cover here do not cover the full scope of fairness in all situations.**_

_**If you are considering to apply fairness metrics and algorithms, you should keep in mind that the toolkit should be used in a very limited setting only. For more guidance on when and if to use fairness metrics and algorithms please refer to http://aif360.mybluemix.net/resources#guidance**_




### Notebook overview

1. [Import statements](#Import_statements)

  1a. [Data set and model scope](#Dataset_and_model)
  
  
2. [Import data and specify protected attribute](#Import_data_and_specify_protected_attribute)


# Fairness in ML

Spoiler alert: defining fairness is quite complicated, as fairness is a multifaceted, context-dependent social construct that defies simple definition.

We can, however, try to approach machine learning fairness by using metrics to assess fairness and algorithms to mitigate bias. 

Here you can also find a guidance on which metric is suitable for which use case, with several considerations to keep in mind:
- Do you need a metric for individual or group fairness or both?
- For group fairness, do you want to focus on the data or the model?
- For group fairness, do you follow the "we're all equal" (WAE) or the "what you see is what you get" (WYSIWYG) worldview? 

For mitigation, you should keep in mind the following guidelines and considerations:
- At which part of the machine learning pipeline do you want to intervene?
- If you can modify training data, you can use pre-processing
- If you can change the learning algorithm, you can use in-processing
- If your model is a black box and you can't modify the training data or the learning algorihtm, you can use post-processing

If you are interested in learning more about the different fairness concepts and the guidelines for choosing specific metrics and algorithms, please refer to http://aif360.mybluemix.net/resources#guidance

As you can see in the image below, fairness can be applied at various stages of the machine learning cycle: to the data, to the model, or post-hoc if you don't have access to the data or the model. At each stage there exist different metrics and mitigation strategies to choose from. 

![AI Fairness - Source: IBM Research AIF360](img/IBM-Research-AI-Fairness-360.png "AIF360")

##### Focus of this tutorial 

For this tutorial, we will focus on:
- the BinaryLabelDatasetMetric, which is a class for computing several metrics based on a single binary label dataset
- the Reweighing preprocessing algorithm, which weighs the example in each group differently with respect to the favourable label (in our case age >=25) to ensure fairness before classification

For this tutorial we will be using IBM Research's AIF360 package (https://github.com/Trusted-AI/AIF360) and (http://aif360.mybluemix.net/). 

Other libraries to assess fairness include VerifyML and Fairlearn

# Import statements

In [2]:
import os
import pickle

import numpy as np
np.random.seed(0)

# Note: if you get a warning about a Tensorflow dependency when running this cell, you can ignore it as we won't need it for this tutorial
# if you re-run the cell it should disappear
from aif360.datasets import GermanDataset
from aif360.metrics import BinaryLabelDatasetMetric
from aif360.algorithms.preprocessing import Reweighing

# Load dataset and specify protected attribute

For this training, the German credit dataset (https://archive.ics.uci.edu/ml/datasets/statlog+(german+credit+data) from aif360.datasets has already been loaded and prepared for you in *data/processed* with the following preprocessing steps already taken:
- The protected attribute has been set to **age**
- **Age >= 25** is considered privileged
- Protected attributes that have been dropped are personal status and sex

We can now simply load the preprocessed data pickle file:

In [3]:
dataset_orig = pickle.load(open(os.path.join(os.getcwd(), os.pardir, "data", "processed", "german_data_aif_360.pkl"), "rb"))
dataset_orig

               instance weights features                \
                                                         
                                   month credit_amount   
instance names                                           
0                           1.0      6.0        1169.0   
1                           1.0     48.0        5951.0   
2                           1.0     12.0        2096.0   
3                           1.0     42.0        7882.0   
4                           1.0     24.0        4870.0   
...                         ...      ...           ...   
995                         1.0     12.0        1736.0   
996                         1.0     30.0        3857.0   
997                         1.0     12.0         804.0   
998                         1.0     45.0        1845.0   
999                         1.0     45.0        4576.0   

                                                                \
                                                               

We can access the data as a dataframe as well:

In [4]:
df = dataset_orig.convert_to_dataframe()[0]
df.head()

Unnamed: 0,month,credit_amount,investment_as_income_percentage,residence_since,age,number_of_credits,people_liable_for,status=A11,status=A12,status=A13,...,housing=A153,skill_level=A171,skill_level=A172,skill_level=A173,skill_level=A174,telephone=A191,telephone=A192,foreign_worker=A201,foreign_worker=A202,credit
0,6.0,1169.0,4.0,4.0,1.0,2.0,1.0,1.0,0.0,0.0,...,0.0,0.0,0.0,1.0,0.0,0.0,1.0,1.0,0.0,1.0
1,48.0,5951.0,2.0,2.0,0.0,1.0,1.0,0.0,1.0,0.0,...,0.0,0.0,0.0,1.0,0.0,1.0,0.0,1.0,0.0,2.0
2,12.0,2096.0,2.0,3.0,1.0,1.0,2.0,0.0,0.0,0.0,...,0.0,0.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,1.0
3,42.0,7882.0,2.0,4.0,1.0,1.0,2.0,1.0,0.0,0.0,...,1.0,0.0,0.0,1.0,0.0,1.0,0.0,1.0,0.0,1.0
4,24.0,4870.0,3.0,4.0,1.0,2.0,2.0,1.0,0.0,0.0,...,1.0,0.0,0.0,1.0,0.0,1.0,0.0,1.0,0.0,2.0


For this tutorial, the original target labels of the dataset have been kept, where (1 = Good, 2 = Bad)

We then split the data into a train, test and validation set. 

In this tutorial we will only cover the train dataset, but during the development of a machine learning model we would also use the validation and test set. 

We set two variables for the privileged (1, age>=25) values and unprivileged (0, age<25) values which will be the key inputs for detecting and mitigating bias. 

In [5]:
dataset_orig_train, dataset_orig_val, dataset_orig_test = dataset_orig.split([0.7, 0.9], shuffle=True)

privileged_groups = [{'age': 1}]
unprivileged_groups = [{'age': 0}]

In [6]:
print(dataset_orig_train.convert_to_dataframe()[0].shape)
print(dataset_orig_val.convert_to_dataframe()[0].shape)
print(dataset_orig_test.convert_to_dataframe()[0].shape)

(700, 58)
(200, 58)
(100, 58)


# Compute fairness metric on original dataset

The BinaryLabelDatasetMetric is a class for computing several metrics based on a single binary label dataset.

We focus on the following metrics from BinaryLabelDatasetMetric:
- **Difference in mean outcomes**, which is the alias of **statistical parity difference** and shows the difference of the rate of favourable outcomes (here 𝑌=1) received by the unprivileged group to the privileged group:

$Pr(Y=1 | D=\text{unpriviliged})-Pr(Y=1 | D=\text{priviliged})$

- **Disparate impact**, which is the ratio between the rate of favourable outcomes (here $Y=1$) received by the unprivileged group to the privileged group. 

$\frac{Pr(Y=1 | D=\text{unpriviliged})}{Pr(Y=1 | D=\text{priviliged})}$

In our case, this means we check the distribution of our target variable credit risk with respect to the entry belonging to the privileged group (age>=25), or unprivileged group (age>25) 

In [14]:
protected_attribute = "age"
metric_orig_train = BinaryLabelDatasetMetric(dataset_orig_train, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)
print("Original training dataset: \n")
# values between [-1;1], closer to 0 is more 'fair'
print("Difference in mean outcomes (statistical parity difference) between unprivileged and privileged groups for protected attribute %s = %f \n" % (protected_attribute, metric_orig_train.mean_difference()))
# values between (-inf,inf), closer to 1 is more 'fair'
print("Disparate impact between unprivileged and privileged groups for protected attribute %s = %f" %(protected_attribute, metric_orig_train.disparate_impact()))

Original training dataset: 

Difference in mean outcomes (statistical parity difference) between unprivileged and privileged groups for protected attribute age = -0.169905 

Disparate impact between unprivileged and privileged groups for protected attribute age = 0.766430


# Mitigating bias by transforming the original dataset

We now apply the reweighing algorithm to the dataset, which is a pre-processing fairness algorithm applied to the data.

Reweighing transforms the training data according to computed weights to guarantee that there remains no conditional dependence between the outcome and the protected attribute.

Reweighing aims at correcting past wrongs with respect to the fairness metrics by weighting non-discriminatory cases with respect to the protected attribute more heavily and weighting discriminatory cases less.

In our case, this means that Reweighing will transform the dataset to ensure that the statistical parity difference on the training data equals 0 and the disparate impact equals 1.


In [15]:
RW = Reweighing(unprivileged_groups=unprivileged_groups,
                privileged_groups=privileged_groups)

dataset_transf_train = RW.fit_transform(dataset_orig_train)

The computed weights are as follows: 

In [16]:
# weight_privileged_favourable, weight_privileged_unfavourable, weight_unprivileged_favourable, weight_unprivileged_unfavourable
(RW.w_p_fav, RW.w_p_unfav, RW.w_up_fav, RW.w_up_unfav)

(0.9622950819672131, 1.100625, 1.2555555555555555, 0.678)

# Compute fairness metric on transformed dataset

If we now again calculate the fairness metrics on the transformed dataset, we can see that the training data has been altered such that these metrics reach their maximum fairness score

In [19]:
protected_attribute = "age"

metric_transf_train = BinaryLabelDatasetMetric(dataset_transf_train, 
                                               unprivileged_groups=unprivileged_groups,
                                               privileged_groups=privileged_groups)
print("Transformed training dataset: \n")
# values between [-1;1], closer to 0 is more 'fair'
print("Difference in mean outcomes between unprivileged and privileged groups for protected attribute %s = %f \n" % (protected_attribute, metric_transf_train.mean_difference())) 
# values between (-inf,inf), closer to 1 is more 'fair'
print("Disparate impact between unprivileged and privileged groups for protected attribute %s = %f" %(protected_attribute, metric_transf_train.disparate_impact()))

Transformed training dataset: 

Difference in mean outcomes between unprivileged and privileged groups for protected attribute age = 0.000000 

Disparate impact between unprivileged and privileged groups for protected attribute age = 1.000000


###### For questions about this notebook please reach out to ellen.hoeven@ibm.com