# Homework 2: Discover, Measure, and Mitigate Bias in Bank Marketing

## Background

In this homework, we use a data coming from a bank’s marketing campaign. It consists of several individual level variables like age, gender, credit default, job etc., which can serve as input variables in the prediction model. The outcome varaible that the bank is interested in is whether a person subscribed to the term deposit or not. Hence, the outcome variable is categorical in nature ‐ subscribed or did not subscribe. The objective of training a model is to predict if someone would subscribe to the term deposit oﬀered by the bank or not. Given that the cost and time to contact all possible leads is enormous hence, ﬁnancial institutions like to identify the most promising leads. Promising leads are likely to be identiﬁed as proﬁle of people who are most likely to subscribe to a term deposit. Once identiﬁed, these leads are contacted through direct marketing channels (e.g., phone calls), they are provided with all the details about the term deposit.

But the bank also wants to make sure that the prediction model is not biased against any group. They are cognizant that a prediction model built on prior data set has the potential to display bias against diﬀerent groups which precludes them from appearing in the list of promising leads. Considering that term deposits can help secure ﬁnancial stability in the long term, a biased prediction model can adversely aﬀect some groups. For the purpose of this project, we will consider marital status (married, not married) as the protected variable of interest. We will refer to the married people as the privileged group and examine whether there is diﬀerences in the privileged group versus the unprivileged group.

| Protected Variable|Privileged Group|Unprivileged Group|
| ----------------- | -------------- | ---------------- |
| Marital status	| Married        |Unmarried         |

## Data Description
The dataset consists of $5000$ rows and $12$ kinds of features. Run the code below to show a subset of the data.

In [None]:
import pandas as pd
bank_data = pd.read_csv('bank.csv', delimiter=';')
bank_data.head(n=100)

The table is referred to as the Original data because this is the data before any analysis has been performed on it.  The outcome variable of $subscribed$ denotes if the client has subscribed to a term deposit. For ease of explanation, we will refer to the two classes of the outcome variable as yes versus no indicating whether a person subscribed (yes) or did not subscribe (no). All features are:

* $age$: How old this client is. 
* $job$: Type of job. 
* $marital$: Marital status.
* $education$: Highest education.
* $default$: Has credit in default.
* $housing$: Has housing loan?
* $loan$: Has personal loan? 
* $contact$: Contact communication type.
* $month$: Last contact month of year.
* $day\_of\_week$: Last contact day of the week.
* $duration$: Last contact duration, in seconds.
* $subscribed$: Has the client subscribed a term deposit？ 

## Steps to Discover, Measure, and Mitigate Bias

![image](../Images/MLWorkflow.png)

* Specify protected variable, privileged group, and unprivileged group
* Split the data into training and test data.
* Check fairness metrics of training data.
* Build a model without mitigation methods. (Baseline)
    * Train a Logistic Regression model using the training data.
    * Make predictions on the test data using the trained model.
    * Check fairness metrics and accuracy of the predition.
* Apply different mitigation methods to get debiased prediction.
    * **Pre-processing** 2
    * **In-processing** 2
    * **Post-processing** 1
* Compare the debiased prediction with the baseline prediction w.r.t. accuracy and fairness metrics.
* flexibly combine different techniques to generate debiased prediction.


### Import libraries

In [12]:
from aif360.datasets import BankDataset
from aif360.metrics import BinaryLabelDatasetMetric
from sklearn.linear_model import LogisticRegression
from aif360.metrics import ClassificationMetric
from aif360.algorithms.preprocessing import Reweighing
from aif360.algorithms.preprocessing import LFR
from aif360.algorithms.inprocessing import PrejudiceRemover
from aif360.algorithms.postprocessing import RejectOptionClassification

### Load the bank data, Specify protected variable, privileged group, and unprivileged group

In [2]:
protected_attribute_maps = [{1.0: 'married', 0.0: 'unmarried'}]
dataset_orig = BankDataset(
            protected_attribute_names=['marital'],          
            privileged_classes=[['married']], 
            features_to_drop=['campaign', 'pdays', 'previous', 'poutcome', 'emp.var.rate', 'cons.price.idx', 'cons.conf.idx', 'euribor3m', 'nr.employed'],
            categorical_features=['job', 'education', 'default',
                    'housing', 'loan', 'contact', 'month', 'day_of_week'],
            metadata={'protected_attribute_maps': protected_attribute_maps}
        )
privileged_groups = [{'marital': 1}]
unprivileged_groups = [{'marital': 0}]

### Split the dataset into training data and test data

In [3]:
dataset_orig_train, dataset_orig_test = dataset_orig.split([0.7], shuffle=None)

### Check fairness metrics of training data.

In [4]:
metric_orig_train = BinaryLabelDatasetMetric(dataset_orig_train, 
                                     unprivileged_groups=unprivileged_groups,
                                     privileged_groups=privileged_groups)
# print the metric values 
print('SPD', round(metric_orig_train.mean_difference(), 2))
print('DI', round(metric_orig_train.disparate_impact(), 2))

SPD 0.18
DI 1.7


### Build a model without mitigation methods. (Baseline)

#### Train a Logistic Regression model using the training data. Make predictions on the test data using the trained model.

In [5]:
# train the dataset with Logistic Regression model
def Logistic_Regression(training_data, test_data):
    model = LogisticRegression(random_state=0, max_iter = 1000)
    # train model
    model.fit(training_data.features, training_data.labels.ravel())
    # test the model
    prediction_label = model.predict(test_data.features)
    prediction = dataset_orig_test.copy()
    prediction.labels = prediction_label
    # return the prediction on the test data
    return prediction

prediction = Logistic_Regression(dataset_orig_train, dataset_orig_test)

#### Check fairness metrics and accuracy of the predition.

In [6]:
# measure the accuracy and the fairness metrics on the prediction
def get_prediction_metrics(prediction):
    metric = ClassificationMetric(
                        dataset_orig_test, prediction,
                        unprivileged_groups=unprivileged_groups,
                        privileged_groups=privileged_groups)

    accuracy = metric.accuracy()
    print('accuracy', accuracy)
    print(round(metric.statistical_parity_difference(), 2))
    print(round(metric.disparate_impact(), 2))
    print(round(metric.equal_opportunity_difference(), 2))
    print(round(metric.average_odds_difference(), 2))

get_prediction_metrics(prediction)

accuracy 0.7973333333333333
0.16
1.7
0.08
0.05


### Apply different mitigation methods to get debiased prediction.

#### Pre-processing

In [7]:
# Pre-processing: reweighing method
RW_model = Reweighing(unprivileged_groups=unprivileged_groups,
            privileged_groups=privileged_groups)
dataset_RW_train = RW_model.fit_transform(dataset_orig_train)
metric_RW_train = BinaryLabelDatasetMetric(dataset_RW_train, 
                                     unprivileged_groups=unprivileged_groups,
                                     privileged_groups=privileged_groups)

# print the metric values 
print('SPD', round(metric_RW_train.mean_difference(), 2))
print('DI', round(metric_RW_train.disparate_impact(), 2))

SPD 0.0
DI 1.0


In [8]:
# Pre-processing: Learning fair representations
LFR_model = LFR(unprivileged_groups=unprivileged_groups, 
    privileged_groups=privileged_groups,
    verbose=0, seed=10)
LFR_model = LFR_model.fit(dataset_orig_train)
dataset_LFR_train = LFR_model.transform(dataset_orig_train)

metric_LFR_train = BinaryLabelDatasetMetric(dataset_LFR_train, 
                                        unprivileged_groups=unprivileged_groups,
                                        privileged_groups=privileged_groups)
print('SPD', round(metric_RW_train.mean_difference(), 2))
print('DI', round(metric_RW_train.disparate_impact(), 2))

SPD 0.0
DI 1.0


In [9]:
# training the original model with the processed training data
prediction = Logistic_Regression(dataset_LFR_train, dataset_orig_test)
get_prediction_metrics(prediction)

prediction = Logistic_Regression(dataset_RW_train, dataset_orig_test)
get_prediction_metrics(prediction)

accuracy 0.7013333333333334
0.0
1.01
-0.11
-0.09
accuracy 0.7973333333333333
0.16
1.7
0.08
0.05


#### In-processing

In [10]:
# in-processing: Prejudice remover
model = PrejudiceRemover(eta=0.1)
model.fit(dataset_orig_train)
prediction = model.predict(dataset_orig_test)
get_prediction_metrics(prediction)

accuracy 0.798
0.17
1.84
0.13
0.08


In [11]:
# in-processing: Adversarial debiasing
from aif360.algorithms.inprocessing.adversarial_debiasing import AdversarialDebiasing
import tensorflow.compat.v1 as tf
tf.disable_eager_execution()

tf.reset_default_graph()
sess = tf.Session()
num_epochs = 50
classifier_num_hidden_units = 200
model = AdversarialDebiasing(privileged_groups = privileged_groups,
                            unprivileged_groups = unprivileged_groups,
                            scope_name='debiased_classifier',
                            debias=True,
                            sess=sess)
model.fit(dataset_RW_train)
prediction = model.predict(dataset_orig_test)
get_prediction_metrics(prediction)

Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


2023-01-20 17:14:33.261624: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-01-20 17:14:33.802681: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:357] MLIR V1 optimization pass is not enabled


epoch 0; iter: 0; batch classifier loss: 7.859415; batch adversarial loss: 0.769911
epoch 1; iter: 0; batch classifier loss: 3.744248; batch adversarial loss: 0.769484
epoch 2; iter: 0; batch classifier loss: 2.387394; batch adversarial loss: 0.729065
epoch 3; iter: 0; batch classifier loss: 5.968015; batch adversarial loss: 0.746093
epoch 4; iter: 0; batch classifier loss: 2.796527; batch adversarial loss: 0.724543
epoch 5; iter: 0; batch classifier loss: 1.637925; batch adversarial loss: 0.700866
epoch 6; iter: 0; batch classifier loss: 1.673924; batch adversarial loss: 0.711516
epoch 7; iter: 0; batch classifier loss: 1.789078; batch adversarial loss: 0.711088
epoch 8; iter: 0; batch classifier loss: 0.960733; batch adversarial loss: 0.715673
epoch 9; iter: 0; batch classifier loss: 0.887707; batch adversarial loss: 0.710739
epoch 10; iter: 0; batch classifier loss: 0.733396; batch adversarial loss: 0.724761
epoch 11; iter: 0; batch classifier loss: 0.860014; batch adversarial loss:

#### Post-processing

In [14]:
model = RejectOptionClassification(privileged_groups = privileged_groups,
                                unprivileged_groups = unprivileged_groups, num_class_thresh=500)
model = model.fit(dataset_orig_test, prediction)
prediction = model.predict(prediction)
get_prediction_metrics(prediction)

accuracy 0.756
0.05
1.27
-0.05
-0.04


### Flexibly combine different techniques to generate debiased prediction.

Give one example, let students try more examples with code
Also try different split ratios