# Prejudice removal

**Uses the Prejudice removal in-processing algorithm from the AIF360 toolkit. Adds a discrimination-aware regularization term to the learning objective** 

See https://aif360.readthedocs.io/en/latest/modules/generated/aif360.algorithms.inprocessing.PrejudiceRemover.html <br />
See http://aif360.mybluemix.net/resources#guidance for guidance on metrics and mitigation algorithms

In [61]:
%matplotlib inline
import os
import sys
sys.path.insert(0, os.path.abspath('../datasets'))
import numpy as np
from tqdm import tqdm
from warnings import warn
import pandas as pd

from IPython.display import Markdown, display
from aif360.metrics import BinaryLabelDatasetMetric
from aif360.metrics import ClassificationMetric
from common_utils import compute_metrics
from sklearn.preprocessing import MaxAbsScaler

from aif360.algorithms.inprocessing import PrejudiceRemover

# Import employment dataset 
from EmploymentDataset import EmploymentDataset
from util import preprocess_employment

In [51]:
privileged_groups = [{'Sex': 1}]
unprivileged_groups = [{'Sex': 0}]

# Fairness penalty paramenters
eta = [0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 10.0, 15.0, 20.0, 30.0, 40.0, 50.0, 100.0,
                         150.0, 200.0, 250.0, 300.0] 

prejudice_remover = PrejudiceRemover(eta=10, sensitive_attr = 'Sex', class_attr='EmploymentStatus')
prejudice_no_penalty = PrejudiceRemover(sensitive_attr = 'Sex', class_attr='EmploymentStatus')
 
# Import the dataset
dataset_orig = preprocess_employment(['Sex'])

# Split dataset 70/30
dataset_orig_train, dataset_orig_test = dataset_orig.split([0.7], shuffle=True)

[1. 0.]


In [52]:
# Metric for the original dataset
metric_orig_train = BinaryLabelDatasetMetric(dataset_orig_train, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)
display(Markdown("#### Original training dataset"))
print("Train set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_orig_train.mean_difference())
metric_orig_test = BinaryLabelDatasetMetric(dataset_orig_test, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)
print("Test set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_orig_test.mean_difference())

#### Original training dataset

Train set: Difference in mean outcomes between unprivileged and privileged groups = -0.095891
Test set: Difference in mean outcomes between unprivileged and privileged groups = -0.094418


In [53]:
min_max_scaler = MaxAbsScaler()
dataset_orig_train.features = min_max_scaler.fit_transform(dataset_orig_train.features)
dataset_orig_test.features = min_max_scaler.transform(dataset_orig_test.features)
metric_scaled_train = BinaryLabelDatasetMetric(dataset_orig_train, 
                             unprivileged_groups=unprivileged_groups,
                             privileged_groups=privileged_groups)
display(Markdown("#### Scaled dataset - Verify that the scaling does not affect the group label statistics"))
print("Train set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_scaled_train.mean_difference())
metric_scaled_test = BinaryLabelDatasetMetric(dataset_orig_test, 
                             unprivileged_groups=unprivileged_groups,
                             privileged_groups=privileged_groups)
print("Test set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_scaled_test.mean_difference())

#### Scaled dataset - Verify that the scaling does not affect the group label statistics

Train set: Difference in mean outcomes between unprivileged and privileged groups = -0.095891
Test set: Difference in mean outcomes between unprivileged and privileged groups = -0.094418


In [54]:
# Train the model
prejudice_no_penalty.fit(dataset_orig_train)

<aif360.algorithms.inprocessing.prejudice_remover.PrejudiceRemover at 0x7fa5c6439710>

In [55]:
# Apply the unconstrained model to test data
dataset_train_transformed_plain = prejudice_no_penalty.predict(dataset_orig_train)
dataset_test_transformed_plain = prejudice_no_penalty.predict(dataset_orig_test)

In [56]:
# Metrics for the dataset from model without debiasing
display(Markdown("## No fairness contraints - dataset metrics"))
display(Markdown("#### Model without debiasing - dataset metrics"))
metric_dataset_prejudiceremover_train = BinaryLabelDatasetMetric(dataset_train_transformed_plain, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)

print("Train set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_dataset_prejudiceremover_train.mean_difference())

metric_dataset_prejudiceremover_test = BinaryLabelDatasetMetric(dataset_test_transformed_plain, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)

print("Test set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_dataset_prejudiceremover_test.mean_difference())

display(Markdown("#### Model without debiasing - classification metrics"))
classified_metric_prejudiceremover_test = ClassificationMetric(dataset_orig_test, 
                                                 dataset_test_transformed_plain,
                                                 unprivileged_groups=unprivileged_groups,
                                                 privileged_groups=privileged_groups)
print("Test set: Classification accuracy = %f" % classified_metric_prejudiceremover_test.accuracy())
TPR = classified_metric_prejudiceremover_test.true_positive_rate()
TNR = classified_metric_prejudiceremover_test.true_negative_rate()
bal_acc_predjudiceremover_test = 0.5*(TPR+TNR)
print("Test set: Balanced classification accuracy = %f" % bal_acc_predjudiceremover_test)
print("Test set: Disparate impact = %f" % classified_metric_prejudiceremover_test.disparate_impact())
print("Test set: Equal opportunity difference = %f" % classified_metric_prejudiceremover_test.equal_opportunity_difference())
print("Test set: Average odds difference = %f" % classified_metric_prejudiceremover_test.average_odds_difference())
print("Test set: Theil_index = %f" % classified_metric_prejudiceremover_test.theil_index())

## No fairness contraints - dataset metrics

#### Model without debiasing - dataset metrics

Train set: Difference in mean outcomes between unprivileged and privileged groups = -0.025344
Test set: Difference in mean outcomes between unprivileged and privileged groups = -0.016847


#### Model without debiasing - classification metrics

Test set: Classification accuracy = 0.815274
Test set: Balanced classification accuracy = 0.660981
Test set: Disparate impact = 0.981027
Test set: Equal opportunity difference = -0.020949
Test set: Average odds difference = 0.054591
Test set: Theil_index = 0.076198


In [57]:
prejudice_remover.fit(dataset_orig_train)

<aif360.algorithms.inprocessing.prejudice_remover.PrejudiceRemover at 0x7fa5c6439780>

In [58]:
# Apply the prejudice remover model to test data
dataset_train_transformed = prejudice_remover.predict(dataset_orig_train)
dataset_test_transformed = prejudice_remover.predict(dataset_orig_test)

In [62]:
display(Markdown("## No fairness contraints - dataset metrics"))
# Metrics for the dataset from model without debiasing
display(Markdown("#### Model without debiasing - dataset metrics"))
metric_dataset_prejudiceremover_train = BinaryLabelDatasetMetric(dataset_train_transformed_plain, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)

print("Train set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_dataset_prejudiceremover_train.mean_difference())

metric_dataset_prejudiceremover_test = BinaryLabelDatasetMetric(dataset_test_transformed_plain, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)

print("Test set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_dataset_prejudiceremover_test.mean_difference())

display(Markdown("#### Model without debiasing - classification metrics (test set)"))
classified_metric_prejudiceremover_test = ClassificationMetric(dataset_orig_test, 
                                                 dataset_test_transformed_plain,
                                                 unprivileged_groups=unprivileged_groups,
                                                 privileged_groups=privileged_groups)
print("Test set: Classification accuracy = %f" % classified_metric_prejudiceremover_test.accuracy())
TPR = classified_metric_prejudiceremover_test.true_positive_rate()
TNR = classified_metric_prejudiceremover_test.true_negative_rate()
bal_acc_predjudiceremover_test = 0.5*(TPR+TNR)
print("Test set: Balanced classification accuracy = %f" % bal_acc_predjudiceremover_test)
print("Test set: Disparate impact = %f" % classified_metric_prejudiceremover_test.disparate_impact())
print("Test set: Equal opportunity difference = %f" % classified_metric_prejudiceremover_test.equal_opportunity_difference())
print("Test set: Average odds difference = %f" % classified_metric_prejudiceremover_test.average_odds_difference())
print("Test set: Theil_index = %f" % classified_metric_prejudiceremover_test.theil_index())

###### Debiasing begins here ############


display(Markdown("## With fairness constraints - dataset metrics"))

# Metrics for the dataset from model with debiasing
display(Markdown("#### Model with debiasing - dataset metrics"))
metric_dataset_prejudiceremover_train = BinaryLabelDatasetMetric(dataset_train_transformed, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)

print("Train set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_dataset_prejudiceremover_train.mean_difference())

metric_dataset_prejudiceremover_test = BinaryLabelDatasetMetric(dataset_test_transformed, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)

print("Test set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_dataset_prejudiceremover_test.mean_difference())

display(Markdown("#### Model with biasing - classification metrics (test set)"))
classified_metric_prejudiceremover_test = ClassificationMetric(dataset_orig_test, 
                                                 dataset_test_transformed,
                                                 unprivileged_groups=unprivileged_groups,
                                                 privileged_groups=privileged_groups)
print("Test set: Classification accuracy = %f" % classified_metric_prejudiceremover_test.accuracy())
TPR = classified_metric_prejudiceremover_test.true_positive_rate()
TNR = classified_metric_prejudiceremover_test.true_negative_rate()
bal_acc_predjudiceremover_test = 0.5*(TPR+TNR)
print("Balanced classification accuracy = %f" % bal_acc_predjudiceremover_test)
print("Test set: Disparate impact = %f" % classified_metric_prejudiceremover_test.disparate_impact())
print("Test set: Equal opportunity difference = %f" % classified_metric_prejudiceremover_test.equal_opportunity_difference())
print("Test set: Average odds difference = %f" % classified_metric_prejudiceremover_test.average_odds_difference())
print("Test set: Theil_index = %f" % classified_metric_prejudiceremover_test.theil_index())


## No fairness contraints - dataset metrics

#### Model without debiasing - dataset metrics

Train set: Difference in mean outcomes between unprivileged and privileged groups = -0.025344
Test set: Difference in mean outcomes between unprivileged and privileged groups = -0.016847


#### Model without debiasing - classification metrics (test set)

Test set: Classification accuracy = 0.815274
Test set: Balanced classification accuracy = 0.660981
Test set: Disparate impact = 0.981027
Test set: Equal opportunity difference = -0.020949
Test set: Average odds difference = 0.054591
Test set: Theil_index = 0.076198


## With fairness constraints - dataset metrics

#### Model with debiasing - dataset metrics

Train set: Difference in mean outcomes between unprivileged and privileged groups = -0.015708
Test set: Difference in mean outcomes between unprivileged and privileged groups = -0.005376


#### Model with biasing - classification metrics (test set)

Test set: Classification accuracy = 0.815890
Test set: Balanced classification accuracy = 0.653882
Test set: Disparate impact = 0.993981
Test set: Equal opportunity difference = -0.012431
Test set: Average odds difference = 0.066897
Test set: Theil_index = 0.070934
