## Step 1 install package using github repository: https://github.com/Trusted-AI/AIF360.git

In [1]:
!pip install git+https://github.com/pg2374/AIF.git

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting git+https://github.com/pg2374/AIF.git
  Cloning https://github.com/pg2374/AIF.git to /tmp/pip-req-build-8ufx28dy
  Running command git clone -q https://github.com/pg2374/AIF.git /tmp/pip-req-build-8ufx28dy
Collecting scipy<1.6.0,>=1.2.0
  Downloading scipy-1.5.4-cp38-cp38-manylinux1_x86_64.whl (25.8 MB)
[K     |████████████████████████████████| 25.8 MB 1.9 MB/s 
Collecting tempeh
  Downloading tempeh-0.1.12-py3-none-any.whl (39 kB)
Collecting shap
  Downloading shap-0.41.0-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (575 kB)
[K     |████████████████████████████████| 575 kB 58.7 MB/s 
Collecting memory-profiler
  Downloading memory_profiler-0.61.0-py3-none-any.whl (31 kB)
Collecting slicer==0.0.7
  Downloading slicer-0.0.7-py3-none-any.whl (14 kB)
Building wheels for collected packages: aif360
  Building wheel for aif360 (setup.py) ... [?25l[?25hdone
  Created w

In [2]:
!pip install fairlearn
!pip install tensorflow
!pip install scikit-plot

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting fairlearn
  Downloading fairlearn-0.8.0-py3-none-any.whl (235 kB)
[K     |████████████████████████████████| 235 kB 14.3 MB/s 
Installing collected packages: fairlearn
Successfully installed fairlearn-0.8.0
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting scikit-plot
  Downloading scikit_plot-0.3.7-py3-none-any.whl (33 kB)
Installing collected packages: scikit-plot
Successfully installed scikit-plot-0.3.7


# Step 2: Download the data files

In [3]:
!wget https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data
!wget https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.test
!wget https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.names

--2022-12-23 06:04:05--  https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data
Resolving archive.ics.uci.edu (archive.ics.uci.edu)... 128.195.10.252
Connecting to archive.ics.uci.edu (archive.ics.uci.edu)|128.195.10.252|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3974305 (3.8M) [application/x-httpd-php]
Saving to: ‘adult.data’


2022-12-23 06:04:08 (3.07 MB/s) - ‘adult.data’ saved [3974305/3974305]

--2022-12-23 06:04:08--  https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.test
Resolving archive.ics.uci.edu (archive.ics.uci.edu)... 128.195.10.252
Connecting to archive.ics.uci.edu (archive.ics.uci.edu)|128.195.10.252|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2003153 (1.9M) [application/x-httpd-php]
Saving to: ‘adult.test’


2022-12-23 06:04:10 (1.77 MB/s) - ‘adult.test’ saved [2003153/2003153]

--2022-12-23 06:04:10--  https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adu

# Step 3: Copy the data files to the location specified in the package

In [4]:
!cp adult.data /usr/local/lib/python3.8/dist-packages/aif360/data/raw/adult/
!cp adult.test /usr/local/lib/python3.8/dist-packages/aif360/data/raw/adult/
!cp adult.names /usr/local/lib/python3.8/dist-packages/aif360/data/raw/adult/

#### This notebook demonstrates the use of adversarial debiasing algorithm to learn a fair classifier.
Adversarial debiasing [1] is an in-processing technique that learns a classifier to maximize prediction accuracy and simultaneously reduce an adversary's ability to determine the protected attribute from the predictions. This approach leads to a fair classifier as the predictions cannot carry any group discrimination information that the adversary can exploit. We will see how to use this algorithm for learning models with and without fairness constraints and apply them on the Adult dataset.

In [5]:
%matplotlib inline
# Load all necessary packages
import sys
sys.path.append("../")
from aif360.datasets import BinaryLabelDataset
from aif360.datasets import AdultDataset, GermanDataset, CompasDataset
from aif360.metrics import BinaryLabelDatasetMetric
from aif360.metrics import ClassificationMetric
from aif360.metrics.utils import compute_boolean_conditioning_vector

from aif360.algorithms.preprocessing.optim_preproc_helpers.data_preproc_functions import load_preproc_data_adult, load_preproc_data_compas, load_preproc_data_german

from aif360.algorithms.inprocessing.adversarial_debiasing import AdversarialDebiasing

from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler, MaxAbsScaler, MinMaxScaler
from sklearn.metrics import accuracy_score
from sklearn import metrics

from IPython.display import Markdown, display
import matplotlib.pyplot as plt

import tensorflow.compat.v1 as tf
tf.disable_eager_execution()

import scikitplot as skplt

#### Load dataset and set options

In [6]:
# Get the dataset and split into train and test
dataset_orig = load_preproc_data_adult()

privileged_groups = [{'sex': 1}]
unprivileged_groups = [{'sex': 0}]

dataset_orig_train, dataset_orig_test = dataset_orig.split([0.7], shuffle=True)

In [7]:
# print out some labels, names, etc.
display(Markdown("#### Training Dataset shape"))
print(dataset_orig_train.features.shape)
display(Markdown("#### Test Dataset shape"))
print(dataset_orig_test.features.shape)
display(Markdown("#### Favorable and unfavorable labels"))
print(dataset_orig_train.favorable_label, dataset_orig_train.unfavorable_label)
display(Markdown("#### Protected attribute names"))
print(dataset_orig_train.protected_attribute_names)
display(Markdown("#### Privileged and unprivileged protected attribute values"))
print(dataset_orig_train.privileged_protected_attributes, 
      dataset_orig_train.unprivileged_protected_attributes)
display(Markdown("#### Dataset feature names"))
print(dataset_orig_train.feature_names)


#### Training Dataset shape

(34189, 18)


#### Test Dataset shape

(14653, 18)


#### Favorable and unfavorable labels

1.0 0.0


#### Protected attribute names

['sex', 'race']


#### Privileged and unprivileged protected attribute values

[array([1.]), array([1.])] [array([0.]), array([0.])]


#### Dataset feature names

['race', 'sex', 'Age (decade)=10', 'Age (decade)=20', 'Age (decade)=30', 'Age (decade)=40', 'Age (decade)=50', 'Age (decade)=60', 'Age (decade)=>=70', 'Education Years=6', 'Education Years=7', 'Education Years=8', 'Education Years=9', 'Education Years=10', 'Education Years=11', 'Education Years=12', 'Education Years=<6', 'Education Years=>12']


#### Metric for original training data

In [8]:
# Metric for the original dataset
metric_orig_train = BinaryLabelDatasetMetric(dataset_orig_train, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)
display(Markdown("#### Original training dataset"))
print("Train set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_orig_train.mean_difference())
metric_orig_test = BinaryLabelDatasetMetric(dataset_orig_test, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)
print("Test set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_orig_test.mean_difference())

#### Original training dataset

Train set: Difference in mean outcomes between unprivileged and privileged groups = -0.191948
Test set: Difference in mean outcomes between unprivileged and privileged groups = -0.200418


# Part 1: Original work by author
##In each part, we have added addtional metrics like precision and recall to gauge performance on imbalanced data
### Preprocessing dataset using MaxAbsScaler</br>

In [9]:
max_abs_scaler = MaxAbsScaler()
dataset_orig_train.features = max_abs_scaler.fit_transform(dataset_orig_train.features)
dataset_orig_test.features = max_abs_scaler.transform(dataset_orig_test.features)
metric_scaled_train = BinaryLabelDatasetMetric(dataset_orig_train, 
                             unprivileged_groups=unprivileged_groups,
                             privileged_groups=privileged_groups)
display(Markdown("#### Scaled dataset - Verify that the scaling does not affect the group label statistics"))
print("Train set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_scaled_train.mean_difference())
metric_scaled_test = BinaryLabelDatasetMetric(dataset_orig_test, 
                             unprivileged_groups=unprivileged_groups,
                             privileged_groups=privileged_groups)
print("Test set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_scaled_test.mean_difference())


#### Scaled dataset - Verify that the scaling does not affect the group label statistics

Train set: Difference in mean outcomes between unprivileged and privileged groups = -0.191948
Test set: Difference in mean outcomes between unprivileged and privileged groups = -0.200418


### Learn plain classifier without debiasing

In [10]:
# Load post-processing algorithm that equalizes the odds
# Learn parameters with debias set to False
sess = tf.Session()
plain_model = AdversarialDebiasing(privileged_groups = privileged_groups,
                          unprivileged_groups = unprivileged_groups,
                          scope_name='plain_classifier_part_1',
                          debias=False,
                          sess=sess)

In [11]:
plain_model.fit(dataset_orig_train)

Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


epoch 0; iter: 0; batch classifier loss: 0.720402
epoch 0; iter: 200; batch classifier loss: 0.407042
epoch 1; iter: 0; batch classifier loss: 0.361959
epoch 1; iter: 200; batch classifier loss: 0.437485
epoch 2; iter: 0; batch classifier loss: 0.480875
epoch 2; iter: 200; batch classifier loss: 0.456051
epoch 3; iter: 0; batch classifier loss: 0.405031
epoch 3; iter: 200; batch classifier loss: 0.341224
epoch 4; iter: 0; batch classifier loss: 0.335127
epoch 4; iter: 200; batch classifier loss: 0.392714
epoch 5; iter: 0; batch classifier loss: 0.491046
epoch 5; iter: 200; batch classifier loss: 0.359802
epoch 6; iter: 0; batch classifier loss: 0.407765
epoch 6; iter: 200; batch classifier loss: 0.339861
epoch 7; iter: 0; batch classifier loss: 0.412609
epoch 7; iter: 200; batch classifier loss: 0.466955
epoch 8; iter: 0; batch classifier loss: 0.375506
epoch 8; iter: 200; batch classifier loss: 0.368174
epoch 9; iter: 0; batch classifier loss: 0.452794
epoch 9; iter: 200; batch classi

<aif360.algorithms.inprocessing.adversarial_debiasing.AdversarialDebiasing at 0x7feab461ba60>

In [12]:
# Apply the plain model to test data
dataset_nodebiasing_train = plain_model.predict(dataset_orig_train)
dataset_nodebiasing_test = plain_model.predict(dataset_orig_test)

In [13]:
# Metrics for the dataset from plain model (without debiasing)
display(Markdown("#### Plain model - without debiasing - dataset metrics"))
metric_dataset_nodebiasing_train = BinaryLabelDatasetMetric(dataset_nodebiasing_train, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)

print("Train set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_dataset_nodebiasing_train.mean_difference())

metric_dataset_nodebiasing_test = BinaryLabelDatasetMetric(dataset_nodebiasing_test, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)

print("Test set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_dataset_nodebiasing_test.mean_difference())

display(Markdown("#### Plain model - without debiasing - classification metrics"))
classified_metric_nodebiasing_test = ClassificationMetric(dataset_orig_test, 
                                                 dataset_nodebiasing_test,
                                                 unprivileged_groups=unprivileged_groups,
                                                 privileged_groups=privileged_groups)
print("Test set: Classification accuracy = %f" % classified_metric_nodebiasing_test.accuracy())
print("Test set: Classification precision = %f" % classified_metric_nodebiasing_test.precision())
print("Test set: Classification recall = %f" % classified_metric_nodebiasing_test.recall())

TPR = classified_metric_nodebiasing_test.true_positive_rate()
TNR = classified_metric_nodebiasing_test.true_negative_rate()
FPR = classified_metric_nodebiasing_test.false_positive_rate()
FNR = classified_metric_nodebiasing_test.false_negative_rate()
print("TPR: True Positive Rate = %f" % TPR)
print("TNR: True Negative Rate = %f" % TNR)
print("FPR: False Positive Rate = %f" % FPR)
print("FNR: False Negative Rate = %f" % FNR)
bal_acc_nodebiasing_test = 0.5*(TPR+TNR)
print("Test set: Balanced classification accuracy = %f" % bal_acc_nodebiasing_test)
print("Test set: Disparate impact = %f" % classified_metric_nodebiasing_test.disparate_impact())
print("Test set: Equal opportunity difference = %f" % classified_metric_nodebiasing_test.equal_opportunity_difference())
print("Test set: Average odds difference = %f" % classified_metric_nodebiasing_test.average_odds_difference())
print("Test set: Theil_index = %f" % classified_metric_nodebiasing_test.theil_index())

#### Plain model - without debiasing - dataset metrics

Train set: Difference in mean outcomes between unprivileged and privileged groups = -0.209740
Test set: Difference in mean outcomes between unprivileged and privileged groups = -0.214823


#### Plain model - without debiasing - classification metrics

Test set: Classification accuracy = 0.798403
Test set: Classification precision = 0.656886
Test set: Classification recall = 0.383743
TPR: True Positive Rate = 0.383743
TNR: True Negative Rate = 0.934306
FPR: False Positive Rate = 0.065694
FNR: False Negative Rate = 0.616257
Test set: Balanced classification accuracy = 0.659025
Test set: Disparate impact = 0.000000
Test set: Equal opportunity difference = -0.451235
Test set: Average odds difference = -0.279242
Test set: Theil_index = 0.184736


### Apply in-processing algorithm based on adversarial learning

In [14]:
sess.close()
tf.reset_default_graph()
sess = tf.Session()

In [15]:
# Learn parameters with debias set to True
debiased_model = AdversarialDebiasing(privileged_groups = privileged_groups,
                          unprivileged_groups = unprivileged_groups,
                          scope_name='debiased_classifier_part_1',
                          debias=True,
                          sess=sess)

In [16]:
debiased_model.fit(dataset_orig_train)

epoch 0; iter: 0; batch classifier loss: 0.759828; batch adversarial loss: 0.661721
epoch 0; iter: 200; batch classifier loss: 0.543180; batch adversarial loss: 0.649956
epoch 1; iter: 0; batch classifier loss: 0.491863; batch adversarial loss: 0.682094
epoch 1; iter: 200; batch classifier loss: 0.459154; batch adversarial loss: 0.649397
epoch 2; iter: 0; batch classifier loss: 0.490993; batch adversarial loss: 0.623625
epoch 2; iter: 200; batch classifier loss: 0.464788; batch adversarial loss: 0.646605
epoch 3; iter: 0; batch classifier loss: 0.444911; batch adversarial loss: 0.641664
epoch 3; iter: 200; batch classifier loss: 0.454854; batch adversarial loss: 0.589226
epoch 4; iter: 0; batch classifier loss: 0.409471; batch adversarial loss: 0.563411
epoch 4; iter: 200; batch classifier loss: 0.408660; batch adversarial loss: 0.608686
epoch 5; iter: 0; batch classifier loss: 0.409663; batch adversarial loss: 0.638049
epoch 5; iter: 200; batch classifier loss: 0.423042; batch adversa

<aif360.algorithms.inprocessing.adversarial_debiasing.AdversarialDebiasing at 0x7feab41df790>

In [17]:
# Apply the plain model to test data
dataset_debiasing_train = debiased_model.predict(dataset_orig_train)
dataset_debiasing_test = debiased_model.predict(dataset_orig_test)

In [18]:
# Metrics for the dataset from plain model (without debiasing)
display(Markdown("#### Plain model - without debiasing - dataset metrics"))
print("Train set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_dataset_nodebiasing_train.mean_difference())
print("Test set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_dataset_nodebiasing_test.mean_difference())

# Metrics for the dataset from model with debiasing
display(Markdown("#### Model - with debiasing - dataset metrics"))
metric_dataset_debiasing_train = BinaryLabelDatasetMetric(dataset_debiasing_train, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)

print("Train set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_dataset_debiasing_train.mean_difference())

metric_dataset_debiasing_test = BinaryLabelDatasetMetric(dataset_debiasing_test, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)

print("Test set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_dataset_debiasing_test.mean_difference())



display(Markdown("#### Plain model - without debiasing - classification metrics"))
print("Test set: Classification accuracy = %f" % classified_metric_nodebiasing_test.accuracy())
print("Test set: Classification precision = %f" % classified_metric_nodebiasing_test.precision())
print("Test set: Classification recall = %f" % classified_metric_nodebiasing_test.recall())
TPR = classified_metric_nodebiasing_test.true_positive_rate()
TNR = classified_metric_nodebiasing_test.true_negative_rate()
FPR = classified_metric_nodebiasing_test.false_positive_rate()
FNR = classified_metric_nodebiasing_test.false_negative_rate()
print("TPR: True Positive Rate = %f" % TPR)
print("TNR: True Negative Rate = %f" % TNR)
print("FPR: False Positive Rate = %f" % FPR)
print("FNR: False Negative Rate = %f" % FNR)
bal_acc_nodebiasing_test = 0.5*(TPR+TNR)
print("Test set: Balanced classification accuracy = %f" % bal_acc_nodebiasing_test)
print("Test set: Disparate impact = %f" % classified_metric_nodebiasing_test.disparate_impact())
print("Test set: Equal opportunity difference = %f" % classified_metric_nodebiasing_test.equal_opportunity_difference())
print("Test set: Average odds difference = %f" % classified_metric_nodebiasing_test.average_odds_difference())
print("Test set: Theil_index = %f" % classified_metric_nodebiasing_test.theil_index())



display(Markdown("#### Model - with debiasing - classification metrics"))
classified_metric_debiasing_test = ClassificationMetric(dataset_orig_test, 
                                                 dataset_debiasing_test,
                                                 unprivileged_groups=unprivileged_groups,
                                                 privileged_groups=privileged_groups)
print("Test set: Classification accuracy = %f" % classified_metric_debiasing_test.accuracy())
TPR = classified_metric_nodebiasing_test.true_positive_rate()
TNR = classified_metric_nodebiasing_test.true_negative_rate()
FPR = classified_metric_nodebiasing_test.false_positive_rate()
FNR = classified_metric_nodebiasing_test.false_negative_rate()
print("TPR: True Positive Rate = %f" % TPR)
print("TNR: True Negative Rate = %f" % TNR)
print("FPR: False Positive Rate = %f" % FPR)
print("FNR: False Negative Rate = %f" % FNR)
bal_acc_debiasing_test = 0.5*(TPR+TNR)
print("Test set: Balanced classification accuracy = %f" % bal_acc_debiasing_test)
print("Test set: Disparate impact = %f" % classified_metric_debiasing_test.disparate_impact())
print("Test set: Equal opportunity difference = %f" % classified_metric_debiasing_test.equal_opportunity_difference())
print("Test set: Average odds difference = %f" % classified_metric_debiasing_test.average_odds_difference())
print("Test set: Theil_index = %f" % classified_metric_debiasing_test.theil_index())

#### Plain model - without debiasing - dataset metrics

Train set: Difference in mean outcomes between unprivileged and privileged groups = -0.209740
Test set: Difference in mean outcomes between unprivileged and privileged groups = -0.214823


#### Model - with debiasing - dataset metrics

Train set: Difference in mean outcomes between unprivileged and privileged groups = -0.136748
Test set: Difference in mean outcomes between unprivileged and privileged groups = -0.142406


#### Plain model - without debiasing - classification metrics

Test set: Classification accuracy = 0.798403
Test set: Classification precision = 0.656886
Test set: Classification recall = 0.383743
TPR: True Positive Rate = 0.383743
TNR: True Negative Rate = 0.934306
FPR: False Positive Rate = 0.065694
FNR: False Negative Rate = 0.616257
Test set: Balanced classification accuracy = 0.659025
Test set: Disparate impact = 0.000000
Test set: Equal opportunity difference = -0.451235
Test set: Average odds difference = -0.279242
Test set: Theil_index = 0.184736


#### Model - with debiasing - classification metrics

Test set: Classification accuracy = 0.791169
TPR: True Positive Rate = 0.383743
TNR: True Negative Rate = 0.934306
FPR: False Positive Rate = 0.065694
FNR: False Negative Rate = 0.616257
Test set: Balanced classification accuracy = 0.659025
Test set: Disparate impact = 0.316061
Test set: Equal opportunity difference = -0.244574
Test set: Average odds difference = -0.148872
Test set: Theil_index = 0.182832


# Part 2: Preprocessing dataset using MinMaxScaler</br>
StandardScaler also gives approximately the same results.

# Part 3: changing the architecture of the neural network, added 2 additonal layer with the activation function relu 

In [19]:
from aif360.algorithms.inprocessing.adversarial_debiasing_pg1 import AdversarialDebiasing

In [20]:
# Add more data preprocessing here 
min_max_scaler = MinMaxScaler()
dataset_orig_train.features = min_max_scaler.fit_transform(dataset_orig_train.features)
dataset_orig_test.features = min_max_scaler.transform(dataset_orig_test.features)
metric_scaled_train = BinaryLabelDatasetMetric(dataset_orig_train, 
                             unprivileged_groups=unprivileged_groups,
                             privileged_groups=privileged_groups)
display(Markdown("#### Scaled dataset - Verify that the scaling does not affect the group label statistics"))
print("Train set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_scaled_train.mean_difference())
metric_scaled_test = BinaryLabelDatasetMetric(dataset_orig_test, 
                             unprivileged_groups=unprivileged_groups,
                             privileged_groups=privileged_groups)
print("Test set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_scaled_test.mean_difference())

#### Scaled dataset - Verify that the scaling does not affect the group label statistics

Train set: Difference in mean outcomes between unprivileged and privileged groups = -0.191948
Test set: Difference in mean outcomes between unprivileged and privileged groups = -0.200418


### Learn plain classifier without debiasing

In [21]:
# Load post-processing algorithm that equalizes the odds
# Learn parameters with debias set to False
sess = tf.Session()
plain_model = AdversarialDebiasing(privileged_groups = privileged_groups,
                          unprivileged_groups = unprivileged_groups,
                          scope_name='plain_classifier_part_3',
                          debias=False,
                          sess=sess)

In [22]:
plain_model.fit(dataset_orig_train)

epoch 0; iter: 0; batch classifier loss: 0.687325
epoch 0; iter: 200; batch classifier loss: 0.470102
epoch 1; iter: 0; batch classifier loss: 0.550684
epoch 1; iter: 200; batch classifier loss: 0.497577
epoch 2; iter: 0; batch classifier loss: 0.397570
epoch 2; iter: 200; batch classifier loss: 0.399183
epoch 3; iter: 0; batch classifier loss: 0.416078
epoch 3; iter: 200; batch classifier loss: 0.402656
epoch 4; iter: 0; batch classifier loss: 0.422276
epoch 4; iter: 200; batch classifier loss: 0.425741
epoch 5; iter: 0; batch classifier loss: 0.457593
epoch 5; iter: 200; batch classifier loss: 0.377042
epoch 6; iter: 0; batch classifier loss: 0.515300
epoch 6; iter: 200; batch classifier loss: 0.409753
epoch 7; iter: 0; batch classifier loss: 0.427897
epoch 7; iter: 200; batch classifier loss: 0.430777
epoch 8; iter: 0; batch classifier loss: 0.376762
epoch 8; iter: 200; batch classifier loss: 0.433482
epoch 9; iter: 0; batch classifier loss: 0.465370
epoch 9; iter: 200; batch classi

<aif360.algorithms.inprocessing.adversarial_debiasing_pg1.AdversarialDebiasing at 0x7feaa0234ac0>

In [23]:
# Apply the plain model to test data
dataset_nodebiasing_train = plain_model.predict(dataset_orig_train)
dataset_nodebiasing_test = plain_model.predict(dataset_orig_test)

In [24]:
# Metrics for the dataset from plain model (without debiasing)
display(Markdown("#### Plain model - without debiasing - dataset metrics"))
metric_dataset_nodebiasing_train = BinaryLabelDatasetMetric(dataset_nodebiasing_train, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)

print("Train set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_dataset_nodebiasing_train.mean_difference())

metric_dataset_nodebiasing_test = BinaryLabelDatasetMetric(dataset_nodebiasing_test, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)

print("Test set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_dataset_nodebiasing_test.mean_difference())

display(Markdown("#### Plain model - without debiasing - classification metrics"))
classified_metric_nodebiasing_test = ClassificationMetric(dataset_orig_test, 
                                                 dataset_nodebiasing_test,
                                                 unprivileged_groups=unprivileged_groups,
                                                 privileged_groups=privileged_groups)
print("Test set: Classification accuracy = %f" % classified_metric_nodebiasing_test.accuracy())
print("Test set: Classification precision = %f" % classified_metric_nodebiasing_test.precision())
print("Test set: Classification recall = %f" % classified_metric_nodebiasing_test.recall())
TPR = classified_metric_nodebiasing_test.true_positive_rate()
TNR = classified_metric_nodebiasing_test.true_negative_rate()
FPR = classified_metric_nodebiasing_test.false_positive_rate()
FNR = classified_metric_nodebiasing_test.false_negative_rate()
print("TPR: True Positive Rate = %f" % TPR)
print("TNR: True Negative Rate = %f" % TNR)
print("FPR: False Positive Rate = %f" % FPR)
print("FNR: False Negative Rate = %f" % FNR)
bal_acc_nodebiasing_test = 0.5*(TPR+TNR)
print("Test set: Balanced classification accuracy = %f" % bal_acc_nodebiasing_test)
print("Test set: Disparate impact = %f" % classified_metric_nodebiasing_test.disparate_impact())
print("Test set: Equal opportunity difference = %f" % classified_metric_nodebiasing_test.equal_opportunity_difference())
print("Test set: Average odds difference = %f" % classified_metric_nodebiasing_test.average_odds_difference())
print("Test set: Theil_index = %f" % classified_metric_nodebiasing_test.theil_index())

#### Plain model - without debiasing - dataset metrics

Train set: Difference in mean outcomes between unprivileged and privileged groups = 0.000000
Test set: Difference in mean outcomes between unprivileged and privileged groups = 0.000000


#### Plain model - without debiasing - classification metrics

Test set: Classification accuracy = 0.753156
Test set: Classification precision = 0.000000
Test set: Classification recall = 0.000000
TPR: True Positive Rate = 0.000000
TNR: True Negative Rate = 1.000000
FPR: False Positive Rate = 0.000000
FNR: False Negative Rate = 1.000000
Test set: Balanced classification accuracy = 0.500000
Test set: Disparate impact = nan
Test set: Equal opportunity difference = 0.000000
Test set: Average odds difference = 0.000000
Test set: Theil_index = 0.283482


invalid value encountered in double_scalars


### Apply in-processing algorithm based on adversarial learning

In [25]:
sess.close()
tf.reset_default_graph()
sess = tf.Session()

In [26]:
# Learn parameters with debias set to True
debiased_model = AdversarialDebiasing(privileged_groups = privileged_groups,
                          unprivileged_groups = unprivileged_groups,
                          scope_name='debiased_classifier_part_2',
                          debias=True,
                          sess=sess)

In [27]:
debiased_model.fit(dataset_orig_train)

epoch 0; iter: 0; batch classifier loss: 0.681401; batch adversarial loss: 0.649939
epoch 0; iter: 200; batch classifier loss: nan; batch adversarial loss: nan
epoch 1; iter: 0; batch classifier loss: nan; batch adversarial loss: nan
epoch 1; iter: 200; batch classifier loss: nan; batch adversarial loss: nan
epoch 2; iter: 0; batch classifier loss: nan; batch adversarial loss: nan
epoch 2; iter: 200; batch classifier loss: nan; batch adversarial loss: nan
epoch 3; iter: 0; batch classifier loss: nan; batch adversarial loss: nan
epoch 3; iter: 200; batch classifier loss: nan; batch adversarial loss: nan
epoch 4; iter: 0; batch classifier loss: nan; batch adversarial loss: nan
epoch 4; iter: 200; batch classifier loss: nan; batch adversarial loss: nan
epoch 5; iter: 0; batch classifier loss: nan; batch adversarial loss: nan
epoch 5; iter: 200; batch classifier loss: nan; batch adversarial loss: nan
epoch 6; iter: 0; batch classifier loss: nan; batch adversarial loss: nan
epoch 6; iter: 2

<aif360.algorithms.inprocessing.adversarial_debiasing_pg1.AdversarialDebiasing at 0x7feaa02cef40>

In [28]:
# Apply the plain model to test data
dataset_debiasing_train = debiased_model.predict(dataset_orig_train)
dataset_debiasing_test = debiased_model.predict(dataset_orig_test)

In [29]:
# Metrics for the dataset from plain model (without debiasing)
display(Markdown("#### Plain model - without debiasing - dataset metrics"))
print("Train set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_dataset_nodebiasing_train.mean_difference())
print("Test set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_dataset_nodebiasing_test.mean_difference())

# Metrics for the dataset from model with debiasing
display(Markdown("#### Model - with debiasing - dataset metrics"))
metric_dataset_debiasing_train = BinaryLabelDatasetMetric(dataset_debiasing_train, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)

print("Train set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_dataset_debiasing_train.mean_difference())

metric_dataset_debiasing_test = BinaryLabelDatasetMetric(dataset_debiasing_test, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)

print("Test set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_dataset_debiasing_test.mean_difference())



display(Markdown("#### Plain model - without debiasing - classification metrics"))
print("Test set: Classification accuracy = %f" % classified_metric_nodebiasing_test.accuracy())
print("Test set: Classification precision = %f" % classified_metric_nodebiasing_test.precision())
print("Test set: Classification recall = %f" % classified_metric_nodebiasing_test.recall())
TPR = classified_metric_nodebiasing_test.true_positive_rate()
TNR = classified_metric_nodebiasing_test.true_negative_rate()
FPR = classified_metric_nodebiasing_test.false_positive_rate()
FNR = classified_metric_nodebiasing_test.false_negative_rate()
print("TPR: True Positive Rate = %f" % TPR)
print("TNR: True Negative Rate = %f" % TNR)
print("FPR: False Positive Rate = %f" % FPR)
print("FNR: False Negative Rate = %f" % FNR)
bal_acc_nodebiasing_test = 0.5*(TPR+TNR)
print("Test set: Balanced classification accuracy = %f" % bal_acc_nodebiasing_test)
print("Test set: Disparate impact = %f" % classified_metric_nodebiasing_test.disparate_impact())
print("Test set: Equal opportunity difference = %f" % classified_metric_nodebiasing_test.equal_opportunity_difference())
print("Test set: Average odds difference = %f" % classified_metric_nodebiasing_test.average_odds_difference())
print("Test set: Theil_index = %f" % classified_metric_nodebiasing_test.theil_index())



display(Markdown("#### Model - with debiasing - classification metrics"))
classified_metric_debiasing_test = ClassificationMetric(dataset_orig_test, 
                                                 dataset_debiasing_test,
                                                 unprivileged_groups=unprivileged_groups,
                                                 privileged_groups=privileged_groups)
print("Test set: Classification accuracy = %f" % classified_metric_debiasing_test.accuracy())
TPR = classified_metric_nodebiasing_test.true_positive_rate()
TNR = classified_metric_nodebiasing_test.true_negative_rate()
FPR = classified_metric_nodebiasing_test.false_positive_rate()
FNR = classified_metric_nodebiasing_test.false_negative_rate()

print("TPR: True Positive Rate = %f" % TPR)
print("TNR: True Negative Rate = %f" % TNR)
print("FPR: False Positive Rate = %f" % FPR)
print("FNR: False Negative Rate = %f" % FNR)
bal_acc_debiasing_test = 0.5*(TPR+TNR)
print("Test set: Balanced classification accuracy = %f" % bal_acc_debiasing_test)
print("Test set: Disparate impact = %f" % classified_metric_debiasing_test.disparate_impact())
print("Test set: Equal opportunity difference = %f" % classified_metric_debiasing_test.equal_opportunity_difference())
print("Test set: Average odds difference = %f" % classified_metric_debiasing_test.average_odds_difference())
print("Test set: Theil_index = %f" % classified_metric_debiasing_test.theil_index())

#### Plain model - without debiasing - dataset metrics

Train set: Difference in mean outcomes between unprivileged and privileged groups = 0.000000
Test set: Difference in mean outcomes between unprivileged and privileged groups = 0.000000


#### Model - with debiasing - dataset metrics

Train set: Difference in mean outcomes between unprivileged and privileged groups = 0.000000
Test set: Difference in mean outcomes between unprivileged and privileged groups = 0.000000


#### Plain model - without debiasing - classification metrics

Test set: Classification accuracy = 0.753156
Test set: Classification precision = 0.000000
Test set: Classification recall = 0.000000
TPR: True Positive Rate = 0.000000
TNR: True Negative Rate = 1.000000
FPR: False Positive Rate = 0.000000
FNR: False Negative Rate = 1.000000
Test set: Balanced classification accuracy = 0.500000
Test set: Disparate impact = nan
Test set: Equal opportunity difference = 0.000000
Test set: Average odds difference = 0.000000
Test set: Theil_index = 0.283482


#### Model - with debiasing - classification metrics

Test set: Classification accuracy = 0.753156
TPR: True Positive Rate = 0.000000
TNR: True Negative Rate = 1.000000
FPR: False Positive Rate = 0.000000
FNR: False Negative Rate = 1.000000
Test set: Balanced classification accuracy = 0.500000
Test set: Disparate impact = nan
Test set: Equal opportunity difference = 0.000000
Test set: Average odds difference = 0.000000
Test set: Theil_index = 0.283482


In [30]:
# #PLOT ROC curve
# # plt.plot(FPR,TPR)
# plt.plot(FPR, TPR, 'k--', label='ROC curve (area = %0.3f)')
# # plt.plot([0, 1], [0, 1], 'k--')  # random predictions curve
# plt.xlim([0.0, 1.0])
# plt.ylim([0.0, 1.0])
# plt.xlabel('False Positive Rate or (1 - Specifity)')
# plt.ylabel('True Positive Rate or (Sensitivity)')
# plt.title('Receiver Operating Characteristic')
# plt.legend(loc="lower right")
# plt.show()

# # from sklearn.metrics import roc_curve
# # from sklearn.metrics import auc




    References:
    [1] B. H. Zhang, B. Lemoine, and M. Mitchell, "Mitigating UnwantedBiases with Adversarial Learning," 
    AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society, 2018.