#### This notebook demonstrates the use of adversarial debiasing algorithm to learn a fair classifier.
Adversarial debiasing [1] is an in-processing technique that learns a classifier to maximize prediction accuracy and simultaneously reduce an adversary's ability to determine the protected attribute from the predictions. This approach leads to a fair classifier as the predictions cannot carry any group discrimination information that the adversary can exploit. We will see how to use this algorithm for learning models with and without fairness constraints and apply them on the Adult dataset.

In [1]:
from __future__ import print_function
%matplotlib inline
# Load all necessary packages
import sys
sys.path.append("../")

from aif360.datasets import BinaryLabelDataset
from aif360.datasets import AdultDataset, GermanDataset, CompasDataset
from aif360.metrics import BinaryLabelDatasetMetric
from aif360.metrics import ClassificationMetric
from aif360.metrics.utils import compute_boolean_conditioning_vector

from aif360.algorithms.preprocessing.optim_preproc_helpers.data_preproc_functions import load_preproc_data_adult, load_preproc_data_compas, load_preproc_data_german

#from aif360.algorithms.inprocessing.adversarial_debiasing import AdversarialDebiasing
from aif360.algorithms.inprocessing.adversarial_debiasing_v2 import AdversarialDebiasing

from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler, MaxAbsScaler
from sklearn.metrics import accuracy_score

from aif360.algorithms.postprocessing.reject_option_classification\
        import RejectOptionClassification

from common_utils import compute_metrics
from IPython.display import Markdown, display
import matplotlib.pyplot as plt

import numpy as np
import tensorflow as tf

#### Load dataset and set options

In [2]:
# Get the dataset and split into train and test
dataset_orig = load_preproc_data_adult()

privileged_groups = [{'sex': 1}]
unprivileged_groups = [{'sex': 0}]

#dataset_orig_train, dataset_orig_test = dataset_orig.split([0.7], shuffle=True)

In [3]:
# Get the dataset and split into train and test
dataset_orig_train, dataset_orig_vt = dataset_orig.split([0.7], shuffle=True)
dataset_orig_valid, dataset_orig_test = dataset_orig_vt.split([0.5], shuffle=True)

In [4]:
# print out some labels, names, etc.
display(Markdown("#### Training Dataset shape"))
print(dataset_orig_train.features.shape)
display(Markdown("#### Favorable and unfavorable labels"))
print(dataset_orig_train.favorable_label, dataset_orig_train.unfavorable_label)
display(Markdown("#### Protected attribute names"))
print(dataset_orig_train.protected_attribute_names)
display(Markdown("#### Privileged and unprivileged protected attribute values"))
print(dataset_orig_train.privileged_protected_attributes, 
      dataset_orig_train.unprivileged_protected_attributes)
display(Markdown("#### Dataset feature names"))
print(dataset_orig_train.feature_names)

#### Training Dataset shape

(34189, 18)


#### Favorable and unfavorable labels

1.0 0.0


#### Protected attribute names

['sex', 'race']


#### Privileged and unprivileged protected attribute values

[array([1.]), array([1.])] [array([0.]), array([0.])]


#### Dataset feature names

['race', 'sex', 'Age (decade)=10', 'Age (decade)=20', 'Age (decade)=30', 'Age (decade)=40', 'Age (decade)=50', 'Age (decade)=60', 'Age (decade)=>=70', 'Education Years=6', 'Education Years=7', 'Education Years=8', 'Education Years=9', 'Education Years=10', 'Education Years=11', 'Education Years=12', 'Education Years=<6', 'Education Years=>12']


#### Metric for original training data

In [5]:
# Metric for the original dataset
metric_orig_train = BinaryLabelDatasetMetric(dataset_orig_train, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)
display(Markdown("#### Original training dataset"))
print("Train set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_orig_train.mean_difference())
metric_orig_test = BinaryLabelDatasetMetric(dataset_orig_test, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)
print("Test set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_orig_test.mean_difference())

#### Original training dataset

Train set: Difference in mean outcomes between unprivileged and privileged groups = -0.191848
Test set: Difference in mean outcomes between unprivileged and privileged groups = -0.199053


In [6]:
min_max_scaler = MaxAbsScaler()
dataset_orig_train.features = min_max_scaler.fit_transform(dataset_orig_train.features)
dataset_orig_test.features = min_max_scaler.transform(dataset_orig_test.features)
metric_scaled_train = BinaryLabelDatasetMetric(dataset_orig_train, 
                             unprivileged_groups=unprivileged_groups,
                             privileged_groups=privileged_groups)
display(Markdown("#### Scaled dataset - Verify that the scaling does not affect the group label statistics"))
print("Train set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_scaled_train.mean_difference())
metric_scaled_test = BinaryLabelDatasetMetric(dataset_orig_test, 
                             unprivileged_groups=unprivileged_groups,
                             privileged_groups=privileged_groups)
print("Test set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_scaled_test.mean_difference())


#### Scaled dataset - Verify that the scaling does not affect the group label statistics

Train set: Difference in mean outcomes between unprivileged and privileged groups = -0.191848
Test set: Difference in mean outcomes between unprivileged and privileged groups = -0.199053


### Learn plan classifier without debiasing

In [7]:
# Load post-processing algorithm that equalizes the odds
# Learn parameters with debias set to False
sess = tf.Session()
plain_model = AdversarialDebiasing(privileged_groups = privileged_groups,
                          unprivileged_groups = unprivileged_groups,
                          scope_name='debiased_classifier',
                          debias=True,
                          sess=sess)

In [8]:
#tf.reset_default_graph()
plain_model.fit(dataset_orig_train)




The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.

Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.




epoch 0; iter: 0; batch classifier loss: 0.402223; batch adversarial loss: 0.923451
epoch 0; iter: 200; batch classifier loss: 0.562122; batch adversarial loss: 0.955835
epoch 1; iter: 0; batch classifier loss: 0.795309; batch adversarial loss: 0.941956
epoch 1; iter: 200; batch classifier loss: 0.478142; batch adversarial loss: 0.898928
epoch 2; iter: 0; batch classifier loss: 0.630311; batch adversarial loss: 0.953077
epoch 2; iter: 200; batch classifier loss: 0.643377; batch adversarial loss: 0.888095
epoc

epoch 37; iter: 0; batch classifier loss: 3.549010; batch adversarial loss: 0.833943
epoch 37; iter: 200; batch classifier loss: 4.056004; batch adversarial loss: 0.996399
epoch 38; iter: 0; batch classifier loss: 2.787884; batch adversarial loss: 0.877264
epoch 38; iter: 200; batch classifier loss: 2.900218; batch adversarial loss: 0.855604
epoch 39; iter: 0; batch classifier loss: 3.637752; batch adversarial loss: 0.974738
epoch 39; iter: 200; batch classifier loss: 3.876347; batch adversarial loss: 0.877264
epoch 40; iter: 0; batch classifier loss: 4.137067; batch adversarial loss: 0.909756
epoch 40; iter: 200; batch classifier loss: 3.669425; batch adversarial loss: 0.888095
epoch 41; iter: 0; batch classifier loss: 4.584616; batch adversarial loss: 0.920586
epoch 41; iter: 200; batch classifier loss: 3.576030; batch adversarial loss: 0.844773
epoch 42; iter: 0; batch classifier loss: 4.525702; batch adversarial loss: 0.888095
epoch 42; iter: 200; batch classifier loss: 3.447188; b

<aif360.algorithms.inprocessing.adversarial_debiasing.AdversarialDebiasing at 0x13ccd6860>

In [9]:
plain_model.pred_labels

<tf.Tensor 'plain_classifier/classifier_model/Softmax:0' shape=(?, 2) dtype=float32>

In [11]:
plain_model.y_pred_cls

AttributeError: 'AdversarialDebiasing' object has no attribute 'y_pred_cls'

In [9]:
pos_ind = np.where(np.asarray([0., 1.])== dataset_orig_train.favorable_label)[0][0]
pos_ind

1

In [46]:
# Apply the plain model to test data
#dataset_nodebiasing_train = plain_model.predict(dataset_orig_train)
dataset_nodebiasing_test = plain_model.predict(dataset_orig_test)
#dataset_valid_pred = plain_model.predict(dataset_orig_valid)

In [47]:
print(dataset_nodebiasing_test.probs)

[[0.8976953 ]
 [0.82502651]
 [0.82502651]
 ...
 [0.77253067]
 [0.81346083]
 [0.78133065]]


In [48]:
print(dataset_nodebiasing_test.labels)

[[0.10230471]
 [0.17497346]
 [0.17497346]
 ...
 [0.22746934]
 [0.18653922]
 [0.21866938]]


In [49]:
scale_orig = StandardScaler()
X_train = scale_orig.fit_transform(dataset_orig_train.features)
y_train = dataset_orig_train.labels.ravel()


dataset_orig_test_pred = dataset_orig_test.copy(deepcopy=True)
X_valid = scale_orig.transform(dataset_orig_test_pred.features)
y_valid = dataset_orig_test_pred.labels
#dataset_orig_test_pred.scores = dataset_nodebiasing_test.labels.reshape(-1,1)
#dataset_orig_valid_pred.scores = plain_model.predict((X_valid)[:,pos_ind].reshape(-1,1))
dataset_orig_test_pred.scores = dataset_nodebiasing_test.probs.reshape(-1,1)
dataset_orig_test_pred.scores

array([[0.8976953 ],
       [0.82502651],
       [0.82502651],
       ...,
       [0.77253067],
       [0.81346083],
       [0.78133065]])

In [60]:
scale_orig = StandardScaler()
X_train = scale_orig.fit_transform(dataset_orig_train.features)
y_train = dataset_orig_train.labels.ravel()


dataset_orig_valid_pred = dataset_orig_valid.copy(deepcopy=True)
X_valid = scale_orig.transform(dataset_orig_valid_pred.features)
y_valid = dataset_orig_valid_pred.labels
#dataset_orig_test_pred.scores = dataset_nodebiasing_test.labels.reshape(-1,1)
#dataset_orig_valid_pred.scores = plain_model.predict((X_valid)[:,pos_ind].reshape(-1,1))
dataset_orig_valid_pred.scores = dataset_valid_pred.probs.reshape(-1,1)
dataset_orig_valid_pred.scores

array([[0.85585117],
       [0.83548355],
       [0.85293543],
       ...,
       [0.89478946],
       [0.85585117],
       [0.715424  ]])

In [50]:
num_thresh = 100
ba_arr = np.zeros(num_thresh)
class_thresh_arr = np.linspace(0.01, 0.99, num_thresh)
for idx, class_thresh in enumerate(class_thresh_arr):
    
    fav_inds = dataset_orig_test_pred.scores > class_thresh
    dataset_orig_test_pred.labels[fav_inds] = dataset_orig_test_pred.favorable_label
    dataset_orig_test_pred.labels[~fav_inds] = dataset_orig_test_pred.unfavorable_label
    
    classified_metric_orig_valid = ClassificationMetric(dataset_orig_test,
                                             dataset_orig_test_pred, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)
    
    ba_arr[idx] = 0.5*(classified_metric_orig_valid.true_positive_rate()\
                       +classified_metric_orig_valid.true_negative_rate())

best_ind = np.where(ba_arr == np.max(ba_arr))[0][0]
best_class_thresh = class_thresh_arr[best_ind]

print("Best balanced accuracy (no fairness constraints) = %.4f" % np.max(ba_arr))
print("Optimal classification threshold (no fairness constraints) = %.4f" % best_class_thresh)

Best balanced accuracy (no fairness constraints) = 0.6759
Optimal classification threshold (no fairness constraints) = 0.8316


In [51]:
ROC = RejectOptionClassification(unprivileged_groups=unprivileged_groups, 
                                 privileged_groups=privileged_groups, 
                                 low_class_thresh=0.01, high_class_thresh=0.99,
                                  num_class_thresh=100, num_ROC_margin=50,
                                  metric_name="Statistical parity difference",
                                  metric_ub=0.001, metric_lb=-0.001)
ROC = ROC.fit(dataset_orig_test, dataset_orig_test_pred)

In [52]:
print("Optimal classification threshold (with fairness constraints) = %.4f" % ROC.classification_threshold)
print("Optimal ROC margin = %.4f" % ROC.ROC_margin)

Optimal classification threshold (with fairness constraints) = 0.7920
Optimal ROC margin = 0.0340


In [55]:
# Metrics for the test set
fav_inds = dataset_orig_test_pred.scores > best_class_thresh
dataset_orig_test_pred.labels[fav_inds] = dataset_orig_test_pred.favorable_label
dataset_orig_test_pred.labels[~fav_inds] = dataset_orig_test_pred.unfavorable_label

display(Markdown("#### Validation set"))
display(Markdown("##### Raw predictions - No fairness constraints, only maximizing balanced accuracy"))

metric_valid_bef = compute_metrics(dataset_orig_test, dataset_orig_test_pred, 
                unprivileged_groups, privileged_groups)

#### Validation set

##### Raw predictions - No fairness constraints, only maximizing balanced accuracy

Balanced accuracy = 0.6759
Statistical parity difference = -0.4856
Disparate impact = 0.1348
Average odds difference = -0.4423
Equal opportunity difference = -0.4512
Theil index = 0.1397


In [58]:
# Transform the validation set
dataset_transf_test_pred = ROC.predict(dataset_orig_test_pred)

display(Markdown("#### Validation set"))
display(Markdown("##### Transformed predictions - With fairness constraints"))
metric_valid_aft = compute_metrics(dataset_orig_test, dataset_transf_test_pred, 
                unprivileged_groups, privileged_groups)

#### Validation set

##### Transformed predictions - With fairness constraints

Balanced accuracy = 0.5727
Statistical parity difference = 0.0003
Disparate impact = 1.0005
Average odds difference = 0.0363
Equal opportunity difference = 0.0442
Theil index = 0.1161


In [59]:
assert np.abs(metric_valid_aft["Statistical parity difference"]) <= np.abs(metric_valid_bef["Statistical parity difference"])

## Predictions with Validation set

In [61]:
num_thresh = 100
ba_arr = np.zeros(num_thresh)
class_thresh_arr = np.linspace(0.01, 0.99, num_thresh)
for idx, class_thresh in enumerate(class_thresh_arr):
    
    fav_inds = dataset_orig_valid_pred.scores > class_thresh
    dataset_orig_valid_pred.labels[fav_inds] = dataset_orig_valid_pred.favorable_label
    dataset_orig_valid_pred.labels[~fav_inds] = dataset_orig_valid_pred.unfavorable_label
    
    classified_metric_orig_valid = ClassificationMetric(dataset_orig_valid,
                                             dataset_orig_valid_pred, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)
    
    ba_arr[idx] = 0.5*(classified_metric_orig_valid.true_positive_rate()\
                       +classified_metric_orig_valid.true_negative_rate())

best_ind = np.where(ba_arr == np.max(ba_arr))[0][0]
best_class_thresh = class_thresh_arr[best_ind]

print("Best balanced accuracy (no fairness constraints) = %.4f" % np.max(ba_arr))
print("Optimal classification threshold (no fairness constraints) = %.4f" % best_class_thresh)

Best balanced accuracy (no fairness constraints) = 0.6835
Optimal classification threshold (no fairness constraints) = 0.8316


In [62]:
ROC = RejectOptionClassification(unprivileged_groups=unprivileged_groups, 
                                 privileged_groups=privileged_groups, 
                                 low_class_thresh=0.01, high_class_thresh=0.99,
                                  num_class_thresh=100, num_ROC_margin=50,
                                  metric_name="Statistical parity difference",
                                  metric_ub=0.001, metric_lb=-0.001)
ROC = ROC.fit(dataset_orig_valid, dataset_orig_valid_pred)

In [63]:
print("Optimal classification threshold (with fairness constraints) = %.4f" % ROC.classification_threshold)
print("Optimal ROC margin = %.4f" % ROC.ROC_margin)

Optimal classification threshold (with fairness constraints) = 0.5940
Optimal ROC margin = 0.0414


In [64]:
# Metrics for the test set
fav_inds = dataset_orig_valid_pred.scores > best_class_thresh
dataset_orig_valid_pred.labels[fav_inds] = dataset_orig_valid_pred.favorable_label
dataset_orig_valid_pred.labels[~fav_inds] = dataset_orig_valid_pred.unfavorable_label

display(Markdown("#### Validation set"))
display(Markdown("##### Raw predictions - No fairness constraints, only maximizing balanced accuracy"))

metric_valid_bef = compute_metrics(dataset_orig_valid, dataset_orig_valid_pred, 
                unprivileged_groups, privileged_groups)

#### Validation set

##### Raw predictions - No fairness constraints, only maximizing balanced accuracy

Balanced accuracy = 0.6835
Statistical parity difference = -0.4668
Disparate impact = 0.1573
Average odds difference = -0.4324
Equal opportunity difference = -0.4595
Theil index = 0.1351


In [65]:
# Transform the validation set
dataset_transf_valid_pred = ROC.predict(dataset_orig_valid_pred)

display(Markdown("#### Validation set"))
display(Markdown("##### Transformed predictions - With fairness constraints"))
metric_valid_aft = compute_metrics(dataset_orig_valid, dataset_transf_valid_pred, 
                unprivileged_groups, privileged_groups)

#### Validation set

##### Transformed predictions - With fairness constraints

Balanced accuracy = 0.5017
Statistical parity difference = 0.0005
Disparate impact = 1.0005
Average odds difference = 0.0013
Equal opportunity difference = 0.0013
Theil index = 0.0341


In [66]:
assert np.abs(metric_valid_aft["Statistical parity difference"]) <= np.abs(metric_valid_bef["Statistical parity difference"])

### Apply in-processing algorithm based on adversarial learning

In [20]:
sess.close()
tf.reset_default_graph()
sess = tf.Session()

In [21]:
# Learn parameters with debias set to True
debiased_model = AdversarialDebiasing(privileged_groups = privileged_groups,
                          unprivileged_groups = unprivileged_groups,
                          scope_name='debiased_classifier',
                          debias=True,
                          sess=sess)

In [22]:
debiased_model.fit(dataset_orig_train)

epoch 0; iter: 0; batch classifier loss: 0.707958; batch adversarial loss: 0.673177
epoch 0; iter: 200; batch classifier loss: 0.386025; batch adversarial loss: 0.627082
epoch 1; iter: 0; batch classifier loss: 0.379295; batch adversarial loss: 0.651693
epoch 1; iter: 200; batch classifier loss: 0.480181; batch adversarial loss: 0.614757
epoch 2; iter: 0; batch classifier loss: 0.448960; batch adversarial loss: 0.632887
epoch 2; iter: 200; batch classifier loss: 0.437593; batch adversarial loss: 0.612206
epoch 3; iter: 0; batch classifier loss: 0.404891; batch adversarial loss: 0.639086
epoch 3; iter: 200; batch classifier loss: 0.479020; batch adversarial loss: 0.642401
epoch 4; iter: 0; batch classifier loss: 0.383114; batch adversarial loss: 0.583473
epoch 4; iter: 200; batch classifier loss: 0.481365; batch adversarial loss: 0.651561
epoch 5; iter: 0; batch classifier loss: 0.374620; batch adversarial loss: 0.603537
epoch 5; iter: 200; batch classifier loss: 0.412906; batch adversa

epoch 48; iter: 200; batch classifier loss: 0.413488; batch adversarial loss: 0.552195
epoch 49; iter: 0; batch classifier loss: 0.418760; batch adversarial loss: 0.593718
epoch 49; iter: 200; batch classifier loss: 0.446568; batch adversarial loss: 0.588043


<aif360.algorithms.inprocessing.adversarial_debiasing.AdversarialDebiasing at 0x1399bcfd0>

In [25]:
debiased_model.pred_probs

<tf.Tensor 'debiased_classifier/Softmax:0' shape=(?, 1) dtype=float32>

In [13]:
# Apply the plain model to test data
dataset_debiasing_train = debiased_model.predict(dataset_orig_train)
dataset_debiasing_test = debiased_model.predict(dataset_orig_test)

In [14]:
# Metrics for the dataset from plain model (without debiasing)
display(Markdown("#### Plain model - without debiasing - dataset metrics"))
print("Train set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_dataset_nodebiasing_train.mean_difference())
print("Test set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_dataset_nodebiasing_test.mean_difference())

# Metrics for the dataset from model with debiasing
display(Markdown("#### Model - with debiasing - dataset metrics"))
metric_dataset_debiasing_train = BinaryLabelDatasetMetric(dataset_debiasing_train, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)

print("Train set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_dataset_debiasing_train.mean_difference())

metric_dataset_debiasing_test = BinaryLabelDatasetMetric(dataset_debiasing_test, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)

print("Test set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_dataset_debiasing_test.mean_difference())



display(Markdown("#### Plain model - without debiasing - classification metrics"))
print("Test set: Classification accuracy = %f" % classified_metric_nodebiasing_test.accuracy())
TPR = classified_metric_nodebiasing_test.true_positive_rate()
TNR = classified_metric_nodebiasing_test.true_negative_rate()
bal_acc_nodebiasing_test = 0.5*(TPR+TNR)
print("Test set: Balanced classification accuracy = %f" % bal_acc_nodebiasing_test)
print("Test set: Disparate impact = %f" % classified_metric_nodebiasing_test.disparate_impact())
print("Test set: Equal opportunity difference = %f" % classified_metric_nodebiasing_test.equal_opportunity_difference())
print("Test set: Average odds difference = %f" % classified_metric_nodebiasing_test.average_odds_difference())
print("Test set: Theil_index = %f" % classified_metric_nodebiasing_test.theil_index())



display(Markdown("#### Model - with debiasing - classification metrics"))
classified_metric_debiasing_test = ClassificationMetric(dataset_orig_test, 
                                                 dataset_debiasing_test,
                                                 unprivileged_groups=unprivileged_groups,
                                                 privileged_groups=privileged_groups)
print("Test set: Classification accuracy = %f" % classified_metric_debiasing_test.accuracy())
TPR = classified_metric_debiasing_test.true_positive_rate()
TNR = classified_metric_debiasing_test.true_negative_rate()
bal_acc_debiasing_test = 0.5*(TPR+TNR)
print("Test set: Balanced classification accuracy = %f" % bal_acc_debiasing_test)
print("Test set: Disparate impact = %f" % classified_metric_debiasing_test.disparate_impact())
print("Test set: Equal opportunity difference = %f" % classified_metric_debiasing_test.equal_opportunity_difference())
print("Test set: Average odds difference = %f" % classified_metric_debiasing_test.average_odds_difference())
print("Test set: Theil_index = %f" % classified_metric_debiasing_test.theil_index())

#### Plain model - without debiasing - dataset metrics

Train set: Difference in mean outcomes between unprivileged and privileged groups = -0.217876
Test set: Difference in mean outcomes between unprivileged and privileged groups = -0.221187


#### Model - with debiasing - dataset metrics

Train set: Difference in mean outcomes between unprivileged and privileged groups = -0.090157
Test set: Difference in mean outcomes between unprivileged and privileged groups = -0.094732


#### Plain model - without debiasing - classification metrics

Test set: Classification accuracy = 0.804955
Test set: Balanced classification accuracy = 0.666400
Test set: Disparate impact = 0.000000
Test set: Equal opportunity difference = -0.470687
Test set: Average odds difference = -0.291055
Test set: Theil_index = 0.175113


#### Model - with debiasing - classification metrics

Test set: Classification accuracy = 0.792056
Test set: Balanced classification accuracy = 0.672481
Test set: Disparate impact = 0.553746
Test set: Equal opportunity difference = -0.090716
Test set: Average odds difference = -0.053841
Test set: Theil_index = 0.170358



    References:
    [1] B. H. Zhang, B. Lemoine, and M. Mitchell, "Mitigating UnwantedBiases with Adversarial Learning," 
    AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society, 2018.