# Fairness in healthcare utilization scoring model

<b> Reference - https://nbviewer.jupyter.org/github/IBM/AIF360/blob/master/examples/tutorial_medical_expenditure.ipynb
    </b>
  

### This tutorial demonstrates classification model learning with bias mitigation as a part of a Care Management use case using Medical Expenditure data.

The notebook demonstrates how the AIF 360 toolkit can be used to detect and reduce bias when learning classifiers using a variety of fairness metrics and algorithms . It also demonstrates how explanations can be generated for predictions made by models learnt with the toolkit using LIME.

Classifiers are built using Logistic Regression as well as Random Forests.

Bias detection is demonstrated using several metrics, including disparate impact, average odds difference, statistical parity difference, equal opportunity difference, and Theil index.

Bias alleviation is explored via a variety of methods, including reweighing (pre-processing algorithm), prejudice remover (in-processing algorithm), and disparate impact remover (pre-processing technique).

Data from the [Medical Expenditure Panel Survey](https://meps.ahrq.gov/mepsweb/) is used in this tutorial. See [Section 2](#2.-Data-used) below for more details.


## [1.](#Table-of-Contents) Use case

In order to demonstrate how AIF 360 can be used to detect and mitigate bias in classfier models, we adopt the following use case:

1. a data scientist develops a 'fair' healthcare utilization scoring model with respect to defined protected classes. Fairness may be dictated by legal or government regulations, such as a requirement that additional care decisions be not predicated on factors such as race of the patient.


2. developer takes the model AND performance characteristics / specs of the model (e.g. accuracy, fairness tests, etc. basically the model factsheet) and deploys the model in an enterprise app that prioritizes cases for care management.


3. the app is put into production and starts scoring people and making recommendations. 


4. explanations are generated for each recommendation


5. both recommendations and associated explanations are given to nurses as a part of the care management process. The nurses can evaluate the recommendations for quality and correctness and provide feedback.


6. nurse feedback as well as analysis of usage data with respect to specs of the model w.r.t accuracy and fairness is communicated to AI Ops specialist and LOB user periodically.


7. when significant drift in model specs relative to the model factsheet is observed, the model is sent back for retraining.

## [2.](#Table-of-Contents) Data used

The specific data used is the [2015 Full Year Consolidated Data File](https://meps.ahrq.gov/mepsweb/data_stats/download_data_files_detail.jsp?cboPufNumber=HC-181) as well as the [2016 Full Year Consolidated Data File](https://meps.ahrq.gov/mepsweb/data_stats/download_data_files_detail.jsp?cboPufNumber=HC-192).

The 2015 file contains data from rounds 3,4,5 of panel 19 (2014) and rounds 1,2,3 of panel 20 (2015). The 2016 file contains data from rounds 3,4,5 of panel 20 (2015) and rounds 1,2,3 of panel 21 (2016).

For this demonstration, three datasets were constructed: one from panel 19, round 5 (used for learning models), one from panel 20, round 3 (used for deployment/testing of model - steps); the other from panel 21, round 3 (used for re-training and deployment/testing of updated model).

## [3.](#Table-of-Contents) Training models on original 2015 Panel 19 data

First, load all necessary packages

In [4]:
import os 
import pandas as pd

In [5]:
filepath = os.path.join(os.path.dirname(os.path.abspath("__file__")),
                                '..', 'data', 'raw', 'meps', 'h201.csv')

In [7]:
filepath

'/Users/dhanush/Documents/USC/Fall-2019/INF599/projects/AIF360/examples/../data/raw/meps/h201.csv'

In [10]:
df = pd.read_csv("/Users/dhanush/Documents/USC/Fall-2019/INF599/projects/AIF360/aif360/data/raw/meps/h201.csv", sep=',')

In [45]:
for s in df.columns.values:
    if s.startswith('FTSTU'):
        print(s)

FTSTU31X
FTSTU42X
FTSTU53X
FTSTU17X


In [23]:
cols = df.columns.values

In [38]:
for col in ['FTSTU','ACTDTY','HONRDC','RTHLTH','MNHLTH','HIBPDX','CHDDX','ANGIDX','EDUCYR','HIDEG',
                     'MIDX','OHRTDX','STRKDX','EMPHDX','CHBRON','CHOLDX','CANCERDX','DIABDX',
                     'JTPAIN','ARTHDX','ARTHTYPE','ASTHDX','ADHDADDX','PREGNT','SOCLIM','COGLIM','DFHEAR42','DFSEE42','ADSMOK42',
                     'PHQ242']:
    if col not in cols:
        print(col)
#         print(col in cols)

FTSTU
ACTDTY
HONRDC
RTHLTH
MNHLTH
CHBRON
JTPAIN
PREGNT
SOCLIM
COGLIM


In [14]:
default_mappings = {
    'label_maps': [{1.0: '>= 10 Visits', 0.0: '< 10 Visits'}],
    'protected_attribute_maps': [{1.0: 'White', 0.0: 'Non-White'}]
}

In [32]:
def default_preprocessing(df):
    """
    1.Create a new column, RACE that is 'White' if RACEV2X = 1 and HISPANX = 2 i.e. non Hispanic White
      and 'Non-White' otherwise
    2. Restrict to Panel 21
    3. RENAME all columns that are PANEL/ROUND SPECIFIC
    4. Drop rows based on certain values of individual features that correspond to missing/unknown - generally < -1
    5. Compute UTILIZATION, binarize it to 0 (< 10) and 1 (>= 10)
    """
    def race(row):
        if ((row['HISPANX'] == 2) and (row['RACEV2X'] == 1)):  #non-Hispanic Whites are marked as WHITE; all others as NON-WHITE
            return 'White'
        return 'Non-White'

    df['RACEV2X'] = df.apply(lambda row: race(row), axis=1)
    df = df.rename(columns = {'RACEV2X' : 'RACE'})

    df = df[df['PANEL'] == 21]

    # RENAME COLUMNS
    df = df.rename(columns = {'FTSTU53X' : 'FTSTU', 'ACTDTY53' : 'ACTDTY', 'HONRDC53' : 'HONRDC', 'RTHLTH53' : 'RTHLTH',
                              'MNHLTH53' : 'MNHLTH', 'CHBRON53' : 'CHBRON', 'JTPAIN53' : 'JTPAIN', 'PREGNT53' : 'PREGNT',
                              'WLKLIM53' : 'WLKLIM', 'ACTLIM53' : 'ACTLIM', 'SOCLIM53' : 'SOCLIM', 'COGLIM53' : 'COGLIM',
                              'EMPST53' : 'EMPST', 'REGION53' : 'REGION', 'MARRY53X' : 'MARRY', 'AGE53X' : 'AGE',
                              'POVCAT16' : 'POVCAT', 'INSCOV16' : 'INSCOV'})

    df = df[df['REGION'] >= 0] # remove values -1
    df = df[df['AGE'] >= 0] # remove values -1

    df = df[df['MARRY'] >= 0] # remove values -1, -7, -8, -9

    df = df[df['ASTHDX'] >= 0] # remove values -1, -7, -8, -9

    df = df[(df[['FTSTU','ACTDTY','HONRDC','RTHLTH','MNHLTH','HIBPDX','CHDDX','ANGIDX','EDUCYR','HIDEG',
                     'MIDX','OHRTDX','STRKDX','EMPHDX','CHBRON','CHOLDX','CANCERDX','DIABDX',
                     'JTPAIN','ARTHDX','ARTHTYPE','ASTHDX','ADHDADDX','PREGNT','SOCLIM','COGLIM','DFHEAR42','DFSEE42','ADSMOK42',
                     'PHQ242']] >= -1).all(1)]  #for all other categorical features, remove values < -1

    def utilization(row):
        return row['OBTOTV16'] + row['OPTOTV16'] + row['ERTOT16'] + row['IPNGTD16'] + row['HHTOTD16']

    df['TOTEXP16'] = df.apply(lambda row: utilization(row), axis=1)
    lessE = df['TOTEXP16'] < 10.0
    df.loc[lessE,'TOTEXP16'] = 0.0
    moreE = df['TOTEXP16'] >= 10.0
    df.loc[moreE,'TOTEXP16'] = 1.0

    df = df.rename(columns = {'TOTEXP16' : 'UTILIZATION'})
    return df

In [28]:
label_name='UTILIZATION'
favorable_classes=[1.0]
protected_attribute_names=['RACE']
privileged_classes=[['White']]
instance_weights_name='PERWT16F'
categorical_features=['REGION','SEX','MARRY', 'FTSTU','ACTDTY','HONRDC','RTHLTH','MNHLTH','HIBPDX','CHDDX','ANGIDX',
                      'MIDX','OHRTDX','STRKDX','EMPHDX','CHBRON','CHOLDX','CANCERDX','DIABDX',
                      'JTPAIN','ARTHDX','ARTHTYPE','ASTHDX','ADHDADDX','PREGNT','WLKLIM',
                      'ACTLIM','SOCLIM','COGLIM','DFHEAR42','DFSEE42', 'ADSMOK42', 'PHQ242',
                      'EMPST','POVCAT','INSCOV']
features_to_keep=['REGION','AGE','SEX','RACE','MARRY',
                     'FTSTU','ACTDTY','HONRDC','RTHLTH','MNHLTH','HIBPDX','CHDDX','ANGIDX',
                     'MIDX','OHRTDX','STRKDX','EMPHDX','CHOLDX','CANCERDX','DIABDX',
                     'ARTHDX','ARTHTYPE','ASTHDX','ADHDADDX','DFHEAR42','DFSEE42','ADSMOK42', 'PCS42',
                     'MCS42','K6SUM42','PHQ242','EMPST','UTILIZATION', 'PERWT16F']
features_to_drop=[]
na_values=[]
custom_preprocessing=default_preprocessing
metadata=default_mappings

In [29]:
filepath="/Users/dhanush/Documents/USC/Fall-2019/INF599/projects/AIF360/aif360/data/raw/meps/h201.csv"

In [19]:
df = pd.read_csv(filepath, sep=',', na_values=na_values)

In [20]:
from aif360.datasets import StandardDataset

In [33]:
x = StandardDataset(df=df, label_name=label_name,
            favorable_classes=favorable_classes,
            protected_attribute_names=protected_attribute_names,
            privileged_classes=privileged_classes,
            instance_weights_name=instance_weights_name,
            categorical_features=categorical_features,
            features_to_keep=features_to_keep,
            features_to_drop=features_to_drop, na_values=na_values,
            custom_preprocessing=custom_preprocessing, metadata=metadata)

KeyError: "['WLKLIM', 'ACTLIM', 'COGLIM', 'JTPAIN', 'POVCAT', 'CHBRON', 'INSCOV', 'SOCLIM', 'PREGNT'] not in index"

In [38]:
import sys
sys.path.insert(0, '../')

%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
from IPython.display import Markdown, display

# Datasets
from aif360.datasets import MEPSDataset19
from aif360.datasets import MEPSDataset20
from aif360.datasets import MEPSDataset21

# Fairness metrics
from aif360.metrics import BinaryLabelDatasetMetric
from aif360.metrics import ClassificationMetric

# Explainers
from aif360.explainers import MetricTextExplainer

# Scalers
from sklearn.preprocessing import StandardScaler

# Classifiers
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import make_pipeline

# Bias mitigation techniques
from aif360.algorithms.preprocessing import Reweighing

np.random.seed(1)

In [39]:
from sklearn.preprocessing import StandardScaler, MaxAbsScaler
import tensorflow as tf
from aif360.algorithms.inprocessing.adversarial_debiasing import AdversarialDebiasing

### 3.1. Load data & create splits for learning/validating/testing model

Get the dataset and split into train (70%), test (30%)

In [40]:
(dataset_orig_panel19_train,
 dataset_orig_panel19_test) = MEPSDataset21().split([0.7], shuffle=True)

sens_ind = 0
sens_attr = dataset_orig_panel19_train.protected_attribute_names[sens_ind]

unprivileged_groups = [{sens_attr: v} for v in
                       dataset_orig_panel19_train.unprivileged_protected_attributes[sens_ind]]
privileged_groups = [{sens_attr: v} for v in
                     dataset_orig_panel19_train.privileged_protected_attributes[sens_ind]]

  priv = np.logical_or.reduce(np.equal.outer(vals, df[attr]))


This function will be used throughout the notebook to print out some labels, names, etc.

In [41]:
def describe(train=None, val=None, test=None):
    if train is not None:
        display(Markdown("#### Training Dataset shape"))
        print(train.features.shape)
    if val is not None:
        display(Markdown("#### Validation Dataset shape"))
        print(val.features.shape)
    display(Markdown("#### Test Dataset shape"))
    print(test.features.shape)
    display(Markdown("#### Favorable and unfavorable labels"))
    print(test.favorable_label, test.unfavorable_label)
    display(Markdown("#### Protected attribute names"))
    print(test.protected_attribute_names)
    display(Markdown("#### Privileged and unprivileged protected attribute values"))
    print(test.privileged_protected_attributes, 
          test.unprivileged_protected_attributes)
    display(Markdown("#### Dataset feature names"))
    print(test.feature_names)

Show 2015 dataset details

In [42]:
describe(dataset_orig_panel19_train, None, dataset_orig_panel19_test)

#### Training Dataset shape

(10972, 138)


#### Test Dataset shape

(4703, 138)


#### Favorable and unfavorable labels

1.0 0.0


#### Protected attribute names

['RACE']


#### Privileged and unprivileged protected attribute values

[array([1.])] [array([0.])]


#### Dataset feature names

['AGE', 'RACE', 'PCS42', 'MCS42', 'K6SUM42', 'REGION=1', 'REGION=2', 'REGION=3', 'REGION=4', 'SEX=1', 'SEX=2', 'MARRY=1', 'MARRY=2', 'MARRY=3', 'MARRY=4', 'MARRY=5', 'MARRY=6', 'MARRY=7', 'MARRY=8', 'MARRY=9', 'MARRY=10', 'FTSTU=-1', 'FTSTU=1', 'FTSTU=2', 'FTSTU=3', 'ACTDTY=1', 'ACTDTY=2', 'ACTDTY=3', 'ACTDTY=4', 'HONRDC=1', 'HONRDC=2', 'HONRDC=3', 'HONRDC=4', 'RTHLTH=-1', 'RTHLTH=1', 'RTHLTH=2', 'RTHLTH=3', 'RTHLTH=4', 'RTHLTH=5', 'MNHLTH=-1', 'MNHLTH=1', 'MNHLTH=2', 'MNHLTH=3', 'MNHLTH=4', 'MNHLTH=5', 'HIBPDX=-1', 'HIBPDX=1', 'HIBPDX=2', 'CHDDX=-1', 'CHDDX=1', 'CHDDX=2', 'ANGIDX=-1', 'ANGIDX=1', 'ANGIDX=2', 'MIDX=-1', 'MIDX=1', 'MIDX=2', 'OHRTDX=-1', 'OHRTDX=1', 'OHRTDX=2', 'STRKDX=-1', 'STRKDX=1', 'STRKDX=2', 'EMPHDX=-1', 'EMPHDX=1', 'EMPHDX=2', 'CHBRON=-1', 'CHBRON=1', 'CHBRON=2', 'CHOLDX=-1', 'CHOLDX=1', 'CHOLDX=2', 'CANCERDX=-1', 'CANCERDX=1', 'CANCERDX=2', 'DIABDX=-1', 'DIABDX=1', 'DIABDX=2', 'JTPAIN=-1', 'JTPAIN=1', 'JTPAIN=2', 'ARTHDX=-1', 'ARTHDX=1', 'ARTHDX=2', 'ARTHTYPE=-1'

Metrics for original data

In [43]:
metric_orig_panel19_train = BinaryLabelDatasetMetric(
        dataset_orig_panel19_train,
        unprivileged_groups=unprivileged_groups,
        privileged_groups=privileged_groups)
explainer_orig_panel19_train = MetricTextExplainer(metric_orig_panel19_train)

print(explainer_orig_panel19_train.disparate_impact())

Disparate impact (probability of favorable outcome for unprivileged instances / probability of favorable outcome for privileged instances): 0.4820224780986613


# Adverserial Debiasing without debias

In [44]:
# Metric for the original dataset
metric_orig_train = BinaryLabelDatasetMetric(dataset_orig_panel19_train, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)
display(Markdown("#### Original training dataset"))
print("Train set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_orig_train.mean_difference())
metric_orig_test = BinaryLabelDatasetMetric(dataset_orig_panel19_test, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)
print("Test set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_orig_test.mean_difference())

#### Original training dataset

Train set: Difference in mean outcomes between unprivileged and privileged groups = -0.131976
Test set: Difference in mean outcomes between unprivileged and privileged groups = -0.115898


In [45]:
min_max_scaler = MaxAbsScaler()
dataset_orig_panel19_train.features = min_max_scaler.fit_transform(dataset_orig_panel19_train.features)
dataset_orig_panel19_test.features = min_max_scaler.transform(dataset_orig_panel19_test.features)
metric_scaled_train = BinaryLabelDatasetMetric(dataset_orig_panel19_train, 
                             unprivileged_groups=unprivileged_groups,
                             privileged_groups=privileged_groups)
display(Markdown("#### Scaled dataset - Verify that the scaling does not affect the group label statistics"))
print("Train set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_scaled_train.mean_difference())
metric_scaled_test = BinaryLabelDatasetMetric(dataset_orig_panel19_test, 
                             unprivileged_groups=unprivileged_groups,
                             privileged_groups=privileged_groups)
print("Test set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_scaled_test.mean_difference())


#### Scaled dataset - Verify that the scaling does not affect the group label statistics

Train set: Difference in mean outcomes between unprivileged and privileged groups = -0.131976
Test set: Difference in mean outcomes between unprivileged and privileged groups = -0.115898


In [46]:
tf.reset_default_graph()
sess = tf.Session()
# Learn parameters with debias set to True
debiased_model = AdversarialDebiasing(privileged_groups = privileged_groups,
                          unprivileged_groups = unprivileged_groups,
                          scope_name='debiased_classifier',
                          debias=False,
                          sess=sess)

In [47]:
debiased_model.fit(dataset_orig_panel19_train)

epoch 0; iter: 0; batch classifier loss: 0.681122
epoch 1; iter: 0; batch classifier loss: 0.376445
epoch 2; iter: 0; batch classifier loss: 0.285735
epoch 3; iter: 0; batch classifier loss: 0.398125
epoch 4; iter: 0; batch classifier loss: 0.225464
epoch 5; iter: 0; batch classifier loss: 0.453644
epoch 6; iter: 0; batch classifier loss: 0.348539
epoch 7; iter: 0; batch classifier loss: 0.433853
epoch 8; iter: 0; batch classifier loss: 0.363735
epoch 9; iter: 0; batch classifier loss: 0.247386
epoch 10; iter: 0; batch classifier loss: 0.229941
epoch 11; iter: 0; batch classifier loss: 0.420820
epoch 12; iter: 0; batch classifier loss: 0.279161
epoch 13; iter: 0; batch classifier loss: 0.267856
epoch 14; iter: 0; batch classifier loss: 0.178618
epoch 15; iter: 0; batch classifier loss: 0.268303
epoch 16; iter: 0; batch classifier loss: 0.314311
epoch 17; iter: 0; batch classifier loss: 0.321154
epoch 18; iter: 0; batch classifier loss: 0.255264
epoch 19; iter: 0; batch classifier loss:

<aif360.algorithms.inprocessing.adversarial_debiasing.AdversarialDebiasing at 0x133faf358>

In [48]:
dataset_debiasing_test = debiased_model.predict(dataset_orig_panel19_test)

In [49]:
# Metrics for the dataset from model with debiasing

display(Markdown("#### Model - without debiasing - classification metrics"))
classified_metric_nodebiasing_test = ClassificationMetric(dataset_orig_panel19_test, 
                                                 dataset_debiasing_test,
                                                 unprivileged_groups=unprivileged_groups,
                                                 privileged_groups=privileged_groups)
print("Test set: Classification accuracy = %f" % classified_metric_nodebiasing_test.accuracy())
TPR = classified_metric_nodebiasing_test.true_positive_rate()
TNR = classified_metric_nodebiasing_test.true_negative_rate()
bal_acc_debiasing_test = 0.5*(TPR+TNR)
print("Test set: Balanced classification accuracy = %f" % bal_acc_debiasing_test)
print("Test set: Disparate impact = %f" % classified_metric_nodebiasing_test.disparate_impact())
print("Test set: Equal opportunity difference = %f" % classified_metric_nodebiasing_test.equal_opportunity_difference())
print("Test set: Average odds difference = %f" % classified_metric_nodebiasing_test.average_odds_difference())
print("Test set: Statistical parity difference = %f" % classified_metric_nodebiasing_test.statistical_parity_difference())
print("Test set: Theil_index = %f" % classified_metric_nodebiasing_test.theil_index())

#### Model - without debiasing - classification metrics

Test set: Classification accuracy = 0.827398
Test set: Balanced classification accuracy = 0.639508
Test set: Disparate impact = 0.399601
Test set: Equal opportunity difference = -0.144565
Test set: Average odds difference = -0.096133
Test set: Statistical parity difference = -0.092544
Test set: Theil_index = 0.128564


# Adverserial Debiasing 

In [50]:
# Metric for the original dataset
metric_orig_train = BinaryLabelDatasetMetric(dataset_orig_panel19_train, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)
display(Markdown("#### Original training dataset"))
print("Train set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_orig_train.mean_difference())
metric_orig_test = BinaryLabelDatasetMetric(dataset_orig_panel19_test, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)
print("Test set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_orig_test.mean_difference())

#### Original training dataset

Train set: Difference in mean outcomes between unprivileged and privileged groups = -0.131976
Test set: Difference in mean outcomes between unprivileged and privileged groups = -0.115898


In [51]:
min_max_scaler = MaxAbsScaler()
dataset_orig_panel19_train.features = min_max_scaler.fit_transform(dataset_orig_panel19_train.features)
dataset_orig_panel19_test.features = min_max_scaler.transform(dataset_orig_panel19_test.features)
metric_scaled_train = BinaryLabelDatasetMetric(dataset_orig_panel19_train, 
                             unprivileged_groups=unprivileged_groups,
                             privileged_groups=privileged_groups)
display(Markdown("#### Scaled dataset - Verify that the scaling does not affect the group label statistics"))
print("Train set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_scaled_train.mean_difference())
metric_scaled_test = BinaryLabelDatasetMetric(dataset_orig_panel19_test, 
                             unprivileged_groups=unprivileged_groups,
                             privileged_groups=privileged_groups)
print("Test set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_scaled_test.mean_difference())


#### Scaled dataset - Verify that the scaling does not affect the group label statistics

Train set: Difference in mean outcomes between unprivileged and privileged groups = -0.131976
Test set: Difference in mean outcomes between unprivileged and privileged groups = -0.115898


In [52]:
tf.reset_default_graph()
sess = tf.Session()
# Learn parameters with debias set to True
debiased_model = AdversarialDebiasing(privileged_groups = privileged_groups,
                          unprivileged_groups = unprivileged_groups,
                          scope_name='debiased_classifier',
                          debias=True,
                          sess=sess)

In [53]:
debiased_model.fit(dataset_orig_panel19_train)

epoch 0; iter: 0; batch classifier loss: 0.488154; batch adversarial loss: 0.697556
epoch 1; iter: 0; batch classifier loss: 0.284598; batch adversarial loss: 0.685454
epoch 2; iter: 0; batch classifier loss: 0.449048; batch adversarial loss: 0.690505
epoch 3; iter: 0; batch classifier loss: 0.306396; batch adversarial loss: 0.661280
epoch 4; iter: 0; batch classifier loss: 0.363295; batch adversarial loss: 0.653064
epoch 5; iter: 0; batch classifier loss: 0.317632; batch adversarial loss: 0.675618
epoch 6; iter: 0; batch classifier loss: 0.400940; batch adversarial loss: 0.661843
epoch 7; iter: 0; batch classifier loss: 0.361443; batch adversarial loss: 0.676482
epoch 8; iter: 0; batch classifier loss: 0.282645; batch adversarial loss: 0.636950
epoch 9; iter: 0; batch classifier loss: 0.264637; batch adversarial loss: 0.651169
epoch 10; iter: 0; batch classifier loss: 0.212423; batch adversarial loss: 0.635829
epoch 11; iter: 0; batch classifier loss: 0.369639; batch adversarial loss:

<aif360.algorithms.inprocessing.adversarial_debiasing.AdversarialDebiasing at 0x13a20b978>

In [54]:
dataset_debiasing_test = debiased_model.predict(dataset_orig_panel19_test)

In [55]:
# Metrics for the dataset from model with debiasing

display(Markdown("#### Model - with debiasing - classification metrics"))
classified_metric_debiasing_test = ClassificationMetric(dataset_orig_panel19_test, 
                                                 dataset_debiasing_test,
                                                 unprivileged_groups=unprivileged_groups,
                                                 privileged_groups=privileged_groups)
print("Test set: Classification accuracy = %f" % classified_metric_debiasing_test.accuracy())
TPR = classified_metric_debiasing_test.true_positive_rate()
TNR = classified_metric_debiasing_test.true_negative_rate()
bal_acc_debiasing_test = 0.5*(TPR+TNR)
print("Test set: Balanced classification accuracy = %f" % bal_acc_debiasing_test)
print("Test set: Disparate impact = %f" % classified_metric_debiasing_test.disparate_impact())
print("Test set: Equal opportunity difference = %f" % classified_metric_debiasing_test.equal_opportunity_difference())
print("Test set: Average odds difference = %f" % classified_metric_debiasing_test.average_odds_difference())
print("Test set: Statistical parity difference = %f" % classified_metric_debiasing_test.statistical_parity_difference())
print("Test set: Theil_index = %f" % classified_metric_debiasing_test.theil_index())

#### Model - with debiasing - classification metrics

Test set: Classification accuracy = 0.829221
Test set: Balanced classification accuracy = 0.596823
Test set: Disparate impact = 1.254862
Test set: Equal opportunity difference = 0.114900
Test set: Average odds difference = 0.071529
Test set: Statistical parity difference = 0.017232
Test set: Theil_index = 0.135146


# Adverserial Debiasing with weighted dataset

In [56]:
RW = Reweighing(unprivileged_groups=unprivileged_groups,
                privileged_groups=privileged_groups)
dataset_transf_panel19_train = RW.fit_transform(dataset_orig_panel19_train)

In [57]:
# Metric for the transformed dataset
metric_orig_train = BinaryLabelDatasetMetric(dataset_transf_panel19_train, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)
display(Markdown("#### Original training dataset"))
print("Train set: Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_orig_train.mean_difference())

#### Original training dataset

Train set: Difference in mean outcomes between unprivileged and privileged groups = -0.000000


In [58]:
tf.reset_default_graph()
sess = tf.Session()
# Learn parameters with debias set to True
weighted_debiased_model = AdversarialDebiasing(privileged_groups = privileged_groups,
                          unprivileged_groups = unprivileged_groups,
                          scope_name='debiased_classifier',
                          debias=True,
                          sess=sess)

In [59]:
weighted_debiased_model.fit(dataset_transf_panel19_train)

epoch 0; iter: 0; batch classifier loss: 0.706876; batch adversarial loss: 0.724897
epoch 1; iter: 0; batch classifier loss: 0.267081; batch adversarial loss: 0.688839
epoch 2; iter: 0; batch classifier loss: 0.299851; batch adversarial loss: 0.692279
epoch 3; iter: 0; batch classifier loss: 0.277771; batch adversarial loss: 0.685785
epoch 4; iter: 0; batch classifier loss: 0.292031; batch adversarial loss: 0.657146
epoch 5; iter: 0; batch classifier loss: 0.404674; batch adversarial loss: 0.645057
epoch 6; iter: 0; batch classifier loss: 0.328264; batch adversarial loss: 0.643620
epoch 7; iter: 0; batch classifier loss: 0.376575; batch adversarial loss: 0.664279
epoch 8; iter: 0; batch classifier loss: 0.264868; batch adversarial loss: 0.654066
epoch 9; iter: 0; batch classifier loss: 0.382271; batch adversarial loss: 0.673713
epoch 10; iter: 0; batch classifier loss: 0.288312; batch adversarial loss: 0.663045
epoch 11; iter: 0; batch classifier loss: 0.358902; batch adversarial loss:

<aif360.algorithms.inprocessing.adversarial_debiasing.AdversarialDebiasing at 0x135e5d668>

In [60]:
dataset_debiasing_test = weighted_debiased_model.predict(dataset_orig_panel19_test)

In [65]:
# Metrics for the dataset from model with debiasing

display(Markdown("#### Model - with weighted + debiasing - classification metrics"))
classified_metric_weighted_debiasing_test = ClassificationMetric(dataset_orig_panel19_test, 
                                                 dataset_debiasing_test,
                                                 unprivileged_groups=unprivileged_groups,
                                                 privileged_groups=privileged_groups)
print("Test set: Classification accuracy = %f" % classified_metric_weighted_debiasing_test.accuracy())
TPR = classified_metric_weighted_debiasing_test.true_positive_rate()
TNR = classified_metric_weighted_debiasing_test.true_negative_rate()
bal_acc_debiasing_test = 0.5*(TPR+TNR)
print("Test set: Balanced classification accuracy = %f" % bal_acc_debiasing_test)
print("Test set: Disparate impact = %f" % classified_metric_weighted_debiasing_test.disparate_impact())
print("Test set: Equal opportunity difference = %f" % classified_metric_weighted_debiasing_test.equal_opportunity_difference())
print("Test set: Statistical parity difference = %f" % classified_metric_weighted_debiasing_test.statistical_parity_difference())
print("Test set: Average odds difference = %f" % classified_metric_weighted_debiasing_test.average_odds_difference())
print("Test set: Theil_index = %f" % classified_metric_weighted_debiasing_test.theil_index())

#### Model - with weighted + debiasing - classification metrics

Test set: Classification accuracy = 0.832999
Test set: Balanced classification accuracy = 0.604049
Test set: Disparate impact = 1.023957
Test set: Equal opportunity difference = 0.091768
Test set: Statistical parity difference = 0.001787
Test set: Average odds difference = 0.053679
Test set: Theil_index = 0.133067


In [64]:
# obeys equality of odds; the FNR and FPR values are approximately equal across sex subgroups
print("Equality of odds \nWith Weighted + Debiasing")
print(classified_metric_weighted_debiasing_test.false_positive_rate(True), 
      classified_metric_weighted_debiasing_test.false_positive_rate(False))
print(classified_metric_weighted_debiasing_test.false_negative_rate(True), 
      classified_metric_weighted_debiasing_test.false_negative_rate(False))

print("\nWith Debiasing")
print(classified_metric_debiasing_test.false_positive_rate(True), 
      classified_metric_debiasing_test.false_positive_rate(False))
print(classified_metric_debiasing_test.false_negative_rate(True), 
      classified_metric_debiasing_test.false_negative_rate(False))


print("\nWithout Debiasing")
print(classified_metric_nodebiasing_test.false_positive_rate(True), 
      classified_metric_nodebiasing_test.false_positive_rate(False))
print(classified_metric_nodebiasing_test.false_negative_rate(True), 
      classified_metric_nodebiasing_test.false_negative_rate(False))

Equality of odds 
With Weighted + Debiasing
0.030981145983376583 0.046572472080316826
0.7768704393550616 0.6851028694534136

With Debiasing
0.027327885013855174 0.055485851503682536
0.7952258799301948 0.6803255530258768

Without Debiasing
0.08738166664117543 0.03968100112631606
0.6185764304283332 0.7631418948880946


In [20]:
## End of Adv Learning

## [10.](#Table-of-Contents) SUMMARY

In [75]:
results = [lr_orig_metrics, lr_transf_metrics,
           lr_transf_metrics_panel20_deploy,
           lr_transf_metrics_panel21_deploy,
           lr_transf_metrics_panel20_test,
           lr_transf_panel20_metrics_panel21_deploy]
debias = pd.Series([''] + ['Reweighing']*5, name='Bias Mitigator')
clf = pd.Series(['Logistic Regression']*6, name='Classifier')
tr = pd.Series(['Panel19']*4 + ['Panel20']*2, name='Training set')
te = pd.Series(['Panel19']*2 + ['Panel20', 'Panel21']*2, name='Testing set')
pd.concat([pd.DataFrame(m) for m in results], axis=0).set_index([debias, clf, tr, te])

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,avg_odds_diff,bal_acc,disp_imp,eq_opp_diff,stat_par_diff,theil_ind
Bias Mitigator,Classifier,Training set,Testing set,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
,Logistic Regression,Panel19,Panel19,-0.205706,0.775935,0.426176,-0.222779,-0.261207,0.092122
Reweighing,Logistic Regression,Panel19,Panel19,-0.015104,0.753893,0.751755,-0.003518,-0.087196,0.096575
Reweighing,Logistic Regression,Panel19,Panel20,0.007135,0.731136,0.805724,0.030262,-0.059602,0.10191
Reweighing,Logistic Regression,Panel19,Panel21,-0.01434,0.737916,0.744126,-0.004405,-0.081262,0.09942
Reweighing,Logistic Regression,Panel20,Panel20,0.004045,0.731345,0.825168,0.041814,-0.069257,0.096305
Reweighing,Logistic Regression,Panel20,Panel21,-0.010875,0.734998,0.80959,-0.004093,-0.075011,0.095536
