# Bias Evaluation : AIF360
***

Quantification of model bias in terms of fairness against protected groups before and after implementation of mitigation methods

## Terminology
***

***Favorable label:*** A label whose value corresponds to an outcome that provides an advantage to the recipient (such as receiving a loan, being hired for a job, not being arrested)

***Protected attribute:*** An attribute that partitions a population into groups whose outcomes should have parity (such as race, gender, caste, and religion)

***Privileged value (of a protected attribute):*** A protected attribute value indicating a group that has historically been at a systemic advantage

***Fairness metric:*** A quantification of unwanted bias in training data or models

***Discrimination/unwanted bias:*** Although bias can refer to any form of preference, fair or unfair, our focus is on undesirable bias or discrimination, which is when specific privileged groups are placed at a systematic advantage and specific unprivileged groups are placed at a systematic disadvantage. This relates to attributes such as race, gender, age, and sexual orientation.


## Structure of Evaluation & Intervention
***
<img src="images/aif360_pipeline.png" width="700" height="500" align="center"/>

### Three Perspectives of Fairness in ML algorithms
***

[linkedin article](https://www.linkedin.com/pulse/whats-new-deep-learning-research-reducing-bias-models-jesus-rodriguez/)

***1. Data vs Mode***

Fairness may be quantified in the training dataset or in the learned model

***2. Group vs Individual***

Group fairness partitions a population into groups defined by protected attributes and seeks for some statistical measure to be equal across all groups. Individual fairness seeks for similar individuals to be treated similarly.


***3. WAE vs WYSIWYG (We are all equal vs What you see is what you get)***

WAE says that fairness is an equal distirbution of skills and opportunities among the participants in an ML task, attributing differences in outcome distributions to structural bias and not a difference in distribution to ability. WYSIWYG says that observations reflect ability with respect to a task.

> If the application follows the WAE worldview, then the demographic parity metrics should be used: disparate_impact and statistical_parity_difference.  If the application follows the WYSIWYG worldview, then the equality of odds metrics should be used: average_odds_difference and average_abs_odds_difference.  Other group fairness metrics (some are often labeled equality of opportunity) lie in-between the two worldviews and may be used appropriately: false_negative_rate_ratio, false_negative_rate_difference, false_positive_rate_ratio, false_positive_rate_difference, false_discovery_rate_ratio, false_discovery_rate_difference, false_omission_rate_ratio, false_omission_rate_difference, error_rate_ratio, and error_rate_difference. 

### 2. Evaluation of Bias in Models

<center><b>Average Odds Difference:</b> $ \tfrac{1}{2}\left[(FPR_{D = \text{unprivileged}} - FPR_{D = \text{privileged}}) + (TPR_{D = \text{privileged}} - TPR_{D = \text{unprivileged}}))\right] $ </center>


<br><br>


<center><b>Statistical Parity Difference:    </b>$ Pr(\hat{Y} = 1 | D = \text{unprivileged}) - Pr(\hat{Y} = 1 | D = \text{privileged}) $</center>


<br><br>


<center><b>Equal Opportunity Difference:    </b>$ TPR_{D = \text{unprivileged}} - TPR_{D = \text{privileged}} $</center>


<br><br>


<center><b>Theil Index:    </b>$ \frac{1}{n}\sum_{i=1}^n\frac{b_{i}}{\mu}\ln\frac{b_{i}}{\mu}, \text{with} b_i = \hat{y}_i - y_i + 1 $</center>


<br><br>


<center><b>Disparate Impact:    </b> $ \frac{Pr(Y = 1 | D = \text{unprivileged})} {Pr(Y = 1 | D = \text{privileged})}$</center>

In [1]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
sns.set(context='talk', style='whitegrid')
from IPython.display import display, Markdown

from tensorflow import keras
from sklearn.metrics import RocCurveDisplay
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler, MaxAbsScaler, RobustScaler
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split

import tensorflow as tf
from tensorflow.keras.optimizers  import Adam, Adagrad, SGD, RMSprop

# from aif360.sklearn.metrics import mdss_bias_scan, mdss_bias_score
import aif360
import utilities
import global_variables as gv

In [2]:
from aif360.datasets import StandardDataset, BinaryLabelDataset
from aif360.sklearn import metrics as mt
from aif360.explainers import MetricTextExplainer, MetricJSONExplainer

### load data

#### configure datasets according to aif360 library
<ol>
    <li> Binarize the sensitive attribute: 1 set to privileged group, 0 to unprivileged group</li>
    <li> Binarize the label columns: 1 is the positive outcome and 0 else</li>
    <li> Set the sensitive attribute as index</li>

In [3]:
df = pd.read_csv('data/binary_full.csv')
pd.set_option('display.max_columns', None)
df.drop('Unnamed: 0', axis=1, inplace=True)
df['sex-binary']=df['sex'].map(gv.binary_sex)

input_cols = df.iloc[:,:61].columns.to_list()
X1 = df.loc[:,input_cols+['CVD']+['sex-binary']].set_index('sex-binary')
X2 = df.loc[:,input_cols+['CVD']+['race-binary']].set_index('race-binary')
X3 = df.loc[:,input_cols+['CVD']]
X3['age-binary'] = np.where((df['age']>=50)&(df['age']<70),1,0)
X3 = X3.set_index('age-binary')
X4 = df.loc[:,input_cols+['CVD']+['age']].set_index('age')

# X1.drop(columns=gv.protected_attributes[0], axis=1, inplace=True)
# X2.drop(columns=gv.protected_attributes[1], axis=1, inplace=True)
# X3.drop(columns=gv.protected_attributes[2], axis=1, inplace=True)
# X4.drop(columns=gv.protected_attributes[2], axis=1, inplace=True)

In [4]:
gv.protected_attributes # sex, race, age

['31-0.0', '21000-0.0', '21003-0.0']

In [5]:
X3.head()

Unnamed: 0_level_0,1319-0.0,1408-0.0,1329-0.0,1448-0.0,1538-0.0,6142-0.0,2050-0.0,1508-0.0,1339-0.0,30710-0.0,1349-0.0,30750-0.0,1468-0.0,20117-0.0,30740-0.0,1160-0.0,2090-0.0,31-0.0,1488-0.0,30850-0.0,4080-0.0,1369-0.0,21000-0.0,1200-0.0,1289-0.0,30790-0.0,845-0.0,48-0.0,30630-0.0,1299-0.0,1220-0.0,1548-0.0,1528-0.0,23099-0.0,49-0.0,30690-0.0,1389-0.0,2654-0.0,1249-0.0,1309-0.0,1379-0.0,1239-0.0,21003-0.0,30780-0.0,1438-0.0,30870-0.0,1359-0.0,30770-0.0,21001-0.0,1458-0.0,23100-0.0,6138-0.0,1418-0.0,1478-0.0,4079-0.0,30760-0.0,23101-0.0,2100-0.0,1428-0.0,30640-0.0,hypertension,CVD
age-binary,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1
1,0.0,1.0,2.0,3.0,2.0,1.0,2.0,3.0,2.0,0.34,1.0,34.937,3.0,2.0,5.622,7.0,1.0,0.0,6.0,0.508,110.0,1.0,1001.0,3.0,6.0,54.4035,20.9,74.0,1.593,10.0,0.0,2.0,2.0,35.6,102.0,6.477,1.0,6.0,1.0,2.0,1.0,0.0,54.0,3.888,10.0,0.977,2.0,26.339,24.579,3.86,25.0,1.0,3.0,1.0,77.0,1.706,45.2,1.0,0.0,1.211,0,1
1,0.0,3.0,2.0,1.0,0.0,1.0,1.0,2.0,2.0,3.94,4.0,40.9,5.0,2.0,5.052,9.0,0.0,1.0,2.0,13.088,166.0,2.0,1001.0,2.0,2.0,15.4,16.0,120.0,1.39,2.0,0.0,2.0,2.47,36.5,113.0,5.512,1.0,7.0,1.0,1.0,2.0,0.0,65.0,3.52,12.0,2.358,3.0,10.701,35.0861,7.0,42.9,3.0,2.0,1.0,91.0,1.173,74.6,0.0,1.0,1.019,1,0
1,0.0,3.0,3.0,2.0,1.0,2.0,1.0,2.0,2.0,0.55,1.0,40.0,1.0,0.0,5.31,5.0,0.0,0.0,0.0,0.515,132.0,1.0,1001.0,3.0,2.0,32.1,16.0,66.0,2.005,4.0,0.0,1.0,1.0,29.5,88.0,7.079,1.0,7.0,3.0,4.0,2.0,0.0,69.0,4.227,8.0,0.655,2.0,10.693,19.3835,7.0,15.2,3.0,2.0,1.0,67.0,2.49,36.3,0.0,1.0,1.097,0,0
1,3.0,3.0,3.0,3.0,0.0,2.0,1.0,2.0,2.0,0.45,2.0,37.3,4.0,2.0,4.449,7.0,0.0,1.0,5.0,4.675,178.0,2.0,1001.0,1.0,3.0,43.562,18.0,110.0,1.474,2.0,0.0,1.0,2.0,28.5,117.0,5.028,0.0,7.0,1.0,1.0,2.0,1.0,66.0,3.041,10.0,3.108,2.0,25.317,35.1281,7.0,31.7,3.0,2.0,1.0,84.0,1.169,79.6,0.0,3.0,0.923,0,0
0,0.0,3.0,2.0,1.0,0.0,5.0,2.0,2.0,2.0,0.75,2.0,32.2,1.0,2.0,4.616,6.0,0.0,1.0,3.04,20.162,178.0,1.0,1001.0,3.0,1.0,71.11,22.38,94.0,2.149,1.0,0.0,2.0,2.0,24.8,100.0,7.958,1.0,7.0,2.0,1.0,1.0,0.0,48.0,4.983,8.0,1.173,1.0,26.523,25.8866,1.0,20.1,1.0,2.0,1.0,88.0,2.053,61.0,0.0,3.0,1.443,0,0


In [6]:
X1_train = X1.iloc[:,:-1]
y1_train = X1.iloc[:,-1]

(X_train, X_test,
 y_train, y_test) = train_test_split(X1_train, y1_train, train_size=0.7, random_state=42)

In [7]:
print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)
print(y_train.value_counts(), '\n', y_test.value_counts())

(351736, 61) (150745, 61) (351736,) (150745,)
0    319546
1     32190
Name: CVD, dtype: int64 
 0    136805
1     13940
Name: CVD, dtype: int64


In [8]:
print(X_train.index.value_counts())

0    191331
1    160405
Name: sex-binary, dtype: int64


### Evaluate bias in original dataset: Disparate impact ratio

#### Interpretation:

<ul>
    <li> output range=[0,1]</li>
    <li> a higher value == more fair related to the given protected attribute</li>
    <li> x>0.8 is considered acceptable bias</li>

In [9]:
mt.disparate_impact_ratio(X1['CVD'], prot_attr='sex-binary')

0.5661578163243346

In [10]:
mt.disparate_impact_ratio(X2['CVD'], prot_attr='race-binary')

0.5438220418530659

In [11]:
mt.disparate_impact_ratio(X3['CVD'], prot_attr='age-binary')

0.31036008621940697

> disparate impact ratios indicate that all three investigated protected attributes possess significant bias

### load model

In [12]:
model = keras.models.load_model('saved_models/mlp_binary_1.h5')
model.compile(loss='categorical_hinge',
              optimizer=SGD(lr=0.0005),
              metrics=['acc',tf.keras.metrics.AUC(), tf.keras.metrics.Recall()])

The `lr` argument is deprecated, use `learning_rate` instead.


### functions for evaluating bias of selected protected attribute

In [94]:
def set_protected_attribute(df, label_col):
    df = df.copy()
    y = df.set_index

In [13]:
def fairness_report(y_test, y_pred, prot_attr, return_df=True):
    fairness_report_dict = {
        prot_attr: {"disparate_impact_ratio": 
                        mt.disparate_impact_ratio(y_test, y_pred, prot_attr=prot_attr),
                    "statistical_parity_difference": 
                        mt.statistical_parity_difference(y_test, y_pred, prot_attr=prot_attr),
                    "equal_opportunity_difference": 
                        mt.equal_opportunity_difference(y_test, y_pred, prot_attr=prot_attr),
                    "average_odds_difference": 
                        mt.average_odds_difference(y_test, y_pred, prot_attr=prot_attr),
                    "average odds error":
                        mt.average_odds_error(y_test, y_pred, prot_attr=prot_attr),

                   }
    }
    if return_df:
        return pd.DataFrame(fairness_report_dict)
    else:
        return fairness_report_dict

### Protected Attribute: Sex

In [14]:
X1_inputs = X1.loc[:,input_cols]
privileged_groups = [{'sex-binary': 1}]
unprivileged_groups = [{'sex-binary': 0}]

In [66]:
import h5py

f = h5py.File('saved_models/mlp_binary_1.h5', 'r')
f.attrs.get('model_config')

'{"class_name": "Sequential", "config": {"name": "sequential_9", "layers": [{"class_name": "InputLayer", "config": {"batch_input_shape": [null, 105], "dtype": "float32", "sparse": false, "ragged": false, "name": "dense_44_input"}}, {"class_name": "Dense", "config": {"name": "dense_44", "trainable": true, "batch_input_shape": [null, 105], "dtype": "float32", "units": 1000, "activation": "tanh", "use_bias": true, "kernel_initializer": {"class_name": "GlorotUniform", "config": {"seed": null}}, "bias_initializer": {"class_name": "Zeros", "config": {}}, "kernel_regularizer": null, "bias_regularizer": null, "activity_regularizer": null, "kernel_constraint": null, "bias_constraint": null}}, {"class_name": "Dense", "config": {"name": "dense_45", "trainable": true, "dtype": "float32", "units": 500, "activation": "tanh", "use_bias": true, "kernel_initializer": {"class_name": "GlorotUniform", "config": {"seed": null}}, "bias_initializer": {"class_name": "Zeros", "config": {}}, "kernel_regularizer

In [64]:
! pip install --user h5py








In [15]:
def get_fairness(df, protected_attribute):

    _, _, X_test, _, _, y_test = utilities.process_features(df, 'CVD', RobustScaler(), one_hot=True)
    
    y_prob = model.predict(X_test)
    y_pred = np.where(y_prob > 0.8, 1,0)
    
    baseline_fairness = fairness_report(y_test, y_pred, prot_attr=protected_attribute)
    return baseline_fairness

In [None]:
result = get_fairness(X1, 'sex-binary')
result



In [None]:
result = get_fairness(X2, 'race-binary')
result

In [None]:
result = get_fairness(X3, 'age-binary')
result