# Baseline measures

Step1. Import packages

The sub-package used to compute the baseline measures is aif360.sklearn. This package allows users to apply the bias metrics on their own datasets. For more information, please refer to
https://github.com/Trusted-AI/AIF360/tree/master/aif360/sklearn.

In [None]:
import numpy as np
import pandas as pd

!pip install 'aif360[OptimPreproc]' 

from sklearn.linear_model import LogisticRegression, LogisticRegressionCV, SGDClassifier
from sklearn.model_selection import train_test_split, KFold, cross_val_score
from aif360.sklearn.metrics import consistency_score,generalized_entropy_error,generalized_entropy_index,theil_index,coefficient_of_variation
from aif360.sklearn.metrics import statistical_parity_difference,disparate_impact_ratio,equal_opportunity_difference,average_odds_difference
from aif360.sklearn.datasets import standardize_dataset, to_dataframe

Step2. Import and preprocess dataset. Initialize objects.

Attribute 8 is *Gender* and attribute 12 is *Age*. For more information about preprocessing please refer to https://aif360.readthedocs.io/en/latest/modules/generated/aif360.sklearn.datasets.standardize_dataset.html#aif360.sklearn.datasets.standardize_dataset.  

In [168]:
# import dataset
df = pd.read_csv("german_data_normalized.csv", header = None)

# preprocess data following aif360.sklearn instructions
X,y = standardize_dataset(df,prot_attr=[8,12],target=20)

# initialize objects
dataset = [] # scenario
consistency = [] 
generalized_entropy = []

Step3. Compute individal and group fairness baseline measures

Individual fairness metrics:
- Consistency score: measures how similar the labels are for similar instances
- Generalised entropy error: measures inequality over a population. This algorithm compares the predictions made by a classifier with the ground truth. To that end, a LogisticRegression is used. Note that no test-train split is made as well as no hyperparameter tuning. 

First, we compute measures using all attributes in the dataset. 

In [169]:
# Feature combination 1
dataset.append('German_all_attributes')
consistency.append(consistency_score(X, y))

#X_tr, X_te, y_tr, y_te = train_test_split(X, y, random_state = 1,test_size=0.3)
model = LogisticRegression(max_iter=1000,random_state=1)
model.fit(X,y)
y_pred = model.predict(X)
print(model.score(X,y))

generalized_entropy.append(generalized_entropy_error(y, y_pred,pos_label=0))

0.777


Second, we exclude the attribute gender from the dataset and compute measures once more.

In [170]:
# Feature combination 2.1
dataset.append('German_excl_gender')

X,y = standardize_dataset(df,prot_attr=[8,12], dropcols=[8], target=20)

consistency.append(consistency_score(X, y))

#X_tr, X_te, y_tr, y_te = train_test_split(X, y, random_state = 1,test_size=0.3)
model = LogisticRegression(max_iter=1000,random_state=1)
model.fit(X,y)
y_pred = model.predict(X)
print(model.score(X,y))

generalized_entropy.append(generalized_entropy_error(y, y_pred,pos_label=0))

0.771


Third, we exclude the attribute age from the dataset and compute measures once more.

In [171]:
# Feature combination 2.1
dataset.append('German_excl_age')

X,y = standardize_dataset(df,prot_attr=[8,12], dropcols=[12], target=20)

consistency.append(consistency_score(X, y))

#X_tr, X_te, y_tr, y_te = train_test_split(X, y, random_state = 1,test_size=0.3)
model = LogisticRegression(max_iter=1000,random_state=1)
model.fit(X,y)
y_pred = model.predict(X)
print(model.score(X,y))

generalized_entropy.append(generalized_entropy_error(y, y_pred,pos_label=0))

0.776


In [172]:
baseline = pd.concat((np.round(pd.Series(consistency, name='Consistency'),3),np.round(pd.Series(generalized_entropy, name='GEE'),3)),1)
baseline.index = dataset
baseline

Unnamed: 0,Consistency,GEE
German_all_attributes,0.746,0.094
German_excl_gender,0.743,0.095
German_excl_age,0.746,0.093


Group fairness metrics:
- Statistical parity difference
- Disparate impact
- Equal opportunity difference
- Average odds difference

In [173]:
df = pd.read_csv("german_data_normalized.csv", header = None)

# preprocess data following aif360.sklearn instructions
X,y = standardize_dataset(df,prot_attr=[8,12],target=20)
X[12] = np.where(X[12]>0.25,int(0),int(1))

model = LogisticRegression(max_iter=1000,random_state=1)
model.fit(X,y)
y_pred = model.predict(X)

# initialize objects
dataset = [] # scenario
stat_par = [] 
disp_im = []
eq_opp = []
ave_odds = []

We compute the four group fairness measures by setting `prot_attr` parameter to the index of the protected attribute.

First, we compute the metrics focusing on gender.

In [174]:
dataset.append('gender/female')
stat_par.append(statistical_parity_difference(y,y_pred,prot_attr=8,pos_label=int(0),priv_group=int(1)))
disp_im.append(disparate_impact_ratio(y,y_pred,prot_attr=8,pos_label=int(0),priv_group=int(1)))
eq_opp.append(equal_opportunity_difference(y,y_pred,8,pos_label=int(0),priv_group=1))
ave_odds.append(average_odds_difference(y,y_pred,8,pos_label=int(0),priv_group=1))

Second, we compute the metrics focusing on age.

In [175]:
dataset.append('age/young')
stat_par.append(statistical_parity_difference(y,y_pred,prot_attr=12,pos_label=0,priv_group=0)) 
disp_im.append(disparate_impact_ratio(y,y_pred,prot_attr=12,pos_label=0,priv_group=0))
eq_opp.append(equal_opportunity_difference(y,y_pred,12,pos_label=0,priv_group=0))
ave_odds.append(average_odds_difference(y,y_pred,12,pos_label=0,priv_group=0))

Finally, we merge the two.

In [176]:
pd.DataFrame(np.array([stat_par, disp_im, eq_opp, ave_odds]).T, 
             columns = ['Statistical Parity', 'Disparate Impact', 
             'Equal Opportunity', 'Average Odds'], index = dataset)

Unnamed: 0,Statistical Parity,Disparate Impact,Equal Opportunity,Average Odds
gender/female,-0.116597,0.856079,-0.056102,-0.103055
age/young,-0.226453,0.773547,-0.104435,-0.30807
