# Submission instructions 

All code that you write should be in this notebook.
Submit:

* This notebook with your code added. Make sure to add enough documentation.
* A short report, max 2 pages including any figures and/or tables (it is likely that you won't need the full 2 pages). Use [this template](https://www.overleaf.com/read/mvskntycrckw). 
* The deadline is Monday 17th of May, 17.00.

For questions, make use of the "Lab" session (see schedule).
Questions can also be posted to the MS teams channel called "Lab". 


# Installing AIF360

In this assignment, we're going to use the AIF360 library.
For documentation, take a look at:

    * https://aif360.mybluemix.net/
    * https://aif360.readthedocs.io/en/latest/ (API documentation)
    * https://github.com/Trusted-AI/AIF360 Installation instructions

We recommend using a dedicated Python environment for this assignment, for example
by using Conda (https://docs.conda.io/en/latest/).
You could also use Google Colab (https://colab.research.google.com/).

When installing AIF360, you only need to install the stable, basic version (e.g., pip install aif360)
You don't need to install the additional optional dependencies.

The library itself provides some examples in the GitHub repository, see:
https://github.com/Trusted-AI/AIF360/tree/master/examples.

**Notes**
* The lines below starting with ! can be used in Google Colab by commenting them out, or in your console
* The first time you're running the import statements, you may get a warning "No module named tensorflow".
  This can be ignored--we don't need it for this assignment. Just run the code block again, and it should disappear

In [1]:
# !pip install aif360[all]

In [2]:
# !pip install aif360
# !pip install fairlearn|
from aif360.algorithms.preprocessing.optim_preproc_helpers.data_preproc_functions\
        import load_preproc_data_compas

from aif360.metrics import BinaryLabelDatasetMetric

# Exploring the data

**COMPAS dataset**

In this assignment we're going to use the COMPAS dataset.

If you haven't done so already, take a look at this article: https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing.
For background on the dataset, see https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm

**Reading in the COMPAS dataset**

The AIF360 library has already built in code to read in this dataset.
However, you'll first need to manually download the COMPAS dataset 
and put it into a specified directory. 
See: https://github.com/Trusted-AI/AIF360/blob/master/aif360/data/raw/compas/README.md.
If you try to load in the dataset for the first time, the library will give you instructions on the steps to download the data.

The protected attributes in this dataset are 'sex' and 'race'. 
For this assignment, we'll only focus on race.

The label codes recidivism, which they defined as a new arrest within 2 years. 
Note that in this dataset, the label is coded with 1 being the favorable label.

In [3]:
# !wget -c https://raw.githubusercontent.com/propublica/compas-analysis/master/compas-scores-two-years.csv
# !mv compas-scores-two-years.csv data/compas-scpre

compas_data = load_preproc_data_compas(protected_attributes=['race'])

Now let's take a look at the data:

In [4]:
compas_data

               instance weights features                                       \
                                         protected attribute                    
                                     sex                race age_cat=25 to 45   
instance names                                                                  
3                           1.0      0.0                 0.0              1.0   
4                           1.0      0.0                 0.0              0.0   
8                           1.0      0.0                 1.0              1.0   
10                          1.0      1.0                 1.0              1.0   
14                          1.0      0.0                 1.0              1.0   
...                         ...      ...                 ...              ...   
10994                       1.0      0.0                 0.0              1.0   
10995                       1.0      0.0                 0.0              0.0   
10996                       

**Creating a train and test split**

We'll create a train (80%) and test split (20%). 

Note: *Usually when carrying out machine learning experiments,
we also need a dev set for developing and selecting our models (incl. tuning of hyper-parameters).
However, in this assignment, the goal is not to optimize 
the performance of models so we'll only use a train and test split.*

Note: *due to random division of train/test sets, the actual output in your runs may slightly differ with statistics showing in the rest of this notebook.*

In [5]:
train_data, test_data = compas_data.split([0.8], shuffle=True)

In this assignment, we'll focus on protected attribute: race.
This is coded as a binary variable with "Caucasian" coded as 1 and "African-American" coded as 0.

In [6]:
priv_group   = [{'race': 1}]  # Caucasian
unpriv_group = [{'race': 0}]  # African-American

Now let's look at some statistics:

In [7]:
print("Training set shape: %s, %s" % train_data.features.shape)
print("Favorable (not recid) and unfavorable (recid) labels: %s; %s" % (train_data.favorable_label, train_data.unfavorable_label))
print("Protected attribute names: %s" % train_data.protected_attribute_names)
# labels of privileged (1) and unprovileged groups (0)
print("Privileged (Caucasian) and unprivileged (African-American) protected attribute values: %s, %s" % (train_data.privileged_protected_attributes, 
      train_data.unprivileged_protected_attributes))
print("Feature names: %s" % train_data.feature_names)

Training set shape: 4222, 10
Favorable (not recid) and unfavorable (recid) labels: 0.0; 1.0
Protected attribute names: ['race']
Privileged (Caucasian) and unprivileged (African-American) protected attribute values: [array([1.])], [array([0.])]
Feature names: ['sex', 'race', 'age_cat=25 to 45', 'age_cat=Greater than 45', 'age_cat=Less than 25', 'priors_count=0', 'priors_count=1 to 3', 'priors_count=More than 3', 'c_charge_degree=F', 'c_charge_degree=M']


Now, let's take a look at the test data and compute the following difference:

$$𝑃(𝑌=favorable|𝐷=unprivileged)−𝑃(𝑌=favorable|𝐷=privileged)$$


In [8]:
metric_test_data = BinaryLabelDatasetMetric(test_data, 
                             unprivileged_groups = unpriv_group,
                             privileged_groups   = priv_group)
print("Mean difference (statistical parity difference) = %f" % 
      metric_test_data.statistical_parity_difference())


Mean difference (statistical parity difference) = -0.132312


To be clear, because we're looking at the original label distribution this is the base rate difference between the two groups

In [9]:
metric_test_data.base_rate(False)  # Base rate of the unprivileged group

0.48148148148148145

In [10]:
metric_test_data.base_rate(True)   # Base rate of the privileged group

0.6137931034482759

To explore the data, it can also help to convert it to a dataframe.
Note that we get the same numbers as the reported base rates above,
but because when calculating base rates the favorable label is taken (which is actually 0),  it's 1-...

In [11]:
test_data.convert_to_dataframe()[0].groupby(['race'])['two_year_recid'].describe()

Unnamed: 0_level_0,count,mean,std,min,25%,50%,75%,max
race,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
0.0,621.0,0.518519,0.50006,0.0,0.0,1.0,1.0,1.0
1.0,435.0,0.386207,0.48744,0.0,0.0,0.0,1.0,1.0


In [12]:
train_data.convert_to_dataframe()[0].groupby(['race'])['two_year_recid'].describe()

Unnamed: 0_level_0,count,mean,std,min,25%,50%,75%,max
race,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
0.0,2554.0,0.524276,0.499508,0.0,0.0,1.0,1.0,1.0
1.0,1668.0,0.392086,0.488362,0.0,0.0,0.0,1.0,1.0


In [13]:
train_data.convert_to_dataframe()[0].groupby(['sex'])['two_year_recid'].describe()

Unnamed: 0_level_0,count,mean,std,min,25%,50%,75%,max
sex,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
0.0,3402.0,0.495885,0.500057,0.0,0.0,0.0,1.0,1.0
1.0,820.0,0.373171,0.483942,0.0,0.0,0.0,1.0,1.0


In [14]:
train_data.convert_to_dataframe()[0].groupby(['sex','race'])['two_year_recid'].describe()

Unnamed: 0_level_0,Unnamed: 1_level_0,count,mean,std,min,25%,50%,75%,max
sex,race,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
0.0,0.0,2108.0,0.554554,0.497133,0.0,0.0,1.0,1.0,1.0
0.0,1.0,1294.0,0.400309,0.49015,0.0,0.0,0.0,1.0,1.0
1.0,0.0,446.0,0.381166,0.486219,0.0,0.0,0.0,1.0,1.0
1.0,1.0,374.0,0.363636,0.48169,0.0,0.0,0.0,1.0,1.0


In [15]:
test_data.convert_to_dataframe()[0].groupby(['sex','race'])['two_year_recid'].describe()

Unnamed: 0_level_0,Unnamed: 1_level_0,count,mean,std,min,25%,50%,75%,max
sex,race,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
0.0,0.0,518.0,0.557915,0.497115,0.0,0.0,1.0,1.0,1.0
0.0,1.0,327.0,0.409786,0.492548,0.0,0.0,0.0,1.0,1.0
1.0,0.0,103.0,0.320388,0.468908,0.0,0.0,0.0,1.0,1.0
1.0,1.0,108.0,0.314815,0.466607,0.0,0.0,0.0,1.0,1.0


In [16]:
compas_data.convert_to_dataframe()[0].groupby(['sex','race'])['two_year_recid'].describe()

Unnamed: 0_level_0,Unnamed: 1_level_0,count,mean,std,min,25%,50%,75%,max
sex,race,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
0.0,0.0,2626.0,0.555217,0.497036,0.0,0.0,1.0,1.0,1.0
0.0,1.0,1621.0,0.402221,0.490497,0.0,0.0,0.0,1.0,1.0
1.0,0.0,549.0,0.369763,0.483181,0.0,0.0,0.0,1.0,1.0
1.0,1.0,482.0,0.352697,0.478306,0.0,0.0,0.0,1.0,1.0


In [17]:
compas_data.convert_to_dataframe()[0].describe()

Unnamed: 0,sex,race,age_cat=25 to 45,age_cat=Greater than 45,age_cat=Less than 25,priors_count=0,priors_count=1 to 3,priors_count=More than 3,c_charge_degree=F,c_charge_degree=M,two_year_recid
count,5278.0,5278.0,5278.0,5278.0,5278.0,5278.0,5278.0,5278.0,5278.0,5278.0,5278.0
mean,0.195339,0.398446,0.573323,0.207654,0.219022,0.315839,0.370027,0.314134,0.651762,0.348238,0.470443
std,0.396499,0.489625,0.494641,0.405666,0.413623,0.464893,0.482857,0.464214,0.476457,0.476457,0.499173
min,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
50%,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0
75%,0.0,1.0,1.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0
max,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0


In [18]:
print('Test data size:',len(test_data.convert_to_dataframe()[0].index))
print('Train data size:',len(train_data.convert_to_dataframe()[0].index))

Test data size: 1056
Train data size: 4222


**Report**

Report basic statistics in your report, such as the size of the training and test set.

Now let's explore the *training* data further.
In your report include a short analysis of the training data. Look at the base rates of the outcome variable (two year recidivism) for the combination of both race and sex categories. What do you see?

# Classifiers

**Training classifiers**

Now, train the following classifiers:

1. A logistic regression classifier making use of all features 
2. A logistic regression classifier without the race feature
3. A classifier after reweighting instances in the training set https://aif360.readthedocs.io/en/latest/modules/generated/aif360.algorithms.preprocessing.Reweighing.html.
    * Report the weights that are used for reweighing and a short interpretation/discussion.
4. A classifier after post-processing 
https://aif360.readthedocs.io/en/latest/modules/generated/aif360.algorithms.postprocessing.EqOddsPostprocessing.html#aif360.algorithms.postprocessing.EqOddsPostprocessing 

For training the classifier we recommend using scikit-learn (https://scikit-learn.org/stable/).
AIF360 contains a sklearn wrapper, however that one is in development and not complete.
We recommend using the base AIF360 library, and not their sklearn wrapper.

**Report**

For each of these classifiers, report the following:
* Overall precision, recall, F1 and accuracy.
* The statistical parity difference. Does this classifier satisfy statistical parity? How does this difference compare to the original dataset?
* Difference of true positive rates between the two groups. Does the classifier satisfy the equal opportunity criterion? 



## A logistic regression classifier making use of all features

In [19]:
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import precision_score,recall_score,accuracy_score,f1_score,confusion_matrix
from aif360.datasets import BinaryLabelDataset


train_df_baseline = train_data.convert_to_dataframe()[0]
test_df_baseline = test_data.convert_to_dataframe()[0]
X_train_b = train_df_baseline[train_df_baseline.columns[:-1]]
y_train_b = train_df_baseline[train_df_baseline.columns[-1]]
X_test_b = test_df_baseline[test_df_baseline.columns[:-1]]
y_test_b = test_df_baseline[test_df_baseline.columns[-1]]


lr_model_baseline = LogisticRegression(random_state=0).fit(X_train_b,y_train_b)
predictions_baseline = lr_model_baseline.predict(X_test_b)

print('f1:',f1_score(y_test_b,predictions_baseline))
print('recall:',recall_score(y_test_b,predictions_baseline))
print('accuracy:',accuracy_score(y_test_b,predictions_baseline))
print('precision:',precision_score(y_test_b,predictions_baseline))
cm = confusion_matrix(y_test_b, predictions_baseline)
TP = cm[1][1]
TN = cm[1][0]
print("True positives:",TP,'rate:',str(round(TP*100/(TP+TN),2))+'%')

f1: 0.6053475935828877
recall: 0.5775510204081633
accuracy: 0.6505681818181818
precision: 0.6359550561797753
True positives: 283 rate: 57.76%


### Statistical parity

In [20]:
priv_group   = [{'race': 1}]  # Caucasian
unpriv_group = [{'race': 0}]  # African-American
X_test_b['two_year_recid'] = predictions_baseline
bld_test = BinaryLabelDataset(favorable_label=1.0, unfavorable_label=0.0,df=X_test_b,label_names=['two_year_recid'],protected_attribute_names=['race'])
metric_test_data = BinaryLabelDatasetMetric(bld_test, 
                             unprivileged_groups = unpriv_group,
                             privileged_groups   = priv_group)
print("Mean difference (statistical parity difference) = %f" % 
      metric_test_data.statistical_parity_difference())

Mean difference (statistical parity difference) = 0.349125


## A logistic regression classifier without the race feature

In [21]:
train_df = train_data.convert_to_dataframe()[0]
test_df = test_data.convert_to_dataframe()[0]
columns_without_race = train_df.columns[2:-1]
columns_without_race =columns_without_race.insert(0,train_df.columns[0])

In [22]:
X_train_wo_race = train_df[columns_without_race]
y_train_wo_race = train_df[train_df.columns[-1]]
X_test_wo_race = test_df[columns_without_race]
y_test_wo_race = test_df[test_df.columns[-1]]


lr_model = LogisticRegression(random_state=0).fit(X_train_wo_race,y_train_wo_race)
predictions = lr_model.predict(X_test_wo_race)

print('f1:',f1_score(y_test_wo_race,predictions))
print('recall:',recall_score(y_test_wo_race,predictions))
print('accuracy:',accuracy_score(y_test_wo_race,predictions))
print('precision:',precision_score(y_test_wo_race,predictions))
cm = confusion_matrix(y_test_wo_race, predictions)
TP = cm[1][1]
TN = cm[1][0]
print("True positives:",TP,'rate:',str(round(TP*100/(TP+TN),2))+'%')

f1: 0.6143790849673203
recall: 0.5755102040816327
accuracy: 0.6647727272727273
precision: 0.6588785046728972
True positives: 282 rate: 57.55%


### Statistical parity

In [23]:
priv_group   = [{'race': 1}]  # Caucasian
unpriv_group = [{'race': 0}]   # African-American
X_test_b['two_year_recid'] = predictions
bld_test = BinaryLabelDataset(favorable_label=1.0, unfavorable_label=0.0,df=X_test_b,label_names=['two_year_recid'],protected_attribute_names=['race'])
metric_test_data = BinaryLabelDatasetMetric(bld_test, 
                             unprivileged_groups = unpriv_group,
                             privileged_groups   = priv_group)
print("Mean difference (statistical parity difference) = %f" % 
      metric_test_data.statistical_parity_difference())

Mean difference (statistical parity difference) = 0.259204


## A classifier after reweighting instances in the training set

In [28]:
from aif360.algorithms.preprocessing import Reweighing

privileged_group = [{'race':1}]
unprivileged_group = [{'race':0}]
rw = Reweighing(unprivileged_group,privileged_group)

rw_train_data = rw.fit_transform(train_data)
rw_train_df = rw_train_data.convert_to_dataframe()[0]

test_df = test_data.convert_to_dataframe()[0]
X_train_rw = rw_train_df[rw_train_df.columns[:-1]]
y_train_rw = rw_train_df[rw_train_df.columns[-1]]
X_test = test_df[test_df.columns[:-1]]
y_test = test_df[test_df.columns[-1]]

instance_weights = rw_train_data.convert_to_dataframe()[1]['instance_weights']
lr_model_rw = LogisticRegression(random_state=0).fit(X_train_rw,y_train_rw,instance_weights)
predictions = lr_model_rw.predict(X_test)

print('f1:',f1_score(y_test,predictions))
print('recall:',recall_score(y_test,predictions))
print('accuracy:',accuracy_score(y_test,predictions))
print('precision:',precision_score(y_test,predictions))
cm = confusion_matrix(y_test, predictions)
TP = cm[1][1]
TN = cm[1][0]
print("True positives:",TP,'rate:',str(round(TP*100/(TP+TN),2))+'%')

f1: 0.6291666666666667
recall: 0.6163265306122448
accuracy: 0.6628787878787878
precision: 0.6425531914893617
True positives: 302 rate: 61.63%


### Statistical parity

In [29]:
priv_group   = [{'race': 1}]  # Caucasian
unpriv_group = [{'race': 0}]  # African-American
X_test['two_year_recid'] = predictions
bld_test = BinaryLabelDataset(favorable_label=1.0, unfavorable_label=0.0,df=X_test,label_names=['two_year_recid'],protected_attribute_names=['race'])
metric_test_data = BinaryLabelDatasetMetric(bld_test, 
                             unprivileged_groups = unpriv_group,
                             privileged_groups   = priv_group)
print("Mean difference (statistical parity difference) = %f" % 
      metric_test_data.statistical_parity_difference())

Mean difference (statistical parity difference) = 0.100106


## A classifier after post-processing

In [30]:
from aif360.algorithms.postprocessing import EqOddsPostprocessing

privileged_group = [{'race':1}]
unprivileged_group = [{'race':0}]
pp = EqOddsPostprocessing(unprivileged_group,privileged_group)

train_dfpp = train_data.convert_to_dataframe()[0]
predictions = lr_model_baseline.predict(train_dfpp[train_dfpp.columns[:-1]])
new_train_dfpp = train_data.convert_to_dataframe()[0]
new_train_dfpp['two_year_recid'] = predictions

bld_train_true = BinaryLabelDataset(favorable_label=1.0, unfavorable_label=0.0,df=train_dfpp,label_names=['two_year_recid'],protected_attribute_names=['race'])
bld_train_pred = BinaryLabelDataset(favorable_label=1.0, unfavorable_label=0.0,df=new_train_dfpp,label_names=['two_year_recid'],protected_attribute_names=['race'])



bld_test = BinaryLabelDataset(favorable_label=1.0, unfavorable_label=0.0,df=test_df,label_names=['two_year_recid'],protected_attribute_names=['race'])


pp = pp.fit(bld_train_true,bld_train_pred)

predictions = pp.predict(bld_test).convert_to_dataframe()[0]['two_year_recid'].values




print('f1:',f1_score(y_test_b,predictions))
print('recall:',recall_score(y_test_b,predictions))
print('accuracy:',accuracy_score(y_test_b,predictions))
print('precision:',precision_score(y_test_b,predictions))
cm = confusion_matrix(y_test, predictions)
TP = cm[1][1]
TN = cm[1][0]
print("True positives:",TP,'rate:',str(round(TP*100/(TP+TN),2))+'%')

f1: 0.8524590163934426
recall: 0.7959183673469388
accuracy: 0.8721590909090909
precision: 0.9176470588235294
True positives: 390 rate: 79.59%


### Statistical parity

In [31]:
priv_group   = [{'race': 1}]  # Caucasian
unpriv_group = [{'race': 0}]  # African-American

df = bld_test.convert_to_dataframe()[0]
df['two_year_recid'] = predictions
bld_test = BinaryLabelDataset(favorable_label=1.0, unfavorable_label=0.0,df=df,label_names=['two_year_recid'],protected_attribute_names=['race'])
metric_test_data = BinaryLabelDatasetMetric(bld_test, 
                             unprivileged_groups = unpriv_group,
                             privileged_groups   = priv_group)
print("Mean difference (statistical parity difference) = %f" % 
      metric_test_data.statistical_parity_difference())

Mean difference (statistical parity difference) = -0.109179


# Discussion

**Report**
* Shortly discuss your results. For example, how do the different classifiers compare against each other? 
* Also include a short ethical discussion (1 or 2 paragraphs) reflecting on these two aspects: 1) The use of a ML system to try to predict recidivism; 2) The public release of a dataset like this.
