<a href="https://colab.research.google.com/github/danielbauer1979/CAS_PredMod/blob/main/pa_pynb_sess9_AlgFairness.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Session 9: Algorithmic Bias and Fairness

Jim Guszcza and Dani Bauer, 10/2023

In this tutorial, we discuss approaches how to analyze a predictive algorithm with regards to "fairness." We do so in the context of a well-known case study on an algorithm that assists judges in parole decisions. We go over different notions of fairness, discuss tradeoffs, and explain the intuition behind the results of the analyses. We also discuss approaches how to adjust algorithms to enforce fairness. A second case study, which we will ask you to work on, illustrates how these ideas apply in the actuarial context.

### Load required packages

In [2]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix, classification_report, precision_score, roc_curve, auc

In [None]:
!pip install aequitas
#Another library that seems to be popular is fariness 360:
#!pip install aif360

In [21]:
from aequitas.group import Group
from aequitas.bias import Bias
from aequitas.fairness import Fairness
import aequitas.plot as ap

# Compas Case Study

The Compas data originates from a [well-known case study on algorithmic bias](https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing). The background is [as follows](https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm): Across the nation, judges, probation and parole officers are increasingly using algorithms to assess a criminal defendant's likelihood of becoming a recidivist---a term used to describe criminals who re-offend. One of the commercial tools made by Northpointe, Inc. is called COMPAS (which stands for Correctional Offender Management Profiling for Alternative Sanctions). The case study compares outcomes and risk scores for individuals belonging to different races. In what follows, we will go through some of these analyses ourselves.

## The Compas Data

The data is in our github folder. Let's take a look:

In [None]:
!git clone https://github.com/danielbauer1979/CAS_PredMod.git

In [None]:
dat = pd.read_csv('CAS_PredMod/pa_data_compasdata.csv')
dat.head()

In [None]:
dat.describe()

The data contains information on recidivism of 6,172 individuals as well as information on the individual's age, sex, criminal history, their ethnicity---and the risk score they received from the COMPAS algorithm.

To simplify the situation, we focus on two ethnicity levels only: African-Americans and Caucasians.

In [None]:
dat = dat.loc[(dat['ethnicity'] == 'Caucasian') | (dat['ethnicity'] == 'African_American')]
dat.describe()

So we still have the majority of individuals included. We will consider the African-Americans as the "protected" class.

We commence by exploring the data some and looking at fairness manually, but then we will also explore how to use the Aequitas package in this setting.

### Fairness scores vs. recidivism rates

Let's start by comparing the COMPAS scores between the two ethic groups (see also the density plots from the fairness package above):

In [None]:
dat.loc[dat['ethnicity'] == 'Caucasian']['probability']

In [None]:
plt.title("Score distribution by group")
plt.xlabel("Group: 1 = Caucasian, 2 = African-American")
plt.boxplot([dat.loc[dat['ethnicity'] == 'Caucasian']['probability'],dat.loc[dat['ethnicity'] == 'African_American']['probability']])
plt.show()

The distribution of scores in the African-American group has a higher median and higher percentiles than the scores in the Caucasian group.

However, consider the number of re-offenders between the group, we see the following:

In [None]:
aq_palette = sns.diverging_palette(225, 35, n=2)
by_race = sns.countplot(x="ethnicity", hue="Two_yr_Recidivism", data=dat[dat.ethnicity.isin(['African_American', 'Caucasian'])], palette=aq_palette)

Let's consider raw averages of re-offenders in the two groups -- for caucasians:

In [None]:
np.average(dat.loc[dat['ethnicity'] == 'Caucasian']['Two_yr_Recidivism'] == 'yes')

And for the protected group:

In [None]:
np.average(dat.loc[dat['ethnicity'] == 'African_American']['Two_yr_Recidivism'] == 'yes')

So, we can interpret the higher average COMPAS scores as resulting from statistical facts in this particular population.

Let's look at this in more detail: We plot the average scores for those that do not re-offend and for those that do re-offend, and we compare it to the percentage of re-offenders based on the decision that was suggested by the algorithm.