# **Method 2-Normalizing the Predections:**

In [None]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
import plotly.figure_factory as ff
from sklearn.metrics import accuracy_score
pd.options.mode.chained_assignment = None  # default='warn'
dataURL = 'https://raw.githubusercontent.com/propublica/compas-analysis/master/compas-scores-two-years.csv'
data = pd.read_csv(dataURL)

## **Data description:**
ProPublica obtained pretrial defendant's COMPAS scores from the Broward County Sheriff’s Office in Florida in 2013 – 2014.
Each pretrial defendant received at least three COMPAS scores, each ranged from 1 to 10, with ten being the highest risk: 
1. **decile_score**- Risk of recidivism
2. **v_decile_score**- Risk of violence
3. Risk of Failure to Appear
<br>

We are provided as well 2 category based evaluation labeled as **“High”** (8 – 10), **“Medium”** (5 – 7) and **“Low”** (1 – 4):
1. **score_text**-  Risk of recidivism category
2. **v_score_text**- Risk of violence category


**days_b_screening_arrest**- number of days before COMPAS assessment being conducted 

**c_charge_degree**- the degree of the charge

**priors_count**- number of prior offences

**is_recid**- yes/no prediction of the model of whether the defendant will reoffend

**two_year_recid**- actual result over a two-year period 

**is_violent_recid**- yes/no prediction of the model of whether the defendant will have a violent offence

**juv_misd_count**- number of juvenile misdemeanor crimes

**juv_fel_count**- number of juvenile felony crimes

**juv_other_count**- number of juvenile crimes with dgree diffrent than misdemeanor or felony



## **Data preprocessing:**
We filtered the underlying data from Broward county to include only those rows representing people who had either recidivated in two years, or had at least two years outside of a correctional facility.

In [None]:
df = (data
      .loc[(data['days_b_screening_arrest'] <= 30) & (data['days_b_screening_arrest'] >= -30), :]
      .loc[data['is_recid'] != -1, :]
      .loc[data['c_charge_degree'] != 'O', :])
df.reset_index(inplace = True)
df=df[['age', 'c_charge_degree', 'race', 'age_cat', 'score_text', 'sex', 'priors_count', 'days_b_screening_arrest', 'decile_score', 'is_recid', 'two_year_recid', 'c_jail_in', 'c_jail_out','is_violent_recid','v_decile_score', 'v_score_text','juv_misd_count', 'juv_other_count','juv_fel_count']]
df=df[(df['race']=='African-American') | (df['race']=='Caucasian')]
cat = ['score_text','age_cat','sex','race','c_charge_degree','v_score_text']

df.loc[:,cat] = df.loc[:,cat].astype('category')
df = pd.get_dummies(data = df, columns=cat)
new_column_names = [col.lstrip().rstrip().lower().replace(" ", "_").replace("-", "_") for col in df.columns]
df.columns = new_column_names
df['v_score_text_high'] = df['v_score_text_medium'] + df['v_score_text_high']
df['score_text_high'] = df['score_text_medium'] + df['score_text_high']
sensetive_feat='race'
df.head()


Unnamed: 0,age,priors_count,days_b_screening_arrest,decile_score,is_recid,two_year_recid,c_jail_in,c_jail_out,is_violent_recid,v_decile_score,...,age_cat_less_than_25,sex_female,sex_male,race_african_american,race_caucasian,c_charge_degree_f,c_charge_degree_m,v_score_text_high,v_score_text_low,v_score_text_medium
1,34,0,-1.0,3,1,1,2013-01-26 03:45:27,2013-02-05 05:36:53,1,1,...,0,0,1,1,0,1,0,0,1,0
2,24,4,-1.0,4,1,1,2013-04-13 04:58:34,2013-04-14 07:02:04,0,3,...,1,0,1,1,0,1,0,0,1,0
4,41,14,-1.0,6,1,1,2014-02-18 05:08:24,2014-02-24 12:18:30,0,2,...,0,0,1,0,1,1,0,0,1,0
6,39,0,-1.0,1,0,0,2014-03-15 05:35:34,2014-03-18 04:28:46,0,1,...,0,1,0,0,1,0,1,0,1,0
7,27,0,-1.0,4,0,0,2013-11-25 06:31:06,2013-11-26 08:26:57,0,4,...,0,0,1,0,1,1,0,0,1,0


## **Useful Functions:**

In [None]:
# plot the distribution of the predections of African-American and Caucasian defendants:
def plot_dist(df,pred):
  black=df[df['race_african_american']==1]
  white=df[df['race_caucasian']==1]
  y_hat_b = black[pred]
  y_hat_w = white[pred]
  hist_data = [y_hat_b, y_hat_w]
  group_labels = ['black', 'white']
  colors = ['#A56CC1', '#A6ACEC']

  # Create distplot with curve_type set to 'normal'
  fig = ff.create_distplot(hist_data, group_labels, colors=colors,bin_size=.1, show_rug=False,curve_type='normal')
  fig.update_layout(title_text='Distribution Of The Predections Of African-American & Caucasian Defendants',title_x=0.5)
  fig.show()

# compute the probability:
def compute_prob(df,race,predection,recid):
    numerator=len(df[(df[predection]>0.5) & (df[race]==1) & (df["two_year_recid"]==recid)])
    denomurator=len(df[(df[race]==1) & (df["two_year_recid"]==recid)])
    return numerator/float(denomurator)
    
#print the probabilities of Equalized Odds pairty:
def print_prob(df,pred='probability'):
  black_recid=compute_prob(df,'race_african_american',pred,1)
  print("P[recidivism predicted | african_american,recidivism]={}".format(black_recid))

  white_recid=compute_prob(df,'race_caucasian',pred,1)
  print("P[recidivism predicted | caucasian, recidivism]={}".format(white_recid))

  print("The diffrence:{}".format(np.abs(black_recid-white_recid)))

  print("\n")

  black_no_recid=compute_prob(df,'race_african_american',pred,0)
  print("P[recidivism predicted | african_american,no recidivism]={}".format(black_no_recid))

  white_no_recid=compute_prob(df,'race_caucasian',pred,0)
  print("P[recidivism predicted | caucasian, no recidivism]={}".format(white_no_recid))

  print("The diffrence:{}".format(np.abs(black_no_recid-white_no_recid)))


# **Logistic Regression:**
Logistic regression models the probabilities for classification problems with two possible outcomes. For example: given the parameters, will the defendant reoffend or not?

It’s an extension of the linear regression model for classification problems.
The logistic regression model uses the logistic function - sigmoid:

$$σ(w^T x_i )=exp(w^Tx_i)/1+exp(w^Tx_i) $$

If we feed an output value to the sigmoid function, it will return the probability of the outcome between 0 and 1. This probability is the model's confidence score to the label he predicted

If $σ(w^T x_i ) ≥ 0.5 $ then the label will be 1 (will reoffend) otherwise it will be 0 (won't reoffend).

## **Train The Model:**

In [None]:
features=['sex_female', 'age_cat_greater_than_45', 'age_cat_less_than_25','race_african_american','race_caucasian','priors_count' ,'c_charge_degree_m','juv_misd_count', 'juv_other_count','juv_fel_count']
cols=features.copy()
cols.append("two_year_recid")
X=df[cols]

# Split into train and test datasets:
train,test=train_test_split(X,test_size=0.2, random_state=42)
X_train=train[features]
y_train=train["two_year_recid"]
X_test=test[features]
y_test=test["two_year_recid"]

# Train with logistic regression:
model=LogisticRegression()
model.fit(X_train, y_train)

# Add the predctions of the training dataset:
train['probability']=model.predict_proba(X_train)[:, 1]


<h3>Lets observe the distribution of the predections of African-American defendants and Caucasian defendants:


In [None]:
plot_dist(train,'probability')

<h4>We can clearly see that there is a major diffrence between the two races

## **Equalized Odds pairty:** 
We saw that there is a clear bias in the datasets - Black defendants were more likely to be misclassified as higher risk compared to their white counterparts

In order to acheive fairness we will try to acheive Equalized Odds pairty.

**Reminder:** Equalized Odds pairty ensures parity between the subgroups of each race with label 1 in the training set, and parity between the subgroups of each race with label 0 in the training set. 

This means that the subgroups of each race who reoffended are equally likely to be predicted to reoffend. Similarly, there is parity between subgroups of each race without recidivism.

In mathematical terms:

\begin{align}
       TPR_{African-American}=TPR_{Caucasian}
    \end{align}

<h4> <center> P[predected recidivism |african american, recidivism]=P[predected recidivism|caucasian, recidivism] </center></h4>

<h4> <center>and</center></h4>

\begin{align}
       FPR_{African-American}=FPR_{Caucasian}
    \end{align}
<h4> <center>P[predected recidivism|african american, no recidivism]=P[predected recidivism|caucasian, no recidivism] </center></h4>


**We can note that** TPR= 1- FNR ,thus minimizing the diffrence between the TPR in the two group will also minimize the diffrence between the FNR in the two group

In [None]:
print_prob(train)

P[recidivism predicted | african_american,recidivism]=0.6849212303075769
P[recidivism predicted | caucasian, recidivism]=0.37155963302752293
The diffrence:0.31336159728005397


P[recidivism predicted | african_american,no recidivism]=0.34111759799833197
P[recidivism predicted | caucasian, no recidivism]=0.15154440154440155
The diffrence:0.18957319645393042


We can clearly see that with the traditional model we can't acheive Equalized Odds pairty, so what can we do?

# **SO what can we do?**


## **Normalize the predection:**
If we want to make the distribution of the predections more simillar, we can normalize the predection of African-American defendants by the prior of the distrbution of Caucasian defendants and leave the predection of white defendants as they are.

This means we are only considering the probability for recidivism given a race as an esitmator for our bias, thus we can normalize the predection **"Post-Training"** of the model

<h4><b>African-American-Prior:</b></h4>
\begin{align}
        P(Recidivism|African-American) = \frac{P(African-American, Recidivism)}{P(African-American)}
    \end{align}

<br><br>
<h4><b>Caucasian-Prior:</b></h4>

\begin{align}
      P(Recidivism|Caucasian)= \frac{P(Caucasian, Recidivism)}{P(Caucasian)}
    \end{align}

In [None]:
black_prior=len(train[(train['race_african_american']==1) & (train["two_year_recid"]==1)])/len(train[(train['race_african_american']==1)])
white_prior=len(train[(train['race_caucasian']==1) & (train["two_year_recid"]==1)])/len(train[(train['race_caucasian']==1)])
print('P[recidivism|african american]={:.2f}'.format(black_prior))
print('P[recidivism|caucasian]={:.2f}'.format(white_prior))

P[recidivism|african american]=0.53
P[recidivism|caucasian]=0.39


In [None]:
def norm_prior(row):
  if row['race_african_american']==1:
    return (row['probability']/black_prior)*white_prior
  return row['probability']

def compute(X,df):
  df['probability']=model.predict_proba(X)[:, 1]
  df['old_predection']=model.predict(X)
  df['norm_probability']=df.apply(lambda row: norm_prior(row),axis=1)
  df['new_predection']=np.where(df['norm_probability']>0.5, 1,0)
  return df

In [None]:
train=compute(X_train,train)
print_prob(train,pred='norm_probability')

P[recidivism predicted | african_american,recidivism]=0.29632408102025504
P[recidivism predicted | caucasian, recidivism]=0.37155963302752293
The diffrence:0.07523555200726789


P[recidivism predicted | african_american,no recidivism]=0.08006672226855713
P[recidivism predicted | caucasian, no recidivism]=0.15154440154440155
The diffrence:0.07147767927584442


In [None]:
plot_dist(train,'norm_probability')

<h3>We can see that, the difference between the probabilities in the train dataset are really small. </h3>

<h3><b>Did we achived Equalized Odds pairty?</h3>




## **Validate the Result on the Test dataset:**
In order to detrmine if the normalization we made is generic enough - not fitted only to the train dataset lets check if we would get small diffrences between the probabilities also in the test dataset.

In [None]:
test=compute(X_test,test)
print_prob(test,pred='norm_probability')

P[recidivism predicted | african_american,recidivism]=0.2682926829268293
P[recidivism predicted | caucasian, recidivism]=0.32142857142857145
The diffrence:0.05313588850174217


P[recidivism predicted | african_american,no recidivism]=0.07936507936507936
P[recidivism predicted | caucasian, no recidivism]=0.1510204081632653
The diffrence:0.07165532879818595


In [None]:
plot_dist(test,'norm_probability')

We can see from the graph that the distribution of the African-American and Caucasian defendents are very closely aligned together, which means that the difference between the probabilities in the test dataset are really small as well.

Since the difference in both the train dataset and the test dataset are really small we can say that we have reached Equalized Odd pairity. 

# **What about the accuracy?**

Lets observe what will happend to the accuarcy after changing the distribution of the risk of African-American defendents to reoffend

In [None]:
def acc_table(df):
  index=['old','new']
  preds=['probability','norm_probability']

  label=df['two_year_recid']
  groups = ["overall", "race_african_american", "race_caucasian"]
  acc_table = pd.DataFrame(index=index, columns=groups)
  for group in groups:
    if group in ["race_african_american", "race_caucasian"]:
      subset=(df[group]==1)
    else:
      subset=np.full(label.shape, True)
    acc_lst=[]
    for pred in preds:
      y_true=label[subset]
      y_pred=np.where(df[pred]>=0.5,1,0)
      y_sub_pred=y_pred[subset]
      acc_sub=accuracy_score(y_true, y_sub_pred)
      acc_lst.append(acc_sub)
    acc_table[group] = acc_lst
  acc_table.columns=["overall", "African-American", "Caucasian"]
  return acc_table

acc_table(test)

Unnamed: 0,overall,African-American,Caucasian
old,0.650568,0.660964,0.634383
new,0.606061,0.587869,0.634383


What can we infer from those results?