# ADS Project 4 - Group 1

In our project, we referred two papers to understand the trade-off between fairness and accuracy. We applied the algorithms on the COMPAS dataset to explore if some features trend to discrimination or affact the accuracy in predicting arrested individual in the future.

Papers:
* Algorithm 1: Information Theoretic Measures for Fairness-aware Feature selection https://arxiv.org/abs/2106.00772.
* Algorithm 2: Learning Fair Representations http://proceedings.mlr.press/v28/zemel13.html

Dataset: COMPAS dataset
* Response variable: "two_year_recid" indicating whether the individual was arrested for a crime within 2 years of release
* Sensitive feature: "race",  African-American=0, Caucasian = 1
* Other Features:
 * "sex", female = 0, male = 1
 * "age", less than 25 = 0, age 25-45 = 1, age older than 45 = 2
 * "c_charge_degree", misdemeanor = 0, felony = 1
 * "priors_count",
 * "length_of_stay", days elapsed from in jail until out of jail
 * "decile_score" measure both dynamic risk (criminogenic factors) and static risk (historical factors).

# Data importing and data cleaning
We select fields for severity of charge, number of priors, demographics, age, sex, compas scores, and whether each person was accused of a crime within two years.

Link: https://github.com/propublica/compas-analysis/

In [54]:
import pandas as pd
import numpy as np

In [63]:
data = pd.read_csv('compas-scores-two-years.csv')
mask = (data.race.str.contains('African-American') ) | (data.race.str.contains('Caucasian'))
data = data[mask]
# Generate feature 'length_of stay'
length_of_stay = pd.to_datetime(data["c_jail_out"]) - pd.to_datetime(data["c_jail_in"])
data.loc[:, "length_of_stay"] = (length_of_stay / pd.Timedelta(hours=1)) / 24
data = data.drop(columns=['c_jail_in', 'c_jail_out'])
raw = data
# Select features
data = data[['two_year_recid','age', 'c_charge_degree', 'race', 'age_cat', 'score_text', 'sex', 'priors_count',
                    'days_b_screening_arrest', 'decile_score', 'is_recid','length_of_stay']]

There are a number of reasons remove rows because of missing data:

1. If the charge date of a defendants Compas scored crime was not within 30 days from when the person was arrested, we assume that because of data quality reasons, that we do not have the right offense.
2. We coded the recidivist flag -- is_recid -- to be -1 if we could not find a compas case at all.
3. In a similar vein, ordinary traffic offenses -- those with a c_charge_degree of 'O' -- will not result in Jail time are removed (only two of them).
4. We filtered the underlying data from Broward county to include only those rows representing people who had either recidivated in two years, or had at least two years outside of a correctional facility.

Reference: https://github.com/propublica/compas-analysis/blob/master/Compas%20Analysis.ipynb

In [64]:
data = data[
    (data['days_b_screening_arrest'] <= 30) &
    (data['days_b_screening_arrest'] >= -30) &
    (data['is_recid'] != -1) &
    (data['c_charge_degree'] != "O") &
    (data['score_text'] != 'N/A')
]
data = data.reset_index(drop = True)

# Baseline Model

We choose logistic regression model and SVM as baseline models. We use calibration to measure fairness of the model. The result shows that there exists discrimination within these features.

In [65]:
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
import numpy as np
import pandas as pd
from sklearn.svm import SVC
import pandas as pd

## Generate training and testing dataset

In [66]:
data = data[['two_year_recid',"race","sex","age_cat","c_charge_degree","priors_count","decile_score","length_of_stay"]]
age_map = {"Less than 25": 0, "25 - 45": 1, "Greater than 45": 2}
data.loc[:,'age_cat']= data['age_cat'].replace(age_map)
data.loc[:,'race'] = data['race'].replace({'African-American': 0, 'Caucasian': 1})
data.loc[:,'sex']= data['sex'].replace({'Female': 0, 'Male': 1})
data.loc[:,'c_charge_degree'] = data['c_charge_degree'].replace({'M': 0, 'F': 1})
conditions_priors = [
  data['priors_count'] == 0,
  data['priors_count'] > 3
]
choices_priors = [0, 2]


data.head()

Unnamed: 0,two_year_recid,race,sex,age_cat,c_charge_degree,priors_count,decile_score,length_of_stay
0,1,0,1,1,1,0,3,10.077384
1,1,0,1,0,1,4,4,1.085764
2,1,1,1,1,1,14,6,6.298681
3,0,1,0,1,0,0,1,2.953611
4,0,1,1,1,1,0,4,1.080451


In [67]:
# X is your dataset features, y is the target variable
sensitive = "race"
features = ["sex","age_cat","c_charge_degree","priors_count","length_of_stay","decile_score","race"]

x_train, x_test, y_train, y_test = train_test_split(data[features], data['two_year_recid'], test_size=0.3, random_state=42)

race_train = x_train[sensitive].reset_index(drop=True)
x_train = x_train.iloc[:,0:5].reset_index(drop=True)
y_train = y_train.reset_index(drop=True)
race_test = x_test[sensitive].reset_index(drop=True)
x_test = x_test.iloc[:,0:5].reset_index(drop=True)
y_test = y_test.reset_index(drop=True)

## Logistic Regression

In [68]:
def calculate_calibration(sensitive_attr, predictions, true_labels):
  # Indices for different groups
  caucasian_idx = np.where(sensitive_attr == 1)[0]
  african_idx = np.where(sensitive_attr == 0)[0]

  # Predictions for Caucasians
  pred_caucasian = predictions[caucasian_idx]
  true_caucasian = true_labels[caucasian_idx]
  accuracy_caucasian = np.mean(pred_caucasian == true_caucasian)

  # Predictions for African-Americans
  pred_african = predictions[african_idx]
  true_african = true_labels[african_idx]
  accuracy_african = np.mean(pred_african == true_african)

  # Calibration calculation (accuracy difference)
  calibration = (accuracy_caucasian - accuracy_african) * 100
  return calibration
# Create logistic regression model
logistic_model = LogisticRegression(random_state=0).fit(x_train, y_train)
# Evaluate the model
train_accuracy = logistic_model.score(x_train, y_train)
test_accuracy = logistic_model.score(x_test, y_test)
train_calibration = calculate_calibration(race_train, logistic_model.predict(x_train), y_train)
test_calibration = calculate_calibration(race_test, logistic_model.predict(x_test), y_test)
# Summary DataFrame
summary = {
    "Methods": ["Logistic Regression", "Logistic Regression"],
    "Set": ["Train", "Test"],
    "Accuracy": [train_accuracy, test_accuracy],
    "Calibration": [train_calibration, test_calibration]
}

summary_df = pd.DataFrame(summary)
print(summary_df)

               Methods    Set  Accuracy  Calibration
0  Logistic Regression  Train  0.671630    -0.717329
1  Logistic Regression   Test  0.662247    -1.398018


## SVM

In [69]:
def calculate_calibration(sensitive_attr, predictions, true_labels):
    """
    Calculate calibration difference between groups defined in sensitive attributes.

    Parameters:
        sensitive_attr (array): Array containing the sensitive attribute (binary) where 1 and 0 represent groups.
        predictions (array): Array containing the model predictions.
        true_labels (array): Array containing the actual labels.

    Returns:
        float: The calibration value as the percentage difference in accuracy between the two groups.
    """
    # Group indices
    group1_idx = np.where(sensitive_attr == 1)[0]  # e.g., Caucasian
    group0_idx = np.where(sensitive_attr == 0)[0]  # e.g., African-American

    # Accuracy for group 1
    pred_group1 = predictions[group1_idx]
    true_group1 = true_labels[group1_idx]
    accuracy_group1 = np.mean(pred_group1 == true_group1)

    # Accuracy for group 0
    pred_group0 = predictions[group0_idx]
    true_group0 = true_labels[group0_idx]
    accuracy_group0 = np.mean(pred_group0 == true_group0)

    # Calibration as the difference in accuracies
    calibration = (accuracy_group1 - accuracy_group0) * 100
    return calibration
# Train the SVM model
svm_model = SVC(kernel='linear', probability=True, random_state=0)
svm_model.fit(x_train, y_train)

# Model evaluation
train_accuracy = svm_model.score(x_train, y_train)
test_accuracy = svm_model.score(x_test, y_test)
train_calibration = calculate_calibration(race_train, svm_model.predict(x_train), y_train)
test_calibration = calculate_calibration(race_test, svm_model.predict(x_test), y_test)

# Summary dictionary
summary = {
    "Methods": ["SVM", "SVM"],
    "Set": ["Train", "Test"],
    "Accuracy": [train_accuracy, test_accuracy],
    "Calibration": [train_calibration, test_calibration]
}

# Create DataFrame and display
summary_df = pd.DataFrame(summary)
print(summary_df)

  Methods    Set  Accuracy  Calibration
0     SVM  Train  0.677044     0.179035
1     SVM   Test  0.657197    -0.835193


From Logistic Regression and SVM baseline output, we can observe that the predicted result need to improve fairness. Next we will apply algorithms to explore how features affect discrimination and accuracy.

# Alogrithm 1

Information Theoretic Measures for Fairness-aware Feature selection https://arxiv.org/abs/2106.00772.

In [10]:
from collections import defaultdict
from itertools import chain, combinations
import itertools
import math
import copy

In [5]:
df = raw[["two_year_recid","race","sex","age","c_charge_degree","priors_count","length_of_stay"]]
df = df.dropna()         # drop NA
df = df.loc[(df["length_of_stay"] > 0)]

df.loc[:,'race'] = df['race'].replace({'African-American': 0, 'Caucasian': 1})
df.loc[:,'sex']= df['sex'].replace({'Female': 0, 'Male': 1})
df.loc[:,'c_charge_degree'] = df['c_charge_degree'].replace({'M': 0, 'F': 1})
df['age'] = df['age'].apply(lambda a: 0 if a < 25 else (2 if a > 45 else 1))
df['priors_count'] = df['priors_count'].apply(lambda x: 0 if x == 0 else (2 if x > 3 else 1))
df['length_of_stay'] = pd.cut(df['length_of_stay'], bins = [0, 7, 90, np.inf], labels = [0, 1, 2])


In [6]:
def unique_values(arr):
    return [np.unique(arr[:, col]).tolist() for col in range(arr.shape[1])]
def info(x, y):
    """
    Compute the mutual information I(X;Y)
    Parameters: x,y np.array
    """
    concated = np.concatenate((x, y), axis=1)
    uniarr = unique_values(concated)
    cartesian_product = list(itertools.product(*uniarr))
    nrows = x.shape[0]
    ncol_x = x.shape[1]

    info = 0
    for i in cartesian_product:
        pxy = len(np.where((concated == i).all(axis=1))[0]) / nrows
        px = len(np.where((x == i[:ncol_x]).all(axis=1))[0]) / nrows
        py = len(np.where((y == i[ncol_x:]).all(axis=1))[0]) / nrows

        if pxy == 0 or px == 0 or py == 0:
            ivalue = 0
        else:
            ivalue = pxy * np.log(pxy / px) / px
        info += np.abs(ivalue)
    return info

def conditional_info(x, y, conditional):
    """
    Compute conditional information I(X,Y|conditional)
    """
    ycond = np.concatenate((y, conditional), axis=1)
    xycond= np.concatenate((x, ycond), axis=1)
    uniarr = unique_values(xycond)
    cartesian_product = list(itertools.product(*uniarr))
    nrows = x.shape[0]
    ncol_x = x.shape[1]
    ncol_y = y.shape[1]

    ci = 0
    for i in cartesian_product:
        pxy = len(np.where((xycond == i).all(axis=1))[0]) / nrows
        px    = len(np.where((x == i[:ncol_x]).all(axis=1))[0]) / nrows
        py    = len(np.where((xycond[:, ncol_x: -ncol_y] == i[ncol_x: -ncol_y]).all(axis=1))[0]) / nrows
        s1 = (xycond[:, :ncol_x] == i[ :ncol_x]).all(axis=1)
        s2 = (xycond[:, -ncol_y:] == i[-ncol_y:]).all(axis=1)

        if(len(np.where(s2)[0])==0):
          x_gy = 0
        else:
          x_gy = len(np.where(s1 & s2)[0]) / len(np.where(s2)[0])

        if pxy == 0 or px == 0 or py == 0 or x_gy == 0:
            cvalue = 0
        else:
            cvalue = pxy * np.log(pxy / py) / x_gy
        ci += np.abs(cvalue)
    return ci

def acc_coef(y, x_s, x_s_c, a):
    """
    Compute accuracy coefficient.
    """
    conditional = np.concatenate((x_s_c, a), axis=1)
    return conditional_info(y, x_s, conditional)

def disc_coef(y, x_s, a):
    """
    Compute discrimination coefficient.
    """
    x_s_a = np.concatenate((x_s, a), axis=1)
    si = info(y,x_s_a)-info(y, a)-info(y, x_s)
    ui = info(x_s, a)
    uic = conditional_info(x_s, a, y)
    return np.abs(si* ui * uic)


Aggregation of Effects

We using Shapely value to represent marginal accuracy/discrimination coefficient where the domain are power set.

In [7]:

# Generate the power set
def power_set(seq):
    """
    Generate the power set of the input sequence.
    Output: list
    """
    if not seq:
        return [[]] # Base case
    rest = power_set(seq[1:])
    with_first = [[seq[0]] + x for x in rest]
    return rest + with_first


def marginal_accuracy(y, x, a, set_tracker):
    """
    Calculate marginal accuracy coefficient
    """
    n_features = x.shape[1]
    feature_idx = list(range(n_features))
    feature_idx.pop(set_tracker)
    power_set_features = [x for x in power_set(feature_idx) if len(x) > 0]

    shapley_value =0
    for sc_idx in power_set_features:
            weight = math.factorial(len(sc_idx)) * math.factorial(n_features - len(sc_idx) - 1) / math.factorial(n_features)

            # Compute v(T ∪ {i})
            idx_xs_ui = copy.copy(sc_idx) # create copy of subset list
            idx_xs_ui.append(set_tracker) # append feature index
            idx_xsc_ui = list(set(list(range(n_features))).difference(set(idx_xs_ui))) # compliment of x_s
            v_with_i = acc_coef(y.reshape(-1, 1), x[:, idx_xs_ui], x[:, idx_xsc_ui], a.reshape(-1, 1))

             # Compute v(T)
            idx_xsc = list(range(n_features))
            idx_xsc.pop(set_tracker)
            idx_xsc = list(set(idx_xsc).difference(set(sc_idx)))
            v_without_i = acc_coef(y.reshape(-1, 1), x[:, sc_idx], x[:, idx_xsc], a.reshape(-1, 1))

            marginal = v_with_i - v_without_i
            shapley_value = shapley_value + weight * marginal
    return shapley_value

def marginal_discrimination(y, x, a, set_tracker):
    """
    Calculate marginal discrimination coefficient
    """
    n_features = x.shape[1]
    fidx = list(range(n_features))
    fidx.remove(set_tracker)
    subsets = list(power_set(fidx))
    a = a.reshape(-1, 1)
    y=y.reshape(-1, 1)
    shapley_value = 0
    for subset in subsets:
        if len(subset) == 0:
            continue
        subset_with_i = subset + [set_tracker]
        v_with_i = disc_coef(y, x[:, subset_with_i], a)
        v_without_i = disc_coef(y, x[:, subset], a)
        weight = math.factorial(len(subset)) * math.factorial(n_features - len(subset) - 1) / math.factorial(n_features)
        shapley_value += weight * (v_with_i - v_without_i)


    return shapley_value

Generate the coefficient table

In [51]:
split_idx = int(len(df) * 0.7)
train1 = df[:split_idx]
test1 = df[split_idx:]

output = "two_year_recid"
protect = "race"
unprotect = ["sex","age","c_charge_degree","priors_count","length_of_stay"]
x_train1, y_train1, race_train1 = train1[unprotect], train1[output].to_numpy(), train1[protect]
x_test1, y_test1, race_test1 = test1[unprotect], test1[output].to_numpy(), test1[protect]

In [52]:

shap_acc = []
shap_disc = []
race_train1 = race_train1.to_numpy()
x_train1_arr = x_train1.to_numpy()
for track in range(5):
    acc_i = marginal_accuracy(y_train1, x_train1_arr, race_train1, track)
    disc_i = marginal_discrimination(y_train1, x_train1_arr, race_train1, track)
    shap_acc.append(acc_i)
    shap_disc.append(disc_i)

# DataFrame to compare shapely values
shapley_df = pd.DataFrame({
    "Feature": x_train1.columns.tolist(),
    "Accuracy": shap_acc,
    "Discrimination": shap_disc
})

shapley_df

Unnamed: 0,Feature,Accuracy,Discrimination
0,sex,1.001571,580.438967
1,age,1.196479,735.033619
2,c_charge_degree,1.053048,575.387615
3,priors_count,1.231018,731.584021
4,length_of_stay,1.133595,693.71562


The large accuracy coefficient indicates that the feature is a good term of accuracy. And the large discrimination coefficient indicates that the feature is correlated to the sensitive feature. The accuracy coefficient of 'Priors Count' is highest. 'Priors Count' is beneficial to accuracy. 'Age' and 'Priors Count' have high Discrimination coefficient. 'Age' and 'Priors Count' is highly correlated to race.  We use marginal accuracy coefficient $\phi_i^{Acc}$ and marginal discrimination coefficient $\phi_i^D$ to define fairness-utility score $$\mathcal{F} = \phi_i^{Acc}-\alpha\phi_i^D$$ where $\alpha$ is a positive hyperparametter.

In [53]:
# Fairness-utility score
alpha  = 0.1
fairness = [shap_acc[i] - alpha * shap_disc[i] for i in range(len(shap_acc))]
shapley_df = pd.DataFrame({
    "Feature": x_train1.columns.tolist(),
    "Fairness_utility_score" : fairness
})
shapley_df

Unnamed: 0,Feature,Fairness_utility_score
0,sex,-57.042325
1,age,-72.306883
2,c_charge_degree,-56.485714
3,priors_count,-71.927384
4,length_of_stay,-68.237967


If the alpha value is large, fairness utility score would pay more attention to discrimination and degrade accuracy. Based on this result, we need to remove features, 'age' and 'priors_count'.

In [49]:
alpha  = 0.0001
fairness = [shap_acc[i] - alpha * shap_disc[i] for i in range(len(shap_acc))]
shapley_df = pd.DataFrame({
    "Feature": x_train1.columns.tolist(),
    "Fairness_utility_score" : fairness
})
shapley_df


Unnamed: 0,Feature,Fairness_utility_score
0,sex,0.943528
1,age,1.122976
2,c_charge_degree,0.995509
3,priors_count,1.157859
4,length_of_stay,1.064224


If the alpha value is small, the fairness score would focus on accuracy. Based on the result table, we need to remove features 'sex' and 'c_charge_degree'.

# Alogrithm 2
Learning Fair Representations http://proceedings.mlr.press/v28/zemel13.html

In [None]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split

In [None]:
data =pd.read_csv('https://github.com/propublica/compas-analysis/raw/master/compas-scores-two-years.csv')

In [None]:
df_filtered = data[(data['race'] == 'Caucasian') | (data['race'] == 'African-American')]

df_filtered = df_filtered[(df_filtered['days_b_screening_arrest'] <= 30) & (df_filtered['days_b_screening_arrest'] >= -30)]

df_filtered = df_filtered[df_filtered['is_recid'] != -1]

df_filtered = df_filtered[df_filtered['c_charge_degree'] != 'O']

missing_values_count = df_filtered.isna().sum()
columns_to_drop = missing_values_count.sort_values(ascending=False).index.tolist()[:14]

specified_columns = ['sex', 'race', 'decile_score', 'two_year_recid']
df_cleaned = df_filtered.drop(columns=columns_to_drop).dropna().loc[:, specified_columns]

df_cleaned['sex'].replace(['Male', 'Female'], [1, 0], inplace=True)
df_cleaned['race'].replace(['Caucasian', 'African-American'], [1, 0], inplace=True)

In [None]:
data_train,data_test = train_test_split(df_cleaned, test_size=0.3, random_state=15)
data_train, data_val= train_test_split(data_train, test_size=0.5, random_state=15)

$d(x_n, v_k) = ||x_n − v_k||_2$
<br>
$d(x_n, v_k, α) = \sum\limits _{i=1}^{D} α_i(x_{ni} - v_{ki})^2$

In [None]:
def d(X, v, alpha):
  N, D =X.shape
  K = v.shape[0]
  dists = np.zeros((N,K))
  for n in range(N):
    for k in range(K):
      for d in range(D):
        dists[n,k]+=alpha[d]*(X[n,d]-v[k,d])**2
  return dists

$M_{n,k}=P(Z=k|x_n)=exp(-d(x,v_k))/\sum\limits_{j=1}^{k}exp(-d(x,v_j))$

In [None]:
def M_nk(dist):
  exp_neg_dist = np.exp(-dist)
  sum_exp_neg_dist = np.sum(exp_neg_dist, axis=1, keepdims=True)
  M_nk = exp_neg_dist/sum_exp_neg_dist
  return M_nk

$M_k = \frac{1}{|X_0|} \sum_{n \in X_0} M_{nk}$

In [None]:
def M_k(M_nk):
  M_k = M_nk.mean(axis=0)
  return M_k

$
\hat{x}_n = \sum^K_{k=1}M_{nk}v_k
$


$
L_x = \sum_{n=1}^N (x_n - \hat{x}_n)^2
$

In [None]:
def xnhat(X, M_nk,v):
  N,D = X.shape
  K = v.shape[0]

  x_hat=np.zeros((N,D))
  for n in range(N):
    for k in range(K):
      x_hat[n]+=M_nk[n,k]*v[k]
  Lx=np.sum((X-x_hat)**2)
  return x_hat, Lx


$\hat{y_n}=\sum\limits_{k = 1}^{K}M_{n,k}w_k$, we constrain the $w_k$ values to be between 0 and 1.

$L_y = \sum_{n=1}^N -y_n log \hat{y}_n - (1-y_n)log(1- \hat{y}_n) $

In [None]:
def ynhat(M_nk,w):
  y_hat=np.dot(M_nk,w)
  return y_hat

In [None]:
def calculate_Ly(y,y_hat):
  log_loss=-y*np.log(y_hat)-(1-y)*np.log(1-y_hat)
  L_y=log_loss.sum()
  return L_y

$L = A_z L_Z + A_x L_x + A_y L_y$

In [None]:
def L(params, sen_data, nonsen_data, sen_label, nonsen_label, K, A_z, A_x, A_y):
  sen_features=sen_data.shape[1]
  alpha_sen=params[:sen_features]
  alpha_nonsen=params[sen_features:2*sen_features]
  w=params[2*sen_features:(2*sen_features)+K]
  v=params[(2*sen_features)+K:].reshape((K,sen_features))

  dist_sen = d(sen_data, v, alpha_sen)
  dist_nonsen = d(nonsen_data, v, alpha_nonsen)

  M_nk_sen = M_nk(dist_sen)
  M_nk_nonsen = M_nk(dist_nonsen)

  M_k_sen = M_k(M_nk_sen)
  M_k_nonsen = M_k(M_nk_nonsen)

  L_z = np.sum(np.abs(M_k_sen - M_k_nonsen))

  _, L_x_sen = xnhat(sen_data, M_nk_sen, v)
  _, L_x_nonsen = xnhat(nonsen_data, M_nk_nonsen, v)
  L_x = L_x_sen + L_x_nonsen

  _, L_y_sen= ynhat(M_nk_sen, w, sen_label)
  _, L_y_nonsen= ynhat(M_nk_nonsen, w, nonsen_label)
  L_y = L_y_sen + L_y_nonsen

  total_loss = A_z * L_z + A_x * L_x + A_y * L_y

  return total_loss

Logistic regression model:

In [None]:
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, roc_auc_score

features = ['sex', 'race', 'decile_score']
target = 'two_year_recid'

X_train = data_train[features]
y_train = data_train[target]

X_val = data_val[features]
y_val = data_val[target]

X_test = data_test[features]
y_test = data_test[target]

# logistic regression model
model = LogisticRegression(random_state=42)
model.fit(X_train, y_train)

y_val_pred = model.predict(X_val)

# evaluate on the validation set
accuracy_val = accuracy_score(y_val, y_val_pred)
roc_auc_val = roc_auc_score(y_val, y_val_pred)


# predict on the test set and evaluate
y_test_pred = model.predict(X_test)
accuracy_test = accuracy_score(y_test, y_test_pred)
roc_auc_test = roc_auc_score(y_test, y_test_pred)


accuracy_val, roc_auc_val, accuracy_test, roc_auc_test

(0.6419284940411701,
 0.6414273388389005,
 0.6858407079646017,
 0.6775882880980892)

In [None]:
pip install fairlearn



In [None]:
from fairlearn.postprocessing import ThresholdOptimizer
from fairlearn.metrics import MetricFrame, false_positive_rate, true_positive_rate
from sklearn.metrics import accuracy_score

optimizer = ThresholdOptimizer(
    estimator=model,
    constraints="equalized_odds",
    prefit=True,
)

# fit the optimizer on validation data
optimizer.fit(X_val, y_val, sensitive_features=data_val['race'])

# prediction
y_test_pred_optimized = optimizer.predict(X_test, sensitive_features=data_test['race'])

# evaluate optimized model
accuracy_optimized = accuracy_score(y_test, y_test_pred_optimized)
mf = MetricFrame(
    metrics={
        'False Positive Rate': false_positive_rate,
        'True Positive Rate': true_positive_rate,
        'Accuracy': accuracy_score
    },
    y_true=y_test,
    y_pred=y_test_pred_optimized,
    sensitive_features=data_test['race']
)

print(f"Optimized Accuracy: {accuracy_optimized}")
print("Fairness Metrics:")
print(mf.by_group)


  warn(


Optimized Accuracy: 0.6352718078381795
Fairness Metrics:
      False Positive Rate  True Positive Rate  Accuracy
race                                                   
0                0.220302            0.495798  0.635783
1                0.275000            0.485597  0.634526


In conclusion, our desicion is to choose algorithm 2. Removing the features in algorithm 1 would decrease the accuarcy. However, alogrithm 2 eliminates discrimination and has similar accuarcy as baseline models. The alogrithm 2 performance better than 1.