# Project 4: Machine Learning Fairness Algorithms Evaluation

##  Outline

* Part 1: To introduction Algorithms method A2(**Maximizing accuracy under fairness constraints (C-SVM and C-LR)**) and A7(**Information Theoretic Measures for Fairness-aware Feature selection (FFS)**).

* Part 2: To show how the evaluation was carried out

* Part 3: To show the main results from these two methods with Fairness theory.

### Preprocessing the data and modules

The data here we used is from **COMPAS Dataset**.

Moreover, in this project, we selected 'race', 'sex', 'age', 'juv_misd_count' and 'priors_count' as the features to investigate.

In [12]:
#preprocessing of the data



import copy
import itertools
import math
import numpy as np
import pandas as pd
from sklearn.svm import SVC
from sklearn.utils import shuffle

from new_helper_1 import *

import pandas as pd
from sklearn.model_selection import train_test_split
import numpy as np
df = pd.read_csv('../data/compas-scores-two-years.csv')
features = ['race', 'age', 'sex', 'juv_misd_count', 'priors_count']
#features chosen for our C-Logistic Regression
to_predict = 'two_year_recid'
races_to_filter = ['Caucasian', 'African-American']
filtered = df.loc[df['race'].isin(races_to_filter), features + [to_predict]].reset_index(drop=True)

#replace categorical data with boolean numbers, 0 and 1
filtered['race'] = filtered['race'].apply(lambda race: 0 if race == 'Caucasian' else 1)
filtered['sex'] = filtered['sex'].apply(lambda sex: 0 if sex == 'Male' else 1)
#x=filtered[['race', 'age', 'sex', 'juv_misd_count', 'priors_count']]
#y=filtered[['two_year_recid']]
#x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3, random_state=42)

#Normalizing the data, so each variable is similar in weight
normalized_df = (filtered-filtered.mean())/filtered.std()
filtered['age'] = normalized_df['age']
filtered['juv_misd_count'] = normalized_df['juv_misd_count']
filtered['priors_count'] = normalized_df['priors_count']

In [13]:
from utils import train_model
import loss_funcs

train_size = 5000
x_train = filtered.loc[:train_size, features]
y_train = filtered.loc[:train_size, to_predict]
x_control = {'race': x_train['race'].to_list()}

apply_fairness_constraints = 1
apply_accuracy_constraint = 0
sep_constraint = 0
gamma = 0
sensitive_attrs = ['race']
sensitive_attrs_to_cov_thresh = {'race': 0}

w = train_model(x_train.to_numpy(),
                y_train.to_numpy(),
                x_control,
                loss_funcs._logistic_loss,
                apply_fairness_constraints,
                apply_accuracy_constraint,
                sep_constraint,
                sensitive_attrs,
                sensitive_attrs_to_cov_thresh,
                gamma)

In [14]:
import numpy as np
np.unique(x_train['juv_misd_count'], return_counts=True)

(array([-0.19550997,  1.7853571 ,  3.76622418,  5.74709125,  7.72795833,
         9.7088254 , 11.68969247, 15.65142662, 23.57489492, 25.55576199]),
 array([4688,  222,   51,   22,    6,    4,    4,    2,    1,    1]))

### Logistic Regression 

We adapted the code from the repository linked with our paper "Fairness Constraints: Mechanisms for Fair Classification"


utils, the file that we are importing the train_model function from is also in the doc folder of our repository

In [15]:
#Logistic Regression

from utils import train_model
import loss_funcs

train_size = 5000
x_train = filtered.loc[:train_size, features]
y_train = filtered.loc[:train_size, to_predict]
x_test = filtered.loc[train_size:, features]
y_test = filtered.loc[train_size:, to_predict]
x_control = {'race': x_train['race'].to_list()}



apply_fairness_constraints = 0
apply_accuracy_constraint = 0
sep_constraint = 0
gamma = 0
sensitive_attrs = ['race']
sensitive_attrs_to_cov_thresh = {'race': 0}

#coefficients from training the model
w = train_model(x_train.to_numpy(),
                y_train.to_numpy(),
                x_control,
                loss_funcs._logistic_loss,
                apply_fairness_constraints,
                apply_accuracy_constraint,
                sep_constraint,
                sensitive_attrs,
                sensitive_attrs_to_cov_thresh,
                gamma)


print("The coefficients from training the model are:", w)

The coefficients from training the model are: [24.54716154 -0.06274797 16.72642233 -0.33462877  0.16826255]


In [16]:
#used sklearn to fit these coefficients to a Logistic regression

from sklearn.linear_model import LogisticRegression
m = LogisticRegression()
m.coef_= w.reshape((1,-1))
m.intercept_ = 0
m.classes_ = np.array([0, 1])
acc = (m.predict(x_test[features]) == y_test).sum() / len(y_test)
print("Logistic Regression Accuracy::", acc)

#Accuracy of 54% 

Logistic Regression Accuracy:: 0.5365217391304348


In [17]:
#section applying fairness constraints

apply_fairness_constraints = 1
apply_accuracy_constraint = 0
sep_constraint = 0
gamma = 0
sensitive_attrs = ['race']
sensitive_attrs_to_cov_thresh = {'race': 0}

w = train_model(x_train.to_numpy(),
                y_train.to_numpy(),
                x_control,
                loss_funcs._logistic_loss,
                apply_fairness_constraints,
                apply_accuracy_constraint,
                sep_constraint,
                sensitive_attrs,
                sensitive_attrs_to_cov_thresh,
                gamma)

print("The coefficients:",w)

The coefficients: [ 2.02536978e+01 -6.27560107e-02  3.46271309e+02 -3.34488239e-01
  1.68227794e-01]


### Maximizing accuracy under fairness constraints (C-SVM and C-LR)

In [18]:
import utils as ut
import loss_funcs as lf
def test_data():
    X, y, x_control = filtered
    ut.compute_p_rule(x_control["sex"], y) # compute the p-rule in the original data
    
    """ Split the data into train and test """
    X = ut.add_intercept(X) # add intercept to X before applying the linear classifier
    train_fold_size = 0.7
    x_train, y_train, x_control_train, x_test, y_test, x_control_test = ut.split_into_train_test(X, y, x_control, train_fold_size)
    
    apply_fairness_constraints = None
    apply_accuracy_constraint = None
    sep_constraint = None

    loss_function = lf._logistic_loss
    sensitive_attrs = ["sex"]
    sensitive_attrs_to_cov_thresh = {}
    gamma = None
    
    def train_test_classifier():
        w = ut.train_model(x_train, y_train, x_control_train, loss_function, apply_fairness_constraints, apply_accuracy_constraint, sep_constraint, sensitive_attrs, sensitive_attrs_to_cov_thresh, gamma)
        train_score, test_score, correct_answers_train, correct_answers_test = ut.check_accuracy(w, x_train, y_train, x_test, y_test, None, None)
        distances_boundary_test = (np.dot(x_test, w)).tolist()
        all_class_labels_assigned_test = np.sign(distances_boundary_test)
        correlation_dict_test = ut.get_correlations(None, None, all_class_labels_assigned_test, x_control_test, sensitive_attrs)
        cov_dict_test = ut.print_covariance_sensitive_attrs(None, x_test, distances_boundary_test, x_control_test, sensitive_attrs)
        p_rule = ut.print_classifier_fairness_stats([test_score], [correlation_dict_test], [cov_dict_test], sensitive_attrs[0])	
        return w, p_rule, test_score
    
 
    print("== Unconstrained (original) classifier ==")
    # all constraint flags are set to 0 since we want to train an unconstrained (original) classifier
    apply_fairness_constraints = 0
    apply_accuracy_constraint = 0
    sep_constraint = 0
    w_uncons, p_uncons, acc_uncons = train_test_classifier()
    
    """ Now classify such that we optimize for accuracy while achieving perfect fairness """
    apply_fairness_constraints = 1 # set this flag to one since we want to optimize accuracy subject to fairness constraints
    apply_accuracy_constraint = 0
    sep_constraint = 0
    sensitive_attrs_to_cov_thresh = {"sex":0}
    print
    print("== Classifier with fairness constraint ==")
    w_f_cons, p_f_cons, acc_f_cons  = train_test_classifier()
    
    
    """ Classify such that we optimize for fairness subject to a certain loss in accuracy """
    apply_fairness_constraints = 0 # flag for fairness constraint is set back to0 since we want to apply the accuracy constraint now
    apply_accuracy_constraint = 1 # now, we want to optimize fairness subject to accuracy constraints
    sep_constraint = 0
    gamma = 0.5 # gamma controls how much loss in accuracy we are willing to incur to achieve fairness -- increase gamme to allow more loss in accuracy
    print("== Classifier with accuracy constraint ==")
    w_a_cons, p_a_cons, acc_a_cons = train_test_classifier()	
    
    """ 
    Classify such that we optimize for fairness subject to a certain loss in accuracy 
    In addition, make sure that no points classified as positive by the unconstrained (original) classifier are misclassified.
    """
    apply_fairness_constraints = 0 # flag for fairness constraint is set back to0 since we want to apply the accuracy constraint now
    apply_accuracy_constraint = 1 # now, we want to optimize accuracy subject to fairness constraints
    sep_constraint = 1 # set the separate constraint flag to one, since in addition to accuracy constrains, we also want no misclassifications for certain points (details in demo README.md)
    gamma = 1000.0
    print("== Classifier with accuracy constraint (no +ve misclassification) ==")
    w_a_cons_fine, p_a_cons_fine, acc_a_cons_fine  = train_test_classifier()
    
    return




### Classification

In [20]:
#SVM

#Preproccessing

features = ['race', 'age', 'sex', 'juv_misd_count', 'priors_count']
to_predict = 'two_year_recid'
races_to_filter = ['Caucasian', 'African-American']
# df.loc[df['race'].isin(races_to_filter), features + [to_predict]]
df = df.loc[df['race'].isin(races_to_filter), features + [to_predict]]


#transform race and sex into 0 and 1 
#African-American will be 0 and Caucasian will be 1
#Male will be 0 and Female will be 1

df['race'] = df['race'].replace(['African-American'],0)
df['race'] = df['race'].replace(['Caucasian'],1)
df['sex'] = df['sex'].replace(['Male'],0)
df['sex'] = df['sex'].replace(['Female'],1)

#normalize age, juv_misd_count, and priors_count

normalized_df = (df-df.mean())/df.std()
df['age'] = normalized_df['age']
df['juv_misd_count'] = normalized_df['juv_misd_count']
df['priors_count'] = normalized_df['priors_count']
# normalized_df.head()
df.head()


from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix

X = df.drop('two_year_recid',axis=1)
Y = df['two_year_recid']

X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 0.20)

svclassifier = SVC(kernel='linear')
svclassifier.fit(X_train, y_train)

y_pred = svclassifier.predict(X_test)

print(confusion_matrix(y_test,y_pred))
print(classification_report(y_test,y_pred))

#Accuracy of .68


[[549 121]
 [310 250]]
              precision    recall  f1-score   support

           0       0.64      0.82      0.72       670
           1       0.67      0.45      0.54       560

    accuracy                           0.65      1230
   macro avg       0.66      0.63      0.63      1230
weighted avg       0.65      0.65      0.64      1230



In [None]:
#SVM with fairness constraints

import math
import numpy
from numpy.linalg import norm
import random
import SVM_utils as utils
#from utils import sign

DEFAULT_NUM_ROUNDS = 1
DEFAULT_LAMBDA = 1.0
DEFAULT_GAMMA = 0.1


def hyperplaneToHypothesis(w):
   return lambda x: sign(numpy.dot(w,x))


# use scikit-learn to do the svm for us
def svmDetailedSKL(data, gamma=DEFAULT_GAMMA, verbose=False, kernel='rbf'):
  # if verbose:
   #  print("Loading scikit-learn")
   from sklearn import svm
   points, labels = zip(*data)
   clf = svm.SVC(kernel=kernel, gamma=gamma)

   #if verbose:
   #   print("Training classifier")

   skClassifier = clf.fit(points, labels)
   hypothesis = lambda x: skClassifier.predict([x])[0]
   bulkHypothesis = lambda data: skClassifier.predict(data)

   alphas = skClassifier.dual_coef_[0]
   supportVectors = skClassifier.support_vectors_
   error = lambda data: 1 - skClassifier.score(*zip(*data))

   intercept = skClassifier.intercept_
   margin = lambda y: skClassifier.decision_function([y])[0]
   bulkMargin = lambda pts: skClassifier.decision_function(pts)

   #if verbose:
   #   print("Done")

   return (hypothesis, bulkHypothesis, skClassifier, error, alphas, intercept,
            gamma, supportVectors, bulkMargin, margin)


def svmSKL(data, gamma=DEFAULT_GAMMA, verbose=False, kernel='rbf'):
   return svmDetailedSKL(data, gamma, verbose, kernel)[0]

def svmLinearSKL(data, verbose=False):
   return svmDetailedSKL(data, 0, verbose, 'linear')[0]

# compute the margin of a point
def margin(point, hyperplane):
   return numpy.dot(hyperplane, point)

# compute the absolute value of the margin of a point
def absMargin(point, hyperplane):
   return abs(margin(point, hyperplane))

### Information Theoretic Measures for Fairness-aware Feature Selection

The Fairness-aware Feature Selection (FFS) framework depends on the joint statistics of the data. It utilizes information decomposition to calculate two information-theoretic measures that separately quantify the accuracy and discriminatory impact of every subset of features. 

Subsequently, based on the two information-theoretic measures of each subset, the authors deduce an accuracy coefficient and a discrimination coefficient for each feature using Shapely-value analysis. The two coefficients capture the marginal impacts on accuracy and discrimination of each feature, respectively.

Note that the two coefficients are deduced using the information-theoretic measures for all subsets of features. This allows consideration for the interdependencies among the features. 

Finally, combining the two coefficients, a fairness-utility score is assigned for each feature, and we can do feature selection based on this score. It’s worth noting that both the calculation of the fairness-utility scores and the feature selection process rely on personal judgement.  

In [21]:
#Load data 

fp = '../data/compas-scores-two-years.csv'
df = pd.read_csv(fp)

In [22]:
##split data

train_set = set_split_train(data_process(df)[0],data_process(df)[1],data_process(df)[2])
test_set = set_split_test(data_process(df)[0],data_process(df)[1],data_process(df)[2])

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  compas_subset["two_year_recid"] = compas_subset["two_year_recid"].apply(lambda x: -1 if x==0 else 1)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  compas_subset["two_year_recid"] = compas_subset["two_year_recid"].apply(lambda x: -1 if x==0 else 1)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  com

In [23]:
# Calculation

accuracy, discriminate = shapley_Cal(train_set)[0], shapley_Cal(train_set)[1]
#print result
shapley_print(discriminate,accuracy)

# 68%

          Feature  Discrimination  Accuracy
0            race    7.174550e+06  2.459747
1             age    4.921024e+06  1.366085
2             sex    3.070006e+06  1.236041
3  juv_misd_count    3.675583e+06  1.311511
4    priors_count    4.033772e+06  1.293547


In [25]:
test_acc,test_cal = test_result(train_set, test_set)  
test_print(test_acc)

# The final accuracy is 68%

NameError: name 'test_result' is not defined

We can see that the accuary is different from the baseline model. Not only for the A2 but also for A7. Because of the fact that the accuarcy is improved by considering the fairness. Moreover, Maximizing accuracy under fairness constraints implies that dropping two_year_recid factor is an efficient way to decrease discrimination. And information Theoretic Measures for Fairness-aware Feature selection (FFS)) as it is predicted by the marginal discrimination coefficient, removal of Age or Prior Counts results in the lowest bias in the classifier output.