# Grid Search for Binary Classification - Admissions

In this notebook, we will apply the `GridSearch` algorithm in FairLearn to a binary classification problem, where we also have a binary protected attribute. This algorithm comes from the paper ["A Reductions Approach to Fair Classification" (Agarwal et al. 2018)](https://arxiv.org/abs/1803.02453). The grid search is a simplified version of the full algorithm (appearing in section 3.4), which works best for binary classification and a binary protected attribute.

The specific problem we consider is a biased college admissions problem. We assume that we have a group of males and females (gender will be our protected attribute), with matching standardised test scores and some other irrelevant feature which is correlated with gender. We also have a set of labels denoting whether or not each individual was admitted, and we will make this (generated) historical data biased, by setting a higher threshold for females than males. We will make the standardised test scores independent of gender, so if admissions were unbiased, both genders would be admitted in equal portions.

In [None]:
from fairlearn.reductions import GridSearch
from fairlearn.reductions import DemographicParity

import numpy as np
import pandas as pd
from sklearn.linear_model import LogisticRegression

## Generating the Data

We are going to synthesise data for this scenario. We will generate a dataset with two features - "score" and "irrelevant." Both will be follow a normal distribution, but while the "score" feature will be parameterised by a single mean and standard deviation, the "irrelevant" feature will be affected by gender. Similarly in the biased historical data, we will set different thresholds for the two genders (with a small around of normally distributed jitter around each threshold).

The following class implements this:

In [None]:
class DataGenerator:
    def __init__(self):
        # We use label and index 0 for female and 1 for male
        self.number = [100, 300]
        
        self.score_mean = 0.5
        self.score_std_dev = 0.1
        
        self.score_threshold = [0.6, 0.4]
        self.score_threshold_jitter = [0.05, 0.05]
        
        self.irrelevant_mean = [0.3, 0.7]
        self.irrelevant_std_dev = [0.2, 0.2]
        
    def generate(self):
        genders = []
        scores = []
        admissions = []
        irrelevants = []

        for g in range(2):
            s, a, ir = self._generate_single_dataset(self.number[g],
                                                     self.score_threshold[g],
                                                     self.score_threshold_jitter[g],
                                                     self.irrelevant_mean[g],
                                                     self.irrelevant_std_dev[g])
            genders.append(np.full(self.number[g], g))
            scores.append(s)
            admissions.append(a)
            irrelevants.append(ir)
        
        all_scores = np.concatenate( (scores[0], scores[1]), axis=None)
        all_admissions = np.concatenate( (admissions[0], admissions[1]), axis=None)
        all_irrelevants = np.concatenate( (irrelevants[0], irrelevants[1]), axis=None)
        all_genders = np.concatenate( (genders[0], genders[1]), axis=None)
        
        A = pd.Series(data=all_genders, name="Gender")
        X = pd.DataFrame({"score":all_scores,
                          "irrelevant": all_irrelevants})
        Y = pd.Series(data=all_admissions, name="Admitted")
        
        return X, Y, A
        
    def _generate_single_dataset(self,
                                 number_samples,
                                 threshold, threshold_jitter,
                                 irr_mean, irr_std_dev):
        scores = np.random.normal(loc=self.score_mean,
                                  scale=self.score_std_dev,
                                  size=number_samples)
        scores[ scores < 0 ] = 0
        scores[ scores > 1 ] = 1
    
        threshold = np.random.normal(loc=threshold, scale=threshold_jitter, size=number_samples)
        threshold[ threshold < 0 ] = 0
        threshold[ threshold > 1 ] = 1
    
        def admit(s, t): return int(s > t)
    
        vadmit = np.vectorize(admit)
    
        admitted = vadmit(scores, threshold)
    
        irrelevant = np.random.normal(loc=irr_mean, scale=irr_std_dev, size=number_samples)
    
        return scores, admitted, irrelevant
        

We then use this class to generate the data:

In [None]:
dg = DataGenerator()

X, Y, A = dg.generate()

We will use `matplotlib` to examine some of the data. First we examine the distribution of the data in the `X` feature array. As expected, the "score" feature has an identical distribution, but the "irrelevant" feature shows a gender difference

In [None]:
import matplotlib.pyplot as plt
plot_width = 12
plot_height = 8
plt.rcParams["figure.figsize"] = (plot_width, plot_height) # (w, h)

# Nice caption text
gender_labels = ["Female", "Male"]

# Plot two histograms for the given column
def histograms(X_s, A_s, col_name):
    
    sep_data = [X_s[col_name][A_s==0], X_s[col_name][A_s==1]]
    
    plt.hist(sep_data, histtype="step", bins=20, label=gender_labels)
    plt.xlabel(col_name)
    plt.ylabel("Counts")
    plt.legend()
    plt.show()
    
histograms(X, A, "score")
histograms(X, A, "irrelevant")

We can also examine whether each individual was admitted as a function of their test score. This clearly shows the bias against females:

In [None]:
def plot_admissions_vs_scores(X_s, Y_s, A_s):
    markers=[".", "x"]
    for i in range(2):
        mask = A_s == i
        plt.scatter(X_s[mask].score, Y_s[mask], label=gender_labels[i], marker=markers[i])
    plt.xlabel("Score")
    plt.ylabel("Admitted")
    plt.legend()
    plt.show()
    
plot_admissions_vs_scores(X, Y, A)