# Week 6 Homework: Refining Our Notion of Algorithmic Bias For COMPAS

In this notebook, we will continue our exploration of the COMPAS dataset. We'll more closely examine what it means for an algorithm to be fair using the [What-If Tool](https://pair-code.github.io/what-if-tool/index.html), a visual interface for probing machine learning models produced by the PAIR group at Google AI. 

Before starting this notebook, read [this article](https://pair-code.github.io/what-if-tool/ai-fairness.html) introducing the tool in context of predicting loan risk. The article describes five different notions of algorithmic fairness that will come up in our analysis.

After reading the article, run the below cell to get started.

In [0]:
%%capture
import google.colab
!pip install --upgrade witwidget
import pandas as pd
import numpy as np
import tensorflow as tf
import functools

In [0]:
#@title Run this cell to import helpers { display-mode: "form" }

# Creates a tf feature spec from the dataframe and columns specified.
def create_feature_spec(df, columns=None):
    feature_spec = {}
    if columns == None:
        columns = df.columns.values.tolist()
    for f in columns:
        if df[f].dtype is np.dtype(np.int64):
            feature_spec[f] = tf.FixedLenFeature(shape=(), dtype=tf.int64)
        elif df[f].dtype is np.dtype(np.float64):
            feature_spec[f] = tf.FixedLenFeature(shape=(), dtype=tf.float32)
        else:
            feature_spec[f] = tf.FixedLenFeature(shape=(), dtype=tf.string)
    return feature_spec

# Creates simple numeric and categorical feature columns from a feature spec and a
# list of columns from that spec to use.
#
# NOTE: Models might perform better with some feature engineering such as bucketed
# numeric columns and hash-bucket/embedding columns for categorical features.
def create_feature_columns(columns, feature_spec):
    ret = []
    for col in columns:
        if feature_spec[col].dtype is tf.int64 or feature_spec[col].dtype is tf.float32:
            ret.append(tf.feature_column.numeric_column(col))
        else:
            ret.append(tf.feature_column.indicator_column(
                tf.feature_column.categorical_column_with_vocabulary_list(col, list(df[col].unique()))))
    return ret

# An input function for providing input to a model from tf.Examples
def tfexamples_input_fn(examples, feature_spec, label, mode=tf.estimator.ModeKeys.EVAL,
                       num_epochs=None, 
                       batch_size=64):
    def ex_generator():
        for i in range(len(examples)):
            yield examples[i].SerializeToString()
    dataset = tf.data.Dataset.from_generator(
      ex_generator, tf.dtypes.string, tf.TensorShape([]))
    if mode == tf.estimator.ModeKeys.TRAIN:
        dataset = dataset.shuffle(buffer_size=2 * batch_size + 1)
    dataset = dataset.batch(batch_size)
    dataset = dataset.map(lambda tf_example: parse_tf_example(tf_example, label, feature_spec))
    dataset = dataset.repeat(num_epochs)
    return dataset

# Parses Tf.Example protos into features for the input function.
def parse_tf_example(example_proto, label, feature_spec):
    parsed_features = tf.parse_example(serialized=example_proto, features=feature_spec)
    target = parsed_features.pop(label)
    return parsed_features, target

# Converts a dataframe into a list of tf.Example protos.
def df_to_examples(df, columns=None):
    examples = []
    if columns == None:
        columns = df.columns.values.tolist()
    for index, row in df.iterrows():
        example = tf.train.Example()
        for col in columns:
            if df[col].dtype is np.dtype(np.int64):
                example.features.feature[col].int64_list.value.append(int(row[col]))
            elif df[col].dtype is np.dtype(np.float64):
                example.features.feature[col].float_list.value.append(row[col])
            elif row[col] == row[col]:
                example.features.feature[col].bytes_list.value.append(row[col].encode('utf-8'))
        examples.append(example)
    return examples

# Converts a dataframe column into a column of 0's and 1's based on the provided test.
# Used to force label columns to be numeric for binary classification using a TF estimator.
def make_label_column_numeric(df, label_column, test):
  df[label_column] = np.where(test(df[label_column]), 1, 0)

Run the below cell to load the COMPAS data. Note that the column names aren't exactly the same as before because we are using a different data source. Some columns to highlight: *recidivism_within_2_years* is the groundtruth recividism, *decile_score* is the predicted recividism score according to COMPAS, and *COMPAS_determination* is a binary number representing whether the predicted score is low (0) or not (1).

In [0]:
df = pd.read_csv('https://storage.googleapis.com/what-if-tool-resources/computefest2019/cox-violent-parsed_filt.csv')
df = df.drop(columns=['id', 'dob', 'screening_date', 'age_cat', 'event', 'v_type_of_assessment', 'v_decile_score', 'v_score_text', 'r_jail_in', 'vr_charge_degree', 'vr_offense_date', 'vr_charge_desc', 'r_charge_degree', 'r_days_from_arrest', 'r_offense_date', 'r_charge_desc', 'c_days_from_compas', 'c_charge_degree',	'c_charge_desc', 'days_b_screening_arrest', 'c_jail_in', 'c_jail_out', 'violent_recid', 'is_violent_recid', 'type_of_assessment', 'decile_score.1', 'priors_count.1'])

# Filter out entries with no indication of recidivism or no compass score
df = df[df['is_recid'] != -1]
df = df[df['decile_score'] != -1]

# Rename recidivism column
df['recidivism_within_2_years'] = df['is_recid']

# Make the COMPAS label column numeric (0 and 1), for use in our model
df['COMPAS_determination'] = np.where(df['score_text'] == 'Low', 0, 1)

df.head()

We will now train a classifier to predict the *COMPAS_determination* field. In other words, we are training a classifier to replicate COMPAS rather than predict the groundtruth (as we did last week). We do this with the goal of better understanding the original COMPAS model that we are replicating!

In [0]:
# Set column to predict
label_column = 'COMPAS_determination'

# Get list of all columns from the dataset we will use for model input or output.
input_features = ['sex', 'age', 'race', 'priors_count', 'juv_fel_count', 'juv_misd_count', 'juv_other_count']
features_and_labels = input_features + [label_column]

features_for_file = input_features + ['recidivism_within_2_years', 'COMPAS_determination']

examples = df_to_examples(df, features_for_file)

In [0]:
#@title Create and train the classifier (run this cell!) {display-mode: "form"}

num_steps = 2000  #@param {type: "number"}
tf.logging.set_verbosity(tf.logging.DEBUG)

# Create a feature spec for the classifier
feature_spec = create_feature_spec(df, features_and_labels)

# Define and train the classifier
train_inpf = functools.partial(tfexamples_input_fn, examples, feature_spec, label_column)
classifier = tf.estimator.LinearClassifier(
    feature_columns=create_feature_columns(input_features, feature_spec))
classifier.train(train_inpf, steps=num_steps)

Run the below cell to launch the What-If Tool. Note that this tool works best when using Google Chrome.

In [0]:
#@title Run this to launch What-If Tool for test data and the trained models {display-mode: "form"}


num_datapoints = 10000
tool_height_in_px = 700

from witwidget.notebook.visualization import WitConfigBuilder
from witwidget.notebook.visualization import WitWidget

# Setup the tool with the test examples and the trained classifier
config_builder = WitConfigBuilder(examples[0:num_datapoints]).set_estimator_and_feature_spec(
    classifier, feature_spec)
WitWidget(config_builder, height=tool_height_in_px)

# Exploration

Once you've launched the What-If Tool, begin by exploring the "Datapoint editor" tab (on the top-left), which enables you to visualize datapoints and even modify individual datapoints. Try clicking on one of the red or blue dots on the right panel to examine the training example. You are able to view the features for the training example, the groundtruth (*recividism_within_2_years*), and the inference value (predicted by the model). You can modify a feature and click the "Run inference" button to see the effect it had on the model's prediction. What happens if you change an example's race from "Caucasian" to "African-American"? To "Asian"? What happens if you change age and sex? Put your answers to these questions in the below cell, in a few sentences. If you like, you can also play around with the visualization tool on the right panel to explore different ways of slicing and presenting the data.






In [0]:
'''
YOUR ANSWERS HERE
'''

After you've explored the "Datapoint editor" tab, click on the "Features" tab (near the top). This view gives you distributions for the features for each example–another useful way to better understand your data. We can see that certain demographics are overrepresented in our data (e.g., African-Americans and males). We can also see that about 25% of examples have 0 prior offenses (*prior_count*). 

Finally, click on the "Performance and Fairness" tab at the top of the tool. On the left, under "Ground Truth Feature", select *recividism_within_2_years*. Leave the cost ratio at 1. This sets the threshold for binary classification (which we have previously set to 0.5) such that the ratio of false positives to false negatives is 1. Effectively, we are telling our model that false positives are just as bad as false negatives, and the model sets the threshold for binary classification so that this is the case. Given the threshold, if the model predicts a value lower than the threshold, we say that recividism risk is low, otherwise the risk is not low. Note that in the past we have arbitrarily used 0.5 as the threshold, but this does not necessarily lead to a ratio of false positives to false negatives of 1. We can imagine for this problem a cost ratio of 1 may be undesirable, but for consistency we leave this at 1 for now (feel free to play around with this later and see the effect on the thresholds). Lastly, under "Slice by", choose "race".

Now you are able to view performance information for each race. Click "Single threshold" to start. The threshold value (0.64) is set so that the overall cost ratio is 1, but you'll notice that different races have different ratios of false positives to false negatives. In particular, African-Americans have a higher false positive rate than other races. 

Note that "Single threshold" corresponds most closely to "Group unaware" from the reading, although in the "Single threshold" case, race is still used as a feature, even if thresholds are the same for all races. Otherwise, the other notions of algorithmic fairness are presented as options.

Play around with the different options, noticing how the thresholds, rates of false positives and false negatives, and accuracies change for each.

Given how the thresholds are different for each option, you might suspect that it is impossible to satisfy all of these notions of fairness at the same time. In fact, researchers [have found that on realistic data](https://arxiv.org/pdf/1609.05807.pdf), this intuition is borne out in practice.

Given that we cannot fulfill all notions of algorithmic fairness at once for the problem of predicting recividism, which would you choose? In the below cell, pick one of the following, and defend your choice in a few sentences. There is no right answer! Be sure to reference figures from the What-If Tool in your analysis.

1.   Group unaware
2.   Group thresholds
3.   Demographic parity
4.   Equal opportunity
5.   Equal accuracy


In [0]:
'''
YOUR ANSWER HERE
'''

# Acknowledgements

This notebook is adapted from [code from the PAIR group at Google](https://github.com/PAIR-code/what-if-tool/blob/master/WIT_COMPAS.ipynb). If you liked this homework, you can use the What-If Tool to [perform a similar analysis on an income classification model built on top of Census Bureau data](https://colab.research.google.com/github/pair-code/what-if-tool/blob/master/WIT_Model_Comparison.ipynb).