#### Copyright 2018 Google LLC

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

# Adversarial Multi-task Learning
Authors: Zhe Zhao, Ben Hutchinson, Andrew Zaldivar, Emanuel Schorsch, Jilin Chen, Alex Beutel, Margaret Mitchell

***

Click [here](https://colab.research.google.com/github/google/ml_fairness/blob/master/colabs/adversarial_multitask_learning.ipynb) to run this colab interactively on colab.research.google.com.

***

## Summary of This Notebook

This notebook demonstrates a handy technique for fairness, using Tensorflow:  adversarial multi-task learning.
It covers some of the work in the paper ([arxiv](https://arxiv.org/pdf/1707.00075.pdf)):

> ``` Alex Beutel, Jilin Chen, Zhe Zhao, Ed H. Chi.  Data Decisions and Theoretical Implications when
Adversarially Learning Fair Representations.  FAT/ML, 2017. ```

For further notebooks walking through other fairness-relevant techniques, check out the [ML Fairness web page](https://developers.google.com/machine-learning/fairness-overview/).

We first walk through how to set up a multi-task learning model in Tensorflow, then extend to the adversarial case, where we'll negate the gradient of one of the tasks.  This adversarial technique can be used to help remove the effect of signals that you don't want your model to pick up on, such as gender or sex.

## Intro Statement of Problem

Multi-task learning is a technique we can use in neural networks to make multiple predictions at the same time from the same model.
For example, a multi-task learning model can be trained to make predictions for both education level and income from the same input features.

This can be useful when the tasks are related, and so being able to make predictions about one helps with the ability to make predictions about the others.  In this colab, we walk through making predictions about sex and income.

## About the UCI Census Income (Adult) Data Set

The data set used throughout this notebook comes from the [1994 Census Income database](https://archive.ics.uci.edu/ml/datasets/Census+Income). Here's some information about each feature:

|Column Name|Type|Description|
|:---|:---|:---|
|age|Continuous|The age of the individual|
|workclass|Categorical|The type of employer the individual has|
|fnlwgt|Continuous|# of people census takers believe that observation represents|
|education|Categorical|The highest level of education achieved for that individual|
|education_num|Continuous|The highest level of education in numerical form|
|marital_status|Categorical|Marital status of the individual|
|occupation|Categorical|The occupation of the individual|
|relationship|Categorical|Wife, Own-child, Husband, Not-in-family, Other-relative, Unmarried|
|race|Categorical|White, Asian-Pac-Islander, Amer-Indian-Eskimo, Other, Black|
|sex|Categorical|Female, Male|
|capital_gain|Continuous|Capital gains recorded|
|capital_loss|Continuous|Capital losses recorded|
|hours_per_week|Continuous|Hours worked per week|
|native_country|Categorical|Country of origin of the individual|
|income|Categorical|Whether the person makes more than $50,000 annually|

## Let's Get Started!

We will read in and preprocess the data using basic Tensorflow functions under-the-hood.

First, we'll import all the packages that we'll need.


In [0]:
!pip install -U -q git+https://github.com/google/ml_fairness
  
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from ml_fairness import simple_multitask_model

import os
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import tempfile
import tensorflow as tf

tf.logging.set_verbosity(tf.logging.ERROR)

Now we'll copy the required data files from Google Cloud Storage to a local directory.

To do this we'll first need to authenticate to Google Cloud Storage. Executing the following cell will generate a link which you'll need to follow to get a verification code.

In [0]:
from google.colab import auth
auth.authenticate_user()

Now we'll sync the data for the colab to a tmp directory from Google Cloud Storage.

In [0]:
project_id = 'mledu-fairness'
!gcloud config set project {project_id}

gcs_bucket_name = 'mledu-fairness/colabs/multitask_learning'
local_dir_name = '/tmp/multitask_learning'
if not os.path.exists(local_dir_name):
  print("creating dir %s" % local_dir_name)
  !mkdir {local_dir_name}
  
!gsutil rsync gs://{gcs_bucket_name} {local_dir_name}

!ls -al {local_dir_name}  

COLUMNS = ["age", "workclass", "fnlwgt", "education", "education_num",
           "marital_status", "occupation", "relationship", "race", "sex",
           "capital_gain", "capital_loss", "hours_per_week", "native_country",
           "income_bracket"]

TRAIN_FILE = os.path.join(local_dir_name, "adult.data")
TEST_FILE = os.path.join(local_dir_name, "adult.test")

train_df = pd.read_csv(TRAIN_FILE, names=COLUMNS, 
                       skipinitialspace=True)
test_df = pd.read_csv(TEST_FILE, names=COLUMNS, 
                      skipinitialspace=True, skiprows=1)

print('UCI Census Income (Adult) Data Set loaded.')

Run this cell to make plots look better than using their default settings.

In [0]:
file_contents = """
axes.linewidth: 1.5
axes.grid: True
axes.titlesize: x-large
axes.labelsize: large
axes.axisbelow: True
axes.facecolor: F7F7F7

axes.prop_cycle: cycler('color', ['4285F4', 'DB4437', '0F9D58', 'F4B400', '9E9E9E', 'D2691E', 'FF6347', '87CEEB', 'FFC0CB', '32CD32', 'DA70D6', '808000', 'FFD700'])

font.size: 14
font.family: sans-serif

xtick.labelsize: large
ytick.labelsize: large

grid.linewidth: 1.5

figure.autolayout: True
figure.figsize: 10, 10
figure.titleweight: bold

legend.fontsize: large
legend.loc: best
legend.fancybox: True
legend.facecolor: FAFBFC
legend.frameon: True
figure.dpi: 1000

lines.linewidth: 2

text.antialiased: True
text.hinting : auto
"""
# Use a tempfile because matplotlib has to load a file by name or URL.
f = tempfile.NamedTemporaryFile(delete=False)
name = f.name
f.write(file_contents)
f.close()

# Load the style sheet.
plt.style.use(name)

# Make plots bigger.
plt.rcParams["figure.figsize"] = (8, 8)

# Make plots higher resolution.
%config InlineBackend.figure_format='retina'

print('Done!')

# Multi-task Learning

Now let's define some of the constants that we'll use throughout the colab.

In [0]:
# Multi-task variables ======================================================
MAIN = "main"
AUX = "aux"
TASKS = (MAIN, AUX)
IDS = "ids"
WEIGHTS = "weights"
# These names are built/assumed by multi-task.
# defined here to access them in the output.
H_MAIN = "h_" + MAIN
H_AUX = "h_" + AUX
W_MAIN = WEIGHTS + "_" + MAIN
W_AUX = WEIGHTS + "_" + AUX

The basic idea behind Multi-task Learning (MTL) is to use the same model to make predictions about multiple variables.

This might include a main variable you want to make predictions for, like income bracket; and another sensitive variable that you want -- or do not want -- to be related to the predictions for the main variable.

This is often cast as a **main task** and **auxiliary tasks**.  

The *main task* is what you'd like the model to perform best on.  

The *auxiliary tasks* can help to improve performance on the main task, or change the way the parameters are learned for the main task.

If the tasks are related, then sometimes the model can improve performance on several of the tasks, by considering them all at the same time.
As we will see, we can also use multi-task learning to explicitly *discourage* the model from learning an auxiliary task, while doing well on a main task -- effectively reshaping the signal for the main task.

First, will begin by defining two tasks:


1.   Income bracket, the main task.
2.   Sex, the auxiliary task.

We set the main task to predict income bracket in two buckets -- > 50K and <= 50K


In [0]:
train_df[MAIN] = (train_df["income_bracket"].apply(
    lambda x: ">50K" in x)).astype(np.float32)
test_df[MAIN] = (test_df["income_bracket"].apply(
    lambda x: ">50K" in x)).astype(np.float32)

Now we set up the auxiliary task, sex. 

In [0]:
train_df[AUX] = train_df["sex"].apply(
    lambda x: "Male" in x).astype(np.float32)
test_df[AUX] = test_df["sex"].apply(
    lambda x: "Male" in x).astype(np.float32)

Now we remove the variables (tasks) that we are predicting from the input data -- without doing this, the model will be able to see the answers in its input.

In [0]:
train_df = train_df.drop(['income_bracket', 'sex'], axis=1)
test_df = test_df.drop(['income_bracket', 'sex'], axis=1)

Now we create train and test data for the multi-task setup.

## Negative Sampling

Now we're going to modify how we sample from the data by using a **sampling** technique. 

Using this technique, the data is sampled to balance the weight of _positive_ examples and _negative_ examples.

In [0]:
# To balance the data set, call this function.
def negative_sampling(labels, neg_pos_ratio):
  weight = tf.where(
      labels > 0.5, tf.ones_like(labels), tf.ones_like(labels) * neg_pos_ratio)
  if neg_pos_ratio > 1.0:
    pos_sample_rate = 1.0
    neg_sample_rate = 1.0 / neg_pos_ratio
  else:
    pos_sample_rate = neg_pos_ratio
    neg_sample_rate = 1.0
  sampling = tf.random_uniform(labels.shape) < tf.where(
      labels>0.5, 
      tf.ones_like(labels) * pos_sample_rate,
      tf.ones_like(labels) * neg_sample_rate)
  return weight, sampling

## FairAware Note

The way that you sample from your data can influence your results. 

Training your model to pay more or less attention to different examples in your training data effects which variables it cares the most about.

***

Next, we'll set up the `input_fn` for Tensorflow to read the data.

Run this cell to load `input_fn` for handling the multi-task case.

In [0]:
def input_fn(df, batch_size=100, sampling_task_name=None):
  # Creates a dictionary mapping from each continuous feature column name (k) to
  # the values of that column stored in a constant Tensor.
  continuous_columns = list(df.select_dtypes(exclude=['object']))
  categorical_columns = list(df.select_dtypes(include=['object']))
  continuous_cols = {k: tf.constant(df[k].values)
                     for k in continuous_columns}
  # Creates a dictionary mapping from each categorical feature column name (k)
  # to the values of that column stored in a tf.SparseTensor.
  #scipy.sparse.csr_matrix(df.values)
  categorical_cols = {k: tf.SparseTensor(
      indices=[[i, 0] for i in range(df[k].size)],
      values=df[k].values,
      dense_shape=[df[k].size, 1])
                      for k in categorical_columns}
  # Merges the two dictionaries into one.
  features = dict(continuous_cols.items() + categorical_cols.items())
  # Convert the label column into a constant Tensor for each task.
  targets = {k: tf.constant(df[k].values) for k in TASKS}
  if batch_size <= 0:
    return features, targets
  ids = tf.reshape(tf.where(tf.ones(shape=[len(df),]) > 0), shape=[-1,])
  inputs = {}
  inputs.update(features)
  inputs.update(targets)
  inputs[IDS] = ids
  if sampling_task_name:
    neg_pos_ratio = (
        float(len(df[sampling_task_name]) - sum(df[sampling_task_name])) /
        sum(df[sampling_task_name]))
    weight, sampling = negative_sampling(
        inputs[sampling_task_name], neg_pos_ratio)
    inputs[WEIGHTS] = tf.reshape(weight, shape=[-1, 1])
    batched_inputs = tf.train.maybe_batch(
        inputs, sampling, batch_size, enqueue_many=True)
  else:
    inputs[WEIGHTS] = tf.ones_like(shape=[len(df), 1])
    batched_inputs = tf.train.batch(inputs, batch_size, enqueue_many=True)
 
  batched_targets = {MAIN: tf.reshape(batched_inputs.pop(MAIN), shape=[-1, 1]), 
                     AUX: tf.reshape(batched_inputs.pop(AUX),
                                         shape=[-1, 1])}
  
  return batched_inputs, batched_targets

print('input_fn() loaded.')

We next input the tasks corresponding to income and sex.

In [0]:
def make_multitask_input_fn(df, batch_sizes=[100, 100], tasks=TASKS):
  def make_current_input_fn(df, batch_size, task):
    def current_input_fn():
      return input_fn(df, batch_size, task)
    return current_input_fn
  input_fn_dict = { task : make_current_input_fn(df, batch_size, task)
                    for (batch_size, task) in zip(batch_sizes, tasks) }
  return simple_multitask_model.merge_multiple_input_fn(input_fn_dict, WEIGHTS)

Now we set up the `input_fn` for a few comparisons.

We'll compare a baseline (single-task) with multi-task and adversarial multi-task.

In [0]:
base_train_input_fn = make_multitask_input_fn(
    train_df, batch_sizes=[100], tasks=[MAIN])

multi_train_input_fn = make_multitask_input_fn(
    train_df, batch_sizes=[100, 100], tasks=TASKS)

eval_input_fn = make_multitask_input_fn(
    test_df, batch_sizes=[5000, 5000], tasks=TASKS)

Now let's define the tasks for the multi-task models, and initialize the weights and shared hidden units.

In [0]:
# Dict mapping target/head names to the dimensionality of predicted targets.
num_classes_tasks = {MAIN: 2, AUX: 2}

task_names_dict = {MAIN: (H_MAIN, W_MAIN),
                   AUX: (H_AUX, W_AUX)}
# List of `_Head` instances.
all_heads = []
for task_name in (MAIN, AUX):
    head_name = task_names_dict[task_name][0]
    weight_column_name = task_names_dict[task_name][1]
    multi_head = tf.contrib.learn.multi_class_head(
        num_classes_tasks[task_name],
        label_name=task_name,
        head_name=head_name,
        weight_column_name=weight_column_name)
    all_heads.append(multi_head)
shared_hidden_units = [64]
hidden_units_dict = {H_MAIN: [], H_AUX: []}
baseline_heads = [all_heads[0]]
multi_heads = all_heads

Next, we get the features for our different variables.  We use a different formulation than above because this spot of code works with an older version of Tensorflow.

Here, we work with an **embedding** for each variable.  This requires defining columns as either sparse or real-valued.

## FairAware Note

Below, we use `race` as an input feature.  

This is a sensitive characteristic that may or may not make sense to include.

Try removing this and see what changes.

What other features might you use that are included/encompassed in what we think of as `race`, but are not sensitive?

***

Run the cell below to load the feature columns function for the multi-task case. 


In [0]:
# Get feature columns
def get_feature_columns(embedding_dim=10):
  race = tf.contrib.layers.sparse_column_with_hash_bucket(
      "race", hash_bucket_size=100)

  education = tf.contrib.layers.sparse_column_with_hash_bucket(
      "education", hash_bucket_size=1000)
  marital_status = tf.contrib.layers.sparse_column_with_hash_bucket(
      "marital_status", hash_bucket_size=100)
  relationship = tf.contrib.layers.sparse_column_with_hash_bucket(
      "relationship", hash_bucket_size=100)
  workclass = tf.contrib.layers.sparse_column_with_hash_bucket(
      "workclass", hash_bucket_size=100)
  occupation = tf.contrib.layers.sparse_column_with_hash_bucket(
      "occupation", hash_bucket_size=1000)
  native_country = tf.contrib.layers.sparse_column_with_hash_bucket(
      "native_country", hash_bucket_size=1000)

  age = tf.contrib.layers.real_valued_column("age")
  age_buckets = tf.contrib.layers.bucketized_column(
      age, boundaries=[18, 25, 30, 35, 40, 45, 50, 55, 60, 65])
  education_num = tf.contrib.layers.real_valued_column("education_num")
  capital_gain = tf.contrib.layers.real_valued_column("capital_gain")
  capital_loss = tf.contrib.layers.real_valued_column("capital_loss")
  hours_per_week = tf.contrib.layers.real_valued_column("hours_per_week")

  education_embedding = tf.contrib.layers.embedding_column(
      education, embedding_dim)
  marital_status_embedding = tf.contrib.layers.embedding_column(
      marital_status, embedding_dim)
  relationship_embedding = tf.contrib.layers.embedding_column(
      relationship, embedding_dim)
  workclass_embedding = tf.contrib.layers.embedding_column(
      workclass, embedding_dim)
  occupation_embedding = tf.contrib.layers.embedding_column(
      occupation, embedding_dim)
  native_country_embedding = tf.contrib.layers.embedding_column(
      native_country, embedding_dim)
  
  race_embedding = tf.contrib.layers.embedding_column(
      race, embedding_dim)

  feature_columns = [age_buckets, education_num, capital_gain,
                     capital_loss, hours_per_week,
                     education_embedding, marital_status_embedding, 
                     relationship_embedding, workclass_embedding, 
                     occupation_embedding, native_country_embedding]
  # This is a sensitive characteristic!
  feature_columns += [race_embedding]
  return feature_columns

print('get_feature_columns() loaded.')

Now set up the feature columns for some Tensorflow multi-taskin'!

In [0]:
feature_columns = get_feature_columns()

Define some helper functions for creating the estimators.

In [0]:
def train_op_fn(loss):
  """Returns the op to optimize the loss."""
  return tf.contrib.layers.optimize_loss(
      loss, tf.contrib.framework.get_global_step(), 1.0, "Adagrad")

def create_model_fn(feature_columns, heads, shared_hidden_units,
                    hidden_units_dict, scale_gradient_targets_dict={}):
  def model_fn(features, labels, mode):
    targets_dict = {head.head_name: head.logits_dimension for head in heads}
    predictions = simple_multitask_model.build_shared_bottom_model(
        features,
        feature_columns,
        targets_dict,
        shared_hidden_units,
        hidden_units_dict,
        is_training=(mode == tf.contrib.learn.ModeKeys.TRAIN),
        activation_fn=tf.nn.relu,
        scale_gradient_targets_dict=scale_gradient_targets_dict)
    multihead = tf.contrib.learn.multi_head(heads)
    return multihead.create_model_fn_ops(features, mode, labels, train_op_fn,
                                      predictions, scale_gradient_targets_dict)
  return model_fn

Now we can create the Estimators!

In [0]:
estimator_base = tf.contrib.learn.Estimator(
    create_model_fn(
        feature_columns,
        baseline_heads,
        shared_hidden_units, 
        hidden_units_dict={H_MAIN: []}))

estimator_multi = tf.contrib.learn.Estimator(
    create_model_fn(
        feature_columns,
        multi_heads,
        shared_hidden_units, 
        hidden_units_dict))

# Adversarial Multi-task Learning

To compare against a model trained adversarially, we scale the gradient of our chosen adversarial auxiliary task by a negative value.

This negates the gradient for that task, backpropogating the negation through the network.

This effectively tells the network that the *better* it is at making predictions for the auxiliary task, the more it should negate (penalize) the effect of the parameters driving the prediction.

In [0]:
scale_gradient_targets_dict = {H_AUX: -0.01}

Easy now using the learning.fairness.multitask module!  Now let's create the adversarial estimator.

In [0]:
estimator_adv = tf.contrib.learn.Estimator(
    create_model_fn(
        feature_columns,
        multi_heads,
        shared_hidden_units, 
        hidden_units_dict,
        scale_gradient_targets_dict))

Now we'll step through training for 5 rounds (epochs), fitting the model again at every 300 steps.

In [0]:
num_epochs = 5
train_steps = 300

# Hack to alter behavior when running unit tests.
if 'COLAB_NOTEBOOK_TEST' in os.environ:
  # Use a small number of steps when running unit tests, for faster
  # tests and less memory usage.
  train_steps = 2
  num_epochs = 2
  logging.info('Running in test mode, using %d steps and %d epochs' % (train_steps, num_epochs))

Each time we fit, we evaluate on the test set using the baseline and the adversarial network to predict the income bracket. 

We will calculate how well it's performing using the common metric **AUC**, or Area Under the Curve.

The 'Curve' often refers to the [Receiver Operating Characteristic (ROC)](https://en.wikipedia.org/wiki/Receiver_operating_characteristic), which is a handy thing to know about relevant to Fairness.

The ROC curve measures the tradeoff between the False Positive Rate (FPR) and the True Positive Rate (TPR).

True Positive Rate = 1 - False Negative Rate, so knowing the TPR also lets you know the FNR.

## FairAware Note

The True Positive Rate/False Negative Rate and the False Positive Rate are metrics that let you know how much your model is incorrectly predicting a variable's value:

* Failing to predict it (False Negative Rate), or 
* Over-predicting it (False Positive Rate).


In [0]:
def fit(estimator, train_input_fn, eval_input_fn, has_aux=False, train_steps=300, num_epochs=5):
  loss_main_list = []
  loss_aux_list = []
  auc_main_list = []
  auc_aux_list = []

  # For each epoch, advances train_steps.
  for i in range(num_epochs):
    # Fit the model...
    estimator.fit(input_fn=train_input_fn, steps=train_steps)
    results = estimator.evaluate(input_fn=eval_input_fn, steps=1)

    print("\n=== Step %s:" % (str((i+1)*train_steps)))
    print("    AUC for main task: %s; main task loss: %s;" %
          (results["auc/" + H_MAIN], results["loss/" + H_MAIN]))
    # AUC for income bracket. 
    auc_main_list.append(results["auc/" + H_MAIN])
    # Loss for income bracket.
    loss_main_list.append(results["loss/" + H_MAIN])
    if has_aux:
      print("    AUC for aux task: %s; aux task loss: %s;" %
            (results["auc/" + H_AUX], results["loss/" + H_AUX]))
      # AUC for sex with the multi-task learning
      auc_aux_list.append(results["auc/" + H_AUX])
      # Loss for sex with multi-task learning
      loss_aux_list.append(results["loss/" + H_AUX])
  if has_aux:      
      return auc_main_list, loss_main_list, auc_aux_list, loss_aux_list
  else:
    return auc_main_list, loss_main_list


print("Fitting Baseline!")
base_out = fit(estimator_base, base_train_input_fn, eval_input_fn,
               train_steps=train_steps, num_epochs=num_epochs)
base_auc_main_list, base_loss_main_list = base_out

print("Fitting Multi-task model!")
multi_out = fit(estimator_multi, multi_train_input_fn, eval_input_fn, has_aux=True,
                train_steps=train_steps, num_epochs=num_epochs)
multi_auc_main_list, multi_loss_main_list, multi_auc_aux_list, multi_loss_aux_list = multi_out

print("Fitting Adversarial Multi-task model!")
adv_out = fit(estimator_adv, multi_train_input_fn, eval_input_fn, has_aux=True,
              train_steps=train_steps, num_epochs=num_epochs)
adv_auc_main_list, adv_loss_main_list, adv_auc_aux_list, adv_loss_aux_list = adv_out

Check out what's happened here:  The AUC for all models should have improved, and the loss should have gone down with each iteration.

The baseline might achieve a higher AUC than the adversarial network.



---



## Averaging Multiple Runs

If you run this multiple times, each time it will be a bit different.
In cases like this, we take the **average** over multiple runs. 

Because this is a colab, we'll just run it once.

A general rule of thumb is to average over 5-10 runs.

We can visualize the change in the AUC between the models using matplotlib.


In [0]:
from matplotlib import pyplot as plt
xax = [n for n in range(num_epochs)]
plt.ylabel("AUC")
plt.xlabel("Num Epochs")
plt.plot(xax, adv_auc_main_list, label="Adversarial Multi-task")
plt.plot(xax, multi_auc_main_list, label="Multi-task")
plt.plot(xax, base_auc_main_list, label="Baseline")
plt.xticks(xax)
plt.legend(loc=5)

## Evaluation

Now let's get ready to evaluate.  We will store the false negatives in three variables:


1.   Overall false negatives
2.   False negatives for the label '0' (>50K)
3.   False negatives for the label '1' (<=50K)



In [0]:
# Counting numbers.
def evaluate(estimator, input_fn, test_df, false_positive=True):
  predictions = list(estimator.predict(
      input_fn=lambda :input_fn(test_df, batch_size=-1)))
  print(predictions)
  """Calculates values for the binary confusion matrix"""
  n_0 = 0; n_1 = 0; pos = 0; neg = 0
  tn = 0; tn_0 = 0; tn_1 = 0
  tp = 0; tp_0 = 0; tp_1 = 0
  fp = 0; fp_0 = 0; fp_1 = 0
  fn = 0; fn_0 = 0; fn_1 = 0.0
  # For each of the testing instances...
  output_predictions = [dict_out[(H_MAIN, 'logistic')] for dict_out in predictions]
  print(output_predictions)
  for i in range(len(test_df)):
    # If the ground truth is 0 for the aux task (sex),
    # which means sex is '0'...
    if test_df[AUX][i] == 0.0:
      # Increment 0 count
      n_0 += 1
      if predictions[i][(H_MAIN, 'classes')] == int(test_df[MAIN][i]):
        # If the ground truth is 0 for task 1 (income bracket),
        # it's a 'negative' in a binary confusion matrix.
        if test_df[MAIN][i] == 0.0:
          tn_0 += 1; tn += 1; neg += 1
        # Otherwise it's a positive.
        else:
          tp_0 += 1; tp += 1; pos += 1
      else:
        if test_df[MAIN][i] == 0.0:
          fp_0 += 1; fp += 1; neg += 1
        else:
          fn_0 += 1; fn += 1; pos += 1
    # When sex is '1'....
    else:
      # Increment 1 count
      n_1 += 1
      if predictions[i][(H_MAIN, 'classes')] == int(test_df[MAIN][i]):
        if test_df[MAIN][i] == 0.0:
          tn_1 += 1; tn += 1; neg += 1
        else:
          tp_1 += 1; tp += 1; pos += 1
      else:
        if test_df[MAIN][i] == 0.0:
          fp_1 += 1; fp += 1; neg += 1
        else:
          fn_1 += 1; fn += 1; pos += 1
  print("num pos %d, neg %d" % (pos, neg))
  print("num n_0 %d, n_1 %d" % (n_0, n_1))
  if false_positive:
    normalized_false_positives = float(fp) / len(test_df)
    fpr_0 = float(fp_0) / n_0
    fpr_1 = float(fp_1) / n_1
    return output_predictions, normalized_false_positives, fpr_0, fpr_1
  else:
    normalized_false_negatives = float(fn) / len(test_df)
    fnr_0 = float(fn_0) / n_0
    fnr_1 = float(fn_1) / n_1
    return output_predictions, normalized_false_negatives, fnr_0, fnr_1

On to evaluating the baseline!  We're printing out some of the details to give you a sense of what's getting captured 'under the hood'.

In [0]:
baseline_pred, baseline_fn, baseline_fn_0, baseline_fn_1 = evaluate(
    estimator_base, input_fn, test_df, False)

Now we'll do the same for the multi-task networks.

In [0]:
multi_pred, multi_fn, multi_fn_0, multi_fn_1 = evaluate(
    estimator_multi, input_fn, test_df, False)

adv_pred, adv_fn, adv_fn_0, adv_fn_1 = evaluate(
    estimator_adv, input_fn, test_df, False)

## Plotting Results

Let's plot it to see the False Negative Rate on the income bracket predictions.

In [0]:
plt.bar([1, 3], [baseline_fn_0, baseline_fn_1], [0.4, 0.4], color='red', label="Baseline");
plt.bar([1.5, 3.5], [multi_fn_0, multi_fn_1], [0.4, 0.4], color='blue', label="Multi-task");
plt.bar([2, 4], [adv_fn_0, adv_fn_1], [0.4, 0.4], color='purple', label="Adversarial");
xax = ['Income Bracket < 50K','Income Bracket >= 50K']
plt.xlim(0, 5)
plt.ylabel("False Negative Rate")
plt.xticks([1.5, 3.5], xax)
plt.legend(loc=2)

You might see different kinds of results:  For example, the adversarial approach working better for one income bracket than the other.

Now we'll do the same evaluation for False Positive Rate.

In [0]:
base_pred, base_fp, base_fp_0, base_fp_1 = evaluate(
    estimator_base, input_fn, test_df, True)

multi_pred, multi_fp, multi_fp_0, multi_fp_1 = evaluate(
    estimator_multi, input_fn, test_df, True)

adv_pred, adv_fp, adv_fp_0, adv_fp_1 = evaluate(
    estimator_adv, input_fn, test_df, True)

plt.xlim(0, 5)
plt.bar([1, 3], [base_fp_0, base_fp_1], [0.4, 0.4], color='red', label="Baseline")
plt.bar([1.5, 3.5], [multi_fp_0, multi_fp_1], [0.4, 0.4], color='blue', label="Multi-task");
plt.bar([2, 4], [adv_fp_0, adv_fp_1], [0.4, 0.4], color='purple', label="Adversarial")
xax = ['Income Bracket < 50K','Income Bracket >= 50']
plt.ylabel("False Positive Rate")
plt.xticks([1.25, 3.25], xax) 
plt.legend(loc=2)

## FairAware Note

There are trade-offs between different evaluation metrics, and what to prioritize depends on your application.  For example, a lower False Positive Rate might be more important for your application than a lower False Negative Rate.  You may care a lot about true negatives, and so prefer an overall accuracy metric.

To visualize how true positive rate and false positive rate trade-off, we can create a **receiver operating characteristic curve**, commonly called an ROC curve.

In [0]:
from sklearn.metrics import roc_curve

fpr, tpr, _ = roc_curve(test_df[MAIN], base_pred)
fpr2, tpr2, _ = roc_curve(test_df[MAIN], multi_pred)
fpr3, tpr3, _ = roc_curve(test_df[MAIN], adv_pred)

def draw_roc(fpr, tpr, fpr2, tpr2, fpr3, tpr3, sweet_spot=False):
  plt.figure()
  plt.plot(fpr, tpr, color='red', label='Baseline')
  plt.plot(fpr2, tpr2, color='blue', label='Multitask')
  plt.plot(fpr3, tpr3, color='purple', label='Adversarial')
  if sweet_spot:
    plt.plot(sweet_spot[0], sweet_spot[1], 'bo', color='orange', label='Sweet Spot!')
  plt.legend(loc=4)
  plt.xlim([0.0, 1.0])
  plt.ylim([0.0, 1.05])
  plt.xlabel('False Positive Rate')
  plt.ylabel('True Positive Rate')
  plt.title('Receiver operating characteristic for income.')
  plt.show()

draw_roc(fpr, tpr, fpr2, tpr2, fpr3, tpr3)