**RDS (DS-UA 202) Spring 2022: Homework 1 Template**


This notebook is a template for problem 2. You should save a copy of this notebook and write your code in that copy. The code to setup the analysis is provided for you here. You should not edit or add to the setup code.

Some suggested steps are included as comments in the below code cells. You do not need to follow these suggestions (other solutions or approaches are acceptable).

# Setup

## Packages

In [1]:
!git clone https://github.com/lurosenb/superquail
!pip install aif360==0.3.0 
!pip install BlackBoxAuditing
!pip install tensorflow==1.13.1
!pip install folktables

Cloning into 'superquail'...
remote: Enumerating objects: 24, done.[K
remote: Counting objects: 100% (24/24), done.[K
remote: Compressing objects: 100% (20/20), done.[K
remote: Total 24 (delta 1), reused 20 (delta 1), pack-reused 0[K
Unpacking objects: 100% (24/24), done.
Collecting aif360==0.3.0
  Downloading aif360-0.3.0-py3-none-any.whl (165 kB)
[K     |████████████████████████████████| 165 kB 22.4 MB/s 
Installing collected packages: aif360
Successfully installed aif360-0.3.0
Collecting BlackBoxAuditing
  Downloading BlackBoxAuditing-0.1.54.tar.gz (2.6 MB)
[K     |████████████████████████████████| 2.6 MB 11.3 MB/s 
Building wheels for collected packages: BlackBoxAuditing
  Building wheel for BlackBoxAuditing (setup.py) ... [?25l[?25hdone
  Created wheel for BlackBoxAuditing: filename=BlackBoxAuditing-0.1.54-py2.py3-none-any.whl size=1394770 sha256=111815441729803e05baa792374570cd47dc8a07db99746fc897ecccfd2f512f
  Stored in directory: /root/.cache/pip/wheels/05/9f/ee/541a74b

In [62]:
import random
random.seed(6)

import sys
import warnings

import numpy as np
import pandas as pd
import tensorflow as tf
import json
import time 
from tqdm import tqdm

import matplotlib.pyplot as plt 
import seaborn as sns

from folktables import ACSDataSource, ACSEmployment, ACSIncome, ACSPublicCoverage, ACSTravelTime
from superquail.data.acs_helper import ACSData

from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, roc_auc_score
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import RandomizedSearchCV
from sklearn.preprocessing import MinMaxScaler

from aif360.datasets import BinaryLabelDataset, StandardDataset
from aif360.algorithms.preprocessing import DisparateImpactRemover
from aif360.algorithms.inprocessing import PrejudiceRemover
from aif360.algorithms.postprocessing import CalibratedEqOddsPostprocessing, RejectOptionClassification
from aif360.metrics import BinaryLabelDatasetMetric, ClassificationMetric
from aif360.metrics import BinaryLabelDatasetMetric

from aif360.sklearn.preprocessing import ReweighingMeta
from aif360.sklearn.inprocessing import AdversarialDebiasing
from aif360.sklearn.postprocessing import CalibratedEqualizedOdds, PostProcessingMeta
from aif360.sklearn.datasets import fetch_adult
from aif360.sklearn.metrics import disparate_impact_ratio, average_odds_error, generalized_fpr
from aif360.sklearn.metrics import generalized_fnr, difference

import BlackBoxAuditing
%matplotlib inline

## Load data

We have included code to read in the folktables dataset. The Folktables dataset is taken from US Census Data and is built to solve a few simple prediction tasks. The sample we pull is data from 2018 in California. The column names are described in the table below. Note that certain categorical variables have been mapped to integer values, which we will keep as is for the following analyses.

For more information on the this dataset, please see the following paper:
https://eaamo2021.eaamo.org/accepted/acceptednonarchival/EAMO21_paper_16.pdf

| Column Name | Feature | Description/Notes |
| --- | ----------- | --- |
| PINCP | Total person’s income | (Target) 1 if >= $50k, 0 if less |
| SEX | Sex | (Sensitive Attribute) Male=1, Female=2 |
| RAC1P | Race | Dropped from this analysis to focus on one sensitive attribute |
| AGEP | Age | Ranges from 0-99 |
| COW | Class of Worker | Ranges 1-9, see paper for description |
| SCHL | Education Level | Ranges 1-24, see paper for description |
| MAR | Marital Status | Ranges 1-5, see paper for description |
| OCCP | Occupation | Codes taken from Public Use Microdata Sample (PUMS) from the US Census, see paper |
| POBP | Place of Birth | Codes taken from Public Use Microdata Sample (PUMS) from the US Census, see paper |
| RELP | Relationship | Relationship of individual to person who responded to the Census taker. Ranges 0-17, see paper for description |
| WKHP | Hours worked per week | Ranges from 0-99, averaged over previous year |

In [4]:
np.random.seed(13) # do not change the seed

# read in the folktables dataset 
full_df, features_df, target_df, groups_df = ACSData().return_acs_data_scenario(scenario="ACSIncome", subsample=30000)
full_df = full_df.drop(columns='RAC1P') # drop race -- another protected attribute from our dataset

print(full_df.shape)
full_df.head()

Downloading data for 2018 1-Year person survey for CA...
(30000, 10)


Unnamed: 0,AGEP,COW,SCHL,MAR,OCCP,POBP,RELP,WKHP,SEX,PINCP
0,44.0,1.0,1.0,1.0,4220.0,6.0,10.0,40.0,1.0,0.0
1,66.0,2.0,20.0,2.0,4720.0,42.0,0.0,32.0,2.0,0.0
2,72.0,6.0,18.0,1.0,10.0,6.0,1.0,8.0,2.0,1.0
3,53.0,1.0,21.0,1.0,1460.0,457.0,0.0,40.0,1.0,1.0
4,55.0,1.0,16.0,1.0,220.0,6.0,1.0,40.0,1.0,0.0


## Set protected attribute and target

In [5]:
protected_attr = 'SEX' # set sex as the protected attribute
target = 'PINCP' # personal income as the target, note that [1 = >50k]

In [6]:
# convert this dataframe into an aif360 dataset
original_data = BinaryLabelDataset(
    favorable_label=1,
    unfavorable_label=0,
    df=full_df,
    label_names=[target],
    protected_attribute_names=[protected_attr])
privileged_groups = [{protected_attr: 1}] 
unprivileged_groups = [{protected_attr: 2}]

## Split data

In [7]:
seed = 50
train_data, test_data = original_data.split([0.8], shuffle=True, seed=seed)

## Scale features in the data

In [8]:
scaler = MinMaxScaler()

train_data.features = scaler.fit_transform(train_data.features)
test_data.features = scaler.transform(test_data.features)

# convert to dataframes
train_df, _ = train_data.convert_to_dataframe()
test_df, _ = test_data.convert_to_dataframe()
print("Training set: ", train_df.shape)
print("Test set: ", test_df.shape)

# extract x (features) and y (target)
train_x = train_df.drop([target, protected_attr], axis=1)
train_y = train_df[target]
test_x = test_df.drop([target, protected_attr], axis=1)
test_y = test_df[target]

Training set:  (24000, 10)
Test set:  (6000, 10)


# 2 (a)

## Train a random forest model (baseline)

In [12]:
# use these hyperparameters in your call to RandomForestClassifier
n_estimators = 20
max_depth = 10

# set up the random forest model, using the hyperparameters
model = RandomForestClassifier(n_estimators=n_estimators,max_depth=max_depth) 

# fit the model using the training data (train_x, train_y)
model.fit(train_x,train_y)


RandomForestClassifier(max_depth=10, n_estimators=20)

## Calculate metrics

In [13]:
# the below function has been provided for you. You can use this function to
# convert your data to a StandardDataset format for use in AIF360
def transform_to_aif(df, target=target, protected_attr=protected_attr):
  '''convert a pandas.DataFrame to a StandardDataset used in AIF360'''

  sd = StandardDataset(
      df,
      label_name = target,
      favorable_classes = [1],
      protected_attribute_names = [protected_attr],
      privileged_classes = [[1]]
  )

  return sd

In [83]:
# calculate predictions from baseline RF model
y_predict = model.predict(test_x)

# convert predictions data to AIF StandardDataset
predict_df = test_df.copy()
predict_df[target] = y_predict
predict_sd = transform_to_aif(predict_df)
print(predict_sd)

# also create AIF StandardDataset versions of training and test data
train_sd = transform_to_aif(train_df)
test_sd = transform_to_aif(test_df)
print(test_sd)

               instance weights  features  ...                     labels
                                           ... protected attribute       
                                     AGEP  ...                 SEX       
instance names                             ...                           
4252                        1.0  0.077922  ...                 1.0    0.0
17560                       1.0  0.480519  ...                 2.0    1.0
13863                       1.0  0.519481  ...                 2.0    0.0
19541                       1.0  0.376623  ...                 1.0    1.0
19879                       1.0  0.571429  ...                 2.0    1.0
...                         ...       ...  ...                 ...    ...
15649                       1.0  0.194805  ...                 2.0    0.0
22637                       1.0  0.233766  ...                 2.0    0.0
10123                       1.0  0.480519  ...                 1.0    1.0
5600                        1.0  0.467

In [98]:
# calculate metrics
overall_accu = accuracy_score(test_y, y_predict)
res_cl = ClassificationMetric(test_sd, predict_sd)

ov_accu = res_cl.accuracy
pg_accu = 0
ug_accu = 0
di = res_cl.disparate_impact
fpr = 0
print(ov_accu)
print(di)

AttributeError: ignored

# 2 (b)

## Transform the original data using Disparate Impact Remover at five repair levels and calculate metrics

In [19]:
# the below function has been provided for you. You can use this function to
# plot the repair_level (on the x-axis) against a given metric,
# e.g. accuracy, on the y-axis
def plot_metric_repair(repair_levels, metric_values, metric_name):
  '''Creates a line plot showing how the metric changed for different values of repair level'''

  # Plot the metrics
  plt.plot(repair_levels, metric_values, color='#0384fc', linewidth=3, label=metric_name)

  # Create labels, etc. 
  plt.xlabel('Repair level')
  plt.ylabel(metric_name)
  plt.legend()
  plt.show()

In [33]:
# use these repair levels
repair_levels = [0, 0.25, 0.5, 0.75, 1]

# transform the test and training data using DI-remover at the above repair
DIs = []
train_repds = []
test_repds = []
for level in repair_levels:
  di = DisparateImpactRemover(repair_level=level)
  train_repd = di.fit_transform(train_data)
  test_repd = di.fit_transform(test_data)
  train_repds.append(train_repd)
  test_repds.append(test_repd)
# levels and calculate metrics (you may wish to use a for loop)

    

In [30]:
# plot each metric against the repair level
# (you can use the plot_metric_repair above)
#plot_metric_repair(repair_levels,)


TypeError: ignored

# 2 (c) 


## Train a Prejudice Remover model at three eta values and calculate metrics


In [None]:
etas = [0.01, 0.1, 1] # eta is the weight we apply to the fairness regularization parameter

# train a Prejudice Remover model at these eta values and calculate metrics
# (you may wish to use a for loop)


In [None]:
# plot one or more of the metrics varies at the different values of eta
