## Running Cortex Certifai fairness evaluation on xgboost model to predict adult income

- Description: Each dataset row represents the attribute values for de-identified individual. The models predict the income bracket of the person as <=50K or >=50K
- Dataset Source: UCI [Machine Learning Repository](https://archive.ics.uci.edu/ml/datasets/census+income)
- In the example below we show how to create an xgboost model and evaluate fairness using Cortex Certifai
- Example can be worked locally by installing the dependencies listed below
- dependencies
    - python>=3.6.2,<=3.7
    - scikit-learn=0.20.3
    - xgboost (`conda install -c conda-forge xgboost`)
    - numpy=1.16.2
    - pandas
    - ipython
    - matplotlib
    - jupyter


In [1]:
# neccessary imports
import xgboost as xgb
from sklearn.metrics import accuracy_score
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split

In [2]:
# special import - 
# for multiprocessing to work in a Notebook,  pickled classes must be in a separate package or notebook
# hence, the model encoder(s),decoder class has to be somewhere other than the current notebook

import os
import sys
sys.path.insert(0, os.path.abspath(os.path.join('.')))
from cat_encoder import CatEncoder

### prepare data for model training

In [3]:
# load data into dataframe
df = pd.read_csv('adult_income-prepped.csv')

In [4]:
# Separate outcome
label_column = 'income'
y = df[label_column]
X_raw = df.drop(label_column, axis=1)

# remove some additional non helpful columns
rm=["fnlwgt", "capital-loss"]
dropped_indexes_list = [i for i,col in enumerate(X_raw.columns.to_list()) if col in rm]
final_list=X_raw.columns.to_list()
for i in rm:   
    final_list.remove(i)
X = X_raw[final_list]

In [5]:
# create train/test set from the cleaned dataframe(after removing non-useful columns)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=42)

In [6]:
# create encoder for categorical columns
from cat_encoder import CatEncoder
cat_columns = [
   'workclass', 
   'education', 
   'marital-status', 
   'occupation', 
   'relationship',
   'race',
   'gender',
   'native-country'
           ]
encoder = CatEncoder(cat_columns, X_train)

### set hyperparams and start model train

In [7]:
# define hyperparams for training xgboost model
params = {"objective":"reg:squarederror",'colsample_bytree': 0.3,'learning_rate': 0.1,
                'max_depth': 5, 'alpha': 10}

In [8]:
# encode training data to be used to for model training 
data_dmatrix = xgb.DMatrix(data=encoder(X_train.values),label=y_train)

In [9]:
# train the xgboost model
xg_reg = xgb.train(params=params, dtrain=data_dmatrix, num_boost_round=10)

### calcuate model accuracy on test set

In [10]:
# calculate accuracy on test-set. using 0.46 as threshold for scoring
threshold = 0.46
dtest = xgb.DMatrix(encoder(X_test.values))
preds = xg_reg.predict(dtest)
best_preds = map(lambda x: int(x > threshold), preds)
acc = accuracy_score(y_test, list(best_preds))
acc

0.8493192752584706

### cortex certifai updates required before initiating scan

### wrapping model to create xgboost.Dmatrix obj from numpy arrays for certifai predicts

- cortex certifai invokes model (black-box) predicts using numpy-arrays from evaluation dataset provided
- since xgboost model requires Dmatrix obj for prediction we create a `TransformedPredict` wrapper class
- `TransformedPredict` wrapper creates Dmatrix object before returning calling wrapped model's (here xgboost) predict

In [11]:
%%writefile override_model_predict.py

import numpy as np
import xgboost as xgb
class TransformedPredict:
    def __init__(self,model):
        self.model = model
    def predict(self,arr):
        dtest = xgb.DMatrix(data=arr)
        return self.model.predict(dtest)

Overwriting override_model_predict.py


### soft scoring models additionally need to provide a decoder callable to get outcomes

- soft scoring models like xgboost return scores (e.g. probability) that needs to be passed through a threshold filter to get final outcomes
- just as we did above to create a threshold to filter binary outcomes for calculating accuracy metrics, we create a `Decoder` class with overridden `__call__` method to add decoding rules for xgboost model scores


In [12]:
%%writefile decoder.py
import numpy as np

class Decoder:
    def __init__(self,threshold):
        self.threshold = threshold
    
    def __call__(self,x):
        if not isinstance(x, np.ndarray):
             x = np.array(x)
        return (x > self.threshold).astype(int)

Overwriting decoder.py


In [13]:
# test to verify model predicts with new wrapper model class == model predicts from raw model
from decoder import Decoder
from override_model_predict import TransformedPredict
decoder = Decoder(threshold)
transformed_model = TransformedPredict(xg_reg)
assert (decoder(transformed_model.predict(encoder(X_test.values))) == 
        decoder(xg_reg.predict(xgb.DMatrix(encoder(X_test.values))))).all

### using cortex certifai scan api's to set up model scanning

- before running below section make sure you have necessary packages for cortex certifai installed
- copy the toolkit path to `certifai_toolkit_path` variable and run the below cell to install the required packages to initiate a certifai model scan

In [213]:
certifai_toolkit_path = 'path_to_certifai_toolkit'
!find $certifai_toolkit_path/packages/all       -type f ! -name "*console-*" ! -name "*client-*" | xargs -I % sh -c 'pip install % ' ;
!find $certifai_toolkit_path/packages/python3.6 -type f   -name "*engine-*"                      | xargs -I % sh -c 'pip install % ' ;

In [14]:
# check version of certifai installed
from certifai.scanner.version import  get_version
get_version()

'1.2.13'

In [15]:
# imports for building certifai scan
from certifai.scanner.builder import (CertifaiScanBuilder, CertifaiPredictorWrapper, CertifaiModel, CertifaiModelMetric,
                                      CertifaiDataset, CertifaiGroupingFeature, CertifaiDatasetSource,
                                      CertifaiPredictionTask, CertifaiTaskOutcomes, CertifaiOutcomeValue)
from certifai.scanner.report_utils import scores, construct_scores_dataframe

### create a CertifaiPredictiorWrapper object from transformed model created above

- this predictiorWrapper object will be used by certifai to perform model predicts as constructor to CertifaiModel
- run the assert test below to confirm predictions from raw model and certifaiWrapped model are identical

In [16]:
xbg_certifai_wrapped_model = CertifaiPredictorWrapper(transformed_model,encoder=encoder,decoder=decoder)

In [17]:
# test to assert wrapped certifai model predicts == raw model predicts
assert (xbg_certifai_wrapped_model.model.predict(X_test.values) == 
        decoder(xg_reg.predict(xgb.DMatrix(encoder(X_test.values))))).all

### creating a certifai evaluation dataset

- earlier we modified our dataset to drop certain non useful columns
- and we ran our encoder on the cleaned dataset
- we will pass the cleaned dataframe (with the removed columns) to certifai for evaluation
- this is needed since the dropped columns are non-encoded and are essentially not required by model for predicts

In [18]:
# cleaned dataframe `X_raw[final_list]` or X
dataframe_certifai = X

In [None]:
# Create the scan object from scratch using the ScanBuilder class

# First define the possible prediction outcomes
task = CertifaiPredictionTask(CertifaiTaskOutcomes.classification(
    [
        CertifaiOutcomeValue(1, name='income > 50K', favorable=True),
        CertifaiOutcomeValue(0, name='income < 50K')
    ]),
    prediction_description='Determine whether income greater than 50K or less')

scan = CertifaiScanBuilder.create('test_user_case',
                                  prediction_task=task)

# Add our local models
first_model = CertifaiModel('XGB',
                            local_predictor=xbg_certifai_wrapped_model)
scan.add_model(first_model)

# Add the eval dataset
eval_dataset = CertifaiDataset('evaluation',
                               CertifaiDatasetSource.dataframe(dataframe_certifai))
scan.add_dataset(eval_dataset)

# Setup an evaluation for fairness on the above dataset using the model
# We'll look at disparity between groups defined by marital status and age
scan.add_fairness_grouping_feature(CertifaiGroupingFeature('race'))
scan.add_fairness_grouping_feature(CertifaiGroupingFeature('gender'))
scan.add_evaluation_type('fairness')
scan.evaluation_dataset_id = 'evaluation'

# Because the dataset contains a ground truth outcome column which the model does not
# expect to receive as input we need to state that in the dataset schema (since it cannot
# be inferred from the CSV)
scan.dataset_schema.outcome_feature_name = 'income'

# Run the scan.
# By default this will write the results into individual report files (one per model and evaluation
# type) in the 'reports' directory relative to the Jupyter root.  This may be disabled by specifying
# `write_reports=False` as below
# The result is a dictionary of dictionaries of reports.  The top level dict key is the evaluation type
# and the second level key is model id.
# Reports saved as JSON (which `write_reports=True` will do) may be visualized in the console app
result = scan.run(write_reports=False)

2020-05-27 17:10:56,860 root   INFO     Validating license...
2020-05-27 17:10:56,861 root   INFO     License is valid - expires: n/a
2020-05-27 17:10:56,882 root   INFO     Generated unique scan id: 3ff9bc936ff6
2020-05-27 17:10:56,884 root   INFO     Validating input data...
2020-05-27 17:10:56,885 root   INFO     Creating dataset with id: evaluation
       'occupation', 'relationship', 'race', 'gender', 'capital-gain',
       'hours-per-week', 'native-country'],
      dtype='object')`.
2020-05-27 17:10:56,888 root   INFO     Inferring dataset features and applying user overrides
2020-05-27 17:10:56,889 root   INFO     Reading configs from: /Users/akumar/.certifai/certifai_config.ini
2020-05-27 17:10:56,892 root   INFO     Reading default config (fallback) from: /Users/akumar/miniconda/envs/visa3.7/lib/python3.7/site-packages/certifai/common/utils/default_certifai_config.ini
2020-05-27 17:10:56,906 root   INFO     Read config marker: config['default']['marker'] = 0.1
2020-05-27 17:10

In [None]:
result