# Building Explainable Machine Learning Models
## Generating Counterfactual explanations for Machine Learning models

We have seen how feature importances can be used to approximate a machine learning model. Counterfactuals on the other hand, indicate the changes to be implemented in a model to flip the prediction. In other words, Counterfactual explanations tell us what features need to be changed and by how much to flip a model's prediction to reverse an unfavourable outcome!

<div class="alert alert-block alert-info">

For more information, read the paper:[Explaining Machine Learning Classifiers through Diverse Counterfactual Explanations](https://arxiv.org/pdf/1905.07697.pdf)     
    
</div>

---

## Case Study: Predicting income of an individual

The objective of the problem is to predict whether a given adult makes more than $50,000 a year based on attributes such as education, hours of work per week, etc.


<img src="https://imgur.com/iQDKbXM.png" width="500" height="600" class="center">

## Importing Necessary Libraries

Let’s start by importing the necessary libraries.

In [1]:
import numpy as np   
import pandas as pd   
import matplotlib.pyplot as plt
import seaborn as sns

# Sklearn imports
from sklearn.compose import ColumnTransformer
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import OneHotEncoder
from sklearn.ensemble import RandomForestClassifier
np.random.seed(123) #ensure reproducibility
RANDOM_STATE = 42

# DiCE imports
import dice_ml
from dice_ml.utils import helpers  # helper functions

# supress deprecation warnings from TF
import tensorflow as tf
tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.ERROR)


import warnings  
warnings.filterwarnings("ignore")

In [2]:
# setting up the styling for the plots in this notebook
sns.set(style="white", palette="colorblind", font_scale=1.2, rc={"figure.figsize":(10,6)})

##  Reading in the Dataset

The dataset curated to Ronny Kohavi and Barry Becker was drawn from the 1994 United States Census Bureau data and involves using personal details such as education level to predict whether an individual will earn more or less than $50,000 per year. 

We will use a sample of the dataset which can be accessed from preprocessed version [UCI Machine Learning Repository Irvine, CA: University of California, School of Information and Computer Science](https://archive.ics.uci.edu/ml/datasets/adult). 

Let’s read in the data and look at the first few rows.

In [3]:
data = pd.read_csv('Adult Income Dataset.csv')
data.head()

Unnamed: 0,age,workclass,education,marital_status,occupation,race,gender,hours_per_week,income
0,28,Private,Bachelors,Single,White-Collar,White,Female,60,0
1,30,Self-Employed,Assoc,Married,Professional,White,Male,65,1
2,32,Private,Some-college,Married,White-Collar,White,Male,50,0
3,20,Private,Some-college,Single,Service,White,Female,35,0
4,41,Self-Employed,Some-college,Married,White-Collar,White,Male,50,0


- `age`: continuous.
- `workclass`: Private, Government, Self-Employed, Other/Unknown.
- `education`: HS-grad, Some-college, Bachelors, School, Assoc, Masters,Prof-school, Doctorate.
- `marital-status`: Married, Single, Divorced, Separated, Widowed.
- `occupation`: Blue-Collar, White-Collar, Service, Professional, Sales,Other/Unknown.
- `race`: White, Other.
- `gender`: Female, Male.
- `hours-per-week`: continuous.
- `income`: >50K : 1, <=50K : 0

As can be seen, the dataset consists of various predictor variables like age, gender, occupation, etc, and one target variable called `income`. The target variable is 1 if the income is greater than 50K and 0 if the income is less than 50K. It is a classic binary classification problem. 

In [4]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 26048 entries, 0 to 26047
Data columns (total 9 columns):
 #   Column          Non-Null Count  Dtype 
---  ------          --------------  ----- 
 0   age             26048 non-null  int64 
 1   workclass       26048 non-null  object
 2   education       26048 non-null  object
 3   marital_status  26048 non-null  object
 4   occupation      26048 non-null  object
 5   race            26048 non-null  object
 6   gender          26048 non-null  object
 7   hours_per_week  26048 non-null  int64 
 8   income          26048 non-null  int64 
dtypes: int64(3), object(6)
memory usage: 1.8+ MB


## Data Preparation 

We will retain the Demographic features, just for the sake of the example

### Train/Test Split
Creating the target and the features column and splitting the dataset into train and validation sets. 

In [5]:
# Creating the target and the features column and splitting the dataset into train and validation set.
target = data["income"]
train_data, test_data, y_train, y_test = train_test_split(data, target, test_size=0.2, random_state=RANDOM_STATE, stratify=target)
x_train = train_data.drop('income', axis=1)
x_test = test_data.drop('income', axis=1)

## Training a Random Forest Model
Now let's fit a [Random Forest classifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html)

In [6]:
# Training and fitting a Random Forest Model
numerical = ["age", "hours_per_week"]
categorical = x_train.columns.difference(numerical)

categorical_transformer = Pipeline(steps=[('onehot', OneHotEncoder(handle_unknown='ignore'))])

transformations = ColumnTransformer(transformers=[('cat', categorical_transformer, categorical)])

# Append classifier to preprocessing pipeline.
# Now we have a full prediction pipeline.
clf = Pipeline(steps=[('preprocessor', transformations), ('classifier', RandomForestClassifier())])
model = clf.fit(x_train, y_train)

We now have our model and our predictions. Let’s now explore the different ways by which we can understand the model and its predictions in a more meaningful way.

### 1. Construct a data object for DiCE

We construct a data object for DiCE. Since continuous and discrete features have different ways of perturbation, we need to specify the names of the continuous features. DiCE also requires the name of the output variable that the ML model will predict.

In [7]:
d = dice_ml.Data(dataframe=train_data, continuous_features=['age', 'hours_per_week'], outcome_name='income')

### 2. Initialize the DiCE explainer

To initialize the DiCE explainer, both dataset and a model are needed. DiCE provides local explanation for the model and requires an input datapoint whose outcome needs to be explained.

In [8]:
# DiCE supports sklearn, tensorflow and Pytorch backends
m = dice_ml.Model(model=model, backend="sklearn")

In [9]:
# Using method=random for generating CFs
explainer = dice_ml.Dice(d, m, method="random")

### 3. Generate CF based on the blackbox model

We can now generate the counterfactual explanations. The first argument of the `generate_counterfactuals` method is the query instances on which counterfactuals are desired. This can be a dataframe with one or more rows.

In [10]:
input_datapoint = x_test[0:1]
exp1 = explainer.generate_counterfactuals(input_datapoint, total_CFs=5, desired_class="opposite")

100%|████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  5.09it/s]


### 4. Visualizing the Counterfactuals

Below we provide a sample input whose outcome is `0` (low-income) as per the ML model object `m`. Given the query input, we can now generate counterfactual explanations to show perturbed inputs from the original input where the ML model outputs `class 1` (high-income). 

In [11]:
exp1.visualize_as_dataframe(show_only_changes=True)

Query instance (original outcome : 0)


Unnamed: 0,age,workclass,education,marital_status,occupation,race,gender,hours_per_week,income
0,22,Private,HS-grad,Single,Blue-Collar,White,Male,16,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,age,workclass,education,marital_status,occupation,race,gender,hours_per_week,income
0,-,-,Doctorate,Married,-,-,-,-,1
1,38.0,-,Prof-school,-,-,-,-,-,1
2,-,-,Prof-school,Married,-,-,-,-,1
3,-,-,Prof-school,-,-,-,-,-,1
4,84.0,-,Prof-school,-,-,-,-,-,1


### 5. Restricting the features to vary while generating the counterfactuals

In [12]:
exp2 = explainer.generate_counterfactuals(input_datapoint,
                                  total_CFs=4,
                                  desired_class="opposite",
                                  features_to_vary=['age','education', 'occupation'])
exp2.visualize_as_dataframe(show_only_changes=True)

100%|████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  6.25it/s]

Query instance (original outcome : 0)





Unnamed: 0,age,workclass,education,marital_status,occupation,race,gender,hours_per_week,income
0,22,Private,HS-grad,Single,Blue-Collar,White,Male,16,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,age,workclass,education,marital_status,occupation,race,gender,hours_per_week,income
0,63.0,-,Prof-school,-,-,-,-,-,1
1,60.0,-,Prof-school,-,-,-,-,-,1
2,79.0,-,Prof-school,-,-,-,-,-,1
3,43.0,-,Prof-school,-,-,-,-,-,1


### 6. Specifying the permitted range of features

In [13]:
exp3 = explainer.generate_counterfactuals(input_datapoint,
                                  total_CFs=4,
                                  desired_class="opposite",
                                  permitted_range={'age': [40, 50], 'education': ['Doctorate', 'Prof-school']})
exp3.visualize_as_dataframe(show_only_changes=True)

100%|████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  4.72it/s]

Query instance (original outcome : 0)





Unnamed: 0,age,workclass,education,marital_status,occupation,race,gender,hours_per_week,income
0,22,Private,HS-grad,Single,Blue-Collar,White,Male,16,0



Diverse Counterfactual set (new outcome: 1.0)


Unnamed: 0,age,workclass,education,marital_status,occupation,race,gender,hours_per_week,income
0,-,-,Prof-school,Separated,-,-,-,-,1
1,-,-,Prof-school,-,Sales,-,-,-,1
2,41.0,-,Prof-school,-,-,-,-,-,1
3,-,-,Prof-school,-,-,-,-,52.0,1


## Advantages 
* Easy to understand and implement
* Doesnot require accesss to data or model
* Even works for non ML systems e.g Rule Based


## Drawbacks
* Suffers from Rashomon effect i.e multiple explanations for a single instance.