## This notebook shows how to generate Diverse Counterfactual Explanations

In [1]:
## import libraries

## python libraries
import numpy as np
import pandas as pd
import os

# DiCE libraries
import dice_ml
from dice_ml import dice # dice interface
from dice_ml import data # data interface
from dice_ml import model # model interface
from dice_ml.utils import helpers # helper functions

### Loading dataset

We use popular 'adult' income dataset from UCI Machine Learning Repository (https://archive.ics.uci.edu/ml/datasets/adult). For demonstration purposes, we transform the data as detailed in **detail in dice_ml.utils.helpers** module. 

In [2]:
dataset = helpers.load_adult_income_dataset()

In [3]:
dataset.head()

Unnamed: 0,age,workclass,education,marital_status,occupation,race,gender,hours_per_week,income
0,39,Government,Bachelors,Single,White-Collar,White,Male,40,0
1,50,Self-Employed,Bachelors,Married,White-Collar,White,Male,13,0
2,38,Private,HS-grad,Divorced,Blue-Collar,White,Male,40,0
3,53,Private,School,Married,Blue-Collar,Other,Male,40,0
4,28,Private,Bachelors,Married,Professional,Other,Female,40,0


In [4]:
d = data.Data(dataframe=dataset, continuous_features=['age', 'hours_per_week'], outcome_name='income')

### Loading trained ML model

Below, we use a trained ML model which produces high accuracy on test datasets, comparable to other popular baselines. This sample trained model comes in-built with our package.

In [5]:
ML_modelpath = helpers.get_adult_income_modelpath()
m = model.Model(model_path= ML_modelpath)

### Generate diverse counterfactuals

In [6]:
# initiate DiCE
exp = dice.Dice(d, m)

In [7]:
# query instance in the form of a dictionary; keys: feature name, values: feature value
query_instance = {'age':22, 
                  'workclass':'Private', 
                  'education':'HS-grad', 
                  'marital_status':'Single', 
                  'occupation':'Service',
                  'race': 'White', 
                  'gender':'Female', 
                  'hours_per_week': 45}

In [8]:
# generate counterfactuals
dice_exp = exp.generate_counterfactuals(query_instance, total_CFs=4, desired_class="opposite")

Diverse Counterfactuals found! total time taken: 00 min 11 sec


In [9]:
# visualize the resutls
dice_exp.visualize_as_dataframe()

Query instance:


Unnamed: 0,age,workclass,education,marital_status,occupation,race,gender,hours_per_week,income
0,22.0,Private,HS-grad,Single,Service,White,Female,45.0,0.004471



Diverse Counterfactual set:


Unnamed: 0,age,workclass,education,marital_status,occupation,race,gender,hours_per_week,income
0,90.0,Private,Assoc,Separated,Service,White,Female,99.0,0.943
1,36.0,Government,Bachelors,Married,Sales,White,Female,63.0,0.994
2,59.0,Private,HS-grad,Single,Other/Unknown,White,Male,99.0,0.977
3,25.0,Private,Bachelors,Single,White-Collar,Other,Female,41.0,0.994
