# DiCE Demo

Sample demo here is adapted from the official [DiCE repository](https://github.com/interpretml/DiCE)

In [1]:
# import DiCE
import dice_ml
from dice_ml.utils import helpers # helper functions

# supress deprecation warnings from TF
import tensorflow as tf
tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.ERROR)

## Dataset for training an ML model

In [2]:
dataset = helpers.load_adult_income_dataset()
dataset.head()

Unnamed: 0,age,workclass,education,marital_status,occupation,race,gender,hours_per_week,income
0,39,Government,Bachelors,Single,White-Collar,White,Male,40,0
1,50,Self-Employed,Bachelors,Married,White-Collar,White,Male,13,0
2,38,Private,HS-grad,Divorced,Blue-Collar,White,Male,40,0
3,53,Private,School,Married,Blue-Collar,Other,Male,40,0
4,28,Private,Bachelors,Married,Professional,Other,Female,40,0


In [3]:
d = dice_ml.Data(dataframe=dataset,
    continuous_features=['age', 'hours_per_week'],
    outcome_name='income')

## Pre-trained ML model

In [4]:
backend = 'TF'+tf.__version__[0] # TF1
ML_modelpath = helpers.get_adult_income_modelpath(backend=backend)
m = dice_ml.Model(model_path= ML_modelpath, backend=backend)

## DiCE explanation instance

In [5]:
exp = dice_ml.Dice(d,m)

In [6]:
# query instance in the form of a dictionary; keys: feature name, values: feature value
query_instance = {'age':22, 
                  'workclass':'Private', 
                  'education':'HS-grad', 
                  'marital_status':'Single', 
                  'occupation':'Service',
                  'race': 'White', 
                  'gender':'Female', 
                  'hours_per_week': 45}

In [7]:
# generate counterfactuals
dice_exp = exp.generate_counterfactuals(query_instance, total_CFs=4, desired_class="opposite")

Diverse Counterfactuals found! total time taken: 00 min 42 sec


In [8]:
# visualize the results
dice_exp.visualize_as_dataframe(show_only_changes=True)

Query instance (original outcome : 0)


Unnamed: 0,age,workclass,education,marital_status,occupation,race,gender,hours_per_week,income
0,22.0,Private,HS-grad,Single,Service,White,Female,45.0,0.01904



Diverse Counterfactual set (new outcome : 1)


Unnamed: 0,age,workclass,education,marital_status,occupation,race,gender,hours_per_week,income
0,70.0,-,Masters,-,White-Collar,-,-,51.0,0.534
1,-,Self-Employed,Doctorate,Married,-,-,-,-,0.861
2,47.0,-,-,Married,-,-,-,-,0.589
3,36.0,-,Prof-school,Married,-,-,-,62.0,0.937


From above, we see different counterfactuals that might actually be helpful to the individual and could recommend some course of actions to take. For example, the last counterfactual says that in order to achieve an income that is greater than $50,000 annually, the user should go to Prof-School, work for 62 hours a week, and get married by the time she reaches 36 years old.

From this initial set of counterfactuals, we see that the recommended changes may actually be hard to achieve, such as getting married (hehe). In general, counterfactuals that are closer to the individual profile is better. One way of achieving this is by specifying which features can be varied.

In [9]:
dice_exp = exp.generate_counterfactuals(query_instance, total_CFs=4, desired_class="opposite", 
                                        features_to_vary=['workclass','education','occupation','hours_per_week'])

Diverse Counterfactuals found! total time taken: 04 min 45 sec


Observation: generating counterfactuals while constraining the number of features to vary increases the run time.

In [10]:
# visualize the results
dice_exp.visualize_as_dataframe(show_only_changes=True)

Query instance (original outcome : 0)


Unnamed: 0,age,workclass,education,marital_status,occupation,race,gender,hours_per_week,income
0,22.0,Private,HS-grad,Single,Service,White,Female,45.0,0.01904



Diverse Counterfactual set (new outcome : 1)


Unnamed: 0,age,workclass,education,marital_status,occupation,race,gender,hours_per_week,income
0,-,Self-Employed,Doctorate,-,White-Collar,-,-,59.0,0.501
1,-,Self-Employed,Doctorate,-,-,-,-,75.0,0.5
2,-,Self-Employed,Doctorate,-,-,-,-,86.0,0.518
3,-,Self-Employed,Doctorate,-,-,-,-,99.0,0.539


## Limitations

DiCE is still in its early stages and therefore, counterfactuals may not be always feasible. 

DiCE currently **only works for models created using Tensorflow and Keras**. However, the team indicated that supported other models such as those built in Scikit Learn  is part of their roadmap.