## Quick introduction to generating counterfactual explanations using DiCE

In [1]:
# import DiCE
import dice_ml
from dice_ml.utils import helpers # helper functions

DiCE requires two inputs: a training dataset and a pre-trained ML model. It can also work without access to the full dataset (see this [notebook](DiCE_with_private_data.ipynb) for advanced examples).

### Loading dataset

We use the "adult" income dataset from UCI Machine Learning Repository (https://archive.ics.uci.edu/ml/datasets/adult). For demonstration purposes, we transform the data as described in **dice_ml.utils.helpers** module. 

In [2]:
dataset = helpers.load_adult_income_dataset()

This dataset has 8 features. The outcome is income which is binarized to 0 (low-income, <=50K) or 1 (high-income, >50K). 

In [3]:
dataset.head()

Unnamed: 0,age,workclass,education,marital_status,occupation,race,gender,hours_per_week,income
0,39,Government,Bachelors,Single,White-Collar,White,Male,40,0
1,50,Self-Employed,Bachelors,Married,White-Collar,White,Male,13,0
2,38,Private,HS-grad,Divorced,Blue-Collar,White,Male,40,0
3,53,Private,School,Married,Blue-Collar,Other,Male,40,0
4,28,Private,Bachelors,Married,Professional,Other,Female,40,0


In [4]:
# description of transformed features
adult_info = helpers.get_adult_data_info()
adult_info

{'age': 'age',
 'workclass': 'type of industry (Government, Other/Unknown, Private, Self-Employed)',
 'education': 'education level (Assoc, Bachelors, Doctorate, HS-grad, Masters, Prof-school, School, Some-college)',
 'marital_status': 'marital status (Divorced, Married, Separated, Single, Widowed)',
 'occupation': 'occupation (Blue-Collar, Other/Unknown, Professional, Sales, Service, White-Collar)',
 'race': 'white or other race?',
 'gender': 'male or female?',
 'hours_per_week': 'total work hours per week',
 'income': '0 (<=50K) vs 1 (>50K)'}

Given this dataset, we construct a data object for DiCE. Since continuous and discrete features have different ways of perturbation, we need to specify the names of the continuous features. DiCE also requires the name of the output variable that the ML model will predict.

In [5]:
d = dice_ml.Data(dataframe=dataset, continuous_features=['age', 'hours_per_week'], outcome_name='income')

### Loading the ML model

In [6]:
ML_modelpath = helpers.get_adult_income_modelpath()
print(ML_modelpath)
m = dice_ml.model.Model(model_path= ML_modelpath, backend='PYT')

/mnt/c/Users/t-dimaha/Desktop/DiCE/env/lib/python3.6/site-packages/dice_ml-0.2-py3.6.egg/dice_ml/utils/sample_trained_models/adult.h5


AttributeError: 'NoneType' object has no attribute 'load_state_dict'

Below, we use a pre-trained ML model which produces high accuracy comparable to other baselines. For convenience, we include the sample trained model with the DiCE package.

In [None]:
import torch
import torch.utils.data
from torch import nn, optim
from torch.nn import functional as F
from torchvision import datasets, transforms
from torchvision.utils import save_image
from torch.autograd import Variable

import numpy as np

from dice_ml.model_interfaces.pytorch_model import PyTorchModel

In [None]:
inp_shape= len(d.encoded_feature_names)

ML_modelpath = helpers.get_adult_income_modelpath()
m = PyTorchModel(inp_shape)

learning_rate = 0.001
# Default Batch Size of Keras
batch_size = 32
optimizer = optim.Adam([
    {'params': filter(lambda p: p.requires_grad, m.ann_model.parameters()) }
], lr=learning_rate)
crieterion= nn.CrossEntropyLoss()

In [None]:
#Pre Trained
base_model_dir= '../dice_ml/utils/sample_trained_models/'
dataset_name= 'adult'
path=base_model_dir+dataset_name+'-pytorch.pth'
m.load_state_dict(torch.load(path))
m.eval()

For an example of how to train your own model, check out [this](DiCE_with_advanced_options.ipynb) or the next three cells

In [None]:
# Dataset for training Black Box Model
train_data_vae= d.data_df.copy()

#Creating list of encoded categorical and continuous feature indices
encoded_categorical_feature_indexes = d.get_data_params()[2]     
encoded_continuous_feature_indexes=[]
data_size= len(d.encoded_feature_names)
for i in range(data_size):
    valid=1
    for v in encoded_categorical_feature_indexes:
        if i in v:
            valid=0
    if valid:
        encoded_continuous_feature_indexes.append(i)            
encoded_start_cat = len(encoded_continuous_feature_indexes)
        
#One Hot Encoding for categorical features
encoded_data = d.one_hot_encode_data(train_data_vae)

# The output/outcome variable position altered due to one_hot_encoding for categorical features: (Cont feat, Outcome, Cat feat) 
# Need to rearrange columns such that outcome variable comes at the last
cols = list(encoded_data.columns)
cols = cols[:encoded_start_cat] + cols[encoded_start_cat+1:] + [cols[encoded_start_cat]]
encoded_data = encoded_data[cols]     

#Normlization for conitnuous features
encoded_data= d.normalize_data(encoded_data)
print(encoded_data.columns)
dataset = encoded_data.to_numpy()

#Train, Val, Test Splits
np.random.shuffle(dataset)
test_size= int(0.2*dataset.shape[0])
val_dataset= dataset[:test_size]
train_dataset= dataset[test_size:]

In [None]:
#Training
for epoch in range(50):
    np.random.shuffle(train_dataset)
    train_batches= np.array_split( train_dataset, train_dataset.shape[0]//batch_size ,axis=0 )    
    print('Epoch: ', epoch)
    train_acc=0.0
    for i in range(len(train_batches)):    
        optimizer.zero_grad()
        train_x= torch.tensor( train_batches[i][:,:-1] ).float() 
        train_y= torch.tensor( train_batches[i][:,-1], dtype=torch.int64 )
        
        out= m(train_x)
        train_acc += torch.sum( torch.argmax(out, axis=1) == train_y )
        
        # Cross Entropy Loss
        loss= crieterion(out, train_y)
        #L2 Regularization
        weight_norm = torch.tensor(0.)
        for w in m.ann_model.parameters():
            weight_norm += w.norm().pow(1)
        loss+= 0.001*weight_norm
        
        loss.backward()
        optimizer.step()
    print(train_acc, len(train_dataset))     

In [None]:
# Validation        
np.random.shuffle(val_dataset)
train_batches= np.array_split( val_dataset, val_dataset.shape[0]//batch_size ,axis=0 )    
val_acc=0.0
for i in range(len(train_batches)):    
    optimizer.zero_grad()
    train_x= torch.tensor( train_batches[i][:,:-1] ).float() 
    train_y= torch.tensor( train_batches[i][:,-1], dtype=torch.int64 )
    out= m(train_x)
    val_acc += torch.sum( torch.argmax(out, axis=1) == train_y )
print(val_acc, len(val_dataset))	

#Saving the Black Box Model
base_model_dir= '../dice_ml/utils/sample_trained_models/'
dataset_name= 'adult'
path=base_model_dir+dataset_name+'-pytorch.pth'
torch.save(m.state_dict(), path)   

### Generate feasible counterfactuals

Based on the data object *d* and the model object *m*, we can now instantiate the DiCE class for generating explanations. 
We present the variational inference based approach towards generating counterfactuals, where we first train an encoder-decoder framework to generate counterfactuals. More details about our framework can be found here:https://arxiv.org/abs/1912.03277

DiceBaseGenCF class has an attribute .train(), which would train the Variational Encoder Decoder framework on the input dataframe d. It has another arugment, 'pre_trained', which if set to 0 would train the framework for generating CF. Else, it can be set to 1 to avoid repreated training of the framework and would load the latest optimal model 

In [None]:
from dice_ml.dice_interfaces.dice_base_gencf import DiceBaseGenCF

In [None]:
# initiate DiCE
exp = DiceBaseGenCF(d, m)
exp.train(pre_trained=0)

DiCE is a form of a local explanation and requires an query input whose outcome needs to be explained. Below we provide a sample input whose outcome is 0 (low-income) as per the ML model object *m*. 

In [None]:
# query instance in the form of a dictionary; keys: feature name, values: feature value
query_instance = {'age':41, 
                  'workclass':'Private', 
                  'education':'HS-grad', 
                  'marital_status':'Single', 
                  'occupation':'Service',
                  'race': 'White', 
                  'gender':'Female', 
                  'hours_per_week': 45}

Given the query input, we can now generate counterfactual explanations to show perturbed inputs from the original input where the ML model outputs class 1 (high-income). 

In [None]:
# generate counterfactuals
dice_exp = exp.generate_counterfactuals(query_instance, total_CFs=5, desired_class="opposite")
# visualize the results
dice_exp.visualize_as_dataframe()

That's it! You can try generating counterfactual explanations for other examples using the same code. 

However, you might notice that for some examples, the above method can still return infeasible counterfactuals. This requires our base framework to be adpated for prodcuing feasible counterfactuals. A detailed description of how we adapt our metod under different assumptions is provided in [our paper](https://arxiv.org/pdf/1912.03277). 

In the section below, we show an adaptation our base approach for preserving the Age-Ed constraint: Age and Education can never decrease and increasing Education implies increase in Age. This approach is called **ModelApprox**, where we adapt our base approach for simple unary and binary constraints. 

### ModelApprox 

Similar to the DiceBaseGenCF class above, DiceModelApproxGenCF class has an attribute .train() with argument 'pre_trained', which determines whether to train the framework again or load the latest optimal model. However, there are additional arguments to the .train() attribute:

1. The first arugment determines whether the constraint to be preserved is unary or monotonic
2. The second arugment provides the list of constraint varaible names: [[Effect, Cause_1,..,Cause_n]]. In case of unary constraint, there would be no causes but only a single constrained variable
3. The third argument provides the intended direction of change for the constrained variables: Value of 1 means we allow for only increase in the constrained variable on the change from data point to its counterfactual and vice versa
4. The fourth argument refers to the penalty weight for infeasibility under given constraint. 

In [None]:
from dice_ml.dice_interfaces.dice_model_approx_gencf import DiceModelApproxGenCF

In [None]:
# initiate DiCE
exp = DiceModelApproxGenCF(d, m)
exp.train(1, [[0]], 1, 100, pre_trained=1)

In [None]:
# generate counterfactuals
dice_exp = exp.generate_counterfactuals(query_instance, total_CFs=5, desired_class="opposite")
# visualize the results
dice_exp.visualize_as_dataframe()

The results for ModelApprox show that the Age is also increased with increase in Education in counterfactual explanations unlike the Base approch. You can try to experiment with ModelApprox to preserve unary and monotonic constraints for other dataset too. Examples for even more advanced approaches like **SCMGenCF**,**OracleGenCF** would be included soon to this repository, where we learn to generate feasible counterfactuals for complex feasiblity constraints. 