# Contrastive Explanations Method(CEM) applied to Heart dataset


Contrastive Explanation Method, abbreviated as CEM, is a XAI Method which can give local explanations for a black box model. This method is applicable for classification datasets. CEM gives two kinds of explanations: 

Pertinent Positives (PP): For a PP, the method finds the features that should be minimally and sufficiently present (e.g. important pixels in an image) to predict the same class as on the original instance.  PP works similarly to Anchors.


Pertinent Negatives (PN): PN’s on the other hand identify what features should be minimally and necessarily absent from the instance to be explained in order to maintain the original prediction class. The aim of PN’s is not to provide a full set of characteristics that should be absent in the explained instance, but to identify a minimal set of features that is enough to differentiate it from the nearest different class. PN works similarly to Counterfactuals.






In [1]:
!pip install alibi

Collecting alibi
[?25l  Downloading https://files.pythonhosted.org/packages/2c/0f/1de259336ecb2eeeb06d703c210effb66febf9f9273ff146fb29b66f17a7/alibi-0.5.4-py3-none-any.whl (215kB)
[K     |█▌                              | 10kB 15.9MB/s eta 0:00:01[K     |███                             | 20kB 2.1MB/s eta 0:00:01[K     |████▋                           | 30kB 2.6MB/s eta 0:00:01[K     |██████                          | 40kB 2.9MB/s eta 0:00:01[K     |███████▋                        | 51kB 2.4MB/s eta 0:00:01[K     |█████████▏                      | 61kB 2.7MB/s eta 0:00:01[K     |██████████▋                     | 71kB 2.9MB/s eta 0:00:01[K     |████████████▏                   | 81kB 3.1MB/s eta 0:00:01[K     |█████████████▊                  | 92kB 3.4MB/s eta 0:00:01[K     |███████████████▏                | 102kB 3.3MB/s eta 0:00:01[K     |████████████████▊               | 112kB 3.3MB/s eta 0:00:01[K     |██████████████████▎             | 122kB 3.3MB/s eta 0:00:0

In [2]:
import tensorflow as tf
tf.get_logger().setLevel(40)
tf.compat.v1.disable_v2_behavior() 
from tensorflow.keras.layers import Dense, Input
from tensorflow.keras.models import Model, load_model
from tensorflow.keras.utils import to_categorical

import matplotlib
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import os
import pandas as pd
import seaborn as sns
from sklearn.model_selection import train_test_split
from alibi.explainers import CEM

print('TF version: ', tf.__version__)
print('Eager execution enabled: ', tf.executing_eagerly()) # False

TF version:  2.3.0
Eager execution enabled:  False


Reading the  dataset:

In [3]:
dataset = pd.read_csv('/content/heartu.csv')
# To display the top 5 rows
dataset.head(5)

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,condition
0,69,1,0,160,234,1,2,131,0,0.1,1,1,0,0
1,69,0,0,140,239,0,0,151,0,1.8,0,2,0,0
2,66,0,0,150,226,0,0,114,0,2.6,2,0,0,0
3,65,1,0,138,282,1,2,174,0,1.4,1,1,0,1
4,64,1,0,110,211,0,2,144,1,1.8,1,0,0,0


In [4]:
heart = dataset.copy()

In [5]:
target = 'condition'
feature_names = list(heart.columns)
feature_names.remove(target)

In [6]:
y = heart.pop('condition')

In [7]:
heart = (heart - heart.mean(axis=0)) / heart.std(axis=0)

In [8]:
X_train, X_test, y_train, y_test = train_test_split(heart, y, test_size=0.2, random_state=33)
x_train=X_train.to_numpy()
x_test=X_test.to_numpy()
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

Training the model:

In [9]:
def nn_model():
    x_in = Input(shape=(13,))
    x = Dense(40, activation='relu')(x_in)
    x = Dense(40, activation='relu')(x)
    x_out = Dense(2, activation='softmax')(x)
    nn = Model(inputs=x_in, outputs=x_out)
    nn.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
    return nn

In [10]:
nn = nn_model()
nn.summary()
nn.fit(X_train, y_train, batch_size=64, epochs=500, verbose=0)
nn.save('nn_heart.h5', save_format='h5')

Model: "functional_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         [(None, 13)]              0         
_________________________________________________________________
dense (Dense)                (None, 40)                560       
_________________________________________________________________
dense_1 (Dense)              (None, 40)                1640      
_________________________________________________________________
dense_2 (Dense)              (None, 2)                 82        
Total params: 2,282
Trainable params: 2,282
Non-trainable params: 0
_________________________________________________________________


Generating contrastive explaination for pertinent negative:

Consider the second instance of testing data.

In [11]:
idx = 1
X = x_test[idx].reshape((1,) + x_test[idx].shape)
print('Prediction on instance to be explained: {}'.format([np.argmax(nn.predict(X))]))
print('Prediction probabilities for each class on the instance: {}'.format(nn.predict(X)))

Prediction on instance to be explained: [0]
Prediction probabilities for each class on the instance: [[0.9962864  0.00371358]]


The original prediction class is 0, since it has a greater prediction probability.

In [12]:
mode = 'PN' 
shape = (1,) + x_train.shape[1:]  
kappa = .2 
            
            
beta = .1  
c_init = 10. 
c_steps = 10
max_iterations = 1000  
feature_range = (x_train.min(axis=0).reshape(shape)-.1,  
                 x_train.max(axis=0).reshape(shape)+.1)  
clip = (-1000.,1000.)  
lr_init = 1e-2  

Here, 

*   mode : 'PN' (Pertinent Negative) or 'PP' (Pertinent Positive)
*   shape : Shape of the current instance. As CEM is applicable for single explanations, we take 1.
*   kappa, beta, gamma, c_init, c_steps are all mathematical terms for calculating loss
*   max_iterations : the total no. of loss optimization steps for each value of c
*   feature_range : global or feature wise minimum and maximum values for the changed instance
*   clip : minimum and maximum gradient values
*   lr_init : initial learning rate 



In [13]:
lr = load_model('nn_heart.h5')

# initialize CEM explainer and explain instance
cem = CEM(lr, mode, shape, kappa=kappa, beta=beta, feature_range=feature_range,
          max_iterations=max_iterations, c_init=c_init, c_steps=c_steps,
          learning_rate_init=lr_init, clip=clip)
cem.fit(x_train, no_info_type='median')  
explanation = cem.explain(X, verbose=False)

In [14]:
print('Feature names: {}'.format(feature_names))
print('Original instance: {}'.format(explanation.X))
print('Predicted class: {}'.format([explanation.X_pred]))
print('Pertinent negative: {}'.format(explanation.PN))
print('Predicted class: {}'.format([explanation.PN_pred]))

Feature names: ['age', 'sex', 'cp', 'trestbps', 'chol', 'fbs', 'restecg', 'thalach', 'exang', 'oldpeak', 'slope', 'ca', 'thal']
Original instance: [[-1.3859065  -1.44454157 -0.16401266 -0.65831955 -0.73753753 -0.41075703
  -1.00172798  1.02001221 -0.695246   -0.90518389  0.64269638 -0.72075958
  -0.87281841]]
Predicted class: [0]
Pertinent negative: [[-1.4474772  -1.4445416   0.24188966 -0.65831953 -0.7375375  -0.41075704
  -1.0017279   1.0200123  -0.5739004  -0.9051839   0.6426964   1.7650422
   0.5487773 ]]
Predicted class: [1]


The above result clearly shows that the pertinent negative method pushes the prediction to get a prediction different from the original prediction which is 0 to 1 in this case.

The CEM values in array which are different from the original one change the prediction class. Some of them are cp, ca, thal. Thus, it can be concluded that changes in these features should necessarily be absent to retain the original prediction as 0 as they are responsible for flipping the prediction class.

Generating pertinent positive:

In [15]:
mode = 'PP'

In [16]:
# define model
lr = load_model('nn_heart.h5')

# initialize CEM explainer and explain instance
cem = CEM(lr, mode, shape, kappa=kappa, beta=beta, feature_range=feature_range,
          max_iterations=max_iterations, c_init=c_init, c_steps=c_steps,
          learning_rate_init=lr_init, clip=clip)
cem.fit(x_train, no_info_type='median')
explanation = cem.explain(X, verbose=False)

In [17]:
print('Original instance: {}'.format(explanation.X))
print('Predicted class: {}'.format([explanation.X_pred]))
print('Pertinent positive: {}'.format(explanation.PP))
print('Predicted class: {}'.format([explanation.PP_pred]))

Original instance: [[-1.3859065  -1.44454157 -0.16401266 -0.65831955 -0.73753753 -0.41075703
  -1.00172798  1.02001221 -0.695246   -0.90518389  0.64269638 -0.72075958
  -0.87281841]]
Predicted class: [0]
Pertinent positive: [[-4.63393008e-08 -1.09832214e-02 -4.27903346e-09 -8.10524396e-02
  -2.21655790e-08  9.42199779e-09 -9.06142362e-02  1.00736806e-01
  -2.08339260e-08 -2.68793621e-01 -3.37093253e-09 -8.50149595e-09
  -3.48234908e-09]]
Predicted class: [0]


The above result shows that the predicted class remains same on applying PP. The CEM values generated, close to 0, should be compulsorily and minimally present in order to get the same original class 0 as predicted class.