# Explaining keras text classifier predictions with Grad-CAM

We will explain text classification predicictions using Grad-CAM. We will use the IMDB dataset available at keras and the financial dataset, loading pretrained models.

For images see the other notebook.

First some imports

In [1]:
import numpy as np
import pandas as pd
from IPython.display import display, HTML

# you may want to keep logging enabled when doing your own work
import logging
import tensorflow as tf
tf.get_logger().setLevel(logging.ERROR) # disable Tensorflow warnings for this tutorial
import warnings
warnings.simplefilter("ignore") # disable Keras warnings for this tutorial
import keras

import eli5

Using TensorFlow backend.


## Explaining sentiment classification

This is common in tutorials. A binary classification task with only one output. In this case high (1) is positive, low (0) is negative. We will use the IMDB dataset and a recurrent model, word level tokenization.

Load our model (available in ELI5).

In [2]:
model = keras.models.load_model('../tests/estimators/keras_sentiment_classifier.h5')
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_1 (Embedding)      (None, None, 8)           80000     
_________________________________________________________________
masking_1 (Masking)          (None, None, 8)           0         
_________________________________________________________________
masking_2 (Masking)          (None, None, 8)           0         
_________________________________________________________________
masking_3 (Masking)          (None, None, 8)           0         
_________________________________________________________________
bidirectional_1 (Bidirection (None, None, 128)         37376     
_________________________________________________________________
bidirectional_2 (Bidirection (None, None, 64)          41216     
_________________________________________________________________
bidirectional_3 (Bidirection (None, 32)                10368     
__________

Load some sample data. We have a module that will do preprocessing, etc for us. Check the relevant package to learn more. For your own models you will have to do your own preprocessing

In [3]:
import tests.estimators.keras_sentiment_classifier as keras_sentiment_classifier
# FIXME: does this always work?

In [4]:
(x_train, y_train), (x_test, y_test) = keras_sentiment_classifier.prepare_train_test_dataset()

confirm accuracy of model

In [5]:
print(model.metrics_names)
model.evaluate(x_test, y_test)

['loss', 'acc']


[0.4319177031707764, 0.81504]

Looks good? Let's go on and get one of the samples.

In [6]:
doc = x_test[0:1]
print(doc)

tokens = keras_sentiment_classifier.vectorized_to_tokens(doc)
print(tokens)

[[   1  591  202   14   31    6  717   10   10    2    2    5    4  360
     7    4  177 5760  394  354    4  123    9 1035 1035 1035   10   10
    13   92  124   89  488 7944  100   28 1668   14   31   23   27 7479
    29  220  468    8  124   14  286  170    8  157   46    5   27  239
    16  179    2   38   32   25 7944  451  202   14    6  717    0    0
     0    0    0    0    0    0    0    0    0    0    0    0    0    0
     0    0    0    0    0    0    0    0    0    0    0    0    0    0
     0    0    0    0    0    0    0    0    0    0    0    0    0    0
     0    0    0    0    0    0    0    0    0    0    0    0    0    0
     0    0]]
[['<START>', 'please', 'give', 'this', 'one', 'a', 'miss', 'br', 'br', '<OOV>', '<OOV>', 'and', 'the', 'rest', 'of', 'the', 'cast', 'rendered', 'terrible', 'performances', 'the', 'show', 'is', 'flat', 'flat', 'flat', 'br', 'br', 'i', "don't", 'know', 'how', 'michael', 'madison', 'could', 'have', 'allowed', 'this', 'one', 'on', 'his', 'p

Check the prediction

In [7]:
model.predict(doc)

array([[0.1622659]], dtype=float32)

As expected, looks pretty low accuracy.

Now let's explain what got us this result with ELI5. We need to pass the model, the input, and the associated tokens that will be highlighted.

In [8]:
eli5.show_prediction(model, doc, tokens=tokens)

Let's try a custom input

In [9]:
s = "hello this is great but not so great"
doc_s, tokens_s = keras_sentiment_classifier.string_to_vectorized(s)
print(doc_s, tokens_s)

[[   1 4825   14    9   87   21   24   38   87]] [['<START>' 'hello' 'this' 'is' 'great' 'but' 'not' 'so' 'great']]


Notice that this model does not require fixed length input. We do not need to pad this sample.

In [10]:
model.predict(doc_s)

array([[0.5912496]], dtype=float32)

In [11]:
eli5.show_prediction(model, doc_s, tokens=tokens_s)

## Modifying binary classification explanations with the `counterfactual` and `relu` arguments

What did we see in the last section? Grad-CAM shows what makes a class score "go up". So we are only seeing the "positive" parts.

To "fix" this, we can pass two boolean arguments.

`counterfactual` shows the "opposite", what makes the score "go down" (set to `True` to enable).

In [12]:
eli5.show_prediction(model, doc, tokens=tokens, counterfactual=True)

`relu` filters out the negative scores and only shows what makes the predicted score go up (set to `False` to disable).

In [13]:
eli5.show_prediction(model, doc, tokens=tokens, relu=False)

Green is positive, red is negative, white is neutral. We can see what made the network decide that is is a negative example.

What happens if we pass both `counterfactual` and `relu`?

In [14]:
eli5.show_prediction(model, doc, tokens=tokens, relu=False, counterfactual=True)

Notice how the colors (green and red) are inverted.

## Removing padding from explanations

Often when working with text, each example is padded, whether because the model expects input with a certain length, or to have all samples be the same length to put them in a batch.

We can remove padding by specifying two arguments. The first is `pad_value`, the padding token such as `<PAD>` or a numeric value such as `0` for `doc`. The second argument is `padding`, which should be set to either `pre` (padding is done before actual text) or `post` (padding is done after actual text).

In [15]:
eli5.show_prediction(model, doc, tokens=tokens, relu=False, pad_value='<PAD>', padding='post')

Now the explanation is shorter. This is useful if the input has a lot of padding.

## Choosing the layer to explain with

Grad-CAM requires a hidden layer to do its calculations on. This is controlled by the `layer` argument. We can pass the layer (as an int index, string name, or a keras Layer instance) explicitly, or let ELI5 attempt to find a good layer to do Grad-CAM on automatically.

In [16]:
for layer in model.layers:
    name = layer.name
    print(name)
    if 'masking' not in layer.name:
        e = eli5.show_prediction(model,
                                 doc,
                                 tokens=tokens,
                                 layer=layer,
                                 relu=False, 
                                 pad_value='<PAD>', 
                                 padding='post')
        display(e) # if using in a loop, we need these two explicit IPython calls

embedding_1


masking_1
masking_2
masking_3
bidirectional_1


bidirectional_2


bidirectional_3


dense_1


dense_2


If you don't get good explanations from ELI5 out of the box, it may be worth looking into this parameter. We advice to pick layers that contain "spatial or temporal" information, i.e. NOT dense/fully-connected or merge layers.

Notice that when explaining the final dense layer node (there is only 1 output), we get an "all green" explanation. You need to hover over the explanation to see the actual value. It seems off because there are no "negative" values here and the colouring is not gradual.

## Explaining multiclass model predictions

A multi-class model trained on the finanial dataset. Character-level tokenization. Convolutional network.

In [17]:
# multiclass model (*target, layer - conv/others, diff. types of expls, padding and its effect)

## Choosing a classification target to focus on

## How it works - explain_prediction and format_as_html.

In [18]:
# heatmap, tokens, weighted_spans, interpolation_kind, etc.

## ELI5 applied - improving the accuracy of a model

In [19]:
# improve acc of existing - check with eli5 what's wrong

In [20]:
# train on reuters for multilabel, use eli5 as keras callback - check explanation!