Model interpretability - Making your model confesses: Saliency maps
===================================================================

This notebooks contains an example to use AllenNLP to interpret a model constructed using HuggingFace transormers using saliency maps. First, let's install the following libraries:

In [1]:
%pip install transformers allennlp eli5 --quiet
%pip install -U google-cloud-storage==1.40.0 --quiet

[K     |████████████████████████████████| 4.2 MB 8.7 MB/s 
[K     |████████████████████████████████| 719 kB 60.3 MB/s 
[K     |████████████████████████████████| 216 kB 57.9 MB/s 
[K     |████████████████████████████████| 596 kB 50.2 MB/s 
[K     |████████████████████████████████| 6.6 MB 45.5 MB/s 
[K     |████████████████████████████████| 86 kB 3.0 MB/s 
[K     |████████████████████████████████| 592 kB 51.9 MB/s 
[K     |████████████████████████████████| 1.8 MB 36.5 MB/s 
[K     |████████████████████████████████| 125 kB 47.4 MB/s 
[K     |████████████████████████████████| 248 kB 14.0 MB/s 
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Installing backend dependencies ... [?25l[?25hdone
    Preparing wheel metadata ... [?25l[?25hdone
[K     |████████████████████████████████| 1.2 MB 48.2 MB/s 
[K     |████████████████████████████████| 4.0 MB 46.7 MB/s 
[K     |████████████████████████████████| 880 k

Loading NLP HuggingFace models into AllenNLP framework
-------------------------------------------------------

AllenNLP is a general deep learning framework for NLP, established by the world-famous Allen Institute for AI Lab. Its team envisions language-centered AI that equitably serves humanity. As a small team, researchers and engineers work closely together to publish impactful research and identify common investments that would benefit the field as a whole.

In this case we will use nlptown/bert-base-multilingual-uncased-sentiment. This is a bert-base-multilingual-uncased model finetuned for sentiment analysis on product reviews in six languages: English, Dutch, German, French, Spanish and Italian. It predicts the sentiment of the review as a number of stars (between 1 and 5).

The model can be used directly as a sentiment analysis model for product reviews in any of the six languages, or further finetuned on related sentiment analysis tasks. To keep the example small, we won't do any fine-tuning with our own data in this opportunity.

In [1]:
model_uri = 'nlptown/bert-base-multilingual-uncased-sentiment'
model_name = model_uri

In [2]:
from transformers import AutoModelForSequenceClassification

classifier = AutoModelForSequenceClassification.from_pretrained(model_name)

## Loading the model using AllenNLP

AllenNLP allows a declarative way of loading models. The entire architecture can be specified using the language JSONNET. This allows even faster iteration for trying new combination of different architectures as it only takes changing the relevant part in the declarative JSON. The following JSONNET (which is indicated using the type Params but you can do exactly the same saving the json structure in a JSONNET file) is the very same equivalent to the model we used before.

In [3]:
from allennlp.common import Params
from allennlp.data.dataset_readers import DatasetReader

params = Params({
      "type": "text_classification_json",
      "tokenizer": {
          "type": "pretrained_transformer",
          "model_name": model_name,
      },
      "token_indexers": {
          "tokens": {
              "type": "pretrained_transformer",
              "model_name": model_name,
          }
      }
})

dataset_reader = DatasetReader.from_params(params)

In [4]:
from allennlp.common import Params
from allennlp.models import Model

params = Params({
    "type": "basic_classifier",
    "vocab": {
        "type": "from_pretrained_transformer",
        "model_name": model_name,
    },
    "text_field_embedder": {
        "type": "basic",
        "token_embedders": {
            "tokens": {
                "type": "pretrained_transformer",
                "model_name": model_name
            }
        }
    },
    "seq2vec_encoder": {
        "type": "bert_pooler",
        "pretrained_model": model_name
    },
    "dropout": 0.1,
    "num_labels": 5,
});

model = Model.from_params(params)
model._classification_layer.weight = classifier.classifier.weight
model._classification_layer.bias = classifier.classifier.bias
_ = model.eval()

Some weights of the model checkpoint at nlptown/bert-base-multilingual-uncased-sentiment were not used when initializing BertModel: ['classifier.bias', 'classifier.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


Creating our predictor based on the dataset reader and the model:

In [5]:
from allennlp.predictors import TextClassifierPredictor

predictor = TextClassifierPredictor(model, dataset_reader)

Interpreting predictions using AllenNLP and Saliency Maps
---------------------------------------------------------

In [6]:
labels = {
    "1-stars": 0,
    "2-stars": 1,
    "3-stars": 2,
    "4-stars": 3,
    "5-stars": 4
}

Let's consider the following example:

In [54]:
sample_text = "Not buying it any sooner! Other brands do a much better job"

In [55]:
inputs = {"sentence": sample_text }

### Using Integrated Gradients

In [56]:
from allennlp.interpret.saliency_interpreters import IntegratedGradient, SmoothGradient

interpreter = IntegratedGradient(predictor)

Calculate the gradients:

In [57]:
inputs = {"sentence": sample_text }

In [58]:
import numpy as np

interpretation = interpreter.saliency_interpret_from_json(inputs)
outputs = predictor.predict(sample_text)



In [59]:
outputs.keys()

dict_keys(['logits', 'probs', 'token_ids', 'label', 'tokens'])

In [60]:
grads = np.array(interpretation['instance_1']['grad_input_1'])
probs = np.array(outputs['probs'])

Let's plot the results:

In [61]:
from IPython.display import HTML
from eli5.formatters import format_as_html
from eli5_allennlp import get_explanation_from_grads

In [62]:
expl = get_explanation_from_grads(estimator_name="transformer",
                                  estimator_description="NLP transformer explanation",
                                  text=sample_text,
                                  tokens=outputs['tokens'],
                                  grads=grads,
                                  probas=probs,
                                  labels=list(labels.keys()))

In [63]:
HTML(format_as_html(expl))

Contribution?,Feature
1.0,Highlighted in text


In [65]:
expl = get_explanation_from_grads(estimator_name="transformer",
                                  estimator_description="NLP transformer explanation",
                                  text=sample_text,
                                  tokens=outputs['tokens'],
                                  grads=grads,
                                  probas=probs,
                                  labels=list(labels.keys()),
                                  force_weights=True)

In [66]:
HTML(format_as_html(expl, force_weights=True))

Weight?,Feature
0.003,[CLS]
0.299,not
0.099,buying
0.025,it
0.006,any
0.017,soon
0.015,##er
0.029,!
0.04,other
0.022,brands


### Using Smooth Gradients

In [67]:
from allennlp.interpret.saliency_interpreters import IntegratedGradient, SmoothGradient

interpreter = SmoothGradient(predictor)

Calculate the gradients:

In [68]:
import numpy as np

interpretation = interpreter.saliency_interpret_from_json(inputs)
outputs = predictor.predict(sample_text)



In [69]:
outputs.keys()

dict_keys(['logits', 'probs', 'token_ids', 'label', 'tokens'])

In [70]:
grads = np.array(interpretation['instance_1']['grad_input_1'])
probs = np.array(outputs['probs'])

Let's plot the results:

In [71]:
from IPython.display import HTML
from eli5.formatters import format_as_html
from eli5_allennlp import get_explanation_from_grads

In [72]:
expl = get_explanation_from_grads(estimator_name="transformer",
                                  estimator_description="NLP transformer explanation",
                                  text=sample_text,
                                  tokens=outputs['tokens'],
                                  grads=grads,
                                  probas=probs,
                                  labels=list(labels.keys()))

In [73]:
HTML(format_as_html(expl))

Contribution?,Feature
1.0,Highlighted in text


In [74]:
expl = get_explanation_from_grads(estimator_name="transformer",
                                  estimator_description="NLP transformer explanation",
                                  text=sample_text,
                                  tokens=outputs['tokens'],
                                  grads=grads,
                                  probas=probs,
                                  labels=list(labels.keys()),
                                  force_weights=True)

In [75]:
HTML(format_as_html(expl, force_weights=True))

Weight?,Feature
0.028,[CLS]
0.028,not
0.028,buying
0.007,it
0.057,any
0.255,soon
0.071,##er
0.0,!
0.078,other
0.057,brands
