<img width="150" alt="Logo_ER10" src="https://user-images.githubusercontent.com/3244249/151994514-b584b984-a148-4ade-80ee-0f88b0aefa45.png">

### Interpreting a movie review sentiment model with RISE
This notebook demonstrates the use of DIANNA with the RISE method on the [Stanford Sentiment Treebank dataset](https://nlp.stanford.edu/sentiment/index.html) which contains one-sentence movie reviews. See also [their paper](https://nlp.stanford.edu/~socherr/EMNLP2013_RNTN.pdf). A pre-trained neural network classifier is used, which identifies whether a movie review is positive or negative.

RISE is short for Randomized Input Sampling for Explanation of Black-box Models. It estimates each word's relevance to the model's decision empirically by probing the model with randomly masked versions of the input image and obtaining the corresponding outputs.  

More details about this method can be found in the paper https://arxiv.org/abs/1806.07421.

*NOTE*: This tutorial is still work-in-progress, the final results need to be improved by tweaking the RISE parameters

#### 1. Imports and paths

In [1]:
import os
import matplotlib.pyplot as plt
import numpy as np
from pathlib import Path
import spacy
from torchtext.vocab import Vectors
from scipy.special import expit as sigmoid

import dianna
from dianna import visualization
from dianna import utils
from dianna.utils.tokenizers import SpacyTokenizer

In [2]:
model_path = Path('models', 'movie_review_model.onnx')
word_vector_path = Path('data', 'movie_reviews_word_vectors.txt')
labels = ("negative", "positive")

#### 2. Loading the model

The classifier is stored in ONNX format. It accepts numerical tokens as input, and outputs a score between 0 and 1, where 0 means the review is negative and 1 that it is positive.  
Here we define a class to run the model, which accepts a sentence (i.e. string) as input instead and returns two classes: negative and positive.

In [None]:
# ensure the tokenizer for english is available
spacy.cli.download('en_core_web_sm')

In [4]:
class MovieReviewsModelRunner:
    def __init__(self, model, word_vectors, max_filter_size):
        self.run_model = utils.get_function(model)
        self.vocab = Vectors(word_vectors, cache=os.path.dirname(word_vectors))
        self.max_filter_size = max_filter_size
        
        self.tokenizer = SpacyTokenizer(name='en_core_web_sm')

    def __call__(self, sentences):
        # ensure the input has a batch axis
        if isinstance(sentences, str):
            sentences = [sentences]

        output = []
        for sentence in sentences:
            # tokenize and pad to minimum length
            tokens = self.tokenizer.tokenize(sentence)
            if len(tokens) < self.max_filter_size:
                tokens += ['<pad>'] * (self.max_filter_size - len(tokens))
            
            # numericalize the tokens
            tokens_numerical = [self.vocab.stoi[token] if token in self.vocab.stoi else self.vocab.stoi['<unk>']
                                for token in tokens]

            # run the model, applying a sigmoid because the model outputs logits, remove any remaining batch axis
            pred = float(sigmoid(self.run_model([tokens_numerical])))
            output.append(pred)

        # output two classes
        positivity = np.array(output)
        negativity = 1 - positivity
        return np.transpose([negativity, positivity])
            

In [5]:
# define model runner. max_filter_size is a property of the model
model_runner = MovieReviewsModelRunner(model_path, word_vector_path, max_filter_size=5)

#### 3. Applying RISE with DIANNA
The simplest way to run DIANNA on text data is with `dianna.explain_text`. The arguments are:
* The function that runs the model (a path to a model in ONNX format is also accepted)
* The text we want to explain
* The name of the explainable-AI method we want to use, here RISE
* The numerical indices of the classes we want an explanation for

`dianna.explain_text` returns a list of tuples. Each tuple contains a word, its location in the input text, and its relevance for the selected output class

In [6]:
review = "A delectable and intriguing thriller filled with surprises"

In [7]:
# An explanation is returned for each label, but we ask for just one label so the output is a list of length one.
explanation_relevances =  dianna.explain_text(model_runner, review, model_runner.tokenizer, 'RISE',
                                              labels=[labels.index('positive')])[0]
explanation_relevances

Rise parameter p_keep was automatically determined at 0.2


Explaining: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:17<00:00,  1.72s/it]


[('A', 0, 0.7158780014514923),
 ('delectable', 1, 0.913871341049671),
 ('and', 2, 0.6892129376530648),
 ('intriguing', 3, 1.0620161551237106),
 ('thriller', 4, 0.840078490972519),
 ('filled', 5, 0.6051010835170746),
 ('with', 6, 0.6926153092086315),
 ('surprises', 7, 0.6697717276215553)]

#### 4. Visualization
DIANNA includes a visualization package, capable of highlighting each word of a text based on their relevance scores. The visualization is in HTML format.
In this visualization, words in favour of the selected class are highlighted in red. Words against the selected class are not present in this example, otherwise they would be highlighted in blue.

In [8]:
visualization.highlight_text(explanation_relevances, model_runner.tokenizer.tokenize(review))

The visualization is not very clear, as all words seem relevant for the review's outcome. From the numerical values above, we see that indeed all words contribute positively according to RISE, with "intriguing" as the most important word with a score of 0.94.