# Quantus + NLP
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/understandable-machine-intelligence-lab/Quantus/main?labpath=tutorials%2FTutorial_NLP_Demonstration.ipynb)


This tutorial demonstrates how to use the library for robustness evaluation explanation of text classification models.
For this purpose, we use a pre-trained `Distilbert` model from [Huggingface](https://huggingface.co/models) and `GLUE/SST2` dataset [here](https://huggingface.co/datasets/sst2).

Author: Artem Sereda

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1eWK9ebfMUVRG4mrOAQvXdJ452SMLfffv?usp=sharing)

In [35]:
import numpy as np
import pandas as pd
from datasets import load_dataset
import tensorflow as tf
import logging
from IPython.core.display import HTML
from quantus.nlp import (
    AvgSensitivity,
    MaxSensitivity,
    visualise_explanations_as_html,
    TFHuggingFaceTextClassifier,
    explain,
    normalize_sum_to_1,
    normalise_attributions,
)


# Suppress debug logs.
logging.getLogger("absl").setLevel(logging.WARNING)
tf.config.list_physical_devices()

[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'),
 PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

## 1) Preliminaries

### 1.1 Load pre-trained model and tokenizer from [huggingface](https://huggingface.co/models) hub

In [36]:
model = TFHuggingFaceTextClassifier.from_pretrained(
    "distilbert-base-uncased-finetuned-sst-2-english"
)

Some layers from the model checkpoint at distilbert-base-uncased-finetuned-sst-2-english were not used when initializing TFDistilBertForSequenceClassification: ['dropout_19']
- This IS expected if you are initializing TFDistilBertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFDistilBertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some layers of TFDistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased-finetuned-sst-2-english and are newly initialized: ['dropout_39']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


### 1.2 Load test split of [GLUE/SST2](https://huggingface.co/datasets/sst2) dataset

In [37]:
BATCH_SIZE = 8
dataset = load_dataset("sst2")["test"]
x_batch = dataset["sentence"][:BATCH_SIZE]

Found cached dataset sst2 (/Users/artemsereda/.cache/huggingface/datasets/sst2/default/2.0.0/9896208a8d85db057ac50c72282bcb8fe755accc671a57dd8059d4e130961ed5)


  0%|          | 0/3 [00:00<?, ?it/s]

Run an example inference, and demonstrate models predictions.

In [38]:
CLASS_NAMES = ["negative", "positive"]


def decode_labels(y_batch: np.ndarray, class_names: [str]) -> [str]:
    """A helper function to map integer labels to human-readable class names."""
    return [class_names[i] for i in y_batch]


y_batch = model.predict(x_batch).argmax(axis=-1)

# Show the x, y data.
pd.DataFrame([x_batch, decode_labels(y_batch, CLASS_NAMES)]).T

2023-01-31 01:28:51.850370: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled.


Unnamed: 0,0,1
0,uneasy mishmash of styles and genres .,negative
1,this film 's relationship to actual tension is...,negative
2,"by the end of no such thing the audience , lik...",positive
3,director rob marshall went out gunning to make...,positive
4,lathan and diggs have considerable personal ch...,positive
5,a well-made and often lovely depiction of the ...,positive
6,none of this violates the letter of behan 's b...,negative
7,although it bangs a very cliched drum at times...,positive


### 1.5 Visualise the explanations.

In [39]:
labels = list(
    map(lambda i: "Predicted label: " + i, decode_labels(y_batch, CLASS_NAMES))
)

In [40]:
# Visualise GradNorm.
a_batch_grad_norm = normalise_attributions(
    explain(model, x_batch, y_batch, method="GradNorm"), normalize_sum_to_1
)
HTML(
    visualise_explanations_as_html(
        a_batch_grad_norm, labels=labels, ignore_special_tokens=True
    )
)

In [41]:
# Visualise Integrated Gradients explanations.
a_batch_input_x_grad = normalise_attributions(
    explain(model, x_batch, y_batch, method="GradXInput"), normalize_sum_to_1
)
HTML(
    visualise_explanations_as_html(
        a_batch_input_x_grad, labels=labels, ignore_special_tokens=True
    )
)

## 2) Quantitative analysis using Quantus
For this example, we compute [Sensitivity](https://arxiv.org/abs/1901.09392) metric.

- Average Sensitivity captures the average change in explanations under slight perturbation.
- Maximum Sensitivity captures the maximal change in explanations under slight perturbation.

In [42]:
results = []
for metric in (AvgSensitivity, MaxSensitivity):
    metric_scores = []
    for xai_method in ("GradNorm", "GradXInput"):
        metric_instance = metric(
            display_progressbar=True,
            disable_warnings=True,
            nr_samples=10,
            normalise=True,
        )
        scores = metric_instance(
            model=model,
            x_batch=x_batch,
            y_batch=y_batch,
            a_batch=a_batch_grad_norm,
            explain_func_kwargs={"method": xai_method},
        )
        metric_scores.append(scores)
    results.append(metric_scores)

pd.DataFrame(
    np.asarray(results).mean(axis=-1),
    columns=["Gradient Norm", "Input X Gradient"],
    index=["Average Sensitivity", "Max Sensitivity"],
)

  0%|          | 0/1 [00:00<?, ?it/s]



  0%|          | 0/1 [00:00<?, ?it/s]

  0%|          | 0/1 [00:00<?, ?it/s]

  0%|          | 0/1 [00:00<?, ?it/s]

Unnamed: 0,Gradient Norm,Input X Gradient
Average Sensitivity,0.007996,0.035785
Max Sensitivity,0.014371,0.043779
