# Tutorial: Zero-Shot Text Classification

In this short tutorial, we show how to use *ferret* to use and evaluate different explainability approaches in the task of Zero-Shot Text Classification.

We will use `MoritzLaurer/mDeBERTa-v3-base-mnli-xnli` as model checkpoint.

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import torch
from datasets import load_dataset
from transformers import AutoModelForSequenceClassification, AutoTokenizer

from ferret import (
    Benchmark,
    GradientExplainer,
    IntegratedGradientExplainer,
    LIMEExplainer,
    SHAPExplainer,
)

device = "cuda:0" if torch.cuda.is_available() else "cpu"

In [3]:
model_name = "MoritzLaurer/mDeBERTa-v3-base-mnli-xnli"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name).to(device)

The sentencepiece tokenizer that you are converting to a fast tokenizer uses the byte fallback option which is not implemented in the fast tokenizers. In practice this means that the fast version of the tokenizer can produce unknown tokens whereas the sentencepiece version would have converted these unknown tokens into a sequence of byte tokens matching the original piece of text.


In [4]:
ig = IntegratedGradientExplainer(model, tokenizer, multiply_by_inputs=True)
g = GradientExplainer(model, tokenizer, multiply_by_inputs=True)
l = LIMEExplainer(model, tokenizer)

No helper provided. Using default 'text-classification' helper.


In [5]:
bench = Benchmark(
    model, tokenizer, task_name="zero-shot-text-classification", explainers=[ig, g, l]
)

Overriding helper for explainer <ferret.explainers.gradient.IntegratedGradientExplainer object at 0x2ab4f4ee0>
Overriding helper for explainer <ferret.explainers.gradient.GradientExplainer object at 0x1079faf80>
Overriding helper for explainer <ferret.explainers.lime.LIMEExplainer object at 0x1079f80d0>


In [6]:
sequence_to_classify = (
    "Amanda ha cucinato la più buona torta pecan che abbia mai provato!"
)
candidate_labels = ["politics", "economy", "bakery"]
sample = (sequence_to_classify, candidate_labels)

In [7]:
sample

('Amanda ha cucinato la più buona torta pecan che abbia mai provato!',
 ['politics', 'economy', 'bakery'])

In [14]:
# get the prediction from our model
bench.score(sample[0], options=candidate_labels, return_probs=True)

{'politics': 0.2317681610584259,
 'economy': 0.23227249085903168,
 'bakery': 0.5359593629837036}

In [15]:
# explain the contradiction class
exp = bench.explain(sample[0], target="entailment", target_option="bakery")

Explainer:   0%|          | 0/3 [00:00<?, ?it/s]

Batch:   0%|          | 0/113 [00:00<?, ?it/s]

In [16]:
# show explanations
bench.show_table(exp)

Unnamed: 0,[CLS],▁_0,Amanda,▁ha,▁_1,cucina,to_0,▁la,▁p,iù,▁buon,a_0,▁tort,a_1,▁pe,can,▁che,▁_2,abbia,▁mai,▁prova,to_1,!,[SEP]_0,▁This,▁is,▁_3,baker,y,[SEP]_1
Integrated Gradient (x Input),0.0,0.96,0.37,-0.22,-0.14,-0.11,0.03,0.08,-0.03,-0.1,-0.09,-0.09,0.01,0.11,-0.16,-0.11,-0.16,1.05,-0.29,0.06,-0.03,0.03,-0.3,-0.1,-1.48,-0.51,0.02,0.99,-0.37,0.0
Gradient (x Input),-0.0,-0.0,-0.0,-0.0,-0.0,0.0,0.0,0.0,0.0,0.0,0.0,-0.0,-0.0,-0.0,-0.0,0.0,0.0,-0.0,0.0,0.0,-0.0,0.0,-0.0,-0.0,-0.0,-0.0,0.0,0.0,0.0,-0.0
LIME,0.0,-0.01,0.01,-0.03,-0.0,-0.1,0.08,0.02,0.01,0.03,0.0,0.01,0.25,0.14,0.04,0.01,-0.02,0.01,0.01,0.01,0.01,0.04,0.06,0.0,-0.05,-0.1,-0.05,0.01,-0.03,0.0


In [17]:
# evaluate explanations and show faithfulness metrics
bench.show_evaluation_table(bench.evaluate_explanations(exp, target="contradiction"))

Explanation eval:   0%|          | 0/3 [00:00<?, ?it/s]

Unnamed: 0_level_0,aopc_compr,aopc_suff,taucorr_loo
Explainer,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Integrated Gradient (x Input),0.48,0.61,0.06
Gradient (x Input),0.6,0.8,-0.07
LIME,0.77,0.77,0.27
