# Introduction to *ferret*

Hi there! This notebook will guide you through the basic functionalities of *ferret*, using as an example the Sentiment Classification task.

Specifically, you will see how to:

- load a model from the Hugging Face Hub into our `Benchmark` client interface;
- use the class to explain a text query using all the supported post-hoc feature attribution methods;
- visualize the explanations in tabular format;
- **evaluate** all the explanations over the metrics (faithfulness and plausibility).

Scroll over to know more 😉

In [None]:
%load_ext autoreload
%autoreload 2

In [None]:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from ferret import Benchmark
import numpy as np

For the purpose of this tutorial, we will use the sentiment classification model `cardiffnlp/twitter-xlm-roberta-base-sentiment`.

In [None]:
name = "cardiffnlp/twitter-xlm-roberta-base-sentiment"
model = AutoModelForSequenceClassification.from_pretrained(name)
tokenizer = AutoTokenizer.from_pretrained(name)

## Explain a single instance

The fastest way to get started with *ferret* is using the `Benchmark` interface class.

In [None]:
bench = Benchmark(model, tokenizer)

Extracting post-hoc explanations with all the supported methods and standard parameters is as easy as:

In [None]:
explanations = bench.explain("I love your style!", target=2)

In [None]:
explanations

Let's visualize the results.

In [None]:
t = bench.show_table(explanations)
t

## Evaluate explanation of a single instance

Evaluate explanations with all the supported evaluators is straightforward. Remember to specify the `target` parameter to match the one used during the explanation!

In [None]:
explanation_evaluations = bench.evaluate_explanations(explanations, target=2)

Again, we can look at the results in a tabular format.

In [None]:
bench.show_evaluation_table(explanation_evaluations)

Area Over the Perturbation Curve (AOPC) Comprehensiveness (aopc_compr), AOPC Sufficiency (aopc_suff) and Correlation with Leave-One-Out scores (taucorr_loo) are three measures of faithfulness.

**AOPC Comprehensiveness**. Comprehensiveness measures the drop in the model probability if the relevant tokens of the explanations are removed. We measure comprehensiveness via the Area Over the Perturbation Curve by progressively considering the most $k$ important tokens, with $k$ from 1 to #tokens (as default) and then averaging the result. The higher the value, the more the explainer is able to select the relevant tokens for the prediction.

**AOPC Sufficiency**. Sufficiency captures if the tokens in the explanation are sufficient for the model to make the prediction. As for comprehensiveness, we use the AOPC score.

**Correlation with Leave-One-Out scores**. We first compute the leave-one-out scores by computing the prediction difference when one feature at the time is omitted. We then measure the Spearman correlation with the explanations.

### Plausibility

We can also specify a human rationale and evaluate plausibility.

In [None]:
explanation_evaluations = bench.evaluate_explanations(
    explanations,
    target=0,
    human_rationale=[0, 1, 0, 0, 0],
    top_k_rationale=1
)
bench.show_evaluation_table(explanation_evaluations)

Plausibility evaluates how well the explanation agree with human rationale. We evaluate plausibility via 
Area Under the Precision Recall curve (AUPRC) (auprc_plau),  token-level f1-score (token_f1_plau) and average Intersection-Over-Union (IOU) at the token level (token_iou_plau).


**Area Under the Precision Recall curve (AUPRC)** is computed by sweeping a threshold over token scores.

Token-level f1-score and the average Intersection-Over-Unionconsider discrete rationales.
We derive a discrete rationale by taking the top-k values. K in the example is set to 1. * 

**Token-level f1-score** is the token-level F1 scores derived from the token-level precision and recall. 
**Intersection-Over-Union (IOU)** is the size of the overlap of the tokens they cover divided by the size of their union.

*When the set of human rationales for the dataset is available, K is set as the average rationale length (as in ERASER)

# Evaluating explainers on a supported XAI Datasets

We can directly load a dataset with rationales using our Dataset API -- since we use Hugging Face's [datasets](https://huggingface.co/datasets), you will download the dataset just once and cache it 🚀

In [None]:
hatexdata = bench.load_dataset("hatexplain")

Here we show an example of text and its human rationales.

In [None]:
hatexdata[2]["text"], hatexdata[2]["rationale"]

We can compute evaluate explanations for a set of the samples of the dataset.

As a default, explanations and their evaluation is computed w.r.t. the predicted class. We can otherwise specify the target class via the parameter 'target'

In [None]:
# Compute and average evaluation scores one of the supported dataset
samples = np.arange(1)
sample_evaluations =  bench.evaluate_samples(hatexdata, samples)

and visualize the evaluation results

In [None]:
bench.show_samples_evaluation_table(sample_evaluations)

# Bonus!

There is more! You can:

- use *ferret* built-in explainers to have fine-grained control over their *init* and *call* parameters (please refer to our [doc](https://ferret.readthedocs.io/en/latest/?version=latest) to know more)
- compute individual faithfulness and plausibility metrics over explanations

**Interface to individual explainers**

You can also use individual explainers using an object oriented interface.

In [None]:
from ferret import SHAPExplainer, LIMEExplainer

In [None]:
exp = LIMEExplainer(model, tokenizer)
exp("hello my friend", target=1)

In [None]:
exp = SHAPExplainer(model, tokenizer)
exp("hello my friend", target=1)

In [None]:
exp = SHAPExplainer(model, tokenizer)
e = exp("I love your style!", target = 0)

In [None]:
bench.show_table([e])

and evaluate an individual evaluation measure

In [None]:
from ferret import AOPC_Comprehensiveness_Evaluation
# from ferret.evaluators import ModelHelper

aopc_compr_eval = AOPC_Comprehensiveness_Evaluation(
    model,
    tokenizer,
    'text-classification'  # Example task name: it's needed in the constructor.
)

In [None]:
aopc_compr_eval.compute_evaluation(e, target = 0)

In [None]:
ev = bench.evaluate_explanation(e, target = 0)
bench.show_evaluation_table([ev])