# Text Data Explanation Benchmarking: Machine Translation

This notebook demonstrates how to use the benchmark utility to benchmark the performance of  an explainer for text data. In this demo, we showcase explanation performance for partition explainer on a Machine Translation model. The metrics used to evaluate are "keep positive" and "keep negative". The masker used is Text Masker. 

The new benchmark utility uses the new API with MaskedModel as wrapper around user-imported model and evaluates masked values of inputs. 

In [None]:
import numpy as np
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import nlp 
import shap
import shap.benchmark as benchmark
import torch

### Load Data and Model

In [None]:
tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-es")
model = AutoModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-en-es")

In [None]:
dataset = nlp.load_dataset('xsum',split='train')

In [None]:
s = [dataset['summary'][i] for i in range(10)]

### Create Explainer Object

In [None]:
explainer = shap.Explainer(model,tokenizer)

### Run SHAP Explanation

In [None]:
shap_values = explainer(s)

### Define Metrics (Sort Order & Perturbation Method)

In [None]:
sort_order = 'positive'
perturbation = 'keep'

### Benchmark Explainer

In [None]:
sp = benchmark.perturbation.SequentialPerturbation(explainer.model, explainer.masker, sort_order, perturbation)
xs, ys, auc = sp.model_score(shap_values, s)
sp.plot(xs, ys, auc)

In [None]:
sort_order = 'negative'
perturbation = 'keep'

In [None]:
sp = benchmark.perturbation.SequentialPerturbation(explainer.model, explainer.masker, sort_order, perturbation)
xs, ys, auc = sp.model_score(shap_values, s)
sp.plot(xs, ys, auc)