# Anchor explanations for movie sentiment

In this example, we will explain why a certain sentence is classified by a logistic regression as having negative or positive sentiment. The logistic regression is trained on negative and positive movie reviews.

In [1]:
import numpy as np
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
import spacy
import string
from alibi.explainers import AnchorText
from alibi.datasets import fetch_movie_sentiment
from alibi.utils.download import spacy_model

%load_ext autoreload
%autoreload 2



### Load movie review dataset

The `fetch_movie_sentiment` function returns a `Bunch` object containing the features, the targets and the target names for the dataset.

In [2]:
movies = fetch_movie_sentiment()
movies.keys()

dict_keys(['data', 'target', 'target_names'])

In [3]:
data = movies.data
labels = movies.target
target_names = movies.target_names

Define shuffled training, validation and test set

In [4]:
train, test, train_labels, test_labels = train_test_split(data, labels, test_size=.2, random_state=42)
train, val, train_labels, val_labels = train_test_split(train, train_labels, test_size=.1, random_state=42)
train_labels = np.array(train_labels)
test_labels = np.array(test_labels)
val_labels = np.array(val_labels)

### Apply CountVectorizer to training set

In [5]:
vectorizer = CountVectorizer(min_df=1)
vectorizer.fit(train)

CountVectorizer()

### Fit model

In [6]:
np.random.seed(0)
clf = LogisticRegression(solver='liblinear')
clf.fit(vectorizer.transform(train), train_labels)

LogisticRegression(solver='liblinear')

### Define prediction function

In [7]:
predict_fn = lambda x: clf.predict(vectorizer.transform(x))

### Make predictions on train and test sets

In [8]:
preds_train = predict_fn(train)
preds_val = predict_fn(val)
preds_test = predict_fn(test)
print('Train accuracy', accuracy_score(train_labels, preds_train))
print('Validation accuracy', accuracy_score(val_labels, preds_val))
print('Test accuracy', accuracy_score(test_labels, preds_test))

Train accuracy 0.9801624284382905
Validation accuracy 0.7544910179640718
Test accuracy 0.7589841878294202


### Load spaCy model

English multi-task CNN trained on OntoNotes, with GloVe vectors trained on Common Crawl. Assigns word vectors, context-specific token vectors, POS tags, dependency parse and named entities.

In [9]:
model = 'en_core_web_md'
spacy_model(model=model)
nlp = spacy.load(model)

### Initialize anchor text explainer

In [10]:
explainer = AnchorText(nlp=nlp, predictor=predict_fn)

### Explain a prediction

In [11]:
class_names = movies.target_names

In [12]:
text = data[4]
print(text)

a visually flashy but narratively opaque and emotionally vapid exercise in style and mystification .


Prediction:

In [13]:
pred = class_names[predict_fn([text])[0]]
alternative =  class_names[1 - predict_fn([text])[0]]
print('Prediction: %s' % pred)

Prediction: negative


Explanation:

In [14]:
np.random.seed(0)
explanation = explainer.explain(text, threshold=0.95, sampling_method='unknown')

use_unk=True means we will perturb examples by replacing words with UNKs. Let us now take a look at the anchor. The word 'exercise' basically guarantees a negative prediction.

In [15]:
print('Anchor: %s' % (' AND '.join(explanation.anchor)))
print('Precision: %.2f' % explanation.precision)
print('\nExamples where anchor applies and model predicts %s:' % pred)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_true']]))
print('\nExamples where anchor applies and model predicts %s:' % alternative)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_false']]))

Anchor: flashy
Precision: 0.99

Examples where anchor applies and model predicts negative:
a UNK flashy UNK UNK opaque and emotionally vapid exercise in style UNK mystification .
a UNK flashy UNK UNK UNK and emotionally UNK exercise UNK UNK and UNK UNK
a UNK flashy UNK narratively opaque UNK UNK UNK exercise in style and UNK UNK
UNK visually flashy UNK narratively UNK and emotionally UNK UNK UNK UNK UNK mystification .
UNK UNK flashy UNK UNK opaque and emotionally UNK UNK in UNK and UNK .
a visually flashy but UNK UNK and UNK UNK UNK in style UNK mystification .
a visually flashy but UNK opaque UNK emotionally vapid UNK in UNK and mystification .
a UNK flashy but narratively UNK UNK emotionally vapid exercise in style UNK mystification UNK
a UNK flashy but narratively opaque UNK emotionally vapid exercise in style and mystification .
a visually flashy UNK UNK opaque UNK UNK UNK exercise in UNK UNK UNK .

Examples where anchor applies and model predicts positive:
UNK UNK flashy but narr

### Changing the perturbation distribution
Let's try this with another perturbation distribution, namely one that replaces words by similar words instead of UNKs.

Explanation:

In [16]:
np.random.seed(0)
explanation = explainer.explain(text, threshold=0.95, sampling_method="unknown", sample_proba=0.5)

The anchor now shows that we need more to guarantee the negative prediction:

In [17]:
print('Anchor: %s' % (' AND '.join(explanation.anchor)))
print('Precision: %.2f' % explanation.precision)
print('\nExamples where anchor applies and model predicts %s:' % pred)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_true']]))
print('\nExamples where anchor applies and model predicts %s:' % alternative)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_false']]))

Anchor: flashy
Precision: 0.99

Examples where anchor applies and model predicts negative:
a UNK flashy UNK UNK opaque and emotionally vapid exercise in style UNK mystification .
a UNK flashy UNK UNK UNK and emotionally UNK exercise UNK UNK and UNK UNK
a UNK flashy UNK narratively opaque UNK UNK UNK exercise in style and UNK UNK
UNK visually flashy UNK narratively UNK and emotionally UNK UNK UNK UNK UNK mystification .
UNK UNK flashy UNK UNK opaque and emotionally UNK UNK in UNK and UNK .
a visually flashy but UNK UNK and UNK UNK UNK in style UNK mystification .
a visually flashy but UNK opaque UNK emotionally vapid UNK in UNK and mystification .
a UNK flashy but narratively UNK UNK emotionally vapid exercise in style UNK mystification UNK
a UNK flashy but narratively opaque UNK emotionally vapid exercise in style and mystification .
a visually flashy UNK UNK opaque UNK UNK UNK exercise in UNK UNK UNK .

Examples where anchor applies and model predicts positive:
UNK UNK flashy but narr

We can make the token perturbation distribution sample words that are more similar to the ground truth word via the `top_n` argument. Smaller values (default=100) should result in sentences that are more coherent and thus more in the distribution of natural language which could influence the returned anchor. By setting the `use_probability_proba` to True, the sampling distribution for perturbed tokens is proportional to the similarity score between the possible perturbations and the original word. We can also put more weight on similar words via the `temperature` argument. Lower values of `temperature` increase the sampling weight of more similar words. The following example will perturb tokens in the original sentence with probability equal to `sample_proba`. The sampling distribution for the perturbed tokens is proportional to the similarity score between the ground truth word and each of the `top_n` words.

In [18]:
np.random.seed(0)
explanation = explainer.explain(text, threshold=0.95, sampling_method='similarity', sample_proba=0.5,
                                use_unk=False, top_n=20, temperature=.2)

print('Anchor: %s' % (' AND '.join(explanation.anchor)))
print('Precision: %.2f' % explanation.precision)
print('\nExamples where anchor applies and model predicts %s:' % pred)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_true']]))
print('\nExamples where anchor applies and model predicts %s:' % alternative)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_false']]))

Anchor: emotionally
Precision: 0.95

Examples where anchor applies and model predicts negative:
a visually flashy but narratively opaque and emotionally vapid exercise arround style and mystification .
any visually flashy but culturally translucent and emotionally ludicrous exercise in style and mystification .
a visually flashy but functionally opaque and emotionally vapid isometric near design and mystification .
an visually blocky but stylistically opaque and emotionally vapid exercise in style and immaturity .
a visually snazzy but anatomically opaque and emotionally vacuous cardio in fashion and immorality .
another aesthetically cumbersome but narratively visible and emotionally vapid weightloss arround style and mystification .
a visually gaudy but narratively reflective and emotionally vapid workout in sass and mystification .
another visually flashy but stylistically opaque and emotionally boilerplate training around style and mystification .
a graphically flashy but strikingl

### Language models

In [19]:
from alibi.utils.lang_model import DistilbertBaseUncased, BertBaseUncased, RobertaBase

In [20]:
# initialize model
# model = DistilbertBaseUncased()
# model = BertBaseUncased()
model = RobertaBase()

# initialize explainer
explainer = AnchorText(language_model=model, predictor=predict_fn)

In [21]:
text= "This is an excellent movie. I highly recommend it. Glad I watched it!"
pred = class_names[predict_fn([text])[0]]
alternative =  class_names[1 - predict_fn([text])[0]]
print('Prediction: %s' % pred)

Prediction: positive


In [26]:
np.random.seed(0)
explanation = explainer.explain(
    text, 
    threshold=0.95, 
    sampling_method="language_model",
    filling_method="parallel",
    sample_proba=0.5,
    stopwords=['It', 'in', 'the', 'a', 'and', 'i', 'an'],
    punctuation=string.punctuation,
    top_n=50,
    prec_mask_templates=0.1,
    temperature=1.0,
    use_lm_proba=False,
)

print('Anchor: %s' % (' AND '.join(explanation.anchor)))
print('Precision: %.2f' % explanation.precision)
print('\nExamples where anchor applies and model predicts %s:' % pred)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_true']]))
print('\nExamples where anchor applies and model predicts %s:' % alternative)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_false']]))

Anchor:  highly AND  is
Precision: 0.96

Examples where anchor applies and model predicts positive:
This is an awesome movie. I highly value it. Wish I buy it!
This is an essential movie. I highly enjoyed it. Yes I admit it!
This is an exciting movie. I highly appreciate it. Really I hated it!
This is an okay movie. I highly advise it. No I wanted it!
This is an insane movie. I highly liked it. Love I want it!
This is an unforgettable movie. I highly loved it. … I ordered it!
This is an engaging movie. I highly underrated it. How I recommended it!
This is an awesome movie. I highly ordered it. Love I watched it!
This is an unbelievable movie. I highly fan it. Yes I recommended it!
This is an addictive movie. I highly respect it. Hope I know it!

Examples where anchor applies and model predicts negative:
This is an original movie. I highly doubt it. Overall I watched it!
This is an epic movie. I highly value it. Tonight I admit it!
This is an excellent movie. I highly admire it. God I w

In [27]:
explanation = explainer.explain(
    text, 
    threshold=0.95, 
    sampling_method="language_model",
    filling_method="autoregressive",
    sample_proba=0.5,
    stopwords=['It', 'in', 'the', 'a', 'and', 'i'],
    punctuation=string.punctuation,
    top_n=50,
    prec_mask_templates=1.0,
    temperature=1.0,
    use_lm_proba=False,
)

print('Anchor: %s' % (' AND '.join(explanation.anchor)))
print('Precision: %.2f' % explanation.precision)
print('\nExamples where anchor applies and model predicts %s:' % pred)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_true']]))
print('\nExamples where anchor applies and model predicts %s:' % alternative)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_false']]))

Anchor:  highly AND  an AND  Glad AND  is
Precision: 0.99

Examples where anchor applies and model predicts positive:
This is an unbelievable movie. I highly recommend it. Glad I tried it!
This is an excellent job. I highly highly it. Glad I watched it!
This is an excellent sequel. I highly promote it. Glad I watched it!
This is an informative movie. I highly loved it. Glad I watched it!
This is an enjoyable movie. I highly recommend it. Glad I ordered it!
This is an excellent review. I highly endorse it. Glad I appreciated it!
This is an enjoyable movie. I highly suggested it. Glad I chose it!
This is an excellent movie. I highly value it. Glad I watched it!
This is an extraordinary movie. I highly recommend it. Glad I finished it!
This is an excellent movie. I highly predict it. Glad I watched it!

Examples where anchor applies and model predicts negative:
This is an excellent series. I highly rate it. Glad I watched it!


In [32]:
test_text = "This was an excellent movie. I highly recommend it. Glad I watched it!"
predict_fn([test_text])

array([1])