In this example, we will explain why a certain sentence is classified by a logistic regression as having negative or positive sentiment.

In [1]:
import numpy as np
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
import spacy
from alibi.explainers import AnchorText
from alibi.datasets import movie_sentiment

### Load movie review dataset

The movie review dataset can be found here: https://www.kaggle.com/nltkdata/sentence-polarity#sentence_polarity.zip.
You can extract the files in a folder of your choosing (the notebook assumes *'sentence_polarity'* in the current path). 

In [2]:
data, labels = movie_sentiment(path='sentence_polarity')

Define shuffled training and test set

In [3]:
train, test, train_labels, test_labels = train_test_split(data, labels, test_size=.2, random_state=0)
train_labels = np.array(train_labels)
test_labels = np.array(test_labels)

### Apply CountVectorizer to training set

In [4]:
vectorizer = CountVectorizer(min_df=1)
vectorizer.fit(train)

CountVectorizer(analyzer='word', binary=False, decode_error='strict',
        dtype=<class 'numpy.int64'>, encoding='utf-8', input='content',
        lowercase=True, max_df=1.0, max_features=None, min_df=1,
        ngram_range=(1, 1), preprocessor=None, stop_words=None,
        strip_accents=None, token_pattern='(?u)\\b\\w\\w+\\b',
        tokenizer=None, vocabulary=None)

### Fit model

In [5]:
clf = LogisticRegression()
clf.fit(vectorizer.transform(train), train_labels)



LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
          intercept_scaling=1, max_iter=100, multi_class='warn',
          n_jobs=None, penalty='l2', random_state=None, solver='warn',
          tol=0.0001, verbose=0, warm_start=False)

### Define prediction function

In [6]:
def predict_fn(texts):
    return clf.predict(vectorizer.transform(texts))

### Make predictions on train and test sets

In [7]:
preds_train = predict_fn(train)
preds_test = predict_fn(test)
print('Train accuracy', accuracy_score(train_labels, preds_train))
print('Test accuracy', accuracy_score(test_labels, preds_test))

Train accuracy 0.976316098018525
Test accuracy 0.7585560243788092


### Load spaCy model

English multi-task CNN trained on OntoNotes, with GloVe vectors trained on Common Crawl. Assigns word vectors, context-specific token vectors, POS tags, dependency parse and named entities.

Note: you must have spacy installed. Run:

        pip install spacy && python -m spacy download en_core_web_lg

In [8]:
nlp = spacy.load('en_core_web_lg')

### Initialize anchor text explainer

In [9]:
explainer = AnchorText(nlp, predict_fn)

### Explain a prediction

In [10]:
class_names = ['negative', 'positive']

Prediction:

In [11]:
np.random.seed(0)
text = 'This is a good book .'
pred = class_names[predict_fn([text])[0]]
alternative =  class_names[1 - predict_fn([text])[0]]
print('Prediction: %s' % pred)

Prediction: positive


Explanation:

In [12]:
explanation = explainer.explain(text, threshold=0.95, use_proba=False, use_unk=True)

use_unk=True means we will perturb examples by replacing words with UNKs. Let us now take a look at the anchor. The word 'good' basically guarantees a positive prediction. This is because the UNKs do not take instances like 'not good' into account.

In [13]:
print('Anchor: %s' % (' AND '.join(explanation['names'])))
print('Precision: %.2f' % explanation['precision'])
print('\nExamples where anchor applies and model predicts %s:' % pred)
print('\n'.join([x[0] for x in explanation['raw']['examples'][0]['covered_true']]))
print('\nExamples where anchor applies and model predicts %s:' % alternative)
print('\n'.join([x[0] for x in explanation['raw']['examples'][0]['covered_false']]))

Anchor: good
Precision: 1.00

Examples where anchor applies and model predicts positive:
UNK UNK UNK good bo
UNK is a good book 
UNK is a good book 
UNK is UNK good boo
UNK UNK UNK good bo
UNK is a good book 
UNK is UNK good UNK
UNK UNK UNK good UN
This UNK a good UNK
This is a good UNK 

Examples where anchor applies and model predicts negative:



### Changing the perturbation distribution
Let's try this with another perturbation distribution, namely one that replaces words by similar words instead of UNKs.

Explanation:

In [14]:
explanation = explainer.explain(text, threshold=0.95, use_proba=True, use_unk=False)

The anchor now shows that we need more to guarantee the positive prediction:

In [15]:
print('Anchor: %s' % (' AND '.join(explanation['names'])))
print('Precision: %.2f' % explanation['precision'])
print('\nExamples where anchor applies and model predicts %s:' % pred)
print('\n'.join([x[0] for x in explanation['raw']['examples'][0]['covered_true']]))
print('\nExamples where anchor applies and model predicts %s:' % alternative)
print('\n'.join([x[0] for x in explanation['raw']['examples'][0]['covered_false']]))

Anchor: good AND book
Precision: 0.98

Examples where anchor applies and model predicts positive:
This knows a good website .
THis stays this good audiobook .
Any appears a good anthology .
This makes any good excerpt .
THis produces a good booklet .
Some believes another good poem .
WHAT seems a good tome .
This comes any good tale .
Another allows that good edition .
Every contains an good synopsis .

Examples where anchor applies and model predicts negative:
ANOTHER constitutes a good narrative .
THOSE requires some good hardcover .
This continues every good blog .
This appears another good reader .
Any comes this good presentation .
This belongs that good paper .
That feels another good text .
This deserves a good textbook .
THE makes some good paper .
THis believes this good idea .
