<a href="https://colab.research.google.com/github/victor-roris/mediumseries/blob/master/NLP/ModelInterpretability_ELI5.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# ELI5 - Explain Like I am 5

ELI5 provides both global model interpretation and local model interpretation. 
The global interpretation works for: scikit-learn, Keras, xgboost, LightGBM, CatBoost, lightning, sklearn-crfsuite. 
For local model interpretation, ELI5 use LIME algorithm.  The display format is different from LIME but using same idea. 

DOCUMENTATION: https://eli5.readthedocs.io/en/latest/index.html

GITHUB: https://github.com/TeamHG-Memex/eli5

ADAPTED FROM: https://github.com/makcedward/nlp/blob/master/sample/nlp-model_interpretation.ipynb


## INSTALLATION

In [0]:
!pip install eli5

## EXAMPLE OF USAGE

In [14]:
import random
import pandas as pd
import numpy as np 
import IPython
import xgboost
import keras

import eli5
print('ELI5 Version:', eli5.__version__)
print('XGBoost Version:', xgboost.__version__)
print('Keras Version:', keras.__version__)

ELI5 Version: 0.10.1
XGBoost Version: 0.90
Keras Version: 2.2.5


### Train NLP models

We generate a set of NLP models to test the ELI5 library

#### Fetch data

We use the sklearn direct dataset 20 news groups.

In [4]:
from sklearn.datasets import fetch_20newsgroups
train_raw_df = fetch_20newsgroups(subset='train')
test_raw_df = fetch_20newsgroups(subset='test')

Downloading 20news dataset. This may take a few minutes.
Downloading dataset from https://ndownloader.figshare.com/files/5975967 (14 MB)


In [8]:
print(f'Number of raw training examples: {len(train_raw_df.data)}')
print(f'Number of raw test examples: {len(test_raw_df.data)}')

Number of raw training examples: 11314
Number of raw test examples: 7532


In [17]:
category_names = np.unique(np.array(train_raw_df.target_names))
print(f'Number of different categories : {len(category_names)}')
print(f'Category list: {category_names}')

Number of different categories : 20
Category list: ['alt.atheism' 'comp.graphics' 'comp.os.ms-windows.misc'
 'comp.sys.ibm.pc.hardware' 'comp.sys.mac.hardware' 'comp.windows.x'
 'misc.forsale' 'rec.autos' 'rec.motorcycles' 'rec.sport.baseball'
 'rec.sport.hockey' 'sci.crypt' 'sci.electronics' 'sci.med' 'sci.space'
 'soc.religion.christian' 'talk.politics.guns' 'talk.politics.mideast'
 'talk.politics.misc' 'talk.religion.misc']


In [11]:
print('Example of entry:')
print(f'\t - LABEL : {train_raw_df.target[0]} - {train_raw_df.target_names[0]}')
print(f'\t - {train_raw_df.data[0]}')

Example of entry:
	 - LABEL : 7 - alt.atheism
	 - From: lerxst@wam.umd.edu (where's my thing)
Subject: WHAT car is this!?
Nntp-Posting-Host: rac3.wam.umd.edu
Organization: University of Maryland, College Park
Lines: 15

 I was wondering if anyone out there could enlighten me on this car I saw
the other day. It was a 2-door sports car, looked to be from the late 60s/
early 70s. It was called a Bricklin. The doors were really small. In addition,
the front bumper was separate from the rest of the body. This is 
all I know. If anyone can tellme a model name, engine specs, years
of production, where this car is made, history, or whatever info you
have on this funky looking car, please e-mail.

Thanks,
- IL
   ---- brought to you by your neighborhood Lerxst ----







#### Prepare data to the model

In [0]:
x_train = train_raw_df.data
y_train = train_raw_df.target

x_test = test_raw_df.data
y_test = test_raw_df.target

#### Training models

* **Models definition**

We are going to use 4 different models: Logistic Regression, Random Forest, XGBoost and a Keras model.

In [0]:
# Word Embedding via TFIDF - vectorization of the text
from sklearn.feature_extraction.text import TfidfVectorizer

# Pipeline execution to combine vectorization and model execution
from sklearn.pipeline import make_pipeline 

# Definition of generic models
from sklearn.linear_model import LogisticRegression  # Logistic Regression
from sklearn.ensemble import RandomForestClassifier  # Random Forest
from xgboost import XGBClassifier                    # XGBoost

# Definition of a custom model in Keras
import numpy as np
from sklearn.preprocessing import LabelEncoder
from sklearn.base import BaseEstimator, TransformerMixin

from keras.models import Model, Input
from keras.layers import Dense, LSTM, Dropout, Embedding, SpatialDropout1D, Bidirectional, concatenate
from keras.layers import GlobalAveragePooling1D, GlobalMaxPooling1D
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences

class KerasTextClassifier:
    __author__ = "Edward Ma"
    __copyright__ = "Copyright 2018, Edward Ma"
    __credits__ = ["Edward Ma"]
    __license__ = "Apache"
    __version__ = "2.0"
    __maintainer__ = "Edward Ma"
    __email__ = "makcedward@gmail.com"
    
    OOV_TOKEN = "UnknownUnknown"
    
    def __init__(self, 
                 max_word_input, word_cnt, word_embedding_dimension, labels, 
                 batch_size, epoch, validation_split,
                 verbose=0):
        self.verbose = verbose
        self.max_word_input = max_word_input
        self.word_cnt = word_cnt
        self.word_embedding_dimension = word_embedding_dimension
        self.labels = labels
        self.batch_size = batch_size
        self.epoch = epoch
        self.validation_split = validation_split
        
        self.label_encoder = None
        self.classes_ = None
        self.tokenizer = None
        
        self.model = self._init_model()
        self._init_label_encoder(y=labels)
        self._init_tokenizer()
        
    def _init_model(self):
        input_layer = Input((self.max_word_input,))
        text_embedding = Embedding(
            input_dim=self.word_cnt+2, output_dim=self.word_embedding_dimension,
            input_length=self.max_word_input, mask_zero=False)(input_layer)
        
        text_embedding = SpatialDropout1D(0.5)(text_embedding)
        
        bilstm = Bidirectional(LSTM(units=256, return_sequences=True, recurrent_dropout=0.5))(text_embedding)
        x = concatenate([GlobalAveragePooling1D()(bilstm), GlobalMaxPooling1D()(bilstm)])
        x = Dropout(0.5)(x)
        x = Dense(128, activation="relu")(x)
        x = Dropout(0.5)(x)
        
        output_layer = Dense(units=len(self.labels), activation="softmax")(x)
        model = Model(input_layer, output_layer)
        model.compile(
            optimizer="adam",
            loss="sparse_categorical_crossentropy",
            metrics=["accuracy"])
        return model
    
    def _init_tokenizer(self):
        self.tokenizer = Tokenizer(
            num_words=self.word_cnt+1, split=' ', oov_token=self.OOV_TOKEN)
    
    def _init_label_encoder(self, y):
        self.label_encoder = LabelEncoder()
        self.label_encoder.fit(y)
        self.classes_ = self.label_encoder.classes_
        
    def _encode_label(self, y):
        return self.label_encoder.transform(y)
        
    def _decode_label(self, y):
        return self.label_encoder.inverse_transform(y)
    
    def _get_sequences(self, texts):
        seqs = self.tokenizer.texts_to_sequences(texts)
        return pad_sequences(seqs, maxlen=self.max_word_input, value=0)
    
    def _preprocess(self, texts):
        # Placeholder only.
        return [text for text in texts]
        
    def _encode_feature(self, x):
        self.tokenizer.fit_on_texts(self._preprocess(x))
        self.tokenizer.word_index = {e: i for e,i in self.tokenizer.word_index.items() if i <= self.word_cnt}
        self.tokenizer.word_index[self.tokenizer.oov_token] = self.word_cnt + 1
        return self._get_sequences(self._preprocess(x))
        
    def fit(self, X, y):
        """
            Train the model by providing x as feature, y as label
        
            :params x: List of sentence
            :params y: List of label
        """
        
        encoded_x = self._encode_feature(X)
        encoded_y = self._encode_label(y)
        
        self.model.fit(encoded_x, encoded_y, 
                       batch_size=self.batch_size, epochs=self.epoch, 
                       validation_split=self.validation_split)
        
    def predict_proba(self, X, y=None):
        encoded_x = self._get_sequences(self._preprocess(X))
        return self.model.predict(encoded_x)
    
    def predict(self, X, y=None):
        y_pred = np.argmax(self.predict_proba(X), axis=1)
        return self._decode_label(y_pred)

In [0]:
model_names = ['Logistic Regression', 'Random Forest', 'XGBoost Classifier', 'Keras']

* **Build models**

We are going to create an object for each of the models.

In [0]:
def build_model(names, x, y):
    pipelines = []
    vec = TfidfVectorizer()
    vec.fit(x)

    for name in names:
        print('train %s' % name)
        
        if name == 'Logistic Regression':
            estimator = LogisticRegression(solver='newton-cg', n_jobs=-1)
            pipeline = make_pipeline(vec, estimator)

        elif name == 'Random Forest':
            estimator = RandomForestClassifier(n_jobs=-1)
            pipeline = make_pipeline(vec, estimator)

        elif name == 'XGBoost Classifier':
            estimator = XGBClassifier()
            pipeline = make_pipeline(vec, estimator)
            
        elif name == 'Keras':
            pipeline = KerasTextClassifier(
                max_word_input=100, word_cnt=30000, word_embedding_dimension=100, 
                labels=list(set(y_train.tolist())), batch_size=128, epoch=1, validation_split=0.1)
        
        
        pipeline.fit(x, y)
        pipelines.append({
            'name': name,
            'pipeline': pipeline
        })
        
    return pipelines, vec

In [26]:
pipelines, vec = build_model(model_names, x_train, y_train)

train Logistic Regression




train Random Forest




train XGBoost Classifier
train Keras




Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where



Train on 10182 samples, validate on 1132 samples
Epoch 1/1







### ELI5 - Model interpretability

ELI5 provides both global model interpretation and local model interpretation. 





#### Global Interpretation

You may simply consider the global model interpretation as a feature importance but it not **only support decision tree algorithm** such as Radom Forest and XGBoost but also all sci-kit learn estimators.

ELI5 author introduces it as **Permutation Importance** for global interpretation. To calculate the score, feature (a word) will be replaced by other word (noise) and predicting it. The idea is that feature importance can be deduced by getting how much score decrease when a particular word is not provided. For example, "I like apple". It will may changed to "I like orange" and then it will classify the newly created record to understand how "apple" is important. Of course, we need to assume the replaced word (e.g. orange) is noise and it should not provide major change on score.


In [34]:
for pipeline in pipelines:
  
    print()
    print('Estimator: %s' % (pipeline['name']))
    # labels = pipeline['pipeline'].classes_.tolist()
    labels = category_names
    
    if pipeline['name'] in ['Logistic Regression', 'Random Forest']:
        estimator = pipeline['pipeline']

    elif pipeline['name'] == 'XGBoost Classifier':
        estimator = pipeline['pipeline'].steps[1][1].get_booster()
    
    # Not support Keras
    elif pipeline['name'] == 'Keras':
        # estimator = pipeline['pipeline']
        print('\t - Global Interpretation is not supported to Keras')
        continue

    else:
        continue
    
    IPython.display.display(
        eli5.show_weights(estimator=estimator, top=10, target_names=labels, vec=vec))


Estimator: Logistic Regression


Weight?,Feature,Unnamed: 2_level_0,Unnamed: 3_level_0,Unnamed: 4_level_0,Unnamed: 5_level_0,Unnamed: 6_level_0,Unnamed: 7_level_0,Unnamed: 8_level_0,Unnamed: 9_level_0,Unnamed: 10_level_0,Unnamed: 11_level_0,Unnamed: 12_level_0,Unnamed: 13_level_0,Unnamed: 14_level_0,Unnamed: 15_level_0,Unnamed: 16_level_0,Unnamed: 17_level_0,Unnamed: 18_level_0,Unnamed: 19_level_0
Weight?,Feature,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1
Weight?,Feature,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2
Weight?,Feature,Unnamed: 2_level_3,Unnamed: 3_level_3,Unnamed: 4_level_3,Unnamed: 5_level_3,Unnamed: 6_level_3,Unnamed: 7_level_3,Unnamed: 8_level_3,Unnamed: 9_level_3,Unnamed: 10_level_3,Unnamed: 11_level_3,Unnamed: 12_level_3,Unnamed: 13_level_3,Unnamed: 14_level_3,Unnamed: 15_level_3,Unnamed: 16_level_3,Unnamed: 17_level_3,Unnamed: 18_level_3,Unnamed: 19_level_3
Weight?,Feature,Unnamed: 2_level_4,Unnamed: 3_level_4,Unnamed: 4_level_4,Unnamed: 5_level_4,Unnamed: 6_level_4,Unnamed: 7_level_4,Unnamed: 8_level_4,Unnamed: 9_level_4,Unnamed: 10_level_4,Unnamed: 11_level_4,Unnamed: 12_level_4,Unnamed: 13_level_4,Unnamed: 14_level_4,Unnamed: 15_level_4,Unnamed: 16_level_4,Unnamed: 17_level_4,Unnamed: 18_level_4,Unnamed: 19_level_4
Weight?,Feature,Unnamed: 2_level_5,Unnamed: 3_level_5,Unnamed: 4_level_5,Unnamed: 5_level_5,Unnamed: 6_level_5,Unnamed: 7_level_5,Unnamed: 8_level_5,Unnamed: 9_level_5,Unnamed: 10_level_5,Unnamed: 11_level_5,Unnamed: 12_level_5,Unnamed: 13_level_5,Unnamed: 14_level_5,Unnamed: 15_level_5,Unnamed: 16_level_5,Unnamed: 17_level_5,Unnamed: 18_level_5,Unnamed: 19_level_5
Weight?,Feature,Unnamed: 2_level_6,Unnamed: 3_level_6,Unnamed: 4_level_6,Unnamed: 5_level_6,Unnamed: 6_level_6,Unnamed: 7_level_6,Unnamed: 8_level_6,Unnamed: 9_level_6,Unnamed: 10_level_6,Unnamed: 11_level_6,Unnamed: 12_level_6,Unnamed: 13_level_6,Unnamed: 14_level_6,Unnamed: 15_level_6,Unnamed: 16_level_6,Unnamed: 17_level_6,Unnamed: 18_level_6,Unnamed: 19_level_6
Weight?,Feature,Unnamed: 2_level_7,Unnamed: 3_level_7,Unnamed: 4_level_7,Unnamed: 5_level_7,Unnamed: 6_level_7,Unnamed: 7_level_7,Unnamed: 8_level_7,Unnamed: 9_level_7,Unnamed: 10_level_7,Unnamed: 11_level_7,Unnamed: 12_level_7,Unnamed: 13_level_7,Unnamed: 14_level_7,Unnamed: 15_level_7,Unnamed: 16_level_7,Unnamed: 17_level_7,Unnamed: 18_level_7,Unnamed: 19_level_7
Weight?,Feature,Unnamed: 2_level_8,Unnamed: 3_level_8,Unnamed: 4_level_8,Unnamed: 5_level_8,Unnamed: 6_level_8,Unnamed: 7_level_8,Unnamed: 8_level_8,Unnamed: 9_level_8,Unnamed: 10_level_8,Unnamed: 11_level_8,Unnamed: 12_level_8,Unnamed: 13_level_8,Unnamed: 14_level_8,Unnamed: 15_level_8,Unnamed: 16_level_8,Unnamed: 17_level_8,Unnamed: 18_level_8,Unnamed: 19_level_8
Weight?,Feature,Unnamed: 2_level_9,Unnamed: 3_level_9,Unnamed: 4_level_9,Unnamed: 5_level_9,Unnamed: 6_level_9,Unnamed: 7_level_9,Unnamed: 8_level_9,Unnamed: 9_level_9,Unnamed: 10_level_9,Unnamed: 11_level_9,Unnamed: 12_level_9,Unnamed: 13_level_9,Unnamed: 14_level_9,Unnamed: 15_level_9,Unnamed: 16_level_9,Unnamed: 17_level_9,Unnamed: 18_level_9,Unnamed: 19_level_9
Weight?,Feature,Unnamed: 2_level_10,Unnamed: 3_level_10,Unnamed: 4_level_10,Unnamed: 5_level_10,Unnamed: 6_level_10,Unnamed: 7_level_10,Unnamed: 8_level_10,Unnamed: 9_level_10,Unnamed: 10_level_10,Unnamed: 11_level_10,Unnamed: 12_level_10,Unnamed: 13_level_10,Unnamed: 14_level_10,Unnamed: 15_level_10,Unnamed: 16_level_10,Unnamed: 17_level_10,Unnamed: 18_level_10,Unnamed: 19_level_10
Weight?,Feature,Unnamed: 2_level_11,Unnamed: 3_level_11,Unnamed: 4_level_11,Unnamed: 5_level_11,Unnamed: 6_level_11,Unnamed: 7_level_11,Unnamed: 8_level_11,Unnamed: 9_level_11,Unnamed: 10_level_11,Unnamed: 11_level_11,Unnamed: 12_level_11,Unnamed: 13_level_11,Unnamed: 14_level_11,Unnamed: 15_level_11,Unnamed: 16_level_11,Unnamed: 17_level_11,Unnamed: 18_level_11,Unnamed: 19_level_11
Weight?,Feature,Unnamed: 2_level_12,Unnamed: 3_level_12,Unnamed: 4_level_12,Unnamed: 5_level_12,Unnamed: 6_level_12,Unnamed: 7_level_12,Unnamed: 8_level_12,Unnamed: 9_level_12,Unnamed: 10_level_12,Unnamed: 11_level_12,Unnamed: 12_level_12,Unnamed: 13_level_12,Unnamed: 14_level_12,Unnamed: 15_level_12,Unnamed: 16_level_12,Unnamed: 17_level_12,Unnamed: 18_level_12,Unnamed: 19_level_12
Weight?,Feature,Unnamed: 2_level_13,Unnamed: 3_level_13,Unnamed: 4_level_13,Unnamed: 5_level_13,Unnamed: 6_level_13,Unnamed: 7_level_13,Unnamed: 8_level_13,Unnamed: 9_level_13,Unnamed: 10_level_13,Unnamed: 11_level_13,Unnamed: 12_level_13,Unnamed: 13_level_13,Unnamed: 14_level_13,Unnamed: 15_level_13,Unnamed: 16_level_13,Unnamed: 17_level_13,Unnamed: 18_level_13,Unnamed: 19_level_13
Weight?,Feature,Unnamed: 2_level_14,Unnamed: 3_level_14,Unnamed: 4_level_14,Unnamed: 5_level_14,Unnamed: 6_level_14,Unnamed: 7_level_14,Unnamed: 8_level_14,Unnamed: 9_level_14,Unnamed: 10_level_14,Unnamed: 11_level_14,Unnamed: 12_level_14,Unnamed: 13_level_14,Unnamed: 14_level_14,Unnamed: 15_level_14,Unnamed: 16_level_14,Unnamed: 17_level_14,Unnamed: 18_level_14,Unnamed: 19_level_14
Weight?,Feature,Unnamed: 2_level_15,Unnamed: 3_level_15,Unnamed: 4_level_15,Unnamed: 5_level_15,Unnamed: 6_level_15,Unnamed: 7_level_15,Unnamed: 8_level_15,Unnamed: 9_level_15,Unnamed: 10_level_15,Unnamed: 11_level_15,Unnamed: 12_level_15,Unnamed: 13_level_15,Unnamed: 14_level_15,Unnamed: 15_level_15,Unnamed: 16_level_15,Unnamed: 17_level_15,Unnamed: 18_level_15,Unnamed: 19_level_15
Weight?,Feature,Unnamed: 2_level_16,Unnamed: 3_level_16,Unnamed: 4_level_16,Unnamed: 5_level_16,Unnamed: 6_level_16,Unnamed: 7_level_16,Unnamed: 8_level_16,Unnamed: 9_level_16,Unnamed: 10_level_16,Unnamed: 11_level_16,Unnamed: 12_level_16,Unnamed: 13_level_16,Unnamed: 14_level_16,Unnamed: 15_level_16,Unnamed: 16_level_16,Unnamed: 17_level_16,Unnamed: 18_level_16,Unnamed: 19_level_16
Weight?,Feature,Unnamed: 2_level_17,Unnamed: 3_level_17,Unnamed: 4_level_17,Unnamed: 5_level_17,Unnamed: 6_level_17,Unnamed: 7_level_17,Unnamed: 8_level_17,Unnamed: 9_level_17,Unnamed: 10_level_17,Unnamed: 11_level_17,Unnamed: 12_level_17,Unnamed: 13_level_17,Unnamed: 14_level_17,Unnamed: 15_level_17,Unnamed: 16_level_17,Unnamed: 17_level_17,Unnamed: 18_level_17,Unnamed: 19_level_17
Weight?,Feature,Unnamed: 2_level_18,Unnamed: 3_level_18,Unnamed: 4_level_18,Unnamed: 5_level_18,Unnamed: 6_level_18,Unnamed: 7_level_18,Unnamed: 8_level_18,Unnamed: 9_level_18,Unnamed: 10_level_18,Unnamed: 11_level_18,Unnamed: 12_level_18,Unnamed: 13_level_18,Unnamed: 14_level_18,Unnamed: 15_level_18,Unnamed: 16_level_18,Unnamed: 17_level_18,Unnamed: 18_level_18,Unnamed: 19_level_18
Weight?,Feature,Unnamed: 2_level_19,Unnamed: 3_level_19,Unnamed: 4_level_19,Unnamed: 5_level_19,Unnamed: 6_level_19,Unnamed: 7_level_19,Unnamed: 8_level_19,Unnamed: 9_level_19,Unnamed: 10_level_19,Unnamed: 11_level_19,Unnamed: 12_level_19,Unnamed: 13_level_19,Unnamed: 14_level_19,Unnamed: 15_level_19,Unnamed: 16_level_19,Unnamed: 17_level_19,Unnamed: 18_level_19,Unnamed: 19_level_19
+4.858,keith,,,,,,,,,,,,,,,,,,
+4.175,atheism,,,,,,,,,,,,,,,,,,
+3.819,atheists,,,,,,,,,,,,,,,,,,
+3.043,caltech,,,,,,,,,,,,,,,,,,
+2.983,islamic,,,,,,,,,,,,,,,,,,
+2.722,islam,,,,,,,,,,,,,,,,,,
+2.692,okcforum,,,,,,,,,,,,,,,,,,
+2.665,mathew,,,,,,,,,,,,,,,,,,
+2.616,god,,,,,,,,,,,,,,,,,,
… 8135 more positive …,… 8135 more positive …,,,,,,,,,,,,,,,,,,

Weight?,Feature
+4.858,keith
+4.175,atheism
+3.819,atheists
+3.043,caltech
+2.983,islamic
+2.722,islam
+2.692,okcforum
+2.665,mathew
+2.616,god
… 8135 more positive …,… 8135 more positive …

Weight?,Feature
+7.746,graphics
+4.409,image
+4.060,3d
+3.402,polygon
+3.288,tiff
+3.262,images
+3.093,cview
+2.772,files
+2.767,points
+2.669,format

Weight?,Feature
+12.026,windows
+4.202,file
+3.395,ax
+3.353,driver
+3.142,drivers
+3.040,files
+2.654,cica
+2.543,dos
+2.460,mouse
… 38121 more positive …,… 38121 more positive …

Weight?,Feature
+3.780,drive
+3.686,scsi
+3.578,card
+3.509,pc
+3.492,bus
+3.439,ide
+3.291,gateway
+3.199,controller
+2.768,monitor
… 8323 more positive …,… 8323 more positive …

Weight?,Feature
+8.315,mac
+6.742,apple
+4.110,quadra
+3.667,duo
+3.567,centris
+3.311,powerbook
+3.309,se
+3.091,lc
+2.761,lciii
… 7564 more positive …,… 7564 more positive …

Weight?,Feature
+6.204,window
+5.833,motif
+4.776,server
+4.107,widget
+4.043,mit
+3.759,xterm
+3.752,x11r5
+2.975,lcs
+2.743,xlib
+2.722,application

Weight?,Feature
+9.782,sale
+4.200,for
+3.886,shipping
+3.806,offer
+3.401,forsale
+3.105,sell
+2.878,00
+2.704,condition
… 8952 more positive …,… 8952 more positive …
… 121146 more negative …,… 121146 more negative …

Weight?,Feature
+10.633,car
+6.166,cars
+3.508,engine
+3.026,oil
+2.788,dealer
+2.616,ford
+2.443,toyota
+2.366,automotive
+2.262,auto
… 8595 more positive …,… 8595 more positive …

Weight?,Feature
+8.775,bike
+7.799,dod
+4.318,bikes
+4.108,motorcycle
+3.802,bmw
+3.722,ride
+3.532,riding
+3.020,helmet
+2.987,motorcycles
… 9032 more positive …,… 9032 more positive …

Weight?,Feature
+6.604,baseball
+4.255,he
+3.700,phillies
+3.225,players
+3.193,year
+3.170,cubs
+3.055,runs
+2.988,braves
+2.959,pitching
… 7542 more positive …,… 7542 more positive …

Weight?,Feature
+7.203,hockey
+4.754,nhl
+4.690,team
+4.448,game
+3.727,ca
+3.225,play
+2.887,playoff
+2.745,players
+2.698,leafs
… 8839 more positive …,… 8839 more positive …

Weight?,Feature
+6.962,clipper
+6.708,key
+5.555,encryption
+4.492,chip
+3.474,nsa
+3.175,pgp
+3.149,keys
+3.113,gtoal
+3.062,security
… 11050 more positive …,… 11050 more positive …

Weight?,Feature
+3.834,circuit
+3.339,electronics
+3.019,power
+2.897,radar
+2.861,voltage
+2.561,tv
+2.531,amp
+2.493,ground
+2.453,audio
… 8870 more positive …,… 8870 more positive …

Weight?,Feature
+4.782,msg
+4.163,doctor
+3.944,pitt
+3.703,disease
+3.244,medical
+3.030,geb
+2.971,banks
+2.930,gordon
+2.846,dyer
… 11611 more positive …,… 11611 more positive …

Weight?,Feature
+10.061,space
+4.212,nasa
+4.205,moon
+4.116,orbit
+3.361,launch
+3.346,henry
+3.139,alaska
+3.029,digex
+2.964,pat
… 10829 more positive …,… 10829 more positive …

Weight?,Feature
+5.027,god
+4.630,church
+4.199,rutgers
+3.964,christians
+3.493,athos
+3.360,christ
+3.117,christian
+2.757,christianity
+2.686,clh
… 9346 more positive …,… 9346 more positive …

Weight?,Feature
+7.998,gun
+5.487,guns
+3.540,waco
+3.519,fbi
+3.404,firearms
+3.398,weapons
+3.315,atf
+3.275,batf
+2.553,stratus
… 12291 more positive …,… 12291 more positive …

Weight?,Feature
+6.923,israel
+6.207,israeli
+3.950,turkish
+3.674,jews
+3.325,armenians
+3.183,armenian
+2.862,armenia
+2.805,arab
+2.792,serdar
… 10996 more positive …,… 10996 more positive …

Weight?,Feature
+4.088,clinton
+3.861,kaldis
+3.745,cramer
+3.451,tax
+3.390,optilink
+3.007,drugs
+2.912,gay
+2.853,president
+2.479,government
… 9743 more positive …,… 9743 more positive …

Weight?,Feature
+3.548,christian
+3.531,sandvik
+3.037,morality
+2.976,koresh
+2.471,jesus
+2.468,objective
+2.191,god
+2.113,hudson
+2.048,biblical
… 8520 more positive …,… 8520 more positive …



Estimator: Random Forest


Weight,Feature
0.0060  ± 0.0093,space
0.0059  ± 0.0121,baseball
0.0056  ± 0.0089,sale
0.0045  ± 0.0046,windows
0.0042  ± 0.0047,of
0.0039  ± 0.0094,bike
0.0035  ± 0.0080,graphics
0.0034  ± 0.0077,chip
0.0034  ± 0.0132,encryption
0.0032  ± 0.0080,israel



Estimator: XGBoost Classifier


Weight,Feature
0.0294,geb
0.0185,msg
0.0116,dod
0.0094,clipper
0.0090,cramer
0.0083,sale
0.0079,hendricks
0.0071,israel
0.0070,hockey
0.0063,armenia



Estimator: Keras
	 - Global Interpretation is not supported to Keras


The above figure means that, if input includes "keith", then score of y=0 (atheism) increase 4.858. Another case is "the", it will decrease score -5.674 in y=6 (forsale)

#### Local Interpretation

For local model interpretation, ELI5 use LIME algorithm. The display format is different from LIME but using same idea.


* **Local Interpretation using show_prediction method**

ELI5 supports `eli5.show_prediction()` to conveniently invoke `explain_prediction` with format_as_image, and display the explanation in an `IPython cell`.

In [42]:
number_of_sample = 1
sample_ids = [random.randint(0, len(x_test) -1 ) for p in range(0, number_of_sample)]

for idx in sample_ids:
    print('Index: %d' % (idx))
    
    for pipeline in pipelines:
        print('-' * 50)
        print('Estimator: %s' % (pipeline['name']))
        
        print('True Label: %s, Predicted Label: %s' % (y_test[idx], pipeline['pipeline'].predict([x_test[idx]])[0]))
        # labels = pipeline['pipeline'].classes_.tolist()
        labels = category_names
  
        if pipeline['name'] in ['Logistic Regression', 'Random Forest']:
            estimator = pipeline['pipeline'].steps[1][1]
        elif pipeline['name'] == 'XGBoost Classifier':
            estimator = pipeline['pipeline'].steps[1][1].get_booster()
        # Not support Keras
        elif pipeline['name'] == 'Keras':
            # estimator = pipeline['pipeline'].model
            print('\t - Local Interpretation, "show_prediction" method doesnt support Keras')
            # continue
        else:
            continue

        IPython.display.display(
            eli5.show_prediction(estimator, x_test[idx], top=10, vec=vec, target_names=labels))

Index: 5364
--------------------------------------------------
Estimator: Logistic Regression
True Label: 17, Predicted Label: 17


Contribution?,Feature
+0.645,Highlighted in text (sum)
… 42 more positive …,… 42 more positive …
… 42 more negative …,… 42 more negative …
-4.334,<BIAS>

Contribution?,Feature
… 22 more positive …,… 22 more positive …
… 62 more negative …,… 62 more negative …
-0.978,Highlighted in text (sum)
-2.584,<BIAS>

Contribution?,Feature
… 19 more positive …,… 19 more positive …
… 65 more negative …,… 65 more negative …
-1.163,Highlighted in text (sum)
-2.776,<BIAS>

Contribution?,Feature
… 27 more positive …,… 27 more positive …
… 57 more negative …,… 57 more negative …
-0.844,Highlighted in text (sum)
-2.949,<BIAS>

Contribution?,Feature
… 25 more positive …,… 25 more positive …
… 59 more negative …,… 59 more negative …
-0.746,Highlighted in text (sum)
-3.036,<BIAS>

Contribution?,Feature
… 15 more positive …,… 15 more positive …
… 69 more negative …,… 69 more negative …
-1.286,Highlighted in text (sum)
-2.488,<BIAS>

Contribution?,Feature
… 18 more positive …,… 18 more positive …
… 66 more negative …,… 66 more negative …
-1.667,Highlighted in text (sum)
-2.232,<BIAS>

Contribution?,Feature
… 31 more positive …,… 31 more positive …
… 53 more negative …,… 53 more negative …
-0.260,Highlighted in text (sum)
-3.216,<BIAS>

Contribution?,Feature
… 24 more positive …,… 24 more positive …
… 60 more negative …,… 60 more negative …
-0.336,Highlighted in text (sum)
-3.166,<BIAS>

Contribution?,Feature
… 22 more positive …,… 22 more positive …
… 62 more negative …,… 62 more negative …
-0.474,Highlighted in text (sum)
-3.012,<BIAS>

Contribution?,Feature
… 18 more positive …,… 18 more positive …
… 66 more negative …,… 66 more negative …
-0.519,Highlighted in text (sum)
-3.213,<BIAS>

Contribution?,Feature
… 41 more positive …,… 41 more positive …
… 43 more negative …,… 43 more negative …
-0.004,Highlighted in text (sum)
-4.107,<BIAS>

Contribution?,Feature
… 27 more positive …,… 27 more positive …
… 57 more negative …,… 57 more negative …
-0.444,Highlighted in text (sum)
-2.770,<BIAS>

Contribution?,Feature
… 24 more positive …,… 24 more positive …
… 60 more negative …,… 60 more negative …
-0.670,Highlighted in text (sum)
-3.130,<BIAS>

Contribution?,Feature
… 20 more positive …,… 20 more positive …
… 64 more negative …,… 64 more negative …
-0.083,Highlighted in text (sum)
-3.291,<BIAS>

Contribution?,Feature
+0.354,Highlighted in text (sum)
… 36 more positive …,… 36 more positive …
… 48 more negative …,… 48 more negative …
-3.875,<BIAS>

Contribution?,Feature
+0.676,Highlighted in text (sum)
… 40 more positive …,… 40 more positive …
… 44 more negative …,… 44 more negative …
-4.065,<BIAS>

Contribution?,Feature
+3.674,Highlighted in text (sum)
… 51 more positive …,… 51 more positive …
… 33 more negative …,… 33 more negative …
-3.881,<BIAS>

Contribution?,Feature
+0.607,Highlighted in text (sum)
… 35 more positive …,… 35 more positive …
… 49 more negative …,… 49 more negative …
-4.130,<BIAS>

Contribution?,Feature
+0.932,Highlighted in text (sum)
… 36 more positive …,… 36 more positive …
… 48 more negative …,… 48 more negative …
-4.061,<BIAS>


--------------------------------------------------
Estimator: Random Forest
True Label: 17, Predicted Label: 17


Contribution?,Feature
+0.043,<BIAS>
+0.013,Highlighted in text (sum)
+0.010,let
+0.008,genocide
… 434 more positive …,… 434 more positive …
… 89 more negative …,… 89 more negative …
-0.010,based
-0.017,robert
-0.033,form
-0.033,reality

Contribution?,Feature
+0.055,Highlighted in text (sum)
+0.052,<BIAS>
+0.050,polk
+0.017,reality
+0.013,405
+0.008,robert
… 430 more positive …,… 430 more positive …
… 47 more negative …,… 47 more negative …
-0.008,course

Contribution?,Feature
+0.052,<BIAS>
… 400 more positive …,… 400 more positive …
… 40 more negative …,… 40 more negative …
-0.005,want
-0.053,Highlighted in text (sum)

Contribution?,Feature
+0.052,<BIAS>
+0.027,405
+0.007,upgraded
… 412 more positive …,… 412 more positive …
… 60 more negative …,… 60 more negative …
-0.008,local
-0.010,trying
-0.056,Highlighted in text (sum)

Contribution?,Feature
+0.051,<BIAS>
+0.007,trying
+0.005,opinion
… 378 more positive …,… 378 more positive …
… 54 more negative …,… 54 more negative …
-0.004,Highlighted in text (sum)
-0.010,upgraded
-0.040,405

Contribution?,Feature
+0.052,<BIAS>
… 397 more positive …,… 397 more positive …
… 43 more negative …,… 43 more negative …
-0.004,6277
-0.049,Highlighted in text (sum)

Contribution?,Feature
+0.051,<BIAS>
… 417 more positive …,… 417 more positive …
… 38 more negative …,… 38 more negative …
-0.007,usenet
-0.007,thanks
-0.049,Highlighted in text (sum)

Contribution?,Feature
+0.052,<BIAS>
… 416 more positive …,… 416 more positive …
… 63 more negative …,… 63 more negative …
-0.006,very
-0.006,com
-0.006,warning
-0.014,hp
-0.023,Highlighted in text (sum)

Contribution?,Feature
+0.053,<BIAS>
… 361 more positive …,… 361 more positive …
… 45 more negative …,… 45 more negative …
-0.004,wanted
-0.007,ongoing
-0.046,Highlighted in text (sum)

Contribution?,Feature
+0.054,<BIAS>
… 410 more positive …,… 410 more positive …
… 42 more negative …,… 42 more negative …
-0.007,bob
-0.008,liberalizer
-0.009,baseball
-0.010,washington
-0.011,course
-0.013,opinion
-0.015,Highlighted in text (sum)

Contribution?,Feature
+0.053,<BIAS>
… 402 more positive …,… 402 more positive …
… 57 more negative …,… 57 more negative …
-0.004,rutgers
-0.004,game
-0.005,they
-0.047,Highlighted in text (sum)

Contribution?,Feature
+0.052,<BIAS>
… 388 more positive …,… 388 more positive …
… 60 more negative …,… 60 more negative …
-0.004,clipper
-0.004,chip
-0.004,key
-0.005,protections
-0.023,Highlighted in text (sum)

Contribution?,Feature
+0.053,<BIAS>
+0.017,reality
+0.008,robert
… 434 more positive …,… 434 more positive …
… 44 more negative …,… 44 more negative …
-0.012,company
-0.047,Highlighted in text (sum)
-0.050,polk

Contribution?,Feature
+0.053,<BIAS>
+0.009,company
+0.008,they
+0.007,want
+0.006,elijah
… 391 more positive …,… 391 more positive …
… 64 more negative …,… 64 more negative …
-0.015,Highlighted in text (sum)

Contribution?,Feature
+0.053,<BIAS>
… 411 more positive …,… 411 more positive …
… 61 more negative …,… 61 more negative …
-0.007,sf
-0.007,elijah
-0.007,spencer
-0.008,day
-0.010,boom
-0.022,Highlighted in text (sum)

Contribution?,Feature
+0.082,Highlighted in text (sum)
+0.053,<BIAS>
… 417 more positive …,… 417 more positive …
… 69 more negative …,… 69 more negative …
-0.009,rom
-0.009,one
-0.025,genocide

Contribution?,Feature
+0.095,Highlighted in text (sum)
+0.048,<BIAS>
+0.033,form
+0.017,genocide
… 410 more positive …,… 410 more positive …
… 84 more negative …,… 84 more negative …
-0.030,almost
-0.050,they

Contribution?,Feature
+0.209,Highlighted in text (sum)
+0.050,<BIAS>
+0.021,one
… 422 more positive …,… 422 more positive …
… 97 more negative …,… 97 more negative …

Contribution?,Feature
+0.081,Highlighted in text (sum)
+0.040,<BIAS>
… 414 more positive …,… 414 more positive …
… 75 more negative …,… 75 more negative …
-0.006,was
-0.007,country

Contribution?,Feature
+0.050,they
+0.033,<BIAS>
+0.030,almost
+0.029,Highlighted in text (sum)
… 430 more positive …,… 430 more positive …
… 81 more negative …,… 81 more negative …
-0.010,carried
-0.013,anybody
-0.020,let


--------------------------------------------------
Estimator: XGBoost Classifier
True Label: 17, Predicted Label: 17


Contribution?,Feature
+0.445,Highlighted in text (sum)
… 4 more positive …,… 4 more positive …
… 42 more negative …,… 42 more negative …
-0.061,islamic
-0.064,atheism
-0.092,keith
-0.149,atheists
-0.182,god
-0.202,<BIAS>

Contribution?,Feature
+0.045,was
… 3 more positive …,… 3 more positive …
… 52 more negative …,… 52 more negative …
-0.033,file
-0.037,files
-0.045,3d
-0.067,image
-0.236,graphics
-0.289,Highlighted in text (sum)

Contribution?,Feature
+0.038,they
… 4 more positive …,… 4 more positive …
… 44 more negative …,… 44 more negative …
-0.030,microsoft
-0.033,files
-0.042,driver
-0.043,win
-0.062,file
-0.186,Highlighted in text (sum)
-0.800,windows

Contribution?,Feature
+0.090,in
… 2 more positive …,… 2 more positive …
… 61 more negative …,… 61 more negative …
-0.041,dos
-0.042,thanks
-0.044,windows
-0.063,drive
-0.066,pc
-0.093,card
-0.289,Highlighted in text (sum)

Contribution?,Feature
+0.077,was
+0.036,com
… 34 more negative …,… 34 more negative …
-0.031,se
-0.037,thanks
-0.037,powerbook
-0.049,drive
-0.049,centris
-0.053,quadra
-0.176,apple

Contribution?,Feature
+0.072,was
+0.053,he
… 1 more positive …,… 1 more positive …
… 39 more negative …,… 39 more negative …
-0.052,widget
-0.069,server
-0.073,mit
-0.091,motif
-0.192,window
-0.366,Highlighted in text (sum)

Contribution?,Feature
… 6 more positive …,… 6 more positive …
… 36 more negative …,… 36 more negative …
-0.049,wanted
-0.056,forsale
-0.076,sell
-0.079,shipping
-0.109,<BIAS>
-0.625,sale
-1.211,Highlighted in text (sum)

Contribution?,Feature
… 43 more negative …,… 43 more negative …
-0.021,usa
-0.023,auto
-0.023,toyota
-0.029,dealer
-0.031,ford
-0.032,automotive
-0.037,warning
-0.042,engine
-0.153,cars

Contribution?,Feature
+0.072,Highlighted in text (sum)
… 1 more positive …,… 1 more positive …
… 33 more negative …,… 33 more negative …
-0.048,bmw
-0.059,riding
-0.060,<BIAS>
-0.060,motorcycle
-0.064,com
-0.065,ride
-0.071,bikes

Contribution?,Feature
… 2 more positive …,… 2 more positive …
… 52 more negative …,… 52 more negative …
-0.049,season
-0.052,players
-0.069,year
-0.074,game
-0.074,team
-0.137,he
-0.164,Highlighted in text (sum)
-0.228,baseball

Contribution?,Feature
… 2 more positive …,… 2 more positive …
… 29 more negative …,… 29 more negative …
-0.052,cup
-0.056,season
-0.057,playoff
-0.065,toronto
-0.077,go
-0.090,ca
-0.118,nhl
-0.135,game

Contribution?,Feature
… 29 more negative …,… 29 more negative …
-0.054,cryptography
-0.063,crypto
-0.064,nsa
-0.070,<BIAS>
-0.080,chip
-0.081,security
-0.091,tapped
-0.215,key
-0.228,encryption

Contribution?,Feature
+0.050,<BIAS>
+0.027,his
… 4 more positive …,… 4 more positive …
… 69 more negative …,… 69 more negative …
-0.022,an
-0.027,anyone
-0.047,electronics
-0.052,Highlighted in text (sum)
-0.056,power
-0.056,circuit

Contribution?,Feature
+0.107,Highlighted in text (sum)
… 2 more positive …,… 2 more positive …
… 47 more negative …,… 47 more negative …
-0.028,symptoms
-0.029,diet
-0.031,treatment
-0.044,medical
-0.045,health
-0.046,gordon
-0.064,pitt

Contribution?,Feature
+0.038,Highlighted in text (sum)
… 32 more negative …,… 32 more negative …
-0.031,spacecraft
-0.038,sci
-0.039,pat
-0.053,launch
-0.066,<BIAS>
-0.086,moon
-0.118,orbit
-0.129,nasa

Contribution?,Feature
+0.389,nntp
+0.282,Highlighted in text (sum)
+0.218,article
+0.087,distribution
… 8 more positive …,… 8 more positive …
… 34 more negative …,… 34 more negative …
-0.071,church
-0.090,christian
-0.092,christians
-0.210,rutgers

Contribution?,Feature
+0.388,Highlighted in text (sum)
+0.076,be
… 4 more positive …,… 4 more positive …
… 43 more negative …,… 43 more negative …
-0.057,fbi
-0.066,atf
-0.070,government
-0.106,waco
-0.110,<BIAS>
-0.202,guns

Contribution?,Feature
+5.746,Highlighted in text (sum)
+0.047,david
… 33 more positive …,… 33 more positive …
… 31 more negative …,… 31 more negative …
-0.044,arab
-0.148,<BIAS>

Contribution?,Feature
+0.701,Highlighted in text (sum)
… 5 more positive …,… 5 more positive …
… 59 more negative …,… 59 more negative …
-0.086,we
-0.092,clinton
-0.110,government
-0.141,article
-0.194,<BIAS>

Contribution?,Feature
+0.752,Highlighted in text (sum)
… 7 more positive …,… 7 more positive …
… 51 more negative …,… 51 more negative …
-0.056,jesus
-0.057,morality
-0.141,christian
-0.183,god
-0.359,<BIAS>


--------------------------------------------------
Estimator: Keras
True Label: 17, Predicted Label: 17
	 - Local Interpretation is not supported to Keras


Contribution?,Feature
+0.445,Highlighted in text (sum)
… 4 more positive …,… 4 more positive …
… 42 more negative …,… 42 more negative …
-0.061,islamic
-0.064,atheism
-0.092,keith
-0.149,atheists
-0.182,god
-0.202,<BIAS>

Contribution?,Feature
+0.045,was
… 3 more positive …,… 3 more positive …
… 52 more negative …,… 52 more negative …
-0.033,file
-0.037,files
-0.045,3d
-0.067,image
-0.236,graphics
-0.289,Highlighted in text (sum)

Contribution?,Feature
+0.038,they
… 4 more positive …,… 4 more positive …
… 44 more negative …,… 44 more negative …
-0.030,microsoft
-0.033,files
-0.042,driver
-0.043,win
-0.062,file
-0.186,Highlighted in text (sum)
-0.800,windows

Contribution?,Feature
+0.090,in
… 2 more positive …,… 2 more positive …
… 61 more negative …,… 61 more negative …
-0.041,dos
-0.042,thanks
-0.044,windows
-0.063,drive
-0.066,pc
-0.093,card
-0.289,Highlighted in text (sum)

Contribution?,Feature
+0.077,was
+0.036,com
… 34 more negative …,… 34 more negative …
-0.031,se
-0.037,thanks
-0.037,powerbook
-0.049,drive
-0.049,centris
-0.053,quadra
-0.176,apple

Contribution?,Feature
+0.072,was
+0.053,he
… 1 more positive …,… 1 more positive …
… 39 more negative …,… 39 more negative …
-0.052,widget
-0.069,server
-0.073,mit
-0.091,motif
-0.192,window
-0.366,Highlighted in text (sum)

Contribution?,Feature
… 6 more positive …,… 6 more positive …
… 36 more negative …,… 36 more negative …
-0.049,wanted
-0.056,forsale
-0.076,sell
-0.079,shipping
-0.109,<BIAS>
-0.625,sale
-1.211,Highlighted in text (sum)

Contribution?,Feature
… 43 more negative …,… 43 more negative …
-0.021,usa
-0.023,auto
-0.023,toyota
-0.029,dealer
-0.031,ford
-0.032,automotive
-0.037,warning
-0.042,engine
-0.153,cars

Contribution?,Feature
+0.072,Highlighted in text (sum)
… 1 more positive …,… 1 more positive …
… 33 more negative …,… 33 more negative …
-0.048,bmw
-0.059,riding
-0.060,<BIAS>
-0.060,motorcycle
-0.064,com
-0.065,ride
-0.071,bikes

Contribution?,Feature
… 2 more positive …,… 2 more positive …
… 52 more negative …,… 52 more negative …
-0.049,season
-0.052,players
-0.069,year
-0.074,game
-0.074,team
-0.137,he
-0.164,Highlighted in text (sum)
-0.228,baseball

Contribution?,Feature
… 2 more positive …,… 2 more positive …
… 29 more negative …,… 29 more negative …
-0.052,cup
-0.056,season
-0.057,playoff
-0.065,toronto
-0.077,go
-0.090,ca
-0.118,nhl
-0.135,game

Contribution?,Feature
… 29 more negative …,… 29 more negative …
-0.054,cryptography
-0.063,crypto
-0.064,nsa
-0.070,<BIAS>
-0.080,chip
-0.081,security
-0.091,tapped
-0.215,key
-0.228,encryption

Contribution?,Feature
+0.050,<BIAS>
+0.027,his
… 4 more positive …,… 4 more positive …
… 69 more negative …,… 69 more negative …
-0.022,an
-0.027,anyone
-0.047,electronics
-0.052,Highlighted in text (sum)
-0.056,power
-0.056,circuit

Contribution?,Feature
+0.107,Highlighted in text (sum)
… 2 more positive …,… 2 more positive …
… 47 more negative …,… 47 more negative …
-0.028,symptoms
-0.029,diet
-0.031,treatment
-0.044,medical
-0.045,health
-0.046,gordon
-0.064,pitt

Contribution?,Feature
+0.038,Highlighted in text (sum)
… 32 more negative …,… 32 more negative …
-0.031,spacecraft
-0.038,sci
-0.039,pat
-0.053,launch
-0.066,<BIAS>
-0.086,moon
-0.118,orbit
-0.129,nasa

Contribution?,Feature
+0.389,nntp
+0.282,Highlighted in text (sum)
+0.218,article
+0.087,distribution
… 8 more positive …,… 8 more positive …
… 34 more negative …,… 34 more negative …
-0.071,church
-0.090,christian
-0.092,christians
-0.210,rutgers

Contribution?,Feature
+0.388,Highlighted in text (sum)
+0.076,be
… 4 more positive …,… 4 more positive …
… 43 more negative …,… 43 more negative …
-0.057,fbi
-0.066,atf
-0.070,government
-0.106,waco
-0.110,<BIAS>
-0.202,guns

Contribution?,Feature
+5.746,Highlighted in text (sum)
+0.047,david
… 33 more positive …,… 33 more positive …
… 31 more negative …,… 31 more negative …
-0.044,arab
-0.148,<BIAS>

Contribution?,Feature
+0.701,Highlighted in text (sum)
… 5 more positive …,… 5 more positive …
… 59 more negative …,… 59 more negative …
-0.086,we
-0.092,clinton
-0.110,government
-0.141,article
-0.194,<BIAS>

Contribution?,Feature
+0.752,Highlighted in text (sum)
… 7 more positive …,… 7 more positive …
… 51 more negative …,… 51 more negative …
-0.056,jesus
-0.057,morality
-0.141,christian
-0.183,god
-0.359,<BIAS>


* **Local prediction using Lime TextExplainer**

https://eli5.readthedocs.io/en/latest/tutorials/black-box-text-classifiers.html#lime-tutorial

We use this method to explain Keras model.

In [59]:
import eli5
from eli5.lime import TextExplainer


target_names = pipeline['pipeline'].classes_.tolist()
te = TextExplainer(random_state=42)
te.fit(x_test[idx], pipeline['pipeline'].predict_proba)
te.show_prediction(target_names=target_names)

Contribution?,Feature
-0.67,<BIAS>
-1.388,Highlighted in text (sum)

Contribution?,Feature
-0.403,<BIAS>
-3.265,Highlighted in text (sum)

Contribution?,Feature
-0.409,<BIAS>
-3.94,Highlighted in text (sum)

Contribution?,Feature
-0.409,<BIAS>
-1.288,Highlighted in text (sum)

Contribution?,Feature
-0.454,<BIAS>
-2.676,Highlighted in text (sum)

Contribution?,Feature
-0.363,<BIAS>
-5.043,Highlighted in text (sum)

Contribution?,Feature
-0.338,<BIAS>
-4.597,Highlighted in text (sum)

Contribution?,Feature
-0.547,<BIAS>
-2.41,Highlighted in text (sum)

Contribution?,Feature
-0.602,<BIAS>
-2.356,Highlighted in text (sum)

Contribution?,Feature
-0.502,<BIAS>
-2.445,Highlighted in text (sum)

Contribution?,Feature
-0.464,<BIAS>
-2.552,Highlighted in text (sum)

Contribution?,Feature
-0.59,<BIAS>
-1.715,Highlighted in text (sum)

Contribution?,Feature
-0.6,<BIAS>
-2.119,Highlighted in text (sum)

Contribution?,Feature
-0.502,Highlighted in text (sum)
-0.578,<BIAS>

Contribution?,Feature
-0.689,<BIAS>
-1.738,Highlighted in text (sum)

Contribution?,Feature
-0.535,Highlighted in text (sum)
-0.678,<BIAS>

Contribution?,Feature
-0.679,<BIAS>
-1.377,Highlighted in text (sum)

Contribution?,Feature
-0.687,<BIAS>
-1.593,Highlighted in text (sum)

Contribution?,Feature
-0.675,<BIAS>
-1.367,Highlighted in text (sum)

Contribution?,Feature
-0.576,<BIAS>
-1.762,Highlighted in text (sum)
