# eli5 Text Explainer Example

<table align="left"><td>
  <a target="_blank"  href="https://colab.research.google.com/github/TannerGilbert/Model-Interpretation/blob/master/eli5/ELI5_Text_Explainer_Example.ipynb">
    <img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab
  </a>
</td><td>
  <a target="_blank"  href="https://github.com/TannerGilbert/Model-Interpretation/blob/master/eli5/ELI5_Text_Explainer_Example.ipynb">
    <img width=32px src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>
</td></table>

## Installation

In [1]:
!pip install eli5

Collecting eli5
[?25l  Downloading https://files.pythonhosted.org/packages/d1/54/04cab6e1c0ae535bec93f795d8403fdf6caf66fa5a6512263202dbb14ea6/eli5-0.11.0-py2.py3-none-any.whl (106kB)
[K     |███                             | 10kB 15.8MB/s eta 0:00:01[K     |██████▏                         | 20kB 13.5MB/s eta 0:00:01[K     |█████████▎                      | 30kB 7.1MB/s eta 0:00:01[K     |████████████▍                   | 40kB 6.8MB/s eta 0:00:01[K     |███████████████▌                | 51kB 4.7MB/s eta 0:00:01[K     |██████████████████▌             | 61kB 5.3MB/s eta 0:00:01[K     |█████████████████████▋          | 71kB 5.2MB/s eta 0:00:01[K     |████████████████████████▊       | 81kB 5.4MB/s eta 0:00:01[K     |███████████████████████████▉    | 92kB 5.3MB/s eta 0:00:01[K     |███████████████████████████████ | 102kB 5.3MB/s eta 0:00:01[K     |████████████████████████████████| 112kB 5.3MB/s 
Installing collected packages: eli5
Successfully installed eli5-0.11.0


## Downloading dataset

In [2]:
from sklearn.datasets import fetch_20newsgroups

categories = ['alt.atheism', 'soc.religion.christian',
              'comp.graphics', 'sci.med']
twenty_train = fetch_20newsgroups(
    subset='train',
    categories=categories,
    shuffle=True,
    random_state=42,
    remove=('headers', 'footers'),
)
twenty_test = fetch_20newsgroups(
    subset='test',
    categories=categories,
    shuffle=True,
    random_state=42,
    remove=('headers', 'footers'),
)

Downloading 20news dataset. This may take a few minutes.
Downloading dataset from https://ndownloader.figshare.com/files/5975967 (14 MB)


## Training Support Vector machine

In [3]:
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.svm import SVC
from sklearn.decomposition import TruncatedSVD
from sklearn.pipeline import make_pipeline

vec = TfidfVectorizer(min_df=3, stop_words='english',
                      ngram_range=(1, 2))
svd = TruncatedSVD(n_components=100, n_iter=7, random_state=42)
lsa = make_pipeline(vec, svd)

clf = SVC(C=150, gamma=2e-2, probability=True)
pipe = make_pipeline(lsa, clf)
pipe.fit(twenty_train.data, twenty_train.target)
pipe.score(twenty_test.data, twenty_test.target)

0.8901464713715047

## TextExplainer

In [4]:
from eli5.lime import TextExplainer

doc = twenty_test.data[0]
te = TextExplainer(random_state=42)
te.fit(doc, pipe.predict_proba)
te.show_prediction(target_names=twenty_train.target_names)



Contribution?,Feature
-0.395,<BIAS>
-8.479,Highlighted in text (sum)

Contribution?,Feature
-0.283,<BIAS>
-8.102,Highlighted in text (sum)

Contribution?,Feature
6.123,Highlighted in text (sum)
-0.068,<BIAS>

Contribution?,Feature
-0.349,<BIAS>
-4.851,Highlighted in text (sum)
