# Movie Review Explanations


We will use an SKLearn classifier built on [movie sentiment data](http://www.cs.cornell.edu/people/pabo/movie%2Dreview%2Ddata/) which predicts positive or negative sentiment for review text. 

The Kfserving resource provdes:
   * A pretrained sklearn model stored on a Google bucket
   * A Text [Seldon Alibi](https://github.com/SeldonIO/alibi) Explainer. See the [Alibi Docs](https://docs.seldon.io/projects/alibi/en/stable/) for further details.

In [2]:
!pygmentize moviesentiment.yaml

[34;01mapiVersion[39;49;00m: [33m"[39;49;00m[33mserving.kubeflow.org/v1alpha2[39;49;00m[33m"[39;49;00m
[34;01mkind[39;49;00m: [33m"[39;49;00m[33mInferenceService[39;49;00m[33m"[39;49;00m
[34;01mmetadata[39;49;00m:
  [34;01mname[39;49;00m: [33m"[39;49;00m[33mmoviesentiment[39;49;00m[33m"[39;49;00m
[34;01mspec[39;49;00m:
  [34;01mdefault[39;49;00m:
    [34;01mpredictor[39;49;00m:
      [34;01mminReplicas[39;49;00m: 1
      [34;01msklearn[39;49;00m:
        [34;01mstorageUri[39;49;00m: [33m"[39;49;00m[33mgs://seldon-models/sklearn/moviesentiment[39;49;00m[33m"[39;49;00m
        [34;01mresources[39;49;00m:
          [34;01mrequests[39;49;00m:
            [34;01mcpu[39;49;00m: 0.1
    [34;01mexplainer[39;49;00m:
      [34;01mminReplicas[39;49;00m: 1
      [34;01malibi[39;49;00m:
        [34;01mtype[39;49;00m: AnchorText
        [34;01mresources[39;49;00m:
          [34;01mrequests[39;49;00m:
            [34;01

In [4]:
!kubectl apply -f moviesentiment.yaml

inferenceservice.serving.kubeflow.org/moviesentiment created


In [1]:
CLUSTER_IPS=!(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
CLUSTER_IP=CLUSTER_IPS[0]
print(CLUSTER_IP)

35.204.158.239


In [2]:
SERVICE_HOSTNAMES=!(kubectl get inferenceservice moviesentiment -o jsonpath='{.status.url}' | cut -d "/" -f 3)
SERVICE_HOSTNAME=SERVICE_HOSTNAMES[0]
print(SERVICE_HOSTNAME)

moviesentiment.default.example.com


In [8]:
import sys
sys.path.append('../')
from alibi_helper import *

In [4]:
from alibi.datasets import fetch_movie_sentiment
movies = fetch_movie_sentiment()

In [40]:
import numpy as np
import requests
from alibi.datasets import fetch_adult
import pandas as pd
import plotly.graph_objects as go
from IPython.display import display, Markdown, display

def getFeatures(X,cmap):
    return pd.DataFrame(X).replace(cmap).values.squeeze().tolist()

def predict(X, name, ds, svc_hostname, cluster_ip):
    formData = {
    'instances': X
    }
    headers = {}
    headers["Host"] = svc_hostname
    res = requests.post('http://'+cluster_ip+'/v1/models/'+name+':predict', json=formData, headers=headers)
    if res.status_code == 200:
        return ds.target_names[np.array(res.json()["predictions"])[0]]
    else:
        print("Failed with ",res.status_code)
        return []
    
def explain(X, name, svc_hostname, cluster_ip):
    formData = {
    'instances': X
    }
    headers = {}
    headers["Host"] = svc_hostname
    res = requests.post('http://'+cluster_ip+'/v1/models/'+name+':explain', json=formData, headers=headers)
    if res.status_code == 200:
        return res.json()
    else:
        print("Failed with ",res.status_code)
        return []

def show_bar(X, labels, title):
    fig = go.Figure(go.Bar(x=X,y=labels,orientation='h',width=[0.5]))
    fig.update_layout(autosize=False,width=700,height=300,
                      xaxis=dict(range=[0, 1]),
                      title_text=title,  
                      font=dict(family="Courier New, monospace",size=18,color="#7f7f7f"
    ))
    fig.show()

    
def show_feature_coverage(exp):
    data = []
    for idx, name in enumerate(exp["names"]):
        data.append(go.Bar(name=name, x=["coverage"], y=[exp['raw']['coverage'][idx]]))
    fig = go.Figure(data=data)
    fig.update_layout(yaxis=dict(range=[0, 1]))
    fig.show()
    
def show_anchors(names):
    display(Markdown('# Explanation:'))
    display(Markdown('## {}'.format(names)))
    
def show_examples(exp,fidx,ds,covered=True):
    if covered:
        cname = 'covered'
        display(Markdown("## Examples covered by Anchors: {}".format(exp['names'][0:fidx+1])))
    else:
        cname = 'covered_false'
        display(Markdown("## Examples not covered by Anchors: {}".format(exp['names'][0:fidx+1])))
    if "feature_names" in ds:
        return pd.DataFrame(exp['raw']['examples'][fidx][cname],columns=ds.feature_names)
    else:
        return pd.DataFrame(exp['raw']['examples'][fidx][cname])

def show_prediction(prediction):
    display(Markdown('## Prediction: {}'.format(prediction)))
    
def show_row(X,ds):
    display(pd.DataFrame(X,columns=ds.feature_names))
                        
                        

In [12]:
movies.target

[0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,


In [13]:
np.argwhere(np.array(movies.target)>0)

array([[ 5212],
       [ 5213],
       [ 5214],
       ...,
       [10430],
       [10431],
       [10432]])

In [70]:
idxNeg = 17
idxPos = 5212
for idx in [idxNeg,idxPos]:
    print(movies.data[idx])
    show_prediction(predict(movies.data[idx:idx+1],'moviesentiment',movies,SERVICE_HOSTNAME,CLUSTER_IP))

such master screenwriting comes courtesy of john pogue , the yale grad who previously gave us " the skulls " and last year's " rollerball . " enough said , except : film overboard !


## Prediction: negative

the rock is destined to be the 21st century's new " conan " and that he's going to make a splash even greater than arnold schwarzenegger , jean-claud van damme or steven segal .


## Prediction: positive

# Get Explanation for Negative Prediction

In [71]:
exp = explain(movies.data[idxNeg:idxNeg+1],"moviesentiment",SERVICE_HOSTNAME,CLUSTER_IP)

In [73]:
show_anchors(exp['names'])

# Explanation:

## ['john', 'rollerball']

Show precision. How likely predictions using the Anchor features would produce the same result.

In [74]:
show_bar([exp['precision']],[''],"Precision")
show_bar([exp['coverage']],[''],"Coverage")

In [75]:
show_feature_coverage(exp)

In [76]:
show_examples(exp,0,movies)

## Examples covered by Anchors: ['john']

Unnamed: 0,0
0,UNK master screenwriting comes courtesy of joh...
1,UNK master screenwriting UNK UNK UNK john UNK ...
2,"UNK UNK UNK comes courtesy UNK john pogue , th..."
3,such UNK screenwriting comes courtesy of john ...
4,such UNK UNK comes UNK UNK john UNK UNK UNK UN...
5,such UNK UNK comes UNK UNK john pogue UNK the ...
6,such master UNK UNK courtesy UNK john pogue UN...
7,UNK UNK UNK comes courtesy UNK john pogue UNK ...
8,UNK UNK screenwriting UNK UNK of john pogue UN...
9,"UNK UNK UNK UNK courtesy of john UNK , the UNK..."


In [77]:
show_examples(exp,0,movies,False)

## Examples not covered by Anchors: ['john']

Unnamed: 0,0
0,UNK UNK screenwriting UNK courtesy of john pog...
1,such master screenwriting comes courtesy UNK j...
2,"such UNK UNK UNK UNK of john UNK , UNK UNK UNK..."
3,UNK master UNK UNK UNK UNK john pogue UNK UNK ...
4,"UNK master UNK UNK courtesy of john pogue , UN..."
5,"such UNK UNK comes UNK UNK john UNK , the yale..."
6,"such master UNK comes courtesy UNK john UNK , ..."
7,"UNK UNK UNK comes courtesy UNK john pogue , th..."
8,"UNK master UNK comes courtesy of john UNK , th..."
9,such master screenwriting comes UNK UNK john U...


# Get Explanation for High Income Example

In [79]:
exp = explain(movies.data[idxPos:idxPos+1],"moviesentiment",SERVICE_HOSTNAME,CLUSTER_IP)

In [80]:
show_anchors(exp['names'])

# Explanation:

## ['steven', 'conan', 'and', 'claud']

Show precision. How likely predictions using the Anchor features would produce the same result.

In [81]:
show_bar([exp['precision']],[''],"Precision")
show_bar([exp['coverage']],[''],"Coverage")

In [82]:
show_feature_coverage(exp)

In [84]:
show_examples(exp,0,movies)

## Examples covered by Anchors: ['steven']

Unnamed: 0,0
0,the rock is destined UNK be UNK UNK century 's...
1,UNK rock is destined UNK UNK the UNK UNK UNK U...
2,UNK UNK is UNK UNK be the UNK century 's UNK U...
3,UNK UNK is UNK UNK be the 21st UNK UNK UNK UNK...
4,"the rock is UNK UNK be UNK 21st UNK UNK new "" ..."
5,the UNK is UNK to UNK UNK UNK century 's UNK U...
6,"the UNK is UNK to be UNK 21st UNK UNK new "" co..."
7,the rock UNK UNK to UNK the 21st century UNK U...
8,the UNK UNK destined to be the 21st UNK 's UNK...
9,the UNK UNK destined UNK UNK UNK 21st century ...


In [85]:
show_examples(exp,0,movies,False)

## Examples not covered by Anchors: ['steven']

Unnamed: 0,0
0,UNK UNK is UNK UNK be UNK 21st century 's UNK ...
1,UNK rock is UNK to be the UNK century 's UNK U...
2,the UNK is UNK UNK be the UNK century UNK UNK ...
3,UNK rock UNK destined to be UNK 21st UNK UNK U...
4,UNK rock UNK UNK to be the UNK century UNK new...
5,the rock UNK UNK to be UNK 21st UNK UNK new UN...
6,the rock UNK UNK to UNK UNK 21st UNK 's UNK UN...
7,UNK rock is destined to UNK UNK UNK UNK UNK ne...
8,UNK rock UNK UNK UNK UNK UNK UNK UNK UNK UNK U...
9,UNK UNK is UNK to be the 21st UNK UNK new UNK ...
