# Movie Review Explanations


We will use an SKLearn classifier built on [movie sentiment data](http://www.cs.cornell.edu/people/pabo/movie%2Dreview%2Ddata/) which predicts positive or negative sentiment for review text. 

The Kfserving resource provdes:
   * A pretrained sklearn model stored on a Google bucket
   * A Text [Seldon Alibi](https://github.com/SeldonIO/alibi) Explainer. See the [Alibi Docs](https://docs.seldon.io/projects/alibi/en/stable/) for further details.

In [1]:
!pygmentize roberta_explainer.yaml

[34;01mapiVersion[39;49;00m: [33m"[39;49;00m[33mserving.kubeflow.org/v1alpha2[39;49;00m[33m"[39;49;00m
[34;01mkind[39;49;00m: [33m"[39;49;00m[33mInferenceService[39;49;00m[33m"[39;49;00m
[34;01mmetadata[39;49;00m:
  [34;01mname[39;49;00m: roberta
[34;01mspec[39;49;00m:
  [34;01mdefault[39;49;00m:
    [34;01mpredictor[39;49;00m:
      [34;01mminReplicas[39;49;00m: 1      
      [34;01mcustom[39;49;00m:
        [34;01mcontainer[39;49;00m:
          [34;01mimage[39;49;00m: seldonio/kf_movie_sentiment_roberta:0.1
          [34;01mresources[39;49;00m:
            [34;01mrequests[39;49;00m:
              [34;01mcpu[39;49;00m: 6
              [34;01mmemory[39;49;00m: 6Gi
              [34;01mnvidia.com/gpu[39;49;00m: 1
            [34;01mlimits[39;49;00m:
              [34;01mcpu[39;49;00m: 6
              [34;01mmemory[39;49;00m: 20Gi
              [34;01mnvidia.com/gpu[39;49;00m: 1              
    [34;01mexplainer[3

In [11]:
!kubectl apply -f roberta_explainer.yaml

inferenceservice.serving.kubeflow.org/roberta created


In [3]:
CLUSTER_IPS=!(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
CLUSTER_IP=CLUSTER_IPS[0]
print(CLUSTER_IP)

35.204.158.239


In [5]:
SERVICE_HOSTNAMES=!(kubectl get inferenceservice roberta -o jsonpath='{.status.url}' | cut -d "/" -f 3)
SERVICE_HOSTNAME=SERVICE_HOSTNAMES[0]
print(SERVICE_HOSTNAME)

roberta.default.example.com


In [6]:
import sys
sys.path.append('../')
from alibi_helper import *

In [7]:
from alibi.datasets import fetch_movie_sentiment
movies = fetch_movie_sentiment()

In [70]:
# 24
idxNeg = 24
idxPos = 5215
for idx in [idxNeg,idxPos]:
    print(movies.data[idx])
    show_prediction(predict(movies.data[idx:idx+1],'roberta',movies,SERVICE_HOSTNAME,CLUSTER_IP))

as exciting as all this exoticism might sound to the typical pax viewer , the rest of us will be lulled into a coma .


## Prediction: negative

if you sometimes like to go to the movies to have fun , wasabi is a good place to start .


## Prediction: positive

# Get Explanation for Negative Prediction

In [71]:
exp = explain(movies.data[idxNeg:idxNeg+1],"roberta",SERVICE_HOSTNAME,CLUSTER_IP)

In [72]:
show_anchors(exp['names'])

# Explanation:

## ['lulled']

Show precision. How likely predictions using the Anchor features would produce the same result.

In [53]:
show_bar([exp['precision']],[''],"Precision")
show_bar([exp['coverage']],[''],"Coverage")

In [54]:
show_feature_coverage(exp)

In [26]:
show_examples(exp,0,movies)

## Examples covered by Anchors: ['useless']

Unnamed: 0,0
0,another useless recycling UNK UNK brutal mid-'...
1,another useless UNK of UNK brutal mid-'70s ame...
2,another useless recycling of UNK brutal mid-'7...
3,UNK useless UNK of a brutal mid-'70s american ...
4,UNK useless UNK UNK UNK brutal UNK UNK sports ...
5,UNK useless recycling UNK a brutal mid-'70s UN...
6,another useless UNK UNK UNK brutal UNK UNK spo...
7,UNK useless recycling of UNK UNK UNK american ...
8,UNK useless UNK UNK a UNK mid-'70s UNK sports ...
9,UNK useless recycling of UNK brutal UNK UNK UN...


In [27]:
show_examples(exp,0,movies,False)

## Examples not covered by Anchors: ['useless']

Unnamed: 0,0
0,UNK useless UNK UNK a brutal mid-'70s american...
1,UNK useless UNK UNK a brutal mid-'70s american...
2,UNK useless UNK UNK a brutal UNK american spor...
3,UNK useless UNK UNK a brutal mid-'70s american...


# Get Explanation for High Income Example

In [29]:
exp = explain(movies.data[idxPos:idxPos+1],"roberta",SERVICE_HOSTNAME,CLUSTER_IP)

In [30]:
show_anchors(exp['names'])

# Explanation:

## ['good']

Show precision. How likely predictions using the Anchor features would produce the same result.

In [31]:
show_bar([exp['precision']],[''],"Precision")
show_bar([exp['coverage']],[''],"Coverage")

In [32]:
show_feature_coverage(exp)

In [33]:
show_examples(exp,0,movies)

## Examples covered by Anchors: ['good']

Unnamed: 0,0
0,UNK UNK sometimes UNK to go UNK UNK UNK to UNK...
1,if you UNK like to UNK UNK the UNK to have UNK...
2,UNK you UNK like UNK go UNK UNK movies UNK UNK...
3,if you sometimes UNK UNK go UNK the UNK UNK UN...
4,if UNK UNK UNK to UNK to the UNK to have fun U...
5,UNK you sometimes UNK to UNK UNK UNK UNK UNK U...
6,UNK you UNK UNK UNK go to UNK UNK UNK have UNK...
7,UNK you sometimes UNK to UNK to UNK UNK to UNK...
8,if UNK UNK like to UNK UNK the movies to have ...
9,if you UNK UNK UNK go UNK UNK movies to UNK UN...


In [34]:
show_examples(exp,0,movies,False)

## Examples not covered by Anchors: ['good']

Unnamed: 0,0
0,if UNK UNK UNK UNK go to UNK movies to have UN...
1,UNK UNK sometimes UNK to go to the UNK to UNK ...
2,if UNK sometimes UNK UNK go to UNK movies to U...
3,UNK you sometimes UNK to UNK to UNK UNK to UNK...
4,if you sometimes UNK UNK go UNK the UNK UNK UN...
