# Sentiment analysis
<img src="./screencast.gif"/>

In this sample, we will build a sentiment annotator for the [Movie Review](http://www.cs.cornell.edu/people/pabo/movie-review-data/) dataset from Cornell.

In [1]:
import json
import tarfile

with tarfile.open('data.json.tgz') as tar:
    file = tar.extractfile('data.json')
    data = file.read().decode('utf8')
    data = json.loads(data)

There's a lot of data here, lets process a subset of it.

In [3]:
import nltk
nltk.download('vader_lexicon')

from nltk.sentiment.vader import SentimentIntensityAnalyzer
sentiment_analyzer = SentimentIntensityAnalyzer()

data = [{'text': _['text'], 'sentiment': sentiment_analyzer.polarity_scores(_['text'])['compound']} for _ in list(data.values())[:10]]

[nltk_data] Downloading package vader_lexicon to
[nltk_data]     /Users/alexkuk/nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!


We will write a simple display formatter to make our output look nice

In [4]:
from IPython.display import display, HTML
def display_record(record):
    sentiment = 1 if record['sentiment'] > 0 else -1
    if sentiment == 1:
        display(HTML('<span style="color:green;">{}</span>'.format(sentiment)))
    else:
        display(HTML('<span style="color:red;">{}</span>'.format(sentiment)))
    print(record['text'])

display_record(data[2])

have you ever been in an automobile accident where you've miraculously walked away with only a few scratches , yet the car has been obliterated into an unrecognizable , mangled wreck ? 
well , that has never actually happened to me and i hope that none of us will ever experience this situation . 
but after watching this inane exercise of a movie , i certainly feel that i've miraculously walked away unscathed after a two-hour ride that mercilessly careens back and forth before finally plummeting into an icy pond . 
oddly , `eye of the beholder , ' which is a psychological-romance-thriller , starts off promisingly enough when the opening sequence introduces us to a british intelligence agent , called the eye ( ewan mcgregor ) , working in washington dc . 
in the humorous opening scene , he eyes a top lawyer across the street in his office with his pants down . 
using an array of high-tech surveillance and communications equipment , he proceeds to transmit pictures of the bared lawyer to 

## Assemble our annotator
Now we can assemble our checker using `ipyannotate`. For this task, we will show the user the model-evaluated sentiment, and let them override it with `+1`, `0` and `-1` buttons, which will modify the annotation tasks.

In [6]:
from ipyannotate.buttons import OkButton as Button, NextButton, BackButton
from ipyannotate.toolbar import Toolbar
from ipyannotate.tasks import Task, Tasks
from ipyannotate.canvas import OutputCanvas
from ipyannotate.annotation import Annotation


def handle_click(button):
    annotation.tasks.current.output['sentiment'] = button.value

tasks = Tasks(Task(_) for _ in data)

pos = Button(label='+', shortcut='1', value=1, color='green', icon='')
neu = Button(label='o', shortcut='2', value=0, color='gray', icon='')
neg = Button(label='-', shortcut='3', value=-1, color='red', icon='')

for button in [pos, neu, neg]:
    button.on_click(handle_click)

buttons = [pos, neu, neg, BackButton(shortcut='j'), NextButton(shortcut='k')]
toolbar = Toolbar(buttons)

canvas = OutputCanvas(display=display_record)

annotation = Annotation(toolbar, tasks, canvas=canvas)
annotation

# annotation.tasks

In [None]:
annotation.tasks