Note: Each of the code cells can be run by selecting that cell and pressing `SHIFT+ENTER`. To restart the notebook, you can select in the menu above Kernel -> Restart & Clear Output.

## Sentiment Analysis of Movie Reviews


In this exercise, we will write a model to analyze movie reviews on IMDB and decide if they are positive or negative reviews.

The IMDB dataset consists of 25,000 reviews, each with a binary label (1 = positive, 0 = negative).

Here is an example of a POSITIVE review:

> "The pace is steady and constant, the characters full and engaging, the relationships and interactions natural showing that you do not need floods of tears to show emotion, screams to show fear, shouting to show dispute or violence to show anger. Naturally Joyce's short story lends the film a ready made structure as perfect as a polished diamond, but the small changes Huston makes such as the inclusion of the poem fit in neatly. It is truly a masterpiece of tact, subtlety and overwhelming beauty."

Here is an example of a NEGATIVE review:

> "Beautiful attracts excellent idea, but ruined with a bad selection of the actors. The main character is a loser and his woman friend and his friend upset viewers. Apart from the first episode all the other become more boring and boring. First, it considers it illogical behavior. No one normal would not behave the way the main character behaves. It all represents a typical Halmark way to endear viewers to the reduced amount of intelligence. Does such a scenario, or the casting director and destroy this question is on Halmark producers. Cat is the main character is wonderful. The main character behaves according to his friend selfish."

In [None]:
print "Hello World!"

1. Setup
--------

We first import the packages we need to run this script.

In [None]:
import mxnet as mx
import numpy as np
import logging
import imdb as IMDB
import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning) 

2. Dataset
----------

We have to preprocess the dataset to convert the words into numbers. We take our vocabularly of words, and assign a number to each word. For example, a sentence such as:

> "Hello world, my name is Intel and my location is Santa Clara"

Will be converted to list of 6 numbers:

> [24, 784, 4, 98, 22, 143, 15, 4, 314, 22, 488, 2894] 

We already done this for you, and is loaded in the code below:

In [None]:
imdb = IMDB.IMDB()

3. Build the Model
-----------------

The network consists of list of the following layers:

1. `Embedding` transforms each word into a vector of numbers. 
2. `LSTMCell` is a recurrent layer with “long short-term memory” units. LSTM networks are good at learning temporal dependencies in the data.
3. `sum` sums the activations over the time
3. `Dropout` randomly silences a subset of the units during training.
4. `FullyConnected` is a layer with two outputs, for the two target classes.

Below we construct the graph of operations by first creating placeholders for the input data.

In [None]:
data = mx.sym.Variable('data')
label = mx.sym.Variable('softmax_label')
label = mx.sym.Reshape(data=label, shape=(-1,))

Then, we generate our graph by passing the data through the layers of the network, starting with the embedding layer:

In [None]:
net = mx.sym.Embedding(data=data, input_dim=20000, output_dim=128, name='embed')

For the LSTM layer, we unroll the layer over time, then pass the outputs to the rest of the network.

In [None]:
lstm_cell = mx.rnn.LSTMCell(num_hidden=64)
net, _ = lstm_cell.unroll(length=128, inputs=net, merge_outputs=True)
net = mx.sym.sum(data=net, axis=1)
net = mx.sym.Dropout(data=net, p=0.5)
net = mx.sym.FullyConnected(data=net, num_hidden=2)
net = mx.sym.SoftmaxOutput(data=net, label=label, name='softmax')

init = mx.init.Mixed(patterns=['embed', '.*'], 
                     initializers= [mx.init.Uniform(scale=1/128), mx.init.Xavier()])

Next, we use the Module API, which providers helper functions to train the model.

In [None]:
model = mx.mod.Module(net, context=mx.cpu(0))

Callbacks allow the model to report its progress during the course of training. Here we tell MXNET to plot a graph with the cost.

In [None]:
from VisCostCallback import CostVisCallback

callbacks = CostVisCallback(nepochs=2.0, y_range=(0, 4.5), total_batches=156).get_callbacks()

5. Train the model.
------------
Now are ready to train the model. Recall what happens during the training process:

<img src="images/train_schematic.png", width=700px>

To train the model, we call the `fit()` function and pass in the training set, and other settings. Here we train for 2 epochs, meaning two rounds through the dataset.

In [None]:
model.fit(
        train_data          = imdb.train_set,
        optimizer           = 'Adagrad',
        eval_metric         = mx.metric.CrossEntropy(),
        optimizer_params    = {'learning_rate': 0.01},
        initializer         = init,
        num_epoch           = 2,
        batch_end_callback  = callbacks['train_cost'])

Accuracy
--------

We can then measure the model's accuracy on the validation data -- data that the model was not trained on.


In [None]:
score = model.score(imdb.valid_set, mx.metric.Accuracy())
print "Test  Accuracy - {}".format(100 * score[0][1])

Inference
--------

Now let's do something fun with the trained model! We create a UI below where you can type in your movie review (or any other text) and have it classified into positive or negative.

In [None]:
from imdb import preprocess, text_window
from ipywidgets import interact, interactive

def inference(x):
    inputs = preprocess(x)
    output = model.predict(eval_data=inputs).asnumpy()
    score = output[0][1]
    print("Sentiment: {:.1f}% Positive".format(100*score))

z = interact(inference, x=text_window())

In [None]:
inference("This movie was great!")