# High-level LSTM Keras (CNTK) Example

In this lab we'll perform sentiment analysis on the imdb datasets of movie reviews.
We'll embed the words of the reviews, then use a Gated Recurrent Unit (variant of LSTM) to perform the analysis.
Lastly, the output needs to indicate either positive or negative sentiment (binary classification), so we'll add a fully connected network to perform the classification.

## Import the required libraries
Note that we are explicitly setting cntk as Keras backend. The default would be TensorFlow.

In [1]:
import os
import sys
import numpy as np
## os.environ['KERAS_BACKEND'] = "cntk"
import keras as K
#import cntk
from keras.models import Sequential
from keras.layers import Dense, Embedding, GRU
from common.params_lstm import *
from common.utils import *

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


In [2]:
print("OS: ", sys.platform)
print("Python: ", sys.version)
print("Keras: ", K.__version__)
print("Numpy: ", np.__version__)
#print("CNTK: ", cntk.__version__)
print(K.backend.backend())

OS:  darwin
Python:  3.6.2 |Anaconda custom (x86_64)| (default, Sep 21 2017, 18:29:43) 
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]
Keras:  2.0.8
Numpy:  1.14.2
tensorflow


## Setup the network
A useful function to create our network: Embedding->GRU->Classifier

In [3]:
def create_symbol():
    model = Sequential()
    model.add(Embedding(MAXFEATURES, EMBEDSIZE, input_length=MAXLEN))
    model.add(GRU(NUMHIDDEN))
    model.add(Dense(2, activation='softmax'))
    return model

Define loss, optimizer and metrics

In [4]:
def init_model(m):
    m.compile(
        loss = "categorical_crossentropy",
        optimizer = K.optimizers.Adam(LR, BETA_1, BETA_2, EPS),
        metrics = ['accuracy'])
    return m

## Download the dataset (pre-defined in Keras)

In [5]:
%%time
# Data into format for library
x_train, x_test, y_train, y_test = imdb_for_library(seq_len=MAXLEN, max_features=MAXFEATURES, one_hot=True)
print(x_train.shape, x_test.shape, y_train.shape, y_test.shape)
print(x_train.dtype, x_test.dtype, y_train.dtype, y_test.dtype)

Downloading https://s3.amazonaws.com/text-datasets/imdb.npz
Done.
Extracting files...
Done.
Trimming to 20000 max-features
Padding to length 150
(25000, 150) (25000, 150) (25000, 2) (25000, 2)
int32 int32 int32 int32
CPU times: user 6.05 s, sys: 558 ms, total: 6.6 s
Wall time: 18.9 s


The dataset contains 25,000 movies reviews from IMDB, labeled by sentiment (positive/negative). Reviews have been preprocessed, and each review is encoded as a sequence of word indexes (integers). For convenience, words are indexed by overall frequency in the dataset, so that for instance the integer "3" encodes the 3rd most frequent word in the data.

In [6]:
x_train.view(), y_train.view()

(array([[   1,    2,  312, ...,    0,    0,    0],
        [ 460,  343,    8, ...,    8,  339,  409],
        [   1,  530,  120, ...,    9,  179,  400],
        ...,
        [ 131,  713,   75, ...,   20,   99,   78],
        [   7,    6,    2, ...,    2, 1198,  798],
        [   1,   14,    9, ...,    0,    0,    0]], dtype=int32),
 array([[0, 1],
        [0, 1],
        [0, 1],
        ...,
        [1, 0],
        [1, 0],
        [1, 0]], dtype=int32))

## Instantiate the model

In [7]:
%%time
# Load symbol
sym = create_symbol()

Instructions for updating:
keep_dims is deprecated, use keepdims instead
CPU times: user 405 ms, sys: 153 ms, total: 558 ms
Wall time: 587 ms


In [8]:
%%time
# Initialise model
model = init_model(sym)

Instructions for updating:
keep_dims is deprecated, use keepdims instead
CPU times: user 68.4 ms, sys: 4.59 ms, total: 73 ms
Wall time: 71.1 ms


In [9]:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_1 (Embedding)      (None, 150, 125)          2500000   
_________________________________________________________________
gru_1 (GRU)                  (None, 100)               67800     
_________________________________________________________________
dense_1 (Dense)              (None, 2)                 202       
Total params: 2,568,002
Trainable params: 2,568,002
Non-trainable params: 0
_________________________________________________________________


## Train the model
Note accuracy of training

In [10]:
%%time
# Train model
model.fit(x_train,
          y_train,
          batch_size=BATCHSIZE,
          epochs=EPOCHS,
          verbose=1)

Epoch 1/3
Epoch 2/3
Epoch 3/3
CPU times: user 15min 36s, sys: 4min 6s, total: 19min 43s
Wall time: 4min 4s


<keras.callbacks.History at 0x131a72940>

## Predict on the test set

In [11]:
%%time
y_guess = model.predict(x_test, batch_size=BATCHSIZE)
y_guess = np.argmax(y_guess, axis=-1)
y_truth = np.argmax(y_test, axis=-1)

CPU times: user 1min 15s, sys: 8.96 s, total: 1min 24s
Wall time: 21 s


In [12]:
print("Precision: ", sum(y_guess == y_truth)/len(y_guess))

Precision:  0.85964
