###Recurrent Neural Network, LSTM Sentiment Classification
Source: Pan, Chao. Deep Learning With Python: Step By Step Guide With Keras and Pytorch (Kindle Locations 1799-1801). 

Task - Sentiment classification
Ordering of words in a sentence is paramount to arriving at an appropriate estimation of users feelings on the subject matter. The task involves classifying users movie reviews into positive or negative categories. We would employ the IMDB movies review classification dataset for our binary classification task. 
The IMDB dataset contains 25,000 movie reviews each in the training and test set, making a total of 50,000 samples. We would use Keras as it bundles the dataset in a preprocessed form. Each review is already encoded as a sequence of word indices.


In [1]:
from keras.preprocessing import sequence 
from keras.models import Sequential 
from keras.layers import Dense, Embedding 
from keras.layers import LSTM 
from keras.datasets import imdb 

# Set hyperparameters 
max_features = 20000 # use 20,000 most common words in corpus 
maxlen = 80 # cut texts after this number of words (among top max_features most common words)
batch_size = 32


  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


Load the dataset and print the number of samples. Note that on the first run the dataset is downloaded into memory.

In [3]:
print(' Loading data...')
(x_train, y_train), (x_test, y_test) = imdb.load_data( num_words = max_features) 
print( len( x_train), 'train sequences') 
print( len( x_test), 'test sequences')

 Loading data...
(25000, 'train sequences')
(25000, 'test sequences')


To make sure all reviews fed into the model contains 80 words, 
we truncate reviews greater than 80 words and pad reviews that are less than 80 words with dummy (0) values.

In [5]:
print(' Pad sequences (samples x time)') 
x_train = sequence.pad_sequences( x_train, maxlen = maxlen) 
x_test = sequence.pad_sequences( x_test, maxlen = maxlen) 
print(' x_train shape:', x_train.shape) 
print(' x_test shape:', x_test.shape)


 Pad sequences (samples x time)
(' x_train shape:', (25000, 80))
(' x_test shape:', (25000, 80))


The next step involves defining our model. The model utilises Keras embedding layer, an LSTM layer with 128 units and finally a dense layer with one unit.

In [7]:
# model definition
model = Sequential()
model.add( Embedding( max_features, 128))
model.add( LSTM( 128, dropout = 0.2, recurrent_dropout = 0.2))
model.add( Dense( 1, activation = 'sigmoid'))


Notice that the dense layer uses a sigmoid activation function because there are only two classes to predict - positive or negative. We are almost done, the next step is to compile our model.

In [8]:
# compile model by setting learning procedure, objective function and evaluation metric
model.compile(loss = 'binary_crossentropy', 
              optimizer = 'adam', 
              metrics =['accuracy'])


Having compiled the model, we fit the model on our data.

In [9]:
# fit the model 
model.fit(x_train, y_train, 
          batch_size = batch_size, 
          epochs = 15, 
          validation_data = (x_test, y_test))


Train on 25000 samples, validate on 25000 samples
Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


<keras.callbacks.History at 0xb24716590>

Finally, we evaluate the model on samples stored in the test set which the model has never seen.

In [10]:
score, acc = model.evaluate( x_test, y_test, batch_size = batch_size) 
print('Test score:', score) 
print('Test accuracy:', acc)


('Test score:', 1.248509069707766)
('Test accuracy:', 0.80484)
