# Recurring Neural Networks with Keras
So we are going to use an RNN to do sentiment analysis on full-text movie reviews!

Since understanding written language requires keeping track of all the words in a sentence, we need a recurrent neural network to keep a "memory" of the words that have come before as it "reads" sentences over time.
In particular, we'll use LSTM (Long Short-Term Memory) cells because we don't really want to "forget" words too quickly - words early on in a sentence can affect the meaning of that sentence significantly.

In [8]:
from tensorflow.keras.preprocessing import sequence
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense,LSTM,Embedding
from tensorflow.keras.datasets import imdb

Now import our training and testing data. We specify that we only care about the 20,000 most popular words in the dataset in order to keep things somewhat managable. The dataset includes 5,000 training reviews and 25,000 testing reviews for some reason.

In [9]:
(x_train,y_train),(x_test,y_test) = imdb.load_data(num_words=20000)

Let's get a feel for what this data looks like. Let's look at the first training feature, which should represent a written movie review:

In [10]:
x_train[0]

[1,
 14,
 22,
 16,
 43,
 530,
 973,
 1622,
 1385,
 65,
 458,
 4468,
 66,
 3941,
 4,
 173,
 36,
 256,
 5,
 25,
 100,
 43,
 838,
 112,
 50,
 670,
 2,
 9,
 35,
 480,
 284,
 5,
 150,
 4,
 172,
 112,
 167,
 2,
 336,
 385,
 39,
 4,
 172,
 4536,
 1111,
 17,
 546,
 38,
 13,
 447,
 4,
 192,
 50,
 16,
 6,
 147,
 2025,
 19,
 14,
 22,
 4,
 1920,
 4613,
 469,
 4,
 22,
 71,
 87,
 12,
 16,
 43,
 530,
 38,
 76,
 15,
 13,
 1247,
 4,
 22,
 17,
 515,
 17,
 12,
 16,
 626,
 18,
 19193,
 5,
 62,
 386,
 12,
 8,
 316,
 8,
 106,
 5,
 4,
 2223,
 5244,
 16,
 480,
 66,
 3785,
 33,
 4,
 130,
 12,
 16,
 38,
 619,
 5,
 25,
 124,
 51,
 36,
 135,
 48,
 25,
 1415,
 33,
 6,
 22,
 12,
 215,
 28,
 77,
 52,
 5,
 14,
 407,
 16,
 82,
 10311,
 8,
 4,
 107,
 117,
 5952,
 15,
 256,
 4,
 2,
 7,
 3766,
 5,
 723,
 36,
 71,
 43,
 530,
 476,
 26,
 400,
 317,
 46,
 7,
 4,
 12118,
 1029,
 13,
 104,
 88,
 4,
 381,
 15,
 297,
 98,
 32,
 2071,
 56,
 26,
 141,
 6,
 194,
 7486,
 18,
 4,
 226,
 22,
 21,
 134,
 476,
 26,
 480,
 5,
 144,
 30,

That doesn't look like a movie review! But this data set has spared you a lot of trouble - they have already converted words to integer-based indices. The actual letters that make up a word don't really matter as far as our model is concerned, what matters are the words themselves - and our model needs numbers to work with, not letters.

In [11]:
y_train[0]

1

In [12]:
X_train = sequence.pad_sequences(x_train,maxlen=80)
X_test = sequence.pad_sequences(x_test,maxlen=80)

Now let's set up our neural network model! Considering how complicated a LSTM recurrent neural network is under the hood, it's really amazing how easy this is to do with Keras.

We will start with an Embedding layer - this is just a step that converts the input data into dense vectors of fixed size that's better suited for a neural network. You generally see this in conjunction with index-based text data like we have here. The 20,000 indicates the vocabulary size (remember we said we only wanted the top 20,000 words) and 128 is the output dimension of 128 units.

Next we just have to set up a LSTM layer for the RNN itself. It's that easy. We specify 128 to match the output size of the Embedding layer, and dropout terms to avoid overfitting, which RNN's are particularly prone to.

Finally we just need to boil it down to a single neuron with a sigmoid activation function to choose our binay sentiment classification of 0 or 1.

In [13]:
model = Sequential()
model.add(Embedding(20000,128))
model.add(LSTM(128,dropout=.2,recurrent_dropout=0.2))
model.add(Dense(1,activation='sigmoid'))

Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor


As this is a binary classification problem, we'll use the binary_crossentropy loss function. And the Adam optimizer is usually a good choice (feel free to try others.)

In [14]:
model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])

Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where


Now let's kick off the training.

In [16]:
model.fit(X_train,y_train,
         batch_size=32,
         epochs=15,
         verbose=2,
         validation_data=(X_test,y_test))

Train on 25000 samples, validate on 25000 samples
Epoch 1/15
25000/25000 - 592s - loss: 0.4579 - acc: 0.7862 - val_loss: 0.4139 - val_acc: 0.8160
Epoch 2/15
25000/25000 - 469s - loss: 0.2964 - acc: 0.8799 - val_loss: 0.4155 - val_acc: 0.8123
Epoch 3/15
25000/25000 - 498s - loss: 0.2134 - acc: 0.9185 - val_loss: 0.4466 - val_acc: 0.8260
Epoch 4/15
25000/25000 - 478s - loss: 0.1513 - acc: 0.9432 - val_loss: 0.4910 - val_acc: 0.8274
Epoch 5/15
25000/25000 - 584s - loss: 0.1076 - acc: 0.9610 - val_loss: 0.5300 - val_acc: 0.8075
Epoch 6/15
25000/25000 - 503s - loss: 0.0823 - acc: 0.9704 - val_loss: 0.6297 - val_acc: 0.8128
Epoch 7/15
25000/25000 - 474s - loss: 0.0559 - acc: 0.9813 - val_loss: 0.7650 - val_acc: 0.8177
Epoch 8/15
25000/25000 - 529s - loss: 0.0446 - acc: 0.9848 - val_loss: 0.7584 - val_acc: 0.8119
Epoch 9/15
25000/25000 - 612s - loss: 0.0441 - acc: 0.9852 - val_loss: 0.8091 - val_acc: 0.8181
Epoch 10/15
25000/25000 - 594s - loss: 0.0277 - acc: 0.9910 - val_loss: 0.9772 - val_a

<tensorflow.python.keras.callbacks.History at 0x21e4f2f70f0>

OK, let's evaluate our model's accuracy:

In [17]:
score,acc = model.evaluate(X_test,y_test,
                          batch_size=32,
                          verbose=0)
print("Score: %f,Accuracy: %f"%(score,acc))

Score: 1.121280,Accuracy: 0.809120
