In [1]:
import keras
keras.__version__

Using TensorFlow backend.


'2.2.0'

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

# Understanding recurrent neural networks

## 1 - A first recurrent layer in Keras

Now let's try to use a simple RNN on the IMDB movie review classification problem.

### 1.1 - Preprocess the data

In [2]:
from keras.datasets import imdb
from keras.preprocessing import sequence

max_features = 10000  # number of words to consider as features
maxlen = 500  # cut texts after this number of words (among top max_features most common words)
batch_size = 32

print('Loading data...')
(input_train, y_train), (input_test, y_test) = imdb.load_data(num_words=max_features)
print(len(input_train), 'train sequences')
print(len(input_test), 'test sequences')

print('Pad sequences (samples x time)')
input_train = sequence.pad_sequences(input_train, maxlen=maxlen)
input_test = sequence.pad_sequences(input_test, maxlen=maxlen)
print('input_train shape:', input_train.shape)
print('input_test shape:', input_test.shape)

Loading data...
25000 train sequences
25000 test sequences
Pad sequences (samples x time)
input_train shape: (25000, 500)
input_test shape: (25000, 500)


### 1.2 - Train a simple recurrent network

Let's train a simple recurrent network using an `Embedding` layer and a `SimpleRNN` layer:

_QUESTION_ 

Create an architecture with the following elements : 
- Embedding layer : input_dim=max_features, output_dim=32
- SimpleRNN : units=32
- Dense: units=1, activation="sigmoid"

Links to documentation

- Embedding : https://keras.io/layers/embeddings/
- SimpleRNN : https://keras.io/layers/recurrent/#simplernn
- Dense : https://keras.io/layers/core/#dense

In [None]:
from keras.layers import Dense

model = Sequential()
# <YOUR CODE HERE>

In [None]:
model.summary()

In [None]:
assert(model.count_params() == 322113)

In [None]:
model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])

history = model.fit(input_train, y_train,
                    epochs=10,
                    batch_size=128,
                    validation_split=0.2)

### 1.3 - Display the results

Let's display the training and validation loss and accuracy:

In [None]:
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(len(acc))

plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()

plt.figure()

plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()

plt.show()

As a reminder, our very first naive approach to this very dataset got us to 88% test accuracy. Unfortunately, our small recurrent network doesn't perform very well at all compared to this baseline (only up to 85% validation accuracy). 

Part of the problem is that our inputs only consider the first 500 words rather the full sequences -- hence our RNN has access to less information than our earlier baseline model. The remainder of the problem is simply that `SimpleRNN` isn't very good at processing long sequences, like text. Other types of recurrent layers perform much better. 

Let's take a look at some more advanced layers.

## 2 - A concrete LSTM example in Keras

Now let's switch to more practical concerns: we will set up a model using a LSTM layer and train it on the IMDB data. 

The network is similar to the one with `SimpleRNN` that we just presented. We only specify the output dimensionality of the LSTM layer, and leave every other argument (there are lots) to the Keras defaults.

### 2.1 - Create the model

_QUESTION_ 

Create an architecture with the following elements : 
- Embedding layer : input_dim=max_features, output_dim=32
- LSTM : units=32
- Dense: units=1, activation="sigmoid"

Links to documentation

- Embedding : https://keras.io/layers/embeddings/
- LSTM : https://keras.io/layers/recurrent/#lstm
- Dense : https://keras.io/layers/core/#dense

In [None]:
from keras.layers import LSTM

model = Sequential()
# <YOUR CODE HERE>

In [None]:
model.summary()

In [None]:
assert(model.count_params() == 328353)

### 2.2 - Compile and train the model

In [None]:
model.compile(optimizer='rmsprop',
              loss='binary_crossentropy',
              metrics=['acc'])

In [None]:
history = model.fit(input_train, y_train,
                    epochs=10,
                    batch_size=128,
                    validation_split=0.2)

### 2.3 - Display the results

In [None]:
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(len(acc))

plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()

plt.figure()

plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()

plt.show()