# Deep Learning - Recurrent Neural Networks

In [0]:
import tensorflow as tf
import tensorflow.keras as keras
from keras.preprocessing import sequence
from keras.layers import SimpleRNN, GRU, LSTM, Embedding, Dense
from keras import Sequential
from keras.datasets import imdb

We will be using the IMDB dataset outlined in the keras documentation [here](https://keras.io/datasets/#imdb-movie-reviews-sentiment-classification). We will be applying a supervised learning application to text where we predict the sentiment of the IMDB reviews.

Take a look at the imports above. For the RNN based imports see the [RNN documentation](https://keras.io/layers/recurrent). For preprocessing using `sequence` see the [sequence documentation](https://keras.io/preprocessing/sequence). For Embedding, see the [Embedding documentation](https://keras.io/layers/embeddings/).

In [3]:
maxlen = 100 # Only use sentences up to this many words
n = 20000 # Only use the most frequent n words
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=n)

Downloading data from https://s3.amazonaws.com/text-datasets/imdb.npz


In [4]:
x_train.shape

(25000,)

In [5]:
x_test.shape

(25000,)

In [6]:
for i in range(10):
    print(f"Element {i} has a length of {len(x_train[i])}")

Element 0 has a length of 218
Element 1 has a length of 189
Element 2 has a length of 141
Element 3 has a length of 550
Element 4 has a length of 147
Element 5 has a length of 43
Element 6 has a length of 123
Element 7 has a length of 562
Element 8 has a length of 233
Element 9 has a length of 130


In [7]:
x_train[0][:10]

[1, 14, 22, 16, 43, 530, 973, 1622, 1385, 65]

In [0]:
x_train = sequence.pad_sequences(x_train, maxlen=maxlen)
x_test = sequence.pad_sequences(x_test, maxlen=maxlen)

In [9]:
x_train.shape, x_test.shape

((25000, 100), (25000, 100))

Each data sample is a sequence of integers that represent the index of the word in our vocabulary. This saves on storage when compared to a vector that's as long as our vocabulary with all 0's and just one 1 as discussed in the lecture. We will be using the [Embedding layer](https://keras.io/layers/embeddings/) to adapt this for our neural network.

In [10]:
print(f"All values of the targets are integers with the following max and min values")
print(f"{y_train.max()}, {y_train.min()}")

All values of the targets are integers with the following max and min values
1, 0


We will build three networks, using basic RNNs, GRUs and LSTMs. We will then compare their performance in predicting the classes of reviews appropriately.

In [0]:
# Define simple_layers which will go into a Sequential model saved as my_simple
# Here we are creating a simple RNN using one SimpleRNN layer with a dropout and recurrent_dropout
# You will need to use an Embedding layer before that to convert the data appropriately
# Determine an embedding size and use that for your SimpleRNN layer's output dimensions as well
# Finally, create an output layer that applies to our dataset task of binary classification

# YOUR CODE HERE
simple_layers = [
    Embedding(20000, 100),# input_length=10),
    SimpleRNN(100, dropout=0.01, recurrent_dropout=0.001),
    Dense(1, activation='tanh'),
]

my_simple = Sequential(simple_layers)

#my_simple.compile()
#output_array = 

In [0]:
assert len(simple_layers) == 3
assert isinstance(simple_layers[0], Embedding)
assert isinstance(simple_layers[1], SimpleRNN)
assert isinstance(simple_layers[2], Dense)
assert simple_layers[0].output_dim == simple_layers[1].units
assert simple_layers[1].dropout > 0
assert simple_layers[1].recurrent_dropout > 0
assert my_simple

In [13]:
%%time
my_simple.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
my_simple.fit(x_train, y_train, batch_size=32, epochs=1)

Epoch 1/1
CPU times: user 1min 39s, sys: 3.88 s, total: 1min 43s
Wall time: 1min 1s


In [0]:
# Define gru_layers which will go into a Sequential model saved as my_gru
# Here we are creating an RNN using GRUs, add 1 GRU layer with a dropout and recurrent_dropout
# You will need to use an Embedding layer before that to convert the data appropriately
# Determine an embedding size and use that for your GRU layer's output dimensions as well
# Finally, create an output layer that applies to our dataset task of binary classification

# YOUR CODE HERE
gru_layers = [
    Embedding(20000, 100),# input_length=10),
    GRU(100, dropout=0.01, recurrent_dropout=0.001),
    Dense(1, activation='tanh'),
]
my_gru = Sequential(gru_layers)

In [0]:
assert len(gru_layers) == 3
assert isinstance(gru_layers[0], Embedding)
assert isinstance(gru_layers[1], GRU)
assert isinstance(gru_layers[2], Dense)
assert gru_layers[0].output_dim == gru_layers[1].units
assert gru_layers[1].dropout > 0
assert gru_layers[1].recurrent_dropout > 0
assert my_gru

In [16]:
%%time
my_gru.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
my_gru.fit(x_train, y_train, batch_size=32, epochs=1)

Epoch 1/1
CPU times: user 3min 21s, sys: 6.42 s, total: 3min 27s
Wall time: 1min 54s


In [0]:
# Define lstm_layers which will go into a Sequential model saved as my_lstm
# Here we are creating an RNN using LSTMs, add 1 LSTM layer with a dropout and recurrent_dropout
# You will need to use an Embedding layer before that to convert the data appropriately
# Determine an embedding size and use that for your LSTM layer's output dimensions as well
# Finally, create an output layer that applies to our dataset task of binary classification

# YOUR CODE HERE
lstm_layers = [
    Embedding(20000,100),
    LSTM(100, dropout=0.01, recurrent_dropout=0.001),
    Dense(1, activation='tanh')
]
my_lstm = Sequential(lstm_layers)

In [0]:
assert len(lstm_layers) == 3
assert isinstance(lstm_layers[0], Embedding)
assert isinstance(lstm_layers[1], LSTM)
assert isinstance(lstm_layers[2], Dense)
assert lstm_layers[0].output_dim == lstm_layers[1].units
assert lstm_layers[1].dropout > 0
assert lstm_layers[1].recurrent_dropout > 0
assert my_lstm

In [19]:
%%time
my_lstm.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
my_lstm.fit(x_train, y_train, batch_size=32, epochs=1)

Epoch 1/1
CPU times: user 3min 58s, sys: 7.62 s, total: 4min 6s
Wall time: 2min 14s


In [0]:
# Evaluate your models on the test set and save the loss and accuracies to 
# the appropriate variables:
# model_name_loss, model_name_acc

# YOUR CODE HERE train fit calculate
x,y = x_test,y_test
my_simple_loss, my_simple_acc = my_simple.evaluate(x,y,verbose=0)

my_gru_loss, my_gru_acc = my_gru.evaluate(x,y,verbose=0)

my_lstm_loss, my_lstm_acc = my_lstm.evaluate(x,y,verbose=0)

In [23]:
print(f"Your simple model achieved an accuracy of {my_simple_acc:.2}.")
print(f"Your GRU model achieved an accuracy of {my_gru_acc:.2}.")
print(f"Your LSTM model achieved an accuracy of {my_lstm_acc:.2}.")

Your simple model achieved an accuracy of 0.15.
Your GRU model achieved an accuracy of 0.76.
Your LSTM model achieved an accuracy of 0.82.


Note that we are only running these models with 1 layer and training them for only 1 epoch. We can easily achieve better results by stacking multiple layers but the model would take a much longer time to train.

In [0]:
assert my_simple_acc > 0.1
assert my_gru_acc > 0.6
assert my_lstm_acc > 0.7

## Feedback

In [0]:
def feedback():
    """Provide feedback on the contents of this exercise
    
    Returns:
        string
    """
    # YOUR CODE HERE
    raise NotImplementedError()