## CSCI 470 Activities and Case Studies

1. For all activities, you are allowed to collaborate with a partner. 
1. For case studies, you should work individually and are **not** allowed to collaborate.

By filling out this notebook and submitting it, you acknowledge that you are aware of the above policies and are agreeing to comply with them.

Some considerations with regard to how these notebooks will be graded:

1. You can add more notebook cells or edit existing notebook cells other than "# YOUR CODE HERE" to test out or debug your code. We actually highly recommend you do so to gain a better understanding of what is happening. However, during grading, **these changes are ignored**. 
2. You must ensure that all your code for the particular task is available in the cells that say "# YOUR CODE HERE"
3. Every cell that says "# YOUR CODE HERE" is followed by a "raise NotImplementedError". You need to remove that line. During grading, if an error occurs then you will not receive points for your work in that section.
4. If your code passes the "assert" statements, then no output will result. If your code fails the "assert" statements, you will get an "AssertionError". Getting an assertion error means you will not receive points for that particular task.
5. If you edit the "assert" statements to make your code pass, they will still fail when they are graded since the "assert" statements will revert to the original. Make sure you don't edit the assert statements.
6. We may sometimes have "hidden" tests for grading. This means that passing the visible "assert" statements is not sufficient. The "assert" statements are there as a guide but you need to make sure you understand what you're required to do and ensure that you are doing it correctly. Passing the visible tests is necessary but not sufficient to get the grade for that cell.
7. When you are asked to define a function, make sure you **don't** use any variables outside of the parameters passed to the function. You can think of the parameters being passed to the function as a hint. Make sure you're using all of those variables.
8. Finally, **make sure you run "Kernel > Restart and Run All"** and pass all the asserts before submitting. If you don't restart the kernel, there may be some code that you ran and deleted that is still being used and that was why your asserts were passing.

# Deep Learning - Recurrent Neural Networks

In [1]:
import tensorflow as tf
from tensorflow.keras.preprocessing import sequence
from tensorflow.keras.layers import SimpleRNN, GRU, LSTM, Embedding, Dense
from tensorflow.keras import Sequential
from tensorflow.keras.datasets import imdb

We will be using the IMDB dataset outlined in the keras documentation [here](https://keras.io/datasets/#imdb-movie-reviews-sentiment-classification). We will be applying a supervised learning application to text where we predict the sentiment of the IMDB reviews.

Take a look at the imports above. For the RNN based imports see the [RNN documentation](https://keras.io/layers/recurrent). For preprocessing using `sequence` see the [sequence documentation](https://keras.io/preprocessing/sequence). For Embedding, see the [Embedding documentation](https://keras.io/layers/embeddings/).

From the Keras documentation, linked above:
>"This is a dataset of 25,000 movies reviews from IMDB, labeled by sentiment (positive/negative). Reviews have been preprocessed, and each review is encoded as a list of word indexes (integers). For convenience, words are indexed by overall frequency in the dataset, so that for instance the integer "3" encodes the 3rd most frequent word in the data. This allows for quick filtering operations such as: "only consider the top 10,000 most common words, but eliminate the top 20 most common words."

In [2]:
maxlen = 100 # Only use sentences up to this many words
n = 20000 # Only use the most frequent n words
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=n)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz


In [3]:
x_train.shape

(25000,)

In [4]:
x_test.shape

(25000,)

In [5]:
for i in range(10):
    print(f"Element {i} has a length of {len(x_train[i])}")

Element 0 has a length of 218
Element 1 has a length of 189
Element 2 has a length of 141
Element 3 has a length of 550
Element 4 has a length of 147
Element 5 has a length of 43
Element 6 has a length of 123
Element 7 has a length of 562
Element 8 has a length of 233
Element 9 has a length of 130


In [6]:
x_train[0][:10]

[1, 14, 22, 16, 43, 530, 973, 1622, 1385, 65]

In [7]:
x_train = sequence.pad_sequences(x_train, maxlen=maxlen)
x_test = sequence.pad_sequences(x_test, maxlen=maxlen)

In [8]:
x_train.shape, x_test.shape

((25000, 100), (25000, 100))

Each data sample is a sequence of integers that represent the index of the word in our vocabulary. This saves on storage when compared to a vector that's as long as our vocabulary with all 0's and just one 1 as discussed in the lecture. We will be using the [Embedding layer](https://keras.io/layers/embeddings/) to adapt this for our neural network.

In [9]:
print(f"All values of the targets are integers with the following max and min values")
print(f"{y_train.max()}, {y_train.min()}")

All values of the targets are integers with the following max and min values
1, 0


We will build three networks, using basic RNNs, GRUs and LSTMs. We will then compare their performance in predicting the classes of reviews appropriately.

In [56]:
# Define "simple_layers", a list of Keras layers, that you will then use to create a Sequential model
# saved as "my_simple".
# 
# Here you will create a simple RNN using one SimpleRNN layer with dropout and recurrent_dropout
# (see argument options in SimpleRNN documentation).
# 
# You will need to use an Embedding layer as the first layer (to convert the data appropriately)
# followed by the SimpleRNN layer. Select an embedding size of your choice, and use that for your
# SimpleRNN layer's output dimensions as well.
#
# Finally, create an output layer that applies to our dataset task of binary classification

simple_layers = [
                 Embedding(input_dim=n,output_dim=64,input_length=maxlen),
                 SimpleRNN(units=64,dropout=.1,recurrent_dropout=.01,activation="tanh"),
                 Dense(1,activation="sigmoid"),
]
my_simple = Sequential(layers=simple_layers)

In [23]:
assert len(simple_layers) == 3
assert isinstance(simple_layers[0], Embedding)
assert isinstance(simple_layers[1], SimpleRNN)
assert isinstance(simple_layers[2], Dense)
assert simple_layers[0].output_dim == simple_layers[1].units
assert simple_layers[1].dropout > 0
assert simple_layers[1].recurrent_dropout > 0
assert my_simple

In [57]:
%%time
my_simple.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
my_simple.fit(x_train, y_train, batch_size=32, epochs=1)

CPU times: user 1min 15s, sys: 3.64 s, total: 1min 18s
Wall time: 47.2 s


In [54]:
# Define "gru_layers", a list of Keras layers, that you will then use to create a Sequential model
# saved as "my_gru".
#
# Here you will create an RNN using a GRU layer, with dropout and recurrent_dropout.
#
# Use an input Embedding layer and output Dense layer, as in the simple RNN model.

gru_layers = [
              Embedding(n,output_dim=64,input_length=maxlen),
              GRU(64,activation="tanh",dropout=.2,recurrent_dropout=.05),
              Dense(1,activation="sigmoid"),
]
my_gru = Sequential(layers=gru_layers)

In [26]:
assert len(gru_layers) == 3
assert isinstance(gru_layers[0], Embedding)
assert isinstance(gru_layers[1], GRU)
assert isinstance(gru_layers[2], Dense)
assert gru_layers[0].output_dim == gru_layers[1].units
assert gru_layers[1].dropout > 0
assert gru_layers[1].recurrent_dropout > 0
assert my_gru

In [55]:
%%time
my_gru.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
my_gru.fit(x_train, y_train, batch_size=32, epochs=1)

CPU times: user 3min 24s, sys: 8.79 s, total: 3min 33s
Wall time: 2min


In [52]:
# Define "lstm_layers", a list of Keras layers, that you will then use to create a Sequential model
# saved as "my_lstm".
#
# Here you will create an RNN using an LSTM layer, again, with dropout and recurrent_dropout.
#
# Use an input Embedding layer and output Dense layer, as in the simple RNN and the GRU model.

lstm_layers = [
               Embedding(n,64,input_length=maxlen),
               LSTM(64,activation="tanh",dropout=.1,recurrent_dropout=.03),
               Dense(1,activation="sigmoid"),
]
my_lstm = Sequential(layers=lstm_layers)

In [38]:
assert len(lstm_layers) == 3
assert isinstance(lstm_layers[0], Embedding)
assert isinstance(lstm_layers[1], LSTM)
assert isinstance(lstm_layers[2], Dense)
assert lstm_layers[0].output_dim == lstm_layers[1].units
assert lstm_layers[1].dropout > 0
assert lstm_layers[1].recurrent_dropout > 0
assert my_lstm

In [53]:
%%time
my_lstm.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
my_lstm.fit(x_train, y_train, batch_size=32, epochs=1)

CPU times: user 4min, sys: 13.1 s, total: 4min 13s
Wall time: 2min 22s


In [58]:
# Evaluate your models on the test set and save the loss and accuracies to the appropriate variables:
# model_name_loss, model_name_acc (e.g., my_simple_loss and my_simple_acc).

My_simple_loss, my_simple_acc= my_simple.evaluate(x_test,y_test)
My_gru_loss, my_gru_acc= my_gru.evaluate(x_test,y_test)
My_lstm_loss, my_lstm_acc= my_lstm.evaluate(x_test,y_test)



In [59]:
print(f"Your simple model achieved an accuracy of {my_simple_acc:.2}.")
print(f"Your GRU model achieved an accuracy of {my_gru_acc:.2}.")
print(f"Your LSTM model achieved an accuracy of {my_lstm_acc:.2}.")

Your simple model achieved an accuracy of 0.76.
Your GRU model achieved an accuracy of 0.85.
Your LSTM model achieved an accuracy of 0.85.


Note that we are only running these models with 1 layer and training them for only 1 epoch. We can easily achieve better results by stacking multiple layers but the model would take a much longer time to train.

In [60]:
assert my_simple_acc > 0.4
assert my_gru_acc > 0.6
assert my_lstm_acc > 0.7

## Feedback

In [61]:
def feedback():
    """Provide feedback on the contents of this exercise
    
    Returns:
        string
    """
    return "Fun assignment"

In [62]:
feedback()

'Fun assignment'