# Instructions

1. Go to https://colab.research.google.com and choose the \"Upload\" option to upload this notebook file.
1. In the Edit menu, choose \"Notebook Settings\" and then set the \"Hardware Accelerator\" dropdown to GPU.
1. Read through the code in the following sections:
  * [IMDB Dataset](#scrollTo=mXcb24B6a03_)
  * [Define model](#scrollTo=kAz68ipVa05_)
  * [Train model](#scrollTo=kIynp1v_a06Y)
  * [Assess model](#scrollTo=ALyNCqx4a06r)
1. Complete at least one of these exercises. Remember to keep notes about what you do!
  * [Exercise Option #1 - Standard Difficulty](#scrollTo=_9dsjJwya06_)
  * [Exercise Option #2 - Advanced Difficulty](#scrollTo=nyZbljLAa09z)

## Documentation/Sources
* [Class Notes](https://jennselby.github.io/MachineLearningCourseNotes/#recurrent-neural-networks)
* [https://machinelearningmastery.com/sequence-classification-lstm-recurrent-neural-networks-python-keras/](https://machinelearningmastery.com/sequence-classification-lstm-recurrent-neural-networks-python-keras/) for information on sequence classification with keras
* [https://keras.io/](https://keras.io/) Keras API documentation
* [Keras recurrent tutorial](https://github.com/Vict0rSch/deep_learning/tree/master/keras/recurrent)

In [1]:
# upgrade tensorflow to tensorflow 2
%tensorflow_version 2.x
# display matplotlib plots
%matplotlib inline
from tensorflow import test
from tensorflow import device

# IMDB Dataset
The [IMDB dataset](https://keras.io/datasets/#imdb-movie-reviews-sentiment-classification) consists of movie reviews (x_train) that have been marked as positive or negative (y_train). See the [Word Vectors Tutorial](https://github.com/jennselby/MachineLearningTutorials/blob/master/WordVectors.ipynb) for more details on the IMDB dataset.

In [2]:
from tensorflow.keras.datasets import imdb
from tensorflow.keras.preprocessing import sequence

In [3]:
(imdb_x_train, imdb_y_train), (imdb_x_test, imdb_y_test) = imdb.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz


  x_train, y_train = np.array(xs[:idx]), np.array(labels[:idx])
  x_test, y_test = np.array(xs[idx:]), np.array(labels[idx:])


For a standard keras model, every input has to be the same length, so we need to set some length after which we will cutoff the rest of the review. (We will also need to pad the shorter reviews with zeros to make them the same length).

In [4]:
cutoff = 500
imdb_x_train_padded = sequence.pad_sequences(imdb_x_train, maxlen=cutoff)
imdb_x_test_padded = sequence.pad_sequences(imdb_x_test, maxlen=cutoff)

 # see https://stackoverflow.com/questions/42821330/restore-original-text-from-keras-s-imdb-dataset
imdb_index_offset = 3

In [5]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, LSTM, Dense

# Define model

Unlike last time, when we used convolutional layers, we're going to use an LSTM, a special type of recurrent network.

Using recurrent networks means that rather than seeing these reviews as one input happening all at once, with the convolutional layers taking into account which words are next to each other, we are going to see them as a sequence of inputs, with one word occurring at each timestep.

In [6]:
imdb_lstm_model = Sequential()
imdb_lstm_model.add(Embedding(input_dim=len(imdb.get_word_index()) + imdb_index_offset,
                              output_dim=100,
                              input_length=cutoff))
# return_sequences tells the LSTM to output the full sequence, for use by the next LSTM layer. The final
# LSTM layer should return only the output sequence, for use in the Dense output layer
imdb_lstm_model.add(LSTM(units=32, return_sequences=True))
imdb_lstm_model.add(LSTM(units=32))
imdb_lstm_model.add(Dense(units=1, activation='sigmoid')) # because at the end, we want one yes/no answer
imdb_lstm_model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['binary_accuracy'])

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb_word_index.json


# Train model

In [7]:
# Train using GPU acceleration
# (see https://colab.research.google.com/notebooks/gpu.ipynb#scrollTo=Y04m-jvKRDsJ)
device_name = test.gpu_device_name()
if device_name != '/device:GPU:0':
  print(
      '\n\nThis error most likely means that this notebook is not '
      'configured to use a GPU.  Change this in Notebook Settings via the '
      'command palette (cmd/ctrl-shift-P) or the Edit menu.\n\n')
  raise SystemError('GPU device not found')

with device('/device:GPU:0'):
  imdb_lstm_model.fit(imdb_x_train_padded, imdb_y_train, epochs=1, batch_size=64)



# Assess model

In [8]:
with device('/device:GPU:0'):
  imdb_lstm_scores = imdb_lstm_model.evaluate(imdb_x_test_padded, imdb_y_test)
  print('loss: {} accuracy: {}'.format(*imdb_lstm_scores))

loss: 0.3421633541584015 accuracy: 0.855679988861084


# Exercise Option #1 - Standard Difficulty

Experiment with different model configurations from the one above. Try other recurrent layers, different numbers of layers, change some of the defaults. See [Keras Recurrent Layers](https://keras.io/layers/recurrent/)

__Keep notes on what you try and what results you get.__

- Org: 0.3421, 0.8556
- +dense with relu and 16 units: 0.3239, 0.8661
- +dense*2, the one above and one with 32 units: 0.3102, 0.8697
- change 16 to 32: 0.3259, 0.8724
- change units, 32, 16, 16, 8, 1: 0.3125, 0.8741

In [49]:
m = Sequential()
m.add(Embedding(input_dim=len(imdb.get_word_index()) + imdb_index_offset,
                              output_dim=100,
                              input_length=cutoff))
# return_sequences tells the LSTM to output the full sequence, for use by the next LSTM layer. The final
# LSTM layer should return only the output sequence, for use in the Dense output layer
m.add(LSTM(units=32, return_sequences=True))
# m.add(LSTM(units=32, return_sequences=True))
m.add(LSTM(units=16))
m.add(Dense(units=16, activation='relu'))
m.add(Dense(units=8, activation='relu'))
m.add(Dense(units=1, activation='sigmoid')) # because at the end, we want one yes/no answer
m.compile(loss='binary_crossentropy', optimizer='adam', metrics=['binary_accuracy'])

In [50]:
device_name = test.gpu_device_name()
if device_name != '/device:GPU:0':
  print(
      '\n\nThis error most likely means that this notebook is not '
      'configured to use a GPU.  Change this in Notebook Settings via the '
      'command palette (cmd/ctrl-shift-P) or the Edit menu.\n\n')
  raise SystemError('GPU device not found')

with device('/device:GPU:0'):
  m.fit(imdb_x_train_padded, imdb_y_train, epochs=1, batch_size=64)



In [51]:
with device('/device:GPU:0'):
  ms = m.evaluate(imdb_x_test_padded, imdb_y_test)
  print('loss: {} accuracy: {}'.format(*ms))

loss: 0.3322821855545044 accuracy: 0.8664000034332275


# Exercise Option #2 - Advanced Difficulty

Set up your own RNN model for the Reuters Classification Problem

Take the model from exercise 1 (imdb_lstm_model) and modify it to classify the [Reuters data](https://keras.io/datasets/#reuters-newswire-topics-classification).

Think about what you are trying to predict in this case, and how you will have to change your model to deal with this.

In [63]:
from tensorflow.keras.datasets import reuters
from tensorflow.keras.preprocessing import sequence
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, LSTM, Dense
from tensorflow.keras.utils import to_categorical

In [64]:
(r_x_train, r_y_train), (r_x_test, r_y_test) = reuters.load_data()
r_y_train = to_categorical(r_y_train)
r_y_test = to_categorical(r_y_test)

  x_train, y_train = np.array(xs[:idx]), np.array(labels[:idx])
  x_test, y_test = np.array(xs[idx:]), np.array(labels[idx:])


In [65]:
rcutoff = 500
r_x_train_padded = sequence.pad_sequences(r_x_train, maxlen=rcutoff)
r_x_test_padded = sequence.pad_sequences(r_x_test, maxlen=rcutoff)
r_index_offset = 3

`high score (with many epochs): 0.6335`

In [121]:
r = Sequential()
r.add(Embedding(input_dim=len(reuters.get_word_index()) + r_index_offset,
                              output_dim=100,
                              input_length=cutoff))
r.add(LSTM(units=16, return_sequences=True))
r.add(LSTM(units=32))
 
r.add(Dense(units=46, activation='sigmoid')) 
r.compile(loss='categorical_crossentropy', optimizer='Nadam', metrics=['categorical_accuracy'])

In [126]:
device_name = test.gpu_device_name()
if device_name != '/device:GPU:0':
  print(
      '\n\nThis error most likely means that this notebook is not '
      'configured to use a GPU.  Change this in Notebook Settings via the '
      'command palette (cmd/ctrl-shift-P) or the Edit menu.\n\n')
  raise SystemError('GPU device not found')

with device('/device:GPU:0'):
  r.fit(r_x_train_padded, r_y_train, epochs=10, batch_size=64)

with device('/device:GPU:0'):
  rs = r.evaluate(r_x_test_padded, r_y_test)
  print('loss: {} accuracy: {}'.format(*rs))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
loss: 1.9577536582946777 accuracy: 0.6335707902908325
