# Exercise 6 - Recurrent Neural Networks 

Today you will be taking another look at the IMDB movie review dataset.

First, you must make a typical feed-forward network using the Functional API in Keras.

Second, you must make a recurrent neural network using the Functional API. 

The choice of which kind of recurrent layers you want is up to you.


Consider how you want your input data encoded.

``from tensorflow.keras.datasets import imdb
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.layers import Input, Embedding, Dense
from tensorflow.keras.models import Model
from tensorflow.keras.utils import to_categorical
import numpy as np``

``numwords = 10000
maxlen = 500``

``(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=numwords)``

``x_train = pad_sequences(x_train, maxlen=maxlen)``
``x_test = pad_sequences(x_test, maxlen=maxlen)``

``y_train = to_categorical(y_train)
y_test = to_categorical(y_test)``


## 1: Feedforward Functional API 

In [137]:
from tensorflow.keras.datasets import imdb
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras import layers
from tensorflow.keras.layers import Input, Embedding, Dense, Dropout
from tensorflow.keras.models import Model
from tensorflow.keras.utils import to_categorical
import tensorflow.keras as tfk
import numpy as np

numwords = 10000
maxlen = 500

(train_data, train_labels), (test_data, test_labels) = tfk.datasets.imdb.load_data(num_words=numwords)

### Decoding the reviews

In [138]:
def decode_review(X):

    from tensorflow.keras.datasets import imdb

    # Let's quickly decode a review
    word_index = imdb.get_word_index()

    reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])

    decoded_review = ' '.join([reverse_word_index.get(i-3, '?') for i in X])

    return decoded_review
decode_review(train_data[1])


"? big hair big boobs bad music and a giant safety pin these are the words to best describe this terrible movie i love cheesy horror movies and i've seen hundreds but this had got to be on of the worst ever made the plot is paper thin and ridiculous the acting is an abomination the script is completely laughable the best is the end showdown with the cop and how he worked out who the killer is it's just so damn terribly written the clothes are sickening and funny in equal ? the hair is big lots of boobs ? men wear those cut ? shirts that show off their ? sickening that men actually wore them and the music is just ? trash that plays over and over again in almost every scene there is trashy music boobs and ? taking away bodies and the gym still doesn't close for ? all joking aside this is a truly bad film whose only charm is to look back on the disaster that was the 80's and have a good old laugh at how bad everything was back then"

In [139]:
train_labels

array([1, 0, 0, ..., 0, 1, 0], dtype=int64)

In [140]:
train_data = pad_sequences(x_train, maxlen=maxlen) 
test_data = pad_sequences(x_test, maxlen=maxlen)

#train_labels = to_categorical(y_train)
#test_labels = to_categorical(y_test)

In [141]:
decode_review(train_data[1])

'? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

### Encoding the data 

In [142]:
def multi_hot_sequences(sequences, dimension):
    #Create an all-zero matrix of shape (len(sequences), dimension)
    results = np.zeros((len(sequences), dimension))
    for i, word_indices in enumerate(sequences):
        results[i, word_indices] = 1.0  # set specific indices of results[i] to 1s
    return results

x_train = multi_hot_sequences(train_data, dimension=numwords)
x_test = multi_hot_sequences(test_data, dimension=numwords)

#splitting the train_data into train AND validation data
#train_data, validation_data, train_labels, validation_labels = train_test_split(train_data, train_labels, test_size =0.33, random_state=42)

# Decoding again after encoding 
decode_review(train_data[1])

'? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

### Building the feedforward model

In [143]:
inputs = Input(shape=(maxlen,), dtype='int32', name='text')
x = Embedding(64, numwords)(inputs)
x = Dense(32, activation='relu', input_shape=(maxlen,))(x) # (x) concatenates the layer
x = Dropout(0.5)(x)
x = Dense(32, activation='relu', kernel_regularizer='l1')(x)
x = Dense(32, activation='softmax', kernel_regularizer='l1')(x)
outputs = Dense(1, activation='sigmoid')(x)

model_ff = Model(inputs,outputs)
model_ff.summary()

Model: "functional_13"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
text (InputLayer)            [(None, 500)]             0         
_________________________________________________________________
embedding_23 (Embedding)     (None, 500, 10000)        640000    
_________________________________________________________________
dense_37 (Dense)             (None, 500, 32)           320032    
_________________________________________________________________
dropout_9 (Dropout)          (None, 500, 32)           0         
_________________________________________________________________
dense_38 (Dense)             (None, 500, 32)           1056      
_________________________________________________________________
dense_39 (Dense)             (None, 500, 32)           1056      
_________________________________________________________________
dense_40 (Dense)             (None, 500, 1)          

In [144]:
#model_ff.compile("adam", "binary_crossentropy", metrics=["accuracy"])
#model_ff.fit(train_data, train_labels, batch_size=32, epochs=2, validation_data=(test_data, test_labels))


## Building the RNN 

In [149]:
inputs = Input(shape=(maxlen,), dtype='int32', name='text')
x = Embedding(64, numwords)(inputs)
x= layers.SimpleRNN(32, return_sequences=True)(x)
x= layers.SimpleRNN(32, return_sequences=True)(x)
x= layers.SimpleRNN(32, return_sequences=True)(x)
outputs= Dense(1, activation="sigmoid")(x)

model_rnn = Model(inputs, outputs)
model_rnn.summary()

Model: "functional_17"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
text (InputLayer)            [(None, 500)]             0         
_________________________________________________________________
embedding_25 (Embedding)     (None, 500, 10000)        640000    
_________________________________________________________________
simple_rnn_5 (SimpleRNN)     (None, 500, 32)           321056    
_________________________________________________________________
simple_rnn_6 (SimpleRNN)     (None, 500, 32)           2080      
_________________________________________________________________
simple_rnn_7 (SimpleRNN)     (None, 500, 32)           2080      
_________________________________________________________________
dense_41 (Dense)             (None, 500, 1)            33        
Total params: 965,249
Trainable params: 965,249
Non-trainable params: 0
_______________________________________________

In [148]:
#model_rnn.compile("adam", "binary_crossentropy", metrics=["accuracy"])
#model_rnn.fit(train_data, train_labels, batch_size=32, epochs=2, validation_data=(test_data, test_labels))
