# Recurrent Neural Networks

Recurrent neural networks (RNN) are the state of the art algorithm for sequential data and are used by Apple's Siri and and Google's voice search. It is the first algorithm that remembers its input, due to an internal memory, which makes it perfectly suited for machine learning problems that involve sequential data. 

Recurrent neural networks (RNN) are a class of neural networks that are helpful in modeling sequence data. Derived from feedforward networks, RNNs exhibit similar behavior to how human brains function. Simply put: recurrent neural networks produce predictive results in sequential data that other algorithms can’t.

Feed-forward neural network:
1.Cannot handle sequential data
2.Considers only the current input
3.Cannot memorize previous inputs

An RNN can handle sequential data, accepting the current input data, and previously received inputs. RNNs can memorize previous inputs due to their internal memory.



In [18]:
import keras
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense, Embedding
from keras.layers import SimpleRNN
from keras.datasets import imdb
from keras import initializers

In this Notebook, I will train a "vanilla" RNN to predict the sentiment on IMDB reviews.  Our data consists of 25000 training sequences and 25000 test sequences.  The outcome is binary (positive/negative) and both outcomes are equally represented in both the training and the test set.

Keras provides a convenient interface to load the data and immediately encode the words into integers (based on the most common words).  This will save us a lot of the drudgery that is usually involved when working with raw text.

In [19]:
# This is used in loading the data, picks the most common (max_features) words
max_features = 20000 
# maximum length of a sequence - truncate after this
maxlen = 30  
batch_size = 32

In [20]:
## Load in the data.  The function automatically tokenizes the text into distinct integers
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)
print(len(x_train), 'train sequences')
print(len(x_test), 'test sequences')

25000 train sequences
25000 test sequences


In [21]:
x_train.shape

(25000,)

In [22]:
# This pads (or truncates) the sequences so that they are of the maximum length
x_train = sequence.pad_sequences(x_train, maxlen=maxlen)
x_test = sequence.pad_sequences(x_test, maxlen=maxlen)
print('x_train shape:', x_train.shape)
print('x_test shape:', x_test.shape)

x_train shape: (25000, 30)
x_test shape: (25000, 30)


In [23]:
x_train[10:15,:]  

array([[    4,   277,   199,   166,   281,     5,  1030,     8,    30,
          179,  4442,   444, 13772,     9,     6,   371,    87,   189,
           22,     5,    31,     7,     4,   118,     7,     4,  2068,
          545,  1178,   829],
       [  991,     7,  3002,     4,   425,     9,    73,  2218,   549,
           18,    31,   155,    36,   100,   763,   379,    20,   103,
          351,  5308,    13,   202,    12,  2241,     5,     6,   320,
           46,     7,   457],
       [  218,  4843,   629,    42,  3017,    21,    48,    25,    28,
           35,   534,     5,     6,   320,     8,   516,     5,    42,
           25,   181,     8,   130,    56,   547,  3571,     5,  1471,
          851,    14,  2286],
       [  276,    23,  1456,   255,     4,  3612,   449,    61,   558,
           12,    16,     6,     2,    17,     8,    63,    31,    16,
          433,    51,     9,   170,    23,    11,  1898,   134,   504,
         1195,  1195,  1195],
       [   75,    28,     9,

# Building RNN Model

In [24]:
## Let's build a RNN

rnn_hidden_dim = 5
word_embedding_dim = 50
model_rnn = Sequential()
model_rnn.add(Embedding(max_features, word_embedding_dim))  #This layer takes each integer in the sequence and embeds it in a 50-dimensional vector
model_rnn.add(SimpleRNN(rnn_hidden_dim,
                    kernel_initializer=initializers.RandomNormal(stddev=0.001),
                    recurrent_initializer=initializers.Identity(gain=1.0),
                    activation='relu',
                    input_shape=x_train.shape[1:]))

model_rnn.add(Dense(1, activation='sigmoid'))

# Model Summary

In [25]:
## Note that most of the parameters come from the embedding layer
model_rnn.summary()

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_2 (Embedding)      (None, None, 50)          1000000   
_________________________________________________________________
simple_rnn_2 (SimpleRNN)     (None, 5)                 280       
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 6         
Total params: 1,000,286
Trainable params: 1,000,286
Non-trainable params: 0
_________________________________________________________________


#### Summary Of Parameters:
    Embedding - Each word is a vector of length 50. 20000*50 =1000000
    Simple_RNN - 50*5+5 = 255 , One state to another 5*5 =25 ,therfore 280
    

In [26]:
rmsprop = keras.optimizers.RMSprop(lr = .0001)

model_rnn.compile(loss='binary_crossentropy',
              optimizer=rmsprop,
              metrics=['accuracy'])

In [27]:
model_rnn.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=10,
          validation_data=(x_test, y_test))

  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


Train on 25000 samples, validate on 25000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.callbacks.History at 0x7f846ecdccf8>

In [28]:
score, acc = model_rnn.evaluate(x_test, y_test,
                            batch_size=batch_size)
print('Test score:', score)
print('Test accuracy:', acc)

Test score: 0.44788544396400454
Test accuracy: 0.7893199920654297
