# **RNN**
A recurrent neural network (RNN) is a class of artificial neural network where connections between units form a directed cycle. This creates an internal state of the network which allows it to exhibit dynamic temporal behavior.

IMDB sentiment classification task

This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. IMDB provided a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well. Raw text and already processed bag of words formats are provided.

You can download the dataset from http://ai.stanford.edu/~amaas/data/sentiment/  or you can directly use 
" from keras.datasets import imdb " to import the dataset.

Few points to be noted:
Modules like SimpleRNN, LSTM, Activation layers, Dense layers, Dropout can be directly used from keras
For preprocessing, you can use required 

In [4]:
#load the imdb dataset 
from keras.datasets import imdb
import numpy as np
import os

vocabulary_size = 5000
(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words = vocabulary_size)
print('Loaded dataset with {} training samples, {} test samples'.format(len(X_train), len(X_test)))

Loaded dataset with 25000 training samples, 25000 test samples


In [6]:
# Check if the GPU is in place
from keras import backend as k
if k.backend() == 'tensorflow':
    import tensorflow as tf
    device_name = tf.test.gpu_device_name()
    if device_name == '':
        device_name = "None"
    print('Using TensorFlow version:', tf.__version__, ', GPU:', device_name)

Using TensorFlow version: 2.6.0 , GPU: /device:GPU:0


2021-11-27 06:22:58.173094: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-11-27 06:22:58.173947: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-11-27 06:22:58.174588: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-11-27 06:22:58.175319: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-11-27 06:22:58.176024: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from S

In [7]:
#the review is stored as a sequence of integers. 
# These are word IDs that have been pre-assigned to individual words, and the label is an integer

print('---review---')
print(X_train[2])
print('---label---')
print(y_train[2])

# to get the actual review
word2id = imdb.get_word_index()
id2word = {i: word for word, i in word2id.items()}
print('---review with words---')
print([id2word.get(i, ' ') for i in X_train[2]])
print('---label---')
print(y_train[2])

---review---
[1, 14, 47, 8, 30, 31, 7, 4, 249, 108, 7, 4, 2, 54, 61, 369, 13, 71, 149, 14, 22, 112, 4, 2401, 311, 12, 16, 3711, 33, 75, 43, 1829, 296, 4, 86, 320, 35, 534, 19, 263, 4821, 1301, 4, 1873, 33, 89, 78, 12, 66, 16, 4, 360, 7, 4, 58, 316, 334, 11, 4, 1716, 43, 645, 662, 8, 257, 85, 1200, 42, 1228, 2578, 83, 68, 3912, 15, 36, 165, 1539, 278, 36, 69, 2, 780, 8, 106, 14, 2, 1338, 18, 6, 22, 12, 215, 28, 610, 40, 6, 87, 326, 23, 2300, 21, 23, 22, 12, 272, 40, 57, 31, 11, 4, 22, 47, 6, 2307, 51, 9, 170, 23, 595, 116, 595, 1352, 13, 191, 79, 638, 89, 2, 14, 9, 8, 106, 607, 624, 35, 534, 6, 227, 7, 129, 113]
---label---
0
---review with words---
['the', 'as', 'there', 'in', 'at', 'by', 'br', 'of', 'sure', 'many', 'br', 'of', 'and', 'no', 'only', 'women', 'was', 'than', "doesn't", 'as', 'you', 'never', 'of', 'hat', 'night', 'that', 'with', 'ignored', 'they', 'bad', 'out', 'superman', 'plays', 'of', 'how', 'star', 'so', 'stories', 'film', 'comes', 'defense', 'date', 'of', 'wide', 'the

In [8]:
#pad sequences (write your code here)
from keras.preprocessing import sequence
from keras.preprocessing.sequence import pad_sequences


X_train = pad_sequences(X_train, maxlen=132)
X_test = pad_sequences(X_test, maxlen=132)

In [19]:
#design a RNN model (write your code)

from keras import Sequential
from keras.layers import Embedding, LSTM, Dense, Dropout, SimpleRNN

embedding_size=32
model = Sequential()
model.add(Embedding(vocabulary_size, embedding_size, input_length=132, mask_zero=True))
model.add(SimpleRNN(128,dropout=0.1, recurrent_dropout=0.1,return_sequences=True))
model.add(SimpleRNN(64,dropout=0.1, recurrent_dropout=0.1,return_sequences=False))
model.add(Dense(64, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

### Reason:
For `binary classification` (like **sentiment analysis** in our case), the loss function **`binary_crossentropy`** is a defacto standard, and **`Adam`** or adaptive momentum optimizer is suitable for optimization in deep learning models as it can handle **sparse gradients** and does not stop at saddle points as it employs momentum, so it continues further.

In [20]:
#train and evaluate your model
#choose your loss function and optimizer and mention the reason to choose that particular loss function and optimizer
# use accuracy as the evaluation metric

model.compile(loss='binary_crossentropy',optimizer='Adam',metrics=['accuracy'])
model.summary()

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_2 (Embedding)      (None, 132, 32)           160000    
_________________________________________________________________
simple_rnn_4 (SimpleRNN)     (None, 132, 128)          20608     
_________________________________________________________________
simple_rnn_5 (SimpleRNN)     (None, 64)                12352     
_________________________________________________________________
dense_3 (Dense)              (None, 64)                4160      
_________________________________________________________________
dense_4 (Dense)              (None, 1)                 65        
Total params: 197,185
Trainable params: 197,185
Non-trainable params: 0
_________________________________________________________________


In [22]:
# Define batch size.
batch_size = 512
num_epochs = 10
# Define the model and train it.
model.fit(X_train, y_train, epochs=num_epochs, batch_size=batch_size, verbose=1)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f6a31b9f6d0>

In [23]:
#evaluate the model using model.evaluate()

score = model.evaluate(X_test, y_test, verbose = 1)
print('Test accuracy:', score[1])

Test accuracy: 0.7670000195503235


# **LSTM**

**Instead of using a RNN, now try using a LSTM model and compare both of them. Which of those performed better and why ?**

In [24]:
model_lstm = Sequential()
embedding_size=32
model_lstm.add(Embedding(vocabulary_size, embedding_size, input_length=132, mask_zero=True))
model_lstm.add(LSTM(128, dropout=0.1, return_sequences=True))#recurrent_dropout=0.1, 
model_lstm.add(LSTM(64, dropout=0.1, return_sequences=False))
model_lstm.add(Dense(32, activation='relu'))
model_lstm.add(Dense(1, activation='sigmoid'))

In [25]:
# Define LSTM model and compile and show it's summary
model_lstm.compile(loss='binary_crossentropy',optimizer='Adam',metrics=['accuracy'])
model_lstm.summary()

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_3 (Embedding)      (None, 132, 32)           160000    
_________________________________________________________________
lstm (LSTM)                  (None, 132, 128)          82432     
_________________________________________________________________
lstm_1 (LSTM)                (None, 64)                49408     
_________________________________________________________________
dense_5 (Dense)              (None, 32)                2080      
_________________________________________________________________
dense_6 (Dense)              (None, 1)                 33        
Total params: 293,953
Trainable params: 293,953
Non-trainable params: 0
_________________________________________________________________


In [26]:
# Define batch size and number of epochs.
batch_size = 512
num_epochs = 10
# Train the LSTM model.
model_lstm.fit(X_train, y_train, epochs=num_epochs, batch_size=batch_size, verbose=1)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f6a374f2810>

In [27]:
#evaluate the model using model.evaluate()

score_lstm = model_lstm.evaluate(X_test, y_test, verbose = 1)
print('Test accuracy for LSTM:', score_lstm[1])

Test accuracy for LSTM: 0.8404399752616882


##### Observations
1. The training time `LSTM` model is more than as compared to the `SimpleRNN` model, with same number of recurrent units ($128$ and $64$) and Dense layers (i.e., $64$ and $1$). And it was apparent from the number of **trainable parameters** both model posses, that is, $293953$ for `LSTM` and $197185$ for `SimpleRNN`. Both are trained with same `batch_size=512`, `epochs=10` with same embedding (of $32$) and maximum length of the input for both the models (i.e. $132$).
2. The `SimpleRNN` model gave an accuracy of $76.70\%$, whereas for the `LSTM` model it was $84.04\%$. Further, `LSTM` performed better than `SimpleRNN`, because simple RNN has an inherent issue of **vanishing gradients**, which is addressed in the LSTM. Plus, it has multiple 'gates' in its cell that decides on amount of previous activation matter to pass on to the next cell, which accounts for memorizing information for longer periods.
3. Not only that, LSTMs also maintains a `cell state` along with the `hidden state` that's responsible for capturing long-term and overall meaning of the semantics which results in absolute increase of overall performance when compared to Simple RNN.

**Perform Error analysis and explain using few examples.**

*Load $20$ samples from test data and see how each model performs on them.*

In [32]:
# Let's import first 10 entries from DIsctionary and we'll analyse performance of both of our models.
# Dictionary that conatins word index as values and word as keys.
from itertools import islice
word_index = imdb.get_word_index()
print("Word index dictionary (first 10 entries)")
(list(islice(word_index.items(), 10)))

Word index dictionary (first 10 entries)


[('fawn', 34701),
 ('tsukino', 52006),
 ('nunnery', 52007),
 ('sonja', 16816),
 ('vani', 63951),
 ('woods', 1408),
 ('spiders', 16115),
 ('hanging', 2345),
 ('woody', 2289),
 ('trawling', 52008)]

In [43]:
def reviewInText(vector):
    """
    Convert the review vector to text form.
    """
    reverse_index = dict([(value,key) for (key,value) in imdb.get_word_index().items()])
    review = " ".join([reverse_index.get(i-3, "!") for i in vector])
    return review

# Set a sed and choose randomly 20 samples from Test data.
np.random.seed(25)
random_index = np.random.choice(25000,20,replace=False)
# Store the random samples taken.
random_test_samples = X_test[random_index, :]
random_test_sample_labels = y_test[random_index]

In [45]:
# Predict on Test data set.
# Predict with RNN.
RNN_labels = model.predict(random_test_samples)
RNN_labels[RNN_labels>=0.5] = 1
RNN_labels[RNN_labels<0.5] = 0
# Predict with LSTM.
LSTM_labels = model_lstm.predict(random_test_samples)
LSTM_labels[LSTM_labels>=0.5] = 1
LSTM_labels[LSTM_labels<0.5] = 0

# Show the actual labels (ground truth values.)
print("Randomly selected reviews and their actual labels.\n\n")

for k, review in enumerate(random_test_samples):
    print(f"{k+1}.", end = " ")
    print(reviewInText(review))
    print()
    print("Actual (Ground Truth) Label :", end = " ")
    if random_test_sample_labels[k] == 1:
        print("POSITIVE")
    else:
        print("NEGATIVE")
    print("RNN predictions :", end = " ")
    if RNN_labels[k] ==1:
        print("POSITIVE")
    else:
        print("NEGATIVE")
    print("LSTM predictions :", end = " ")
    if LSTM_labels[k] ==1:
        print("POSITIVE\n\n")
    else:
        print("NEGATIVE\n\n")

Randomly selected reviews and their actual labels.


1. ! have somehow ! and have become the focus of their lives br br all of them somehow come to terms with and their past demons all of them except the first one who realizes the only way he can move on through life is getting flat ! again during this flat line ! he sees himself getting flat ! the first time and also sees the boy he killed trying to kill him this time round the boy kills him this time for a few minutes and in doing so has ! revenge for a few minutes in the movie one is left wondering if he gets to come back thankfully because most of us like happy endings the boy him of his past and he comes back to life again

Actual (Ground Truth) Label : POSITIVE
RNN predictions : POSITIVE
LSTM predictions : NEGATIVE


2. costumes cinematography and music are gorgeous the acting writing and directing are extremely strong and filled with realism class and originality i loved the film and the novel section iii in the film is much dif

**Error Analysis: Reviewed in Bottom-Up fashion**

1. $20$--> Since `RNN` mostly focuses on a local small group of words, like "made me smile", "good cast", "great story", it made it classified as "POSITIVE", while LSTM is able to remember the past semantics as well as able to draw out the whole meaning. Plus, its encounter the sarcasm "up so badly" which made it classified as "NEGATIVE" sentiment.

2. $18$--> Again this review consists of short group of positive words that made RNN think as "POSITIVE" sentiment while it wasn't

3. $14$--> Even if this was a positive review, both the models marked it as "NEGATIVE" sentiment due to over-usage of negative words from start to last, like "can't be all bad", "rough day", "revenge", "doesn't list anything", "unfortunately", "not on dvd", etc.

4. $4$--> We can clearly see even though the whole review feels as "NEGATIVE", but the last 1-2 sentences doesn't conatins any negative semantics which results in "POSITIVE" flagging of the review.

5. $5$--> As we go through the whole there are negative words in each statement except the last one, which is purely in positive sense. So overall the sentence displays some meaning that a human can sense it as "POSITIVE", but since `LSTM` has this behaviour to remember all the meanings of individual words and accumulate their sentiments over time, it flags it as "NEGATIVE" while for the `RNN` it was "POSITIVE" due to the last statement.

# Rough-Work