<a href="https://colab.research.google.com/github/lovnishverma/Python-Getting-Started/blob/main/080_imdb_rnn_complete.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# IMDB: recursive neural networks

## Data preprocessing

### Required imports

In [25]:
from keras.datasets import imdb
from keras.preprocessing import sequence
import numpy as np
from sklearn.model_selection import train_test_split

### Processing

Load the training and test data.  To limit computation time, we restrict the number of words to 5,000.

In [26]:
num_words = 5_000
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=num_words)

Since the review vary in length, and we prefer to limit the computation time, we will base the classification on the first 100 features of each input sequence.

In [27]:
feature_length = 100
x_train = sequence.pad_sequences(x_train, maxlen=feature_length)
x_test = sequence.pad_sequences(x_test, maxlen=feature_length)

Now the training and test input are 2D arrays. We split the training set into a subset for actual training, and one for validation.  First we seed the random number generator to ensure reproducibility. In this case, we will use part of the 25000 test examples as valiation data.

In [28]:
np.random.seed(1234)

In [29]:
x_train, x_val, y_train, y_val = train_test_split(x_train, y_train)

## GRU

### Required imports & model definition

In [30]:
from tensorflow.keras.layers import Embedding, GRU, Dense, Dropout, Activation
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import Adam


Again, to limit training times, we restrict ourselfs to using a limited number of features.

In [31]:
vector_length = 64
num_units = 64
model = Sequential()
model.add(Embedding(num_words, vector_length, mask_zero=True,
                    input_length=feature_length))
model.add(GRU(num_units))
model.add(Dropout(rate=0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))

In [32]:
model.summary()

In [33]:
model.compile(loss='binary_crossentropy', optimizer=Adam(),
              metrics=['accuracy'])

###    Training

In [34]:
history = model.fit(x_train, y_train, batch_size=64, epochs=10,
                    validation_data=(x_val, y_val))

Epoch 1/10
[1m293/293[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m41s[0m 127ms/step - accuracy: 0.6660 - loss: 0.5792 - val_accuracy: 0.8437 - val_loss: 0.3562
Epoch 2/10
[1m293/293[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m38s[0m 119ms/step - accuracy: 0.8700 - loss: 0.3146 - val_accuracy: 0.8496 - val_loss: 0.3471
Epoch 3/10
[1m293/293[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m38s[0m 110ms/step - accuracy: 0.9036 - loss: 0.2468 - val_accuracy: 0.8474 - val_loss: 0.3438
Epoch 4/10
[1m293/293[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m41s[0m 111ms/step - accuracy: 0.9243 - loss: 0.2081 - val_accuracy: 0.8478 - val_loss: 0.3873
Epoch 5/10
[1m293/293[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m40s[0m 107ms/step - accuracy: 0.9445 - loss: 0.1543 - val_accuracy: 0.8421 - val_loss: 0.4039
Epoch 6/10
[1m293/293[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m40s[0m 105ms/step - accuracy: 0.9659 - loss: 0.1083 - val_accuracy: 0.8344 - val_loss: 0.5178
Epoch 7/10

The training accuracy is much better than the validation accurcy, so the model is likely heavily overtrained.

### Testing

In [35]:
model.evaluate(x_test, y_test)

[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 19ms/step - accuracy: 0.8161 - loss: 0.7625


[0.7382863163948059, 0.819920003414154]

## LSTM

### Required imports & model definition

In [36]:
from keras.layers import LSTM

Again, to limit training times, we restrict ourselfs to using a limited number of features.

In [37]:
vector_length = 64
num_units = 64
model = Sequential()
model.add(Embedding(num_words, vector_length, mask_zero=True,
                    input_length=feature_length))
model.add(LSTM(num_units))
model.add(Dropout(rate=0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))

In [38]:
model.summary()

In [39]:
model.compile(loss='binary_crossentropy', optimizer=Adam(),
              metrics=['accuracy'])

###    Training

In [23]:
history = model.fit(x_train, y_train, batch_size=64, epochs=10,
                    validation_data=(x_val, y_val))

Epoch 1/10
[1m293/293[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m35s[0m 110ms/step - accuracy: 0.6839 - loss: 0.5644 - val_accuracy: 0.8390 - val_loss: 0.3672
Epoch 2/10
[1m293/293[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m32s[0m 109ms/step - accuracy: 0.8767 - loss: 0.3009 - val_accuracy: 0.8464 - val_loss: 0.3441
Epoch 3/10
[1m293/293[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m39s[0m 101ms/step - accuracy: 0.9076 - loss: 0.2362 - val_accuracy: 0.8416 - val_loss: 0.3587
Epoch 4/10
[1m293/293[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m43s[0m 109ms/step - accuracy: 0.9252 - loss: 0.1923 - val_accuracy: 0.8368 - val_loss: 0.4084
Epoch 5/10
[1m293/293[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m29s[0m 100ms/step - accuracy: 0.9420 - loss: 0.1577 - val_accuracy: 0.8410 - val_loss: 0.4174
Epoch 6/10
[1m293/293[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m41s[0m 100ms/step - accuracy: 0.9538 - loss: 0.1299 - val_accuracy: 0.8269 - val_loss: 0.4872
Epoch 7/10

The training accuracy is much better than the validation accuracy, so the model is likely heavily overtrained....

### Testing

In [24]:
model.evaluate(x_test, y_test)

[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m16s[0m 20ms/step - accuracy: 0.8176 - loss: 0.7482


[0.7441230416297913, 0.8160399794578552]