# Recurrent Neural Network

## (Book: Mastering Machine Learning with Python in Six Steps)

Ideal modeling choice to work in sequential data for speech text mining image captioning, time series prediction, robot control,
language modeling, etc.

Feedback from previous step is provided to the current step.

## Long Short Term Memory(LSTM)

LSTM is an implementation of improved RNN architecture to address the issues of 
general RNN, and it enables long-range dependencies. It is designed to have better 
memory through linear memory cells surrounded by a set of gate units used to control 
the flow of information—when information should enter the memory, when to forget, 
and when to output. It uses no activation function within its recurrent components, thus 
the gradient term does not vanish with backpropagation.

In [1]:
import numpy as np
np.random.seed(2017)

from keras.utils import pad_sequences
from keras.models import Sequential
from keras.layers import Dense, Activation, Embedding
from keras.layers import LSTM
from keras.datasets import imdb

2022-08-22 19:37:20.368330: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-08-22 19:37:20.368390: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.


In [2]:
max_features = 20000
maxlen = 80 # cut texts after this number of words (among top max_features most common words)
batch_size = 32
print('Loading data...')
(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=max_features)
print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')
print('Pad sequences (samples x time)')
X_train = pad_sequences(X_train, maxlen=maxlen)
X_test = pad_sequences(X_test, maxlen=maxlen)
print('X_train shape:', X_train.shape)
print('X_test shape:', X_test.shape)

Loading data...
25000 train sequences
25000 test sequences
Pad sequences (samples x time)
X_train shape: (25000, 80)
X_test shape: (25000, 80)


In [3]:
# Defining the model
model = Sequential()
model.add(Embedding(max_features, 128))
model.add(LSTM(128, recurrent_dropout=0.2, dropout=0.2))
model.add(Dense(1))
model.add(Activation('sigmoid'))

2022-08-22 19:37:58.496045: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2022-08-22 19:37:58.496086: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2022-08-22 19:37:58.496111: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (arch): /proc/driver/nvidia/version does not exist


In [4]:
# Compiling the model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

In [5]:
# Train the model
model.fit(X_train, y_train, batch_size=batch_size, epochs=5, validation_data=(X_test, y_test))

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7ffabdbfc7f0>

In [6]:
train_score, train_acc = model.evaluate(X_train, y_train, batch_size=batch_size)
test_score, test_acc = model.evaluate(X_test, y_test, batch_size=batch_size)



In [7]:
print('Train score:', train_score)
print('Train accuracy:', train_acc)
print ('Test score:', test_score)
print ('Test accuracy:', test_acc)

Train score: 0.0371478870511055
Train accuracy: 0.9907600283622742
Test score: 0.6249503493309021
Test accuracy: 0.8193600177764893


## Transfer Learning 

Based on our past experience, we humans can learn a new skill easily. We are more
efficient in learning, particularly if the task at hand is similar to what we have done in the
past. For example, learning a new programming language for a computer professional
or driving a new type of vehicle for a seasoned driver is relatively easy, based on our past
experience.

Transfer learning is an area in ML that aims to utilize the knowledge gained while
solving one problem to solve a different but related problem.

In [8]:
import numpy as np
np.random.seed(2017)

from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.utils import np_utils
from keras import backend as K

In [9]:
batch_size = 128
nb_classes = 5
nb_epoch = 5
# input image dimensions
img_rows, img_cols = 28, 28
# number of convolutional filters to use
nb_filters = 32
# size of pooling area for max pooling
pool_size = 2
# convolution kernel size
kernel_size = 3
input_shape = (img_rows, img_cols, 1)

In [10]:
# the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


In [14]:
# create two datasets one with digits below 5 and one with 5 and above
X_train_lt5 = X_train[y_train < 5]
y_train_lt5 = y_train[y_train < 5]

X_test_lt5 = X_test[y_test < 5]
y_test_lt5 = y_test[y_test < 5]
X_train_gte5 = X_train[y_train >= 5]
y_train_gte5 = y_train[y_train >= 5] - 5 # make classes start at 0 for
X_test_gte5 = X_test[y_test >= 5] # np_utils.to_categorical
y_test_gte5 = y_test[y_test >= 5] - 5

In [15]:
# Train model for digits 0 to 4
def train_model(model, train, test, nb_classes):
    X_train = train[0].reshape((train[0].shape[0],) + input_shape)
    X_test = test[0].reshape((test[0].shape[0],) + input_shape)
    X_train = X_train.astype('float32')
    X_test = X_test.astype('float32')
    X_train /= 255
    X_test /= 255
    print('X_train shape:', X_train.shape)
    print(X_train.shape[0], 'train samples')
    print(X_test.shape[0], 'test samples')
    
    # convert class vectors to binary class matrices
    Y_train = np_utils.to_categorical(train[1], nb_classes)
    Y_test = np_utils.to_categorical(test[1], nb_classes)
    
    model.compile(
        loss='categorical_crossentropy',
        optimizer='adadelta',
        metrics=['accuracy']
    )
    
    model.fit(
        X_train,
        Y_train,
        batch_size=batch_size, 
        epochs=nb_epoch,
        verbose=1,
        validation_data=(X_test, Y_test)
    )
    score = model.evaluate(X_test, Y_test, verbose=0)
    print('Test score:', score[0])
    print('Test accuracy:', score[1])

In [16]:
# Define two groups of layers: feature (convolutions) and classification(dense)

feature_layers = [
    Conv2D(nb_filters, kernel_size,
    padding="valid",
    input_shape=input_shape),
    Activation("relu"),
    Conv2D(nb_filters, kernel_size),
    Activation("relu"),
    MaxPooling2D(pool_size=(pool_size, pool_size)),
    Dropout(0.25),
    Flatten(),
]

classification_layers = [
    Dense(128),
    Activation("relu"),
    Dropout(0.5),
    Dense(nb_classes),
    Activation("softmax")
]

In [17]:
# create complete model
model = Sequential(feature_layers + classification_layers)

In [18]:
# train model for 5-digit classification [0..4]
train_model(model, (X_train_lt5, y_train_lt5), (X_test_lt5, y_test_lt5),
nb_classes)

X_train shape: (30596, 28, 28, 1)
30596 train samples
5139 test samples


2022-08-22 20:39:13.252182: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 95949056 exceeds 10% of free system memory.


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Test score: 1.2482320070266724
Test accuracy: 0.8606733083724976


Transfer existing trained model on 0 to 4 to build model for digits 5 to 9

In [19]:
# freeze feature layers and rebuild model
for layer in feature_layers:
    layer.trainable = False

In [20]:
# transfer: train dense layers for new classification task [5..9]
train_model(
    model, 
    (X_train_gte5, y_train_gte5),
    (X_test_gte5, y_test_gte5), 
    nb_classes
)

X_train shape: (29404, 28, 28, 1)
29404 train samples
4861 test samples


2022-08-22 20:49:46.013697: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 92210944 exceeds 10% of free system memory.


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Test score: 1.322437047958374
Test accuracy: 0.6885414719581604
