# dump-rnn

This notebook is meant to visualize recurrent neural network activations. The networks used (RNN, LSTM, GRU) are trained on a small portion of the imdb dataset. The networks are then run on a sample sentence and the activations from the predictions are "dumped" and saved as both images and an npz file.

In [1]:
import keras
import numpy as np

Using TensorFlow backend.


First, the imdb dataset is loaded using the keras library. We set the maximum sentence length to 100 words and maximum words in the sentence to 10000 words to save both time and increase computation speeds. Then the data is processed so that it may be used by the networks at training time.

In [2]:
from keras.datasets import imdb

maxlen = 100
maxword = 10000

(x_train, y_train), (x_test, y_test) = imdb.load_data(path="imdb.npz", 
                                                      num_words=maxword, 
                                                      maxlen=maxlen)

from keras.preprocessing import sequence


data = sequence.pad_sequences(x_train, maxlen=maxlen)
labels = np.reshape(y_train, len(y_train))

Here, the networks used to dump activations are made. There are three models that will be tested in this notebook: regular RNNs, LSTMs, and GRUs.

In [3]:
from keras.models import Sequential
from keras import layers

rmodels = []

In [4]:
rnn = Sequential()

rnn.add(layers.Embedding(maxword, 32, input_length=maxlen))
rnn.add(layers.SimpleRNN(128, return_sequences=True))
rnn.add(layers.SimpleRNN(128))
rnn.add(layers.Dense(1, activation='sigmoid'))

rmodels.append(rnn)

In [5]:
lstm = Sequential()

lstm.add(layers.Embedding(maxword, 32, input_length=maxlen))
lstm.add(layers.LSTM(128, return_sequences=True))
lstm.add(layers.LSTM(128))
lstm.add(layers.Dense(1, activation='sigmoid'))

rmodels.append(lstm)

In [6]:
gru = Sequential()

gru.add(layers.Embedding(maxword, 32, input_length=maxlen))
gru.add(layers.GRU(128, return_sequences=True))
gru.add(layers.GRU(128))
gru.add(layers.Dense(1, activation='sigmoid'))

rmodels.append(gru)

Each of the models are compiled and trained on the test data.

In [8]:
for model in rmodels:
    model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])
    model.fit(data, labels, epochs=5, batch_size=64, validation_split=0.3)

Train on 4015 samples, validate on 1721 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Train on 4015 samples, validate on 1721 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Train on 4015 samples, validate on 1721 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


Now, the activations are dumped. First, each model is modified to output each of their layers' outputs. These outputs are added to list which is saved both as graphs and an npz file.

In [9]:
input_sequence = np.reshape(data[0], (1, 100))
dump = []

from keras import models

for model in rmodels:
    activations = [layer.output for layer in model.layers]  
    dump_model = models.Model(inputs=model.input, outputs=activations)  
    dump.append(dump_model.predict(input_sequence))

In [10]:
from matplotlib import pyplot as plt

n = 0

for network in dump:
    n += 1
    l = 0
    for layer in network:
        l += 1
        image = np.squeeze(layer)
        plt.plot(image)
        plt.savefig("model"+str(n)+"-"+"layer"+str(l))
        plt.clf()

[[ -4.25081924e-02  -2.81243790e-02   1.04718590e-02 ...,  -1.86898783e-02
    1.44745102e-02   3.50164510e-02]
 [ -4.25081924e-02  -2.81243790e-02   1.04718590e-02 ...,  -1.86898783e-02
    1.44745102e-02   3.50164510e-02]
 [ -4.25081924e-02  -2.81243790e-02   1.04718590e-02 ...,  -1.86898783e-02
    1.44745102e-02   3.50164510e-02]
 ..., 
 [ -3.32166627e-02  -2.41237096e-02  -5.97841525e-03 ...,   9.29845264e-05
    4.05949913e-03  -1.02841258e-02]
 [ -2.71964092e-02  -1.82066262e-02   5.65175526e-03 ...,  -3.63797806e-02
   -1.72416307e-02  -2.89847571e-02]
 [ -2.71964092e-02  -1.82066262e-02   5.65175526e-03 ...,  -3.63797806e-02
   -1.72416307e-02  -2.89847571e-02]]
[[ -1.63121391e-02  -1.61196031e-02  -3.36217694e-04 ...,  -5.83846064e-04
   -5.92414774e-02   2.30054744e-02]
 [ -3.34315188e-02   1.27144274e-03   1.89718008e-02 ...,  -8.62987898e-03
   -8.78836140e-02   5.09923398e-02]
 [ -2.75578294e-02   8.95962957e-03   2.92509552e-02 ...,   6.28189137e-03
   -7.34574646e-02   

In [11]:
np.savez("dump")