<div style="text-align: right"><a href="http://ml-school.uni-koeln.de">Virtual Summer School "Deep Learning for
    Language Analysis"</a> <br/><strong>Text Analysis with Deep Learning</strong><br/>Aug 31 â€” Sep 4, 2020<br/>Nils Reiter<br/><a href="mailto:nils.reiter@uni-koeln.de">nils.reiter@uni-koeln.de</a></div>

# A Simple Recurrent Neural Network

This notebook demonstrates the sequence labeling capabilities of recurrent neural networks with a very simple (artificial) example.

The input consists of sequences of 1s and 0s. The output also consists of 0s and 1s, but *shifted* by two positions. Thus, `[0,1,0,0]` results in `[0,0,0,1]`.



We first generate a few random sequences to be used for training.

In [2]:
import numpy as np

number_of_sequences = 40
number_of_symbols = 2
length_of_sequences = 15

# initialize the random number generator
rng = np.random.default_rng()

x_train = np.array([rng.integers(0,number_of_symbols,length_of_sequences) for i in range(number_of_sequences)])
# x_train

This creates the y-data by iterating over each sequence, and adding 0s in front and removing the last two elements.

In [3]:
y_train = np.array([np.insert(x_seq[:13],0,[0,0]) for x_seq in x_train])
# y_train

Up to now, the input sequences consist of scalar integers. Neural networks expect the elements to be vectors -- to allow including multiple features (e.g., from an embedding).

Thus, we reshape the data appropriately. In essence, each scalar value becomes a vector of length 1.

In [5]:
x_train = x_train.reshape(40, length_of_sequences, 1)
y_train = y_train.reshape(40, length_of_sequences, 1)

This is the model as usual. The RNN layer is provided by [`SimpleRNN` from keras](https://keras.io/api/layers/recurrent_layers/simple_rnn/), which in essence is what we talked about before.

## Model building

In [6]:
from tensorflow.keras import models, layers, optimizers

model = models.Sequential()
model.add(layers.Input(shape=(length_of_sequences,1)))
model.add(layers.SimpleRNN(5,return_sequences=True))
model.add(layers.Dense(1))

model.compile(loss="mean_squared_error", 
             optimizer="adam",
             metrics=["accuracy"])

model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
simple_rnn (SimpleRNN)       (None, 15, 5)             35        
_________________________________________________________________
dense (Dense)                (None, 15, 1)             6         
Total params: 41
Trainable params: 41
Non-trainable params: 0
_________________________________________________________________


In [7]:
model.fit(x_train, y_train, epochs=3, batch_size=5, verbose=1)

Epoch 1/3
Epoch 2/3
Epoch 3/3


<tensorflow.python.keras.callbacks.History at 0x148e9f160>

Let's predict a sequence and see what happens

In [8]:
model.predict(np.array([[1,0,0,1,0,0,0,1,0,0,0,0,0,0,0]]).reshape(1,length_of_sequences,1))

array([[[-0.04548491],
        [ 0.5868582 ],
        [-0.18736522],
        [ 0.2419456 ],
        [ 0.59146625],
        [ 0.09438032],
        [ 0.3664228 ],
        [ 0.02078995],
        [ 0.7117472 ],
        [ 0.02720589],
        [ 0.38891363],
        [ 0.12878738],
        [ 0.30441555],
        [ 0.24255693],
        [ 0.1810411 ]]], dtype=float32)