<div style="text-align: right"><a href="http://ml-school.uni-koeln.de">Summer School "Deep Learning for
    Language Analysis"</a> <br/><strong>Text Analysis with Deep Learning</strong><br/>Sep 5 - 9, 2022<br/>Nils Reiter<br/><a href="mailto:nils.reiter@uni-koeln.de">nils.reiter@uni-koeln.de</a></div>

# A Simple Recurrent Neural Network

This notebook demonstrates the sequence labeling capabilities of recurrent neural networks with a very simple (artificial) example.

The input consists of sequences of 1s and 0s. The output also consists of 0s and 1s, but *shifted* by two positions. Thus, `[0,1,0,0]` results in `[0,0,0,1]`.



We first generate a few random sequences to be used for training.

In [None]:
import numpy as np

number_of_sequences = 40
number_of_symbols = 2
length_of_sequences = 15

# initialize the random number generator
rng = np.random.default_rng()

x_train = np.array([rng.integers(0,number_of_symbols,length_of_sequences) for i in range(number_of_sequences)])
# x_train

This creates the y-data by iterating over each sequence, and adding 0s in front and removing the last two elements.

In [None]:
y_train = np.array([np.insert(x_seq[:13],0,[0,0]) for x_seq in x_train])
# y_train

Up to now, the input sequences consist of scalar integers. Neural networks expect the elements to be vectors -- to allow including multiple features (e.g., from an embedding).

Thus, we reshape the data appropriately. In essence, each scalar value becomes a vector of length 1.

In [None]:
x_train = x_train.reshape(40, length_of_sequences, 1)
y_train = y_train.reshape(40, length_of_sequences, 1)

## Model building

This is the model as usual. The RNN layer is provided by [`SimpleRNN` from keras](https://keras.io/api/layers/recurrent_layers/simple_rnn/), which is what we talked about before.

In [None]:
from tensorflow.keras import models, layers, optimizers

model = models.Sequential()
model.add(layers.Input(shape=(length_of_sequences,1)))
model.add(layers.SimpleRNN(5,return_sequences=True))
model.add(layers.Dense(1))

model.compile(loss="mean_squared_error", 
             metrics=["accuracy"])

model.summary()

In [None]:
model.fit(x_train, y_train, epochs=3, batch_size=5, verbose=1)

Let's predict a sequence and see what happens

In [None]:
model.predict(np.array([[1,0,0,1,0,0,0,1,0,0,0,0,0,0,0]]).reshape(1,length_of_sequences,1))