# Simple Recurrent Network (SRN)

## Introduction

The **Simple Recurrent Network (SRN)**, proposed by Jeff Elman (1990), is a basic form of Recurrent Neural Network (RNN) that can learn temporal sequences.

Each hidden layer at time $t$ not only depends on the current input $x_t$, but also on the previous hidden state $h_{t-1}$, stored in a context layer.

---

## Mathematical Formulation

The SRN is defined by the following equations:

$$h_t = f(W_{hx} \cdot x_t + W_{hh} \cdot h_{t-1} + b_h)$$

$$y_t = g(W_{yh} \cdot h_t + b_y)$$

### Where:

- $h_t$ : hidden state at time $t$
- $x_t$ : input at time $t$
- $y_t$ : output at time $t$
- $f$ : activation function for hidden layer (usually $\tanh$)
- $g$ : activation function for output (e.g., $\text{softmax}$)
- $W_{hx}$ : weight matrix from input to hidden layer
- $W_{hh}$ : weight matrix from hidden to hidden (recurrent connections)
- $W_{yh}$ : weight matrix from hidden to output layer
- $b_h$, $b_y$ : bias vectors

**1. Install prerequisite packages**

In [1]:
import sys
import subprocess

packages = [
    'tensorflow',
    'pillow',
    'matplotlib',
    'numpy'
]

subprocess.check_call([sys.executable, '-m', 'pip', 'install', '--quiet'] + packages)

0

**2. Imports and Data Preparation**

In [2]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np

print("TensorFlow version:", tf.__version__)

# 1. The toy corpus
# Symbolic input into numeric form that can be processed by neural networks.
# ' '→0, 'e'→1, 'h'→2, 'i'→3, 'l'→4, 'n'→5, 'o'→6, 't'→7
text = "hello hinton hello hinton hello hinton "
chars = sorted(list(set(text)))  # unique characters
stoi  = {ch: i for i, ch in enumerate(chars)}  # char to index
itos  = {i: ch for ch, i in stoi.items()}      # index to char
vocab_size = len(chars)

# 2. Encode the text into integers
# "hello hinton" → [2, 1, 4, 4, 6, 0, 2, 3, 5, 7, 6, 5, 0]
encoded = [stoi[c] for c in text]

# 3. Create input-output pairs
seq_len = 10  # window size
inputs, targets = [], []
for i in range(len(encoded) - seq_len):
    inputs.append(encoded[i:i+seq_len])
    targets.append(encoded[i+seq_len])

X = np.array(inputs, dtype=np.int32)
y = np.array(targets, dtype=np.int32)

print("Shape of X:", X.shape, "Shape of y:", y.shape)

TensorFlow version: 2.19.0
Shape of X: (29, 10) Shape of y: (29,)


**3. Build the SRN Model**

In [3]:
embed_dim = 16
hidden_size = 64
seq_len = 10

model = keras.Sequential([
    layers.Embedding(input_dim=vocab_size, output_dim=embed_dim, input_length=seq_len),
    layers.SimpleRNN(hidden_size, activation='tanh'),  # Elman SRN
    layers.Dense(vocab_size, activation='softmax')
])

model.compile(
    optimizer=keras.optimizers.Adam(0.01),
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

model.summary()



**4. Training (Automatic BPTT)**

In [4]:
batch_size = 32
# For each 10-character sequence, the model predicts the next character.
dataset = tf.data.Dataset.from_tensor_slices((X, y)).shuffle(1000).batch(batch_size)
# TensorFlow runs autograd to backpropagate the loss through all time steps (BPTT).
history = model.fit(dataset, epochs=30, verbose=2)

Epoch 1/30
1/1 - 0s - 408ms/step - accuracy: 0.0690 - loss: 2.0946
Epoch 2/30
1/1 - 0s - 4ms/step - accuracy: 0.4138 - loss: 1.7921
Epoch 3/30
1/1 - 0s - 4ms/step - accuracy: 0.7931 - loss: 1.2526
Epoch 4/30
1/1 - 0s - 4ms/step - accuracy: 0.7241 - loss: 0.8587
Epoch 5/30
1/1 - 0s - 4ms/step - accuracy: 0.9310 - loss: 0.5942
Epoch 6/30
1/1 - 0s - 4ms/step - accuracy: 0.8621 - loss: 0.4787
Epoch 7/30
1/1 - 0s - 3ms/step - accuracy: 0.9310 - loss: 0.3649
Epoch 8/30
1/1 - 0s - 3ms/step - accuracy: 1.0000 - loss: 0.2677
Epoch 9/30
1/1 - 0s - 3ms/step - accuracy: 1.0000 - loss: 0.1952
Epoch 10/30
1/1 - 0s - 3ms/step - accuracy: 1.0000 - loss: 0.1262
Epoch 11/30
1/1 - 0s - 3ms/step - accuracy: 1.0000 - loss: 0.0994
Epoch 12/30
1/1 - 0s - 3ms/step - accuracy: 1.0000 - loss: 0.0737
Epoch 13/30
1/1 - 0s - 3ms/step - accuracy: 1.0000 - loss: 0.0466
Epoch 14/30
1/1 - 0s - 3ms/step - accuracy: 1.0000 - loss: 0.0317
Epoch 15/30
1/1 - 0s - 3ms/step - accuracy: 1.0000 - loss: 0.0266
Epoch 16/30
1/1 -

**5. Generate Text (Prediction)**

In [5]:
def sample_next_char(probs, temperature=1.0):
    probs = np.asarray(probs).astype('float64')
    if temperature != 1.0:
        probs = np.log(probs + 1e-9) / temperature
        probs = np.exp(probs)
    probs = probs / np.sum(probs)
    return np.random.choice(len(probs), p=probs)

def generate(seed="hello ", gen_len=60, temperature=0.8):
    context = [stoi[c] for c in seed[-10:]]  # use last 10 chars
    output = list(seed)

    for _ in range(gen_len):
        x = np.array(context[-seq_len:], dtype=np.int32)[None, :]
        probs = model.predict(x, verbose=0)[0]
        idx = sample_next_char(probs, temperature)
        output.append(itos[idx])
        context.append(idx)

    return "".join(output)

print(generate(seed="hello ", gen_len=60, temperature=0.8))

hello  inton hello hinton hello hinton hello hinton hello hinton h


The SRN successfully learned short-term temporal structure of the text —
it “remembers” the pattern "hello hinton" even though it only sees 10 characters at a time.