## RNN Basics

Recurrent Neural Networks (RNNs) are made for handling **sequences** — like text, time series, or signals. Unlike regular networks, they have memory. They remember what came before, so they can use it to influence the next output.


In [1]:
import random
import math

random.seed(42)

### The Problem: Sequence Prediction

Let’s say we have this repeating pattern:

`[0, 1, 0, 1, 0, 1, ...]`

We want the RNN to learn this pattern — that every time it sees a `0`, it should predict `1`, and every time it sees `1`, it should predict `0`.

We’ll feed one number at a time, and the RNN should learn to "remember" the previous value to guess what comes next.

In [2]:
sequence = [0, 1] * 20; inputs, targets = sequence[:-1], sequence[1:]

In [3]:
def sigmoid(x):
    return 1 / (1 + math.exp(-x))

def sigmoid_derivative(output):
    return output * (1 - output)

In [4]:
class SimpleRNN:
    def __init__(self, lr=0.1, epochs=100):
        
        self.w_input_hidden = random.uniform(-1, 1)  # input to hidden
        self.w_hidden_hidden = random.uniform(-1, 1)  # hidden to hidden (recurrent)
        self.w_hidden_output = random.uniform(-1, 1)  # hidden to output
        self.b_hidden = random.uniform(-1, 1)
        self.b_output = random.uniform(-1, 1)
        self.lr = lr
        self.epochs = epochs

    def train(self, inputs, targets):
        for epoch in range(self.epochs):
            hidden_state = 0
            total_loss = 0

            for x, target in zip(inputs, targets):
                hidden_input = self.w_input_hidden * x + self.w_hidden_hidden * hidden_state + self.b_hidden
                hidden_state_new = sigmoid(hidden_input)

                output_input = self.w_hidden_output * hidden_state_new + self.b_output
                prediction = sigmoid(output_input)

                error = prediction - target
                loss = error ** 2
                total_loss += loss

                d_output = error * sigmoid_derivative(prediction)
                d_hidden = d_output * self.w_hidden_output * sigmoid_derivative(hidden_state_new)

                self.w_hidden_output -= self.lr * d_output * hidden_state_new
                self.b_output -= self.lr * d_output

                self.w_input_hidden -= self.lr * d_hidden * x
                self.w_hidden_hidden -= self.lr * d_hidden * hidden_state
                self.b_hidden -= self.lr * d_hidden

                hidden_state = hidden_state_new

            if epoch % 10 == 0:
                print(f"Epoch {epoch}: Loss = {total_loss:.4f}")

    def predict(self, inputs):
        hidden_state = 0
        outputs = []
        for x in inputs:
            hidden_input = self.w_input_hidden * x + self.w_hidden_hidden * hidden_state + self.b_hidden
            hidden_state = sigmoid(hidden_input)

            output_input = self.w_hidden_output * hidden_state + self.b_output
            prediction = sigmoid(output_input)
            outputs.append(round(prediction))
        return outputs

In [5]:
rnn = SimpleRNN(epochs=200)
rnn.train(inputs, targets)

print("\nPredictions:")
predictions = rnn.predict(inputs)
for i in range(20):
    print(f"Input: {inputs[i]} → Predicted: {predictions[i]}, Actual: {targets[i]}")

Epoch 0: Loss = 9.7750
Epoch 10: Loss = 9.0703
Epoch 20: Loss = 7.5908
Epoch 30: Loss = 5.2404
Epoch 40: Loss = 3.2729
Epoch 50: Loss = 2.1369
Epoch 60: Loss = 1.5132
Epoch 70: Loss = 1.1457
Epoch 80: Loss = 0.9109
Epoch 90: Loss = 0.7503
Epoch 100: Loss = 0.6346
Epoch 110: Loss = 0.5477
Epoch 120: Loss = 0.4803
Epoch 130: Loss = 0.4266
Epoch 140: Loss = 0.3830
Epoch 150: Loss = 0.3470
Epoch 160: Loss = 0.3167
Epoch 170: Loss = 0.2909
Epoch 180: Loss = 0.2687
Epoch 190: Loss = 0.2495

Predictions:
Input: 0 → Predicted: 1, Actual: 1
Input: 1 → Predicted: 0, Actual: 0
Input: 0 → Predicted: 1, Actual: 1
Input: 1 → Predicted: 0, Actual: 0
Input: 0 → Predicted: 1, Actual: 1
Input: 1 → Predicted: 0, Actual: 0
Input: 0 → Predicted: 1, Actual: 1
Input: 1 → Predicted: 0, Actual: 0
Input: 0 → Predicted: 1, Actual: 1
Input: 1 → Predicted: 0, Actual: 0
Input: 0 → Predicted: 1, Actual: 1
Input: 1 → Predicted: 0, Actual: 0
Input: 0 → Predicted: 1, Actual: 1
Input: 1 → Predicted: 0, Actual: 0
Input: 

## What We’ve Understood from RNN

Recurrent Neural Networks are designed for sequence-based problems — anything where the order matters. Unlike normal networks, RNNs maintain a small memory of previous steps (called the **hidden state**) and update it as they move through the sequence.

Here’s what we saw:
- The RNN processed one input at a time, remembering what came before using the hidden state.
- Each prediction depended not just on the current input but on what the RNN had “seen” earlier.
- Backpropagation in RNNs is slightly different. It flows through time, so the error at time `t` affects earlier weights — we only did 1-step BPTT to keep things simple.
- Our model learned the XOR-style alternating pattern — a simple but perfect demo of how RNNs can “remember” something beyond a single input.

This scratch implementation gave us a clear view of what’s happening inside — no magic libraries, just real logic.