# Recurrent Neural Network

The following code illustrates RNN by defining a simple RNN model with:

* One RNN layer and
* One fully connected layer

The RNN layer takes a sequence of vectors as input and outputs a sequence of vectors.

`nn.Linear`, the fully connected layer, takes the output of the last time step and outputs a single vector. `nn.Linear` is a class in PyTorch that applies a linear transformation to the incoming data. It's essentially a simple feed-forward layer without any activation function applied.

The transformation it applies is $y = xA^T + b$, where:

* $x$ is the input
* $A$ is the weight matrix, learned during training
* $b$ is the bias vector, also learned during training
* $y$ is the output

In the context of our code, `nn.Linear(hidden_size, output_size)` creates a fully connected (linear) layer that takes as input a tensor of size `hidden_size` and outputs a tensor of size `output_size`. The layer learns a weight matrix of size `output_size x hidden_size` and a bias vector of size `output_size` during training.

The model is then applied to a batch of input sequences, and the output is printed.

In [7]:
import torch
import torch.nn as nn

class SimpleRNN(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(SimpleRNN, self).__init__()

        self.hidden_size = hidden_size
        self.rnn = nn.RNN(input_size, hidden_size, num_layers=1, batch_first=True)
        self.fc = nn.Linear(hidden_size, output_size)

    def forward(self, x):
        # Initialize hidden state
        h0 = torch.zeros(1, x.size(0), self.hidden_size).to(x.device)

        # Pass the input through the RNN layer
        out, _ = self.rnn(x, h0)

        # Pass the output of the last time step through the fully connected layer
        out = self.fc(out[:, -1, :])

        return out

# Define the dimensions
input_size = 10
hidden_size = 20
output_size = 1
seq_length = 5

# Create the model
model = SimpleRNN(input_size, hidden_size, output_size)

# Create a random tensor to represent a batch of input sequences
x = torch.randn((32, seq_length, input_size))

# Forward propagate the RNN
output = model(x)

print(output.shape)  # Should be [32, 1] because the batch size is 32 and output size is 1
print(output)

torch.Size([32, 1])
tensor([[-0.2066],
        [-0.2378],
        [-0.3100],
        [-0.3621],
        [-0.2376],
        [-0.4694],
        [-0.2193],
        [-0.3081],
        [-0.2183],
        [-0.2774],
        [-0.2603],
        [-0.0779],
        [-0.2969],
        [-0.1407],
        [-0.3621],
        [-0.2641],
        [-0.3416],
        [-0.3610],
        [-0.3359],
        [-0.2387],
        [-0.2761],
        [-0.2700],
        [-0.3523],
        [-0.2503],
        [-0.2999],
        [-0.0874],
        [-0.4693],
        [-0.3114],
        [-0.1293],
        [-0.4395],
        [-0.3414],
        [-0.1399]], grad_fn=<AddmmBackward0>)


Let's see what the RNN's internal look like.

In [8]:
from torchsummary import summary

# Assuming that the input size is (seq_length, input_size)
summary(model, input_size=(seq_length, input_size))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
               RNN-1  [[-1, 5, 20], [-1, 2, 20]]               0
            Linear-2                    [-1, 1]              21
Total params: 21
Trainable params: 21
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.00
Forward/backward pass size (MB): 0.03
Params size (MB): 0.00
Estimated Total Size (MB): 0.03
----------------------------------------------------------------
