<a href="https://colab.research.google.com/github/hissain/mlworks/blob/main/codes/RNN_cells_in_Pytorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
# Import
%matplotlib inline
import math
import torch
from torch import nn
from torch.nn import functional as F

In [None]:
torch.manual_seed(0)

<torch._C.Generator at 0x7c627c3299d0>

**Recurrent Neural Networks (RNNs)**:
   - PyTorch provides the `torch.nn.RNN` module for implementing basic RNNs. You can specify the input size, hidden size, number of layers, and whether to use bidirectional connections.
   - The input shape expected by `torch.nn.RNN` is `(seq_len, batch, input_size)`, where `seq_len` is the length of the input sequence, `batch` is the batch size, and `input_size` is the dimensionality of the input feature vector at each time step.
   - RNNs process input sequences sequentially, maintaining an internal hidden state that captures information from previous time steps. You can use the output at each time step or only the final output, depending on the task.

In [None]:
# Define a simple RNN model
class SimpleRNN(nn.Module):
    def __init__(self, input_size, hidden_size, num_layers, output_size):
        super(SimpleRNN, self).__init__()
        self.hidden_size = hidden_size
        self.num_layers = num_layers
        self.rnn = nn.RNN(input_size, hidden_size, num_layers, batch_first=True)
        self.fc = nn.Linear(hidden_size, output_size)

    def forward(self, x):
        h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(x.device)
        out, _ = self.rnn(x, h0)
        out = self.fc(out[:, -1, :])  # Use the output of the last time step
        return out

# Example usage
input_size = 10
hidden_size = 20
num_layers = 2
output_size = 1
seq_length = 5
batch_size = 3

model = SimpleRNN(input_size, hidden_size, num_layers, output_size)
input_tensor = torch.randn(batch_size, seq_length, input_size)
output = model(input_tensor)
print(output.shape)  # Output shape: (batch_size, output_size)


torch.Size([3, 1])


**Long Short-Term Memory networks (LSTMs)**:
   - PyTorch offers the `torch.nn.LSTM` module for implementing LSTM networks. LSTMs are a type of RNN designed to address the vanishing gradient problem and better capture long-range dependencies in sequences.
   - Similar to `torch.nn.RNN`, you can specify parameters such as input size, hidden size, number of layers, and bidirectionality.
   - The input shape expected by `torch.nn.LSTM` is the same as that of `torch.nn.RNN`.
   - LSTMs have a more complex architecture than basic RNNs, incorporating input, forget, and output gates, as well as a memory cell that can retain information over long sequences.

In [None]:
# Define the LSTM model
class LSTMModel(nn.Module):
    def __init__(self, input_size, hidden_size, num_layers, output_size):
        super(LSTMModel, self).__init__()
        self.hidden_size = hidden_size
        self.num_layers = num_layers
        self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True)
        self.fc = nn.Linear(hidden_size, output_size)

    def forward(self, x):
        h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(x.device)
        c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(x.device)
        out, _ = self.lstm(x, (h0, c0))
        out = self.fc(out[:, -1, :])  # Use the output of the last time step
        return out

# Example usage
model = LSTMModel(input_size, hidden_size, num_layers, output_size)
output = model(input_tensor)
print(output.shape)  # Output shape: (batch_size, output_size)


torch.Size([3, 1])


**Gated Recurrent Units (GRUs)**:
   - PyTorch provides the `torch.nn.GRU` module for implementing GRU networks. GRUs are similar to LSTMs but have a simplified architecture, combining the forget and input gates into a single update gate.
   - Like with `torch.nn.RNN` and `torch.nn.LSTM`, you can specify input size, hidden size, number of layers, and bidirectionality when creating a GRU layer.
   - The input shape expected by `torch.nn.GRU` is the same as that of `torch.nn.RNN` and `torch.nn.LSTM`.

In [None]:
# Define the GRU model
class GRUModel(nn.Module):
    def __init__(self, input_size, hidden_size, num_layers, output_size):
        super(GRUModel, self).__init__()
        self.hidden_size = hidden_size
        self.num_layers = num_layers
        self.gru = nn.GRU(input_size, hidden_size, num_layers, batch_first=True)
        self.fc = nn.Linear(hidden_size, output_size)

    def forward(self, x):
        h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(x.device)
        out, _ = self.gru(x, h0)
        out = self.fc(out[:, -1, :])  # Use the output of the last time step
        return out

# Example usage
model = GRUModel(input_size, hidden_size, num_layers, output_size)
output = model(input_tensor)
print(output.shape)  # Output shape: (batch_size, output_size)


torch.Size([3, 1])
