<a href="https://colab.research.google.com/github/mohamedyosef101/101_learning_area/blob/area/PyTorch/RNN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Recurrent** Neural Networks

Recurrent neural networks (RNNs) are a type of artificial neural network architecture designed to **handle sequential data**, where the order of elements matters.

Unlike traditional neural networks that process individual inputs independently, **RNNs have an internal "memory"** that allows them to remember information from previous inputs and use it to influence their predictions for the current input.

In [1]:
import torch
from torch import nn
import torch.nn.functional as F
import torch.optim as optim

import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader as dl

import os
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import random

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Device available now:', device)

Device available now: cpu


# Step 1. Get the data

Using `ToTensor`, *will make you able to normalize, perform data augmenatation, etc.*

In [2]:
# hide the output
%%capture

# set the transform to -> ToTensor
transform = transforms.Compose([transforms.ToTensor()])

# Downlaod data
train = torchvision.datasets.MNIST('data', train=True,
                                   download=True, transform=transform)
test = torchvision.datasets.MNIST('data', train=False,
                                  download=True, transform=transform)

In [3]:
# Data Loader to select a batch from it
train_loader = dl(train, batch_size=64)
test_loader = dl(test, batch_size=64)

# Step 2. Vanilla RNN

A **pro tip** I took from [Andrada](https://www.kaggle.com/andradaolteanu) is to use print, *as many times as you can*, because it helps you understand what is happening.

In [17]:
class VanillaRNN(nn.Module):
  def __init__(self, batch_size, input_size, hidden_size, output_size):
    super(VanillaRNN, self).__init__()

    # RNN layer
    self.rnn = nn.RNN(input_size, hidden_size)

    # Fully connected layer
    self.fc = nn.Linear(hidden_size, output_size)

  def forward(self, images, prints=False):
    if prints: print('Original Images Shape:', images.shape)

    # data augmentations
    images = images.permute(1, 0, 2)
    if prints: print('Permuted Images Shape', images.shape)

    # initalize hidden state with zeros
    hidden_state = torch.zeros(1, batch_size, hidden_size)
    if prints: print('Initial hidden state shape:', hidden_state.shape, '\n')

    # Creating RNN
    hidden_outputs, hidden_state = self.rnn(images, hidden_state)

    # Log probabilities
    out = self.fc(hidden_state)

    if prints:
      print('----hidden outputs shape:', hidden_outputs.shape, '\n' +
            '----final hidden state:', hidden_state.shape, '\n' +
            '----out shape:', out.shape, '\n')

    # Reshape out
    out = out.view(-1, output_size)
    if prints: print('Out Final Shape:', out.shape)

    return out

In [18]:
# Statics
batch_size = 64
input_size = 28
hidden_size = 150
output_size = 10

In [19]:
# take a sample

images_example, labels_example = next(iter(train_loader))
print('original image shape:', images_example.shape)

# Reshape
images_example = images_example.view(-1, 28, 28)
print('changed images shape:', images_example.shape, '\n' +
      'labels shape:', labels_example.shape, '\n')

original image shape: torch.Size([64, 1, 28, 28])
changed images shape: torch.Size([64, 28, 28]) 
labels shape: torch.Size([64]) 



In [20]:
# Creating a small model
model_example = VanillaRNN(batch_size, input_size,
                           hidden_size, output_size)
out = model_example(images_example, prints=True)

Original Images Shape: torch.Size([64, 28, 28])
Permuted Images Shape torch.Size([28, 64, 28])
Initial hidden state shape: torch.Size([1, 64, 150]) 

----hidden outputs shape: torch.Size([28, 64, 150]) 
----final hidden state: torch.Size([1, 64, 150]) 
----out shape: torch.Size([1, 64, 10]) 

Out Final Shape: torch.Size([64, 10])


In [27]:
# Understand Model Parameters
params = list(model_example.parameters())
print(f'There are {len(params)} parameters')
print('Parameters 0 - U:', params[0].shape, '\n' +
      'Parameters 1 - W:', params[1].shape, '\n' +
      'Parameters 2 - Bias:', params[2].shape, '\n' +
      'Parameters 3 - Bias:', params[3].shape, '\n' +
      'Parameters 4 - Forward weights:', params[4].shape, '\n' +
      'Parameters 5 - Predictions:', params[5].shape, '\n')

There are 6 parameters
Parameters 0 - U: torch.Size([150, 28]) 
Parameters 1 - W: torch.Size([150, 150]) 
Parameters 2 - Bias: torch.Size([150]) 
Parameters 3 - Bias: torch.Size([150]) 
Parameters 4 - Forward weights: torch.Size([10, 150]) 
Parameters 5 - Predictions: torch.Size([10]) 



# **References:**
* Misra Turp. 2022. [Basics of Recurrent Neural Networks](https://youtu.be/M6-AIQnB4_Q?si=TrBq7jsb6L8OMnFV), [LSTM & GRUs](https://youtu.be/E4c_bom0_6Y?si=qGaQ3FYlpRphiIan). YouTube.
* Andrada. 2020. [PyTorch RNNs and LSTM](https://www.kaggle.com/code/andradaolteanu/pytorch-rnns-and-lstms-explained-acc-0-99). Kaggle.