# PyTorch RNNs

RNNs are a class of artificial neural networks that allow previous outputs to be used as inputs while having hidden states. It consists of connections between nodes that can create cycles, allowing information to persist. This structure enables RNNs to exhibit temporal dynamic behavious, making them well suited for tasks that involve sequential data, such as **time series forecasting**, **natural language processing** and **speech recognition**.

<img src="https://easyai.tech/wp-content/uploads/2022/08/f0116-2019-07-02-rnn-1.gif">

RNNs have an **internal state** (memory) that is updated each time step based on the current input and the previous state, enabling them to capture and remember dependencies over time.

>However, this vanilla RNNs often suffer from issues like **vanishing** & **exploding gradients**, making it dfficult for them to learn long term dependencies.

## How RNN differ from Feedforward Neural Network?

Unlike traditional feedforward neural networks, RNNs have recurrent connections, allowing information to persist. Each neuron in the network is not only connected to the next layer but also to itself from the previous time step. This cyclic connection enables the network to capture temporal dependencies in the data.

<img src="https://cdn.analyticsvidhya.com/wp-content/uploads/2020/02/assets_-LvBP1svpACTB1R1x_U4_-LwEQnQw8wHRB6_2zYtG_-LwEZT8zd07mLDuaQZwy_image-1.png">

| **Aspect**                | **Feedforward Neural Networks (FNNs)**                                   | **Recurrent Neural Networks (RNNs)**                                                   |
|---------------------------|--------------------------------------------------------------------------|----------------------------------------------------------------------------------------|
| **Direction of Information Flow** | Information flows in one direction, from input nodes to output nodes, in a single pass | Information flows in both directions, allowing feedback loops to capture sequential dependencies |
| **Feedback Connections**  | No feedback connections                                                  | Feedback connections allow the network to maintain an internal state, enabling it to learn from previous outputs |
| **State-Based Representation** | No internal state; only input-output relationships are modeled       | Internal state (or memory) is used to store information, allowing the network to capture long-term dependencies |
| **Sequential Processing** | Processes data in parallel, without considering sequential relationships | Designed to process sequential data, where the output of the previous time step is used as input for the next time step |
| **Training**              | Trains the network to predict the output for a single input              | Trains the network to predict the output for a sequence of inputs, considering the internal state and feedback connections |
| **Applications**          | Suitable for applications like image and speech recognition, where the input can be partitioned into separate, unrelated parts | Well-suited for applications that involve sequential data, such as language translation, speech recognition, and time series forecasting |
| **Complexity**            | Generally simpler to implement and train, with fewer parameters to learn | More complex to implement and train, with many more parameters to learn and the need to manage the internal state |
| **Error Propagation**     | Error is propagated only through the forward pass, without considering the internal state | Error is propagated backwards through time, allowing the network to adjust the internal state and learn from previous mistakes |




### Unfolding in Recurrent Neural Networks (RNNs)

**Unfolding** is a way to visualize and understand how Recurrent Neural Networks (RNNs) process sequential data over time. When we unfold an RNN, we break down its operations into individual time steps, revealing how information flows through the network at each step.

<img src="https://miro.medium.com/v2/resize:fit:1400/1*SKGAqkVVzT6co-sZ29ze-g.png">

#### Key Concepts:

1. **Sequential Processing**:
   - RNNs process input data one time step at a time. At each time step t, the RNN takes the current input x<sub>t</sub> and the hidden state from the previous time step h<sub>t-1</sub>.

2. **Hidden State**:
   - The hidden state h<sub>t</sub> serves as the memory of the network, capturing information from previous inputs. It is updated at each time step based on the current input and the previous hidden state.
   - h<sub>t</sub> = f(W<sub>h</sub> ⋅ h<sub>t-1</sub> + W<sub>x</sub> ⋅ x<sub>t</sub> + b)
   - Here, W<sub>h</sub> and W<sub>x</sub> are weight matrices, b is a bias vector, and f is an activation function (e.g., tanh or ReLU).

3. **Unfolding Process**:
   - When we unfold an RNN, we create a sequence of repeated copies of the network, each corresponding to a different time step. This allows us to see how the network evolves over time.
   - For a sequence of inputs x<sub>1</sub>, x<sub>2</sub>, ..., x<sub>T</sub>, the RNN is unfolded into T time steps, with each step having its own hidden state h<sub>t</sub> and output y<sub>t</sub>.

### [Detailed Explanation on RNNs](https://www.deeplearningbook.org/contents/rnn.html)




## 0. Import Libraries

In [5]:
# Importing libraries
import torch
import torch.nn as nn
import matplotlib.pyplot as plt

# 1. Download and Prepare Datasets

In [4]:
# Download dataset
!gdown https://drive.google.com/uc?id=1eL4Hxm6NMcR6dCwuGDSPmVi-4BkUGKDz

Downloading...
From: https://drive.google.com/uc?id=1eL4Hxm6NMcR6dCwuGDSPmVi-4BkUGKDz
To: /content/language_data.zip
  0% 0.00/2.88M [00:00<?, ?B/s]100% 2.88M/2.88M [00:00<00:00, 180MB/s]


In [8]:
import zipfile

# Extract zip data
with zipfile.ZipFile('language_data.zip', 'r') as zip_ref:
    zip_ref.extractall('/content')

In [None]:

import os
import io
import unicodedata, string, glob
import random

# alphabet small + capital letters + ".,;'"
ALL_LETTERS = string.ascii_letters + ".,;'"
N_LETTERS = len(ALL_LETTERS)

# Turn a unicode string to plain ASCII
def unicode_to_ascii(s):
  return ''.join(
      c for c in unicodedata.normalize('NFD', s)
      if unicodedata.category(c) != 'Mn'
      and c in ALL_LETTERS
  )

def load_data():
  # Build the category_lines in dictionary, a list of names per lanaguge
  category_lines = {}
  all_categories = []

  def find_files(path):
    return glob.glob(path)

  # Read a file and split into lines
  def read_lines(filename):
    lines = io.open(filename, encoding='utf-8').read().strip().split('\n')
    return [unicode_to_ascii(line) for line in lines]

    for filename in find_files('data/names/*.txt'):
      category = os.path.splittext(os.path.basename(filename))[0]
      all_categories.append(category)
      lines = read_lines(filename)
      category_lines[category] = lines

    return category_lines, all_categories

