## Handling Text Data

In [1]:
x = 'The movie was fantastic'

- This is one datapoint. Here abstractly the `sequence length` is 4, because the sentence has 4 words
- Here, each word would be projected into a space of `d` dimension which is called as the embedding space of the words

`NOTE` : We discuss here on an high-level considering words in a sentence, but in practice each sentence would be broken into `tokens` which can be considered as a unit of semantic between word and a character. Some words may also be completely taken as a token.

#### So, text is a sequence data => Sequential information can be processed well by `Recurrent Neural Network`

![Recurrent Neural Network](./RNN.jpg)

- Here the intermediate outputs in each layer at each time step are called `hidden states` of that layer and the dimension of these vectors may or may not be different from embedding dimension and is called as `hidden state dimension`

## RNN in Pytorch

#### `nn.RNN(i, h, l)`

- `i` - input dimension
- `h` - hidden dimension
- `l` - number of layers

The ouput from `nn.RNN` is a tuple

`out, hidden = nn.RNN(...)`

- `out` - output from the final layer ($output_0, output_1,.., output_n$ of the $M^{th}$ layer from the diagram)
- `hidden` - output from the each intemediate nodes of the final layer ($hidden_N$ from $0_{th}$ layer to $M_{th}$ layer from the diagram)

The Fully Connected Neural Network (FNN) is stacked on the `out` for further processing

To install all the necessary libraries

`pip install torch torchtext portalocker`

In [2]:
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as f

#### Let us create a random dataset

`X` has 3 example sentences of 10 tokens, each with an embedding dimension of 5 containing integers between 1 to 10

In [3]:
X = torch.randint(1, 10, (3, 10, 5), dtype=torch.float32)
y = torch.tensor([1, 0, 1], dtype=torch.long)

`Input Dimension` is same as `Embedding Dimension`

In [4]:
rnn = nn.RNN(5, 3, 1)

The timesteps will be decided internally based on the number of tokens you give as input for a single data point

- 5 -> Input Dimension
- 3 -> Hidden Dimension
- 1 -> No. of Layers