## About
DeepNLP Starter 01- RNN and its application for name classification.

- RNN(Recurrent NN) is very special as it takes x as input and produces and output, ensuring the hidden state is being passed to the next state as per the following diagram.

![RNN](RNN.png "RNN")

The series of inputs x1,x2,x3 of each consecutive cell from L to R generates y1,y2,y3 thereby passing hidden states from L to R.

This is useful in handling time series data, language modelling, translation etc task where data is a correlated series of data.

RNNs can be of following types

1. one to many for Image captioning tasks (image as input and caption as sequential input)
2. many to one for sentiment analysis taks (text as input and output - sentiment )
3. many to many for language translation tasks where one language sequential data is fed as input and it generates other language as output.

![RNN2](RNN2.jpg "RNN2")

The above diagram shows the working of RNN where previous state's output(h_t-1) is being concatenated with current state's input x_t and passed to an activation function tanh thereby generating h_t which is being passed to next state.

To use RNN or its complex derivatives in PyTorch, We need to use the following syntax,

```
#output is usually quoted as hidden_size
RNN_cell= nn.RNN(input_size=7, hidden_size=2, batch_first=True)
GRU_cell = nn.GRU(input_size=7, hidden_size=2, batch_first=True)
LSTM_cell = nn.LSTM(input_size=7, hidden_size=2, batch_first=True)

```
- GRU and LSTM have various gates to boost its performance with respect to RNN.


- To implement a general RNN, we should keep the following dimensions of input handy

```
inputs_dim = batch_size, sequence_length, input_size
hidden = cell(inputs, hidden_prev_layer) #dimension - num_layers,batch_size,hidden_size
```


#### 1. Let's teach RNN to say a word - Hello World

1. One hot encode the unique letters in the letters
```
H - [1, 0, 0, 0, 0, 0, 0]
E - [0, 1, 0, 0, 0, 0, 0]
L - [0, 0, 1, 0, 0, 0, 0]
O - [0, 0, 0, 1, 0, 0, 0]
W - [0, 0, 0, 0, 1, 0, 0]
R - [0, 0, 0, 0, 0, 1, 0]
D - [0, 0, 0, 0, 0, 0, 1]
```

2. Feeding each letter to RNN node 

> input_shape to each RNN_cell =  1,1,7
input to first RNN - [[[1, 0, 0, 0, 0, 0, 0]]] #batch_size,seq_len, input_len

> since, we define output_size as 2 i.e hidden_dim
then dimension of h_t to this cell shall be [[[x,x]]] #1,1,2 - num_layers,batchsize, hidden_dim

In [5]:
import torch.nn as nn
import torch

RNN_Cell = nn.RNN(input_size=7, hidden_size=2, batch_first=True)

inputs = torch.Tensor([[[1, 0, 0, 0, 0, 0, 0]]])
print(inputs.shape)

# initialising an initial hidden state to be passed to first RNN cell
init_hidden_state = torch.randn(1,1,2) # num_layers,batch,hidden_size

out, hidden = RNN_Cell(inputs,init_hidden_state)
print("output of first RNN cell is {}".format(out))
print("hidden state- {}".format(hidden))

#clearly both are same and this shall be passed to next RNN cell.

torch.Size([1, 1, 7])
output of first RNN cell is tensor([[[-0.3513, -0.5473]]], grad_fn=<TransposeBackward1>)
hidden state- tensor([[[-0.3513, -0.5473]]], grad_fn=<StackBackward0>)


#### let's create a many to many RNN stacked where each hidden_state is passed to next cell
#### one for each h,e,l,l,o, w, o,r,l,d

shape = (1,10,7) # batch,seq_len,input_len

#### Also, by increasing batches, We mean we want to forward multiple permutations

e,l,l,o,w,o,r,l,d,h etc
the shape thus becomes

2,10,7

- However, This shall have its imprint on computation cost as it will be slow

