# Recurrent Neural Network

RNN is a type of artificial neural network in which the hidden node is connected to the directional edge to form a circulating structure. It is known as a model suitable for processing data that appears sequentially, such as voice and text. 

As shown in the figure above, RNN's greatest advantage is its ability to create flexible and flexible structures as needed, as it can accept inputs and outputs regardless of sequence length.

![RNN_architecture](../../../image/rnn_arch.png)

## RNN`s basic structure
![RNN_basic](../../../image/s8nYcww.png)
The basic structure of the RNN is shown in the figure above. A green box means a hidden state. The red box is the input x, the blue box is the output y. The current state's hidden state ht is updated with the previous hidden state ht-1.

The output yt of the current state is a structure that is updated by receiving ht. As the formula shows, the activation function of the hidden state is a hyperbolic tangent (tanh), which is a nonlinear function.

## RNN`s forward Propagation
![RNN_forward](../../../image/TIdBDTJ.png)

## RNN`s backward Propagation
![RNN backpropagtion](../../../image/XYDxsNs.png)
## Reference 
https://ratsgo.github.io/natural%20language%20processing/2017/03/09/rnnlstm/

In [1]:
import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms

In [3]:
# Device
device = torch.device('cuda:0')

In [4]:
# Hyper-parameters
sequence_length = 28
input_size = 28
hidden_size = 128
num_layers = 2
num_classes = 10
batch_size = 100
num_epochs = 2
learning_rate = 0.01

In [6]:
# MNIST dataset
train_dataset = torchvision.datasets.MNIST(root='../../data/',
                                           train=True,
                                           transform=transforms.ToTensor(),
                                           download=True)
test_dataset = torchvision.datasets.MNIST(root='../../data/',
                                          train=False,
                                          transform=transforms.ToTensor()
                                          )

In [7]:
# Data loader
train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
                                           batch_size=batch_size,
                                           shuffle=True)
test_loader = torch.utils.data.DataLoader(dataset=test_dataset,
                                          batch_size=batch_size,
                                          shuffle=False)

In [8]:
# Recurrent Neural Network (many-to-one)
class RNN(nn.Module):
    def __init__(self,input_size,hidden_size,num_layers,num_classes):
        super(RNN,self).__init__()
        self.hidden_size = hidden_size
        self.num_layers = num_layers
        self.lstm = nn.LSTM(input_size,hidden_size,num_layers,batch_first=True)
        self.fc = nn.Linear(hidden_size,num_classes)
    
    def forward(self,x) :
        # set initial hidden and cell states
        h0 = torch.zeros(self.num_layers,x.size(0), self.hidden_size).to(device)
        c0 = torch.zeros(self.num_layers,x.size(0), self.hidden_size).to(device)
        
        # Forward propagate LSTM
        out,_ = self.lstm(x,(h0,c0)) # out : tensor of shape (batch,seq_length,hidden_size)
        
        # Decode the hidden state of the last time step
        out = self.fc(out[:,-1,:])
        return out
    
model = RNN(input_size,hidden_size,num_layers,num_classes).to(device)

In [9]:
# Loss, optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(),lr=learning_rate)

In [11]:
# Train model
total_step = len(train_loader) 
for epoch in range(num_epochs) :
    for i, (images,labels) in enumerate(train_loader) :
        images = images.reshape(-1,sequence_length,input_size).to(device)
        labels = labels.to(device)
        
        # Forward pass
        outputs = model(images)
        loss = criterion(outputs,labels)
        
        # Backward and optimize
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        if (i+1) % 100 == 0:
            print ('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}' 
                   .format(epoch+1, num_epochs, i+1, total_step, loss.item()))

Epoch [1/2], Step [100/600], Loss: 0.3794
Epoch [1/2], Step [200/600], Loss: 0.4424
Epoch [1/2], Step [300/600], Loss: 0.2403
Epoch [1/2], Step [400/600], Loss: 0.1716
Epoch [1/2], Step [500/600], Loss: 0.1249
Epoch [1/2], Step [600/600], Loss: 0.1374
Epoch [2/2], Step [100/600], Loss: 0.1075
Epoch [2/2], Step [200/600], Loss: 0.0668
Epoch [2/2], Step [300/600], Loss: 0.1059
Epoch [2/2], Step [400/600], Loss: 0.0399
Epoch [2/2], Step [500/600], Loss: 0.0180
Epoch [2/2], Step [600/600], Loss: 0.1150


In [12]:
# Test the model
with torch.no_grad():
    correct = 0
    total = 0
    for images, labels in test_loader:
        images = images.reshape(-1, sequence_length, input_size).to(device)
        labels = labels.to(device)
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

    print('Test Accuracy of the model on the 10000 test images: {} %'.format(100 * correct / total)) 
    

Test Accuracy of the model on the 10000 test images: 97.78 %


In [13]:
# Save the model checkpoint
torch.save(model.state_dict(), 'rnn.ckpt')