# Deep Learning Series

<br>
<span style="color:gray">

1. Neural Net - part.1


2. Convolution Neural Network


3. Neural Net - part.2


</span>


<b>

4. Recursive Nerural Network


</b>

# Recurrent Neural Network

## What is Recurrent Neural Network ?

RNN은 RNN이다

## Computation in Recurrent Neural Network Layer

<img src="img/RNN_01.PNG">

In [1]:
import os
import torch
import torch.nn as nn
import torch.nn.functional as F
from torchtext import data, datasets

In [2]:
BATCH_SIZE = 64
lr = 0.001
EPOCHS = 10
USE_CUDA = torch.cuda.is_available()
DEVICE = torch.device("cuda" if USE_CUDA else "cpu")

In [3]:
TEXT = data.Field(sequential=True, batch_first=True, lower=True)
LABEL = data.Field(sequential=False, batch_first=True)

에러 내용 :: RuntimeError: multi-target not supported

원인 :: LABEL Field의 sequential 인자를 True로 지정함.
sequential을 True로 지정하게 되면 Tokenization이 진행됨.
> sequential – Whether the datatype represents sequential data. If False, no tokenization is applied. Default: True.

In [4]:
trainset, testset = datasets.IMDB.splits(TEXT, LABEL)

In [5]:
TEXT.build_vocab(trainset, min_freq=5)
LABEL.build_vocab(trainset)

In [6]:
trainset, valset = trainset.split(split_ratio=0.8)
train_iter, val_iter, test_iter = data.BucketIterator.splits((trainset, valset, testset),
                                                            batch_size=BATCH_SIZE,
                                                            shuffle=True, repeat=False)

In [7]:
vocab_size = len(TEXT.vocab)
n_classes = 2

In [8]:
print("[학습셋]: %d [검증셋]: %d [테스트셋]: %d [단어수]: %d [클래스]: %d" % (len(trainset), len(valset), len(testset), vocab_size, n_classes))

[학습셋]: 20000 [검증셋]: 5000 [테스트셋]: 25000 [단어수]: 46159 [클래스]: 2


In [9]:
class BasicRNN(nn.Module):
    def __init__(self, n_layers, hidden_dim, n_vocab, embed_dim, n_classes, dropout_p=0.2):
        super(BasicRNN, self).__init__()
        print("Building Basic RNN model")
        self.n_layers = n_layers
        self.embed = nn.Embedding(n_vocab, embed_dim)
        self.hidden_dim = hidden_dim
        self.dropout = nn.Dropout(dropout_p)
        self.rnn = nn.RNN(embed_dim, self.hidden_dim,
                         num_layers=self.n_layers,
                         batch_first=True)
        self.out = nn.Linear(self.hidden_dim, n_classes)
    
    def forward(self, x):
        x = self.embed(x)
        h_0 = self._init_state(batch_size=x.size(0))
        x, _ = self.rnn(x, h_0)
        h_t = x[:,-1,:]
        self.dropout(h_t)
        logit = self.out(h_t)
        return logit

    def _init_state(self, batch_size=1):
        weight = next(self.parameters()).data
        return weight.new(self.n_layers, batch_size, self.hidden_dim).zero_()

In [10]:
def train(model, optimizer, train_iter):
    model.train()
    for b, batch in enumerate(train_iter):
        x, y = batch.text.to(DEVICE), batch.label.to(DEVICE)
        y.data.sub_(1)
        optimizer.zero_grad()
        
        logit = model(x)
        loss = F.cross_entropy(logit, y)
        loss.backward()
        optimizer.step()

In [11]:
def evaluate(model, val_iter):
    """evaluate model"""
    model.eval()
    corrects, total_loss = 0, 0
    for batch in val_iter:
        x, y = batch.text.to(DEVICE), batch.label.to(DEVICE)
        y.data.sub_(1) # 레이블값을 0과 1로 변환
        logit = model(x)
        loss = F.cross_entropy(logit, y, reduction='sum')
        total_loss += loss.item()
        corrects += (logit.max(1)[1].view(y.size()).data == y.data).sum()
    size = len(val_iter.dataset)
    avg_loss = total_loss / size
    avg_accuracy = 100.0 * corrects / size
    return avg_loss, avg_accuracy

In [12]:
model = BasicRNN(1, 256, vocab_size, 128, n_classes, 0.5).to(DEVICE)
optimizer = torch.optim.Adam(model.parameters(), lr=lr)

Building Basic RNN model


In [14]:
best_val_loss = None
for e in range(1, EPOCHS+1):
    train(model, optimizer, train_iter)
    val_loss, val_accuracy = evaluate(model, val_iter)
    
    print("[이폭: %d] 검증 오차:%5.2f | 검증 정확도:%5.2f" % (e, val_loss, val_accuracy))
    
    # 검증 오차가 가장 적은 최적의 모델을 저장
    
    if not best_val_loss or val_loss < best_val_loss:
        if not os.path.isdir("snapshot"):
            os.makedirs("snapshot")
        torch.save(model.state_dict(),
                  './snapshot/txtclassification.pt')
        best_val_loss = val_loss

[이폭: 1] 검증 오차: 0.70 | 검증 정확도:49.00
[이폭: 2] 검증 오차: 0.69 | 검증 정확도:48.00
[이폭: 3] 검증 오차: 0.70 | 검증 정확도:49.00
[이폭: 4] 검증 오차: 0.71 | 검증 정확도:49.00
[이폭: 5] 검증 오차: 0.70 | 검증 정확도:50.00
[이폭: 6] 검증 오차: 0.70 | 검증 정확도:50.00
[이폭: 7] 검증 오차: 0.70 | 검증 정확도:51.00
[이폭: 8] 검증 오차: 0.70 | 검증 정확도:50.00
[이폭: 9] 검증 오차: 0.70 | 검증 정확도:50.00
[이폭: 10] 검증 오차: 0.69 | 검증 정확도:51.00


## BPTT (Back Propagation Through Time)

<img src="img/RNN_02.PNG">

RNN의 $h$는 '상태(state)'를 기억해 시각이 1 스텝 (1단위;1t) 진행될 때마다 $h = \text{tanh}(h_{t-1}W_h + x_tW_x + b)$의 형태로 갱신이 된다.

보통 RNN의 출력 $h_t$를 은닉 상태(hidden state) 혹은 은닉 상태 벡터(hidden state vector)라고 한다.

## Gradient Vanishing & Gradient Exploding problem

## LSTM (Long-Short Term Memory)

## GRU (Gate Recurrent Unit)