# Recurrent Neural Network, RNN
- https://wikidocs.net/22886
- RNN(Recurrent Neural Network)은 시퀀스(Sequence) 모델입니다. 입력과 출력을 시퀀스 단위로 처리하는 모델입니다.
- RNN은 은닉층의 노드에서 활성화 함수를 통해 나온 결과값을 출력층 방향으로도 보내면서, 다시 은닉층 노드의 다음 계산의 입력으로 보내는 특징을 갖고있습니다.
- memory cell, hidden state
- 수식 정의
    - 은닉층 : ht=tanh(Wxxt+Whht−1+b)
    - 출력층 : yt=f(Wyht+b)
    - 단, f는 비선형 활성화 함수 중 하나.
    - xt : (d×1), Wx : (Dh×d), Wh : (Dh×Dh), ht−1 : (Dh×1), b : (Dh×1)

## Import

In [62]:
import numpy as np

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import SimpleRNN, Bidirectional

## RNN

### Build RNN using tf.keras

In [6]:
model = Sequential()
model.add(SimpleRNN(3, input_shape=(2, 10)))
model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
simple_rnn_1 (SimpleRNN)     (None, 3)                 42        
Total params: 42
Trainable params: 42
Non-trainable params: 0
_________________________________________________________________


In [7]:
model = Sequential()
model.add(SimpleRNN(3, batch_input_shape=(8, 2, 10))) # batch size를 미리 정함
model.summary()

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
simple_rnn_2 (SimpleRNN)     (8, 3)                    42        
Total params: 42
Trainable params: 42
Non-trainable params: 0
_________________________________________________________________


In [8]:
model = Sequential()
model.add(SimpleRNN(3, batch_input_shape=(8, 2, 10), return_sequences=True)) # 출력값으로 (batch_size, timesteps, output_dim) 크기의 3D 텐서를 리턴
model.summary()

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
simple_rnn_3 (SimpleRNN)     (8, 2, 3)                 42        
Total params: 42
Trainable params: 42
Non-trainable params: 0
_________________________________________________________________


### Build RNN using numpy - 그냥 hidden state 계산만 함

In [12]:
timesteps = 10
input_dim = 4
hidden_size = 8

inputs = np.random.random((timesteps, input_dim))
inputs.shape

(10, 4)

In [16]:
inputs

array([[0.81111112, 0.66104075, 0.03228098, 0.34490095],
       [0.80023525, 0.09129578, 0.90260154, 0.42486265],
       [0.78124413, 0.36178635, 0.03651335, 0.07727108],
       [0.55444861, 0.28800689, 0.61168066, 0.91052   ],
       [0.58005866, 0.93817878, 0.01588937, 0.58395941],
       [0.05317946, 0.98062232, 0.17607263, 0.29059323],
       [0.8647249 , 0.84604152, 0.09925531, 0.40992413],
       [0.88785773, 0.76683182, 0.69259132, 0.77376166],
       [0.93446808, 0.37283073, 0.8773381 , 0.00553213],
       [0.53330173, 0.95321179, 0.36378881, 0.96775884]])

In [14]:
hidden_state_t = np.zeros((hidden_size, ))
hidden_state_t.shape

(8,)

In [15]:
hidden_state_t

array([0., 0., 0., 0., 0., 0., 0., 0.])

In [17]:
Wx = np.random.random((hidden_size, input_dim))
Wh = np.random.random((hidden_size, hidden_size))
b = np.random.random((hidden_size, ))
Wx.shape, Wh.shape, b.shape

((8, 4), (8, 8), (8,))

In [19]:
total_hidden_states = []

for input_t in inputs:# input_t: t시점의 입력값
    # ht=tanh(WxXt+Whht−1+b)
    output_t = np.tanh(np.dot(Wx, input_t) + np.dot(Wh, hidden_state_t) + b)
    total_hidden_states.append(list(output_t))
    
    hidden_state_t = output_t
total_hidden_states = np.stack(total_hidden_states, axis = 0) 
total_hidden_states

array([[0.68861425, 0.92247127, 0.94192747, 0.8869073 , 0.79288545,
        0.85683692, 0.87070896, 0.8751066 ],
       [0.99993224, 0.99998215, 0.99967247, 0.99998001, 0.99926469,
        0.999989  , 0.99996274, 0.99994545],
       [0.99997618, 0.99999326, 0.99978348, 0.9999712 , 0.99925092,
        0.99998931, 0.99998293, 0.99992983],
       [0.99997817, 0.99999881, 0.99987852, 0.99999645, 0.99980928,
        0.99999733, 0.99998659, 0.99997708],
       [0.99998058, 0.99999907, 0.99987841, 0.99999177, 0.99973466,
        0.99999367, 0.99997943, 0.99995494],
       [0.99997426, 0.99999838, 0.99961788, 0.99997958, 0.99959676,
        0.99998428, 0.99994608, 0.99995378],
       [0.9999835 , 0.99999862, 0.9999062 , 0.99999254, 0.99972752,
        0.99999541, 0.99998823, 0.9999619 ],
       [0.99998692, 0.99999942, 0.99994365, 0.99999849, 0.99990081,
        0.99999871, 0.99999363, 0.99998768],
       [0.99998473, 0.99999508, 0.99985344, 0.99999354, 0.99971924,
        0.99999731, 0.999993

## Deep Recurrent Neural Network

In [59]:
model = Sequential()
model.add(SimpleRNN(8, input_shape=(20,4), return_sequences=True))
model.add(SimpleRNN(8, input_shape=(20,8), return_sequences=True))
model.summary()

Model: "sequential_38"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
simple_rnn_48 (SimpleRNN)    (None, 20, 8)             104       
_________________________________________________________________
simple_rnn_49 (SimpleRNN)    (None, 20, 8)             136       
Total params: 240
Trainable params: 240
Non-trainable params: 0
_________________________________________________________________


## Bidirectional Recurrent Neural Network
양방향 RNN은 하나의 출력값을 예측하기 위해 기본적으로 두 개의 메모리 셀을 사용합니다. 첫번째 메모리 셀은 앞에서 배운 것처럼 앞 시점의 은닉 상태(Forward States)를 전달받아 현재의 은닉 상태를 계산합니다. 두번째 메모리 셀은 앞에서 배운 것과는 다릅니다. 앞 시점의 은닉 상태가 아니라 뒤 시점의 은닉 상태(Backward States)를 전달 받아 현재의 은닉 상태를 계산합니다. 그리고 이 두 개의 값 모두가 출력층에서 출력값을 예측하기 위해 사용됩니다.

In [63]:
model = Sequential()
model.add(Bidirectional(SimpleRNN(hidden_size, return_sequences=True), input_shape=(timesteps, input_dim)))
model.add(Bidirectional(SimpleRNN(hidden_size, return_sequences = True)))
model.add(Bidirectional(SimpleRNN(hidden_size, return_sequences = True)))
model.add(Bidirectional(SimpleRNN(hidden_size, return_sequences = True)))
model.summary()

Model: "sequential_40"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
bidirectional (Bidirectional (None, 10, 16)            208       
_________________________________________________________________
bidirectional_1 (Bidirection (None, 10, 16)            400       
_________________________________________________________________
bidirectional_2 (Bidirection (None, 10, 16)            400       
_________________________________________________________________
bidirectional_3 (Bidirection (None, 10, 16)            400       
Total params: 1,408
Trainable params: 1,408
Non-trainable params: 0
_________________________________________________________________
