이 자료는 위키독스 딥 러닝을 이용한 자연어 처리 입문의 RNN 튜토리얼 자료입니다.  
링크 : https://wikidocs.net/22886

2021년 10월 12일에 마지막으로 테스트되었습니다.

In [None]:
import tensorflow as tf

In [None]:
tf.__version__

'2.6.0'

# 1. 케라스(Keras)로 RNN 구현하기

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import SimpleRNN

In [None]:
model = Sequential()
model.add(SimpleRNN(3, input_shape=(2,10)))
# model.add(SimpleRNN(3, input_length=2, input_dim=10))와 동일
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
simple_rnn (SimpleRNN)       (None, 3)                 42        
Total params: 42
Trainable params: 42
Non-trainable params: 0
_________________________________________________________________


In [None]:
model = Sequential()
model.add(SimpleRNN(3, batch_input_shape=(8,2,10)))
model.summary()

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
simple_rnn_2 (SimpleRNN)     (8, 3)                    42        
Total params: 42
Trainable params: 42
Non-trainable params: 0
_________________________________________________________________


In [None]:
model = Sequential()
model.add(SimpleRNN(3, batch_input_shape=(8,2,10), return_sequences=True))
model.summary()

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
simple_rnn_3 (SimpleRNN)     (8, 2, 3)                 42        
Total params: 42
Trainable params: 42
Non-trainable params: 0
_________________________________________________________________


# 2. 파이썬으로 RNN 구현하기

In [None]:
import numpy as np

timesteps = 10 # 시점의 수. NLP에서는 보통 문장의 길이가 된다.
input_dim = 4 # 입력의 차원. NLP에서는 보통 단어 벡터의 차원이 된다.
hidden_size = 8 # 은닉 상태의 크기. 메모리 셀의 용량이다.

inputs = np.random.random((timesteps, input_dim)) # 입력에 해당되는 2D 텐서

# 은닉 상태의 크기 hidden_size로 은닉 상태를 만듬.
hidden_state_t = np.zeros((hidden_size,)) # 초기 은닉 상태는 0(벡터)로 초기화

In [None]:
# 은닉 상태의 크기 hidden_size로 은닉 상태를 만듬.# 8의 크기를 가지는 은닉 상태. 현재는 초기 은닉 상태로 모든 차원이 0의 값을 가짐.
print(hidden_state_t)

[0. 0. 0. 0. 0. 0. 0. 0.]


In [None]:
Wx = np.random.random((hidden_size, input_dim))  # (8, 4)크기의 2D 텐서 생성. 입력에 대한 가중치.
Wh = np.random.random((hidden_size, hidden_size)) # (8, 8)크기의 2D 텐서 생성. 은닉 상태에 대한 가중치.
b = np.random.random((hidden_size,)) # (8,)크기의 1D 텐서 생성. 이 값은 편향(bias).

In [None]:
print(np.shape(Wx))
print(np.shape(Wh))
print(np.shape(b))

(8, 4)
(8, 8)
(8,)


In [None]:
total_hidden_states = []

# 메모리 셀 동작
for input_t in inputs: # 각 시점에 따라서 입력값이 입력됨.
  output_t = np.tanh(np.dot(Wx,input_t) + np.dot(Wh,hidden_state_t) + b) # Wx * Xt + Wh * Ht-1 + b(bias)
  total_hidden_states.append(list(output_t)) # 각 시점의 은닉 상태의 값을 계속해서 축적
  print(np.shape(total_hidden_states)) # 각 시점 t별 메모리 셀의 출력의 크기는 (timestep, output_dim)
  hidden_state_t = output_t

total_hidden_states = np.stack(total_hidden_states, axis = 0) 
# 출력 시 값을 깔끔하게 해준다.

print(total_hidden_states) # (timesteps, output_dim)의 크기. 이 경우 (10, 8)의 크기를 가지는 메모리 셀의 2D 텐서를 출력.

(1, 8)
(2, 8)
(3, 8)
(4, 8)
(5, 8)
(6, 8)
(7, 8)
(8, 8)
(9, 8)
(10, 8)
[[0.85496939 0.94702668 0.86708324 0.91527352 0.96703696 0.80542573
  0.97073141 0.89202759]
 [0.99944812 0.99999666 0.99995257 0.9979095  0.99999017 0.99999713
  0.99999364 0.99987055]
 [0.99967934 0.99999649 0.99998757 0.99754125 0.99999355 0.99999958
  0.99999087 0.99996529]
 [0.9997036  0.99999849 0.99997919 0.99784333 0.99999326 0.99999922
  0.99999624 0.99995417]
 [0.99907414 0.99999    0.99997213 0.99851461 0.99998889 0.99999835
  0.99998732 0.99981697]
 [0.99983633 0.99999726 0.99998918 0.99874178 0.99999664 0.99999958
  0.99999422 0.99997005]
 [0.99944346 0.99999446 0.99997273 0.99877603 0.99999202 0.99999845
  0.99999257 0.99986234]
 [0.99973524 0.99999846 0.99998268 0.99890153 0.9999961  0.99999915
  0.9999973  0.99994947]
 [0.99948656 0.9999978  0.99998404 0.99844112 0.99999454 0.99999921
  0.99999601 0.99994352]
 [0.99909342 0.99998551 0.99996263 0.9968948  0.99997724 0.99999844
  0.99997329 0.99981045]

# 3. 깊은 순환 신경망(Deep Recurrent Neural Network)


In [None]:
model = Sequential()
model.add(SimpleRNN(hidden_size, input_length=10, input_dim=5, return_sequences = True))
model.add(SimpleRNN(hidden_size, return_sequences = True))
model.summary()

Model: "sequential_6"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
simple_rnn_8 (SimpleRNN)     (None, 10, 8)             112       
_________________________________________________________________
simple_rnn_9 (SimpleRNN)     (None, 10, 8)             136       
Total params: 248
Trainable params: 248
Non-trainable params: 0
_________________________________________________________________


# 4. 양방향 순환 신경망(Bidirectional Recurrent Neural Network)

In [None]:
from tensorflow.keras.layers import Bidirectional

In [None]:
timesteps = 10
input_dim = 5

model = Sequential()
model.add(Bidirectional(SimpleRNN(hidden_size, return_sequences = True), input_shape=(timesteps, input_dim)))
model.summary()

Model: "sequential_8"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
bidirectional_1 (Bidirection (None, 10, 16)            224       
Total params: 224
Trainable params: 224
Non-trainable params: 0
_________________________________________________________________


In [None]:
model = Sequential()
model.add(Bidirectional(SimpleRNN(hidden_size, return_sequences = True), input_shape=(timesteps, input_dim)))
model.add(Bidirectional(SimpleRNN(hidden_size, return_sequences = True)))
model.add(Bidirectional(SimpleRNN(hidden_size, return_sequences = True)))
model.add(Bidirectional(SimpleRNN(hidden_size, return_sequences = True)))
model.summary()

Model: "sequential_10"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
bidirectional_6 (Bidirection (None, 10, 16)            224       
_________________________________________________________________
bidirectional_7 (Bidirection (None, 10, 16)            400       
_________________________________________________________________
bidirectional_8 (Bidirection (None, 10, 16)            400       
_________________________________________________________________
bidirectional_9 (Bidirection (None, 10, 16)            400       
Total params: 1,424
Trainable params: 1,424
Non-trainable params: 0
_________________________________________________________________
