all exercises based on [Introduction to Deep Learning for NLP](https://wikidocs.net/22886)

- 기본 형태
`model.add(SimpleRNN(hidden_size))`
- 추가 인자 사용
`model.add(SimpleRNN(hidden_size, input_length=M, input_dim=N))`

    - hidden_size = 은닉 상태의 크기. 메모리 셀이 다음 시점의 메모리 셀과 출력층으로 보내는 값의 크기(output_dim)와도 동일. RNN의 용량(capacity)을 늘린다고 보면 되고, 중소형 모델은 보통 128, 256, 512, 1024 등의 값 가짐
    - timesteps = 입력 시퀀스의 길이(input_length). 시점의 수
    - input_dim = 입력의 크기

In [2]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import SimpleRNN

In [3]:
model = Sequential()
model.add(SimpleRNN(3, input_shape = (2,10)))
# model.add(SimpleRNN(3, input_length = 2, input_dim = 10))
model.summary()
# input length vs input dim?
# batch size를 지금 알 수 없음 -> None
# hidden size: 3 (=output dim)

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
simple_rnn (SimpleRNN)       (None, 3)                 42        
Total params: 42
Trainable params: 42
Non-trainable params: 0
_________________________________________________________________


In [4]:
# batch size를 미리 정의해보자
model = Sequential()
model.add(SimpleRNN(3, batch_input_shape = (8,2,10)))
model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
simple_rnn_1 (SimpleRNN)     (8, 3)                    42        
Total params: 42
Trainable params: 42
Non-trainable params: 0
_________________________________________________________________


In [5]:
# time step 포함해 3d 텐서 리턴하도록 모델을 만들어보자
model = Sequential()
model.add(SimpleRNN(3, batch_input_shape = (8,2,10), return_sequences = True))
model.summary()

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
simple_rnn_2 (SimpleRNN)     (8, 2, 3)                 42        
Total params: 42
Trainable params: 42
Non-trainable params: 0
_________________________________________________________________


### 파이썬으로 RNN 구현하기

In [6]:
import numpy as np

timesteps = 10 # 시점의 수 (문장의 길이)
input_dim = 4 # 입력의 차원 (단어 벡터의 차원)
hidden_size = 8 # 은닉 상태의 크기. 메모리 셀 용량

inputs = np.random.random((timesteps, input_dim)) # 입력에 해당되는 2d 텐서

hidden_state_t = np.zeros((hidden_size,)) # 초기 은닉 상태는 영벡터로 초기화
# 은닉 상태의 크기 hidden_size로 은닉 상태 만듦

print(hidden_state_t) # 8의 크기를 가지는 은닉 상태.

[0. 0. 0. 0. 0. 0. 0. 0.]


In [7]:
Wx = np.random.random((hidden_size, input_dim)) # (8,4) 크기의 2d 텐서. 입력에 대한 가중치
Wh = np.random.random((hidden_size, hidden_size)) # (8,8) 크기의 2d 텐서 생성. 은닉 상태에 대한 가중치
b = np.random.random((hidden_size,)) # (8, )크기의 1d 텐서 생성 (이 값은 편향)

In [9]:
print(np.shape(Wx))
print(np.shape(Wh))
print(np.shape(b))

(8, 4)
(8, 8)
(8,)


In [10]:
total_hidden_states = []

# 메모리 셀 동작
for input_t in inputs:
    output_t = np.tanh(np.dot(Wx, input_t) + np.dot(Wh, hidden_state_t) + b)
    total_hidden_states.append(list(output_t)) # 각 시점의 은닉 상태의 값을 계속 축적
    print(np.shape(total_hidden_states))
    hidden_state_t = output_t

total_hidden_states = np.stack(total_hidden_states, axis = 0)

print(total_hidden_states)

(1, 8)
(2, 8)
(3, 8)
(4, 8)
(5, 8)
(6, 8)
(7, 8)
(8, 8)
(9, 8)
(10, 8)
[[0.8509924  0.95215477 0.94177424 0.46921501 0.80442601 0.66239114
  0.71711232 0.91959919]
 [0.99996761 0.99973676 0.99972805 0.99974395 0.99993515 0.9996309
  0.99984282 0.99989549]
 [0.99999569 0.99991491 0.99996002 0.99996578 0.9999905  0.99986491
  0.99997427 0.99998605]
 [0.99999736 0.99992305 0.9999573  0.99997255 0.99998806 0.99990229
  0.99997999 0.99998443]
 [0.9999888  0.99986317 0.99988598 0.99996787 0.99999292 0.99984271
  0.99996274 0.9999767 ]
 [0.99999745 0.99977763 0.99996415 0.99996419 0.99996028 0.99991444
  0.99996425 0.99998142]
 [0.99999655 0.99969116 0.99990581 0.99996511 0.99994811 0.99980613
  0.9999707  0.99989551]
 [0.99999322 0.99869635 0.99991179 0.99993435 0.9998285  0.99963675
  0.99992688 0.9998003 ]
 [0.99999658 0.99950968 0.99990967 0.99996559 0.99992256 0.99988789
  0.99995736 0.99993329]
 [0.99999784 0.99965742 0.99996927 0.99995446 0.99991575 0.99986186
  0.99996269 0.99996127]]

### 깊은 순환 신경망 (Deep Recurrent Neural Network)

In [11]:
model = Sequential()
model.add(SimpleRNN(hidden_size, return_sequences = True))
model.add(SimpleRNN(hidden_size, return_sequences = True))

### 양방향 순환 신경망 (Bidirectional Recurrent Neural Network)

In [12]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import SimpleRNN, Bidirectional

model = Sequential()
model.add(Bidirectional(SimpleRNN(hidden_size, return_sequences = True), input_shape = (timesteps, input_dim)))