<h2>순환 신경망(Recurrent Neural Network, RNN)</h2>
<b>순서가 있는 데이터 입력에, 변화하는 입력에 대한 출력을 얻음</b><br>
<b>시계열(날씨, 주가 등) 자연어와 같이 시간의 흐름에 따라 변화하고, 그 변화가 의미를 갖는 데이터</b>

<h2>Feed Forward Network vs Recurrent Network</h2>
<h2>Feed Forward Network(앞먹임 구조)</h2>
<b>일반적인 구조 신경망</b><br>
<b>입력 -> 은닉 -> 출력층 으로 이어지는 단방향</b><br>
<b>이전 스텝 출력 영향을 받지 않음</b><br>

<h2>Recurrent Network(되먹임 구조)</h2>
<b>이전 층(Layer) 또는 스텝의 출력이 다시 입력으로 연결되는 구조</b><br>
<b>각 스텝마다 이전 상태를 기억 시스템(Memory System)</b><br>
<b>현재 상태가 이전 상태에 종속</b><br>

<h2>순환 신경망 구조</h2>
<b>One to One - RNN</b><br>
<b>One to Many - Image Captioning, 이미지 설명 생성</b><br>
<b>Many to One - Sentiment Classification, 문장 긍정/부정 판단하는 감정 분석</b><br>
<b>Many to Many - Machine Translation, 하나의 언어를 다른 언어로 번역하는 기계 번역</b><br>
<b>Many to Many - Video Classification(Frame Level)</b>

In [2]:
import numpy as np

In [14]:
timesteps = 100
input_features = 32
output_features = 64

inputs = np.random.random((timesteps, input_features))

state_t = np.zeros((output_features, ))

W = np.random.random((output_features, input_features))
U = np.random.random((output_features, output_features))
b = np.random.random((output_features))

succesive_outputs = []

for input_t in inputs:
    output_t = np.tanh(np.dot(W, input_t) + np.dot(U, state_t) + b)
    succesive_outputs.append(output_t)
    state_t = output_t
    
final_output_sequence = np.stack(succesive_outputs, axis = 0 )

<h2>케라스 순환층</h2>
<b>SimpleRNN layer</b><br>
<b>입력: batch_size, timesteps, input_features</b><br>

<b>출력: return_sequences로 결정</b><br>
<b>3D 텐서: 타임스텝스 출력을 모은 모든 전체 시퀀스 반환(batch_size, timesteps, output_features)</b><br>
<b>2D 텐서: 입력 시퀀스에 대한 마지막 출력만 반환(batch_size, output_features)</b><br>

In [9]:
from tensorflow.keras.layers import SimpleRNN, Embedding
from tensorflow.keras.models import Sequential

In [18]:
model = Sequential()
model.add(Embedding(10000, 32))
# model.add(SimpleRNN(32))
model.summary()

Model: "sequential_5"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_4 (Embedding)      (None, None, 32)          320000    
Total params: 320,000
Trainable params: 320,000
Non-trainable params: 0
_________________________________________________________________


In [None]:
model = Sequential()
model.add(Embedding(10000, 32))
model.add(SimpleRNN(32, return_sequences = True))
model.add(SimpleRNN(16))
model.summary()

<b>네트워크 표현력 증가 위해 여러 순환층을 차례대로 쌓는 것이 유요할 때가 있음</b><br>
<b>중간 층들이 전체 출력 시퀀스 반환하도록 설정</b>

In [None]:
model = Sequential()
modle.add(Embedding(10000, 32))
model.add(SimpleRNN(32, return_sequences = True))
model.add(SimpleRNN(32, return_sequences = True))
model.add(SimpleRNN(32, return_sequences = True))
model.add(SimpleRNN(32))

model.summary()

<h2>IMDB 데이터로 적용</h2>

In [23]:
from tensorflow.keras.datasets import imdb
from tensorflow.keras.preprocessing import sequence

In [24]:
num_words = 10000
max_len = 500
batch_size = 32

(input_train, y_train), (input_test, y_test) = imdb.load_data(num_words = num_words)
print(len(input_train))
print(len(input_test))

input_train = sequence.pad_sequences(input_train, maxlen = max_len)
input_test = sequence.pad_sequences(input_test, maxlen = max_len)
print(input_train.shape)
print(input_test.shape)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz


  x_train, y_train = np.array(xs[:idx]), np.array(labels[:idx])
  x_test, y_test = np.array(xs[idx:]), np.array(labels[idx:])


25000
25000
(25000, 500)
(25000, 500)


<b>모델 구성</b>

In [25]:
from tensorflow.keras.layers import Dense

In [None]:
model = Sequential()

model.add(Embedding(num_words, 32))
model.add(SimpleRNN(32))
model.add(Dense(1, activation = 'sigmoid'))

model.compile(optimizer = 'rmsprop',
             loss = 'binarycrossentropy',
              metrics = ['acc'])

model.summary()

<b>모델 학습</b>

In [None]:
history = model.fit(input_train, y_train,
                   epochs = 10,
                   batch_size = 128,
                   validation_split = 0.2)

In [None]:
import matplotlib.pyplot as plt
plt.style.use('seaborn-white')

In [None]:
loss = history.history['loss']
val_loss = history.history['val_loss']
acc = history.history['acc']
val_acc = history.history['val_acc']

epochs = range(1, len(loss) + 1)

plt.plot(epochs, loss, 'b--', label = 'training loss')
plt.plot(epochs, val_loss, 'r:', label = 'validation loss')
plt.grid()
plt.legend()

plt.figure()
plt.plot(epochs, acc, 'b--', label = 'training accuracy')
plt.plot(epochs, val_acc, 'r:', label = 'validation accuracy')
plt.grid()
plt.legend()

plt.show()

<b>모델 평가</b>

In [None]:
model.evaluate(input_test, y_test)

<b>전체 시퀀스 아닌 순서대로 500개만 넣어 성능이 낮음</b><br>
<b>SimpleRNN은 긴 시퀀스 처리에 부적합, 실전에 이용하기엔 너무 단순 -> 이를 방지하기 위해 LSTM, GRU 같은 레이어 등장</b><br>
<b>그래디언트 소실 문제(vanishing gradient problem)</b><br>

<h2>LSTM(Long Short-Term Memory)</h2>
<b>장단기 메모리 알고리즘</b><br>
<b>나중을 위해 정보 저장, 오래된 시그널이 점차 소실되는 것 막아줌</b><br>

In [None]:
from tensorflow.keras.datasets import imdb
from tensorflow.keras.preprocessing.sequence import pad_sequences

In [None]:
num_words = 10000
max_len = 500

(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words = num_words)
print(len(x_train))
print(len(x_test))

pad_x_train = sequence.pad_sequences(x_train, maxlen = max_len)
pad_x_test = sequence.pad_sequences(x_test, maxlen = max_len)
print(pad_x_train.shape)
print(pad_x_test.shape)

<b>모델 구성</b>

In [30]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM, GRU, Embedding

In [None]:
model = Sequential()

model.add(Embedding(num_words, 32))
model.add(LSTM(32))
model.add(Dense(1, activation = 'sigmoid'))

model.compile(optimizer = 'rmsprop',
             loss = 'binarycrossentropy',
              metrics = ['acc'])

model.summary()

<b>모델 학습</b>

In [None]:
history = model.fit(input_train, y_train,
                   epochs = 10,
                   batch_size = 128,
                   validation_split = 0.2)

<b>시각화</b>

In [None]:
loss = history.history['loss']
val_loss = history.history['val_loss']
acc = history.history['acc']
val_acc = history.history['val_acc']

epochs = range(1, len(loss) + 1)

plt.plot(epochs, loss, 'b--', label = 'training loss')
plt.plot(epochs, val_loss, 'r:', label = 'validation loss')
plt.grid()
plt.legend()

plt.figure()
plt.plot(epochs, acc, 'b--', label = 'training accuracy')
plt.plot(epochs, val_acc, 'r:', label = 'validation accuracy')
plt.grid()
plt.legend()

plt.show()

<b>모델 평가</b>

In [None]:
model.evaluate(pad_x_test, y_test)

<h2>GRU(Gates Recurrent Unit)</h2>
<b>LSTM을 더 단순하게 만든 구조</b><br>
<b>기억 셀 없고, 시간방향으로 전파하는 은닉 상태만 있음</b><br>

<b>reset gate - 과거 은닉 상태 얼마나 무시할지를 r 값이 결정</b><br>
<b>update gate - 은닉 상태를 갱신하는 게이트(LSTM의 forget, input gate 역할)</b><br>

<h2>Reuters 데이터</h2>
<b>IMDB와 유사한 데이터셋(텍스트 데이터), 46개의 상호 배타적인 토픽으로 이루어짐, 다중 분류 문제</b><br>

In [None]:
from tensorflow.keras.datasets import reuters

In [None]:
num_words = 10000

(x_train, y_train), (x_test, y_test) = reuters.load_data(num_words = num_words)
print(x_train.shape)
print(y_train.shape)

print(x_test.shape)
print(y_test.shape)

<b>데이터 전처리 및 확인</b>

In [None]:
from tensorflow.keras.preprocessing.sequence import pad_sequences

In [None]:
max_len = 300

In [None]:
pad_x_train = pad_sequences(x_train, maxlen = max_len)
pad_x_test = pad_sequences(x_test, maxlen = max_len)

print(len(pad_x_train[0]))

In [None]:
pad_x_train[5]

<b>모델 구성</b><br>
<b>LSTM 레이어도 SimpleRNN과 같이 Return_sequences 인자 사용 가능</b>

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense LSTM, GRU, Enbedding

In [None]:
model = Sequential()
model.add(Embedding(input_dim = num_words, output_dim = 256))
model.add(GRU(256, return_sequences = True))
model.add(GRU(128))
model.add(Dense(46, activation = 'softmax'))

model.compile(optimizer = 'adam',
             loss = 'sparse_categorical_crossentropy',
             metrics['acc'])
model.summary()

<b>모델 학습</b><br>

In [None]:
history = model.fit(pad_x_train, y_train,
                   epochs = 20,
                   batch_size = 32,
                   validation_split = 0.2)

<b>시각화</b>

In [None]:
loss = history.history['loss']
val_loss = history.history['val_loss']
acc = history.history['acc']
val_acc = history.history['val_acc']

epochs = range(1, len(loss) + 1)

plt.plot(epochs, loss, 'b--', label = 'training loss')
plt.plot(epochs, val_loss, 'r:', label = 'validation loss')
plt.grid()
plt.legend()

plt.figure()
plt.plot(epochs, acc, 'b--', label = 'training accuracy')
plt.plot(epochs, val_acc, 'r:', label = 'validation accuracy')
plt.grid()
plt.legend()

plt.show()

<b>모델 평가</b>

In [None]:
model.evaluate(pad_x_test, y_test)