# 기계학습과 인공신경망 10장: RNN(LSTM) 실습

RNN(LSTM)을 이용하여 Text 데이터를 예측하는 예제.

인터넷에서 hihello 의 예측에 대한 많은 유사한 예제들이 있으나 본 내용은 아래 참고 링크 1에서 가져온 것이며 참고링크 2에 추가적인 설명이 있음.

참고 링크:  
1.   https://github.com/hunkim/DeepLearningZeroToAll/blob/master/keras/klab-12-1-rnn_hello_char.py
2.   https://docs.google.com/presentation/d/1UpZVnOvouIbXd0MAFBltSra5rRpsiJ-UyBUKGCrfYoo/edit



(1) 사용할 패키지들 ...

In [1]:
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, TimeDistributed, Activation, LSTM
from keras.utils import to_categorical
import matplotlib.pyplot as plt

Using TensorFlow backend.


(2) 사용할 데이터 생성 및 입출력 포맷 설정

In [0]:
sentence = ("if you want to build a ship, don't drum up people together to "
           "collect wood and don't assign them tasks and work, but rather "
           "teach them to long for the endless immensity of the sea.")

sentence = ("hi hello")

char_set = list(set(sentence))
char_dic = {w: i for i, w in enumerate(char_set)}

input_dim = len(char_set)
hidden_size = 7 #len(char_set)  # output dimension of RNN(LSTM) cell, can be any arbitrary integer
num_classes = len(char_set)  # output dimension of the model
seq_length = 5  # Number of time steps, Any arbitrary integer
sample_size = len(sentence) - seq_length

print(char_set)
print(char_dic)

(3) RNN(LSTM)을 위한 데이터 전처리

1. 문자열을 Toeplitz 행렬 형태로바꾸어  입력데이터 준비
2. 이 때 char_dic를 이용하여 문자를 숫자로 바꾸고
3. 최종적으로 숫자를  2진 벡터로 바꿈 (one hot encoding)




In [0]:
X_seqs = np.zeros([sample_size,seq_length])
Y_seqs = np.zeros([sample_size,seq_length])
for i in range(0, sample_size):
   x_str = sentence[i:i + seq_length]
   y_str = sentence[i + 1: i + seq_length + 1]
   print(i, 'th sample: Input -> Target : ', x_str, '->', y_str)

   x = [char_dic[c] for c in x_str]  # x str to index
   y = [char_dic[c] for c in y_str]  # y str to index

   X_seqs[i,] = x
   Y_seqs[i,] = y
    
print('(sample_size, seq_length) = ', X_seqs.shape)
print('숫자로 변환된 입력 데이터:')
print(X_seqs)
print(Y_seqs)

# One-hot encoding
x_onehot = to_categorical(X_seqs, num_classes=num_classes)
y_onehot = to_categorical(Y_seqs, num_classes=num_classes)

print('(sample_size, seq_length, input_dim) = ', x_onehot.shape)
print(x_onehot)
print(y_onehot)

(4) 모델 구성

In [0]:
model = Sequential()
model.add(LSTM(hidden_size, input_shape=(seq_length, input_dim), return_sequences=True))
model.add(TimeDistributed(Dense(num_classes, activation='softmax')))
model.summary()
# Store model graph in png
# (Error occurs on in python interactive shell)
# plot_model(model, to_file=os.path.basename(__file__) + '.png', show_shapes=True)

model.compile(loss='categorical_crossentropy',
              optimizer='rmsprop', metrics=['accuracy'])

(5) 학습 (Model fitting)

In [0]:
history = model.fit(x_onehot, y_onehot, epochs=5, steps_per_epoch = 100)

plt.plot(history.history["loss"])
plt.title("Loss")
plt.show()

(6) 학습이 잘 되었는지 확인

In [0]:
def get_idx_n_str( onehot ):
  index = np.argmax(onehot, axis=0)
  if( np.isscalar(index) ): 
    str = char_set[index]
  else: 
    index = np.argmax(onehot, axis=1)
    str = [char_set[j] for j in index]
  return index, str
  

predictions = model.predict(x_onehot, verbose=0)
for i, prediction in enumerate(predictions):
    # print(prediction)
    index, str = get_idx_n_str(x_onehot[i])
    print(i,'th Input:     ', index, ' -> ', ''.join(str))

    index, str = get_idx_n_str(y_onehot[i])
    print(i,'th Target:    ', index, ' -> ', ''.join(str))

    index, str = get_idx_n_str(prediction)
    print(i,'th Predicted: ', index, ' -> ', ''.join(str))