<a href="https://colab.research.google.com/github/allenlincg/aipractice/blob/main/Colab%E5%AF%A6%E4%BD%9C%EF%BC%9A%E5%BB%BA%E7%AB%8B_RNN_%E6%A8%A1%E5%9E%8B%E7%9A%84%E6%96%B9%E6%B3%95.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


<p align="center">
  <img src="https://drive.google.com/uc?id=11XY5LPAsz9zuu7RxqlNcDbkAsIgsGveo" width=700
  />
  <center>堆疊形式的 RNN 模型概念圖</center>
</p>

## 1: 切換 TensorFlow 至 2.x 版

In [None]:
%tensorflow_version 2.x

## 2: 載入套件及資料集

In [None]:
%matplotlib inline
# Import some useful packages
import matplotlib.pyplot as plt
import numpy as np
from ipywidgets import interact, Text

# For DNN
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

# For RNN
from tensorflow.keras.layers import SimpleRNN, LSTM, GRU

# For training
from tensorflow.keras.optimizers import SGD, Adam, RMSprop

# For data preprocessing
from tensorflow.keras import datasets
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.preprocessing.sequence import pad_sequences

## 3: SimpleRNN/LSTM/GRU 中的重要參數

### 是否輸出所有隱藏狀態 (hidden state)

若希望模型輸出的不只是最後一個時間點的隱藏狀態 $h_T$，而是每個時間點的狀態 $\{h_1,\cdots, h_T\}$，則需指定 ``return_sequence=True``

假設 $T=5$，且 $x_t$ 的維度為 3，使用 4 個 RNN cell

In [None]:
timesteps, input_dim = 5, 3

#### SimpleRNN Case

In [None]:
# 只輸出 h_T
model_T = Sequential()
model_T.add(SimpleRNN(4, input_shape=(timesteps, input_dim)))

# Output hidden state for the last time step, output dimension 4
model_T.summary()

Model: "sequential_18"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
simple_rnn_12 (SimpleRNN)    (None, 4)                 36        
Total params: 36
Trainable params: 36
Non-trainable params: 0
_________________________________________________________________


In [None]:
model_all = Sequential()
model_all.add(SimpleRNN(4, input_shape=(timesteps, input_dim), return_sequences=True))

# Output hidden state at each time step, output dimension 5x4
model_all.summary()

#### LSTM Case

In [None]:
model_LSTM = Sequential()
# model_LSTM.add(LSTM(4, input_shape=(timesteps, input_dim), return_sequences=False)) # Default return_sequences setting
model_LSTM.add(LSTM(4, input_shape=(timesteps, input_dim), return_sequences=True))

# Output hidden state at each time step, output dimension 5x4
model_LSTM.summary()

#### GRU Case

GRU 在 Keras 中，根據每批次訓練後要不要重設起始狀態，在 bias 的數量上會有所不同，跟使用者是否使用 cuDNN 來進行有關。

設定參數 ``reset_after=False`` 可以關掉預設的 GPU 版 GRU

In [None]:
model_GRU = Sequential()
# model_GRU.add(GRU(4, input_shape=(timesteps, input_dim), return_sequences=False, reset_after=False)) # Default return_sequences setting
model_GRU.add(GRU(4, input_shape=(timesteps, input_dim), return_sequences=True, reset_after=False))

# Output hidden state at each time step, output dimension 5x4
model_GRU.summary()

## 4: 透過 ``return_sequences`` 來堆疊 RNN/LSTM/GRU 模型

In [None]:
timesteps, input_dim = 5, 3

In [None]:
model_stack = Sequential()
model_stack.add(SimpleRNN(4, input_shape=(timesteps, input_dim), return_sequences=True))
model_stack.add(LSTM(20, input_shape=(timesteps, input_dim)))

model_stack.summary()

In [None]:
model_classfy = Sequential()
model_classfy.add(SimpleRNN(4, input_shape=(timesteps, input_dim), return_sequences=True))
model_classfy.add(LSTM(20, input_shape=(timesteps, input_dim)))

model_classfy.summary()

## 5: 建立一個有「加法」功能的 RNN 模型

我們希望建構一個 RNN 模型，來建立「加法器」。

首先，隨機生成 60000 筆資料，每筆資料包含 3~10 個數字，每個數字為介在 0 和 100 之間的整數。

將前 50000 筆資料當作訓練資料，後 10000 筆資料當作測試資料。

In [None]:
X = []
y = []
X_min, X_max = 0, 100
data_size = 60000
for _ in range(data_size):
    random_length = np.random.randint(3, 10)
    X_i = np.random.randint(X_min, X_max, size=random_length)
    X.append(X_i)
    y.append(sum(X_i))

In [None]:
print(X[0], y[0])
print(X[1], y[1])

In [None]:
# Pad variable length sequences to a fixed length
X_train = pad_sequences(X[:50000], maxlen=10, padding='post')
X_test = pad_sequences(X[50000:], maxlen=10, padding='post')

y_train = X_train.sum(axis=1)
y_test = X_test.sum(axis=1)

In [None]:
# Observe some data
print(X_train[0], y_train[0])
print(X_train[1], y_train[1])

In [None]:
# Reshap and change data type
X_train = X_train.reshape(X_train.shape+(1,))
X_test = X_test.reshape(X_test.shape+(1,))

X_train = X_train.astype('float32')
X_test = X_test.astype('float32');

In [None]:
model = Sequential()
model.add(LSTM(20, input_shape=(10, 1)))
model.add(Dense(1))
model.summary()

Model: "sequential_15"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm_6 (LSTM)                (None, 20)                1760      
_________________________________________________________________
dense_7 (Dense)              (None, 1)                 21        
Total params: 1,781
Trainable params: 1,781
Non-trainable params: 0
_________________________________________________________________


In [None]:
model.compile(loss='mae',
              optimizer=RMSprop(),
              metrics=['mse'])

In [None]:
training_history = model.fit(X_train, y_train,
                             batch_size=32,
                             epochs=50,
                             validation_data=(X_test, y_test)
                             )

In [None]:
def visulization(seq):
    X = seq.split(',')
    description = ''
    for term in X:
          description = term + '+'
    description = description[:-1]
    X = np.array(X, dtype=float)
    y = sum(X)
    X = pad_sequences([X], maxlen=10, padding='post')
    X = X.reshape(X.shape+(1,))
    prediction = model.predict(X)[0][0]
    print("The predicted sum of %s is %f" %(description, prediction))
    print("Correct answer is %f" %y)

In [None]:
interact(visulization, seq=Text(value='10,20,30,40'));

interactive(children=(Text(value='10,20,30,40', description='seq'), Output()), _dom_classes=('widget-interact'…

關於加法器的實作方式，也可以參考 Keras 官方文件的範例

https://keras.io/examples/addition_rnn/

連結中的的加法器，其輸入會是像是 "5+234" 的字串，而輸出會是加過後的數字 239

因此，加法器可以透過不同的敘述方式來建構，模型及資料整理的差異性也會差很多