# **RNN原理跟算法**


<br>
  <a href="https://colab.research.google.com/drive/15JgbvjFS1YT2jiFk_UoHgzE51MToF31_#scrollTo=e_J8uom_h2M0"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a>
<br>






## 1.RNN 算法的原理簡述

* 循環神經網絡（RNN）是一類適用於學習像自然語言處理（NLP）中的文本這樣的序列數據的表示的神經網路種類。

* RNN背後的思想是運用序列信息進行判斷，RNN之所以被稱為“循環”，是因為它們對序列中的每一個元素執行相同的任務，輸出依賴於之前的計算。

* 另一種理解RNN的方式是，它們有一個“記憶”，捕捉到目前為止已經計算的信息，理論上，RNN可以利用在任意長的序列中的信息，但在實際的實現上，經典的RNN只能回顧幾步而已。

### 1.1.Tensorflow 的 RNN算法實現

In [1]:
import numpy as np # 為了使用線性代數相關的函數
import pandas as pd # 為了使用資料處理相關的套件
import requests
url = 'https://github.com/markl-a/ML-demos/raw/main/3.RNNs/wonderland.txt'  # 注意這裡是 'raw' 鏈接
response = requests.get(url)

# 確保請求成功
if response.status_code == 200:
    with open('wonderland.txt', 'wb') as f:
        f.write(response.content)
else:
    print('Failed to download the file.')

In [2]:
# 載入Keras相關的套件
from __future__ import print_function
from tensorflow.keras.layers import SimpleRNN
from keras.models import Sequential
from keras.layers import Dense, Activation

In [3]:
from keras.src.utils import split_dataset
# 資料處理相關的過程
RawData = "wonderland.txt"
# 將輸入檔案轉成字元串流並轉到要處理的檔案中
print("將輸入檔案轉成字元串流並轉到要處理的檔案中...")
with open(RawData, 'rb') as StreamData:
    SplitDataset = [
        line.strip().lower().decode("ascii", "ignore")
        for line in StreamData
        if len(line.strip()) > 0
    ]
text = " ".join(SplitDataset)

將輸入檔案轉成字元串流並轉到要處理的檔案中...


In [4]:
# 創建字符到索引和索引到字符的映射
charSet = set(text)
charToIndex = {c: i for i, c in enumerate(charSet)}
indexToChar = {i: c for i, c in enumerate(charSet)}

# 初始化參數和列表
print("建立輸入向量和文字標籤")
seqLen, step = 10, 1
inputChars, labelChars = [], []

# 創建輸入和標籤列表
inputChars = [text[i:i + seqLen] for i in range(0, len(text) - seqLen, step)]
labelChars = [text[i + seqLen] for i in range(0, len(text) - seqLen, step)]

# 初始化和填充 X 和 y
numChars = len(charSet)
X = np.zeros((len(inputChars), seqLen, numChars), dtype=np.bool)
y = np.zeros((len(inputChars), numChars), dtype=np.bool)

for i, inputChar in enumerate(inputChars):
    for j, ch in enumerate(inputChar):
        X[i, j, charToIndex[ch]] = 1
    y[i, charToIndex[labelChars[i]]] = 1



建立輸入向量和文字標籤


Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  X = np.zeros((len(inputChars), seqLen, numChars), dtype=np.bool)
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  y = np.zeros((len(inputChars), numChars), dtype=np.bool)


In [5]:
# 初始化參數
hiddenSize, batchSize = 128, 128
numIterations, numEpochsPerIteration, numPredsPerEpoch = 25, 1, 100

# 建立模型
model = Sequential([
    SimpleRNN(hiddenSize, return_sequences=False, input_shape=(seqLen, numChars), unroll=True),
    Dense(numChars),
    Activation("softmax")
])

# 編譯模型
model.compile(loss="categorical_crossentropy", optimizer="rmsprop")

In [6]:
# 我們分批訓練模型，並在每個迭代步驟後生成測試輸出
for iteration in range(numIterations):# 遍歷每一個迭代步驟
    print("=" * 50)# 輸出分隔線
    print("Iteration #: %d" % (iteration))# 輸出當前迭代次數
    # 使用 fit 方法訓練模型，批次大小為 batchSize，迭代次數為 numEpochsPerIteration
    model.fit(X, y, batch_size=batchSize, epochs=numEpochsPerIteration)

    # 測試模型
    # 從 inputChars 中隨機選擇一個索引作為種子，然後生成接下來的 100 個字符
    testIdx = np.random.randint(len(inputChars))# 隨機選擇一個索引
    testChars = inputChars[testIdx]# 使用該索引獲取對應的字符序列作為種子
    print("Generating from seed: %s" % (testChars))# 輸出所選的種子
    print(testChars, end="")# 輸出種子字符，不換行
    # 遍歷每一個預測步驟
    for i in range(numPredsPerEpoch):
        # 初始化一個形狀為 (1, seqLen, numChars) 的零矩陣，用於存儲單個輸入序列
        Xtest = np.zeros((1, seqLen, numChars))
        # 填充 Xtest 矩陣
        for i, ch in enumerate(testChars):
            Xtest[0, i, charToIndex[ch]] = 1# 將對應的字符位置設為 1
        # 使用模型進行預測
        pred = model.predict(Xtest, verbose=0)[0]
        # 從預測結果中選擇最可能的字符
        ypred = indexToChar[np.argmax(pred)]
        # 輸出預測的字符，不換行
        print(ypred, end="")
        # 更新 testChars，以便下一次預測
        testChars = testChars[1:] + ypred
    print()# 換行，開始下一個迭代步驟

Iteration #: 0
Generating from seed: y took the
y took the wast the sher the said the said the said the said the said the said the said the said the said the 
Iteration #: 1
Generating from seed: any rate a
any rate and the sald the tore the routhe ther she her the sald the tore the routhe ther she her the sald the 
Iteration #: 2
Generating from seed: n, and she
n, and she could to the dore to the dore to the dore to the dore to the dore to the dore to the dore to the do
Iteration #: 3
Generating from seed: e alice wa
e alice was a lang the could the doon the labbet in a moute the forme the forme the forme the forme the forme 
Iteration #: 4
Generating from seed: ake us up 
ake us up and she had alice and alice and alice and alice and alice and alice and alice and alice and alice an
Iteration #: 5
Generating from seed: you myself
you myself the grown the grown the grown the grown the grown the grown the grown the grown the grown the grown
Iteration #: 6
Generating from seed: n and how

## 2.参考：

1.[comprehensive guide to rnn with keras](https://www.kaggle.com/code/prashant111/comprehensive-guide-to-rnn-with-keras)

2.[A guide on Recurrent Neural Networks: Character-level Text Generator](https://edumunozsala.github.io/BlogEms/fastpages/jupyter/rnn/lstm/pytorch/2020/09/03/char-level-text-generator-pytorch.html)



