<a href="https://colab.research.google.com/github/twyeh/AI-in-education/blob/main/%E7%B0%A1%E6%98%93%E8%81%8A%E5%A4%A9%E6%A9%9F%E5%99%A8%E4%BA%BA20250331.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<img src="https://r2cdn.perplexity.ai/pplx-full-logo-primary-dark%402x.png" class="logo" width="120"/>

# 使用RNN的LSTM方法建構一個可以從網路參考一篇文章進行聊天的簡易機器人

---

# 使用RNN的LSTM方法建構一個可以從網路參考一篇文章進行聊天的簡易機器人

使用LSTM（長短期記憶網絡）構建能參考網路文章進行聊天的機器人，主要涉及自然語言處理與深度學習技術的整合應用。以下是關鍵步驟與技術實現的詳細說明：

---

### **1. 數據準備與預處理**

**文本獲取與清洗**
首先需從目標文章提取文本內容，進行以下處理：

- 移除特殊符號、HTML標籤等無意義字符
- 分句與分詞處理（英文使用NLTK，中文需結巴分詞）
- 轉換為小寫並去除停用詞

In [70]:
import jieba  # 中文分詞
text = "Hong Kong　CNN　 — As Donald Trump’s April 2 “Liberation Day” for announcing “reciprocal” tariffs on America’s trading partners approaches, the question in Beijing is whether this will be the moment when its nascent trade war with the US really escalates.　Mixed messages have kept Chinese officials guessing. The US president has long railed against the gaping trade deficit between the world’s two largest economies – and on the campaign trail threatened upwards of 60% duties on all Chinese goods coming into the US. That’s potentially putting Beijing prominently in line for Wednesday’s expected measures.　And yet, in recent days, Trump has suggested he could reduce tariffs on Chinese imports as part of a wider bargain on the sale of Chinese-social social media platform TikTok.　Speaking on Air Force One on Sunday, he vowed again to complete a deal on the platform ahead of a Saturday deadline. Trump has also touted a “a great relationship” with Chinese leader Xi Jinping even as his government slapped tough controls on China’s access to US tech and called for tighter investment controls.　Beijing is not alone in facing vacillations from Trump in his second term. The US leader appears to use uncertainty as a tactical weapon as he faces off against trade partners near and far.　On Monday, Asian shares followed US futures in falling because of the riskiness over what the upcoming tariffs could be. Stock markets in Japan and South Korea, whose carmakers are likely to be disproportionately affected by measures announced by Trump last week, were hit especially hard..But for China, America’s closest peer in economic size and its key geopolitical rival, the shape and tenor of their relationship can affect the whole world. One possible outcome, a de facto decoupling, would upend global supply chains and roil parts of each economy. Another could see the two reframe how they co-exist amicably in a global economy. “We’re at a real fork in the road,” Scott Kennedy, a senior adviser at the Center for Strategic and International Studies think tank in Washington, told reporters on the sidelines of a global business forum in Beijing last week. “We might see these negotiations and pressure result in a pulling back of these threats and a resumption of a more stable relationship, but things could get a lot worse. We could see tariffs go sky high and investment fall. That would lead to some sort of at least incremental decoupling between the two economies, and there’d be a lot of suffering,” he said. In the face of the uncertainty, Beijing’s message has been clear: The US should “return to the right track of dialogue and cooperation at an early date,” a Foreign Ministry official said earlier this month. But “if war is what the US wants, be it a tariff war, a trade war or any other type of war, we’re ready to fight till the end.” During what the US described as an “introductory meeting” between America’s top trade representative Jamieson Greer and China’s Vice Premier He Lifeng last week, He “expressed serious concerns” about existing US tariffs and the potential introduction of further duties on April 2, according to a state media readout. A separate summary of the call from a social media account linked to China’s state broadcaster was more explicit on what He said: “If the US insists on damaging China’s interests, China will resolutely counterattack.” Any US changes to China’s trading status “will not affect China’s win-win cooperation with the world,” He also said. China has already hit back swiftly – though modestly – on the two tranches of additional 10% tariffs Trump has imposed on Chinese imports to the US so far, while sharpening a toolbox of other countermeasures. Chinese Premier Li Qiang last week signed an order strengthening China’s “anti-sanctions law,” which allows Beijing to take action against foreign countries that “contain or suppress” China or discriminate against its entities and individuals. Earlier this month, Beijing used anti-discrimination measures under its Foreign Trade Law for the first time to raise tariffs on Canadian imports. Late last year, the government also overhauled its export controls for “dual use” items and then swiftly tightened export of gallium, germanium, antimony – key materials with military applications. “China knows the way it handled the US for Trump’s first term might not work. Beijing underestimated the US determination to wage a war, it did not have enough bullets to have a tit-for-tat war,” said Shanghai-based foreign affairs analyst Shen Dingli. Then, China’s retaliatory measures largely consisted of slapping tariffs across a wide swath of US imports before the two sides reached a “phase one” trade deal that analysts say China never fully implemented. This time, Beijing is moving more strategically and using other tools as it aims for leader-level talks with Trump to stop the escalation, Shen said. Beijing has also been using global uncertainty around Trump’s “America First” and trade policy to pitch itself as a champion of the global economy – and win economic allies among US trade partners and companies from Asia to Europe. “Decoupling and breaking supply chains harm everyone and lead nowhere,” Xi told an audience of global executives in Beijing last week, which included CEOs of US firms FedEx and Qualcomm. “Blocking others’ paths will ultimately only obstruct your own,” the Chinese leader said. Data show China has already diversified imports and exports away from the US since the first Trump trade war. But access to European and Asian markets will become increasingly critical for Beijing if it faces mounting barriers to enter the American market. Those could be numerous as Trump seeks to close trade deficits and also punish behaviors he sees as harming the US. The president last week also threatened hefty tariffs on imports from countries that import Venezuelan oil, a list that China tops. Over the weekend, he also threatened to put similar, secondary sanctions on Russian oil, of which China is also a major purchaser. “It feels like it’s gravitating towards a situation where the overall tariff rate on Chinese goods shipped the United States by summertime could actually reach the President’s proposed 60%,” said Kurt Tong, managing partner at Washington DC-based advisory firm Asia Group, a former ambassador who served as consul general in Hong Kong. That’s a level economists say “will lead to a dramatic decline in shipments,” he added."
sentences0 = [jieba.lcut(sent) for sent in text.split(',')]

**建立詞彙映射表**
將詞語轉換為數值索引以輸入模型：

In [71]:
from tensorflow.keras.preprocessing.text import Tokenizer
tokenizer = Tokenizer()
tokenizer.fit_on_texts(sentences0)
word_index = tokenizer.word_index

**序列填充**
統一序列長度以符合模型輸入格式：

In [83]:
from tensorflow.keras.preprocessing.sequence import pad_sequences
max_length = 50  # 根據文本特性調整
sequences = pad_sequences(tokenizer.texts_to_sequences(sentences0),      maxlen=max_length, padding='post')

In [84]:
sequences

array([[ 81,   1,  82, ..., 150,   0,   0],
       [  1,  88,   1, ...,   5,   1, 180],
       [  1,  14,   1, ...,   0,   0,   0],
       ...,
       [  1, 517,   1, ...,   0,   0,   0],
       [  1, 523,   1, ...,  14,   1, 532],
       [ 11,   1,  21, ...,   0,   0,   0]], dtype=int32)

In [85]:
 len(sequences)

49

---

### **2. 模型架構設計**

採用**編碼器-解碼器（Seq2Seq）**結構，搭配雙向LSTM提升上下文理解能力：

In [86]:
from tensorflow.keras.layers import LSTM, Dense, Embedding, Bidirectional, Input, concatenate
from tensorflow.keras.models import Model

# 編碼器
encoder_input = Input(shape=(max_length,))
encoder_embed = Embedding(len(word_index)+1, 256)(encoder_input)
# Get outputs from Bidirectional LSTM, handling both directions
encoder_output, forward_h, forward_c, backward_h, backward_c = Bidirectional(LSTM(256, return_state=True))(encoder_embed)
# Concatenate the hidden and cell states from both directions
state_h = concatenate([forward_h, backward_h])
state_c = concatenate([forward_c, backward_c])
encoder_states = [state_h, state_c]

# 解碼器
decoder_input = Input(shape=(max_length,))
decoder_embed = Embedding(len(word_index)+1, 256)(decoder_input)
decoder_lstm = LSTM(256, return_sequences=True, return_state=True)
decoder_output, _, _ = decoder_lstm(decoder_embed, initial_state=encoder_states)
decoder_dense = Dense(len(word_index)+1, activation='softmax')(decoder_output)

model = Model([encoder_input, decoder_input], decoder_dense)



---

### **3. 模型訓練配置**

**損失函數與優化器**
使用交叉熵損失與Adam優化器：

In [87]:
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

**訓練參數設置**

In [90]:
# Assuming 'sequences' is your preprocessed data from previous cells
import numpy as np

# Check if sequences has enough data for training
if len(sequences) < 2:
    raise ValueError("Not enough data in 'sequences' for training. Need at least 2 sequences.")

# Create encoder input data (previous sentence)
encoder_input_data = sequences[:-1]

# Create decoder input data (current sentence) and target data (next word)
decoder_input_data = []
decoder_target_data = []
for i in range(len(sequences) - 1):
    decoder_input_data.append(sequences[i+1][:-1]) # Exclude last word for input
    decoder_target_data.append(sequences[i+1][1:]) # Exclude first word for target

# Convert to NumPy arrays
decoder_input_data = np.array(decoder_input_data)
decoder_target_data = np.array(decoder_target_data)

# Now you can call model.fit()
# If the length of sequences is small, set validation_split to 0.0
# to avoid the ValueError
validation_split = 0.1 if len(sequences) > 5 else 0.0  # Adjust 5 if needed

encoder_input_data

decoder_input_data

array([[  1,  88,   1, ...,  29,   5,   1],
       [  1,  14,   1, ...,   0,   0,   0],
       [ 24,   1,  36, ...,   1,  12,   1],
       ...,
       [  1, 517,   1, ...,   0,   0,   0],
       [  1, 523,   1, ...,   1,  14,   1],
       [ 11,   1,  21, ...,   0,   0,   0]], dtype=int32)

In [93]:
len(encoder_input_data)

48

In [94]:
len(decoder_input_data)

48

In [88]:
history = model.fit([encoder_input_data, decoder_input_data],
                    decoder_target_data,
                    batch_size=32,
                    epochs=100,
                    validation_split=validation_split)

Epoch 1/100


ValueError: Input 0 of layer "functional_6" is incompatible with the layer: expected shape=(None, 50), found shape=(None, 20)

---

### **4. 對話生成機制**

**狀態傳遞與動態預測**

In [None]:
# 編碼器推論模型
encoder_model = Model(encoder_input, encoder_states)

# 解碼器推論模型
decoder_state_input_h = Input(shape=(256,))
decoder_state_input_c = Input(shape=(256,))
decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]

decoder_outputs, state_h, state_c = decoder_lstm(decoder_embed, initial_state=decoder_states_inputs)
decoder_states = [state_h, state_c]
decoder_model = Model([decoder_input] + decoder_states_inputs,
                      [decoder_outputs] + decoder_states)

**回應生成函數**

In [None]:
def generate_response(input_seq):
    states_value = encoder_model.predict(input_seq)
    target_seq = np.zeros((1, 1))
    response = []
    for _ in range(max_length):
        output_tokens, h, c = decoder_model.predict([target_seq] + states_value)
        sampled_token_index = np.argmax(output_tokens[0, -1, :])
        sampled_word = index_word[sampled_token_index]
        response.append(sampled_word)
        if sampled_word == '&lt;END&gt;':
            break
        target_seq = np.zeros((1, 1))
        target_seq[0, 0] = sampled_token_index
        states_value = [h, c]
    return ' '.join(response)

---

### **5. 實用技巧與優化**

- **注意力機制**：添加Bahdanau注意力層提升長文本處理能力[^2][^3]
- **遷移學習**：使用預訓練詞向量（如Word2Vec）加速收斂
- **數據增強**：通過同義詞替換擴充訓練樣本
- **混合架構**：結合規則式回應（如AIML模板[^1]）提升基礎問答穩定性

---

### **6. 部署與測試**

**API服務化**
使用Flask框架建立HTTP端點：

In [None]:
from flask import Flask, request
app = Flask(__name__)

@app.route('/chat', methods=['POST'])
def chat():
    user_input = request.json['msg']
    processed_input = preprocess(user_input)
    response = generate_response(processed_input)
    return {'response': response}

**GUI介面整合**
參考PyQt或Tkinter實現互動界面[^4][^7]，包含：

- 輸入文字框
- 對話歷史顯示區
- 發送按鈕與即時回應功能

---

此架構能有效結合特定領域知識（參考文章內容）與上下文對話能力，實際應用時需根據文本特性調整詞表大小與LSTM層數，並通過BLEU分數等指標持續優化生成質量[^5][^6]。

<div>⁂</div>

[^1]: https://hub.packtpub.com/build-generative-chatbot-using-recurrent-neural-networks-lstm-rnns/

[^2]: https://ijiet.com/wp-content/uploads/2019/10/32.pdf

[^3]: https://github.com/ShrishtiHore/Conversational_Chatbot_using_LSTM

[^4]: https://github.com/tridibsamanta/Chatbot-using-Python

[^5]: https://www.semanticscholar.org/paper/Automated-Thai-FAQ-Chatbot-using-RNN-LSTM-Muangkammuen-Intiruk/c278680868ad4d03431ab92bf404431f4fb9478e

[^6]: https://hub.packtpub.com/build-and-train-rnn-chatbot-using-tensorflow/

[^7]: https://github.com/AnMol12499/LSTM_based_chatbot

[^8]: https://github.com/AdroitAnandAI/LSTM-Attention-based-Generative-Chat-bot/blob/master/README.md

[^9]: https://www.semanticscholar.org/paper/Chatterbot-implementation-using-Transfer-Learning-Prakash/b41c4c1f4b3a33fd2dbe7e9be29fee5a6ce1c81c

[^10]: https://www.packtpub.com/en-jp/learning/how-to-tutorials/build-generative-chatbot-using-recurrent-neural-networks-lstm-rnns?fallbackPlaceholder=en-gb%2Flearning%2Fhow-to-tutorials%2Fbuild-generative-chatbot-using-recurrent-neural-networks-lstm-rnns

[^11]: https://www.taylorfrancis.com/chapters/edit/10.1201/9781003357346-75/development-chatbot-using-lstm-architecture-seq2seq-model-adil-tanmaya-garg-rajesh-kumar

[^12]: https://github.com/AdroitAnandAI/LSTM-Attention-based-Generative-Chat-bot

[^13]: https://dl.acm.org/doi/10.1145/3529649

[^14]: https://ijisae.org/index.php/IJISAE/article/view/6100

[^15]: https://www.datasciencecentral.com/under-the-hood-with-chatbots/

[^16]: http://www.warse.org/IJETER/static/pdf/file/ijeter35852020.pdf

[^17]: https://pdfs.semanticscholar.org/be96/353bdbf2803d8cf2d1f6556d7f92f0f4adcb.pdf

[^18]: https://medium.com/@Scofield_Idehen/building-nlp-chatbots-with-pytorch-c4447604c974

[^19]: https://github.com/rshinde03/AI-Self-learning-Chatbot