 <font size=6> **实验16 现代循环神经网络**</font>

* 理解GRU要解决问题、算法思路，并能熟练运用；
* 理解LSTM要解决问题、算法思路，并能熟练运用；
* 理解深度循环神经网络的基本结构和简洁实现；
* 理解双向循环神经网络(BiLSTM)的基本思路、适用范围；
* 了解机器翻译与数据集的基本流程。

**(实验报告提交题)**  lab_16_4目录中spider.ipynb程序是根据https://zhuanlan.zhihu.com/p/597957245 站点代码改编的(原url无效)，其功能是爬取李白诗词并保存到libai.txt.文件中，请据此使用LSTM实现李白诗词生成的功能。

# 导入库

In [1]:
import numpy as np
import torch
import torch.nn as nn
from torch.nn.utils.rnn import pack_padded_sequence, pad_packed_sequence
import torch.optim as optim
import numpy as np
import random

# 数据加载和处理

## 加载数据

In [2]:
with open('libai.txt', 'r', encoding='utf8') as file:
    content = file.read().replace('\n', '')
# print(content.replace('\n', ''))
chars = list(set(content))
print(chars)
n_chars = len(chars)
print(n_chars)
# 字符与对应数字标记
char_indices = dict((c, i) for i, c in enumerate(chars))
print(char_indices)
# 数字标记与对应的字符
indices_char = dict((i, c) for i, c in enumerate(chars))
print(indices_char)

maxlen = 20
step = 3
sentences = []
next_chars = []
for i in range(0, len(content) - maxlen, step):
    sentences.append(content[i: i+maxlen])
    next_chars.append(content[i+maxlen])

['脱', '浩', '五', '感', '许', '是', '离', '载', '罢', '固', '笔', '归', '旃', '危', '予', '招', '造', '童', '迫', '羲', '散', '虎', '忘', '俱', '保', '略', '轮', '促', '奴', '蜍', '蔓', '馔', '及', '烛', '僧', '帝', '客', '何', '辰', '肆', '亥', '却', '阶', '乎', '声', '啼', '御', '没', '徒', '荡', '礼', '宾', '雅', '膺', '携', '代', '冠', '骨', '湾', '翠', '烹', '居', '帏', '个', '响', '城', '妒', '卢', '生', '身', '俊', '水', '至', '歇', '掷', '游', '缺', '霸', '鸾', '簸', '所', '漱', '敢', '殊', '管', '鸟', '两', '田', '仁', '苔', '翁', '宣', '避', '迁', '烜', '解', '建', '杖', '溪', '杯', '谣', '隗', '容', '徂', '户', '试', '隘', '殿', '道', '洲', '炎', '泪', '猱', '块', '赵', '半', '夏', '压', '吐', '句', '忧', '午', '入', '虏', '栈', '寞', '凉', '怀', '驻', '始', '湿', '焚', '桀', '常', '施', '求', '岭', '堆', '彩', '窈', '黄', '涂', '羞', '肯', '罗', '今', '盛', '丝', '骠', '赐', '夙', '其', '发', '寻', '尔', '泉', '络', '久', '左', '珑', '征', '尺', '雉', '亦', '松', '望', '恨', '菱', '适', '西', '玉', '残', '幽', '双', '娱', '识', '当', '喜', '兹', '胁', '桥', '徘', '清', '匹', '悲', '逸', '题', '市', '听', '萏', '京', '坐', '年', '旧', '弃', '布', '县', '妇', '石', '甚',

In [5]:
len(sentences)

2955

In [12]:
n_chars

1472

In [11]:
sentences[:10]

['将进酒君不见黄河之水天上来，奔流到海不复',
 '君不见黄河之水天上来，奔流到海不复回。君',
 '黄河之水天上来，奔流到海不复回。君不见高',
 '水天上来，奔流到海不复回。君不见高堂明镜',
 '来，奔流到海不复回。君不见高堂明镜悲白发',
 '流到海不复回。君不见高堂明镜悲白发，朝如',
 '不复回。君不见高堂明镜悲白发，朝如青丝暮',
 '。君不见高堂明镜悲白发，朝如青丝暮成雪。',
 '见高堂明镜悲白发，朝如青丝暮成雪。人生得',
 '明镜悲白发，朝如青丝暮成雪。人生得意须尽']

## one-hot编码

In [6]:
# 将所有句子中的字符转换为独热编码的形式
import numpy as np
import torch
X_train = torch.zeros((len(sentences), maxlen, len(chars)), dtype=torch.float32)
y_train = torch.zeros((len(sentences), len(chars)), dtype=torch.float32)

for i, sentence in enumerate(sentences):
    for j, char in enumerate(sentence):
        X_train[i, j, char_indices[char]] = 1
    y_train[i, char_indices[next_chars[i]]] = 1

In [9]:
X_train[0][0]

tensor([0., 0., 0.,  ..., 0., 0., 0.])

In [10]:
len(X_train[0][0])

1472

# LSTM建模

In [13]:
class RNNModel(nn.Module):
  def __init__(self, input_size, hidden_size, output_size):
      super().__init__()
      self.rnn = nn.RNN(input_size, hidden_size, batch_first=True)
      self.linear = nn.Linear(hidden_size, output_size)
      self.fc = nn.Linear(hidden_size, output_size)

  def forward(self, x):
      out,_ = self.rnn(x)
      out = out[:, -1, :]
      return self.fc(out)

class LSTMModel(nn.Module):
  def __init__(self, input_size, hidden_size, output_size):
      super().__init__()
      self.rnn = nn.LSTM(input_size, hidden_size, batch_first=True)
      self.fc = nn.Linear(hidden_size, output_size)

  def forward(self, x):
      out,_ = self.rnn(x)
      out = out[:, -1, :]
      return self.fc(out)

In [24]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

Using device: cuda


In [25]:
model = LSTMModel(len(chars), 128, len(chars)).to(device)
X_train = X_train.to(device)
y_train = y_train.to(device)

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.01)

num_epochs = 100
for epoch in range(num_epochs):
    optimizer.zero_grad()
    output = model(X_train.float())
    loss = criterion(output, y_train.argmax(dim=1).to(device))
    loss.backward()
    optimizer.step()
    print(f'Epoch: [{epoch+1}], Loss: {loss.item():.4f}')

Epoch: [1], Loss: 7.2856
Epoch: [2], Loss: 7.1894
Epoch: [3], Loss: 6.6581
Epoch: [4], Loss: 6.2853
Epoch: [5], Loss: 6.1330
Epoch: [6], Loss: 6.0480
Epoch: [7], Loss: 6.0314
Epoch: [8], Loss: 6.0198
Epoch: [9], Loss: 6.0217
Epoch: [10], Loss: 6.0116
Epoch: [11], Loss: 5.9864
Epoch: [12], Loss: 5.9647
Epoch: [13], Loss: 5.9564
Epoch: [14], Loss: 5.9560
Epoch: [15], Loss: 5.9539
Epoch: [16], Loss: 5.9449
Epoch: [17], Loss: 5.9305
Epoch: [18], Loss: 5.9158
Epoch: [19], Loss: 5.9031
Epoch: [20], Loss: 5.8910
Epoch: [21], Loss: 5.8770
Epoch: [22], Loss: 5.8587
Epoch: [23], Loss: 5.8363
Epoch: [24], Loss: 5.8107
Epoch: [25], Loss: 5.7818
Epoch: [26], Loss: 5.7491
Epoch: [27], Loss: 5.7140
Epoch: [28], Loss: 5.6745
Epoch: [29], Loss: 5.6337
Epoch: [30], Loss: 5.5988
Epoch: [31], Loss: 5.5415
Epoch: [32], Loss: 5.4848
Epoch: [33], Loss: 5.4320
Epoch: [34], Loss: 5.3580
Epoch: [35], Loss: 5.2921
Epoch: [36], Loss: 5.2180
Epoch: [37], Loss: 5.1322
Epoch: [38], Loss: 5.1483
Epoch: [39], Loss: 5.

In [30]:
def sample(preds, temperature=1.0):
    preds = torch.softmax(preds / temperature, dim=-1)
    preds = preds.detach().cpu().numpy()
    return np.random.choice(len(preds), p=preds)

def generate_text(length, diversity):
    start_index = random.randint(0, len(content) - maxlen - 1)
    sentence = content[start_index : start_index + maxlen]
    generate = ""
    for i in range(length):
        x_pred = torch.zeros((1, maxlen, len(chars)), dtype=torch.float32).to(device)
        for t, char in enumerate(sentence):
            x_pred[0, t, char_indices[char]] = 1
        preds = model(x_pred)[0]
        next_index = sample(preds, diversity)
        next_char = indices_char[next_index]
        sentence = sentence[1:] + next_char
        generate += next_char
    return generate

In [31]:
generate_text(24, 0.9)

'抱抱以，窈明飞羽枝水东流流天。坐白·长相天门前。'