<a href="https://colab.research.google.com/github/day253/labs/blob/master/labs/text_cnn.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [4]:
import torch
import torch.nn as nn
import torch.nn.functional as F

2. 定义模型
我们将定义一个包含两个通道（静态和非静态）的CNN模型。模型结构如下：

输入层：句子表示为
n
×
k
n×k 的矩阵，其中
n
n 是句子长度，
k
k 是词向量维度。

卷积层：使用多个卷积核宽度（filter widths）和特征图（feature maps）。

池化层：使用时间维度上的最大池化（max-over-time pooling）。

全连接层：包含dropout和softmax输出。

In [5]:
class TextCNN(nn.Module):
    def __init__(self, vocab_size, embedding_dim, num_filters, filter_sizes, num_classes, dropout_rate):
        super(TextCNN, self).__init__()

        # 词嵌入层
        self.embedding = nn.Embedding(vocab_size, embedding_dim)

        # 静态通道（不更新词嵌入）
        self.static_embedding = nn.Embedding(vocab_size, embedding_dim)
        self.static_embedding.weight.requires_grad = False

        # 卷积层
        self.convs = nn.ModuleList([
            nn.Conv2d(in_channels=1, out_channels=num_filters, kernel_size=(fs, embedding_dim))
            for fs in filter_sizes
        ])

        # 全连接层
        self.fc = nn.Linear(len(filter_sizes) * num_filters, num_classes)

        # Dropout
        self.dropout = nn.Dropout(dropout_rate)

    def forward(self, x):
        # 获取词嵌入
        embedded = self.embedding(x).unsqueeze(1)  # (batch_size, 1, seq_len, embedding_dim)
        static_embedded = self.static_embedding(x).unsqueeze(1)  # (batch_size, 1, seq_len, embedding_dim)

        # 合并静态和非静态通道
        combined_embedded = torch.cat([embedded, static_embedded], dim=1)  # (batch_size, 2, seq_len, embedding_dim)

        # 卷积操作
        conv_outputs = [F.relu(conv(combined_embedded)).squeeze(3) for conv in self.convs]  # [(batch_size, num_filters, seq_len - filter_size + 1)]

        # 最大池化
        pooled_outputs = [F.max_pool1d(conv_output, conv_output.size(2)).squeeze(2) for conv_output in conv_outputs]  # [(batch_size, num_filters)]

        # 合并所有池化结果
        pooled = torch.cat(pooled_outputs, 1)  # (batch_size, num_filters * len(filter_sizes))

        # Dropout
        pooled = self.dropout(pooled)

        # 全连接层
        logits = self.fc(pooled)

        return logits

3. 初始化模型
假设我们有以下参数：

vocab_size：词汇表大小

embedding_dim：词向量维度

num_filters：每个卷积核的特征图数量

filter_sizes：卷积核宽度列表

num_classes：分类类别数量

dropout_rate：dropout率

In [6]:
vocab_size = 10000
embedding_dim = 300
num_filters = 100
filter_sizes = [3, 4, 5]
num_classes = 2
dropout_rate = 0.5

model = TextCNN(vocab_size, embedding_dim, num_filters, filter_sizes, num_classes, dropout_rate)

4. 训练模型
训练模型的步骤包括：

定义损失函数和优化器

迭代训练数据

计算损失并反向传播

In [7]:
num_epochs = 10

# 定义损失函数和优化器
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

# 假设我们有一些训练数据
# train_data: (batch_size, seq_len)
# train_labels: (batch_size)

for epoch in range(num_epochs):
    model.train()
    optimizer.zero_grad()

    # 前向传播
    outputs = model(train_data)

    # 计算损失
    loss = criterion(outputs, train_labels)

    # 反向传播和优化
    loss.backward()
    optimizer.step()

    print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')

NameError: name 'num_epochs' is not defined

5. 测试模型
在测试数据上评估模型的性能：

In [None]:
model.eval()
with torch.no_grad():
    outputs = model(test_data)
    _, predicted = torch.max(outputs, 1)
    accuracy = (predicted == test_labels).sum().item() / test_labels.size(0)
    print(f'Test Accuracy: {accuracy:.4f}')