<a href="https://colab.research.google.com/github/L-W121/generals-bots/blob/master/learn.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!nvidia-smi
import torch
print("CUDA:", torch.cuda.is_available())


Thu Sep 25 12:38:21 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  Tesla T4                       Off |   00000000:00:04.0 Off |                    0 |
| N/A   49C    P8              9W /   70W |       0MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

In [2]:
!git clone https://github.com/L-W121/generals-bots.git
%cd generals-bots

fatal: destination path 'generals-bots' already exists and is not an empty directory.
/content/generals-bots


In [9]:
%cd data_process/
!pip install -r requirements.txt


/content/generals-bots/data_process


In [13]:
!python scripts/preprocess_data.py

2025-09-25 12:47:50,233 - __main__ - INFO - 开始生成训练数据...
2025-09-25 12:47:50,233 - data_process.preprocessing.replay_parser - INFO - 正在加载数据集: strakammm/generals_io_replays
README.md: 5.93kB [00:00, 10.5MB/s]
train-00000-of-00001.parquet: 100% 61.4M/61.4M [00:00<00:00, 64.7MB/s]
Generating train split: 100% 18803/18803 [00:07<00:00, 2672.68 examples/s]
2025-09-25 12:48:00,555 - data_process.preprocessing.replay_parser - INFO - 数据集加载完成，包含 18803 个对局
2025-09-25 12:48:00,556 - __main__ - INFO - 准备处理 18803 个对局
处理对局批次:   0% 0/189 [00:00<?, ?it/s]2025-09-25 12:48:52,037 - __main__ - INFO - 批次 0-100 完成，生成 57816 个样本
处理对局批次:   1% 1/189 [00:51<2:41:18, 51.48s/it]2025-09-25 12:49:38,635 - __main__ - INFO - 批次 100-200 完成，生成 49474 个样本
处理对局批次:   1% 2/189 [01:38<2:31:30, 48.61s/it]2025-09-25 12:51:06,775 - __main__ - INFO - 批次 200-300 完成，生成 58571 个样本
处理对局批次:   2% 3/189 [04:02<4:10:57, 80.95s/it]
Traceback (most recent call last):
  File "/content/generals-bots/data_process/scripts/preprocess_data.py", l

In [14]:
# 检查已生成的数据文件
!ls -la data/cache/preprocessed/
!du -sh data/cache/preprocessed/

# 验证数据格式
import torch
batch_file = "data/cache/preprocessed/batch_000000_000100.pt"
samples = torch.load(batch_file)
state, action, metadata = samples[0]
print(f"数据格式验证:")
print(f"  状态形状: {state.shape}")  # 应该是 [15, 22, 22]
print(f"  动作类型: {type(action)}")
print(f"  样本数量: {len(samples)}")


total 5215656
drwxr-xr-x 2 root root       4096 Sep 25 12:51 .
drwxr-xr-x 3 root root       4096 Sep 25 12:48 ..
-rw-r--r-- 1 root root 1696595759 Sep 25 12:48 batch_000000_000100.pt
-rw-r--r-- 1 root root 1451797799 Sep 25 12:49 batch_000100_000200.pt
-rw-r--r-- 1 root root 1718752083 Sep 25 12:51 batch_000200_000300.pt
-rw-r--r-- 1 root root  473657799 Sep 25 12:52 batch_000300_000400.pt
5.0G	data/cache/preprocessed/
数据格式验证:
  状态形状: torch.Size([15, 22, 22])
  动作类型: <class 'int'>
  样本数量: 57816


In [16]:
# 1. 导入基础依赖
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.utils.data import DataLoader, random_split, Dataset
from pathlib import Path
from tqdm.notebook import tqdm
import numpy as np

# 2. 直接在 Notebook 中定义 CNN 模型
class GeneralsCNN(nn.Module):
    """Generals.io CNN 模型"""

    def __init__(self, input_channels=15, map_size=(22, 22)):
        super().__init__()

        # 计算动作空间大小 (每个位置5个方向)
        self.action_space_size = map_size[0] * map_size[1] * 5

        # CNN 特征提取
        self.features = nn.Sequential(
            # 第一层
            nn.Conv2d(input_channels, 64, kernel_size=3, padding=1),
            nn.BatchNorm2d(64),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(2, stride=1, padding=1),

            # 第二层
            nn.Conv2d(64, 128, kernel_size=3, padding=1),
            nn.BatchNorm2d(128),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(2, stride=1, padding=1),

            # 第三层
            nn.Conv2d(128, 256, kernel_size=3, padding=1),
            nn.BatchNorm2d(256),
            nn.ReLU(inplace=True),
        )

        # 自适应池化
        self.adaptive_pool = nn.AdaptiveAvgPool2d((8, 8))

        # 全连接层
        self.classifier = nn.Sequential(
            nn.Dropout(0.3),
            nn.Linear(256 * 8 * 8, 512),
            nn.ReLU(inplace=True),
            nn.Dropout(0.3),
            nn.Linear(512, self.action_space_size)
        )

    def forward(self, x):
        # 特征提取
        x = self.features(x)

        # 自适应池化
        x = self.adaptive_pool(x)

        # 展平
        x = x.view(x.size(0), -1)

        # 分类
        action_logits = self.classifier(x)

        return {'action_logits': action_logits}

# 3. 数据集类
class GeneralsDataset(Dataset):
    def __init__(self, samples):
        self.samples = samples

    def __len__(self):
        return len(self.samples)

    def __getitem__(self, idx):
        state, action, _ = self.samples[idx]
        return state.float(), torch.tensor(action, dtype=torch.long)

# 4. 加载预处理数据
print("📦 加载预处理数据...")
batch_files = list(Path('data/cache/preprocessed').glob('batch_*.pt'))
all_samples = []

for batch_file in batch_files:
    try:
        batch_data = torch.load(batch_file, map_location='cpu')
        all_samples.extend(batch_data)
    except Exception as e:
        print(f"⚠️ 跳过文件 {batch_file}: {e}")

print(f"✅ 总样本数: {len(all_samples):,}")

# 检查数据格式
if all_samples:
    state, action, meta = all_samples[0]
    print(f"样本格式: state={state.shape}, action={action}")

# 5. 创建数据加载器
dataset = GeneralsDataset(all_samples)
train_size = int(0.8 * len(dataset))
val_size = len(dataset) - train_size

train_dataset, val_dataset = random_split(dataset, [train_size, val_size])

train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=64, shuffle=False)

print(f"📊 数据划分:")
print(f"   训练集: {len(train_dataset):,} 样本")
print(f"   验证集: {len(val_dataset):,} 样本")

# 6. 初始化模型和训练组件
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"🔧 使用设备: {device}")

model = GeneralsCNN(input_channels=15, map_size=(22, 22))
model.to(device)

optimizer = optim.Adam(model.parameters(), lr=1e-3, weight_decay=1e-4)
criterion = nn.CrossEntropyLoss()

print(f"🧠 模型参数数量: {sum(p.numel() for p in model.parameters()):,}")
print(f"🎯 动作空间大小: {model.action_space_size:,}")

# 7. 训练循环
num_epochs = 20
best_val_acc = 0.0

print(f"\n🚀 开始训练 ({num_epochs} 轮)...")

for epoch in range(1, num_epochs + 1):
    # 训练阶段
    model.train()
    train_loss = 0
    train_correct = 0
    train_total = 0

    pbar = tqdm(train_loader, desc=f"Epoch {epoch:2d}/20 [Train]")
    for states, actions in pbar:
        states = states.to(device, non_blocking=True)
        actions = actions.to(device, non_blocking=True)

        optimizer.zero_grad()
        outputs = model(states)
        loss = criterion(outputs['action_logits'], actions)
        loss.backward()
        optimizer.step()

        train_loss += loss.item()
        _, predicted = outputs['action_logits'].max(1)
        train_total += actions.size(0)
        train_correct += predicted.eq(actions).sum().item()

        # 更新进度条
        acc = 100. * train_correct / train_total
        pbar.set_postfix({'Loss': f'{train_loss/(pbar.n+1):.4f}', 'Acc': f'{acc:.2f}%'})

    train_acc = 100. * train_correct / train_total

    # 验证阶段
    model.eval()
    val_loss = 0
    val_correct = 0
    val_total = 0

    with torch.no_grad():
        for states, actions in tqdm(val_loader, desc=f"Epoch {epoch:2d}/20 [Val]  "):
            states = states.to(device, non_blocking=True)
            actions = actions.to(device, non_blocking=True)

            outputs = model(states)
            loss = criterion(outputs['action_logits'], actions)

            val_loss += loss.item()
            _, predicted = outputs['action_logits'].max(1)
            val_total += actions.size(0)
            val_correct += predicted.eq(actions).sum().item()

    val_acc = 100. * val_correct / val_total

    # 打印结果
    print(f"Epoch {epoch:2d}: Train Loss {train_loss/len(train_loader):.4f} | Train Acc {train_acc:.2f}% | Val Acc {val_acc:.2f}%")

    # 保存最佳模型
    if val_acc > best_val_acc:
        best_val_acc = val_acc
        torch.save({
            'epoch': epoch,
            'model_state_dict': model.state_dict(),
            'optimizer_state_dict': optimizer.state_dict(),
            'val_accuracy': val_acc,
        }, 'best_generals_model.pth')
        print(f"✅ 保存新的最佳模型 (验证准确率: {val_acc:.2f}%)")

print(f"\n🎉 训练完成! 最佳验证准确率: {best_val_acc:.2f}%")
print("📁 模型已保存为: 'best_generals_model.pth'")


📦 加载预处理数据...
⚠️ 跳过文件 data/cache/preprocessed/batch_000300_000400.pt: PytorchStreamReader failed locating file data/15962: file not found
✅ 总样本数: 165,861
样本格式: state=torch.Size([15, 22, 22]), action=447
📊 数据划分:
   训练集: 132,688 样本
   验证集: 33,173 样本
🔧 使用设备: cuda
🧠 模型参数数量: 10,009,204
🎯 动作空间大小: 2,420

🚀 开始训练 (20 轮)...


Epoch  1/20 [Train]:   0%|          | 0/4147 [00:00<?, ?it/s]

Epoch  1/20 [Val]  :   0%|          | 0/519 [00:00<?, ?it/s]

Epoch  1: Train Loss 7.3359 | Train Acc 0.17% | Val Acc 0.25%
✅ 保存新的最佳模型 (验证准确率: 0.25%)


Epoch  2/20 [Train]:   0%|          | 0/4147 [00:00<?, ?it/s]

Epoch  2/20 [Val]  :   0%|          | 0/519 [00:00<?, ?it/s]

Epoch  2: Train Loss 7.1395 | Train Acc 0.32% | Val Acc 0.47%
✅ 保存新的最佳模型 (验证准确率: 0.47%)


Epoch  3/20 [Train]:   0%|          | 0/4147 [00:00<?, ?it/s]

Epoch  3/20 [Val]  :   0%|          | 0/519 [00:00<?, ?it/s]

Epoch  3: Train Loss 7.0200 | Train Acc 0.46% | Val Acc 0.65%
✅ 保存新的最佳模型 (验证准确率: 0.65%)


Epoch  4/20 [Train]:   0%|          | 0/4147 [00:00<?, ?it/s]

Epoch  4/20 [Val]  :   0%|          | 0/519 [00:00<?, ?it/s]

Epoch  4: Train Loss 6.9050 | Train Acc 0.65% | Val Acc 0.91%
✅ 保存新的最佳模型 (验证准确率: 0.91%)


Epoch  5/20 [Train]:   0%|          | 0/4147 [00:00<?, ?it/s]

Epoch  5/20 [Val]  :   0%|          | 0/519 [00:00<?, ?it/s]

Epoch  5: Train Loss 6.8097 | Train Acc 0.80% | Val Acc 1.09%
✅ 保存新的最佳模型 (验证准确率: 1.09%)


Epoch  6/20 [Train]:   0%|          | 0/4147 [00:00<?, ?it/s]

Epoch  6/20 [Val]  :   0%|          | 0/519 [00:00<?, ?it/s]

Epoch  6: Train Loss 6.7290 | Train Acc 0.92% | Val Acc 1.15%
✅ 保存新的最佳模型 (验证准确率: 1.15%)


Epoch  7/20 [Train]:   0%|          | 0/4147 [00:00<?, ?it/s]

Epoch  7/20 [Val]  :   0%|          | 0/519 [00:00<?, ?it/s]

Epoch  7: Train Loss 6.6655 | Train Acc 0.99% | Val Acc 1.15%


Epoch  8/20 [Train]:   0%|          | 0/4147 [00:00<?, ?it/s]

Epoch  8/20 [Val]  :   0%|          | 0/519 [00:00<?, ?it/s]

Epoch  8: Train Loss 6.6268 | Train Acc 1.05% | Val Acc 1.32%
✅ 保存新的最佳模型 (验证准确率: 1.32%)


Epoch  9/20 [Train]:   0%|          | 0/4147 [00:00<?, ?it/s]

Epoch  9/20 [Val]  :   0%|          | 0/519 [00:00<?, ?it/s]

Epoch  9: Train Loss 6.5951 | Train Acc 1.02% | Val Acc 1.24%


Epoch 10/20 [Train]:   0%|          | 0/4147 [00:00<?, ?it/s]

Epoch 10/20 [Val]  :   0%|          | 0/519 [00:00<?, ?it/s]

Epoch 10: Train Loss 6.5746 | Train Acc 1.14% | Val Acc 1.27%


Epoch 11/20 [Train]:   0%|          | 0/4147 [00:00<?, ?it/s]

Epoch 11/20 [Val]  :   0%|          | 0/519 [00:00<?, ?it/s]

Epoch 11: Train Loss 6.5531 | Train Acc 1.10% | Val Acc 1.36%
✅ 保存新的最佳模型 (验证准确率: 1.36%)


Epoch 12/20 [Train]:   0%|          | 0/4147 [00:00<?, ?it/s]

Epoch 12/20 [Val]  :   0%|          | 0/519 [00:00<?, ?it/s]

Epoch 12: Train Loss 6.5416 | Train Acc 1.08% | Val Acc 1.28%


Epoch 13/20 [Train]:   0%|          | 0/4147 [00:00<?, ?it/s]

Epoch 13/20 [Val]  :   0%|          | 0/519 [00:00<?, ?it/s]

Epoch 13: Train Loss 6.5261 | Train Acc 1.17% | Val Acc 1.21%


Epoch 14/20 [Train]:   0%|          | 0/4147 [00:00<?, ?it/s]

Epoch 14/20 [Val]  :   0%|          | 0/519 [00:00<?, ?it/s]

Epoch 14: Train Loss 6.5196 | Train Acc 1.16% | Val Acc 1.42%
✅ 保存新的最佳模型 (验证准确率: 1.42%)


Epoch 15/20 [Train]:   0%|          | 0/4147 [00:00<?, ?it/s]

Epoch 15/20 [Val]  :   0%|          | 0/519 [00:00<?, ?it/s]

Epoch 15: Train Loss 6.5106 | Train Acc 1.20% | Val Acc 1.34%


Epoch 16/20 [Train]:   0%|          | 0/4147 [00:00<?, ?it/s]

Epoch 16/20 [Val]  :   0%|          | 0/519 [00:00<?, ?it/s]

Epoch 16: Train Loss 6.5051 | Train Acc 1.19% | Val Acc 1.37%


Epoch 17/20 [Train]:   0%|          | 0/4147 [00:00<?, ?it/s]

Epoch 17/20 [Val]  :   0%|          | 0/519 [00:00<?, ?it/s]

Epoch 17: Train Loss 6.4928 | Train Acc 1.17% | Val Acc 1.43%
✅ 保存新的最佳模型 (验证准确率: 1.43%)


Epoch 18/20 [Train]:   0%|          | 0/4147 [00:00<?, ?it/s]

Epoch 18/20 [Val]  :   0%|          | 0/519 [00:00<?, ?it/s]

Epoch 18: Train Loss 6.4864 | Train Acc 1.21% | Val Acc 1.47%
✅ 保存新的最佳模型 (验证准确率: 1.47%)


Epoch 19/20 [Train]:   0%|          | 0/4147 [00:00<?, ?it/s]

Epoch 19/20 [Val]  :   0%|          | 0/519 [00:00<?, ?it/s]

Epoch 19: Train Loss 6.4813 | Train Acc 1.28% | Val Acc 1.39%


Epoch 20/20 [Train]:   0%|          | 0/4147 [00:00<?, ?it/s]

Epoch 20/20 [Val]  :   0%|          | 0/519 [00:00<?, ?it/s]

Epoch 20: Train Loss 6.4756 | Train Acc 1.24% | Val Acc 1.37%

🎉 训练完成! 最佳验证准确率: 1.47%
📁 模型已保存为: 'best_generals_model.pth'
