# HW3-2: MNIST 手写数字识别 （80分）

- 本次作业的主题是使用深度学习技术对`MNIST`手写数字数据集进行分类。
- 要求使用两种不同的深度学习模型，分别是多层感知机--`MLP`和卷积神经网络--`CNN`。
- 本次作业的目的是让同学们了解深度学习模型的构建和训练过程，以及对经典视觉任务处理效果的对比。

## 评分标准（更细致的评分标准见对应部分）

- 数据读取和预处理：5分
- 基于`MLP`的手写数字分类：30分 
- 基于`CNN`的手写数字分类：30分
- 结果讨论与对比分析：15分


---

## 第一部分：数据读取和预处理

你可以在Canvas上下载如下两个文件：
- 训练数据：`mnist_train.csv`
- 测试数据：`mnist_test.csv`

这两个文件包含了Mnist数据集的训练集和测试集。每一行的第一个数字表示标签，后面的784数字是28x28的图片像素值。
- 请你正确读取并分别展示训练集和测试集的第一个样本（图像+label）。

### 评分细则
- 数据读取：2分
- 数据展示：3分

In [1]:
import pandas as pd
import numpy as np

train_df = pd.read_csv('mnist_train.csv')
test_df = pd.read_csv('mnist_test.csv')

train_df.head()
test_df.head()
# Code Here

Unnamed: 0,7,0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,...,0.658,0.659,0.660,0.661,0.662,0.663,0.664,0.665,0.666,0.667
0,2,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,1,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,4,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,1,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


---
## 第二部分：基于`MLP`的手写数字分类

- 请你构建一个多层感知机模型，在Mnist数据集上进行训练和测试。
- 可以使用`Pytorch`或者`Tensorflow`等深度学习框架。
- 请你展示模型的训练过程（Loss曲线）和测试结果（分类精度）。

### 评分细则
- 模型构建：10分
- 模型训练：10分 - 要求Loss曲线收敛
- 模型测试：10分 - 要求分类精度达到90%以上

In [None]:
# Code here
import torch
import torch.nn as nn
import tqdm as tqdm
from sklearn.model_selection import train_test_split


class SimpleNN(nn.Module):
    def __init__(self):
        super(SimpleNN, self).__init__()
        self.fc1 = nn.Linear(784, 128)  # 28*28=784
        self.fc2 = nn.Linear(128, 64)
        self.fc3 = nn.Linear(64, 10)  # 10 classes for digits 0-9

    def forward(self, x):
        x = torch.relu(self.fc1(x))
        x = torch.relu(self.fc2(x))
        x = self.fc3(x)
        return x
    
# Convert DataFrame to PyTorch tensors
def df_to_tensor(df):
    labels   = torch.as_tensor(df.iloc[:, 0].values, dtype=torch.long)
    features = torch.as_tensor(df.iloc[:, 1:].values, dtype=torch.float32)
    return features, labels

model=SimpleNN()
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model.to(device)

learning_rate = 0.001
batch_size = 64

optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
criterion = nn.CrossEntropyLoss()

train_features, train_labels = df_to_tensor(train_df)

train_features = train_features.to(device)
train_labels   = train_labels.to(device)
train_dataset = torch.utils.data.TensorDataset(train_features, train_labels)
train_dataset, val_dataset = train_test_split(train_dataset, test_size=0.2, random_state=42)
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
val_loader = torch.utils.data.DataLoader(val_dataset, batch_size=batch_size, shuffle=False)

def train_model(model, train_loader,val_loader, criterion, optimizer, epochs=10):
    model.train()
    for epoch in range(epochs):
        running_loss = 0.0
        correct     = 0
        total       = 0
        
        for features, labels in tqdm.tqdm(train_loader, desc=f'Epoch {epoch+1}/{epochs}'):
            optimizer.zero_grad()
            
            outputs = model(features)                   
            loss    = criterion(outputs, labels)        
            loss.backward()                            
            optimizer.step()                            

            batch_size     = labels.size(0)
            running_loss  += loss.item() * batch_size
            
            preds          = outputs.argmax(dim=1)
            correct      += (preds == labels).sum().item()
            total        += batch_size
        
        # 计算 epoch 平均损失和准确率
        epoch_loss = running_loss / total
        accuracy   = 100 * correct / total
        
        print(
            f'Epoch {epoch+1}/{epochs} — '
            f'Loss: {epoch_loss:.4f} — '
            f'Accuracy: {accuracy:.2f}%'
        )
        
        for features, labels in val_loader:
            model.eval()
            outputs = model(features)
            loss = criterion(outputs, labels)
            preds = outputs.argmax(dim=1)
            correct += (preds == labels).sum().item()
            total += labels.size(0)
        print(f'Validation Loss: {loss.item():.4f} — Validation Accuracy: {100 * correct / total:.2f}%')
            

test_features, test_labels = df_to_tensor(test_df)
test_features = test_features.to(device)
test_labels   = test_labels.to(device)        
test_dataset = torch.utils.data.TensorDataset(test_features, test_labels)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

def evaluate_model(model, test_loader):
    model.eval()
    correct = 0
    total = 0
    with torch.no_grad():
        for features, labels in test_loader:
            outputs = model(features)
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()
    accuracy = 100 * correct / total
    print(f'test Accuracy: {accuracy:.2f}%')

train_model(model, train_loader,val_loader, criterion, optimizer, epochs=10)
evaluate_model(model, test_loader)


Epoch 1/10: 100%|██████████| 750/750 [00:01<00:00, 692.59it/s]


Epoch 1/10 — Loss: 0.3506 — Accuracy: 90.99%
Validation Loss: 0.2932 — Validation Accuracy: 91.47%


Epoch 2/10: 100%|██████████| 750/750 [00:00<00:00, 835.63it/s]


Epoch 2/10 — Loss: 0.1576 — Accuracy: 95.36%
Validation Loss: 0.3595 — Validation Accuracy: 95.40%


Epoch 3/10: 100%|██████████| 750/750 [00:00<00:00, 799.34it/s]


Epoch 3/10 — Loss: 0.1169 — Accuracy: 96.45%
Validation Loss: 0.0695 — Validation Accuracy: 96.25%


Epoch 4/10: 100%|██████████| 750/750 [00:00<00:00, 777.68it/s]


Epoch 4/10 — Loss: 0.0956 — Accuracy: 97.08%
Validation Loss: 0.2530 — Validation Accuracy: 96.90%


Epoch 5/10: 100%|██████████| 750/750 [00:00<00:00, 785.86it/s]


Epoch 5/10 — Loss: 0.0847 — Accuracy: 97.40%
Validation Loss: 0.2907 — Validation Accuracy: 97.18%


Epoch 6/10: 100%|██████████| 750/750 [00:00<00:00, 786.65it/s]


Epoch 6/10 — Loss: 0.0843 — Accuracy: 97.46%
Validation Loss: 0.5814 — Validation Accuracy: 97.26%


Epoch 7/10: 100%|██████████| 750/750 [00:00<00:00, 752.64it/s]


Epoch 7/10 — Loss: 0.0762 — Accuracy: 97.74%
Validation Loss: 0.2528 — Validation Accuracy: 97.45%


Epoch 8/10: 100%|██████████| 750/750 [00:00<00:00, 754.27it/s]


Epoch 8/10 — Loss: 0.0707 — Accuracy: 97.89%
Validation Loss: 0.3973 — Validation Accuracy: 97.51%


Epoch 9/10: 100%|██████████| 750/750 [00:00<00:00, 769.97it/s]


Epoch 9/10 — Loss: 0.0621 — Accuracy: 98.13%
Validation Loss: 0.8199 — Validation Accuracy: 97.83%


Epoch 10/10: 100%|██████████| 750/750 [00:00<00:00, 764.53it/s]


Epoch 10/10 — Loss: 0.0674 — Accuracy: 98.07%
Validation Loss: 1.0631 — Validation Accuracy: 97.75%
Accuracy: 96.71%


In [11]:
import torch
import torch.nn as nn
import tqdm
from sklearn.model_selection import StratifiedKFold
from torch.utils.data import TensorDataset, DataLoader, Subset

# 假设 train_features/train_labels 都是 CPU 上的 Tensor
train_features, train_labels = df_to_tensor(train_df)

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# 1. 整体 Dataset
full_dataset = TensorDataset(train_features, train_labels)

# 2. 定义五折拆分器
skf = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)

fold_metrics = []

for fold, (train_idx, val_idx) in enumerate(
        skf.split(train_features.numpy(), train_labels.numpy()), 1):
    
    print(f"\n>>> Fold {fold}")
    # 3. 构造每折的 DataLoader
    train_subset = Subset(full_dataset, train_idx)
    val_subset   = Subset(full_dataset, val_idx)
    train_loader = DataLoader(train_subset, batch_size=64, shuffle=True)
    val_loader   = DataLoader(val_subset,   batch_size=64, shuffle=False)
    
    # 4. 每折都要重新初始化模型和优化器
    model = SimpleNN().to(device)
    optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
    criterion = nn.CrossEntropyLoss()
    
    # 5. 训练若干 epoch
    for epoch in range(10):
        model.train()
        total_loss = total_correct = total_samples = 0
        for x, y in tqdm.tqdm(train_loader, desc=f"Fold{fold} Epoch{epoch+1}"):
            x, y = x.to(device), y.to(device)
            optimizer.zero_grad()
            logits = model(x)
            loss = criterion(logits, y)
            loss.backward()
            optimizer.step()
            
            preds = logits.argmax(dim=1)
            total_correct += (preds == y).sum().item()
            total_loss   += loss.item() * y.size(0)
            total_samples+= y.size(0)
        
        print(f"  Train — loss: {total_loss/total_samples:.4f}, "
              f"acc: {100*total_correct/total_samples:.2f}%")
        
        # 验证
        model.eval()
        val_correct = val_samples = 0
        with torch.no_grad():
            for x, y in val_loader:
                x, y = x.to(device), y.to(device)
                logits = model(x)
                preds = logits.argmax(dim=1)
                val_correct += (preds == y).sum().item()
                val_samples += y.size(0)
        val_acc = 100 * val_correct / val_samples
        print(f"  Val   — acc: {val_acc:.2f}%")
    
    fold_metrics.append(val_acc)

# 6. 平均五折准确率
print(f"\nAverage 5-fold val accuracy: {sum(fold_metrics)/len(fold_metrics):.2f}%")



>>> Fold 1


Fold1 Epoch1: 100%|██████████| 750/750 [00:01<00:00, 515.80it/s]


  Train — loss: 0.3531, acc: 90.56%
  Val   — acc: 94.18%


Fold1 Epoch2: 100%|██████████| 750/750 [00:01<00:00, 475.26it/s]


  Train — loss: 0.1555, acc: 95.42%
  Val   — acc: 95.13%


Fold1 Epoch3: 100%|██████████| 750/750 [00:01<00:00, 462.40it/s]


  Train — loss: 0.1228, acc: 96.34%
  Val   — acc: 95.88%


Fold1 Epoch4: 100%|██████████| 750/750 [00:01<00:00, 447.11it/s]


  Train — loss: 0.1068, acc: 96.85%
  Val   — acc: 95.93%


Fold1 Epoch5: 100%|██████████| 750/750 [00:01<00:00, 439.29it/s]


  Train — loss: 0.0918, acc: 97.20%
  Val   — acc: 95.68%


Fold1 Epoch6: 100%|██████████| 750/750 [00:01<00:00, 429.64it/s]


  Train — loss: 0.0884, acc: 97.33%
  Val   — acc: 95.90%


Fold1 Epoch7: 100%|██████████| 750/750 [00:01<00:00, 416.89it/s]


  Train — loss: 0.0760, acc: 97.66%
  Val   — acc: 96.63%


Fold1 Epoch8: 100%|██████████| 750/750 [00:01<00:00, 418.66it/s]


  Train — loss: 0.0719, acc: 97.82%
  Val   — acc: 96.47%


Fold1 Epoch9: 100%|██████████| 750/750 [00:01<00:00, 420.20it/s]


  Train — loss: 0.0762, acc: 97.74%
  Val   — acc: 96.63%


Fold1 Epoch10: 100%|██████████| 750/750 [00:01<00:00, 422.41it/s]


  Train — loss: 0.0604, acc: 98.12%
  Val   — acc: 96.71%

>>> Fold 2


Fold2 Epoch1: 100%|██████████| 750/750 [00:01<00:00, 457.61it/s]


  Train — loss: 0.3854, acc: 90.44%
  Val   — acc: 94.09%


Fold2 Epoch2: 100%|██████████| 750/750 [00:01<00:00, 426.13it/s]


  Train — loss: 0.1509, acc: 95.51%
  Val   — acc: 95.62%


Fold2 Epoch3: 100%|██████████| 750/750 [00:01<00:00, 415.89it/s]


  Train — loss: 0.1158, acc: 96.55%
  Val   — acc: 95.21%


Fold2 Epoch4: 100%|██████████| 750/750 [00:01<00:00, 423.14it/s]


  Train — loss: 0.1059, acc: 96.80%
  Val   — acc: 95.04%


Fold2 Epoch5: 100%|██████████| 750/750 [00:01<00:00, 447.30it/s]


  Train — loss: 0.0903, acc: 97.22%
  Val   — acc: 96.24%


Fold2 Epoch6: 100%|██████████| 750/750 [00:01<00:00, 469.89it/s]


  Train — loss: 0.0811, acc: 97.53%
  Val   — acc: 96.19%


Fold2 Epoch7: 100%|██████████| 750/750 [00:01<00:00, 482.52it/s]


  Train — loss: 0.0744, acc: 97.71%
  Val   — acc: 96.30%


Fold2 Epoch8: 100%|██████████| 750/750 [00:01<00:00, 473.34it/s]


  Train — loss: 0.0707, acc: 97.89%
  Val   — acc: 96.55%


Fold2 Epoch9: 100%|██████████| 750/750 [00:01<00:00, 475.71it/s]


  Train — loss: 0.0714, acc: 97.83%
  Val   — acc: 96.53%


Fold2 Epoch10: 100%|██████████| 750/750 [00:01<00:00, 425.37it/s]


  Train — loss: 0.0620, acc: 98.13%
  Val   — acc: 96.39%

>>> Fold 3


Fold3 Epoch1: 100%|██████████| 750/750 [00:01<00:00, 531.81it/s]


  Train — loss: 0.3744, acc: 90.25%
  Val   — acc: 94.25%


Fold3 Epoch2: 100%|██████████| 750/750 [00:01<00:00, 526.14it/s]


  Train — loss: 0.1561, acc: 95.42%
  Val   — acc: 95.61%


Fold3 Epoch3: 100%|██████████| 750/750 [00:01<00:00, 503.52it/s]


  Train — loss: 0.1224, acc: 96.27%
  Val   — acc: 94.95%


Fold3 Epoch4: 100%|██████████| 750/750 [00:01<00:00, 572.33it/s]


  Train — loss: 0.1040, acc: 96.91%
  Val   — acc: 95.98%


Fold3 Epoch5: 100%|██████████| 750/750 [00:01<00:00, 535.68it/s]


  Train — loss: 0.0925, acc: 97.21%
  Val   — acc: 95.92%


Fold3 Epoch6: 100%|██████████| 750/750 [00:01<00:00, 498.32it/s]


  Train — loss: 0.0822, acc: 97.61%
  Val   — acc: 95.47%


Fold3 Epoch7: 100%|██████████| 750/750 [00:01<00:00, 494.89it/s]


  Train — loss: 0.0847, acc: 97.43%
  Val   — acc: 96.38%


Fold3 Epoch8: 100%|██████████| 750/750 [00:01<00:00, 471.54it/s]


  Train — loss: 0.0721, acc: 97.88%
  Val   — acc: 96.70%


Fold3 Epoch9: 100%|██████████| 750/750 [00:01<00:00, 469.74it/s]


  Train — loss: 0.0628, acc: 98.05%
  Val   — acc: 95.85%


Fold3 Epoch10: 100%|██████████| 750/750 [00:01<00:00, 473.64it/s]


  Train — loss: 0.0586, acc: 98.24%
  Val   — acc: 96.56%

>>> Fold 4


Fold4 Epoch1: 100%|██████████| 750/750 [00:01<00:00, 496.35it/s]


  Train — loss: 0.3461, acc: 90.80%
  Val   — acc: 94.38%


Fold4 Epoch2: 100%|██████████| 750/750 [00:01<00:00, 473.12it/s]


  Train — loss: 0.1453, acc: 95.63%
  Val   — acc: 95.43%


Fold4 Epoch3: 100%|██████████| 750/750 [00:01<00:00, 428.97it/s]


  Train — loss: 0.1155, acc: 96.46%
  Val   — acc: 96.08%


Fold4 Epoch4: 100%|██████████| 750/750 [00:01<00:00, 434.29it/s]


  Train — loss: 0.0967, acc: 97.01%
  Val   — acc: 95.94%


Fold4 Epoch5: 100%|██████████| 750/750 [00:01<00:00, 436.62it/s]


  Train — loss: 0.0928, acc: 97.15%
  Val   — acc: 95.90%


Fold4 Epoch6: 100%|██████████| 750/750 [00:01<00:00, 435.68it/s]


  Train — loss: 0.0794, acc: 97.62%
  Val   — acc: 96.13%


Fold4 Epoch7: 100%|██████████| 750/750 [00:01<00:00, 428.49it/s]


  Train — loss: 0.0742, acc: 97.80%
  Val   — acc: 96.53%


Fold4 Epoch8: 100%|██████████| 750/750 [00:01<00:00, 437.08it/s]


  Train — loss: 0.0668, acc: 98.00%
  Val   — acc: 96.69%


Fold4 Epoch9: 100%|██████████| 750/750 [00:01<00:00, 428.96it/s]


  Train — loss: 0.0632, acc: 98.08%
  Val   — acc: 96.46%


Fold4 Epoch10: 100%|██████████| 750/750 [00:01<00:00, 436.91it/s]


  Train — loss: 0.0564, acc: 98.32%
  Val   — acc: 96.92%

>>> Fold 5


Fold5 Epoch1: 100%|██████████| 750/750 [00:01<00:00, 442.95it/s]


  Train — loss: 0.3848, acc: 90.37%
  Val   — acc: 94.16%


Fold5 Epoch2: 100%|██████████| 750/750 [00:01<00:00, 434.45it/s]


  Train — loss: 0.1537, acc: 95.35%
  Val   — acc: 95.53%


Fold5 Epoch3: 100%|██████████| 750/750 [00:01<00:00, 429.37it/s]


  Train — loss: 0.1152, acc: 96.48%
  Val   — acc: 96.30%


Fold5 Epoch4: 100%|██████████| 750/750 [00:01<00:00, 430.04it/s]


  Train — loss: 0.1015, acc: 96.83%
  Val   — acc: 96.17%


Fold5 Epoch5: 100%|██████████| 750/750 [00:01<00:00, 425.96it/s]


  Train — loss: 0.0895, acc: 97.24%
  Val   — acc: 95.14%


Fold5 Epoch6: 100%|██████████| 750/750 [00:01<00:00, 420.60it/s]


  Train — loss: 0.0897, acc: 97.29%
  Val   — acc: 96.22%


Fold5 Epoch7: 100%|██████████| 750/750 [00:01<00:00, 433.41it/s]


  Train — loss: 0.0756, acc: 97.71%
  Val   — acc: 96.18%


Fold5 Epoch8: 100%|██████████| 750/750 [00:01<00:00, 443.01it/s]


  Train — loss: 0.0771, acc: 97.69%
  Val   — acc: 96.18%


Fold5 Epoch9: 100%|██████████| 750/750 [00:01<00:00, 463.89it/s]


  Train — loss: 0.0627, acc: 98.07%
  Val   — acc: 96.62%


Fold5 Epoch10: 100%|██████████| 750/750 [00:01<00:00, 469.33it/s]


  Train — loss: 0.0653, acc: 98.07%
  Val   — acc: 96.60%

Average 5-fold val accuracy: 96.63%


---
## 第三部分：基于`CNN`的手写数字分类

- 请你构建一个卷积神经网络模型，在Mnist数据集上进行训练和测试。
- 可以使用`Pytorch`或者`Tensorflow`等深度学习框架。
- 请你展示模型的训练过程（Loss曲线）和测试结果（分类精度）。

### 评分细则
- 模型构建：10分
- 模型训练：10分 - 要求Loss曲线收敛
- 模型测试：10分 - 要求分类精度达到95%以上

In [14]:
# Code here
class SimpleCNN(nn.Module):
    def __init__(self):
        super(SimpleCNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, kernel_size=3, padding=1)  # 28x28 -> 28x28
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, padding=1)  # 28x28 -> 28x28
        self.fc1 = nn.Linear(64 * 7 * 7, 128)  # 7x7 after pooling
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = torch.relu(self.conv1(x))
        x = torch.max_pool2d(x, kernel_size=2)  # 28x28 -> 14x14
        x = torch.relu(self.conv2(x))
        x = torch.max_pool2d(x, kernel_size=2)  # 14x14 -> 7x7
        x = x.view(x.size(0), -1)  # Flatten
        x = torch.relu(self.fc1(x))
        x = self.fc2(x)
        return x

model = SimpleCNN()
model.to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
criterion = nn.CrossEntropyLoss()

train_features, train_labels = df_to_tensor(train_df)
train_features = train_features.view(-1, 1, 28, 28).to(device)  # Reshape for CNN
train_labels = train_labels.to(device)
train_dataset = torch.utils.data.TensorDataset(train_features, train_labels)
train_dataset, val_dataset = train_test_split(train_dataset, test_size=0.2, random_state=42)
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
val_loader = torch.utils.data.DataLoader(val_dataset, batch_size=batch_size, shuffle=False)

test_features, test_labels = df_to_tensor(test_df)
test_features = test_features.view(-1, 1, 28, 28).to(device)  # Reshape for CNN
test_labels = test_labels.to(device)
test_dataset = torch.utils.data.TensorDataset(test_features, test_labels)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

train_model(model, train_loader, criterion, optimizer, epochs=10)
evaluate_model(model, test_loader)




Epoch 1/10: 100%|██████████| 750/750 [00:01<00:00, 680.26it/s]


Epoch 1/10 — Loss: 0.2624 — Accuracy: 94.88%
Validation Loss: 0.3528 — Validation Accuracy: 95.31%


Epoch 2/10: 100%|██████████| 750/750 [00:01<00:00, 735.44it/s]


Epoch 2/10 — Loss: 0.0559 — Accuracy: 98.26%
Validation Loss: 0.2526 — Validation Accuracy: 98.20%


Epoch 3/10: 100%|██████████| 750/750 [00:00<00:00, 753.43it/s]


Epoch 3/10 — Loss: 0.0383 — Accuracy: 98.78%
Validation Loss: 0.2985 — Validation Accuracy: 98.69%


Epoch 4/10: 100%|██████████| 750/750 [00:01<00:00, 650.64it/s]


Epoch 4/10 — Loss: 0.0325 — Accuracy: 98.98%
Validation Loss: 0.0232 — Validation Accuracy: 98.86%


Epoch 5/10: 100%|██████████| 750/750 [00:01<00:00, 643.35it/s]


Epoch 5/10 — Loss: 0.0258 — Accuracy: 99.12%
Validation Loss: 0.3805 — Validation Accuracy: 98.91%


Epoch 6/10: 100%|██████████| 750/750 [00:01<00:00, 616.25it/s]


Epoch 6/10 — Loss: 0.0257 — Accuracy: 99.18%
Validation Loss: 0.2155 — Validation Accuracy: 99.00%


Epoch 7/10: 100%|██████████| 750/750 [00:01<00:00, 615.77it/s]


Epoch 7/10 — Loss: 0.0222 — Accuracy: 99.26%
Validation Loss: 0.2321 — Validation Accuracy: 99.02%


Epoch 8/10: 100%|██████████| 750/750 [00:01<00:00, 604.91it/s]


Epoch 8/10 — Loss: 0.0208 — Accuracy: 99.35%
Validation Loss: 0.4847 — Validation Accuracy: 99.17%


Epoch 9/10: 100%|██████████| 750/750 [00:01<00:00, 648.02it/s]


Epoch 9/10 — Loss: 0.0177 — Accuracy: 99.47%
Validation Loss: 0.7306 — Validation Accuracy: 99.22%


Epoch 10/10: 100%|██████████| 750/750 [00:01<00:00, 622.62it/s]


Epoch 10/10 — Loss: 0.0197 — Accuracy: 99.42%
Validation Loss: 1.0428 — Validation Accuracy: 99.23%
Accuracy: 98.55%



---
## 第四部分：结果讨论与对比分析

- 请你对`MLP`和`CNN`两种模型的训练和测试结果进行对比分析。
- 你可以从模型的训练速度、模型的性能、模型的泛化能力等方面进行分析。
    - 训练速度：模型训练所需的时间
    - 模型性能：模型在测试集上的分类精度
    - 模型泛化能力：模型在未知数据上的表现，未知数据可以从Canvas上下载，在文件夹`unseen`中有5张手写数字图片。
- 请在下面的markdown cell中写下你的分析结果。

### 评分细则
- 结果讨论与对比分析：
    - 训练速度：4分
    - 模型性能：4分
    - 模型泛化能力：5分
    - 其他：2分