# 搭建一个完成深度学习训练框架

## 损失函数

- 功能：衡量模型预测值与真实值之间的差异，为优化算法提供优化方向和依据。
- 回归损失函数：
    - 均方误差（MSE）：对预测值与真实值的误差平方求均值，强调较大误差的惩罚。
    - 均绝误差（MAE）：对预测值与真实值的绝对误差求均值，更鲁棒但对大误差不敏感。
    - Huber 损失：结合 MSE 和 MAE，适用于对异常值更鲁棒的场景。
- 分类损失函数：
    - 交叉熵损失（Cross-Entropy Loss）：衡量预测分布与真实分布之间的差异，适用于二分类或多分类任务。
    - 二元交叉熵（Binary Cross-Entropy）：专用于二分类任务的交叉熵损失函数。
    - 稀疏交叉熵（Sparse Categorical Cross-Entropy）：用于多分类任务，支持整数标签形式。

In [1]:
import torch
import torch.nn as nn
loss = nn.MSELoss()
input = torch.randn(3, 5, requires_grad=True)
target = torch.randn(3, 5)
output = loss(input, target)
output.backward()

## 优化器

- 功能：优化器用于根据损失函数的梯度信息更新模型参数，从而使损失函数的值逐步减小。优化器是深度学习模型训练的关键组件，它通过迭代调整权重参数，帮助模型更快更准确地拟合数据。
- SGD：适用于简单模型或需要较强正则化的任务。
- Adam：适合大多数深度学习任务，尤其是 NLP、CV 和复杂模型。
- RMSProp：多用于 RNN、LSTM 等处理序列数据的任务。
- AdamW：适用于需要更好正则化的任务，如深度网络训练。

In [None]:
from torch import optim
model = nn.Linear(5, 3)
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)

## 深度学习完整流程

### 1.导入相关库

In [1]:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms

### 2.数据处理

In [2]:
# 数据预处理
transform = transforms.Compose([
    transforms.Resize(64),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

# 加载数据
trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                      download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64,
                                        shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                     download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=64,
                                       shuffle=False, num_workers=2)

100%|██████████| 170M/170M [00:23<00:00, 7.23MB/s] 


### 3.模型搭建

In [4]:
class AlexNet(nn.Module):
    def __init__(self, num_classes=10):
        super(AlexNet, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=3, stride=2, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2),
            nn.Conv2d(64, 192, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2),
            nn.Conv2d(192, 384, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(384, 256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(256, 256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2),
        )
        self.classifier = nn.Sequential(
            nn.Dropout(),
            nn.Linear(256 * 4 * 4, 4096),
            nn.ReLU(inplace=True),
            nn.Dropout(),
            nn.Linear(4096, 4096),
            nn.ReLU(inplace=True),
            nn.Linear(4096, num_classes),
        )

    def forward(self, x):
        x = self.features(x)
        x = x.view(x.size(0), -1)
        x = self.classifier(x)
        return x

### 4.模型训练

In [5]:
# 设置设备
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
# 初始化模型
net = AlexNet().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.01, momentum=0.9)

In [6]:
num_epochs = 2
for epoch in range(num_epochs):
    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        inputs, labels = data[0].to(device), data[1].to(device)

        optimizer.zero_grad()
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()
        if i % 100 == 99:
            print(f'[{epoch + 1},{i + 1:5d}] loss: {running_loss / 100:.3f}')
            running_loss = 0.0

print('Finished Training')

[1,   100] loss: 2.303
[1,   200] loss: 2.303
[1,   300] loss: 2.301
[1,   400] loss: 2.269
[1,   500] loss: 2.071
[1,   600] loss: 1.917
[1,   700] loss: 1.803
[2,   100] loss: 1.683
[2,   200] loss: 1.587
[2,   300] loss: 1.548
[2,   400] loss: 1.483
[2,   500] loss: 1.497
[2,   600] loss: 1.433
[2,   700] loss: 1.394
Finished Training


### 5模型验证

In [7]:
# 测试模型
correct = 0
total = 0
with torch.no_grad():
    for data in testloader:
        images, labels = data[0].to(device), data[1].to(device)
        outputs = net(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print(f'Accuracy on test set: {100 * correct / total}%')

Accuracy on test set: 51.21%
