torch.optim 是 PyTorch 中的 优化器模块，用于更新神经网络中所有 可学习参数（带梯度的张量）   

# 一、什么是 torch.optim？   
torch.optim 提供了一组优化器类（optimizers），这些类负责使用计算得到的梯度来更新模型的参数。   

# 二、PyTorch 训练流程中的位置 

```python
# 1. 前向传播
output = model(input)

# 2. 计算损失
loss = loss_fn(output, target)

# 3. 反向传播，计算梯度
loss.backward()

# 4. 使用优化器更新权重
optimizer.step()

# 5. 清空梯度，否则会累加
optimizer.zero_grad()
```

# 三、常用优化器一览（torch.optim 中的类）     
![常用优化器](./images/Optim.png)

# 四、优化器常用参数
![优化器常用参数](./images/optism2.png)

In [1]:
# optim.SGD
#     params,                        要优化的参数，一般用model.parameters()
#     lr,                            学习率，控制每次参数更新的步长
#     momentum=0,                    动量，缓解震荡，加速收敛，常设为0.9
#     dampening=0,                   动量衰减，减少过快更新（一般不用）
#     weight_decay=0,                权重衰减，用于L2正则化，防止过拟合
#     nesterov=False,                是否使用Nesterov动量（更快收敛）
#     maximize=False,                若设为True，执行梯度上升（用于强化学习）
#     foreach=None,                  是否启用多张量操作（提高效率）
#     differentiable=False           是否启用二阶梯度（很少用）

The history saving thread hit an unexpected error (OperationalError('attempt to write a readonly database')).History will not be written to the database.


In [1]:
import torch
import torchvision
from torch import nn
from collections import OrderedDict

In [2]:
class Tudui(nn.Module):
    def __init__(self):
        super(Tudui, self).__init__()
        self.model1 = nn.Sequential(OrderedDict([
            ('conv1', nn.Conv2d(in_channels=3, out_channels=32, kernel_size=5, padding=2)),
            ('pool1', nn.MaxPool2d(kernel_size=2)),
            ('conv2', nn.Conv2d(in_channels=32, out_channels=32, kernel_size=5, padding=2)),
            ('pool2', nn.MaxPool2d(kernel_size=2)),
            ('conv3', nn.Conv2d(in_channels=32, out_channels=64, kernel_size=5, padding=2)),
            ('pool3', nn.MaxPool2d(kernel_size=2)),
            ('flatten', nn.Flatten()),
            ('fc1', nn.Linear(1024, 64)),  # 注意：1024 = 64通道 × 4 × 4（针对输入32x32）
            ('fc2', nn.Linear(64, 10))
        ]))

    def forward(self, x):
        return self.model1(x)

In [3]:
dataset = torchvision.datasets.CIFAR10("../datasets/CIFAR10/", train = False, transform = torchvision.transforms.ToTensor(), download = True)
dataloader = torch.utils.data.DataLoader(dataset, batch_size = 64)

Files already downloaded and verified


In [4]:
tudui = Tudui()
loss = nn.CrossEntropyLoss()
optim = torch.optim.SGD(tudui.parameters(), lr = 0.01)
for epoch in range(20):
    running_loss = 0.0
    for data in dataloader:
        imgs, targets = data
        outputs = tudui(imgs)
        result_loss = loss(outputs, targets)
        optim.zero_grad()
        result_loss.backward()
        optim.step()
        running_loss = running_loss + result_loss
    print(running_loss)

tensor(360.2856, grad_fn=<AddBackward0>)
tensor(355.2269, grad_fn=<AddBackward0>)
tensor(338.3457, grad_fn=<AddBackward0>)
tensor(320.0082, grad_fn=<AddBackward0>)


KeyboardInterrupt: 