数据集增强：

①图像增广：翻转（左右、上下）；裁剪；颜色改变（亮度、对比度、饱和度、色调）；

In [22]:
# Linux
'''
Ubuntu 18.04.4 LTS
'''

# GPU
'''
CUDA Version: 11.6
Driver Version: 510.47.03
'''

# python和pytorch的环境
'''
python3.10
torch=1.12.1+cu116
torchaudio=0.12.1+cu116
torchinfo=1.8.0
torchvision=0.13.1+cu116
'''

'\npython3.10\ntorch=1.12.1+cu116\ntorchaudio=0.12.1+cu116\ntorchinfo=1.8.0\ntorchvision=0.13.1+cu116\n'

In [1]:
# 导入库
import math # 用于之后的math.ceil()函数，其能向上取整

import torch
'''
torch.utils.data.DataLoader() 类，数据加载
torch.flatten() 将多维张量展平为一维
torch.device() # 指定使用GPU还是CPU
to.() #将张量或是模型移动到指定的设备上。如果设备是GPU，这样有一系列的好处（加快计算、高效的内存管理、更好的模型性能、便于实验和调试、支持更大的数据集、适应现代深度学习框架）
torch.cuda.is_available() 确定是否可以使用GPU
torch.optim.Adam() Adam优化算法，结合了Momentum算法和RMSProp算法
torch.no_grad() 不进行反向传播
torch.max() 得到输入张量中的最值，可以是整个张量的最值，也可以是该张量在某个维度的最值
torch.save() 将模型的参数保存到磁盘上
'''

import torch.nn as nn
'''
torch.nn.Conv2d 二维卷积层类
torch.nn.Module 是一个基类，所有自定义的神经网络模型都要继承自该基类
torch.nn.BatchNorm2d 二维批量归一化类
torch.nn.ReLU 使用ReLU激活函数
torch.nn.Sequential() 指定神经网络线性堆叠结构
torch.nn.MaxPool2d() 最大池化操作
torch.nn.AdaptiveAvgPool2d() 自适应平均池化层操作
torch.nn.Linear() 全连接层
torch.nn.CrossEntropyLoss() 在多分类任务中常用的损失函数
'''
# from torchinfo import summary

import torchvision
'''
torchvision.datasets.CIFAR10() 加载CIFAR-10数据集
'''

import torchvision.transforms as transforms
'''
torchvision.transforms.Compose() 将图像预处理和增强操作组合在一起
torchvision.transforms.Pad() 对图像进行填充变换
torchvision.transforms.RandomHorizontalFlip() 对图像随机水平翻转
torchvision.transforms.RandomCrop() 随机裁剪图像
torchvision.transforms.ToTensor() 将PIL图像或Numpy数组转换为PyTorch的张量，即Tensor
'''

'\ntorchvision.transforms.Compose() 将图像预处理和增强操作组合在一起\ntorchvision.transforms.Pad() 对图像进行填充变换\ntorchvision.transforms.RandomHorizontalFlip() 对图像随机水平翻转\ntorchvision.transforms.RandomCrop() 随机裁剪图像\ntorchvision.transforms.ToTensor() 将PIL图像或Numpy数组转换为PyTorch的张量，即Tensor\n'

In [2]:
# 32 * 32图片
img_height = 32
img_width = 32

In [3]:
# 图片预处理
transform = transforms.Compose([
    transforms.Pad(4), # 在图像四周都添加4像素
    transforms.RandomHorizontalFlip(), # 以概率p=0.5随机水平翻转图像
    transforms.RandomCrop(32), # 随机裁剪图像，得到目标尺寸为32x32
    transforms.ToTensor() # 将PIL图像转换为Tensor
])
'''
将.Pad() .RandomHorizontalFlip() .RandomCrop() .ToTenssor() 四种图像预处理和增强操作顺序执行
'''

'\n将.Pad() .RandomHorizontalFlip() .RandomCrop() .ToTenssor() 四种图像预处理和增强操作顺序执行\n'

In [4]:
# 下载数据

# 训练数据
train_dataset = torchvision.datasets.CIFAR10(
    root = "/root/data/project/Kaggle/CODE/ResNet50/Dataset/",
    train = True,
    transform = transform,
    download = True
)
'''
将CIFAR-10数据集(训练集)保存到/root/data/project/Kaggle/CODE/ResNet50/Dataset/目录下
是训练集
调用transform实例来进行数据增强，本例子中对图像进行了：填充、随机水平翻转、随机裁剪和转换为Tensor操作
如果本地没有CIFAR-10数据集，则选择下载该数据集
'''

# 测试数据
test_dataset = torchvision.datasets.CIFAR10(
    root = "/root/data/project/Kaggle/CODE/ResNet50/Dataset/",
    train = False,
    transform = transforms.ToTensor()
)
'''
将CIFAR-10数据集（测试集）保存到/root/data/project/Kaggle/CODE/ResNet50/Dataset/目录下
是测试集
调用transforms的ToTensor方法，将PIL图像转换为Tensor
'''

Files already downloaded and verified


'\n将CIFAR-10数据集（测试集）保存到/root/data/project/Kaggle/CODE/ResNet50/Dataset/目录下\n是测试集\n调用transforms的ToTensor方法，将PIL图像转换为Tensor\n'

In [5]:
# 超参数
num_epochs = 20
batch_size = 50
learning_rate= 0.01

In [6]:
# 导入数据

# 训练数据
train_loader = torch.utils.data.DataLoader(
    dataset = train_dataset,
    batch_size = batch_size,
    shuffle = True
)
# 测试数据
test_loader = torch.utils.data.DataLoader(
    dataset = test_dataset,
    batch_size = batch_size,
    shuffle = True
)
'''
dataset 导入的数据集
batch_size 一个迭代(iteration)所前馈、反馈和更新处理的样本数量
shuffle 在每一个轮次(epoch)前，将所有样本打乱
'''

'\ndataset 导入的数据集\nbatch_size 一个迭代(iteration)所前馈、反馈和更新处理的样本数量\nshuffle 在每一个轮次(epoch)前，将所有样本打乱\n'

In [7]:
# 得到一个卷积层实例
def conv3x3(in_channels, out_channels, kernel_size = 3, stride = 1, padding = 1):
    return nn.Conv2d(in_channels, out_channels, kernel_size=kernel_size, stride=stride, padding=padding, bias=False)

In [8]:
# residual block
class ResidualBlock(nn.Module): # 自定义神经网络模型，ResidualBlock类需要继承自torch.nn.Module类
    def __init__(self, in_channels, out_channels, stride=1, downsample=None):
        super(ResidualBlock, self).__init__()
        self.mid_channels = out_channels//4
        self.conv1 = conv3x3(in_channels, self.mid_channels, kernel_size=1, padding=0)
        self.bn1 = nn.BatchNorm2d(self.mid_channels) # self.mid_channels是卷积层输出通道数，self.mid_channels个输出通道的批量归一化
        self.relu = nn.ReLU(inplace=True) # 构造relu实例，令inplacce=True，反向传播不用原始值
        self.conv2 = conv3x3(self.mid_channels, self.mid_channels, kernel_size=3, stride=stride, padding=1)
        self.bn2 = nn.BatchNorm2d(self.mid_channels) # self.mid_channels是卷积层输出通道数，self.mid_channels个输出通道的批量归一化
        self.conv3 = conv3x3(self.mid_channels, out_channels, kernel_size=1, padding=0)
        self.bn3 = nn.BatchNorm2d(out_channels) # out_channels是卷积层输出通道数，out_channels个输出通道的批量归一化
        self.downsample_0 = nn.Sequential() # 定义空容器实例downsample_0
        self.downsample = downsample


    def forward(self, x):
        residual = x
        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out) # 使用relu激活函数
        out = self.conv2(out)
        out = self.bn2(out)
        out = self.relu(out) # 使用relu激活函数
        out = self.conv3(out)
        out = self.bn3(out)
        if self.downsample:
            residual = self.downsample(x)
        else:
            residual = self.downsample_0(x)

        out += residual
        out = self.relu(out) # 使用relu激活函数
        return out

In [9]:
# 定义模型

# ResNet50
class ResNet(nn.Module): # 自定义神经网络模型，ResNet类需要继承自基类torch.nn.Module
    def __init__(self, block, layers, num_classes = 10):
        super(ResNet, self).__init__()
        self.conv = conv3x3(3, 64, kernel_size = 3, stride = 1, padding = 1)
        self.bn = nn.BatchNorm2d(64) # 64是卷积层输出通道数，64个输出通道的批量归一化
        self.max_pool = nn.MaxPool2d(3, 2, padding = 1) # 最大池化操作，kernel_size=3x3;stride=2;padding=1
        self.relu = nn.ReLU(inplace = True) # 构造relu实例，令inplace=True，反向传播中不用始值
        self.layer1 = self.make_layer(block, 64, 256, layers[0], 1)
        self.layer2 = self.make_layer(block, 256, 512, layers[1], 2)
        self.layer3 = self.make_layer(block, 512, 1024, layers[2], 2)
        self.layer4 = self.make_layer(block, 1024, 2048, layers[3], 2)
        self.avg_pool = nn.AdaptiveAvgPool2d((1, 1)) # 进行自适应平均池化层操作，输出高度x宽度为1x1
        self.fc = nn.Linear(math.ceil(img_height / 32) * math.ceil(img_width / 32) * 2048, num_classes)  # 构造全连接层实例fc，用于神经网络的最后阶段，将特征图展平并映射到输出类别(2048, 10)

    def make_layer(self, block, in_channels, out_channels, blocks, stride = 1):
        downsample = None
        if (stride != 1) or (in_channels != out_channels):
            downsample = nn.Sequential( # 定义容器实例downsample，是线性的：conv3x3;BatchNorm2d;
                conv3x3(in_channels, out_channels, kernel_size = 1, stride = stride, padding = 0),
                nn.BatchNorm2d(out_channels) # out_channels是卷积层输出通道数，out_channels个输出通道的批量归一化
            )
        layers = [block(in_channels, out_channels, stride, downsample)]
        for i in range(1, blocks):
            layers.append(block(out_channels, out_channels))
        return nn.Sequential(*layers) # 将layers线性化连接起来

    def forward(self, x):
        out = self.conv(x)
        out = self.bn(out)
        out = self.max_pool(out)
        out = self.layer1(out)
        out = self.layer2(out)
        out = self.layer3(out)
        out = self.layer4(out)
        out = self.avg_pool(out)
        out = torch.flatten(out, 1) # 通道维数不变，在每一通道维度的基础上，展平第二和第三维度
        out = self.fc(out)
        return out

In [10]:
# 有GPU则device赋为"cuda"，即指定默认的GPU；如果没有GPU则使用CPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

In [11]:
# 模型实例化，并且将模型移动到指定的设备上
model = ResNet(ResidualBlock, [3, 4, 6, 3]).to(device)

In [12]:
# 定义损失实例，是交叉熵损失实例
criterion = nn.CrossEntropyLoss()

In [13]:
# 优化实例，使用Adam优化算法，第一个参数模型实例的参数；第二个参数是学习率，是一个超参数，需要自己设定
optimizer = torch.optim.Adam(model.parameters(), lr = learning_rate)

In [14]:
# 更新学习率
def update_lr(optimizer, lr):
    for param_group in optimizer.param_groups:
        param_group['lr'] = lr

In [18]:
# 训练函数
total_step = len(train_loader) # total_step = 训练集总样本数 // batch_size = 50,000 // 50 = 1,000
curr_lr = learning_rate
for epoch in range(num_epochs): # num_epochs轮次数
    for i, (images, labels) in enumerate(train_loader):
        images = images.to(device) # 将张量images移动到指定的设备中
        labels = labels.to(device) # 将张量labels移动到指定的

        outputs = model(images)
        loss = criterion(outputs, labels)

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        if (i + 1) % 100 == 0:
            print("Epoch[{}/{}], Step [{}/{}] Loss:{:.4f}".format(epoch + 1, num_epochs, i + 1, total_step, loss.item()))
    if (epoch + 1) % 20 == 0:
        curr_lr /= 3
        update_lr(optimizer, curr_lr)

Epoch[1/20], Step [100/1000] Loss:2.2465
Epoch[1/20], Step [200/1000] Loss:2.3048
Epoch[1/20], Step [300/1000] Loss:2.0614
Epoch[1/20], Step [400/1000] Loss:1.8806
Epoch[1/20], Step [500/1000] Loss:1.8562
Epoch[1/20], Step [600/1000] Loss:1.8562
Epoch[1/20], Step [700/1000] Loss:1.9196
Epoch[1/20], Step [800/1000] Loss:1.7620
Epoch[1/20], Step [900/1000] Loss:1.5524
Epoch[1/20], Step [1000/1000] Loss:1.5996
Epoch[2/20], Step [100/1000] Loss:1.7130
Epoch[2/20], Step [200/1000] Loss:1.7179
Epoch[2/20], Step [300/1000] Loss:1.5812
Epoch[2/20], Step [400/1000] Loss:1.5514
Epoch[2/20], Step [500/1000] Loss:1.4996
Epoch[2/20], Step [600/1000] Loss:1.7425
Epoch[2/20], Step [700/1000] Loss:1.4656
Epoch[2/20], Step [800/1000] Loss:1.6440
Epoch[2/20], Step [900/1000] Loss:1.6708
Epoch[2/20], Step [1000/1000] Loss:1.4362
Epoch[3/20], Step [100/1000] Loss:1.7464
Epoch[3/20], Step [200/1000] Loss:1.2145
Epoch[3/20], Step [300/1000] Loss:1.5714
Epoch[3/20], Step [400/1000] Loss:1.4206
Epoch[3/20], S

In [20]:
# 测试函数
model.eval() # 进入模型评估模式，禁用Dropout、禁用Batch Normalization
with torch.no_grad(): # 在此缩进块中，不再进行反向传播
    correct = 0
    total = 0
    for images, labels in test_loader:
        images = images.to(device) # 将张量images移动到指定的设备中
        labels = labels.to(device) # 将张量labels移动到指定的设备中
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1) # 返回输出最大值和索引，索引即预测的类别
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

    print("Accuracy of the model on the test image:{}".format(100 * correct / total))

Accuracy of the model on the test image:81.95


In [21]:
# 保存模型参数到/root/data/project/Kaggle/CODE/ResNet50/Dataset/目录下，文件名为resnet50_cifar10.pth
torch.save(model.state_dict(), "Dataset/resnet50_cifar10.pth")