# 使用 VGG16 对 CIFAR10 分类

VGG是由Simonyan 和Zisserman在文献《Very Deep Convolutional Networks for Large Scale Image Recognition》中提出卷积神经网络模型，其名称来源于作者所在的牛津大学视觉几何组(Visual Geometry Group)的缩写。

该模型参加2014年的 ImageNet图像分类与定位挑战赛，取得了优异成绩：在分类任务上排名第二，在定位任务上排名第一。

VGG16的网络结构如下图所示：

![VGG16示意图](http://q6dz4bbgt.bkt.clouddn.com/20200229111521.jpg)

16层网络的结节信息如下：
- 01：Convolution using 64 filters
- 02: Convolution using 64 filters + Max pooling
- 03: Convolution using 128 filters
- 04: Convolution using 128 filters + Max pooling
- 05: Convolution using 256 filters
- 06: Convolution using 256 filters
- 07: Convolution using 256 filters + Max pooling
- 08: Convolution using 512 filters
- 09: Convolution using 512 filters
- 10: Convolution using 512 filters + Max pooling
- 11: Convolution using 512 filters
- 12: Convolution using 512 filters
- 13: Convolution using 512 filters + Max pooling
- 14: Fully connected with 4096 nodes
- 15: Fully connected with 4096 nodes
- 16: Softmax

## 1. 定义 dataloader

需要注意的是，这里的 transform，dataloader 和之前定义的有所不同，大家自己体会。

In [5]:
import torch
import torchvision
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
import numpy as np
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

# 使用GPU训练，可以在菜单 "代码执行工具" -> "更改运行时类型" 里进行设置
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

transform_train = transforms.Compose([
    transforms.RandomCrop(32, padding=4),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))])

transform_test = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,  download=True, transform=transform_train)
testset  = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform_test)

trainloader = torch.utils.data.DataLoader(trainset, batch_size=128, shuffle=True, num_workers=2)
testloader = torch.utils.data.DataLoader(testset, batch_size=128, shuffle=False, num_workers=2)

classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

Files already downloaded and verified
Files already downloaded and verified


## 2. VGG 网络定义

下面定义VGG网络，参数太多，我手动改简单了些~

现在的结构基本上是：

64 conv, maxpooling,

128 conv, maxpooling,

256 conv, 256 conv, maxpooling,

512 conv, 512 conv, maxpooling,

512 conv, 512 conv, maxpooling,

softmax 


下面是模型的实现代码：


In [9]:
class VGG(nn.Module):
    def __init__(self):
        super(VGG, self).__init__()
        self.cfg = [64, 'M', 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M']
        self.features = self._make_layers(self.cfg)
        self.classifier = nn.Linear(512, 10)

    def forward(self, x):
        out = self.features(x)
        out = out.view(out.size(0), -1)
        out = self.classifier(out)
        return out

    def _make_layers(self, cfg):
        layers = []
        in_channels = 3
        for x in cfg:
            if x == 'M':
                layers += [nn.MaxPool2d(kernel_size=2, stride=2)]
            else:
                layers += [nn.Conv2d(in_channels, x, kernel_size=3, padding=1),
                           nn.BatchNorm2d(x),
                           nn.ReLU(inplace=True)]
                in_channels = x
        layers += [nn.AvgPool2d(kernel_size=1, stride=1)]
        return nn.Sequential(*layers)

初始化网络，根据实际需要，修改分类层。因为 tiny-imagenet 是对200类图像分类，这里把输出修改为200。


In [10]:
# 网络放到GPU上
net = VGG().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(net.parameters(), lr=0.001)

## 3. 网络训练

训练的代码和以前是完全一样的：

In [11]:
for epoch in range(10):  # 重复多轮训练
    for i, (inputs, labels) in enumerate(trainloader):
        inputs = inputs.to(device)
        labels = labels.to(device)
        # 优化器梯度归零
        optimizer.zero_grad()
        # 正向传播 +　反向传播 + 优化 
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        # 输出统计信息
        if i % 100 == 0:   
            print('Epoch: %d Minibatch: %5d loss: %.3f' %(epoch + 1, i + 1, loss.item()))

print('Finished Training')

Epoch: 1 Minibatch:     1 loss: 2.740
Epoch: 1 Minibatch:   101 loss: 1.579
Epoch: 1 Minibatch:   201 loss: 1.326
Epoch: 1 Minibatch:   301 loss: 1.110
Epoch: 2 Minibatch:     1 loss: 1.012
Epoch: 2 Minibatch:   101 loss: 1.055
Epoch: 2 Minibatch:   201 loss: 0.963
Epoch: 2 Minibatch:   301 loss: 0.976
Epoch: 3 Minibatch:     1 loss: 0.856
Epoch: 3 Minibatch:   101 loss: 0.795
Epoch: 3 Minibatch:   201 loss: 0.757
Epoch: 3 Minibatch:   301 loss: 0.678
Epoch: 4 Minibatch:     1 loss: 0.878
Epoch: 4 Minibatch:   101 loss: 0.565
Epoch: 4 Minibatch:   201 loss: 0.684
Epoch: 4 Minibatch:   301 loss: 0.784
Epoch: 5 Minibatch:     1 loss: 0.489
Epoch: 5 Minibatch:   101 loss: 0.683
Epoch: 5 Minibatch:   201 loss: 0.678
Epoch: 5 Minibatch:   301 loss: 0.688
Epoch: 6 Minibatch:     1 loss: 0.480
Epoch: 6 Minibatch:   101 loss: 0.471
Epoch: 6 Minibatch:   201 loss: 0.494
Epoch: 6 Minibatch:   301 loss: 0.711
Epoch: 7 Minibatch:     1 loss: 0.492
Epoch: 7 Minibatch:   101 loss: 0.420
Epoch: 7 Min

## 4. 测试验证准确率：

测试的代码和之前也是完全一样的。

In [12]:
correct = 0
total = 0

for data in testloader:
    images, labels = data
    images, labels = images.to(device), labels.to(device)
    outputs = net(images)
    _, predicted = torch.max(outputs.data, 1)
    total += labels.size(0)
    correct += (predicted == labels).sum().item()

print('Accuracy of the network on the 10000 test images: %.2f %%' % (
    100 * correct / total))

Accuracy of the network on the 10000 test images: 83.61 %


可以看到，使用一个简化版的 VGG 网络，就能够显著地将准确率由 64%，提升到 84.92%
