<a href="https://colab.research.google.com/github/OUCTheoryGroup/colab_demo/blob/master/05_03_VGG_CIFAR10.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## 使用 VGG16 对 CIFAR10 分类

VGG是由Simonyan 和Zisserman在文献《Very Deep Convolutional Networks for Large Scale Image Recognition》中提出卷积神经网络模型，其名称来源于作者所在的牛津大学视觉几何组(Visual Geometry Group)的缩写。

该模型参加2014年的 ImageNet图像分类与定位挑战赛，取得了优异成绩：在分类任务上排名第二，在定位任务上排名第一。

VGG16的网络结构如下图所示：

![VGG16示意图](https://gaopursuit.oss-cn-beijing.aliyuncs.com/202003/20200229111521.jpg)

16层网络的结节信息如下：
- 01：Convolution using 64 filters
- 02: Convolution using 64 filters + Max pooling
- 03: Convolution using 128 filters
- 04: Convolution using 128 filters + Max pooling
- 05: Convolution using 256 filters
- 06: Convolution using 256 filters
- 07: Convolution using 256 filters + Max pooling
- 08: Convolution using 512 filters
- 09: Convolution using 512 filters
- 10: Convolution using 512 filters + Max pooling
- 11: Convolution using 512 filters
- 12: Convolution using 512 filters
- 13: Convolution using 512 filters + Max pooling
- 14: Fully connected with 4096 nodes
- 15: Fully connected with 4096 nodes
- 16: Softmax

### 1. 定义 dataloader

**需要注意的是，这里的 transform，dataloader 和之前定义的有所不同，大家自己体会。**

In [0]:
import torch
import torchvision
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
import numpy as np
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

# 使用GPU训练，可以在菜单 "代码执行工具" -> "更改运行时类型" 里进行设置
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

transform_train = transforms.Compose([
    transforms.RandomCrop(32, padding=4),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))])

transform_test = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,  download=True, transform=transform_train)
testset  = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform_test)

trainloader = torch.utils.data.DataLoader(trainset, batch_size=128, shuffle=True, num_workers=2)
testloader = torch.utils.data.DataLoader(testset, batch_size=128, shuffle=False, num_workers=2)

classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

### 2. VGG 网络定义

下面定义VGG网络，参数太多，我手动改简单了些~~~

现在的结构基本上是：

64 conv, maxpooling,

128 conv, maxpooling,

256 conv, 256 conv, maxpooling,

512 conv, 512 conv, maxpooling,

512 conv, 512 conv, maxpooling,

softmax 

可能有同学要问，为什么这么设置？

其实不为什么，就是觉得对称，我自己随便改的。。。

下面是模型的实现代码：

In [0]:
class VGG(nn.Module):
    def __init__(self):
        super(VGG, self).__init__()
        cfg = [64, 'M', 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M']
        self.features = self._make_layers(cfg)
        self.classifier = nn.Linear(3072, 10)

    def forward(self, x):
        out = self.features(x)
        out = x.view(-1, 3072)
        out = self.classifier(out)
        return out

    def _make_layers(self, cfg):
        layers = []
        in_channels = 3
        for x in cfg:
            if x == 'M':
                layers += [nn.MaxPool2d(kernel_size=2, stride=2)]
            else:
                layers += [nn.Conv2d(in_channels, x, kernel_size=3, padding=1),
                           nn.BatchNorm2d(x),
                           nn.ReLU(inplace=True)]
                in_channels = x
        layers += [nn.AvgPool2d(kernel_size=1, stride=1)]
        return nn.Sequential(*layers)

初始化网络，根据实际需要，修改分类层。因为 tiny-imagenet 是对200类图像分类，这里把输出修改为200。

In [0]:
# 网络放到GPU上
net = VGG().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(net.parameters(), lr=0.001)

### 3. 网络训练

训练的代码和以前是完全一样的：

In [0]:
for epoch in range(10):  # 重复多轮训练
    for i, (inputs, labels) in enumerate(trainloader):
        inputs = inputs.to(device)
        labels = labels.to(device)
        # 优化器梯度归零
        optimizer.zero_grad()
        # 正向传播 +　反向传播 + 优化 
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        # 输出统计信息
        if i % 100 == 0:   
            print('Epoch: %d Minibatch: %5d loss: %.3f' %(epoch + 1, i + 1, loss.item()))

print('Finished Training')

Epoch: 1 Minibatch:     1 loss: 2.453
Epoch: 1 Minibatch:   101 loss: 1.819
Epoch: 1 Minibatch:   201 loss: 1.383
Epoch: 1 Minibatch:   301 loss: 1.208
Epoch: 2 Minibatch:     1 loss: 1.025
Epoch: 2 Minibatch:   101 loss: 0.965
Epoch: 2 Minibatch:   201 loss: 0.808
Epoch: 2 Minibatch:   301 loss: 0.728
Epoch: 3 Minibatch:     1 loss: 0.737
Epoch: 3 Minibatch:   101 loss: 0.820
Epoch: 3 Minibatch:   201 loss: 0.909
Epoch: 3 Minibatch:   301 loss: 0.711
Epoch: 4 Minibatch:     1 loss: 0.604
Epoch: 4 Minibatch:   101 loss: 0.603
Epoch: 4 Minibatch:   201 loss: 0.640
Epoch: 4 Minibatch:   301 loss: 0.740
Epoch: 5 Minibatch:     1 loss: 0.526
Epoch: 5 Minibatch:   101 loss: 0.620
Epoch: 5 Minibatch:   201 loss: 0.335
Epoch: 5 Minibatch:   301 loss: 0.620
Epoch: 6 Minibatch:     1 loss: 0.589
Epoch: 6 Minibatch:   101 loss: 0.631
Epoch: 6 Minibatch:   201 loss: 0.375
Epoch: 6 Minibatch:   301 loss: 0.489
Epoch: 7 Minibatch:     1 loss: 0.463
Epoch: 7 Minibatch:   101 loss: 0.352
Epoch: 7 Min

# 4. 测试验证准确率：

测试的代码和之前也是完全一样的。

In [0]:
correct = 0
total = 0

for data in testloader:
    images, labels = data
    images, labels = images.to(device), labels.to(device)
    outputs = net(images)
    _, predicted = torch.max(outputs.data, 1)
    total += labels.size(0)
    correct += (predicted == labels).sum().item()

print('Accuracy of the network on the 10000 test images: %.2f %%' % (
    100 * correct / total))

Accuracy of the network on the 10000 test images: 84.92 %


可以看到，使用一个简化版的 VGG 网络，就能够显著地将准确率由 64%，提升到 84.92%

使用哪些技术可以更进一步的提升性能呢？我们在后边的教程中会进一步学习。