# VGG：使用块的网络

如何更好的更深更大：

1. 更多的全连接层（太贵）
2. 更多的卷积层
3. 将卷积层组合成块（提出VGG块）（就像堆积木）

VGG发现：

深但窄的网络效果更好，所以都是放着（3*3）的卷积核

# coding

## VGG块

In [2]:
import torch
from torch import nn

  from .autonotebook import tqdm as notebook_tqdm


In [3]:
# out_channels: 卷积层输出的channel数
def vgg_block(num_convs, in_channels, out_channels):
    layers = []
    
    for _ in range(num_convs):
        layers.append(nn.Conv2d(in_channels, out_channels, kernel_size=3, padding=1))
        layers.append(nn.ReLU())
        in_channels = out_channels
    layers.append(nn.MaxPool2d(kernel_size=2, stride=2))
    
    return nn.Sequential(*layers)

In [8]:
# 经典架构，通常5块，比如imagenet224 / 32 = 7， 没法再除下去了（每次maxpooling会/2）

conv_arch = ((1,64), (1, 128), (2, 256), (2,512), (2, 512))
def vgg(conv_arch):
    conv_blks = []
    in_channels = 1
    
    # 加入五块
    for (num_convs, out_channels) in conv_arch:
        conv_blks.append(vgg_block(num_convs, in_channels, out_channels))
        in_channels = out_channels
        
    # 最后全联接
    
    return nn.Sequential(*conv_blks, nn.Flatten(), 
                        nn.Linear(out_channels * 7 * 7, 4096), nn.ReLU(),nn.Dropout(p=0.5),
                         nn.Linear(4096, 4096),  nn.ReLU(),nn.Dropout(p=0.5),
                         nn.Linear(4096, 10)
                        )

net = vgg(conv_arch)
        

In [9]:
X = torch.rand(size=(1,1,224,224), dtype=torch.float32)

for blk in net:
    X = blk(X)
    print(blk.__class__.__name__, 'output shape: \t\t', X.shape)

Sequential output shape: 		 torch.Size([1, 64, 112, 112])
Sequential output shape: 		 torch.Size([1, 128, 56, 56])
Sequential output shape: 		 torch.Size([1, 256, 28, 28])
Sequential output shape: 		 torch.Size([1, 512, 14, 14])
Sequential output shape: 		 torch.Size([1, 512, 7, 7])
Flatten output shape: 		 torch.Size([1, 25088])
Linear output shape: 		 torch.Size([1, 4096])
ReLU output shape: 		 torch.Size([1, 4096])
Dropout output shape: 		 torch.Size([1, 4096])
Linear output shape: 		 torch.Size([1, 4096])
ReLU output shape: 		 torch.Size([1, 4096])
Dropout output shape: 		 torch.Size([1, 4096])
Linear output shape: 		 torch.Size([1, 10])


训练就拉倒了，李沐这里精度： 0.91

大约跑了5min，alexnet是3min，慢了一倍，但是效果好不少。