# VGG
vggNet 是第一个真正意义上的深层网络结构，其是 ImageNet2014年的冠军
vgg 几乎全部使用 3 x 3 的卷积核以及 2 x 2 的池化层，使用小的卷积核进行多层的堆叠和一个大的卷积核的感受野是相同的，同时小的卷积核还能减少参数，同时可以有更深的结构。

vgg 的一个关键就是使用很多层 3 x 3 的卷积然后再使用一个最大池化层，这个模块被使用了很多次，下面我们照着这个结构来写一写

In [1]:
import tensorflow as tf
from utils import cifar10_input

In [2]:
batch_size = 64
train_imgs, train_labels, val_imgs, val_labels = cifar10_input.load_data(batch_size=batch_size)

Instructions for updating:
Queue-based input pipelines have been replaced by `tf.data`. Use `tf.data.Dataset.from_tensor_slices(string_tensor).shuffle(tf.shape(input_tensor, out_type=tf.int64)[0]).repeat(num_epochs)`. If `shuffle=False`, omit the `.shuffle(...)`.
Instructions for updating:
Queue-based input pipelines have been replaced by `tf.data`. Use `tf.data.Dataset.from_tensor_slices(input_tensor).shuffle(tf.shape(input_tensor, out_type=tf.int64)[0]).repeat(num_epochs)`. If `shuffle=False`, omit the `.shuffle(...)`.
Instructions for updating:
Queue-based input pipelines have been replaced by `tf.data`. Use `tf.data.Dataset.from_tensors(tensor).repeat(num_epochs)`.
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
Instructions for updating:
Queue-based input pipelines have been replaced by `tf.data`. Use `tf.data.FixedLengthRecordDataset`.
Instructions for updating:
Q

In [3]:
from utils.layers import conv, max_pool, fc
from utils.learning import train

In [4]:
def vgg_block(inputs, num_convs, out_depth, scope='vgg_block', reuse=None):
    """构建vgg_block.
    
    一个 vgg_block 由`num_convs`个卷积层和一个最大值池化层构成.
    
    Args:
        inputs: 输入
        num_convs: 这一个block里卷积层的个数
        out_depth: 每一个卷积层的卷积核个数
        scope: 变量域名
        reuse: 是否复用
    """
    int_depth = inputs.get_shape().as_list()[-1]
    
    with tf.variable_scope(scope, reuse=reuse) as sc:
        net = inputs
        for i in range(num_convs):
            net = conv(net, ksize=[3,3], out_depth=out_depth, strides=[1,1],
                       padding='SAME', scope='conv%d'%i, reuse=reuse)
            net = max_pool(net, [2,2], [2,2], name='pool')
        return net

- 然后我们把很多个不同的`vgg_block`堆叠在一起

In [5]:
def vgg_stack(inputs, num_convs, out_depths, scope='vgg_stack', reuse=None):
    """构建vgg_stack.
    
    一个 vgg_stack 将若干个不同的`vgg_block`进行`stack`(堆叠)
    
    Args:
        inputs: 输入
        num_convs: 每一个block里卷积层的个数, 列表. 如`[1, 2, 3]`
        out_depths: 每一个block的卷积核个数, 列表, 如`[64, 128, 256]`
        scope: 变量域名
        reuse: 是否复用
    """
    with tf.variable_scope(scope, reuse=reuse) as sc:
        net = inputs
        for i, (n, d) in enumerate(zip(num_convs, out_depths)):
            net = vgg_block(inputs=net, num_convs=n, out_depth=d, scope='vgg_block%d'%i)
        return net

In [6]:
def vgg(inputs, num_convs, out_depths, num_outputs, scope='vgg', reuse=None):
    """构建vgg.
    
    一个 vgg 先经过`vgg_stack`后再连接两个全连接层.
    
    Args:
        inputs: 输入
        num_convs: 每一个 vgg_block 的卷积层的个数
        out_depths: 每一个 vgg_block 卷积核个数
        num_outputs: 最后输出向量的维数
        scope: 变量域名
        reuse: 是否复用
    """
    with tf.variable_scope(scope, reuse=reuse) as sc:
        net = vgg_stack(inputs, num_convs, out_depths)
        with tf.variable_scope('classification'):
            net = tf.layers.flatten(net)
            net = fc(net, 100, scope='fc1')
            net = fc(net, num_outputs, act=tf.identity, scope='classification')
        return net

In [7]:
train_out = vgg(inputs=train_imgs, num_convs=(1, 1, 2, 2, 2), 
                out_depths=(64, 128, 256, 512, 512), 
                num_outputs=10)
# 复用上面的参数
val_out = vgg(inputs=val_imgs, num_convs=(1, 1, 2, 2, 2), 
                out_depths=(64, 128, 256, 512, 512), 
                num_outputs=10, reuse=True)

In [8]:
with tf.variable_scope('loss'):
    train_loss = tf.losses.sparse_softmax_cross_entropy(labels=train_labels, logits=train_out, scope='train')
    val_loss = tf.losses.sparse_softmax_cross_entropy(labels=val_labels, logits=val_out, scope='val')

In [9]:
with tf.name_scope('accuracy'):
    with tf.name_scope('train'):
        train_acc = tf.reduce_mean(tf.cast(tf.equal(tf.argmax(train_out, axis=-1, output_type=tf.int32), train_labels), dtype=tf.float32))
    with tf.name_scope('val'):
        val_acc = tf.reduce_mean(tf.cast(tf.equal(tf.argmax(val_out, axis=-1, output_type=tf.int32), val_labels), dtype=tf.float32))

In [10]:
lr = 0.01
optimizer = tf.train.MomentumOptimizer(lr, momentum=0.9)
train_op = optimizer.minimize(train_loss)

In [11]:

train(train_op, train_loss, train_acc, val_loss, val_acc, 20000, batch_size)

Instructions for updating:
To construct input pipelines, use the `tf.data` module.
[train]: step 0 loss = 2.3167 acc = 0.0781 (0.0088 / batch)
[val]: step 0 loss = 2.3165 acc = 0.1562
[train]: step 1000 loss = 1.1907 acc = 0.5156 (0.1211 / batch)
[train]: step 2000 loss = 0.5664 acc = 0.8125 (0.1264 / batch)
[train]: step 3000 loss = 0.7340 acc = 0.7500 (0.1302 / batch)
[train]: step 4000 loss = 0.8368 acc = 0.6875 (0.1321 / batch)
[val]: step 4000 loss = 0.7107 acc = 0.7188
[train]: step 5000 loss = 0.3022 acc = 0.8906 (0.1327 / batch)
[train]: step 6000 loss = 0.2720 acc = 0.9219 (0.1299 / batch)
[train]: step 7000 loss = 0.1600 acc = 0.9531 (0.1274 / batch)
[train]: step 8000 loss = 0.2381 acc = 0.9062 (0.1321 / batch)
[val]: step 8000 loss = 0.6860 acc = 0.7969
[train]: step 9000 loss = 0.1761 acc = 0.9375 (0.1312 / batch)
[train]: step 10000 loss = 0.1726 acc = 0.9219 (0.1305 / batch)
[train]: step 11000 loss = 0.0781 acc = 0.9844 (0.1289 / batch)
[train]: step 12000 loss = 0.0894