# 手工打造一个网络骨架backbone

一个复杂深度神经网络的backbone，也就是骨架，他决定了数据中的features是如何在网络里被层层剥离出来的。关于图形识别这个领域里不同的任务和要求有不同的算法实现，也就是有不同的backbone可供选择。

在输入给定的情况下，如何处理这些输入和如何去正/反向在网络中传播数据，更新权重和偏差值，会给网络的效率和最终的准确度带来很大的影响。

作为深度学习的进阶挑战就是用tensorflow的api把别人的算法作为一个网络模型实现出来。没什么神奇的。难的是怎么设计一个算法。我不是算法大神，只是希望能读懂别人的论文，然后用代码实现出来。所以手动实现ResNet就是一个好的练手。

ResNet [https://arxiv.org/pdf/1512.03385.pdf] 残差神经网络是一种算法，是一个高阶，或者是高优化的卷积神经网络。在卷积层间加入bypass->优化模型训练时收敛速度，提高精度。使更深的网络有了意义 （理论上大于22层的经典卷积网络在性能上已经没有没有的明显的提高了，反而加大了训练的负担，使网络在应用时也需要更多的资源）。

卷积网络的尾部可以安装一套全连接的层，卷积层中发现的features作为全连接层的输入，输出最终的分类结果。这套全连接层通常被称为网络的header。其实也有很复杂的header，其中包含卷积和上下采样，等等复杂的预处理，再做最后的分类任务。这里就暂时这么理解吧。

## 第一步 - 实现基本的残差block

ResNet有基本的参差块组成 -> 2个及以上的卷积层组成，层之间是Relu。block的入口和出口之间直接有一个identity做成的bypass

![Residual block](pic/resnet_single_block.png)



In [1]:
import tensorflow as tf
from tensorflow.keras import layers, models, losses, optimizers, regularizers, Sequential

initializer = tf.random_normal_initializer(stddev=0.01)
regularizer = tf.keras.regularizers.l2(4e-5)

class ResBlock(layers.Layer):

    def __init__(self, filters, strides=1):
        super(ResBlock, self).__init__()
        # the first conv layer could downsample the input via strides
        self.conv1 = layers.Conv2D(filters, 
                                   (3, 3), 
                                   strides=strides, 
                                   padding='same'，
                                   kernel_initializer=initializer, 
                                   kernel_regularizer=regularizer)
        self.bn1   = layers.BatchNormalization()
        self.relu  = layers.Activation('relu')
        
        # the second conv layer need to keep the output of first conv layer
        # therefore the strides must be 1
        self.conv2 = layers.Conv2D(filters, 
                                   (3, 3), 
                                   strides=1, 
                                   padding='same', 
                                   kernel_initializer=initializer, 
                                   kernel_regularizer=regularizer)
        self.bn2   = layers.BatchNormalization()
        
        # in case the strides equal 1 the identity function return the input of the block
        if strides == 1:
            self.identity = lambda x:x
        # if strides greater than 1 means we have to convert the input with the same shape
        # as the processed input
        # e.g.: strides = [2, 2] input [b, 28, 28]
        # output conv1: [b, 27, 27] -> conv2: [b, 27, 27]
        else:
            self.identity = Sequential([layers.Conv2D(filters, (1, 1), strides)])

    def call(self, inputs, training=None):
        conv1_out = self.relu(self.bn1(self.conv1(inputs), training=training))
        conv2_out = self.bn2(self.conv2(conv1_out), training=training)

        identity = self.identity(inputs)
        output = tf.nn.relu(layers.add([conv2_out, identity]))

        return output

## 整体残差网络模型

In [None]:
class ResNet(models.Model):

    def __init__(self, dims, class_numbers):
        super(ResNet, self).__init__()

        self.stem = Sequential([
                                layers.Conv2D(64, 
                                              (7, 7), 
                                              strides=2, 
                                              padding='same', 
                                              kernel_initializer=initializer),
                                layers.BatchNormalization(),
                                layers.Activation('relu'),
                                layers.MaxPool2D(pool_size=(2,2), strides=1, padding='same')
                                ])

        self.layer1 = self.build_resblocks(64, dims[0])
        self.layer2 = self.build_resblocks(128, dims[1], strides=2)
        self.layer3 = self.build_resblocks(256, dims[2], strides=2)
        self.layer4 = self.build_resblocks(512, dims[3], strides=2)

        self.avgpool = layers.GlobalAveragePooling2D()
        self.fc = layers.Dense(class_numbers)

    def call(self, inputs, training=None):
        x = self.stem(inputs)

        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)

        x = self.avgpool(x)
        x = self.fc(x)

        return x

    def build_resblocks(self, filters, numbers, strides=1):
        res_blocks = Sequential([ResBlock(filters, strides)])

        for _ in range(1, numbers):
            res_blocks.add(ResBlock(filters, strides=1))

        return res_blocks

## 18和34层的ResNet

In [None]:
def resnet18(class_numbers):
    return ResNet(dims=[2, 2, 2, 2], class_numbers=class_numbers)


def resnet34(class_numbers):
    return ResNet(dims=[3, 4, 6, 3], class_numbers=class_numbers)
