# ResNet-InceptionV4-InceptionResNet

為什麼要構建深層網路？

答：認為神經網路的每一層分別對應於提取不同層次的特徵資訊，有低層，中層和高層，而網路越深的時候，提取到的不同層次的資訊會越多，而不同層次間的層次資訊的組合也會越多。

ResNets為什麼能構建如此深的網路？

答：深度學習對於網路深度遇到的主要問題是梯度消失和梯度爆炸，傳統對應的解決方案則是資料的初始化(normlized initializatiton)和（batch normlization）正則化，但是這樣雖然解決了梯度的問題，深度加深了，卻帶來了另外的問題，就是網路效能的退化問題，深度加深了，錯誤率卻上升了，而殘差用來設計解決退化問題，其同時也解決了梯度問題，更使得網路的效能也提升了。普通網路（Plain network），類似VGG，沒有殘差，憑經驗會發現隨著網路深度的加深，訓練錯誤會先減少，然後增多（並證明的錯誤的增加並不是由於過擬合產生，而是由於網路變深導致難以訓練）。從理論上分析，網路深度越深越好。但實際上，如果沒有殘差網路，對於一個普通網路來說，深度越深意味著用優化演算法越難訓練。實際上，隨著網路深度的增加，訓練誤差會越來越多，這被描述為網路退化。ResNets的提出，可以解決上述問題，即使網路再深嗎，訓練的表現仍表現很好。它有助於解決梯度消失和梯度爆炸問題，讓我們在訓練更深網路的同時，又能保證良好的資訊。

## Residual Block (Identity Block)

將輸入的input與經過2–3層的F(x)跨接並相加，使輸出表示為y=F(x)+x，這樣的好處在於反向傳播時能保證至少會有一個1存在，降低梯度消失 (vanishing gradient) 發生的可能性。

通過在一個淺層網路基礎上疊加y=x的層（稱identity mappings，恆等對映），可以讓網路隨著深度增加而不退化。這反映多層非線性網路無法逼近恆等對映網路。

如果已經學習到較飽和的準確率（或者當發現下層的誤差變大時），那麼接下來的學習目標就轉變為恆等對映的學習，也就是使輸入x近似於輸出H(x)，以保持在後面的層次中不會造成精度下降。例如，在一個網路中（假設有5層），如果前面四層已經達到一個最優的函式，那第五層就是沒有必要的了，這時我們通過這種跳躍結構，我們的優化目標就從一個等價對映變為逼近0了，逼近其他任何函式都會造成網路退化。通過這種方式就可以解決網路太深難訓練的問題。

## Part1

In [1]:
import numpy as np
from keras.models import Model
from keras.layers import Flatten
from keras.layers import Dense
from keras.layers import Input
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import GlobalMaxPooling2D
from keras.layers import GlobalAveragePooling2D
from keras import backend as K
from keras import layers
from keras.layers import BatchNormalization
from keras.layers import Activation
from keras.layers import Concatenate
from keras.layers import Lambda

![Incpeiton](ResNet_Structure.png)

拿掉Identity branch後的 BN

BN層會改變Identity branch的訊息分佈，造成收斂速度下降。論文中也用到Inception Block中，1\*1 kernel壓縮深度的技巧，最後再用1\*1 kernel回放深度，藉此降低運算。

## ResNetV1

In [4]:
def Residual_block(input_tensor, kernel_size, filters, stage, block):
    filters1, filters2, filters3 = filters
    conv_name_base = 'res' + str(stage) + block + '_branch'
    bn_name_base = 'bn' + str(stage) + block + '_branch'

    x = Conv2D(filters1, (1, 1), name=conv_name_base + '2a')(input_tensor)
    x = BatchNormalization(axis=3, name=bn_name_base + '2a')(x)
    x = Activation('relu')(x)

    x = Conv2D(filters2, kernel_size,
               padding='same', name=conv_name_base + '2b')(x)
    x = BatchNormalization(axis=3, name=bn_name_base + '2b')(x)
   

    x = layers.add([x, input_tensor])
    x = Activation('relu')(x)
    return x

## 參考ResNetV1 搭建 ResNetV2版本的Residual Block

In [5]:
def Residual_block_v2(input_tensor, kernel_size, filters, stage, block):
    filters1, filters2, filters3 = filters
    conv_name_base = 'res' + str(stage) + block + '_branch'
    bn_name_base = 'bn' + str(stage) + block + '_branch'

    '''搭建V2版本Block'''
    
        
    x = BatchNormalization(axis=3, name=bn_name_base + '2a')(input_tensor)
    x = Activation('relu')(x)
    x = Conv2D(filters1, kernel_size,padding='same', name=conv_name_base + '2a')(x)
    
    x = BatchNormalization(axis=3, name=bn_name_base + '2b')(x)
    x = Activation('relu')(x)
    x = Conv2D(filters2, (1, 1), name=conv_name_base + '2b')(x)
    
    x = layers.add([x, input_tensor])
    return x

## 試試看自己設計一個先壓縮再回放的V2 Block

In [6]:
def Residual_block_v2(input_tensor, kernel_size, stage, block,reduce=96,ouput_size=128):
    
    conv_name_base = 'res' + str(stage) + block + '_branch'
    bn_name_base = 'bn' + str(stage) + block + '_branch'

    x = Conv2D(reduce, (1, 1), name=conv_name_base + '2a')(input_tensor)
    x = BatchNormalization(axis=3, name=bn_name_base + '2a')(x)
    x = Activation('relu')(x)

    x = Conv2D(reduce, kernel_size,
               padding='same', name=conv_name_base + '2b')(x)
    x = BatchNormalization(axis=3, name=bn_name_base + '2b')(x)
    x = Activation('relu')(x)

    x = Conv2D(ouput_size, (1, 1), name=conv_name_base + '2c')(x)

    
    
    x = layers.add([x, input_tensor])
    return x

---

## Part2

## Incpetion Block-A

![Incpeiton](Inception-ResNet-A.png)

## Incpetion Block-B

![Incpeiton](Inception-ResNet-B.png)

## Incpetion Block-C

![Incpeiton](Inception-ResNet-C.png)

可以發現有加入殘差結構的版本收斂速度都大幅提升了許多。論文中也有提到，相同的參數量下加入殘差結構並不一定能提升準度，但確實能提升模型收斂速度，並且能搭建更深的網路。

In [7]:
def Conv2d_bn(x,filters,kernel_size,padding='same',strides=(1, 1),normalizer=True,activation='relu',name=None):
    if name is not None:
        conv_name = name + '_conv'
        bn_name = name + '_bn'
        act_name = name + '_act'
    else:
        conv_name = None
        bn_name = None
        act_name = None
    if K.image_data_format() == 'channels_first':
        bn_axis = 1
    else:
        bn_axis = 3
    x = Conv2D(
            filters, kernel_size,
            strides=strides, padding=padding,
            use_bias=False, name=conv_name)(x)
    if normalizer:
        x = BatchNormalization(axis=bn_axis, scale=False, name=bn_name)(x)
    if activation:
        x = Activation(activation, name=act_name)(x)
    return x

In [8]:
def Residual_block(input_tensor, kernel_size, filters, stage, block):
    filters1, filters2, filters3 = filters
    conv_name_base = 'res' + str(stage) + block + '_branch'
    bn_name_base = 'bn' + str(stage) + block + '_branch'

    x = Conv2D(filters1, (1, 1), name=conv_name_base + '2a')(input_tensor)
    x = BatchNormalization(axis=3, name=bn_name_base + '2a')(x)
    x = Activation('relu')(x)

    x = Conv2D(filters2, kernel_size,
               padding='same', name=conv_name_base + '2b')(x)
    x = BatchNormalization(axis=3, name=bn_name_base + '2b')(x)
    x = Activation('relu')(x)

    x = Conv2D(filters3, (1, 1), name=conv_name_base + '2c')(x)
    x = BatchNormalization(axis=3, name=bn_name_base + '2c')(x)

    x = layers.add([x, input_tensor])
    x = Activation('relu')(x)
    return x

## 參考上方Residual_block搭建 Inception-ResNet中的Inception Block

In [14]:
def inception_resnet_block(x, scale, block_type, activation='relu'):
    '''scale: scaling factor to scale the residuals (i.e., the output of
            passing `x` through an inception module) before adding them
            to the shortcut branch. Let `r` be the output from the residual branch,
            the output of this block will be `x + scale * r`.(簡單來說就是控制Residual branch的比例)'''
    if block_type == 'Incpetion_Block-A':
        branch_0 = Conv2d_bn(x, 32, 1)
        branch_1 = Conv2d_bn(x, 32, 1)
        branch_1 = Conv2d_bn(branch_1, 32, 3)
        branch_2 = Conv2d_bn(x, 32, 1)
        branch_2 = Conv2d_bn(branch_2, 48, 3)
        branch_2 = Conv2d_bn(branch_2, 64, 3)
        branches = [branch_0, branch_1, branch_2]
    elif block_type == 'Incpetion_Block-B':
        branch_0 = Conv2d_bn(x, 192, 1)
        branch_1 = Conv2d_bn(x, 128, 1)
        branch_1 = Conv2d_bn(branch_1, 160, [1, 7])
        branch_1 = Conv2d_bn(branch_1, 192, [7, 1])
        branches = [branch_0, branch_1]
    elif block_type == 'Incpetion_Block-C':
        branch_0 = Conv2d_bn(x, 192, 1)
        branch_1 = Conv2d_bn(x, 192, 1)
        branch_1 = Conv2d_bn(branch_1, 192, [1, 3])
        branch_1 = Conv2d_bn(branch_1, 192, [3, 1])
        branches = [branch_0, branch_1]
    else:
        raise ValueError('Unknown Inception-ResNet block type. '
                         'Expects "block35", "block17" or "block8", '
                         'but got: ' + str(block_type))
    mixed = Concatenate(axis=3)(branches)
    
    '''確保輸入跟輸出深度相同'''
    up = Conv2d_bn(mixed,K.int_shape(x)[3],1,activation=None)
    
    '''導入殘差結構，並給予權重'''
    
    '''如果你只是想對流經該層的數據做個變換，而這個變換本身沒有什麽需要學習的參數，那麽直接用Lambda Layer是最合適的了。'''
    
    x = Lambda(lambda inputs, scale: inputs[0] + inputs[1] * scale, ##提示inputs[0]、inputs[1]
               output_shape=K.int_shape(x)[1:],
               arguments={'scale': scale},)([x,up])
    
    if activation is not None:
        x = Activation(activation)(x)
    return x


In [15]:
img_input = Input(shape=(224,224,32))
x=inception_resnet_block(img_input, 0.1, 'Incpetion_Block-A', activation='relu')
print(x)

KerasTensor(type_spec=TensorSpec(shape=(None, 224, 224, 32), dtype=tf.float32, name=None), name='activation_26/Relu:0', description="created by layer 'activation_26'")


## 測試

In [16]:
img_input = Input(shape=(224,224,32))
x=inception_resnet_block(img_input, 0.1, 'Incpetion_Block-A', activation='relu')
print(x)

KerasTensor(type_spec=TensorSpec(shape=(None, 224, 224, 32), dtype=tf.float32, name=None), name='activation_33/Relu:0', description="created by layer 'activation_33'")


In [17]:
img_input = Input(shape=(224,224,32))
x=inception_resnet_block(img_input, 0.1, 'Incpetion_Block-B', activation='relu')
print(x)

KerasTensor(type_spec=TensorSpec(shape=(None, 224, 224, 32), dtype=tf.float32, name=None), name='activation_38/Relu:0', description="created by layer 'activation_38'")


In [18]:
img_input = Input(shape=(224,224,32))
x=inception_resnet_block(img_input, 0.1, 'Incpetion_Block-C', activation='relu')
print(x)

KerasTensor(type_spec=TensorSpec(shape=(None, 224, 224, 32), dtype=tf.float32, name=None), name='activation_43/Relu:0', description="created by layer 'activation_43'")


## 嘗試導入Inception resnet Block到 Vgg_Inception中 block 3 and 5

In [19]:
def VGG16_ResNet_Inception(include_top=True,input_tensor=None, input_shape=(224,224,1),
          pooling='max',classes=1000):
 
    img_input = Input(shape=input_shape)
    
    x = Conv2d_bn(img_input,64, (3, 3), activation='relu', padding='same', name='block1_conv1')
    x = Conv2d_bn(x,64, (3, 3), activation='relu', padding='same', name='block1_conv2')
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)

    # Block 2
    x = Conv2d_bn(x,128, (3, 3), activation='relu', padding='same', name='block2_conv1')
    x = Conv2d_bn(x,128, (3, 3), activation='relu', padding='same', name='block2_conv2')
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)

    # Block 3
    x = inception_resnet_block(x, 0.1, 'Incpetion_Block-A', activation='relu')
    x = inception_resnet_block(x, 0.1, 'Incpetion_Block-A', activation='relu')
    x = inception_resnet_block(x, 0.1, 'Incpetion_Block-A', activation='relu')
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)

    # Block 4
    x = Conv2d_bn(x,512, (3, 3), activation='relu', padding='same', name='block4_conv1')
    x = Conv2d_bn(x,512, (3, 3), activation='relu', padding='same', name='block4_conv2')
    x = Conv2d_bn(x,512, (3, 3), activation='relu', padding='same', name='block4_conv3')
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)

    # Block 5 
    x =inception_resnet_block(x, 0.1, 'Incpetion_Block-B', activation='relu')
    x =inception_resnet_block(x, 0.1, 'Incpetion_Block-B', activation='relu')
    x =inception_resnet_block(x, 0.1, 'Incpetion_Block-B', activation='relu')
    x =MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x)

    if include_top:
        # Classification block
        x = Flatten(name='flatten')(x)
        x = Dense(4096, activation='relu', name='fc1')(x)
        x = Dense(4096, activation='relu', name='fc2')(x)
        x = Dense(classes, activation='softmax', name='predictions')(x)
    else:
       #可以提醒學員為什麼要加avg或是max
        if pooling == 'avg':
            x = GlobalAveragePooling2D()(x)
        elif pooling == 'max':
            x = GlobalMaxPooling2D()(x)

    inputs = img_input
    # Create model.
    model = Model(inputs, x, name='vgg16')

   
    return model



In [20]:
model = VGG16_ResNet_Inception(include_top=False)

In [21]:
model.summary()

Model: "vgg16"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_10 (InputLayer)           [(None, 224, 224, 1) 0                                            
__________________________________________________________________________________________________
block1_conv1_conv (Conv2D)      (None, 224, 224, 64) 576         input_10[0][0]                   
__________________________________________________________________________________________________
block1_conv1_bn (BatchNormaliza (None, 224, 224, 64) 192         block1_conv1_conv[0][0]          
__________________________________________________________________________________________________
block1_conv1_act (Activation)   (None, 224, 224, 64) 0           block1_conv1_bn[0][0]            
______________________________________________________________________________________________