## 算法原理

Inception resnet v1 和 v2 结合了 inception 结构以及 residual connections 结构，Inception-resnet-v1的计算速度和inception-v3大致相同，Inception-resnet-v2则提升了识别性能。下图是Inception-Resnet-v1和Inception-Resnet-v2的具体结构：  
![pic0](pic0.png)  
下图是Inception-Resnet-v1的内部网络模块，从左到右分别是 35x35，17x17，8x8，分别对应于上面结构图中的Inception-A，Inception-B，Inception-C：  
![pic1](pic1.png)  
下图是Inception-Resnet-v2的内部网络模块，从左到右分别是 35x35，17x17，8x8，分别对应于上面结构图中的Inception-A，Inception-B，Inception-C：  
![pic2](pic2.png)  
上面是论文中阐述的内容，下面我们结合源码对Inception-Resnet-v2进行具体说明。

## 算法实现

源码中Inception-Resnet-v2结构图如下：  
![pic3](pic3.png)  
其中左图为网络整体架构图，右图为stem详细结构图，stem编码实现为：  

In [2]:
import keras

def conv_bn(x,
            filters,
            kernel_size,
            strides=1,
            padding='same',
            activation=None,
            name=None):
    '''
    添加bn的卷积操作
    :param x: input tensor
    :param filters: integer，the number of kernel
    :param kernel_size: integer or tumple，the size of kernel
    :param strides: integer or tumple, default 1
    :param padding: string, 'same' or 'valid', 'same' default
    :param activation: string, if None, linear activation
    :param name: name of operation
    :return: output tensor
    '''
    x = keras.layers.Conv2D(filters=filters, kernel_size=kernel_size, strides=strides, padding=padding, use_bias=False, name=name+'_conv2d')(x)
    x = keras.layers.BatchNormalization(axis=3, name=name + '_bn')(x)
    if activation is not None:
        x = keras.layers.Activation(activation, name=name+'_'+activation)(x)
    return x

def stem(x, name=None):
    '''
    stem structure
    :param x: input tensor
    :param name: operation name
    :return: output tensor
    '''
    x = conv_bn(x, filters=32, kernel_size=(3,3), strides=2, padding='valid', activation='relu', name=name+'_conv1')
    x = conv_bn(x, filters=32, kernel_size=(3,3), padding='valid', activation='relu', name=name+'_conv2')
    x = conv_bn(x, filters=64, kernel_size=(3,3), activation='relu', name=name+'_conv3')
    x = keras.layers.MaxPooling2D(pool_size=(3,3), strides=2, padding='valid', name=name+'_pool1')(x)
    x = conv_bn(x, filters=80, kernel_size=(1,1), padding='valid', activation='relu', name=name+'_conv4')
    x = conv_bn(x, filters=192, kernel_size=(3,3), padding='valid', activation='relu', name=name+'_conv5')
    x = keras.layers.MaxPooling2D(pool_size=(3,3), strides=2, padding='valid', name=name+'_pool')(x)
    return x

Using TensorFlow backend.


下图是Inception-A block结构，其编码实现为：  
![pic4](pic4.png)  

In [3]:
def inception_A(x, name=None):
    '''
    inception A block structure
    :param x: input tensor
    :param name: string , operation name 
    :return: output tensor
    '''
    branch_0 = conv_bn(x, filters=96, kernel_size=(1,1), activation='relu', name=name+'_branch0_convbn')
    branch_1 = conv_bn(x, filters=48, kernel_size=(1,1), activation='relu', name=name+'_branch1_convbn0')
    branch_1 = conv_bn(branch_1, filters=64, kernel_size=(5,5), activation='relu', name=name+'_branch1_convbn1')
    branch_2 = conv_bn(x, filters=64, kernel_size=(1,1), activation='relu', name=name+'_branch2_convbn0')
    branch_2 = conv_bn(branch_2, filters=96, kernel_size=(3,3), activation='relu', name=name+'_branch2_convbn1')
    branch_2 = conv_bn(branch_2, filters=96, kernel_size=(3,3), activation='relu', name=name+'_branch2_convbn2')
    branch_3 = keras.layers.AveragePooling2D(pool_size=(3,3), strides=1, padding='same', name=name+'_branch3_pool')(x)
    branch_3 = conv_bn(branch_3, filters=64, kernel_size=(1,1), activation='relu', name=name+'_branch3_convbn')
    x = keras.layers.Concatenate(axis=3, name=name + '_concat')([branch_0, branch_1, branch_2, branch_3])
    return x

下图从左到右分别是Inception-Resnet-A、Inception-Resnet-B、Inception-Resnet-C三种结构，其编码实现为：  
![pic5](pic5.png)  

In [4]:
def inception_resnet_A(x, scale=0.1, name=None):
    '''
    inception resnet A block structure
    :param x: input tensor
    :param scale: float, around 0.1
    :param name: operation name
    :return: output tensor
    '''
    branch_0 = conv_bn(x, filters=32, kernel_size=(1,1), activation='relu', name=name + '_branch0_convbn')
    
    branch_1 = conv_bn(x, filters=32, kernel_size=(1,1), activation='relu', name=name + '_branch1_convbn0')
    branch_1 = conv_bn(branch_1, filters=32, kernel_size=(3,3), activation='relu', name=name + '_branch1_convbn1')
    
    branch_2 = conv_bn(x, filters=32, kernel_size=(1,1), activation='relu', name=name + '_branch2_convbn0')
    branch_2 = conv_bn(branch_2, filters=48, kernel_size=(3,3), activation='relu', name=name + '_branch2_convbn1')
    branch_2 = conv_bn(branch_2, filters=64, kernel_size=(3,3), activation='relu', name=name + '_branch2_convbn2')
    
    x2 = keras.layers.Concatenate(axis=3, name=name+'_concat')([branch_0, branch_1,branch_2])
    
    x2 = conv_bn(x2, filters=320, kernel_size=(1,1), name=name+'_conv')
    x2 = keras.layers.Lambda(lambda x: x * scale, name=name+'_scale')(x2)
    
    x = keras.layers.Add(name=name+'_add')([x, x2])
    x = keras.layers.Activation('relu', name=name+'_relu')(x)
    return x

In [5]:
def inception_resnet_B(x, scale=0.1, name=None):
    '''
    inception resnet B structure
    :param x: input tensor
    :param scale:  float, around 0.1
    :param name: string， operation name
    :return: output tensor
    '''
    branch_0 = conv_bn(x, filters=192, kernel_size=(1,1), activation='relu', name=name + '_branch0_convbn')
    
    branch_1 = conv_bn(x, filters=128, kernel_size=(1,1), activation='relu', name=name + '_branch1_convbn0')
    branch_1 = conv_bn(branch_1, filters=160, kernel_size=(1,7), activation='relu', name=name+'_branch1_convbn1')
    branch_1 = conv_bn(branch_1, filters=192, kernel_size=(7,1), activation='relu', name=name+'_branch1_convbn2')
    
    x2 = keras.layers.Concatenate(axis=3, name=name+'_concat')([branch_0, branch_1])
    x2 = conv_bn(x2, filters=1088, kernel_size=(1,1), name=name+'_conv')
    x2 = keras.layers.Lambda(lambda x: x * scale, name=name+'_scale')(x2)
    
    x = keras.layers.Add(name=name+'_add')([x, x2])
    x = keras.layers.Activation('relu', name=name+'_relu')(x)
    return x

In [6]:
def inception_resnet_C(x, scale=0.1, name=None):
    '''
    inception resnet C structure
    :param x: input tensor
    :param scale: float, around 0.1
    :param name: string, operation name
    :return: output tensor
    '''
    branch_0 = conv_bn(x, filters=192, kernel_size=(1,1), activation='relu', name=name+'_branch0_convbn')
    
    branch_1 = conv_bn(x, filters=192, kernel_size=(1,1), activation='relu', name=name+'_branch1_convbn0')
    branch_1 = conv_bn(branch_1, filters=224, kernel_size=(1,3), activation='relu', name=name+'_branch1_convbn1')
    branch_1 = conv_bn(branch_1, filters=256, kernel_size=(3,1), activation='relu', name=name+'_branch1_convbn2')
    
    x2 = keras.layers.Concatenate(axis=3, name=name+'_concat')([branch_0, branch_1])
    x2 = conv_bn(x2, filters=2080, kernel_size=(1,1), name=name+'_conv')
    x2 = keras.layers.Lambda(lambda x: x * scale, name=name+'_scale')(x2)
    
    x = keras.layers.Add(name=name+'_add')([x, x2])
    x = keras.layers.Activation('relu', name=name+'_relu')(x)
    return x

下图从左到右分别是Reduction-A，Reduction-B结构，其编码实现为：  
![pic6](pic6.png)  

In [7]:
def reduction_A(x, name=None):
    '''
    reduction A structure
    :param x: input tensor
    :param name: string, operation name
    :return: output tensor
    '''
    branch_0 = conv_bn(x, filters=384, kernel_size=(3,3), strides=2, padding='valid', activation='relu', name=name+'_branch0_convbn')
    
    branch_1 = conv_bn(x, filters=256, kernel_size=(1,1), activation='relu', name=name+'_branch1_convbn0')
    branch_1 = conv_bn(branch_1, filters=256, kernel_size=(3,3), activation='relu', name=name+'_branch1_convbn1')
    branch_1 = conv_bn(branch_1, filters=384, kernel_size=(3,3), strides=2, padding='valid', activation='relu', name=name+'_branch1_convbn2')
    
    branch_2 = keras.layers.MaxPooling2D(pool_size=(3,3), strides=2, padding='valid')(x)
    x = keras.layers.Concatenate(axis=3, name=name+'_concat')([branch_0, branch_1, branch_2])
    return x

In [8]:
def reduction_B(x, name=None):
    '''
    reduction B structure
    :param x: input tensor
    :param name: string, operation name
    :return: output tensor
    '''
    branch_0 = conv_bn(x, filters=256, kernel_size=(1,1), activation='relu', name=name+'_branch0_convbn0')
    branch_0 = conv_bn(branch_0, filters=384, kernel_size=(3,3), strides=2, padding='valid', activation='relu', name=name+'_branch0_convbn1')
    
    branch_1 = conv_bn(x, filters=256, kernel_size=(1,1), activation='relu', name=name+'_branch1_convbn0')
    branch_1 = conv_bn(branch_1, filters=288, kernel_size=(3,3), strides=2, padding='valid', activation='relu', name=name+'_branch1_convbn1')
    
    branch_2 = conv_bn(x, filters=256, kernel_size=(1,1), activation='relu', name=name+'_branch2_convbn0')
    branch_2 = conv_bn(branch_2, filters=288, kernel_size=(3,3), activation='relu', name=name+'_branch2_convbn1')
    branch_2 = conv_bn(branch_2, filters=320, kernel_size=(3,3), strides=2, padding='valid', activation='relu', name=name+'_branch2_convbn2')
    
    branch_3 = keras.layers.MaxPooling2D(pool_size=(3,3), strides=2, padding='valid')(x)
    
    x = keras.layers.Concatenate(axis=3, name=name+'_concat')([branch_0, branch_1, branch_2, branch_3])
    return x

基于上述每个结构的编码实现，网络的整体架构编码实现为：

In [9]:
def Inception_Resnet_v2(input_shape=(299,299,3),
                        keep_rate=0.8,
                        classes=1000):
    '''
    inception resnet v2 structure
    :param input_shape: tuple，input image of shape , (299,299,3) default
    :param keep_rate: float, Number of reserved nodes
    :param classes: integer, number of classes, 1000 default
    :return: keras model
    '''
    x_input = keras.layers.Input(shape=input_shape, name='input')
    
    x = stem(x_input, name='stem')
    x = inception_A(x, name='inception_A')
    for i in range(10):
        x = inception_resnet_A(x, scale=0.17, name='inception_resnet_A_'+str(i))
    x = reduction_A(x, name='reduction_A')
    for i in range(20):
        x = inception_resnet_B(x, scale=0.1, name='inception_resnet_B_'+str(i))
    x = reduction_B(x, name='reduction_B')
    for i in range(9):
        x = inception_resnet_C(x, scale=0.2, name='inception_resnet_C_'+str(i))
    x = inception_resnet_C(x, scale=1, name='inception_resnet_C_9')
    
    x = conv_bn(x, filters=1536, kernel_size=(1,1), activation='relu', name='conv')
    x = keras.layers.GlobalAveragePooling2D(name='globalaveragepooling')(x)
    x = keras.layers.Dropout(rate=1-keep_rate, name='dropout')(x)
    x = keras.layers.Dense(units=classes, activation='softmax', name='classification')(x)
    
    model = keras.models.Model(inputs=x_input, outputs = x, name='inception_resnet_v2')
    return model

In [10]:
#查看整体网络框架
model = Inception_Resnet_v2()
print(model.summary())

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input (InputLayer)              (None, 299, 299, 3)  0                                            
__________________________________________________________________________________________________
stem_conv1_conv2d (Conv2D)      (None, 149, 149, 32) 864         input[0][0]                      
__________________________________________________________________________________________________
stem_conv1_bn (BatchNormalizati (None, 149, 149, 32) 128         stem_conv1_conv2d[0][0]          
__________________________________________________________________________________________________
stem_conv1_relu (Activation)    (None, 149, 149, 32) 0           stem_conv1_bn[0][0]              
__________________________________________________________________________________________________
stem_conv2

keras.application模块集成了Inception_resnet_v2模型，其模型接口为:  

keras.applications.inception_resnet_v2.InceptionResNetV2(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)  
- include_top: 是否包括顶层的全连接层。
- weights: None 代表随机初始化， 'imagenet' 代表加载在 ImageNet 上预训练的权值。
- input_tensor: 可选，Keras tensor 作为模型的输入（比如 layers.Input() 输出的 tensor）
- input_shape: 可选，输入尺寸元组，仅当 include_top=False 时有效（不然输入形状必须是 (299, 299, 3) （channels_last 格式）或 (3, 299, 299) （channels_first 格式），因为预训练模型是以这个大小训练的）。输入尺寸必须是三个数字，且宽高必须不小于 139，比如 (150, 150, 3) 是一个合法的输入尺寸。
- pooling: 可选，当 include_top 为 False 时，该参数指定了特征提取时的池化方式。  
  None 代表不池化，直接输出最后一层卷积层的输出，该输出是一个四维张量。  
  'avg' 代表全局平均池化（GLobalAveragePool2D），相当于在最后一层卷积层后面再加一层全局平均池化层，输出是一个二维张量。  
  'max' 代表全局最大池化
- classes: 可选，图片分类的类别数，仅当 include_top 为 True 并且不加载预训练权值时可用。  

我们基于下面这幅图对函数使用进行简单说明：  
![dog](dog.jpg)

In [13]:
from keras.applications.inception_resnet_v2 import InceptionResNetV2, preprocess_input, decode_predictions
from keras.preprocessing import image
import numpy as np

model = InceptionResNetV2(weights='imagenet')
img_path = 'dog.jpg'
img = image.load_img(img_path, target_size=(299,299))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

preds = model.predict(x)
print('Predicted:', decode_predictions(preds, top=3)[0])

Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.7/inception_resnet_v2_weights_tf_dim_ordering_tf_kernels.h5
Predicted: [('n02093256', 'Staffordshire_bullterrier', 0.5607028), ('n02093428', 'American_Staffordshire_terrier', 0.33625633), ('n02096294', 'Australian_terrier', 0.0012601839)]


## 参考文献  
(1) Szegedy C, Ioffe S, Vanhoucke V, et al. Inception-v4, inception-resnet and the impact of residual connections on learning[C]//AAAI. 2017, 4: 12.  
(2) keras中文文档  https://keras.io/zh/  