╔══<i><b>Alai-DeepLearning</b></i>════════════════════════════╗
###  &nbsp;&nbsp; **✎&nbsp;&nbsp;Week 11. CNN Architectures**
# Section 6. ResNet 구성하기

### _Objective_
1. Residual Network의 논문을 읽고 구현하면서, ResNet의 모델을 이해해 보겠습니다. <br>
  
╚═════════════════════════════════════════╝

In [4]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import cv2

import tensorflow as tf

## Graph Visualization

In [5]:
from IPython.display import clear_output, Image, display, HTML
import numpy as np    

def strip_consts(graph_def, max_const_size=32):
    """Strip large constant values from graph_def."""
    strip_def = tf.GraphDef()
    for n0 in graph_def.node:
        n = strip_def.node.add() 
        n.MergeFrom(n0)
        if n.op == 'Const':
            tensor = n.attr['value'].tensor
            size = len(tensor.tensor_content)
            if size > max_const_size:
                tensor.tensor_content = "<stripped %d bytes>"%size
    return strip_def

def show_graph(graph_def, max_const_size=32):
    """Visualize TensorFlow graph."""
    if hasattr(graph_def, 'as_graph_def'):
        graph_def = graph_def.as_graph_def()
    strip_def = strip_consts(graph_def, max_const_size=max_const_size)
    code = """
        <script>
          function load() {{
            document.getElementById("{id}").pbtxt = {data};
          }}
        </script>
        <link rel="import" href="https://tensorboard.appspot.com/tf-graph-basic.build.html" onload=load()>
        <div style="height:600px">
          <tf-graph-basic id="{id}"></tf-graph-basic>
        </div>
    """.format(data=repr(str(strip_def)), id='graph'+str(np.random.rand()))

    iframe = """
        <iframe seamless style="width:1200px;height:620px;border:0" srcdoc="{}"></iframe>
    """.format(code.replace('"', '&quot;'))

    display(HTML(iframe))

<br><br>

# \[ Paper Implementation \]
---
---


> *VGG Network처럼 매우 간단한 형태로, Inception Network에 비해 짜기가 매우 간단합니다.*<br>

Reference : [Deep Residual Learning for Image Recognition](https://arxiv.org/pdf/1512.03385.pdf)

<br>

## 1. Placeholder 구성하기
---

![Imgur](https://i.imgur.com/1MyBu1b.png)

* VGG Network와 마찬가지로 이미지의 크기는 (224,224,3)이며, 전처리로서는 Image의 Mean Value를 빼주는 방식으로 진행되었습니다.

In [19]:
input_shape = (224,224,3)
num_classes = 1000

graph = tf.Graph()
with graph.as_default():
    images = tf.placeholder(tf.float32, (None,224,224,3), name='images')
    labels = tf.placeholder(tf.int32, (None,), name='labels')
    
    with tf.variable_scope("preprocess"):
        image_mean = tf.constant([123.68, 116.779, 103.939], tf.float32)
        x = images - image_mean

<br>

## 2. Inference Network 구성하기
---

![Imgur](https://i.imgur.com/ajlup9L.png)
* ResNet에서 핵심은 실선과 점선으로 이루어진 Residual Block입니다.<br>
* 실선으로 이루어진 부분은 Input과 Output의 shape가 같아서, 바로 더해줄 수 있는 Block을 의미하고<br>
점선으로 이루어진 부분은 Input과 Output의 Shape가 달라서, 바로 더해줄 수 없고 stride와 Projection을 통해 Shape을 동일하게 해주어야 합니다.

![Imgur](https://i.imgur.com/KRR62oi.png)

ResNet-34을 구성해보도록 하겠습니다.

### (1) conv2_x까지 구성하기

그림에는 나타나 있지 않지만, ResNet의 모든 Convolution Layer 다음에는 Batch Normalization Layer이 뒤따라 옵니다.

![Imgur](https://i.imgur.com/g3zZGs7.png)

In [20]:
with graph.as_default():
    is_train = tf.placeholder_with_default(False,(),name='is_train')
    
    he_init = tf.initializers.he_normal()
    
    with tf.variable_scope('conv1'):
        conv = tf.layers.Conv2D(64, (7,7), strides=(2,2),padding='SAME',
                                kernel_initializer=he_init)(x)
        bn = tf.layers.BatchNormalization()(conv, training=is_train)
        out = tf.nn.relu(bn)
        
    with tf.variable_scope('conv2_1'):
        pool = tf.layers.MaxPooling2D((3,3),(2,2),padding='SAME')(out)        
        
        input_conv = tf.identity(pool)        
        conv = tf.layers.Conv2D(64, (3,3), padding='SAME',
                                kernel_initializer=he_init)(pool)
        bn = tf.layers.BatchNormalization()(conv, training=is_train)
        act = tf.nn.relu(bn)
        conv = tf.layers.Conv2D(64, (3,3), padding='SAME',
                                kernel_initializer=he_init)(act)
        bn = tf.layers.BatchNormalization()(conv, training=is_train)
        added = tf.add(input_conv, bn)
        out = tf.nn.relu(added)

    with tf.variable_scope('conv2_2'):
        input_conv = tf.identity(out)        
        conv = tf.layers.Conv2D(64, (3,3), padding='SAME',
                                kernel_initializer=he_init)(out)
        bn = tf.layers.BatchNormalization()(conv, training=is_train)
        act = tf.nn.relu(bn)
        conv = tf.layers.Conv2D(64, (3,3), padding='SAME',
                                kernel_initializer=he_init)(act)
        bn = tf.layers.BatchNormalization()(conv, training=is_train)
        added = tf.add(input_conv, bn)
        out = tf.nn.relu(added)       
        
    with tf.variable_scope('conv2_3'):
        input_conv = tf.identity(out)        
        conv = tf.layers.Conv2D(64, (3,3), padding='SAME',
                                kernel_initializer=he_init)(out)
        bn = tf.layers.BatchNormalization()(conv, training=is_train)
        act = tf.nn.relu(bn)
        conv = tf.layers.Conv2D(64, (3,3), padding='SAME',
                                kernel_initializer=he_init)(act)
        bn = tf.layers.BatchNormalization()(conv, training=is_train)
        added = tf.add(input_conv, bn)
        out = tf.nn.relu(added)       

In [21]:
show_graph(graph)

### (2) Conv3_x 구성하기

![Imgur](https://i.imgur.com/ctUSU6L.png)

conv2 block에서 conv3 block으로 넘어갈 때, Input과 output의 크기가 달라집니다.<br>
1. Feature Map의 갯수가 커지는 것 : 1x1 Conv로 차원을 늘리는 것
2. Feature Map의 size가 줄어드는 것 : Stride를 통해 줄임

In [22]:
with graph.as_default():
    with tf.variable_scope('conv3_1'):        
        input_conv = tf.identity(out)
        projection = tf.layers.Conv2D(128,(1,1),strides=(2,2))(input_conv)
        
        conv = tf.layers.Conv2D(128, (3,3), strides=(2,2), padding='SAME',
                                kernel_initializer=he_init)(out)
        bn = tf.layers.BatchNormalization()(conv, training=is_train)
        act = tf.nn.relu(bn)
        conv = tf.layers.Conv2D(128, (3,3), padding='SAME',
                                kernel_initializer=he_init)(act)
        bn = tf.layers.BatchNormalization()(conv, training=is_train)
        added = tf.add(projection, bn)
        out = tf.nn.relu(added)       

    with tf.variable_scope('conv3_2'):
        input_conv = tf.identity(out)
        
        conv = tf.layers.Conv2D(128, (3,3), padding='SAME',
                                kernel_initializer=he_init)(out)
        bn = tf.layers.BatchNormalization()(conv, training=is_train)
        act = tf.nn.relu(bn)
        conv = tf.layers.Conv2D(128, (3,3), padding='SAME',
                                kernel_initializer=he_init)(act)
        bn = tf.layers.BatchNormalization()(conv, training=is_train)
        added = tf.add(input_conv, bn)
        out = tf.nn.relu(added)       

        
    with tf.variable_scope('conv3_3'):
        input_conv = tf.identity(out)

        conv = tf.layers.Conv2D(128, (3,3), padding='SAME',
                                kernel_initializer=he_init)(out)
        bn = tf.layers.BatchNormalization()(conv, training=is_train)
        act = tf.nn.relu(bn)
        conv = tf.layers.Conv2D(128, (3,3), padding='SAME',
                                kernel_initializer=he_init)(act)
        bn = tf.layers.BatchNormalization()(conv, training=is_train)
        added = tf.add(input_conv, bn)
        out = tf.nn.relu(added)       
        
    with tf.variable_scope('conv3_4'):
        input_conv = tf.identity(out)

        conv = tf.layers.Conv2D(128, (3,3), padding='SAME',
                                kernel_initializer=he_init)(out)
        bn = tf.layers.BatchNormalization()(conv, training=is_train)
        act = tf.nn.relu(bn)
        conv = tf.layers.Conv2D(128, (3,3), padding='SAME',
                                kernel_initializer=he_init)(act)
        bn = tf.layers.BatchNormalization()(conv, training=is_train)
        added = tf.add(input_conv, bn)
        out = tf.nn.relu(added)       


In [23]:
show_graph(graph)

크게 두 가지 유형의 Residual이 존재한다는 것을 알 수 있습니다.<br>
Input과 Output의 크기가 동일한 residual block(실선으로 표시)과 <br>
Input과 Output의 크기가 다른 residual block(점선으로 표시)로 나뉘어집니다.

우리는 이렇게 동일한 패턴으로 되어있을 경우, 메소드로 작성하는 것이 보다 간결하고<br>
안전하게 코드를 구성할 수 있습니다.

In [25]:
def residual_block(input_layer, is_train, filters, strides=(1,1), block_name="residual_block"):
    with tf.variable_scope(block_name):
        if input_layer.shape.as_list()[-1] != filters:
            # input_layer의 필터 갯수와 filters가 다르면, projection layer을 거침
            projection = tf.layers.Conv2D(filters, (1,1), strides=strides)(input_layer)
        else:
            # 동일하면 바로 이어줌
            projection = input_layer
        
        conv = tf.layers.Conv2D(filters, (3,3), strides, padding='SAME',
                                kernel_initializer=he_init)(input_layer)
        bn = tf.layers.BatchNormalization()(conv, training=is_train)
        act = tf.nn.relu(bn)
        conv = tf.layers.Conv2D(filters, (3,3), padding='SAME',
                                kernel_initializer=he_init)(act)
        bn = tf.layers.BatchNormalization()(conv, training=is_train)
        added = tf.add(projection, bn)
        out = tf.nn.relu(added)       

    return out        

### (3) 전체 Residual Block 구성하기

위에 구성한 메소드를 이용해 Residual Block을 conv5_x까지 마저 구하겠습니다.

In [26]:
input_shape = (224,224,3)
num_classes = 1000

graph = tf.Graph()
with graph.as_default():
    images = tf.placeholder(tf.float32, (None,224,224,3), name='images')
    labels = tf.placeholder(tf.int32, (None,), name='labels')
    is_train = tf.placeholder_with_default(False, (), name='is_train')
    
    with tf.variable_scope("preprocess"):
        image_mean = tf.constant([123.68, 116.779, 103.939], tf.float32)
        x = images - image_mean
        
    conv = tf.layers.Conv2D(64,(7,7),strides=(2,2),padding='SAME',name='conv1')(x)
    
    pool = tf.layers.MaxPooling2D((3,3),(2,2),padding='SAME'
                                  ,name='maxpool1')(conv)
    res2_1 = residual_block(pool, is_train, 64, block_name='conv2_1')
    res2_2 = residual_block(res2_1, is_train, 64, block_name='conv2_2')    
    res2_3 = residual_block(res2_2, is_train, 64, block_name='conv2_3') 
        
    res3_1 = residual_block(res2_3, is_train, 128, 
                            strides=(2,2), block_name='conv3_1')
    res3_2 = residual_block(res3_1, is_train, 128, block_name='conv3_2')    
    res3_3 = residual_block(res3_2, is_train, 128, block_name='conv3_3')    
    res3_4 = residual_block(res3_3, is_train, 128, block_name='conv3_4')    
    
    res4_1 = residual_block(res3_4, is_train, 256, 
                            strides=(2,2), block_name='conv4_1')
    res4_2 = residual_block(res4_1, is_train, 256, block_name='conv4_2')    
    res4_3 = residual_block(res4_2, is_train, 256, block_name='conv4_3')    
    res4_4 = residual_block(res4_3, is_train, 256, block_name='conv4_4')    
    res4_5 = residual_block(res4_4, is_train, 256, block_name='conv4_5')    
    res4_6 = residual_block(res4_5, is_train, 256, block_name='conv4_6')    

    res5_1 = residual_block(res4_6, is_train, 512, 
                            strides=(2,2), block_name='conv5_1')
    res5_2 = residual_block(res5_1, is_train, 512, block_name='conv5_2')    
    res5_3 = residual_block(res5_2, is_train, 512, block_name='conv5_3')        
    

In [27]:
show_graph(graph)

### (4) Global Average Pooling Layer 구성하기

![Imgur](https://i.imgur.com/kbVvJCU.png)

ResNet의 마지막 층에서는 Global Average Pooling Layer가 붙습니다.<br>
Global Average Pooling Layer는 각 필터 층 별로 평균값을 산출하는 것을 의미합니다.<br>

Global Average Pooling Layer가 붙을 경우, Input Image의 크기에 무관하게, 분류기에 넣을 수 있게 됩니다.<Br>
ResNet 이후 많은 모델들은 Global Average Pooling Layer를 Classification 모델 마지막에 넣음으로써 이미지의 크기에 무관한 모델로 구현하였습니다.

In [28]:
n_classes = 1000
with graph.as_default():
    xavier_init = tf.initializers.glorot_normal()
    
    gap = tf.reduce_mean(res5_3,axis=(1,2),
                         name='global_average_pooling')    
    logits = tf.layers.Dense(n_classes,
                             kernel_initializer=xavier_init,
                             name='logits')(gap)
    y_pred = tf.nn.softmax(logits,name='prediction')

In [29]:
show_graph(graph)

#  

---

    Copyright(c) 2019 by Public AI. All rights reserved.<br>
    Writen by PAI, SangJae Kang ( rocketgrowthsj@publicai.co.kr )  last updated on 2019/05/14

---