╔══<i><b>Alai-DeepLearning</b></i>════════════════════════════╗
###  &nbsp;&nbsp; **✎&nbsp;&nbsp;Week 11. CNN Architectures**
# Section 2. GoogLeNet 구성하기

### _Objective_
1. GoogLeNet의 논문을 읽고 구현하면서, GoogLeNet의 모델을 이해해 보겠습니다. <br>
  
╚═════════════════════════════════════════╝

In [1]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import cv2

import tensorflow as tf

  return f(*args, **kwds)


## Graph Visualization

In [2]:
from IPython.display import clear_output, Image, display, HTML
import numpy as np    

def strip_consts(graph_def, max_const_size=32):
    """Strip large constant values from graph_def."""
    strip_def = tf.GraphDef()
    for n0 in graph_def.node:
        n = strip_def.node.add() 
        n.MergeFrom(n0)
        if n.op == 'Const':
            tensor = n.attr['value'].tensor
            size = len(tensor.tensor_content)
            if size > max_const_size:
                tensor.tensor_content = "<stripped %d bytes>"%size
    return strip_def

def show_graph(graph_def, max_const_size=32):
    """Visualize TensorFlow graph."""
    if hasattr(graph_def, 'as_graph_def'):
        graph_def = graph_def.as_graph_def()
    strip_def = strip_consts(graph_def, max_const_size=max_const_size)
    code = """
        <script>
          function load() {{
            document.getElementById("{id}").pbtxt = {data};
          }}
        </script>
        <link rel="import" href="https://tensorboard.appspot.com/tf-graph-basic.build.html" onload=load()>
        <div style="height:600px">
          <tf-graph-basic id="{id}"></tf-graph-basic>
        </div>
    """.format(data=repr(str(strip_def)), id='graph'+str(np.random.rand()))

    iframe = """
        <iframe seamless style="width:1200px;height:620px;border:0" srcdoc="{}"></iframe>
    """.format(code.replace('"', '&quot;'))

    display(HTML(iframe))

<br><br>

# \[ Paper Implementation \]
---
---


> *VGG Network와 같은 년에 만들어진 네트워크로, VGG Network를 앞지르고 ILSVRC2014 대회에서 1위를 기록한 모델입니다. *<br>
> *Inception Module이라는 개념을 만들어, VGG Network보다 깊은 Network를 구성한 것이 특징입니다.*<br>

![Imgur](https://i.imgur.com/cxNp3RD.png)

Reference : [Going Deeper with Convolutions](https://www.cs.unc.edu/~wliu/papers/GoogLeNet.pdf)

<br>

## 1. Placeholder 구성하기
---

![Imgur](https://i.imgur.com/WvSC8Sl.png)

* VGG Network와 마찬가지로 이미지의 크기는 (224,224,3)이며, 전처리로서는 Image의 Mean Value를 빼주는 방식으로 진행되었습니다.

In [3]:
input_shape = (224,224,3)
num_classes = 1000

graph = tf.Graph()
with graph.as_default():
    images = tf.placeholder(tf.float32, (None,224,224,3), name='images')
    labels = tf.placeholder(tf.int32, (None,), name='labels')
    
    with tf.variable_scope("preprocess"):
        image_mean = tf.constant([123.68, 116.779, 103.939], tf.float32)
        x = images - image_mean

<br>

## 2. Inference Network 구성하기
---

* GoogleNet은 아래의 Inception Module의 반복으로 구성되어 있습니다. <br>
* 각 층 별로 Inception Module 내 각 Convolution layer 별로 Filter 수가 달라집니다. 논문에서는 이 값들이 바뀌더라도 성능 변화가 크게 발생하진 않는다고 합니다.<br>
![Imgur](https://i.imgur.com/6rawkQJ.png)

### (1) Inception Module 구현하기

GoogleNet은 Inception Module이 반복된 구조로 되어 있습니다.<br>
반복된 구조를 보다 효율적으로 구성하기 위해, 메소드로 구현하겠습니다.<br>

In [4]:
def inception_module(prev_layer, 
                     one_conv_filters, 
                     reduced_three_filters,
                     three_conv_filters,
                     reduced_five_filters,
                     five_conv_filters,
                     after_pool_filters,
                     module_name='inception'):
    """
    Inception Module을 생성하는 메소드
    
    Arguments
    prev_layer : Input Layer
    one_conv_filters : 1x1 convolutions의 필터 갯수
    reduced_three_filters : 3x3 convolutions 앞에 있는 1x1 convolutions의 필터 갯수
    three_conv_filters : 3x3 convolutions의 필터 갯수
    reduced_five_filters : 5x5 convolutions 앞에 있는 1x1 convolutions의 필터 갯수
    file_conv_filters : 5x5 convolutions의 필터 갯수
    after_pool_filters : 3x3 max pool 이후 1x1 convolutions의 필터 갯수
    """
    he_init = tf.initializers.he_uniform()
    with tf.variable_scope(module_name):
        out1 = tf.layers.Conv2D(one_conv_filters,(1,1),padding='SAME',
                                kernel_initializer=he_init,
                                activation=tf.nn.relu,name='1x1_conv')(prev_layer)
        
        out2 = tf.layers.Conv2D(reduced_three_filters, (1,1), padding='SAME',
                                kernel_initializer=he_init,
                                activation=tf.nn.relu, name='reduced_3x3')(prev_layer)
        out2 = tf.layers.Conv2D(three_conv_filters, (3,3), padding='SAME',
                                kernel_initializer=he_init,
                                activation=tf.nn.relu, name='3x3_conv')(out2)

        out3 = tf.layers.Conv2D(reduced_five_filters, (1,1), padding='SAME',
                                kernel_initializer=he_init,
                                activation=tf.nn.relu, name='reduced_5x5')(prev_layer)
        out3 = tf.layers.Conv2D(five_conv_filters, (5,5), padding='SAME',
                                kernel_initializer=he_init,
                                activation=tf.nn.relu, name='5x5_conv')(out3)
        
        out4 = tf.layers.MaxPooling2D((3,3),(1,1),padding='SAME',name='3x3_pool')(prev_layer)
        out4 = tf.layers.Conv2D(after_pool_filters, (1,1), padding='SAME',
                                kernel_initializer=he_init,
                                activation=tf.nn.relu, name='after_pool')(out4)
        
        out = tf.concat([out1,out2,out3,out4],axis=-1,name='filter_concatenation')
    return out

### (2) Inception (3a) 이전까지 구성하기

![Imgur](https://i.imgur.com/CiCfDnS.png)

* 메모리 문제로 인해, 처음 몇 개의 Layer는 Inception Module의 형태를 따르지 않고, 7x7 Convolution과 3x3 Convolution을 이용하였습니다.

In [5]:
with graph.as_default():
    he_init = tf.initializers.he_uniform()    
    
    conv1 = tf.layers.Conv2D(64,(7,7),strides=(2,2),padding='SAME',
                             kernel_initializer=he_init,name='7x7_conv')(x)
    pool1 = tf.layers.MaxPooling2D((3,3),(2,2),name='MaxPool_1')(conv1)
    conv2 = tf.layers.Conv2D(192,(3,3),padding='SAME',
                             kernel_initializer=he_init,name='3x3_conv')(pool1)
    pool2 = tf.layers.MaxPooling2D((3,3),(2,2),name='MaxPool_2')(conv2)

Instructions for updating:
Colocations handled automatically by placer.


In [6]:
show_graph(graph)

![Imgur](https://i.imgur.com/Kmyt4Dw.png)
* 각 열(#1x1, #3x3 reduce #3x3, etc)의 수치들은 아래 Inception Module 내 Filter의 갯수를 의미합니다.<br>
* #3x3reduce는 3x3 Filter 앞의 1x1 Filter의 갯수를 의미하고, pool proj는 Max Pooling 후 1x1 Filter의 갯수를 의미합니다.<br>
![Imgur](https://i.imgur.com/zop0MO6.png)

#### Optional

* 표에 나타나 있는 `params`에 있는 열의 값들이 잘못 계산된 것으로 추측됩니다.<br>
* 이 문제에 대한 논의 : [stackoverflow](https://stackoverflow.com/questions/30585122/how-to-calculate-the-number-of-parameters-for-google-net/47880856)

### (3) Inception (3a) 구성하기
![Imgur](https://i.imgur.com/GqXsI8W.png)

In [7]:
with graph.as_default():    
    block_3a = inception_module(pool2, one_conv_filters=64, 
                                reduced_three_filters=96,three_conv_filters=128,
                                reduced_five_filters=16,five_conv_filters=32,
                                after_pool_filters=32,module_name='inception_3a')

In [8]:
show_graph(graph)

### (4) Inception(5b)까지 구성하기

In [9]:
graph = tf.Graph()
with graph.as_default():
    images = tf.placeholder(tf.float32, (None,224,224,3), name='images')
    
    with tf.variable_scope("preprocess"):
        image_mean = tf.constant([123.68, 116.779, 103.939], tf.float32)
        x = images - image_mean

    he_init = tf.initializers.he_uniform()    
    
    conv1 = tf.layers.Conv2D(64,(7,7),strides=(2,2),padding='SAME',
                             kernel_initializer=he_init,name='7x7_conv')(x)
    pool1 = tf.layers.MaxPooling2D((3,3),(2,2),padding='SAME',name='MaxPool_1')(conv1)

    conv2 = tf.layers.Conv2D(192,(3,3),padding='SAME',
                             kernel_initializer=he_init,name='3x3_conv')(pool1)
    pool2 = tf.layers.MaxPooling2D((3,3),(2,2),padding='SAME',name='MaxPool_2')(conv2)
    
    block_3a = inception_module(pool2,64,96,128,16,32,32,module_name='inception_3a')
    block_3b = inception_module(block_3a,128,128,192,32,96,64,module_name='inception_3b')
    pool3 = tf.layers.MaxPooling2D((3,3),(2,2),padding='SAME',name='MaxPool_3')(block_3b)    
    
    block_4a = inception_module(pool3,192,96,208,16,48,64,module_name='inception_4a')
    block_4b = inception_module(block_4a,160,112,224,24,64,64,module_name='inception_4b')
    block_4c = inception_module(block_4b,128,128,256,24,64,64,module_name='inception_4c')
    block_4d = inception_module(block_4c,112,144,288,32,64,64,module_name='inception_4d')    
    block_4e = inception_module(block_4d,256,160,320,32,128,128,module_name='inception_4e')        
    pool4 = tf.layers.MaxPooling2D((3,3),(2,2),padding='SAME',name='MaxPool_4')(block_4e)    
    
    block_5a = inception_module(pool4,256,160,320,32,128,128,module_name='inception_5a')
    block_5b = inception_module(block_5a,384,192,384,48,128,128,module_name='inception_5b')  
    

In [10]:
show_graph(graph)

### (5) 이후 Inference Network 구성하기

![Imgur](https://i.imgur.com/uDCgfil.png)

* Inception Network에서는 바로 fully-Connected Layer을 붙이는 것이 아니라, Average Pooling을 통해,<br>
각 필터 별 정보를 축약한 후, Fully-Connected Layer을 연결하였습니다.<br>
* 이러한 기법은 나중에 Global Average Pooling(GAP)으로 발전하여, Convolution Layer의 정보를 요약할 때 많이 쓰입니다.

In [None]:
with graph.as_default():
    is_train = tf.placeholder_with_default(False, (),name='is_train')
    
    xavier_init = tf.initializers.glorot_normal()
    avg_pool = tf.layers.AveragePooling2D((7,7),(1,1),name='avg_pool')(block_5b)
    avg_pool = tf.layers.Flatten()(avg_pool)
    dropout = tf.layers.Dropout(rate=0.4)(avg_pool, training=is_train)
    fc = tf.layers.Dense(1024, kernel_initializer=he_init,
                         activation=tf.nn.relu)(dropout)
    logits = tf.layers.Dense(num_classes,kernel_initializer=xavier_init)(fc)
    y_pred = tf.nn.softmax(logits)

In [None]:
show_graph(graph)

<br>

## 3. Training Network 구성하기
---
![Imgur](https://i.imgur.com/pgAC6FM.png)

* VGG Network와 마찬가지로 GoogleNet 또한 "어떻게 깊은 망을 학습시킬까"에 대한 문제를 안고 있었습니다.<br>
* VGG Network와 다르게, Auxiliary Network를 두어 해결하는 방식으로 진행하였습니다.<br>

### (1) Auxiliary Network 구성하기

![Imgur](https://i.imgur.com/fPqF4pv.png)

* 위의 붉은 부분이 Auxiliary Network 입니다.<br> 실제로 Classification Network의 역할을 수행하는 것은 아니지만, 아래층까지 깊이 Loss 값을 전파하기 위해 구성된 Network입니다.

![Imgur](https://i.imgur.com/sI3HLST.png)

* 구체적인 Auxiliary Network의 Spec입니다.

In [None]:
with graph.as_default():
    with tf.variable_scope('auxiliary_network_4a'):
        avg_pool = tf.layers.AveragePooling2D((5,5),(3,3))(block_4a)
        conv = tf.layers.Conv2D(128,(1,1),kernel_initializer=he_init,
                                activation=tf.nn.relu,name='1x1')(avg_pool)
        fc = tf.layers.Flatten()(conv)
        fc = tf.layers.Dense(1024,kernel_initializer=he_init,
                             activation=tf.nn.relu)(fc)
        fc = tf.layers.Dropout(0.7)(fc)
        aux_logit_4a = tf.layers.Dense(1000,
                                       kernel_initializer=xavier_init)(fc)
    
    with tf.variable_scope('auxiliary_network_4d'):
        avg_pool = tf.layers.AveragePooling2D((5,5),(3,3))(block_4d)
        conv = tf.layers.Conv2D(128,(1,1),kernel_initializer=he_init,
                                activation=tf.nn.relu,name='1x1')(avg_pool)
        fc = tf.layers.Flatten()(conv)
        fc = tf.layers.Dense(1024,kernel_initializer=he_init,
                             activation=tf.nn.relu)(fc)
        fc = tf.layers.Dropout(0.7)(fc)
        aux_logit_4d = tf.layers.Dense(1000,
                                       kernel_initializer=xavier_init)(fc)
        
        

### (2) Loss Function 구성하기

* GoogLeNet은 Main Classifier의 Loss와, Auxiliary Network의 Loss를 통해서, Model을 학습시킵니다.<br>

![Imgur](https://i.imgur.com/4FGtLta.png)

In [None]:
with graph.as_default():
    labels = tf.placeholder(tf.int64,shape=(None,),name='labels')
        
    with tf.variable_scope('losses'):
        main_loss = tf.losses.sparse_softmax_cross_entropy(labels,logits)
        aux_4a_loss = tf.losses.sparse_softmax_cross_entropy(labels, aux_logit_4a)
        aux_4d_loss = tf.losses.sparse_softmax_cross_entropy(labels, aux_logit_4d)
        loss = main_loss + 0.3 * aux_4a_loss + 0.3 * aux_4d_loss

### (3) Optimizer 구성하기

* Googlenet 논문에서는 어떻게 모델을 학습시켰는지에 대한 방법론은 대강 기술하였습니다.<br>
* Model을 어떻게 학습시켰는지가 중요한 이슈가 아니었고, 실제로 학습시킨 방식이 일반적인 방식이 아닌<br>
분산 환경에서의 학습시켰기 때문입니다.


![Imgur](https://i.imgur.com/4ntTbiz.png)

In [None]:
with graph.as_default():
    lr = tf.placeholder_with_default(1e-2,(),name='learning_rate')
    
    train_op = (tf.train
                .MomentumOptimizer(learning_rate=lr,momentum=0.9)
                .minimize(loss))

In [None]:
show_graph(graph)

#  

---

    Copyright(c) 2019 by Public AI. All rights reserved.<br>
    Writen by PAI, SangJae Kang ( rocketgrowthsj@publicai.co.kr )  last updated on 2019/05/13

---