## Implementation for Inception V1 Network(GoogLeNet)
- author: jiho Ahn
- date: 2021.07.29
- topic: GoogLeNet

- reference link
  - https://oi.readthedocs.io/en/latest/computer_vision/cnn/googlenet.html
  - https://static.googleusercontent.com/media/research.google.com/ko//pubs/archive/43022.pdf
  - https://medium.com/@rockyxu399/paper-review-and-model-architecture-for-cnn-classification-94972e40d96a
  - https://www.tensorflow.org/guide?hl=ko
  - https://keras.io/guides/
  - https://www.youtube.com/watch?v=C86ZXvgpejM

### Structure of GoogLeNet(Inception V1 Network)

<img alt="GoogLeNet" src='https://drive.google.com/uc?export=view&id=1GbLhTFWDCC1bmmvVoXeg6GuNoqIrY3ve'>

### Overall Structure of GoogLeNet(Table)

<img alt="GoogLeNet(Table)" src='https://drive.google.com/uc?export=view&id=1Wf2ZZ5-3c4j_m03uyEV27tuEG_eal9SH'>

##### 논문 출처 : https://static.googleusercontent.com/media/research.google.com/ko//pubs/archive/43022.pdf

### Inception layer(3a)

<image alt="Detail Information for Inception Layer" src='https://drive.google.com/uc?export=view&id=1EWDUy9CzFeq98kAnjI0Q2Q4WQ5qJVgWM'>

#### 출처: https://medium.com/@rockyxu399/paper-review-and-model-architecture-for-cnn-classification-94972e40d96a

### Implementation

#### Library Import


In [None]:
import os

import tensorflow as tf
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.layers import Conv2D, MaxPool2D, Dropout, Dense, Input, concatenate, GlobalAveragePooling2D, AveragePooling2D, Flatten
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.optimizers import SGD
from keras.callbacks import LearningRateScheduler

import tensorflow_datasets as tfds

import cv2
import numpy as np
import math

import matplotlib

matplotlib.use('Agg')
# Loading dataset and Performing some preprocessing steps.

In [None]:
# number of classes for multi-class classifier
num_class = 10

#### Inception V1 Architecture

In [None]:
class Inception_v1(tf.keras.layers.Layer):
    def __init__(self, filters_1x1, filters_3x3_reduce, filters_3x3, filters_5x5_reduce, filters_5x5,
                 filters_pool_proj, name=None, kernel_init='glorot_uniform', bias_init='zeros'):
        super(Inception_v1, self).__init__()

        self.conv_1x1 = Conv2D(filters_1x1, 1, 1, 'same', activation='relu', kernel_initializer=kernel_init, bias_initializer=bias_init)
        self.conv_3x3_reduce = Conv2D(filters_3x3_reduce, 1, 1, 'same', activation='relu', kernel_initializer=kernel_init, bias_initializer=bias_init)
        self.conv_5x5_reduce = Conv2D(filters_5x5_reduce, 1, 1, 'same', activation='relu', kernel_initializer=kernel_init, bias_initializer=bias_init)
        self.maxpool_3x3 = MaxPool2D(3, strides=1, padding='same')
        self.conv_3x3 = Conv2D(filters_3x3, 3, 1, 'same', activation='relu', kernel_initializer=kernel_init, bias_initializer=bias_init)
        self.conv_5x5 = Conv2D(filters_5x5, 5, 1, 'same', activation='relu', kernel_initializer=kernel_init, bias_initializer=bias_init)
        self.pool_proj = Conv2D(filters_pool_proj, 1, 1, 'same', activation='relu', kernel_initializer=kernel_init, bias_initializer=bias_init)


    def call(self, x):
        result_conv_1x1 = self.conv_1x1(x)

        result_conv_3x3_reduce = self.conv_3x3_reduce(x)
        result_conv_5x5_reduce = self.conv_5x5_reduce(x)
        result_maxpool_3x3 = self.maxpool_3x3(x)

        result_conv_3x3 = self.conv_3x3(result_conv_3x3_reduce)
        result_conv_5x5 = self.conv_5x5(result_conv_5x5_reduce)
        result_pool_proj = self.pool_proj(result_maxpool_3x3)

        output = concatenate([result_conv_1x1, result_conv_3x3, result_conv_5x5, result_pool_proj], axis=-1)

        return output


    def get_config():
        config = super().get_config().copy()
        config.update({
            'conv_1x1': self.conv_1x1.get_config(),
            'conv_3x3_reduce': self.conv_3x3_reduce.get_config(),
            'conv_3x3': self.conv_3x3.get_config(),
            'conv_5x5_reduce': self.conv_5x5_reduce.get_config(),
            'conv_5x5': self.conv_5x5.get_config(),
            'conv_maxpool_3x3': self.maxpool_3x3.get_config(),
            'conv_pool_proj': self.pool_proj.get_config()
        })

        return config
       

#### Auxiliary classifier
- 층이 깊어짐에 따라, 역전파 계산 시, 기울기 소실 문제가 발생할 가능성이 있다.
- 따라서, 이미지와 가까운 layer들의 가중치 학습이 제대로 안되는 경우가 있다.
- 이를 방지하고자, layer 중간에 loss를 계산하고, 이를 역전파 시키는 layer를 추가로 삽입한다.

- 참고 논문 링크: https://arxiv.org/abs/1505.02496

In [None]:
class AuxiliaryClassifier(tf.keras.layers.Layer):
    def __init__(self):
        super(AuxiliaryClassifier, self).__init__()

        self.model = Sequential()
        self.model.add(AveragePooling2D((5, 5), strides=3, name='avg_pool_aux'))
        self.model.add(Conv2D(128, 1, padding='same', activation='relu', name='conv_aux'))
        self.model.add(Flatten())
        self.model.add(Dense(1024, activation='relu', name='dense_aux'))
        self.model.add(Dropout(rate=0.7))
        self.model.add(Dense(10, activation='softmax', name='output_aux'))


    def call(self, x):
        x = self.model(x)
        return x

    
    def get_config(self):
        config = super().get_config().copy()
        for layer in self.model.layers:
            config.update({layer['name']: layer.get_config()})

        return config

#### GoogLeNet Architecture

In [None]:
class GoogLeNet(Model):
    def __init__(self):      
        super(GoogLeNet, self).__init__()

        kernel_init = tf.keras.initializers.glorot_uniform()
        bias_init = tf.keras.initializers.Constant(value=0.2)

        self.sub_googleNet1 = Sequential()
        self.sub_googleNet1.add(Conv2D(64, 7, 2, 'same', activation='relu', name='conv1_7x7/2', kernel_initializer=kernel_init, bias_initializer=bias_init))
        self.sub_googleNet1.add(MaxPool2D(3, 2, 'same', name='maxpool1_3x3/2'))
        self.sub_googleNet1.add(Conv2D(64, 3, 1, 'same', activation='relu', name='conv2_3x3/1', kernel_initializer=kernel_init, bias_initializer=bias_init))
        self.sub_googleNet1.add(MaxPool2D(3, 2, 'same', name='maxpool2_3x3/2'))
        self.sub_googleNet1.add(Inception_v1(64, 96, 128, 16, 32, 32, 'inception_v1_3a', kernel_init, bias_init))
        self.sub_googleNet1.add(Inception_v1(128, 128, 192, 32, 96, 64, 'inception_v1_3b', kernel_init, bias_init))
        self.sub_googleNet1.add(MaxPool2D(3, 2, 'same', name='maxpool3_3x3/2'))
        self.sub_googleNet1.add(Inception_v1(192, 96, 208, 16, 48, 64, 'inception_v1_4a', kernel_init, bias_init))

        self.sub_googleNet2 = Sequential()
        self.sub_googleNet2.add(Inception_v1(160, 112, 224, 24, 64, 64, 'inception_v1_4b', kernel_init, bias_init))
        self.sub_googleNet2.add(Inception_v1(128, 128, 256, 24, 64, 64, 'inception_v1_4c', kernel_init, bias_init))
        self.sub_googleNet2.add(Inception_v1(112, 144, 288, 32, 64, 64, 'inception_v1_4d', kernel_init, bias_init))

        self.sub_googleNet3 = Sequential()
        self.sub_googleNet3.add(Inception_v1(256, 160, 320, 32, 128, 128, 'inception_v1_4e', kernel_init, bias_init))
        self.sub_googleNet3.add(MaxPool2D(3, 2, 'same', name='maxpool4_3x3/2'))
        self.sub_googleNet3.add(Inception_v1(256, 160, 320, 32, 128, 128, 'inception_v1_5a', kernel_init, bias_init))
        self.sub_googleNet3.add(Inception_v1(384, 192, 384, 48, 128, 128, 'inception_v1_5b', kernel_init, bias_init))
        self.sub_googleNet3.add(GlobalAveragePooling2D(name='global_avg_pool_7x7/1'))
        self.sub_googleNet3.add(Dropout(rate=0.4))
        self.sub_googleNet3.add(Dense(units=10, activation='softmax', name='output', kernel_initializer=kernel_init, bias_initializer=bias_init))
        
        self.aux1 = AuxiliaryClassifier()
        self.aux2 = AuxiliaryClassifier()

    
    def call(self, x):
        x1 = self.sub_googleNet1(x)

        aux1 = self.aux1(x1)
        x2 = self.sub_googleNet2(x1)

        aux2 = self.aux2(x2)
        x3 = self.sub_googleNet3(x2)

        return x3, aux1, aux2

#### Learning Rate Decay 

In [None]:
lr_decay_for_epoch = LearningRateScheduler(lambda epoch: 0.01 * math.pow(0.98, math.floor((1+epoch) / 8)), verbose=1)

#### Load Cifar10 dataset

In [None]:
def load_cifar10_dataset():
    # train : image(50000, 32, 32, 3) / label(50000, 1) , test : image(10000, 32, 32, 3) / label(10000, 1)
    train_image, train_label = tfds.as_numpy(tfds.load(name='cifar10', split='train', 
                                                           as_supervised=True, shuffle_files=True, batch_size=-1))
    test_image, test_label = tfds.as_numpy(tfds.load(name='cifar10', split='test', 
                                                           as_supervised=True, shuffle_files=True, batch_size=-1))
    
    # return 2500 train sets and 500 test sets for speedy training
    train_image, train_label = train_image[0:2500, :, :, :], train_label[:2500]
    test_image, test_label = test_image[0:500, :, :, :], test_label[:500]

    # process images(resize, normalize)
    train_image = processing(train_image)
    test_image = processing(test_image)

    # transform targets to keras compatible format(One-hot encoding)
    train_label = to_categorical(train_label, num_classes=10)
    test_label = to_categorical(test_label, num_classes=10)
    
    return (train_image, train_label), (test_image, test_label)

#### Image processing


In [None]:
def processing(images):  
    # resize images
    images = np.array([cv2.resize(img, (224, 224)) for img in images[:, :, :, :]])  
    # type casting
    images = images.astype('float32') / 255.
    
    return images

### Main Section


In [None]:
def main():
    # load images and labels
    (train_image, train_label), (test_image, test_label) = load_cifar10_dataset()

    # input layer
    input_layer = Input(shape=(224, 224, 3))

    # define model
    googlenet = GoogLeNet()
    final_output, aux1_output, aux2_output = googlenet(input_layer)
    model = Model(input_layer, [final_output, aux1_output, aux2_output], name='googLeNet')

    # set params
    epoch = 25
    initial_lrate = 0.01

    # define optimizer and learning rate scheduler
    sgd = SGD(lr=initial_lrate, momentum=0.9, nesterov=False)
    lr_sc = lr_decay_for_epoch

    # compile GoogLeNet model
    model.compile(loss=['categorical_crossentropy', 'categorical_crossentropy', 'categorical_crossentropy'],
                  loss_weights=[1, 0.3, 0.3], optimizer=sgd, metrics=['accuracy'])

    # fit and validate model
    history = model.fit(train_image, [train_label, train_label, train_label], validation_data=(test_image, [test_label, test_label, test_label]),
                        epochs=epoch, batch_size=32, callbacks=[lr_sc])

In [None]:
if __name__ == '__main__':
    main()

[1mDownloading and preparing dataset cifar10/3.0.2 (download: 162.17 MiB, generated: 132.40 MiB, total: 294.58 MiB) to /root/tensorflow_datasets/cifar10/3.0.2...[0m


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Dl Completed...', max=1.0, style=Progre…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Dl Size...', max=1.0, style=ProgressSty…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Extraction completed...', max=1.0, styl…









HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Shuffling and writing examples to /root/tensorflow_datasets/cifar10/3.0.2.incomplete2RGGBP/cifar10-train.tfrecord


HBox(children=(FloatProgress(value=0.0, max=50000.0), HTML(value='')))

HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Shuffling and writing examples to /root/tensorflow_datasets/cifar10/3.0.2.incomplete2RGGBP/cifar10-test.tfrecord


HBox(children=(FloatProgress(value=0.0, max=10000.0), HTML(value='')))

[1mDataset cifar10 downloaded and prepared to /root/tensorflow_datasets/cifar10/3.0.2. Subsequent calls will reuse this data.[0m


  "The `lr` argument is deprecated, use `learning_rate` instead.")


Epoch 1/25

Epoch 00001: LearningRateScheduler reducing learning rate to 0.01.
Epoch 2/25

Epoch 00002: LearningRateScheduler reducing learning rate to 0.01.
Epoch 3/25

Epoch 00003: LearningRateScheduler reducing learning rate to 0.01.
Epoch 4/25

Epoch 00004: LearningRateScheduler reducing learning rate to 0.01.
Epoch 5/25

Epoch 00005: LearningRateScheduler reducing learning rate to 0.01.
Epoch 6/25

Epoch 00006: LearningRateScheduler reducing learning rate to 0.01.
Epoch 7/25

Epoch 00007: LearningRateScheduler reducing learning rate to 0.01.
Epoch 8/25

Epoch 00008: LearningRateScheduler reducing learning rate to 0.0098.
Epoch 9/25

Epoch 00009: LearningRateScheduler reducing learning rate to 0.0098.
Epoch 10/25

Epoch 00010: LearningRateScheduler reducing learning rate to 0.0098.
Epoch 11/25

Epoch 00011: LearningRateScheduler reducing learning rate to 0.0098.
Epoch 12/25

Epoch 00012: LearningRateScheduler reducing learning rate to 0.0098.
Epoch 13/25

Epoch 00013: LearningRateS