## GoogLeNet의 Inception Module 구현

Going Deeper with Convolutions(2015) Inception 모듈에 대한 내용입니다. 해당 논문에서는 Inception Module이라는 새로운 neural network architecture 를 공개하였습니다. 논문의 제목과 같이 Going Deeper 즉 더욱 깊은 신경망 모델을 dimension reduction이 적용된 Inception Module로 가능케 하였는데, 이때 1x1 Convolution을 적극 활용하였습니다.

이때 활용한 1x1 Convolution이 어떤 역할을 하였는지 살펴보도록 하겠습니다.

- 논문 링크 [**(링크)**](https://arxiv.org/pdf/1409.4842v1.pdf)

In [1]:
import torch
import torch.nn as nn
import torchsummary

> GoogLeNet Inception Module naive version (Version 1)

![](https://miro.medium.com/max/720/1*wverTCLSTVNDpyRhlGVwyw.webp)

In [2]:
class BaseConv2D(nn.Module):
    def __init__(self, in_channels, out_channels, **kwargs):
        super(BaseConv2D, self).__init__()
        self.conv = nn.Conv2d(in_channels=in_channels, out_channels=out_channels, **kwargs)
        self.relu = nn.ReLU(inplace=True)
        
    def forward(self, x):
        return self.relu(self.conv(x))

        
class InceptionModuleV1(nn.Module):
    def __init__(self, in_channels, out_1x1, out_3x3, out_5x5, pool):
        super(InceptionModuleV1, self).__init__()
        self.conv1x1 = BaseConv2D(in_channels, out_1x1, kernel_size=1)
        self.conv3x3 = BaseConv2D(in_channels, out_3x3, kernel_size=3, padding='same')
        self.conv5x5 = BaseConv2D(in_channels, out_5x5, kernel_size=5, padding='same')
        self.pool = nn.Sequential(
            nn.MaxPool2d(kernel_size=3, stride=1, padding=1), 
            BaseConv2D(in_channels, pool, kernel_size=1, padding='same')
        )
    
    def forward(self, x):
        x1 = self.conv1x1(x)
        x2 = self.conv3x3(x)
        x3 = self.conv5x5(x)
        x4 = self.pool(x)
        return torch.cat([x1, x2, x3, x4], 1)

In [3]:
inception_module_V1 = InceptionModuleV1(192, 64, 128, 32, 32)

In [6]:
torchsummary.summary(inception_module_V1, input_size=(192, 28, 28), device='cpu')

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1           [-1, 64, 28, 28]          12,352
              ReLU-2           [-1, 64, 28, 28]               0
        BaseConv2D-3           [-1, 64, 28, 28]               0
            Conv2d-4          [-1, 128, 28, 28]         221,312
              ReLU-5          [-1, 128, 28, 28]               0
        BaseConv2D-6          [-1, 128, 28, 28]               0
            Conv2d-7           [-1, 32, 28, 28]         153,632
              ReLU-8           [-1, 32, 28, 28]               0
        BaseConv2D-9           [-1, 32, 28, 28]               0
        MaxPool2d-10          [-1, 192, 28, 28]               0
           Conv2d-11           [-1, 32, 28, 28]           6,176
             ReLU-12           [-1, 32, 28, 28]               0
       BaseConv2D-13           [-1, 32, 28, 28]               0
Total params: 393,472
Trainable params:

> Inception Module with dimension reductions (Version 2)

![](https://miro.medium.com/max/720/1*SdbkFi2JB-Tjri7LVOMkWA.webp)

출처: https://valentinaalto.medium.com/understanding-the-inception-module-in-googlenet-2e1b7c406106

In [4]:
class InceptionModuleV2(nn.Module):
    def __init__(self, in_channels, out_1x1, out_3x3_reduce, out_3x3, out_5x5_reduce, out_5x5, pool):
        super(InceptionModuleV2, self).__init__()
        self.conv1x1 = BaseConv2D(in_channels, out_1x1, kernel_size=1)
        
        self.conv3x3 = nn.Sequential(
            BaseConv2D(in_channels, out_3x3_reduce, kernel_size=1),
            BaseConv2D(out_3x3_reduce, out_3x3, kernel_size=3, padding='same')
        )
        self.conv5x5 = nn.Sequential(
            BaseConv2D(in_channels, out_5x5_reduce, kernel_size=1),
            BaseConv2D(out_5x5_reduce, out_5x5, kernel_size=5, padding='same')
        )
        
        self.pool = nn.Sequential(
            nn.MaxPool2d(kernel_size=3, stride=1, padding=1), 
            BaseConv2D(in_channels, pool, kernel_size=1, padding='same')
        )
    
    def forward(self, x):
        x1 = self.conv1x1(x)
        x2 = self.conv3x3(x)
        x3 = self.conv5x5(x)
        x4 = self.pool(x)
        return torch.cat([x1, x2, x3, x4], 1)

In [5]:
inception_module_V2 = InceptionModuleV2(192, 64, 96, 128, 16, 32, 32)

In [7]:
torchsummary.summary(inception_module_V2, input_size=(192, 28, 28), device='cpu')

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1           [-1, 64, 28, 28]          12,352
              ReLU-2           [-1, 64, 28, 28]               0
        BaseConv2D-3           [-1, 64, 28, 28]               0
            Conv2d-4           [-1, 96, 28, 28]          18,528
              ReLU-5           [-1, 96, 28, 28]               0
        BaseConv2D-6           [-1, 96, 28, 28]               0
            Conv2d-7          [-1, 128, 28, 28]         110,720
              ReLU-8          [-1, 128, 28, 28]               0
        BaseConv2D-9          [-1, 128, 28, 28]               0
           Conv2d-10           [-1, 16, 28, 28]           3,088
             ReLU-11           [-1, 16, 28, 28]               0
       BaseConv2D-12           [-1, 16, 28, 28]               0
           Conv2d-13           [-1, 32, 28, 28]          12,832
             ReLU-14           [-1, 32,

In [8]:
dummy_input = torch.randn(size=(1, 192, 28, 28))

In [9]:
y1 = inception_module_V1(dummy_input)
y2 = inception_module_V2(dummy_input)

In [10]:
y1.shape, y2.shape

(torch.Size([1, 256, 28, 28]), torch.Size([1, 256, 28, 28]))