edwith의 부스트 코스 : "파이토치로 시작하는 딥러닝 기초" 를 바탕으로 작성되었습니다.  
https://www.boostcourse.org/ai214

# VGG Net 구현  
- Pytorch doc , torch vision models의 VGG.py 와 동일한 VGG Net 을 구현
- CIFAR10 적용


In [1]:
import torch.nn as nn
import torch.utils.model_zoo as model_zoo

In [2]:

__all__ = [
    'VGG', 'vgg11', 'vgg11_bn', 'vgg13', 'vgg13_bn', 'vgg16', 'vgg16_bn',
    'vgg19_bn', 'vgg19',
]


model_urls = {
    'vgg11': 'https://download.pytorch.org/models/vgg11-bbd30ac9.pth',
    'vgg13': 'https://download.pytorch.org/models/vgg13-c768596a.pth',
    'vgg16': 'https://download.pytorch.org/models/vgg16-397923af.pth',
    'vgg19': 'https://download.pytorch.org/models/vgg19-dcbb9e9d.pth',
    'vgg11_bn': 'https://download.pytorch.org/models/vgg11_bn-6002323d.pth',
    'vgg13_bn': 'https://download.pytorch.org/models/vgg13_bn-abd245e5.pth',
    'vgg16_bn': 'https://download.pytorch.org/models/vgg16_bn-6c64b313.pth',
    'vgg19_bn': 'https://download.pytorch.org/models/vgg19_bn-c79401a0.pth',
}

In [4]:
class VGG(nn.Module):
    def __init__(self, features, num_classes=1000, init_weight=True):
        super(VGG, self).__init__()
        self.features = features  # Convolution
        
        self.avgpool = nn.AdaptiveAvgPool2d((7, 7))
        
        self.classifier = nn.Sequential(
            nn.Linear(512 * 7 * 7, 4096),
            nn.ReLU(True),
            nn.Dropout(),
            nn.Linear(4096, 4096),
            nn.ReLU(True),
            nn.Dropout(),
            nn.Linear(4096, num_classes)
        )  # FC layer
        
        if init_weight:
            self._initialize_weights()
            
    def forward(self, x):
        x = self.features(x)  # Convolution
        x = self.avgpool(x)  # avgpool
        x = x.view(x.size(0), -1)
        x = self.classifier(x) # FC layer
        return x
    
    def _initialize_weights(self):
        for m in self.modules():  #. 모델의 layer 들을 하나씩 m에 넘겨주는 반복문 
            if isinstance(m, nn.Conv2d):  # 만약 layer == Conv2d 라면
                nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')  ## kalming_normal 로 Conv2d layer의 weight 를 normalize ( )
                if m.bias is not None:
                    nn.init.constant_(m.bias, 0)  # VGG 에서는 bias 의 값이 0
            elif isinstance(m, nn.BatchNorm2d):  # 만약 layer == BatchNorm2d 라면 
                nn.init.constant_(m.weight, 1)  # weight 1로 초기화 
                nn.init.constant_(m.bias, 0)  # bias 0으로 초기화 
            elif isinstance(m, nn.Linear):  # 만약 layer == Linear 라면 
                nn.init.normal_(m.weight, 0, 0.01) # weight 평균 0, 표준편차 0.01로 초기화 
                nn.init.normal_(m.bias, 0)  # bias 평균 0으로 초기화  


### kaiming_normal_  
어떠한 activation function 을 쓰냐에 따라서 weight 를 잘 초기화 해줄 수 있는 방식이 다르다.  
He-initialization 논문(“Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification”) 설명 : https://blog.airlab.re.kr/2019/11/He-initialization

[PyTorch doc](https://pytorch.org/cppdocs/api/function_namespacetorch_1_1nn_1_1init_1ac8a913c051976a3f41f20df7d6126e57.html)  
with values according to the method described in “Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification” - He, K. et al. (2015), using a normal distribution. Also known as He initialization. No gradient will be recorded for this operation.