# **학업 일지**

#### 오늘의 한마디  

> **Convolution을 활용하여 이미지를 트레이닝시키는 것 적용하기**

## 1. 강의 정리

### 1.1 Convolution

#### 1.1.1 convolution의 정의

- Continuous convolution  
![image.png](img/con_convolution.png)  

신호처리에서 실제 원하는 신호와 얼마나 연관성이 있는지를 나타내는 지표가 될 수 있다.  

- Discrete convolution  
![image.png](img/dis_convolution.png) 

좁은 영역에 대해서 kernal과 그림사이의 연관성을 분석하여 특징을 찾아내는 역할을 수행할 수 있다.

### 1.2 Modern CNN model

#### 1.2.1 우승작들의 역사와 개선점

- AlexNet ( 8-layer, 60 M 파라미터 )  
GPU사용 ,ReLU, Dropout, Data augmentation  
11, 11 convolution mask
- VGGNet  ( 19-layer, 110M 파라미터 )  
3,3 convolution mask : layer를 늘릴수있다. 
- GoogLeNet ( 22-layer, 4 M 파라미터 )  
Network in Network 구조, 1,1 convolution(channel방향 demension 줄이기), inception block,
- ResNet  
residual connection(1,1 convolution) 추가한다. 일반적으로 simple shortcut을 사용한다. Batch Norm
batchnorm과 relu의 순서는 무엇이 나을까?
Bottlenect architecture 3,3 convolution 앞뒤에 1,1 convolution을 추가시킨다.- 채널맞추기
![image.png](img/resnet.png) 

- DenseNet  
resnet은 더하는 것이지만 DenseNet은 concatenate시켜준다. 커지는 channel은 1,1 convolution을 활용한다.


### 1.3 Semantic Segmentation

#### Semantic Segmentation의 역사와 개선점

### 1.4 CNN model

#### 1.2.1 Cnn model code review

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
%matplotlib inline
%config InlineBackend.figure_format='retina'
print ("PyTorch version:[%s]."%(torch.__version__))
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
print ("device:[%s]."%(device))

PyTorch version:[1.7.1].
device:[cuda:0].


In [2]:
# dataset 가져와서 batch size로 나누어 iterater 만들기
from torchvision import datasets,transforms
mnist_train = datasets.MNIST(root='./data/',train=True,transform=transforms.ToTensor(),download=True)
mnist_test = datasets.MNIST(root='./data/',train=False,transform=transforms.ToTensor(),download=True)
print ("mnist_train:\n",mnist_train,"\n")
print ("mnist_test:\n",mnist_test,"\n")
print ("Done.")

mnist_train:
 Dataset MNIST
    Number of datapoints: 60000
    Root location: ./data/
    Split: Train
    StandardTransform
Transform: ToTensor() 

mnist_test:
 Dataset MNIST
    Number of datapoints: 10000
    Root location: ./data/
    Split: Test
    StandardTransform
Transform: ToTensor() 

Done.


In [3]:
BATCH_SIZE = 256
train_iter = torch.utils.data.DataLoader(mnist_train,batch_size=BATCH_SIZE,shuffle=True,num_workers=1)
test_iter = torch.utils.data.DataLoader(mnist_test,batch_size=BATCH_SIZE,shuffle=True,num_workers=1)
print ("Done.")

Done.


In [7]:
class ConvolutionalNeuralNetworkClass(nn.Module):
    """
        Convolutional Neural Network (CNN) Class
    """
    # OOP 개념을 응용하여 모델 정의하기
    def __init__(self,name='cnn',xdim=[1,28,28],
                 ksize=3,cdims=[32,64],hdims=[1024,128],ydim=10,
                 USE_BATCHNORM=False):
        super(ConvolutionalNeuralNetworkClass,self).__init__()
        self.name = name
        self.xdim = xdim
        self.ksize = ksize
        self.cdims = cdims
        self.hdims = hdims
        self.ydim = ydim
        self.USE_BATCHNORM = USE_BATCHNORM

        # Convolutional layers
        self.layers = []
        prev_cdim = self.xdim[0]
        for cdim in self.cdims: # for each hidden layer
            self.layers.append(
                nn.Conv2d(prev_cdim,cdim,kernel_size=self.ksize,stride=(1,1),
                          padding=(self.ksize//2)
                  )) # convlution
            if self.USE_BATCHNORM :
                self.layers.append(nn.BatchNorm2d(cdim)) # batch-norm
            self.layers.append(nn.ReLU(True))  # activation
            self.layers.append(nn.MaxPool2d(kernel_size=(2,2), stride=(2,2))) # max-pooling 
            self.layers.append(nn.Dropout2d(p=0.5))  # dropout
            prev_cdim = cdim

        # Dense layers
        self.layers.append(nn.Flatten())
        prev_hdim = prev_cdim*(self.xdim[1]//(2**len(self.cdims)))*(self.xdim[2]//(2**len(self.cdims)))
        for hdim in self.hdims:
            self.layers.append(nn.Linear(prev_hdim,hdim
                # FILL IN HERE
                               ))
            self.layers.append(nn.ReLU(True))  # activation
            prev_hdim = hdim
        # Final layer (without activation)
        self.layers.append(nn.Linear(prev_hdim,self.ydim,bias=True))

        # Concatenate all layers 
        self.net = nn.Sequential()
        for l_idx,layer in enumerate(self.layers):
            layer_name = "%s_%02d"%(type(layer).__name__.lower(),l_idx)
            self.net.add_module(layer_name,layer)
        self.init_param() # initialize parameters
        
    def init_param(self):
        for m in self.modules():
            if isinstance(m,nn.Conv2d): # init conv
                nn.init.kaiming_normal_(m.weight)
                nn.init.zeros_(m.bias)
            elif isinstance(m,nn.BatchNorm2d): # init BN
                nn.init.constant_(m.weight,1)
                nn.init.constant_(m.bias,0)
            elif isinstance(m,nn.Linear): # lnit dense
                nn.init.kaiming_normal_(m.weight)
                nn.init.zeros_(m.bias)
            
    def forward(self,x):
        return self.net(x)

C = ConvolutionalNeuralNetworkClass(
    name='cnn',xdim=[1,28,28],ksize=3,cdims=[32,64],
    hdims=[32],ydim=10).to(device)
loss = nn.CrossEntropyLoss()
optm = optim.Adam(C.parameters(),lr=1e-3)
print ("Done.")

Done.


In [8]:
np.set_printoptions(precision=3)
n_param = 0
for p_idx,(param_name,param) in enumerate(C.named_parameters()):
    if param.requires_grad:
        param_numpy = param.detach().cpu().numpy() # to numpy array 
        n_param += len(param_numpy.reshape(-1))
        print ("[%d] name:[%s] shape:[%s]."%(p_idx,param_name,param_numpy.shape))
        print ("    val:%s"%(param_numpy.reshape(-1)[:5]))
print ("Total number of parameters:[%s]."%(format(n_param,',d')))

[0] name:[net.conv2d_00.weight] shape:[(32, 1, 3, 3)].
    val:[-0.192 -0.124  0.626 -0.057  0.378]
[1] name:[net.conv2d_00.bias] shape:[(32,)].
    val:[0. 0. 0. 0. 0.]
[2] name:[net.conv2d_04.weight] shape:[(64, 32, 3, 3)].
    val:[ 0.091  0.074  0.144 -0.056  0.062]
[3] name:[net.conv2d_04.bias] shape:[(64,)].
    val:[0. 0. 0. 0. 0.]
[4] name:[net.linear_09.weight] shape:[(32, 3136)].
    val:[-0.018  0.015 -0.033  0.058 -0.064]
[5] name:[net.linear_09.bias] shape:[(32,)].
    val:[0. 0. 0. 0. 0.]
[6] name:[net.linear_11.weight] shape:[(10, 32)].
    val:[ 0.36  -0.283  0.122  0.275 -0.303]
[7] name:[net.linear_11.bias] shape:[(10,)].
    val:[0. 0. 0. 0. 0.]
Total number of parameters:[119,530].


In [9]:
np.set_printoptions(precision=3)
torch.set_printoptions(precision=3)
x_numpy = np.random.rand(2,1,28,28)
x_torch = torch.from_numpy(x_numpy).float().to(device)
y_torch = C.forward(x_torch) # forward path
y_numpy = y_torch.detach().cpu().numpy() # torch tensor to numpy array
print ("x_torch:\n",x_torch)
print ("y_torch:\n",y_torch)
print ("\nx_numpy %s:\n"%(x_numpy.shape,),x_numpy)
print ("y_numpy %s:\n"%(y_numpy.shape,),y_numpy)

x_torch:
 tensor([[[[0.383, 0.025, 0.250,  ..., 0.134, 0.644, 0.542],
          [0.115, 0.283, 0.608,  ..., 0.169, 0.246, 0.836],
          [0.007, 0.488, 0.788,  ..., 0.784, 0.744, 0.688],
          ...,
          [0.440, 0.420, 0.908,  ..., 0.231, 0.566, 0.682],
          [0.436, 0.177, 0.798,  ..., 0.251, 0.807, 0.990],
          [0.373, 0.446, 0.004,  ..., 0.403, 0.380, 0.007]]],


        [[[0.687, 0.169, 0.155,  ..., 0.124, 0.156, 0.531],
          [0.783, 0.847, 0.354,  ..., 0.457, 0.475, 0.990],
          [0.655, 0.147, 0.573,  ..., 0.376, 0.823, 0.817],
          ...,
          [0.364, 0.262, 0.130,  ..., 0.103, 0.679, 0.591],
          [0.025, 0.484, 0.892,  ..., 0.987, 0.951, 0.148],
          [0.581, 0.100, 0.255,  ..., 0.383, 0.184, 0.907]]]], device='cuda:0')
y_torch:
 tensor([[-1.153, -4.308, -1.840, -2.702,  0.961,  0.573, -7.581, -1.540, -4.667,
          0.091],
        [ 0.765, -3.300, -1.796, -3.842, -1.426, -1.519, -2.906,  0.429, -0.654,
          1.724]], device=

In [10]:
def func_eval(model,data_iter,device):
    with torch.no_grad():
        n_total,n_correct = 0,0
        model.eval() # evaluate (affects DropOut and BN)
        for batch_in,batch_out in data_iter:
            y_trgt = batch_out.to(device)
            model_pred = model(batch_in.view(-1,1,28,28).to(device))
            _,y_pred = torch.max(model_pred.data,1)
            n_correct += (y_pred==y_trgt).sum().item()
            n_total += batch_in.size(0)
        val_accr = (n_correct/n_total)
        model.train() # back to train mode 
    return val_accr
print ("Done")

Done


In [11]:
C.init_param() # initialize parameters
train_accr = func_eval(C,train_iter,device)
test_accr = func_eval(C,test_iter,device)
print ("train_accr:[%.3f] test_accr:[%.3f]."%(train_accr,test_accr))

train_accr:[0.102] test_accr:[0.104].


In [12]:
print ("Start training.")
C.init_param() # initialize parameters
C.train() # to train mode 
EPOCHS,print_every = 10,1
for epoch in range(EPOCHS):
    loss_val_sum = 0
    for batch_in,batch_out in train_iter:
        # Forward path
        y_pred = C.forward(batch_in.view(-1,1,28,28).to(device))
        loss_out = loss(y_pred,batch_out.to(device))
        # Update
        optm.zero_grad()     # reset gradient 

        loss_out.backward()    # backpropagate
        optm.step()      # optimizer update
        loss_val_sum += loss_out
    loss_val_avg = loss_val_sum/len(train_iter)
    # Print
    if ((epoch%print_every)==0) or (epoch==(EPOCHS-1)):
        train_accr = func_eval(C,train_iter,device)
        test_accr = func_eval(C,test_iter,device)
        print ("epoch:[%d] loss:[%.3f] train_accr:[%.3f] test_accr:[%.3f]."%
               (epoch,loss_val_avg,train_accr,test_accr))
print ("Done")

Start training.
epoch:[0] loss:[0.648] train_accr:[0.953] test_accr:[0.955].
epoch:[1] loss:[0.195] train_accr:[0.971] test_accr:[0.969].
epoch:[2] loss:[0.136] train_accr:[0.980] test_accr:[0.979].
epoch:[3] loss:[0.107] train_accr:[0.983] test_accr:[0.983].
epoch:[4] loss:[0.091] train_accr:[0.984] test_accr:[0.985].
epoch:[5] loss:[0.085] train_accr:[0.987] test_accr:[0.986].
epoch:[6] loss:[0.077] train_accr:[0.989] test_accr:[0.987].
epoch:[7] loss:[0.069] train_accr:[0.990] test_accr:[0.988].
epoch:[8] loss:[0.066] train_accr:[0.991] test_accr:[0.987].
epoch:[9] loss:[0.062] train_accr:[0.992] test_accr:[0.988].
Done


## 2. 피어 세션 정리

#### 2.1 스터디

- 2.1.1 bootstrapping에 대해 개념을 잡았고 k-fold validation에 대해 토론을 하였다.(발표: 변재경)

- 2.1.2 CNN에 들어가는 개념을 나누어 분석하였고 개념이 통일되도록 정리를 하였다.(발표: 김상현)


#### 2.2 원하는 진로를 잡고 논문 공부

- Deep Learning : Yann LeCnn, Yoshua Bengio & Geoffrey Hinton
딥러닝에 대한 전반적인 이해를 돕고 AI 논문에 대한 언어 장벽을 없애기 위해 필요하며 발표 예정

## 3. 진행중인 공부 및 신규 공부 목록

- 진행중인 공부  

    - Deep learning 논문 읽고 정리하기
    - AI 기본 수학 : Mathematics for Machine learning - Marc Peter Deisenroth 3과 공부
    - 웹 크롤링 및 데이터 처리 연습 익숙해지기
    - Numpy를 이용하여 프로젝트 하나 진행해보기 (wav파일로 악보를 추출하는 것을 계획중)
    - Pandas 연산 반복 학습 하기
    - Pytorch로 시작하는 딥러닝, Pytorch tutorial 공부  
    - pytorch 데이터셋 설정

- 신규 공부 목록  


- 완료한 공부  

## 4. 감사한 일

- dataset에 대해 모르는 것을 줌으로 도와줄 수 있느냐는 질문에 흔쾌히 가르쳐준 새봄님에게 감사합니다.