Convolutional Neural Network - Introduction
---

### Stanford CS231n - the best textbook

[Main Page](http://cs231n.stanford.edu/)

[Course Note](http://cs231n.github.io/)

[Youtube](https://www.youtube.com/playlist?list=PL3FW7Lu3i5JvHM8ljYj-zLfQRF3EO8sYv)

[Reference: CS231n Ch9 Convolutional Neural Network Lecture Note](https://www.slideshare.net/raistlinkong/cs231n-2017-lecture9-cnn-architecture)

http://taewan.kim/post/cnn/

이전 시간에 살펴봤던 일반적인 DNN 의 구조를 살펴봅시다.



![image.png](attachment:image.png)




위와같은 구조를 가지고 있었으며, 이러한 구조의 경우 input data의 크기가 커지게 될 경우 모든 layer들이 fully connected이기 때문에 굉장히 많은 weight값을 update해야 합니다. 즉 parameter의 개수가 굉장히 많기 때문에 데이터가 커지면 커질 수록 학습 시키기가 쉽지 않고 overfitting되기 쉽습니다.

학자들은 이 문제를 해결하기 위해서 많은 고민 끝에 이미지 데이터의 경우에, 이미지 데이터의 특징을 활용하여 parameter의 개수를 줄이는 방법을 고안하게 됩니다. 그 방법은 바로 실제 인간의 시각 정보 처리방식을 모방한 방법이며, 굉장히 유명한 CNN(Convolutional Neural Network) 방법입니다.




<br/>
<img src="https://t1.daumcdn.net/cfile/tistory/276FC94357AB43B00D" alt="Drawing" style="width: 700px;"/>
<br/><br/>
<img src="https://ars.els-cdn.com/content/image/1-s2.0-S016516841500290X-gr1.jpg" alt="Drawing" style="width: 500px;"/>


### V1 cortex (Primary Visual Cortex)

- V1 is arranged in a spatial map $\rightarrow$ features deﬁned in terms of two-dimensional maps.

- V1 contains many simple cells $\rightarrow$ characterized by a linear function.

- V1 also contains many complex cells (invariant to small shifts) $\rightarrow$ pooling strategies such as maxout units


Convolutional Neural Network - Structure
---

![image.png](attachment:image.png)


<br/>
<img src="http://cs231n.github.io/assets/cnn/convnet.jpeg" alt="Drawing" style="width: 700px;"/>


### Convolution Layer

Hyper Parameters: filter size (receptive field), depth, stride, zero-padding

- filter size (F): parameter sharing
- depth (D): the output volume
- stride (S): with which we slide the filter
- zero-padding (P): pad the input volume with zeros around the border

Ouput Size = D * {(W−F+2P)/S+1}

<h3><center>
    <a href="http://cs231n.github.io/assets/conv-demo/index.html">CS231n Convolution Layer Demo</a>
</center></h3>



### Pooling Layer

<img src="http://cs231n.github.io/assets/cnn/pool.jpeg" alt="Drawing" style="width: 400px;"/>

<img src="http://cs231n.github.io/assets/cnn/maxpool.jpeg" alt="Drawing" style="width: 400px;"/>


풀링 레이어가 보통 representation의 크기를 심하게 줄이기 때문에 (이런 효과는 작은 데이터셋에서만 오버피팅 방지 효과 등으로 인해 도움이 됨), 최근 추세는 점점 풀링 레이어를 사용하지 않는 쪽으로 발전하고 있다.



### Fully Connected Layer


![image.png](attachment:image.png)


<img src="https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcTtveLa9aU0XOoeUeEn0HXCl_aGRJMBqwimt_9sg29j23Ch9gpZ" alt="Drawing" style="width: 400px;"/>




## MNIST tutorial

Original Source code: https://github.com/pytorch/examples

In [1]:
from __future__ import print_function
import argparse
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
        self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
        self.conv2_drop = nn.Dropout2d()
        self.fc1 = nn.Linear(320, 50)
        self.fc2 = nn.Linear(50, 10)

    def forward(self, x):
        x = F.relu(F.max_pool2d(self.conv1(x), 2))
        x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
        x = x.view(-1, 320)
        x = F.relu(self.fc1(x))
        x = F.dropout(x, training=self.training)
        x = self.fc2(x)
        return F.log_softmax(x, dim=1)

def train(model, device, train_loader, optimizer, epoch, log_interval):
    model.train()
    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()
        output = model(data)
        loss = F.nll_loss(output, target)
        loss.backward()
        optimizer.step()
        if batch_idx % log_interval == 0:
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch, batch_idx * len(data), len(train_loader.dataset),
                100. * batch_idx / len(train_loader), loss.item()))

def test(model, device, test_loader):
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += F.nll_loss(output, target, size_average=False).item() # sum up batch loss
            pred = output.max(1, keepdim=True)[1] # get the index of the max log-probability
            correct += pred.eq(target.view_as(pred)).sum().item()

    test_loss /= len(test_loader.dataset)
    print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(test_loss, correct,
                len(test_loader.dataset), 100. * correct / len(test_loader.dataset)))


In [2]:
use_cuda = torch.cuda.is_available()

batch_size = 64
test_batch_size = 1000
lr = 0.01
momentum = 0.0
log_interval = 10
epochs = 10

device = torch.device("cuda:0" if use_cuda else "cpu")
kwargs = {'num_workers': 1, 'pin_memory': True} if use_cuda else {}

train_loader = torch.utils.data.DataLoader(
    datasets.MNIST('../data', train=True, download=True,
                   transform=transforms.Compose([
                       transforms.ToTensor(),
                       transforms.Normalize((0.1307,), (0.3081,))
                   ])),
        batch_size=batch_size, shuffle=True, **kwargs)

test_loader = torch.utils.data.DataLoader(
    datasets.MNIST('../data', train=False, transform=transforms.Compose([
                       transforms.ToTensor(),
                       transforms.Normalize((0.1307,), (0.3081,))
                   ])),
        batch_size=test_batch_size, shuffle=False, **kwargs)


model = Net().to(device)
optimizer = optim.SGD(model.parameters(), lr=lr, momentum=momentum)

for epoch in range(1, epochs + 1):
    train(model, device, train_loader, optimizer, epoch, log_interval)
    test(model, device, test_loader)







Test set: Average loss: 0.3247, Accuracy: 9151/10000 (92%)


Test set: Average loss: 0.1764, Accuracy: 9476/10000 (95%)




Test set: Average loss: 0.1322, Accuracy: 9600/10000 (96%)


Test set: Average loss: 0.1073, Accuracy: 9670/10000 (97%)




Test set: Average loss: 0.0968, Accuracy: 9707/10000 (97%)


Test set: Average loss: 0.0870, Accuracy: 9722/10000 (97%)




Test set: Average loss: 0.0803, Accuracy: 9736/10000 (97%)




Test set: Average loss: 0.0748, Accuracy: 9775/10000 (98%)


Test set: Average loss: 0.0685, Accuracy: 9784/10000 (98%)




Test set: Average loss: 0.0668, Accuracy: 9794/10000 (98%)



In [3]:
torch.save(model, 'test_mnist.pth')
model2 = torch.load('test_mnist.pth')
test(model2, device, test_loader)

  "type " + obj.__name__ + ". It won't be checked "



Test set: Average loss: 0.0668, Accuracy: 9794/10000 (98%)



Convolutional Neural Network - Case Study
---

### LeNet-5 [LeCun et al. (1998)]

<img src="https://t1.daumcdn.net/cfile/tistory/2777003557AB5C0634" alt="Drawing" style="width: 700px;"/>

<img src="https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQtox5IW0GC0_AEXhVeLDJ4f-5ePyxd8AzTQUyBclJMTRTtvilm" alt="Drawing" style="width: 700px;"/>

<img src="http://yann.lecun.com/exdb/lenet/gifs/f333.gif" alt="Drawing" style="width: 500px;"/>


<center>
    <a href="http://yann.lecun.com/exdb/lenet/index.html">Yann Lecun LeNet-5 Demo</a>
</center>


<img src="https://image.slidesharecdn.com/random-170910154045/95/-64-638.jpg?cb=1505089848" alt="Drawing" style="width: 700px;"/>

### AlexNet
<img src="https://image.slidesharecdn.com/cs231n2017lecture9-171204024938/95/cs231n-2017-lecture9-cnn-architecture-18-638.jpg?cb=1512355830" alt="Drawing" style="width: 700px;"/>

<img src="https://image.slidesharecdn.com/cs231n2017lecture9-171204024938/95/cs231n-2017-lecture9-cnn-architecture-19-638.jpg?cb=1512355830" alt="Drawing" style="width: 700px;"/>


### ZFNet
<img src="https://image.slidesharecdn.com/cs231n2017lecture9-171204024938/95/cs231n-2017-lecture9-cnn-architecture-24-638.jpg?cb=1512355830" alt="Drawing" style="width: 700px;"/>


### VGGNet
<img src="https://image.slidesharecdn.com/cs231n2017lecture9-171204024938/95/cs231n-2017-lecture9-cnn-architecture-35-638.jpg?cb=1512355830" alt="Drawing" style="width: 700px;"/>

<img src="https://image.slidesharecdn.com/cs231n2017lecture9-171204024938/95/cs231n-2017-lecture9-cnn-architecture-30-638.jpg?cb=1512355830" alt="Drawing" style="width: 700px;"/>


### GoogLeNet
<img src="https://image.slidesharecdn.com/cs231n2017lecture9-171204024938/95/cs231n-2017-lecture9-cnn-architecture-63-638.jpg?cb=1512355830" alt="Drawing" style="width: 700px;"/>

<img src="https://image.slidesharecdn.com/cs231n2017lecture9-171204024938/95/cs231n-2017-lecture9-cnn-architecture-49-638.jpg?cb=1512355830" alt="Drawing" style="width: 700px;"/>

<img src="https://image.slidesharecdn.com/cs231n2017lecture9-171204024938/95/cs231n-2017-lecture9-cnn-architecture-55-638.jpg?cb=1512355830" alt="Drawing" style="width: 700px;"/>


### ResNet
<img src="https://image.slidesharecdn.com/cs231n2017lecture9-171204024938/95/cs231n-2017-lecture9-cnn-architecture-65-638.jpg?cb=1512355830" alt="Drawing" style="width: 700px;"/>

<img src="https://image.slidesharecdn.com/cs231n2017lecture9-171204024938/95/cs231n-2017-lecture9-cnn-architecture-79-638.jpg?cb=1512355830" alt="Drawing" style="width: 700px;"/>

<img src="https://image.slidesharecdn.com/cs231n2017lecture9-171204024938/95/cs231n-2017-lecture9-cnn-architecture-77-638.jpg?cb=1512355830" alt="Drawing" style="width: 700px;"/>


### Network Comparison
<img src="https://pbs.twimg.com/media/C6u_ugFWsAI3XgZ.jpg" alt="Drawing" style="width: 700px;"/>

<img src="https://image.slidesharecdn.com/cs231n2017lecture9-171204024938/95/cs231n-2017-lecture9-cnn-architecture-90-638.jpg?cb=1512355830" alt="Drawing" style="width: 700px;"/>


### Next Version...?
<img src="https://image.slidesharecdn.com/cs231n2017lecture9-171204024938/95/cs231n-2017-lecture9-cnn-architecture-100-1024.jpg?cb=1512355830" alt="Drawing" style="width: 700px;"/>

<img src="https://image.slidesharecdn.com/cs231n2017lecture9-171204024938/95/cs231n-2017-lecture9-cnn-architecture-98-638.jpg?cb=1512355830" alt="Drawing" style="width: 700px;"/>

<img src="https://image.slidesharecdn.com/cs231n2017lecture9-171204024938/95/cs231n-2017-lecture9-cnn-architecture-99-638.jpg?cb=1512355830" alt="Drawing" style="width: 700px;"/>



Convolutional Neural Network - Transfer Learning
---
<img src="https://image.slidesharecdn.com/dlcvd2l5transfer-160802094347/95/deep-learning-for-computer-vision-transfer-learning-and-domain-adaptation-upc-2016-4-638.jpg?cb=1470247790" alt="Drawing" style="width: 700px;"/>


### Training Dataset
<img src="https://ai2-s2-public.s3.amazonaws.com/figures/2017-08-08/2d79d338c114ece1d97cde1aa06ab4cf17d38254/2-Table1-1.png" alt="Drawing" style="width: 700px;"/>

<h3><center>
    <a href="https://skymind.ai/wiki/open-datasets">Open Datasets</a>
</center></h3>


### Models
PyTorch - [TorchVision models](https://pytorch.org/docs/stable/torchvision/models.html)

Caffe - [Model zoo](https://github.com/BVLC/caffe/wiki/Model-Zoo)

Tensorflow - [tf hub](https://www.tensorflow.org/hub/)


### ONNX - Open Neural Network Exchange Format
<img src="https://image.slidesharecdn.com/deeplearningsystems-modelserving-datainnovationsummitmarch2018-180322170608/95/model-serving-for-deep-learning-28-638.jpg?cb=1521738792" alt="Drawing" style="width: 700px;"/>

[ONNX website](http://onnx.ai/)

[ONNX Model Zoo](https://github.com/onnx/models)


Recurrent Neural Network - Introduction
---
<img src="http://www.wildml.com/wp-content/uploads/2015/09/rnn.jpg" alt="Drawing" style="width: 700px;"/>


Recurrent Neural Network - Structure
---
<img src="https://image.slidesharecdn.com/cs231n2017lecture10-171031000846/95/cs231n-2017-lecture10-recurrent-neural-networks-20-638.jpg?cb=1509408599" alt="Drawing" style="width: 700px;"/>

<img src="https://image.slidesharecdn.com/cs231n2017lecture10-171031000846/95/cs231n-2017-lecture10-recurrent-neural-networks-22-638.jpg?cb=1509408599" alt="Drawing" style="width: 700px;"/>

<img src="https://image.slidesharecdn.com/cs231n2017lecture10-171031000846/95/cs231n-2017-lecture10-recurrent-neural-networks-29-638.jpg?cb=1509408599" alt="Drawing" style="width: 700px;"/>

<img src="https://image.slidesharecdn.com/cs231n2017lecture10-171031000846/95/cs231n-2017-lecture10-recurrent-neural-networks-30-638.jpg?cb=1509408599" alt="Drawing" style="width: 700px;"/>

<img src="https://image.slidesharecdn.com/cs231n2017lecture10-171031000846/95/cs231n-2017-lecture10-recurrent-neural-networks-31-638.jpg?cb=1509408599" alt="Drawing" style="width: 700px;"/>


### LSTM
<img src="https://image.slidesharecdn.com/cs231n2017lecture10-171031000846/95/cs231n-2017-lecture10-recurrent-neural-networks-41-638.jpg?cb=1509408599" alt="Drawing" style="width: 700px;"/>

<img src="https://image.slidesharecdn.com/cs231n2017lecture10-171031000846/95/cs231n-2017-lecture10-recurrent-neural-networks-95-638.jpg?cb=1509408599" alt="Drawing" style="width: 700px;"/>

<img src="https://image.slidesharecdn.com/cs231n2017lecture10-171031000846/95/cs231n-2017-lecture10-recurrent-neural-networks-99-638.jpg?cb=1509408599" alt="Drawing" style="width: 700px;"/>

<img src="https://image.slidesharecdn.com/cs231n2017lecture10-171031000846/95/cs231n-2017-lecture10-recurrent-neural-networks-102-638.jpg?cb=1509408599" alt="Drawing" style="width: 700px;"/>


Recurrent Neural Network - RNN Applications
---
### Translation
<img src="http://www.wildml.com/wp-content/uploads/2015/09/Screen-Shot-2015-09-17-at-10.39.06-AM-1024x557.png" alt="Drawing" style="width: 700px;"/>


### Image Captioning
<img src="https://image.slidesharecdn.com/cs231n2017lecture10-171031000846/95/cs231n-2017-lecture10-recurrent-neural-networks-63-638.jpg?cb=1509408599" alt="Drawing" style="width: 700px;"/>

<img src="https://image.slidesharecdn.com/cs231n2017lecture10-171031000846/95/cs231n-2017-lecture10-recurrent-neural-networks-73-638.jpg?cb=1509408599" alt="Drawing" style="width: 700px;"/>


### Image Captioning with attention
<img src="https://image.slidesharecdn.com/cs231n2017lecture10-171031000846/95/cs231n-2017-lecture10-recurrent-neural-networks-80-638.jpg?cb=1509408599" alt="Drawing" style="width: 700px;"/>

<img src="https://image.slidesharecdn.com/cs231n2017lecture10-171031000846/95/cs231n-2017-lecture10-recurrent-neural-networks-84-638.jpg?cb=1509408599" alt="Drawing" style="width: 700px;"/>

<img src="https://image.slidesharecdn.com/cs231n2017lecture10-171031000846/95/cs231n-2017-lecture10-recurrent-neural-networks-86-638.jpg?cb=1509408599" alt="Drawing" style="width: 700px;"/>

