#Lab 10-1 Convolution

**Convolution**
- 이미지 위에서 stride 값 만큼 filter(kernel)을 이동시키면서 겹쳐지는 부분의 각 원소의 값을 곱해서 모두 더한 값을 출력으로 하는 연산


- stride : filter를 한 번에 얼마나 이동할 것인가
- padding : input image 주변을 둘러싸는 겹(?)

conv input type
- input type : torch.Tensor
- input shape : (N x C x H x W) -> (batch_size, channel, height, width)

output size of convolution
$$ Output size = {{input size - filter size + (2 * padding) } \over {Stride} }+ 1 $$

In [1]:
import torch
import torch.nn as nn

In [3]:
# Example 1
conv = nn.Conv2d(1, 1, 11, stride = 4, padding = 0)
print(conv)

inputs = torch.Tensor(1, 1, 227, 227)
print(inputs.shape)

out = conv(inputs)
print(out.shape)

Conv2d(1, 1, kernel_size=(11, 11), stride=(4, 4))
torch.Size([1, 1, 227, 227])
torch.Size([1, 1, 55, 55])


In [4]:
# Example 5
conv = nn.Conv2d(1, 1, 3, stride = 1, padding = 1)
print(conv)

inputs = torch.Tensor(1, 1, 64, 32)
print(inputs.shape)

out = conv(inputs)
print(out.shape)

Conv2d(1, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
torch.Size([1, 1, 64, 32])
torch.Size([1, 1, 64, 32])


**Pooling**
- Max pooling : 정한 사이즈 내에서 max값을 계산해서 계산
- avg pooling : 정한 사이즈 내에서 avg값을 계산해서 계산

In [7]:
## Simple CNN implementation

input = torch.Tensor(1, 1, 28, 28) # input에
conv1 = nn.Conv2d(1, 5, 5) # (input channel 하나, output channel 5개, output size 5x5) filter 쓰고
pool = nn.MaxPool2d(2) # pooling의 size 2x2 로 설정
out1 = conv1(input) # 실행하고
out2 = pool(out)    # max pooling 실행
print(out1.size())
print(out2.size())


torch.Size([1, 5, 24, 24])
torch.Size([1, 5, 12, 12])


#Lab 10-2 MNIST CNN


**딥러닝 학습 단계**

1. 라이브러리 가져오기 (torch, torchvision, matplotlib, pandas, numpy, ...)
2. GPU 사용 설정, random value를 위한 seed 설정
3. 학습에 사용되는 parameter 설정 (learning rate, training_epochs, batch_size 등)
4. 데이터셋 가져오고 loader 만들기 (batch size, shuffle, drop last 설정)
5. 학습 모델 만들기 (제발 클래스로 모델 정의하자 승희...)
6. loss function(criterion) 선택, 최적화 도구(optimizer) 선택
7. 모델 학습 및 loss check (criterion의 output)
8. 학습된 모델의 성능 확인 


In [8]:
inputs = torch.Tensor(1, 1, 28, 28) # batch size, channel, height, width
print(input.shape)

torch.Size([1, 1, 28, 28])


In [10]:
conv1 = nn.Conv2d(1, 32, kernel_size = (3, 3), stride = 1, padding = (1, 1))
pool1 = nn.MaxPool2d(2)

conv2 = nn.Conv2d(32, 64, kernel_size = (3, 3), stride = 1, padding = (1, 1))
pool2 = nn.MaxPool2d(2)

In [11]:
out = conv1(inputs)
print(out.shape)
out = pool1(out)
print(out.shape)
out = conv2(out)
print(out.shape)
out = pool2(out)
print(out.shape)

torch.Size([1, 32, 28, 28])
torch.Size([1, 32, 14, 14])
torch.Size([1, 64, 14, 14])
torch.Size([1, 64, 7, 7])


In [12]:
out = out.view(out.size(0), -1) # reshape
print(out.shape)

torch.Size([1, 3136])


In [13]:
fc = nn.Linear(3136, 10)
out = fc(out)
print(out.shape)

torch.Size([1, 10])


## implementation zero to all

In [21]:
import torch
import torch.nn as nn
import torchvision.datasets as dsets
import torchvision.transforms as transforms

import torch.nn.init

In [22]:
device = 'cuda' if torch.cuda.is_available() else 'cpu'

torch.manual_seed(777)
if device == 'cuda':
  torch.cuda.manual_seed_all(777)

In [23]:
# parameters
learning_rate = 0.001
training_epochs = 15
batch_size = 100

In [24]:
# MNIST dataset

mnist_train = dsets.MNIST(root = 'MNIST_data/', train = True, transform = transforms.ToTensor(), download = True)
mnist_test = dsets.MNIST(root = 'MNIST_data/', train = False, transform = transforms.ToTensor(), download = True)

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to MNIST_data/MNIST/raw/train-images-idx3-ubyte.gz


HBox(children=(FloatProgress(value=0.0, max=9912422.0), HTML(value='')))


Extracting MNIST_data/MNIST/raw/train-images-idx3-ubyte.gz to MNIST_data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to MNIST_data/MNIST/raw/train-labels-idx1-ubyte.gz


HBox(children=(FloatProgress(value=0.0, max=28881.0), HTML(value='')))


Extracting MNIST_data/MNIST/raw/train-labels-idx1-ubyte.gz to MNIST_data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to MNIST_data/MNIST/raw/t10k-images-idx3-ubyte.gz


HBox(children=(FloatProgress(value=0.0, max=1648877.0), HTML(value='')))


Extracting MNIST_data/MNIST/raw/t10k-images-idx3-ubyte.gz to MNIST_data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to MNIST_data/MNIST/raw/t10k-labels-idx1-ubyte.gz


HBox(children=(FloatProgress(value=0.0, max=4542.0), HTML(value='')))


Extracting MNIST_data/MNIST/raw/t10k-labels-idx1-ubyte.gz to MNIST_data/MNIST/raw



  return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)


In [25]:
data_loader = torch.utils.data.DataLoader(dataset=mnist_train, batch_size = batch_size, shuffle = True, drop_last = True)

In [29]:
class CNN(nn.Module):

  def __init__(self):
    super(CNN, self).__init__()
    self.layer1 = nn.Sequential(
        nn.Conv2d(1, 32, kernel_size = 3, stride = 1, padding = 1),
        nn.ReLU(),
        nn.MaxPool2d(2)
    )

    self.layer2 = nn.Sequential(
        nn.Conv2d(32, 64, kernel_size = 3, stride = 1, padding = 1),
        nn.ReLU(),
        nn.MaxPool2d(2)
    )

    self.fc = nn.Linear(7*7*64, 10, bias = True)
    torch.nn.init.xavier_uniform_(self.fc.weight) # xavier uniform initialization

  def forward(self, x):
    out = self.layer1(x)
    out = self.layer2(out)

    out = out.view(out.size(0), -1)
    out = self.fc(out)
    
    return out


In [30]:
model = CNN().to(device)

In [31]:
model

CNN(
  (layer1): Sequential(
    (0): Conv2d(1, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): ReLU()
    (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (layer2): Sequential(
    (0): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): ReLU()
    (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (fc): Linear(in_features=3136, out_features=10, bias=True)
)

In [33]:
criterion = nn.CrossEntropyLoss().to(device)
optimizer = torch.optim.Adam(model.parameters(), lr = learning_rate)

In [34]:
# training
total_batch = len(data_loader)

for epoch in range(training_epochs):
  avg_cost = 0

  for X, Y in data_loader:
    X = X.to(device)
    Y = Y.to(device)

    optimizer.zero_grad() # 빼먹으면 학습 안되니까 꼭 !
    hypothesis = model(X)

    cost = criterion(hypothesis, Y)
    cost.backward()
    optimizer.step()

    avg_cost += cost / total_batch

  print('[Epoch : {}] cost = {}'.format(epoch + 1, avg_cost))
print('Learning Finished! ')

[Epoch : 1] cost = 0.21580401062965393
[Epoch : 2] cost = 0.06259971112012863
[Epoch : 3] cost = 0.045254964381456375
[Epoch : 4] cost = 0.03711254522204399
[Epoch : 5] cost = 0.02831818163394928
[Epoch : 6] cost = 0.02485988475382328
[Epoch : 7] cost = 0.020700037479400635
[Epoch : 8] cost = 0.018485048785805702
[Epoch : 9] cost = 0.014841755852103233
[Epoch : 10] cost = 0.011564149521291256
[Epoch : 11] cost = 0.011367826722562313
[Epoch : 12] cost = 0.0099481251090765
[Epoch : 13] cost = 0.00822856742888689
[Epoch : 14] cost = 0.006425795145332813
[Epoch : 15] cost = 0.005629360675811768
Learning Finished! 


In [35]:
# test model

with torch.no_grad():
  X_test = mnist_test.test_data.view(len(mnist_test), 1, 28, 28).float().to(device)
  Y_test = mnist_test.test_labels.to(device)

  prediction = model(X_test)
  correct_prediction = torch.argmax(prediction, 1) == Y_test
  accuracy = correct_prediction.float().mean()
  print('Accuracy : ', accuracy.item())



Accuracy :  0.9836999773979187


## MNIST Deep CNN

In [48]:
import torch
import torch.nn as nn
import torchvision.datasets as dsets
import torchvision.transforms as transforms

import torch.nn.init

In [49]:
device = 'cuda' if torch.cuda.is_available() else 'cpu'

torch.manual_seed(777)
if device == 'cuda':
  torch.cuda.manual_seed_all(777)

In [50]:
# parameters
learning_rate = 0.001
training_epochs = 15
batch_size = 100

In [51]:
# MNIST dataset

mnist_train = dsets.MNIST(root = 'MNIST_data/', train = True, transform = transforms.ToTensor(), download = True)
mnist_test = dsets.MNIST(root = 'MNIST_data/', train = False, transform = transforms.ToTensor(), download = True)

In [52]:
data_loader = torch.utils.data.DataLoader(dataset=mnist_train, batch_size = batch_size, shuffle = True, drop_last = True)

In [102]:
class CNN5(nn.Module):

  def __init__(self):
    super(CNN5, self).__init__()
    self.layer1 = nn.Sequential(
        nn.Conv2d(1, 32, kernel_size = 3, stride = 1, padding = 1),
        nn.ReLU(),
        nn.MaxPool2d(2)
    )

    self.layer2 = nn.Sequential(
        nn.Conv2d(32, 64, kernel_size = 3, stride = 1, padding = 1),
        nn.ReLU(),
        nn.MaxPool2d(2)
    )
    self.layer3 = nn.Sequential(
        nn.Conv2d(64, 128, kernel_size = 3, stride = 1, padding = 1),
        nn.ReLU(),
        nn.MaxPool2d(2)
    )

    self.fc1 = nn.Linear(3*3*128, 625, bias = True)
    torch.nn.init.xavier_uniform_(self.fc1.weight)

    self.layer4 = nn.Sequential(
        self.fc1, 
        nn.ReLU()
    )
    self.fc2 = nn.Linear(625, 10, bias = True)
    torch.nn.init.xavier_uniform_(self.fc2.weight) # xavier uniform initialization

  def forward(self, x):
    out = self.layer1(x)
    out = self.layer2(out)
    out = self.layer3(out)

    out = out.view(out.size(0), -1)
    out = self.layer4(out)
    out = out.view(out.size(0), -1)
    out = self.fc2(out)
    
    return out


In [103]:
# instantiate CNN model
model = CNN5().to(device)

value = (torch.Tensor(1, 1, 28, 28)).to(device)
print((model(value)).shape)

torch.Size([1, 10])


In [104]:
# define cost/loss & optimizer
criterion = torch.nn.CrossEntropyLoss().to(device)
optimizer = torch.optim.Adam(model.parameters(), lr = learning_rate)

In [105]:
# train model

total_batch = len(data_loader)
model.train()

for epoch in range(training_epochs):
  avg_cost = 0

  for X, Y in data_loader:
    # image is already size of (28, 28), no reshape
    # label is not one-hot encoded
    X = X.to(device)
    Y = Y.to(device)

    optimizer.zero_grad()
    hypothesis = model(X)
    cost = criterion(hypothesis, Y)
    cost.backward()
    optimizer.step()

    avg_cost += cost / total_batch

  print('[Epoch: {:>4}] cost = {:>.9}'.format(epoch + 1, avg_cost))

print('Learning finish')

[Epoch:    1] cost = 0.174777344
[Epoch:    2] cost = 0.0428322665
[Epoch:    3] cost = 0.0322122388
[Epoch:    4] cost = 0.0214078855
[Epoch:    5] cost = 0.0178606678
[Epoch:    6] cost = 0.0135261631
[Epoch:    7] cost = 0.0115925241
[Epoch:    8] cost = 0.00971460808
[Epoch:    9] cost = 0.0108784307
[Epoch:   10] cost = 0.00969578419
[Epoch:   11] cost = 0.00588979712
[Epoch:   12] cost = 0.00759508554
[Epoch:   13] cost = 0.0065462105
[Epoch:   14] cost = 0.0055210921
[Epoch:   15] cost = 0.004390107
Learning finish


In [106]:
# test model

with torch.no_grad():
  X_test = mnist_test.test_data.view(len(mnist_test), 1, 28, 28).float().to(device)
  Y_test = mnist_test.test_labels.to(device)

  prediction = model(X_test)
  correct_prediction = torch.argmax(prediction, 1) == Y_test
  accuracy = correct_prediction.float().mean()
  print('Accuracy : ', accuracy.item())



Accuracy :  0.9919000267982483
