Copyright (C) 2020 Software Platform Lab, Seoul National University

Licensed under the Apache License, Version 2.0 (the "License"); 

you may not use this file except in compliance with the License. 

You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 

Unless required by applicable law or agreed to in writing, software 

distributed under the License is distributed on an "AS IS" BASIS, 


WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 


See the License for the specific language governing permissions and


limitations under the License.

### Preparing MNIST Dataset

Pytorch is providing a MNIST Dataset class, so we can simply import it from `torchvision.datasets` without handcrafting it. Before instantiating Dataset objects, let's first define transforms for MNIST dataset. `torchvision.transforms` module contains multiple predefined transforms. Here, we apply `ToTensor()` that converts the data into the PyTorch Tensor type and `Normalize(mean, std)` that normalizes the sample data to have given mean and standard deviation. 

In [1]:
import torchvision.transforms as transforms

# Normalize data with mean=0.5, std=1.0
mnist_transform = transforms.Compose([
    transforms.ToTensor(), # dataset으로 받은 값이 tensor가 아닐수도 있기 때문
    transforms.Normalize((0.5,), (1.0,))
])

We use `torchvision.datasets.MNIST` API to instantiate MNIST Dataset objects. Having `download=True` flag, MNIST dataset will be automatically downloaded at `download_root` unless it already exists.

In [2]:
from torchvision.datasets import MNIST

# download path
download_root = './MNIST_DATASET'

train_dataset = MNIST(download_root, transform=mnist_transform, train=True, download=True)
test_dataset = MNIST(download_root, transform=mnist_transform, train=False, download=True)

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ./MNIST_DATASET/MNIST/raw/train-images-idx3-ubyte.gz


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Extracting ./MNIST_DATASET/MNIST/raw/train-images-idx3-ubyte.gz to ./MNIST_DATASET/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to ./MNIST_DATASET/MNIST/raw/train-labels-idx1-ubyte.gz


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Extracting ./MNIST_DATASET/MNIST/raw/train-labels-idx1-ubyte.gz to ./MNIST_DATASET/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to ./MNIST_DATASET/MNIST/raw/t10k-images-idx3-ubyte.gz



HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Extracting ./MNIST_DATASET/MNIST/raw/t10k-images-idx3-ubyte.gz to ./MNIST_DATASET/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to ./MNIST_DATASET/MNIST/raw/t10k-labels-idx1-ubyte.gz


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Extracting ./MNIST_DATASET/MNIST/raw/t10k-labels-idx1-ubyte.gz to ./MNIST_DATASET/MNIST/raw
Processing...
Done!


  return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)


Finally, we instantiate DataLoader that can shuffles and batches the MNIST Dataset. 

In [3]:
from torch.utils.data import DataLoader

BATCH_SIZE = 64 # batch size는 GPU에서 처리하기 용이하도록 2의 거듭제곱으로 한다.

train_loader = DataLoader(dataset=train_dataset, 
                         batch_size=BATCH_SIZE,
                         shuffle=True)

# test_loader = DataLoader(dataset=test_dataset, 
#                          batch_size=BATCH_SIZE,
#                          shuffle=True)
test_loader = DataLoader(dataset=test_dataset, 
                         batch_size=BATCH_SIZE,
                         shuffle=False)

## Custom Neural Network Models 

PyTorch `torch.nn.Module` allows you to easily make your own custom neural network model. All you need to do is to 1) make a class that inherits `torch.nn.Module` class and 2) define `forward` method. Let's build a simple CNN model using `torch.nn` APIs. 

* `torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros')` 
* `torch.nn.Linear(in_features, out_features, bias=True)`
* `torch.nn.MaxPool2d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)`
* `torch.nn.functional.relu(input, inplace=False)`
* `torch.nn.functional.softmax(input, dim=None)`

You can refer to the following links for more detailed descriptions of `torch.nn` APIs.
* https://pytorch.org/docs/stable/nn.html
* https://pytorch.org/docs/stable/nn.functional.html

In [4]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

class Net(nn.Module):
  
    def __init__(self):
        super(Net, self).__init__()
        
        self.conv1 = nn.Conv2d(1, 6, 5, 1)
        self.pool1 = nn.MaxPool2d(2)

        self.conv2 = nn.Conv2d(6, 16, 5, 1)
        self.pool2 = nn.MaxPool2d(2)
        
        self.fc1 = nn.Linear(256, 64)
        self.fc2 = nn.Linear(64, 10)
    
    def forward(self, x):
    
        # First convolution layer
        x = self.conv1(x)
        x = F.relu(x)
        x = self.pool1(x)
        
        
        # Second convolution layer
        x = self.conv2(x)
        x = F.relu(x)
        x = self.pool2(x)
        
        # (N, 256)
        x = x.view(-1, 256)
        
        # First fully-connected layer
        x = F.relu(self.fc1(x))
        
        # Second fully-connected layer
        x = self.fc2(x)
    
        return F.softmax(x, dim=1)




Instantiate the custom neural network model.

In [5]:
net = Net()
print(net)

Net(
  (conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
  (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
  (pool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (fc1): Linear(in_features=256, out_features=64, bias=True)
  (fc2): Linear(in_features=64, out_features=10, bias=True)
)


Define the loss function (`criterion`) and the optimization method (`optimizer`). In this example, cross entropy loss is used as the criterion and SGD is used as the optimizer. By having `net.parameters()` as the input for the optimizer, we are trying to apply SGD to all the trainable parameters that consist of our custom neural network model `net`.

In [6]:
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.01)

Finally, connecting altoghether, we can train the model.

In [7]:
num_epochs = 10 # dataset을 10번 반복

for epoch in range(num_epochs):
    train_loss = 0.0
    # Iteration over the train dataset
    for i, data in enumerate(train_loader):
        x, label = data
        # 1. Initalize gradient values 
        optimizer.zero_grad() # gradient 값을 0으로 초기화
        # 2. Forward propagation
        model_output = net(x)
        # 3. Calculate loss using the criterion
        loss = criterion(model_output, label)
        # 4. Back propagation 
        loss.backward() # grad 값을 구하는 과정
        # 5. Weight update
        optimizer.step()
        
        train_loss += loss.item()
        
    # Print train loss and test accuracy at the end of every epoch   
    with torch.no_grad(): # do not forget this
        corr_num = 0
        total_num = 0
        # Iteration overt the test dataset to evaluate the test accuracy
        # for _, test in enumerate(test_loader):
        for test in test_loader:
            test_x, test_label = test
            test_output = net(test_x)
            pred_label = test_output.argmax(dim=1)
            corr = test_label[test_label == pred_label].size(0)
            corr_num += corr
            total_num += test_label.size(0)
    print("[Epoch: %d] train loss: %.4f, test acc: %.2f" \
        % (epoch + 1, train_loss / len(train_loader), corr_num / total_num * 100))
    train_loss = 0.0
            

[Epoch: 1] train loss: 2.3024, test acc: 9.82
[Epoch: 2] train loss: 2.3018, test acc: 9.82
[Epoch: 3] train loss: 2.3011, test acc: 9.82
[Epoch: 4] train loss: 2.3000, test acc: 15.27
[Epoch: 5] train loss: 2.2980, test acc: 22.84
[Epoch: 6] train loss: 2.2926, test acc: 32.40
[Epoch: 7] train loss: 2.2348, test acc: 31.03
[Epoch: 8] train loss: 1.9718, test acc: 73.39
[Epoch: 9] train loss: 1.7314, test acc: 80.50
[Epoch: 10] train loss: 1.6687, test acc: 82.78
