In [1]:
import numpy as np
import torch
from torch import nn
from torchvision import transforms, datasets

### What is Tensor?
tensor is mostly same as numpy array (even its applications like broadcasting operation, indexing, slicing and etc), except for it brings us the opportunity to run operations on faster hardwares like GPU. let's see some tensor defintion

In [2]:
arr = torch.zeros((256, 256), dtype=torch.int32)

# tensors are defined by default at CPU
print(arr.device)

# keep 'size', 'dtype' and 'device' same as arr, but fill with 1
arr2 = torch.ones_like(arr)

# keep 'dtype' and 'device' same as arr, but fill data arbitrarily
arr3 = arr.new_tensor([[1, 2], [3, 4]])

cpu


in order to feed tensors to deep-learning models, they should follow a customary shape form; `B C H W` for 4D tensors where `B` is batch size, `C` is channel dimension and `H W` are spatial dimensions.

#### Device determination
first we need to determine which device all torch tensors (including the input, learning weights and etc) are going to be allocated. basically, GPU is the first priority.

In [3]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

#### Pseudo random generation
it is often recommended to generate **pseudo** random numbers as it provides fair comparison between different configs of deep learning model(s). torch provides this by `torch.manual_seed`.

In [4]:
np.random.seed(12345)

# same seed on all devices; both CPU and CUDA
torch.manual_seed(12345)

<torch._C.Generator at 0x7f4e95f0c4d0>

## Build model
pytorch models are defined as python classes inherited from `torch.nn.Module`. two functions are essential for model creation:
1. learning weights and network layers used by model are defined within `__init__()`.
2. forwarding procedure of model are developed within `forward()`.

let's create a multi-classification model (five label) with this schema: `Conv` -> `ReLU` -> `Batchnorm` -> `Conv` -> `ReLU` -> `Batchnorm` -> `Adaptive average pooling` -> `dropout` -> `fully connected`. suppose input has one channel and `forward()` will only return output of the model

In [5]:
class Model(nn.Module):
    
    def __init__(self):
        super().__init__()
        # your code here
        
        self.conv = nn.Sequential(
                        nn.Conv2d(1, 32, 3, 1),
                        nn.ReLU(),
                        nn.BatchNorm2d(32),
                        nn.Conv2d(32, 64, 3, 1),
                        nn.ReLU(),
                        nn.BatchNorm2d(64),
                    )
        self.glob_avg_pool = nn.AdaptiveAvgPool2d((1, 1))
        self.fc = nn.Sequential(
                    nn.Dropout(0.5),
                    nn.Linear(64, 5),
                )
        
    def forward(x: torch.Tensor) -> torch.Tensor:
        # your code here    
        
        x = self.conv(x)
        x = self.glob_avg_pool(x)
        x = self.fc(x)
        
        return x

#### set model device
in order to explicitly transfer model (or tensors) to another device, pytorch provides `.to(device)`

In [6]:
model = Model()
model.to(device)

Model(
  (conv): Sequential(
    (0): Conv2d(1, 32, kernel_size=(3, 3), stride=(1, 1))
    (1): ReLU()
    (2): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (3): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1))
    (4): ReLU()
    (5): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  )
  (glob_avg_pool): AdaptiveAvgPool2d(output_size=(1, 1))
  (fc): Sequential(
    (0): Dropout(p=0.5, inplace=False)
    (1): Linear(in_features=64, out_features=5, bias=True)
  )
)

## Data operation

#### Data transformation
PIL images should first be transformed to torch tensors. `torchvision.transforms.Compose` provides a pipeline of transforms. in the following 'converting to tensors' is only applied.

In [7]:
transform = transforms.Compose([
    transforms.ToTensor()
])

#### Download data
evaluation is not purpose of this notebook therefore you only need to load **train** set of MNIST dataset using `torchvision.datasets.MNIST`.

In [8]:
# your code here
train = None

train = datasets.MNIST(
               root='data',
               train=True, 
               transform=transform,
               download=True)

#### Data loader
define train loader using `torch.utils.data.DataLoader`.

In [9]:
batch_size = 32

# your code here
train_loader = None


train_loader = torch.utils.data.DataLoader(
                    dataset=train,
                    batch_size=batch_size,
                    shuffle=True)

## Training