## Setup Codes

### Google Colab Setup

we need to run a few commands to set up our environment on Google Colab. If you are running this notebook on a local machine you can skip this section. Run the following cell to mount your Google Drive.

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


Now recall the path in your Google Drive where you uploaded this notebook, fill it in below.

In [2]:
import os
import sys

# TODO: Fill in the Google Drive path where you uploaded the assignment
# Example: If you create a 'Test' folder and put all the files under 'example' folder, then 'Test/example'
# GOOGLE_DRIVE_PATH_AFTER_MYDRIVE = 'Test/example'
GOOGLE_DRIVE_PATH_AFTER_MYDRIVE = 'GIT/tutorials/utils/custom_model_utils'
GOOGLE_DRIVE_PATH = os.path.join('drive', 'My Drive', GOOGLE_DRIVE_PATH_AFTER_MYDRIVE)
sys.path.append(GOOGLE_DRIVE_PATH)

print(os.listdir(GOOGLE_DRIVE_PATH))

['__pycache__', 'optimizers.py', 'losses.py', 'modules.py', 'activations.py', 'models.py', 'data.py', 'utils.py']


### Setup Code

Run some setup code for this notebook: Import some useful packages

In [3]:
import torch
import torch.nn as nn
import torch.optim as optim

# custom utils
import utils
import data
import modules

torch.manual_seed(42)
torch.cuda.manual_seed_all(42)

input_dim = 3 * 32 * 32
num_classes = 10

### Load the CIFAR-10 dataset
 The utility function eecs598.data.preprocess_cifar10() returns the entire CIFAR-10 dataset as a set of six Torch tensors while also preprocessing the RGB images:

In [4]:
dset_train, dset_val, dset_test = data.load_cifar10(val_ratio=0.2)

train_loader = torch.utils.data.DataLoader(dset_train, batch_size=64, shuffle=True)
val_loader = torch.utils.data.DataLoader(dset_val, batch_size=64, shuffle=True)
test_loader = torch.utils.data.DataLoader(dset_test, batch_size=64, shuffle=False)


data_loaders = {}
data_loaders['train'] = train_loader
data_loaders['val'] = val_loader
data_loaders['test'] = test_loader

## Fully-connected neural networks

we want to build networks using a more modular design so that we can implement different layer types in isolation and then snap them together into models with different architectures.

In this exercise we will implement fully-connected networks using a more modular approach. For each layer we will implement a `forward` and a `backward` function. The `forward` function will receive inputs, weights, and other parameters and will return an output. The `backward` function will receive upstream derivatives, and will return gradients with respect to the inputs and weights.

After implementing a bunch of layers this way, we will be able to easily combine them to build classifiers with different architectures.

## torch.autograd.Function

Machine Learning에 대해 배울때, forward()와 backward()에 대해 한 번쯤은 들어봤을 것입니다. 그러나 실제 PyTorch로 모델을 설계할 때, 실제로 backward()를 직접 구현한 경험이 없을 것이다. PyTorch는 `autograd` 기능을 제공하여 사용자가 직접 gradient를 계산할 필요가 없게 설계되어있기 때문이다.

해당 튜토리얼에서는 모델의 작동방식을 자세히 살펴보기 위해 `torch.autograd.Function`을 사용하여 `backward`를 직접 구현한다. 이러한 설계를 통해 PyTorch의 내부 작동 구조가 어떠한 방식으로 실행되는지를 조금이나마 이해 할 수 있도록 모델을 설계하였다.

>  `torch.autograd.Function`에 대한 자세한 내용은 [여기](https://pytorch.org/docs/stable/notes/extending.html#extending-autograd)를 참조하세요.

### Linear layer

In [5]:
class Linear(nn.Module):
  def __init__(self, input_dim, output_dim):
    super().__init__()

    self.W = nn.Parameter(modules.xavier_init(input_dim, output_dim))
    self.b = nn.Parameter(torch.zeros(output_dim))

  def forward(self, x):
    return LinearFunction.apply(x, self.W.to(x.device), self.b.to(x.device))

In [6]:
class LinearFunction(torch.autograd.Function):
    @staticmethod
    def forward(ctx, input, weight, bias):
        ctx.save_for_backward(input, weight, bias)
        return torch.matmul(input, weight) + bias

    @staticmethod
    def backward(ctx, grad_output):
        input, weight, bias = ctx.saved_tensors

        dx = torch.matmul(grad_output, weight.T)
        dw = torch.matmul(input.T, grad_output)
        db = grad_output.sum(0)

        return dx, dw, db

### Multilayer network
Next you will implement a fully-connected network with an arbitrary number of hidden layers.

In [7]:
class FullyConnectedNet(nn.Module):
  def __init__(self, input_dim, hidden_dims, num_classes):
    super().__init__()

    self.layers = nn.ModuleList()

    if isinstance(hidden_dims, int):
      hidden_dims = [hidden_dims]

    in_dim = input_dim
    for out_dim in hidden_dims:
      self.layers.append(Linear(in_dim, out_dim))
      self.layers.append(nn.ReLU())
      in_dim = out_dim

    self.output_layer = Linear(in_dim, num_classes)

  def forward(self, x):

    for layer in self.layers:
      x = layer(x)

    output = self.output_layer(x)

    return output

# Train

In [8]:
model = FullyConnectedNet(input_dim, [64, 128], num_classes)
utils.summary(model, (1,input_dim))

Layer (type)                Output Shape         Param #
Linear()                    (1, 64)                  196,672
ReLU()                      (1, 64)                        0
Linear()                    (1, 128)                   8,320
ReLU()                      (1, 128)                       0
output_layer                (1, 10)                    1,290
Total params: 206,282


In [9]:
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.001)

history = utils.runner(model, criterion, optimizer, data_loaders, num_epochs=10, msg=True)

Train using cpu
Epoch [1/10]          train Loss: 2.2202          train ACC: 18.67%          val Loss: 2.1218          val ACC: 25.65%
Epoch [2/10]          train Loss: 2.0633          train ACC: 27.99%          val Loss: 2.0142          val ACC: 29.73%
Epoch [3/10]          train Loss: 1.9749          train ACC: 31.06%          val Loss: 1.9417          val ACC: 32.22%
Epoch [4/10]          train Loss: 1.9128          train ACC: 33.04%          val Loss: 1.8891          val ACC: 33.87%
Epoch [5/10]          train Loss: 1.8668          train ACC: 34.66%          val Loss: 1.8494          val ACC: 34.99%
Epoch [6/10]          train Loss: 1.8310          train ACC: 35.93%          val Loss: 1.8187          val ACC: 36.17%
Epoch [7/10]          train Loss: 1.8016          train ACC: 37.13%          val Loss: 1.7940          val ACC: 37.09%
Epoch [8/10]          train Loss: 1.7766          train ACC: 37.88%          val Loss: 1.7711          val ACC: 37.95%
Epoch [9/10]          train Loss