# Problem Description
Tiny ImageNet contains 100000 images of 200 classes (500 for each class) downsized to 64×64 colored images. Each class has 500 training images, 50 validation images and 50 test images.

# Metrics
For the evaluation of the model, we will use accuracy as our metric. It is straightforward and defined as follows:
$$ \text{Accuracy} = \frac{\text{correct classifications}}{\text{all classifications}} $$

However, accuracy has a disadvantage for multiclass classification problems, as it does not consider class imbalances. If our model is biased towards one class, and that class has the highest occurrence, accuracy may fail to reflect this bias. In our case, since the dataset does not have class imbalances, accuracy should be sufficient for our evaluation.

To estimate the error in the chosen metric, we could also consider using an alternative metric like the F1 Score, which penalizes false predictions rather than just summarizing the correct ones.


# Base Architecture
- The base model consists of two convolutional layers for feature extraction and two pooling layers to reduce the spatial dimension of the image. Two fully connected layer ensure enough parameters. The goal is to train with a single sample or batch and to show that it works as well as in the next step to find a proper learning rate and batch size.

In [1]:
import torch
import torch.nn as nn
from torchsummary import summary
import utils
from typing import List, Tuple, Dict


class CNN(nn.Module):
    def __init__(
        self,
        dim: int,
        num_classes: int,
        confs: List[Tuple[str, Dict]],
        in_channels: int,
        weight_init=None,
    ):
        super(CNN, self).__init__()

        self.net = nn.ModuleList()

        linear_idxs = [idx for idx, (layer, _) in enumerate(confs) if layer == "L"]
        linear_start = linear_idxs[0]
        convolution_conf = confs[:linear_start]
        linear_conf = confs[linear_start:]
        for layer, conf in convolution_conf:
            if layer == "C":
                self.net.append(
                    nn.Conv2d(
                        in_channels,
                        out_channels=conf["channels"],
                        kernel_size=conf["kernel"],
                        stride=conf.get("stride", 1),
                        padding=conf.get("padding", 0),
                    )
                )
                self.net.append(nn.ReLU())
                if conf.get("batch_norm", False):
                    self.net.append(nn.BatchNorm2d(conf["channels"]))
                if conf.get("dropout", 0):
                    self.net.append(nn.Dropout(conf["dropout"]))
                in_channels = conf["channels"]
            elif layer == "P":
                self.net.append(nn.MaxPool2d(kernel_size=conf["kernel"]))
            else:
                raise NotImplementedError(f"Layer {layer} not implemented")

        self.dim = utils.get_dim_after_conv_and_pool(dim_init=dim, confs=confs)
        for idx, (layer, conf) in enumerate(linear_conf):
            if idx == 0:
                self.net.append(nn.Flatten())
                self.net.append(
                    nn.Linear(self.dim * self.dim * in_channels, conf["units"])
                )
                self.net.append(nn.ReLU())
                if conf.get("dropout", 0):
                    self.net.append(nn.Dropout(conf["dropout"]))
            elif idx == len(linear_conf) - 1:
                self.net.append(nn.Linear(conf["units"], num_classes))
            else:
                self.net.append(nn.Linear(conf["units"], conf["units"]))
                self.net.append(nn.ReLU())
                if conf.get("dropout", 0):
                    self.net.append(nn.Dropout(conf["dropout"]))

    def forward(self, x):
        N, H, W, C = x.shape
        x = x.permute(
            0, 3, 1, 2
        )  # Adjust (batch_size, H, W, C) to (batch_size, C, H, W)
        assert x.shape == (N, C, H, W)

        for layer in self.net:
            x = layer(x)

        return x

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
confs = [
    ("C", {"kernel": 3, "channels": 16}),
    ("P", {"kernel": 2}),
    ("C", {"kernel": 3, "channels": 32}),
    ("P", {"kernel": 2}),
    ("L", {"units": 500}),
    ("L", {"units": 500}),
]

In [3]:
x = torch.rand(10, 64, 64, 3)
model = CNN(dim=64, num_classes=200, confs=confs, in_channels=3)
model(x)
summary(model, (64, 64, 3))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1           [-1, 16, 62, 62]             448
              ReLU-2           [-1, 16, 62, 62]               0
         MaxPool2d-3           [-1, 16, 31, 31]               0
            Conv2d-4           [-1, 32, 29, 29]           4,640
              ReLU-5           [-1, 32, 29, 29]               0
         MaxPool2d-6           [-1, 32, 14, 14]               0
           Flatten-7                 [-1, 6272]               0
            Linear-8                  [-1, 500]       3,136,500
              ReLU-9                  [-1, 500]               0
           Linear-10                  [-1, 200]         100,200
Total params: 3,241,788
Trainable params: 3,241,788
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.05
Forward/backward pass size (MB): 1.57
Params size (MB): 12.37
Estima

In [4]:
train_loader, valid_loader = utils.get_data(batch_size=None)
print(train_loader, valid_loader)

<torch.utils.data.dataloader.DataLoader object at 0x10605e470> <torch.utils.data.dataloader.DataLoader object at 0x10605c760>


## Discussion

# SGD, Tuning of Learning Rate and Batch Size
- Explain SGD
- Explain Learning Rate
- Explain Batch Size

In [5]:
def my_code():
    pass

## Discussion

# SGD, Weight Initialization, Model Complexity, Convolution Settings
- Explain what you will do here

## Weight Initialization
- Explain the different weight initialization methods

In [6]:
def my_code():
    pass

### Discussion

## Model Complexity
- Explain what you will do

### Model Variant 1
- Explain

In [7]:
import torch
import torch.nn as nn
import utils
from typing import List, Tuple, Dict


class CNN(nn.Module):
    def __init__(
        self,
        dim: int,
        num_classes: int,
        confs: List[Tuple[str, Dict]],
        in_channels: int,
        weight_init=None,
    ):
        super(CNN, self).__init__()

        self.net = nn.ModuleList()

        linear_idxs = [idx for idx, (layer, _) in enumerate(confs) if layer == "L"]
        linear_start = linear_idxs[0]
        convolution_conf = confs[:linear_start]
        linear_conf = confs[linear_start:]
        for layer, conf in convolution_conf:
            if layer == "C":
                self.net.append(
                    nn.Conv2d(
                        in_channels,
                        out_channels=conf["channels"],
                        kernel_size=conf["kernel"],
                        stride=conf.get("stride", 1),
                        padding=conf.get("padding", 0),
                    )
                )
                self.net.append(nn.ReLU())
                if conf.get("batch_norm", False):
                    self.net.append(nn.BatchNorm2d(conf["channels"]))
                if conf.get("dropout", 0):
                    self.net.append(nn.Dropout(conf["dropout"]))
                in_channels = conf["channels"]
            elif layer == "P":
                self.net.append(nn.MaxPool2d(kernel_size=conf["kernel"]))
            else:
                raise NotImplementedError(f"Layer {layer} not implemented")

        self.dim = utils.get_dim_after_conv_and_pool(dim_init=dim, confs=confs)
        print(f"self.dim: {self.dim},\nin_channels: {in_channels}")
        for idx, (layer, conf) in enumerate(linear_conf):
            if idx == 0:
                self.net.append(nn.Flatten())
                self.net.append(
                    nn.Linear(self.dim * self.dim * in_channels, conf["units"])
                )
                self.net.append(nn.ReLU())
                if conf.get("dropout", 0):
                    self.net.append(nn.Dropout(conf["dropout"]))
            elif idx == len(linear_conf) - 1:
                self.net.append(nn.Linear(conf["units"], num_classes))
            else:
                self.net.append(nn.Linear(conf["units"], conf["units"]))
                self.net.append(nn.ReLU())
                if conf.get("dropout", 0):
                    self.net.append(nn.Dropout(conf["dropout"]))

    def forward(self, x):
        N, H, W, C = x.shape
        x = x.permute(
            0, 3, 1, 2
        )  # Adjust (batch_size, H, W, C) to (batch_size, C, H, W)
        assert x.shape == (N, C, H, W)

        for layer in self.net:
            x = layer(x)

        return x

### Model Variant 2
- Explain

In [8]:
import torch
import torch.nn as nn
import utils
from typing import List, Tuple, Dict


class CNN(nn.Module):
    def __init__(
        self,
        dim: int,
        num_classes: int,
        confs: List[Tuple[str, Dict]],
        in_channels: int,
        weight_init=None,
    ):
        super(CNN, self).__init__()

        self.net = nn.ModuleList()

        linear_idxs = [idx for idx, (layer, _) in enumerate(confs) if layer == "L"]
        linear_start = linear_idxs[0]
        convolution_conf = confs[:linear_start]
        linear_conf = confs[linear_start:]
        for layer, conf in convolution_conf:
            if layer == "C":
                self.net.append(
                    nn.Conv2d(
                        in_channels,
                        out_channels=conf["channels"],
                        kernel_size=conf["kernel"],
                        stride=conf.get("stride", 1),
                        padding=conf.get("padding", 0),
                    )
                )
                self.net.append(nn.ReLU())
                if conf.get("batch_norm", False):
                    self.net.append(nn.BatchNorm2d(conf["channels"]))
                if conf.get("dropout", 0):
                    self.net.append(nn.Dropout(conf["dropout"]))
                in_channels = conf["channels"]
            elif layer == "P":
                self.net.append(nn.MaxPool2d(kernel_size=conf["kernel"]))
            else:
                raise NotImplementedError(f"Layer {layer} not implemented")

        self.dim = utils.get_dim_after_conv_and_pool(dim_init=dim, confs=confs)
        print(f"self.dim: {self.dim},\nin_channels: {in_channels}")
        for idx, (layer, conf) in enumerate(linear_conf):
            if idx == 0:
                self.net.append(nn.Flatten())
                self.net.append(
                    nn.Linear(self.dim * self.dim * in_channels, conf["units"])
                )
                self.net.append(nn.ReLU())
                if conf.get("dropout", 0):
                    self.net.append(nn.Dropout(conf["dropout"]))
            elif idx == len(linear_conf) - 1:
                self.net.append(nn.Linear(conf["units"], num_classes))
            else:
                self.net.append(nn.Linear(conf["units"], conf["units"]))
                self.net.append(nn.ReLU())
                if conf.get("dropout", 0):
                    self.net.append(nn.Dropout(conf["dropout"]))

    def forward(self, x):
        N, H, W, C = x.shape
        x = x.permute(
            0, 3, 1, 2
        )  # Adjust (batch_size, H, W, C) to (batch_size, C, H, W)
        assert x.shape == (N, C, H, W)

        for layer in self.net:
            x = layer(x)

        return x

### Model Variant 3
- Explain

In [9]:
import torch
import torch.nn as nn
import utils
from typing import List, Tuple, Dict


class CNN(nn.Module):
    def __init__(
        self,
        dim: int,
        num_classes: int,
        confs: List[Tuple[str, Dict]],
        in_channels: int,
        weight_init=None,
    ):
        super(CNN, self).__init__()

        self.net = nn.ModuleList()

        linear_idxs = [idx for idx, (layer, _) in enumerate(confs) if layer == "L"]
        linear_start = linear_idxs[0]
        convolution_conf = confs[:linear_start]
        linear_conf = confs[linear_start:]
        for layer, conf in convolution_conf:
            if layer == "C":
                self.net.append(
                    nn.Conv2d(
                        in_channels,
                        out_channels=conf["channels"],
                        kernel_size=conf["kernel"],
                        stride=conf.get("stride", 1),
                        padding=conf.get("padding", 0),
                    )
                )
                self.net.append(nn.ReLU())
                if conf.get("batch_norm", False):
                    self.net.append(nn.BatchNorm2d(conf["channels"]))
                if conf.get("dropout", 0):
                    self.net.append(nn.Dropout(conf["dropout"]))
                in_channels = conf["channels"]
            elif layer == "P":
                self.net.append(nn.MaxPool2d(kernel_size=conf["kernel"]))
            else:
                raise NotImplementedError(f"Layer {layer} not implemented")

        self.dim = utils.get_dim_after_conv_and_pool(dim_init=dim, confs=confs)
        print(f"self.dim: {self.dim},\nin_channels: {in_channels}")
        for idx, (layer, conf) in enumerate(linear_conf):
            if idx == 0:
                self.net.append(nn.Flatten())
                self.net.append(
                    nn.Linear(self.dim * self.dim * in_channels, conf["units"])
                )
                self.net.append(nn.ReLU())
                if conf.get("dropout", 0):
                    self.net.append(nn.Dropout(conf["dropout"]))
            elif idx == len(linear_conf) - 1:
                self.net.append(nn.Linear(conf["units"], num_classes))
            else:
                self.net.append(nn.Linear(conf["units"], conf["units"]))
                self.net.append(nn.ReLU())
                if conf.get("dropout", 0):
                    self.net.append(nn.Dropout(conf["dropout"]))

    def forward(self, x):
        N, H, W, C = x.shape
        x = x.permute(
            0, 3, 1, 2
        )  # Adjust (batch_size, H, W, C) to (batch_size, C, H, W)
        assert x.shape == (N, C, H, W)

        for layer in self.net:
            x = layer(x)

        return x

### Model Variant 4
- Explain

In [10]:
import torch
import torch.nn as nn
import utils
from typing import List, Tuple, Dict


class CNN(nn.Module):
    def __init__(
        self,
        dim: int,
        num_classes: int,
        confs: List[Tuple[str, Dict]],
        in_channels: int,
        weight_init=None,
    ):
        super(CNN, self).__init__()

        self.net = nn.ModuleList()

        linear_idxs = [idx for idx, (layer, _) in enumerate(confs) if layer == "L"]
        linear_start = linear_idxs[0]
        convolution_conf = confs[:linear_start]
        linear_conf = confs[linear_start:]
        for layer, conf in convolution_conf:
            if layer == "C":
                self.net.append(
                    nn.Conv2d(
                        in_channels,
                        out_channels=conf["channels"],
                        kernel_size=conf["kernel"],
                        stride=conf.get("stride", 1),
                        padding=conf.get("padding", 0),
                    )
                )
                self.net.append(nn.ReLU())
                if conf.get("batch_norm", False):
                    self.net.append(nn.BatchNorm2d(conf["channels"]))
                if conf.get("dropout", 0):
                    self.net.append(nn.Dropout(conf["dropout"]))
                in_channels = conf["channels"]
            elif layer == "P":
                self.net.append(nn.MaxPool2d(kernel_size=conf["kernel"]))
            else:
                raise NotImplementedError(f"Layer {layer} not implemented")

        self.dim = utils.get_dim_after_conv_and_pool(dim_init=dim, confs=confs)
        print(f"self.dim: {self.dim},\nin_channels: {in_channels}")
        for idx, (layer, conf) in enumerate(linear_conf):
            if idx == 0:
                self.net.append(nn.Flatten())
                self.net.append(
                    nn.Linear(self.dim * self.dim * in_channels, conf["units"])
                )
                self.net.append(nn.ReLU())
                if conf.get("dropout", 0):
                    self.net.append(nn.Dropout(conf["dropout"]))
            elif idx == len(linear_conf) - 1:
                self.net.append(nn.Linear(conf["units"], num_classes))
            else:
                self.net.append(nn.Linear(conf["units"], conf["units"]))
                self.net.append(nn.ReLU())
                if conf.get("dropout", 0):
                    self.net.append(nn.Dropout(conf["dropout"]))

    def forward(self, x):
        N, H, W, C = x.shape
        x = x.permute(
            0, 3, 1, 2
        )  # Adjust (batch_size, H, W, C) to (batch_size, C, H, W)
        assert x.shape == (N, C, H, W)

        for layer in self.net:
            x = layer(x)

        return x

### Discussion
- Variant 1
- Variant 2
- Variant 3
- Variant 4

# Regularization
- Briefly describe what the goal of regularization methods in general is

## L1/L2
- Explain

In [11]:
def my_code():
    pass

## Dropout
- Explain

In [12]:
def my_code():
    pass

## Discussion
- To what extent is this goal achieved in the given case?

# Batchnorm (without REG, with SGD)
- Evaluate whether Batchnorm is useful. Describe what the idea of BN is, what it is supposed to help.

In [13]:
def my_code():
    pass

## Discussion

# Adam
- Explain

## Without BN, without REG
- Explain

In [14]:
def my_code():
    pass

## Without BN, with REG
- Explain

In [15]:
def my_code():
    pass

## Discussion

# Transfer Learning
- Explain

In [16]:
def my_code():
    pass

## Discussion

# Conclusion