## Problem 1: Image Classification by CNN

CIFAR-10 dataset has the classes (listed below): ‘airplane’, ‘automobile’, ‘bird’, ‘cat’, ‘deer’, ‘dog’, ‘frog’, ‘horse’, ‘ship’, ‘truck’. The images in CIFAR-10 are of size $3 \times 32 \times 32$, i.e. 3-channel color images of $32 \times 32$ pixels in size.

CIFAR-10 is included in **torchvision**, so we don't have to upload the dataset to Colab.



In [None]:
import torch
import torch.nn as nn
from torchvision import transforms, datasets
from torch.utils.data import DataLoader

classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse',
           'ship', 'truck')
transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (1.0, 1.0, 1.0))])

# Randomly split the training set into 45000 training and 5000 validation
generator1 = torch.Generator().manual_seed(42)
cifar10_trainset, cifar10_valset = torch.utils.data.random_split(datasets.CIFAR10(root='./data/', train=True, download=True, transform=transform), [45000, 5000], generator1)
cifar10_testset = datasets.CIFAR10(root='./data/', train=False, download=True, transform=transform)

cifar_val_loader = DataLoader(cifar10_valset, batch_size=128, shuffle=False)
cifar_test_loader = DataLoader(cifar10_testset, batch_size=128, shuffle=False)

Visualize some samples in the CIFAR-10 dataset.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm

def cifar_imshow(img):
  img = img + 0.5     # unnormalize
  npimg = img.numpy()
  return np.transpose(npimg, (1, 2, 0))

# TODO: visualize some samples in the CIFAR-10 dataset

In [None]:
cifar_train_loader = DataLoader(cifar10_trainset, batch_size=128, shuffle=True)

Given the following network parameters, implement CNN1 using PyTorch.


In [None]:
'''
CNN1(
  (convs): Sequential(
    (0): Conv2d(3, 8, kernel_size=(5, 5), stride=(2, 2), padding=(2, 2))
    (1): BatchNorm2d(8, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace=True)
    (3): Conv2d(8, 8, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
    (4): BatchNorm2d(8, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (5): ReLU(inplace=True)
    (6): Conv2d(8, 16, kernel_size=(5, 5), stride=(2, 2), padding=(2, 2))
    (7): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (8): ReLU(inplace=True)
    (9): Conv2d(16, 16, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
    (10): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (11): ReLU(inplace=True)
    (12): Conv2d(16, 32, kernel_size=(5, 5), stride=(2, 2), padding=(2, 2))
    (13): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (14): ReLU(inplace=True)
    (15): Conv2d(32, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
    (16): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (17): ReLU(inplace=True)
    (18): MaxPool2d(kernel_size=4, stride=4, padding=0, dilation=1, ceil_mode=False)
  )
  (fcs): Sequential(
    (0): Linear(in_features=32, out_features=10, bias=True)
  )
)
'''

In [None]:
import torch.nn as nn
import torch.nn.functional as F

class CNN1(nn.Module):
  def __init__(self):
    super().__init__()
    # TODO: define your CNN

  def forward(self, x):
    # TODO: define your forward function
    return outs

Let's do classification on CIFAR-10 dataset.
**Note**: remember to keep the logs of training.

In [None]:
# use GPU to train if possible
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)

In [None]:
n_epoch = 10
cnn1 = CNN1().to(device)  # operate on GPU
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(cnn1.parameters(), lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0)
for i in range(n_epoch):
  # todo: use your train() to train your cnn1 and test() to evaluate on your validation set


Evaluate the classfication performance on the testing set.

In [None]:
# todo: use your test() to test your cnn1

#### Data Augmentation

In order to mitigate overfitting and simulate real-world data variability, we can transform our data by data augmentation. Here, we will implement some common tricks for data augmentation.

In [None]:
from torch.utils.data import ConcatDataset
import copy

transform1 = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (1.0, 1.0, 1.0))])

# TODO: Implements transforms

# Shifting: randomly shift the images up/down and left/right by within 10%.
transform2 =

# Rotating: randomly rotate the images by range (-30 degrees, 30 degrees).
transform3 =

# Flipping: horizontally flip the images.
transform4 =

# Adding Noise: randomly add some small Gaussian noise to the images.
transform5 =

transform_list = [transform1, transform2, transform3, transform4, transform5]
augmented_dataset = []
for t in transform_list:
  dataset = copy.deepcopy(cifar10_trainset)
  dataset.transform = t
  augmented_dataset.append(dataset)

cifar_train_dataset = ConcatDataset(augmented_dataset)
cifar_train_loader = DataLoader(cifar_train_dataset, batch_size=128, shuffle=True)

Use the same CNN architecture to train on the augmented training dataset.

**Note**: remember to keep the logs of training.

In [None]:
# use GPU to train if possible
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)

Define a loss function and optimizer.

In [None]:
import torch.optim as optim

# TODO: you can change loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(cnn1.parameters(), lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0)

Train the network. Evaluate your model at the end of your epoch.

In [None]:
n_epoch = 20

In [None]:
for i in range(n_epoch):
  # todo: use your train() to train your cnn1 and test() to evaluate on your validation set

Evaluate the classfication performance on the testing set.

In [None]:
# todo: use your test() to test your cnn1

Define CNN2. Modify CNN1 model by doubling the number of output channels in each layer, for example, changing 8 to 16.
Train it on the augmented dataset and report the accuracy on the testing set.

In [None]:
import torch.nn as nn
import torch.nn.functional as F

class CNN2(nn.Module):
  def __init__(self):
    super().__init__()
    # TODO: define your CNN


  def forward(self, x):
    # TODO: define your forward function

    return out

In [None]:
n_epoch = 20
cnn2 = CNN2().to(device)  # operate on GPU
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(cnn2.parameters(), lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0)
for i in range(n_epoch):
  # todo: use your train() to train your cnn2 and test() to evaluate on your validation set

Modify the above CNN model by using kernel size of 3 for every convolutional layer.
Train it on the augmented dataset and report the accuracy on the testing set.

In [None]:
import torch.nn as nn
import torch.nn.functional as F

class CNN3(nn.Module):
  def __init__(self):
    super().__init__()
    # TODO: define your CNN

  def forward(self, x):
    # TODO: define your forward function

    return out

In [None]:
n_epoch = 20
cnn3 = CNN3().to(device)  # operate on GPU
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(cnn3.parameters(), lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0)
for i in range(n_epoch):
  # todo: use your train() to train your cnn3 and test() to evaluate on your validation set

Try different optimizers or initial learning rates. Train it on the augmented dataset and report the accuracy on the testing set.

####  Discussion

Based on your experiments in Problem 1, what can potentially affect your performance most?

