# 1. Receptive field and parameter count (1 point)

Recall that the *receptive field* refers to size of the region in the input that are visible to a given activation (or neuron) in a convolutional neural network. "Visible" here means that the values of those inputs affect the value of the activation. In all of the following questions, assume that the input image is arbitrarily large, so you don't need to worry about boundary effects or padding.

1. Consider a convolutional network which consists of three convolutional layers, each with a filter size of 3x3, and a stride of 1x1. What is the receptive field size of one of the activations at the final output?
1. What is the receptive field if the stride is 2x3 at each layer?
1. What is the receptive field if the stride is 2x2 at each layer, and there is a 2x2 max-pooling layer with stride 2x2 after each convolutional layer?
1. Assume that the input image has 3 channels, the three convolutional layers have 16, 32, and 64 channels respectively, and that there are no biases on any of the layers. How many parameters does the network have?

# 2. CIFAR-10 classification (4 points)

CIFAR-10 is a standard dataset where the goal is to classify 32 x 32 images into one of 10 classes. The goal of this problem is simple: build and train a convolutional neural network to perform classification on CIFAR-10. The problem is intentionally extremely open-ended! There are dozens (hundreds?) of tutorials online describing how to train a convnet on CIFAR-10 - please seek them out and make use of them. I recommend getting started with the [CIFAR-10 tutorial from PyTorch](https://colab.research.google.com/github/pytorch/tutorials/blob/gh-pages/_downloads/cifar10_tutorial.ipynb) which includes code for loading the dataset and evaluating performance on it. You are welcome to use any other resource that you want (but please cite it!) - as I mentioned there are many, many tutorials online, and googling for help is an utterly crucial skill for a researcher! You will be graded on the final test accuracy achieved by your model:

- 60% accuracy or higher: 2/4 points
- 75% accuracy or higher: 3/4 points
- 90% accuracy or higher: 4/4 points
- Highest accuracy in the class: 4/3 points!

Note that in order for us to know the final performance of your model, you will need to implement a function that computes the accuracy of your model on the test set (which appears in both of the linked tutorials above). The only rules are: You can only train your model on the CIFAR-10 training set (i.e. you can't use pre-trained models or other datasets for additional training, and you certaintly can't train on the CIFAR-10 test set!), and you must train the model on the free Colab GPU or TPU. This means you can only train the model for an hour or so! This is *much* less compute than is typically used for training CIFAR-10 models. As such, this is as much an exercise in building an accurate model as it is in building an efficient one. This is a popular game to play, and to the best of my knowledge the state-of-the-art is [this approach](https://myrtle.ai/learn/how-to-train-your-resnet/) which attains 96% accuracy in only *26 seconds* on a single GPU! (note that the final link on that page is broken; it should be [this](https://myrtle.ai/learn/how-to-train-your-resnet-8-bag-of-tricks/)).

There are lots of things you can try to make your model more accurate and/or more efficient:

1. Deeper models
1. Residual connections
1. [Data augmentation and normalization](https://d2l.ai/chapter_computer-vision/kaggle-cifar10.html#image-augmentation)
1. Regularization like dropout or weight decay
1. [Learning rate schedules](https://d2l.ai/chapter_optimization/lr-scheduler.html)
1. [Different forms of normalization](https://d2l.ai/chapter_convolutional-modern/batch-norm.html)

Note that we haven't covered all these topics in class yet, but you should be able to get to at least 60% accuracy without applying all of these ideas - and probably 75% by tweaking around a little bit. Specifically, you should be able to get about 60% accuracy by taking the basic AlexNet architecture we discussed in class and applying it directly to CIFAR-10. And, if you're feeling adventurous, feel free to go for 96% using the aforementioned blog series! Good luck!

In [15]:
# Import Statements
import torch
import torchvision
import torchvision.transforms as transforms
import torch.nn as nn
from torch import Tensor
from typing import Type
import matplotlib.pyplot as plt

In [16]:
epochs = 24
warmup = 5
batch_size = 400
momentum=0.9
learning_rate = 0.01
weight_decay = 0.000125
device = torch.device("mps")

In [17]:
# Transformation to Dataset

# Augmenting the dataset
transform_train = transforms.Compose([
    # Scale the image up to a square of 40 pixels in both height and width
    transforms.Resize(40),
    # Randomly crop a square image of 40 pixels in both height and width to
    # produce a small square of 0.64 to 1 times the area of the original
    # image, and then scale it to a square of 32 pixels in both height and
    # width
    transforms.RandomResizedCrop(32, scale=(0.64, 1.0),
                                                   ratio=(1.0, 1.0)),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    # Standardize each channel of the image
    transforms.Normalize([0.4914, 0.4822, 0.4465], # Is this the best normalization? Or [0.5, 0.5, 0.5]?
     [0.2023, 0.1994, 0.2010])])

transform_test = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize([0.4914, 0.4822, 0.4465], # Is this the best normalization?
     [0.2023, 0.1994, 0.2010])])

transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])


In [18]:
# Loading Dataset

# Load Training Dataset
trainset = torchvision.datasets.CIFAR10("../data/train_data", train=True,
                                        download=True, transform=transform_train)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size,
                                          shuffle=True, num_workers=2)

# Load Testing Dataset
testset = torchvision.datasets.CIFAR10("../data/test_data", train=False,
                                       download=True, transform=transform_test)
testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size,
                                         shuffle=False, num_workers=2)

Files already downloaded and verified
Files already downloaded and verified


In [19]:
# ResNet-18

# Basic Block
def conv_block(in_channels, out_channels, pool=False):
    layers = [nn.Conv2d(in_channels, out_channels, kernel_size=3, padding=1),
              nn.BatchNorm2d(out_channels),
              nn.ReLU(inplace=True)]
    if pool: layers.append(nn.MaxPool2d(2))
    return nn.Sequential(*layers)

'''
    # Basic Block
class BasicBlock(nn.Module): # expands nn.module class
  def __init__(self, in_channels: int, out_channels: int, pool=False, downsample: nn.Module = None) -> None: # Initialize Basic Block
    super(BasicBlock, self).__init__() # Initialize nn.module

    # Set params
    self.expansion = 1
    self.downsample = downsample

    # Set layers
    self.conv1 = nn.Conv2d(in_channels, out_channels, kernel_size=3, padding=1, bias=False)
    self.bn1 = nn.BatchNorm2d(out_channels)
    self.celu = nn.CELU(inplace=True) # Maybe change to ReLU?

    if pool: self.pool = nn.MaxPool2d(2)

  def forward(self, x: Tensor) -> Tensor: # Order of going through layers
    identity = x

    out = self.conv1(x)
    out = self.bn1(out)
    out = self.celu(out)

    if self.pool:
      out = self.pool(out)

    return  out
'''

'\n    # Basic Block\nclass BasicBlock(nn.Module): # expands nn.module class\n  def __init__(self, in_channels: int, out_channels: int, pool=False, downsample: nn.Module = None) -> None: # Initialize Basic Block\n    super(BasicBlock, self).__init__() # Initialize nn.module\n\n    # Set params\n    self.expansion = 1\n    self.downsample = downsample\n\n    # Set layers\n    self.conv1 = nn.Conv2d(in_channels, out_channels, kernel_size=3, padding=1, bias=False)\n    self.bn1 = nn.BatchNorm2d(out_channels)\n    self.celu = nn.CELU(inplace=True) # Maybe change to ReLU?\n\n    if pool: self.pool = nn.MaxPool2d(2)\n\n  def forward(self, x: Tensor) -> Tensor: # Order of going through layers\n    identity = x\n\n    out = self.conv1(x)\n    out = self.bn1(out)\n    out = self.celu(out)\n\n    if self.pool:\n      out = self.pool(out)\n\n    return  out\n'

In [20]:
# ResNet Block

class ResNet(nn.Module):
    def __init__(self, in_channels, num_classes):
        super().__init__()

        self.conv1 = conv_block(in_channels, 64)
        self.conv2 = conv_block(64, 128, pool=True)
        self.res1 = nn.Sequential(conv_block(128, 128), conv_block(128, 128))

        self.conv3 = conv_block(128, 256, pool=True)
        self.conv4 = conv_block(256, 512, pool=True)
        self.res2 = nn.Sequential(conv_block(512, 512), conv_block(512, 512))

        self.classifier = nn.Sequential(nn.MaxPool2d(4),
                                        nn.Flatten(),
                                        nn.Linear(512, num_classes))

    def forward(self, xb):
        out = self.conv1(xb)
        out = self.conv2(out)
        out = self.res1(out) + out
        out = self.conv3(out)
        out = self.conv4(out)
        out = self.res2(out) + out
        out = self.classifier(out)
        return out

In [21]:
def save_plots(train_acc, valid_acc, train_loss, valid_loss, name=None):
    """
    Function to save the loss and accuracy plots to disk.
    """
    # Accuracy plots.
    plt.figure(figsize=(10, 7))
    plt.subplot(1, 2, 1)
    plt.plot(
        train_acc, color='tab:blue', linestyle='-',
        label='train accuracy'
    )
    plt.plot(
        valid_acc, color='tab:red', linestyle='-',
        label='validataion accuracy'
    )
    plt.xlabel('Epochs')
    plt.ylabel('Accuracy')
    plt.legend()

    # Loss plots.
    plt.subplot(1, 2, 2)
    plt.figure(figsize=(10, 7))
    plt.plot(
        train_loss, color='tab:blue', linestyle='-',
        label='train loss'
    )
    plt.plot(
        valid_loss, color='tab:red', linestyle='-',
        label='validataion loss'
    )
    plt.xlabel('Epochs')
    plt.ylabel('Loss')
    plt.legend()

In [32]:
# from tqdm import tqdm
# Training function.
import time
def train(model, trainloader, optimizer, criterion, device, sched): # device
    model.train()
    print('Training')
    train_running_loss = 0.0
    train_running_correct = 0
    lrs = [] # NOTUSED
    # NOTUSED sched
    
    for image, labels in trainloader:
        # image, labels = data
        image = image.to(device)
        labels = labels.to(device)
        optimizer.zero_grad()
        # Forward pass.
        outputs = model(image)
        # Calculate the loss.
        loss = criterion(outputs, labels)
        train_running_loss += loss.item()
        # Calculate the accuracy.
        _, preds = torch.max(outputs.data, 1)
        train_running_correct += (preds == labels).sum().item()
        # Backpropagation
        loss.backward()
        # Update the weights.
        optimizer.step()

    # Loss and accuracy for the complete epoch.
    epoch_loss = train_running_loss / len(trainloader.dataset)
    # epoch_acc = 100. * (train_running_correct / len(trainloader.dataset))
    epoch_acc = 100. * (train_running_correct / len(trainloader.dataset))
    return epoch_loss, epoch_acc

In [33]:
# Validation function.
def test(model, testloader, criterion, device): #device
    model.eval()
    
    print('Testing')
    test_running_loss = 0.0
    test_running_correct = 0
    counter = 0
    with torch.no_grad():
        for image, labels in testloader:

            image = image.to(device)
            labels = labels.to(device)
            # Forward pass.
            outputs = model(image)
            # Calculate the loss.
            loss = criterion(outputs, labels)
            test_running_loss += loss.item()
            # Calculate the accuracy.
            _, preds = torch.max(outputs.data, 1)
            test_running_correct += (preds == labels).sum().item()

    # Loss and accuracy for the complete epoch.
    epoch_loss = test_running_loss / len(testloader.dataset)
    epoch_acc = 100. * (test_running_correct / len(testloader.dataset))
    return epoch_loss, epoch_acc

In [24]:
import torch.optim as optim
import argparse
import numpy as np
import random

# Set seed.
#seed = 42
#torch.manual_seed(seed)
#torch.cuda.manual_seed(seed)
#torch.backends.cudnn.deterministic = True
#torch.backends.cudnn.benchmark = True
#np.random.seed(seed)
#random.seed(seed)

In [27]:
# Learning and training parameters.
#device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')

# Define model
model = ResNet(in_channels=3, num_classes=10).to(device)
# plot_name = 'resnet_scratch'

# # Total parameters and trainable parameters.
# total_params = sum(p.numel() for p in model.parameters())
# print(f"{total_params:,} total parameters.")
# total_trainable_params = sum(
#     p.numel() for p in model.parameters() if p.requires_grad)
# print(f"{total_trainable_params:,} training parameters.")
# Optimizer.
# optimizer = optim.SGD(model.parameters(), lr=learning_rate, momentum=0.9)
optimizer = optim.Adam(model.parameters(), learning_rate, weight_decay=weight_decay)
# Loss function.
criterion = nn.CrossEntropyLoss()
# Schedule
sched = optim.lr_scheduler.OneCycleLR(optimizer, learning_rate, epochs=epochs,
                                      steps_per_epoch=len(trainloader))

In [34]:
# Lists to keep track of losses and accuracies.
train_loss, test_loss = [], []
train_acc, test_acc = [], []
lr_tracker = []
# Start the training.

for epoch in range(epochs):
  start_time = time.time()
  train_epoch_loss, train_epoch_acc = train(
      model,
      trainloader,
      optimizer,
      criterion,
      device,
      sched
      )
  test_epoch_loss, test_epoch_acc = test(
      model,
      testloader,
      criterion,
      device
      )

  sched.step()
  train_loss.append(train_epoch_loss)
  test_loss.append(test_epoch_loss)
  train_acc.append(train_epoch_acc)
  test_acc.append(test_epoch_acc)
  elapsed_time = time.time() - start_time
  mins, secs = divmod(elapsed_time, 60)
  print(f"[INFO]: Epoch {epoch+1} of {epochs}, Elapsed time: {int(mins)}m {int(secs)}s")
  print(f"Training loss: {train_epoch_loss:.3f}, training acc: {train_epoch_acc:.3f}")
  print(f"Validation loss: {test_epoch_loss:.3f}, validation acc: {test_epoch_acc:.3f}")
  print('-'*50)

# Save the loss and accuracy plots.
# save_plots(
#     train_acc,
#     test_acc,
#     train_loss,
#     test_loss,
#     name=plot_name
#     )
print('TRAINING COMPLETE')

Training
Testing
[INFO]: Epoch 1 of 24, Elapsed time: 3m 55s
Training loss: 0.002, training acc: 71.722
Validation loss: 0.002, validation acc: 68.920
--------------------------------------------------
Training
Testing
[INFO]: Epoch 2 of 24, Elapsed time: 4m 20s
Training loss: 0.002, training acc: 77.764
Validation loss: 0.002, validation acc: 73.650
--------------------------------------------------
Training


Exception ignored in: <function _MultiProcessingDataLoaderIter.__del__ at 0x1119604a0>
Traceback (most recent call last):
  File "/Users/chryron/.local/share/virtualenvs/CSC2516_NN-DL-UFXW_cxZ/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1477, in __del__
    def __del__(self):

  File "/Users/chryron/.local/share/virtualenvs/CSC2516_NN-DL-UFXW_cxZ/lib/python3.11/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
    _error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 44839) is killed by signal: Interrupt: 2. 


KeyboardInterrupt: 

In [None]:
# Image augmentation - randomly flip stuff (https://d2l.ai/chapter_computer-vision/kaggle-cifar10.html#image-augmentation)
# Deeper model and ResNet
# Regularization like dropout or weight decay
# Learning rate schedulers
# Batch normalization (https://d2l.ai/chapter_convolutional-modern/batch-norm.html)

https://medium.com/@nischitasadananda/convolutional-neural-network-data-augmentation-and-batch-normalization-fd9d6237e9e

https://d2l.ai/chapter_computer-vision/kaggle-cifar10.html#image-augmentation
