# Computer Vision Homework 3: Big vs Small Models

## Brief

Due date: Nov 16, 2022

Required files: `homework-3.ipynb`, `report.pdf`

To download the jupyter notebook from colab, you can refer to the colab tutorial we gave.


## Codes for Problem 1 and Problem 2

### Import Packages

In [48]:
import glob
import os
import random

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

import torch
import torch.nn as nn
import torch.optim as optim

from PIL import Image
from torch.utils.data import DataLoader, Dataset, RandomSampler
from torchvision import transforms, models, datasets
from tqdm import tqdm

%matplotlib inline

### Check GPU Environment

In [49]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f'Using {device} device')

Using cuda device


In [50]:
! nvidia-smi -L

GPU 0: NVIDIA TITAN X (Pascal) (UUID: GPU-08bb949b-ba00-8984-874f-5599aaa811fd)


### Set the Seed to Reproduce the Result

In [51]:
def set_all_seed(seed):
    np.random.seed(seed)
    random.seed(seed)
    torch.manual_seed(seed)
set_all_seed(123)

### Create Dataset and Dataloader

In [52]:
batch_size = 256

train_transform = transforms.Compose([
    transforms.Pad(4, padding_mode='reflect'),
    transforms.RandomHorizontalFlip(),
    transforms.RandomCrop(32),
    transforms.ToTensor(),
])
test_transform = transforms.Compose([
    transforms.ToTensor(),
])

train_dataset = datasets.CIFAR10(root='data', train=True, download=True, transform=train_transform)
valid_dataset = datasets.CIFAR10(root='data', train=False, download=True, transform=test_transform)

train_dataloader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, pin_memory=True)
valid_dataloader = DataLoader(valid_dataset, batch_size=batch_size, shuffle=False, pin_memory=True)

sixteenth_train_sampler = RandomSampler(train_dataset, num_samples=len(train_dataset)//16)
half_train_sampler = RandomSampler(train_dataset, num_samples=len(train_dataset)//2)

sixteenth_train_dataloader = DataLoader(train_dataset, batch_size=batch_size, sampler=sixteenth_train_sampler)
half_train_dataloader = DataLoader(train_dataset, batch_size=batch_size, sampler=half_train_sampler)

Files already downloaded and verified
Files already downloaded and verified


### Load Models

In [53]:
# HINT: Remember to change the model to 'resnet50' and the weights to weights="IMAGENET1K_V1" when needed.
model = torch.hub.load('pytorch/vision:v0.10.0', 'resnet18', weights=None)

# Background: The original resnet18 is designed for ImageNet dataset to predict 1000 classes.
# TODO: Change the output of the model to 10 class.
model.fc = nn.Linear(in_features=512, out_features=10, bias=True)
model = model.to(device)

Using cache found in /home/leo/.cache/torch/hub/pytorch_vision_v0.10.0


### Training and Testing Models

In [54]:
# TODO: Fill in the code cell according to the pytorch tutorial we gave.
def train(dataloader, model, loss_fn, optimizer):
    num_batches = len(dataloader)
    size = len(dataloader.dataset)
    epoch_loss = 0
    correct = 0

    model.train()

    for X, y in tqdm(dataloader):
        X, y = X.to(device), y.to(device)

        # Compute prediction error
        pred = model(X)
        loss = loss_fn(pred, y)

        # Backpropagatopn
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        epoch_loss += loss.item()
        pred = pred.argmax(dim=1, keepdim=True)
        correct += pred.eq(y.view_as(pred)).sum().item()
    
    avg_epoch_loss = epoch_loss / num_batches
    avg_acc = correct / size
    return avg_epoch_loss, avg_acc

In [55]:
def test(dataloader, model, loss_fn):
    num_batches = len(dataloader)
    size = len(dataloader.dataset)
    epoch_loss = 0
    correct = 0

    model.eval()

    with torch.no_grad():
        for X, y in tqdm(dataloader):
            X, y = X.to(device), y.to(device)

            pred = model(X)

            epoch_loss += loss_fn(pred, y).item()
            pred = pred.argmax(dim=1, keepdim=True)
            correct += pred.eq(y.view_as(pred)).sum().item()
    
    avg_epoch_loss = epoch_loss / num_batches
    avg_acc = correct / size

    return avg_epoch_loss, avg_acc

In [56]:
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)

epochs = 50

for epoch in range(epochs):
    train_loss, train_acc = train(train_dataloader, model, loss_fn, optimizer)
    test_loss, test_acc = test(valid_dataloader, model, loss_fn)
    print(f"Epoch {epoch + 1:2d}: Loss = {train_loss:.4f} Acc = {train_acc:.2f} Test_Loss = {test_loss:.4f} Test_Acc = {test_acc:.2f}")
print('Done!')

100%|██████████| 196/196 [00:22<00:00,  8.73it/s]
100%|██████████| 40/40 [00:01<00:00, 27.88it/s]


Epoch  1: Loss = 1.5166 Acc = 0.45 Test_Loss = 1.6336 Test_Acc = 0.46


100%|██████████| 196/196 [00:22<00:00,  8.77it/s]
100%|██████████| 40/40 [00:01<00:00, 27.66it/s]


Epoch  2: Loss = 1.1780 Acc = 0.58 Test_Loss = 1.4084 Test_Acc = 0.50


100%|██████████| 196/196 [00:22<00:00,  8.77it/s]
100%|██████████| 40/40 [00:01<00:00, 27.72it/s]


Epoch  3: Loss = 1.0274 Acc = 0.63 Test_Loss = 1.2835 Test_Acc = 0.57


100%|██████████| 196/196 [00:22<00:00,  8.77it/s]
100%|██████████| 40/40 [00:01<00:00, 27.79it/s]


Epoch  4: Loss = 0.9382 Acc = 0.67 Test_Loss = 1.5416 Test_Acc = 0.53


100%|██████████| 196/196 [00:22<00:00,  8.74it/s]
100%|██████████| 40/40 [00:01<00:00, 27.75it/s]


Epoch  5: Loss = 0.8655 Acc = 0.69 Test_Loss = 0.8840 Test_Acc = 0.68


100%|██████████| 196/196 [00:22<00:00,  8.76it/s]
100%|██████████| 40/40 [00:01<00:00, 27.75it/s]


Epoch  6: Loss = 0.8050 Acc = 0.72 Test_Loss = 1.4274 Test_Acc = 0.58


100%|██████████| 196/196 [00:22<00:00,  8.73it/s]
100%|██████████| 40/40 [00:01<00:00, 27.80it/s]


Epoch  7: Loss = 0.7542 Acc = 0.74 Test_Loss = 0.8666 Test_Acc = 0.70


100%|██████████| 196/196 [00:22<00:00,  8.72it/s]
100%|██████████| 40/40 [00:01<00:00, 27.37it/s]


Epoch  8: Loss = 0.7172 Acc = 0.75 Test_Loss = 1.1271 Test_Acc = 0.64


100%|██████████| 196/196 [00:22<00:00,  8.62it/s]
100%|██████████| 40/40 [00:01<00:00, 26.99it/s]


Epoch  9: Loss = 0.6838 Acc = 0.76 Test_Loss = 0.7966 Test_Acc = 0.73


100%|██████████| 196/196 [00:22<00:00,  8.62it/s]
100%|██████████| 40/40 [00:01<00:00, 27.12it/s]


Epoch 10: Loss = 0.6451 Acc = 0.77 Test_Loss = 0.8470 Test_Acc = 0.72


100%|██████████| 196/196 [00:22<00:00,  8.59it/s]
100%|██████████| 40/40 [00:01<00:00, 27.32it/s]


Epoch 11: Loss = 0.6208 Acc = 0.78 Test_Loss = 0.7178 Test_Acc = 0.75


100%|██████████| 196/196 [00:22<00:00,  8.60it/s]
100%|██████████| 40/40 [00:01<00:00, 27.05it/s]


Epoch 12: Loss = 0.6016 Acc = 0.79 Test_Loss = 0.8327 Test_Acc = 0.72


100%|██████████| 196/196 [00:22<00:00,  8.59it/s]
100%|██████████| 40/40 [00:01<00:00, 26.88it/s]


Epoch 13: Loss = 0.5773 Acc = 0.80 Test_Loss = 0.6693 Test_Acc = 0.77


100%|██████████| 196/196 [00:22<00:00,  8.61it/s]
100%|██████████| 40/40 [00:01<00:00, 27.35it/s]


Epoch 14: Loss = 0.5530 Acc = 0.81 Test_Loss = 0.6963 Test_Acc = 0.76


100%|██████████| 196/196 [00:22<00:00,  8.58it/s]
100%|██████████| 40/40 [00:01<00:00, 26.99it/s]


Epoch 15: Loss = 0.5336 Acc = 0.81 Test_Loss = 0.9549 Test_Acc = 0.70


100%|██████████| 196/196 [00:22<00:00,  8.58it/s]
100%|██████████| 40/40 [00:01<00:00, 26.66it/s]


Epoch 16: Loss = 0.5259 Acc = 0.82 Test_Loss = 0.6777 Test_Acc = 0.78


100%|██████████| 196/196 [00:22<00:00,  8.58it/s]
100%|██████████| 40/40 [00:01<00:00, 27.13it/s]


Epoch 17: Loss = 0.5015 Acc = 0.82 Test_Loss = 0.7539 Test_Acc = 0.76


100%|██████████| 196/196 [00:22<00:00,  8.58it/s]
100%|██████████| 40/40 [00:01<00:00, 26.87it/s]


Epoch 18: Loss = 0.4823 Acc = 0.83 Test_Loss = 0.6877 Test_Acc = 0.77


100%|██████████| 196/196 [00:22<00:00,  8.58it/s]
100%|██████████| 40/40 [00:01<00:00, 26.24it/s]


Epoch 19: Loss = 0.4731 Acc = 0.83 Test_Loss = 0.5753 Test_Acc = 0.81


100%|██████████| 196/196 [00:22<00:00,  8.54it/s]
100%|██████████| 40/40 [00:01<00:00, 27.18it/s]


Epoch 20: Loss = 0.4582 Acc = 0.84 Test_Loss = 0.8219 Test_Acc = 0.75


100%|██████████| 196/196 [00:22<00:00,  8.57it/s]
100%|██████████| 40/40 [00:01<00:00, 27.03it/s]


Epoch 21: Loss = 0.4378 Acc = 0.85 Test_Loss = 0.6107 Test_Acc = 0.80


100%|██████████| 196/196 [00:22<00:00,  8.55it/s]
100%|██████████| 40/40 [00:01<00:00, 27.13it/s]


Epoch 22: Loss = 0.4369 Acc = 0.85 Test_Loss = 0.7147 Test_Acc = 0.77


100%|██████████| 196/196 [00:22<00:00,  8.56it/s]
100%|██████████| 40/40 [00:01<00:00, 27.29it/s]


Epoch 23: Loss = 0.4163 Acc = 0.85 Test_Loss = 0.6994 Test_Acc = 0.77


100%|██████████| 196/196 [00:23<00:00,  8.52it/s]
100%|██████████| 40/40 [00:01<00:00, 27.09it/s]


Epoch 24: Loss = 0.4096 Acc = 0.85 Test_Loss = 0.8661 Test_Acc = 0.73


100%|██████████| 196/196 [00:22<00:00,  8.56it/s]
100%|██████████| 40/40 [00:01<00:00, 27.01it/s]


Epoch 25: Loss = 0.3988 Acc = 0.86 Test_Loss = 0.5939 Test_Acc = 0.80


100%|██████████| 196/196 [00:23<00:00,  8.49it/s]
100%|██████████| 40/40 [00:01<00:00, 26.79it/s]


Epoch 26: Loss = 0.3857 Acc = 0.86 Test_Loss = 0.6087 Test_Acc = 0.80


100%|██████████| 196/196 [00:23<00:00,  8.43it/s]
100%|██████████| 40/40 [00:01<00:00, 26.95it/s]


Epoch 27: Loss = 0.3815 Acc = 0.86 Test_Loss = 0.8475 Test_Acc = 0.75


100%|██████████| 196/196 [00:23<00:00,  8.44it/s]
100%|██████████| 40/40 [00:01<00:00, 26.98it/s]


Epoch 28: Loss = 0.3652 Acc = 0.87 Test_Loss = 0.8235 Test_Acc = 0.75


100%|██████████| 196/196 [00:23<00:00,  8.52it/s]
100%|██████████| 40/40 [00:01<00:00, 26.64it/s]


Epoch 29: Loss = 0.3561 Acc = 0.87 Test_Loss = 0.6304 Test_Acc = 0.80


100%|██████████| 196/196 [00:22<00:00,  8.55it/s]
100%|██████████| 40/40 [00:01<00:00, 26.93it/s]


Epoch 30: Loss = 0.3515 Acc = 0.87 Test_Loss = 0.6897 Test_Acc = 0.78


100%|██████████| 196/196 [00:22<00:00,  8.55it/s]
100%|██████████| 40/40 [00:01<00:00, 27.06it/s]


Epoch 31: Loss = 0.3364 Acc = 0.88 Test_Loss = 0.6366 Test_Acc = 0.79


100%|██████████| 196/196 [00:22<00:00,  8.54it/s]
100%|██████████| 40/40 [00:01<00:00, 26.89it/s]


Epoch 32: Loss = 0.3298 Acc = 0.88 Test_Loss = 0.7969 Test_Acc = 0.76


100%|██████████| 196/196 [00:22<00:00,  8.56it/s]
100%|██████████| 40/40 [00:01<00:00, 27.08it/s]


Epoch 33: Loss = 0.3227 Acc = 0.89 Test_Loss = 0.6899 Test_Acc = 0.79


100%|██████████| 196/196 [00:22<00:00,  8.53it/s]
100%|██████████| 40/40 [00:01<00:00, 26.67it/s]


Epoch 34: Loss = 0.3178 Acc = 0.89 Test_Loss = 0.6822 Test_Acc = 0.78


100%|██████████| 196/196 [00:22<00:00,  8.55it/s]
100%|██████████| 40/40 [00:01<00:00, 27.16it/s]


Epoch 35: Loss = 0.3083 Acc = 0.89 Test_Loss = 0.5647 Test_Acc = 0.82


100%|██████████| 196/196 [00:22<00:00,  8.53it/s]
100%|██████████| 40/40 [00:01<00:00, 27.18it/s]


Epoch 36: Loss = 0.3009 Acc = 0.89 Test_Loss = 0.5484 Test_Acc = 0.82


100%|██████████| 196/196 [00:22<00:00,  8.53it/s]
100%|██████████| 40/40 [00:01<00:00, 27.15it/s]


Epoch 37: Loss = 0.2883 Acc = 0.90 Test_Loss = 0.5874 Test_Acc = 0.82


100%|██████████| 196/196 [00:22<00:00,  8.57it/s]
100%|██████████| 40/40 [00:01<00:00, 27.38it/s]


Epoch 38: Loss = 0.2864 Acc = 0.90 Test_Loss = 0.5717 Test_Acc = 0.82


100%|██████████| 196/196 [00:22<00:00,  8.54it/s]
100%|██████████| 40/40 [00:01<00:00, 27.17it/s]


Epoch 39: Loss = 0.2774 Acc = 0.90 Test_Loss = 0.6477 Test_Acc = 0.81


100%|██████████| 196/196 [00:22<00:00,  8.53it/s]
100%|██████████| 40/40 [00:01<00:00, 26.90it/s]


Epoch 40: Loss = 0.2716 Acc = 0.90 Test_Loss = 0.5979 Test_Acc = 0.82


100%|██████████| 196/196 [00:22<00:00,  8.55it/s]
100%|██████████| 40/40 [00:01<00:00, 26.77it/s]


Epoch 41: Loss = 0.2671 Acc = 0.91 Test_Loss = 0.6193 Test_Acc = 0.81


100%|██████████| 196/196 [00:22<00:00,  8.57it/s]
100%|██████████| 40/40 [00:01<00:00, 26.87it/s]


Epoch 42: Loss = 0.2526 Acc = 0.91 Test_Loss = 0.7899 Test_Acc = 0.77


100%|██████████| 196/196 [00:22<00:00,  8.53it/s]
100%|██████████| 40/40 [00:01<00:00, 27.50it/s]


Epoch 43: Loss = 0.2567 Acc = 0.91 Test_Loss = 0.5727 Test_Acc = 0.83


100%|██████████| 196/196 [00:22<00:00,  8.56it/s]
100%|██████████| 40/40 [00:01<00:00, 27.31it/s]


Epoch 44: Loss = 0.2474 Acc = 0.91 Test_Loss = 0.6093 Test_Acc = 0.82


100%|██████████| 196/196 [00:22<00:00,  8.56it/s]
100%|██████████| 40/40 [00:01<00:00, 27.29it/s]


Epoch 45: Loss = 0.2387 Acc = 0.92 Test_Loss = 0.6361 Test_Acc = 0.81


100%|██████████| 196/196 [00:22<00:00,  8.56it/s]
100%|██████████| 40/40 [00:01<00:00, 27.00it/s]


Epoch 46: Loss = 0.2448 Acc = 0.91 Test_Loss = 0.6329 Test_Acc = 0.81


100%|██████████| 196/196 [00:22<00:00,  8.54it/s]
100%|██████████| 40/40 [00:01<00:00, 27.20it/s]


Epoch 47: Loss = 0.2324 Acc = 0.92 Test_Loss = 0.5496 Test_Acc = 0.83


100%|██████████| 196/196 [00:22<00:00,  8.55it/s]
100%|██████████| 40/40 [00:01<00:00, 27.17it/s]


Epoch 48: Loss = 0.2216 Acc = 0.92 Test_Loss = 0.6562 Test_Acc = 0.82


100%|██████████| 196/196 [00:22<00:00,  8.56it/s]
100%|██████████| 40/40 [00:01<00:00, 26.86it/s]


Epoch 49: Loss = 0.2229 Acc = 0.92 Test_Loss = 0.5633 Test_Acc = 0.83


100%|██████████| 196/196 [00:22<00:00,  8.53it/s]
100%|██████████| 40/40 [00:01<00:00, 26.37it/s]

Epoch 50: Loss = 0.2177 Acc = 0.92 Test_Loss = 0.5901 Test_Acc = 0.82
Done!





## Codes for Problem 3

In [57]:
# TODO: Try to achieve the best performance given all training data using whatever model and training strategy.

## Problems

1. (30%) Finish the rest of the codes for Problem 1 and Problem 2 according to the hint. (2 code cells in total.)
2. Train small model (resnet18) and big model (resnet50) from scratch on `sixteenth_train_dataloader`, `half_train_dataloader`, and `train_dataloader` respectively.
3. (30%) Achieve the best performance given all training data using whatever model and training strategy.  
  (You cannot use the model that was pretrained on CIFAR10)



## Discussion


- (30%) The relationship between the accuracy, model size, and the training dataset size.  
    (Total 6 models. Small model trains on the sixteenth, half, and all data. Big model trains on the sixteenth, half, and all data.)
- (10%) What if we train the ResNet with ImageNet initialized weights (`weights="IMAGENET1K_V1"`), how would the relationship change?

## Credits

1. [CIFAR10](https://www.cs.toronto.edu/~kriz/cifar.html)