# Computer Vision Homework 3: Big vs Small Models

## Brief

Due date: Nov 13, 2023

Required files: `homework-3.ipynb`, `report.pdf`

To download the jupyter notebook from colab, you can refer to the colab tutorial we gave.


## Codes for Problem 1 and Problem 2

### Import Packages

In [1]:
import glob
import os
import random

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

import torch
import torch.nn as nn
import torch.optim as optim
from torch.autograd import Variable

from PIL import Image
from torch.utils.data import DataLoader, Dataset, RandomSampler
from torchvision import transforms, models, datasets
from tqdm import tqdm

%matplotlib inline

### Check GPU Environment

In [2]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f'Using {device} device')

Using cuda device


In [3]:
! nvidia-smi -L

GPU 0: Tesla T4 (UUID: GPU-6713d831-72d0-9681-f218-72d92a81c85a)


### Set the Seed to Reproduce the Result

In [4]:
def set_all_seed(seed):
    np.random.seed(seed)
    random.seed(seed)
    torch.manual_seed(seed)
set_all_seed(123)

### Create Dataset and Dataloader

In [5]:
batch_size = 256

mean = (0.4914, 0.4822, 0.4465)
std = (0.2471, 0.2435, 0.2616)
train_transform = transforms.Compose([
    transforms.RandomCrop(32, padding=4),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize(mean, std),
])
test_transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean, std),
])

train_dataset = datasets.CIFAR10(root='data', train=True, download=True, transform=train_transform)
valid_dataset = datasets.CIFAR10(root='data', train=False, download=True, transform=test_transform)

train_dataloader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, pin_memory=True)
valid_dataloader = DataLoader(valid_dataset, batch_size=batch_size, shuffle=False, pin_memory=True)

sixteenth_train_sampler = RandomSampler(train_dataset, num_samples=len(train_dataset)//16)
half_train_sampler = RandomSampler(train_dataset, num_samples=len(train_dataset)//2)

sixteenth_train_dataloader = DataLoader(train_dataset, batch_size=batch_size, sampler=sixteenth_train_sampler)
half_train_dataloader = DataLoader(train_dataset, batch_size=batch_size, sampler=half_train_sampler)

Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to data/cifar-10-python.tar.gz


100%|██████████| 170498071/170498071 [00:04<00:00, 34320728.74it/s]


Extracting data/cifar-10-python.tar.gz to data
Files already downloaded and verified


### Load Models

In [None]:
# HINT: Remember to change the model to 'resnet50' and the weights to weights="IMAGENET1K_V1" when needed.
model = torch.hub.load('pytorch/vision:v0.10.0', 'resnet50', weights="IMAGENET1K_V1")
#model = torch.hub.load('pytorch/vision:v0.10.0', 'resnet18', weights=None)
num_fc = model.fc.in_features
model.fc = nn.Linear(num_fc, 10)
# Background: The original resnet18 is designed for ImageNet dataset to predict 1000 classes.
# TODO: Change the output of the model to 10 class.

Using cache found in /root/.cache/torch/hub/pytorch_vision_v0.10.0
Downloading: "https://download.pytorch.org/models/resnet50-0676ba61.pth" to /root/.cache/torch/hub/checkpoints/resnet50-0676ba61.pth
100%|██████████| 97.8M/97.8M [00:00<00:00, 146MB/s]


### Training and Testing Models

In [None]:
# TODO: Fill in the code cell according to the pytorch tutorial we gave.
LR=0.001
optimizer = torch.optim.Adam(model.parameters(), lr=LR)
loss_func = nn.CrossEntropyLoss()
input_shape = (-1,3,32,32)
for epoch in range(5):
  model.train()
  correct_train = 0
  total_train = 0
  for step, (x, y) in enumerate(train_dataloader):
    b_x = Variable(x, requires_grad=False)
    b_y = Variable(y, requires_grad=False)
    out = model(b_x)
    loss = loss_func(out, b_y)

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    predicted = torch.max(out.data, 1)[1]
    total_train += len(b_y)
    correct_train += (predicted == b_y).float().sum()
  train_accuracy = 100 * correct_train / float(total_train)
  print('Epoch: {} | accuracy: {}% | Loss: {}'.format(epoch + 1, train_accuracy, loss))

correct_test = 0
total_test = 0
for step, (x, y) in enumerate(valid_dataloader):
    b_x = Variable(x, requires_grad=False)
    b_y = Variable(y, requires_grad=False)
    out = model(b_x)
    loss = loss_func(out, b_y)

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    predicted = torch.max(out.data, 1)[1]
    total_test += len(b_y)
    correct_test += (predicted == b_y).float().sum()
test_accuracy = 100 * correct_test / float(total_test)
print('train_dataloader | accuracy: {}% '.format(test_accuracy))


Epoch: 1 | accuracy: 65.06600189208984% | Loss: 0.797803521156311
Epoch: 2 | accuracy: 76.8219985961914% | Loss: 0.6793445348739624
Epoch: 3 | accuracy: 80.697998046875% | Loss: 0.6568388938903809
Epoch: 4 | accuracy: 82.4520034790039% | Loss: 0.6170578002929688
Epoch: 5 | accuracy: 82.53199768066406% | Loss: 0.7501604557037354
train_dataloader | accuracy: 82.6500015258789% 


## Codes for Problem 3

In [6]:
# TODO: Try to achieve the best performance given all training data using whatever model and training strategy.
# (New) (You cannot use the model that was pretrained on CIFAR10)
model = torch.hub.load('pytorch/vision:v0.10.0', 'resnet18', weights=None)
num_fc = model.fc.in_features
model.fc = nn.Linear(num_fc, 10)
# TODO: Fill in the code cell according to the pytorch tutorial we gave.
LR=0.001
optimizer = torch.optim.Adam(model.parameters(), lr=LR)
loss_func = nn.CrossEntropyLoss()
input_shape = (-1,3,32,32)
for epoch in range(10):
  model.train()
  correct_train = 0
  total_train = 0
  for step, (x, y) in enumerate(train_dataloader):
    b_x = Variable(x, requires_grad=False)
    b_y = Variable(y, requires_grad=False)
    out = model(b_x)
    loss = loss_func(out, b_y)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    predicted = torch.max(out.data, 1)[1]
    total_train += len(b_y)
    correct_train += (predicted == b_y).float().sum()
  train_accuracy = 100 * correct_train / float(total_train)
  print('Epoch: {} | accuracy: {}% | Loss: {}'.format(epoch + 1, train_accuracy, loss))

correct_test = 0
total_test = 0
for step, (x, y) in enumerate(valid_dataloader):
    b_x = Variable(x, requires_grad=False)
    b_y = Variable(y, requires_grad=False)
    out = model(b_x)
    loss = loss_func(out, b_y)

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    predicted = torch.max(out.data, 1)[1]
    total_test += len(b_y)
    correct_test += (predicted == b_y).float().sum()
test_accuracy = 100 * correct_test / float(total_test)
print('train_dataloader | accuracy: {}% '.format(test_accuracy))

Downloading: "https://github.com/pytorch/vision/zipball/v0.10.0" to /root/.cache/torch/hub/v0.10.0.zip


Epoch: 1 | accuracy: 43.27799987792969% | Loss: 1.1996657848358154
Epoch: 2 | accuracy: 57.70000076293945% | Loss: 1.036866307258606
Epoch: 3 | accuracy: 63.104000091552734% | Loss: 1.058481216430664
Epoch: 4 | accuracy: 66.60600280761719% | Loss: 0.9383122324943542
Epoch: 5 | accuracy: 69.93800354003906% | Loss: 1.035589575767517
Epoch: 6 | accuracy: 71.94000244140625% | Loss: 0.7491894960403442
Epoch: 7 | accuracy: 73.5479965209961% | Loss: 0.8618677854537964
Epoch: 8 | accuracy: 75.20999908447266% | Loss: 0.7155677080154419
Epoch: 9 | accuracy: 76.2040023803711% | Loss: 0.44183674454689026
Epoch: 10 | accuracy: 77.4260025024414% | Loss: 0.435247004032135
train_dataloader | accuracy: 77.27999877929688% 


## Problems

1. (30%) Finish the rest of the codes for Problem 1 and Problem 2 according to the hint. (2 code cells in total.)
2. Train small model (resnet18) and big model (resnet50) from scratch on `sixteenth_train_dataloader`, `half_train_dataloader`, and `train_dataloader` respectively.
3. (30%) Achieve the best performance given all training data using whatever model and training strategy.  
  (You cannot use the model that was pretrained on CIFAR10)



## Discussion

Write down your insights in the report. The file name should be report.pdf.
For the following discussion, please present the results graphically as shown in Fig. 1 and discuss them.

- (30%) The relationship between the accuracy, model size, and the training dataset size.  
    (Total 6 models. Small model trains on the sixteenth, half, and all data. Big model trains on the sixteenth, half, and all data. If the result is different from Fig.1, please explain the possible reasons.)
- (10%) What if we train the ResNet with ImageNet initialized weights (`weights="IMAGENET1K_V1"`).
Please explain why the relationship changed this way?

Hint: You can try different hyperparameters combinations when training the models.

## Credits

1. [CIFAR10](https://www.cs.toronto.edu/~kriz/cifar.html)