# Google Colab setup with Google Drive folder

This notebook provides the code you need to set up Google Colab to run and import files from within a Google Drive folder.

This will allow you to upload assignment code to your Google Drive and then run the code on Google Colab machines (with free GPUs if needed). 

You will need to create a folder in your Google Drive to hold your assignments and you will need to open Colaboratory within this folder before running the set up code (check the link above to see how).

# Mount Google Drive

This will allow the Colab machine to access Google Drive folders by mounting the drive on the machine. You will be asked to copy and paste an authentication code.

In [None]:
# Skip this step because I'm not using Colab.
# from google.colab import drive
# drive.mount('/content/gdrive/')

In [None]:
# ls

# Change directory to allow imports


As noted above, you should create a Google Drive folder to hold all your assignment files. You will need to add this code to the top of any python notebook you run to be able to import python files from your drive assignment folder (you should change the file path below to be your own assignment folder). Following the hand-out, you should have a directory "SFU_CMPT_CV_lab2" on g-drive, which should have a directory "data", which contains three tar.gz files.

In [1]:
# Skip this step because I'm not using Colab.
# import os
# os.chdir("/content/gdrive/My Drive/SFU_CMPT_CV_lab2")

!cd /scratch/MJ/762-Assignment-2-Code

In [5]:
!ls # Check if this is your folder

[0m[01;34m__pycache__[0m/      dataset_test.py       epoch_visualizer.py  model_test.py
[01;34mcheckpoints.old[0m/  densenet.py           infer.py             requirements.txt
constant.py       densenet_test.py      lab2.ipynb           train.py
[01;34mdata[0m/             epoch_logger.py       [01;34mlogs[0m/
dataset.py        epoch_logger_test.py  model.py


# Copy data to local dir

In [None]:
# !mkdir /data
# !cp data/cifar100.tar.gz /data/
# !tar -xf /data/cifar100.tar.gz -C /data/
# !cp data/test.tar.gz /data
# !tar -xf /data/test.tar.gz -C /data
# !cp data/train.tar.gz /data
# !tar -xf /data/train.tar.gz -C /data/

In [7]:
!ls data/

[0m[01;34mcifar100[0m/  [01;31mcifar100.tar.gz[0m  [01;34mtest[0m/  [01;31mtest.tar.gz[0m  [01;34mtrain[0m/  [01;31mtrain.tar.gz[0m


# Activate virtual environment

The codebase requires at least Python 3.8 because of the ubiquitous `typing.Final`.

In [2]:
!source /scratch/MJ/venv/762-Assignment-2-Code/bin/activate.fish

# Set up GPU and PyTorch

First, ensure that your notebook on Colaboratory is set up to use GPU. After opening the notebook on Colaboratory, go to Edit>Notebook settings, select Python 3 under "Runtime type," select GPU under "Hardware accelerator," and save.

Next, install PyTorch:

In [8]:
# !pip3 install torch torchvision
!pip3 install -r requirements.txt



Make sure that pytorch is installed and works with GPU:

In [9]:
import torch
a = torch.Tensor([1]).cuda()
print(a)


tensor([1.], device='cuda:0')


In [10]:
torch.cuda.is_available()

True

# Part 1

In [None]:
"""Headers"""

from __future__ import print_function
from PIL import Image
import os
import os.path
import numpy as np
import sys
if sys.version_info[0] == 2:
    import cPickle as pickle
else:
    import pickle

import torch.utils.data as data
from torchvision.datasets.utils import download_url, check_integrity

import csv
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import os.path
import sys
import torch
import torch.utils.data
import torchvision
import torchvision.transforms as transforms

from torch.autograd import Variable
import torch.nn as nn
import torch.nn.functional as F

np.random.seed(111)
torch.cuda.manual_seed_all(111)
torch.manual_seed(111)



## **Just execute the cell below. This is the dataloader. DO NOT CHANGE ANYTHING IN HERE!**


In [11]:
""""""

# I've moved the dataset script into an external module.
from dataset import CIFAR100_SFU_CV

This file has been adapted from the easy-to-use tutorial released by PyTorch:
http://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html

Training an image classifier
----------------------------

We will do the following steps in order:

1. Load the CIFAR100_SFU_CV training, validation and test datasets using
   torchvision. Use torchvision.transforms to apply transforms on the
   dataset.
2. Define a Convolution Neural Network - BaseNet
3. Define a loss function and optimizer
4. Train the network on training data and check performance on val set.
   Plot train loss and validation accuracies.
5. Try the network on test data and create .csv file for submission to kaggle

In [None]:
# These settings are unused.
# Refer to `train.py` and `constant.py` for those settings.

# <<TODO#5>> Based on the val set performance, decide how many
# epochs are apt for your model.
# ---------
# EPOCHS = 15
# ---------

# IS_GPU = True
# TEST_BS = 256
# TOTAL_CLASSES = 100
# TRAIN_BS = 32
# PATH_TO_CIFAR100_SFU_CV = "/data/"

In [13]:
!ls data/cifar100/

test_cs543  train_cs543


In [None]:
# Unused in this notebook because it's inefficient (wastes 30 more seconds per epoch; not even using `torch.no_grad()`; should not mix numpy and torch functions).
# Refer to the `MARK: Validation` part of `train.py` for validation accuracy calculation.

# from train import calculate_val_accuracy

1.** Loading CIFAR100_SFU_CV**

We modify the dataset to create CIFAR100_SFU_CV dataset which consist of 45000 training images (450 of each class), 5000 validation images (50 of each class) and 10000 test images (100 of each class). The train and val datasets have labels while all the labels in the test set are set to 0.


In [None]:
# The output of torchvision datasets are PILImage images of range [0, 1].
# Using transforms.ToTensor(), transform them to Tensors of normalized range
# [-1, 1].

# Refer to `train.py` and `constant.py` for transformations and datasets.

2. Define a Convolution Neural Network

In [None]:
# Refer to `densenet.py` for my network.

3. Define a Loss function and optimizer

In [None]:
# Refer to the `main` function in `train.py`.
# I used Cross-Entropy loss and SGD with weight decay and a high momentum.

4. Train the network and save the training log to a file

The DenseNet model is very deep and can take hours to train.

Don't save the training log to this notebook because they are very long (I have 400 epochs).
In fact, it's recommended to NOT use Jupyter notebook for training for stability reasons.
Better alternatives include Docker containers, `screen`, and `tmux`.

In [None]:
!mkdir checkpoints
!py train.py --train_batch_size=300 > checkpoints/log.txt
# Refer to train.py for training loop.
# All training logs and weights can be obtained at: https://github.com/MacJim/CMPT-762-Assignment-2-Checkpoints

5. Visualize training and validation losses

In [14]:
!python epoch_visualizer.py

6. Try the network on test data, and create .csv file

In [15]:
!python infer.py --checkpoint_filename=checkpoints/400.pth --csv_filename=0.4-300batch-predictions.csv
# Refer to `infer.py` for the inference process.

Working directory: /scratch/MJ/762-Assignment-2-Code


# Part 2

In [7]:
"""Headers"""
import os
import os.path as osp
import time

%matplotlib inline
import matplotlib.pyplot as plt

import torch
import torch.nn as nn
import torchvision.models as models
from torchvision import transforms
import torch.optim as optim

from torchvision import datasets

from PIL import Image

# Pre-Trained Model

TODO1. Load pretrained resnet model. Experiment with different models. 

TODO2: Replace last fc layer

TODO3. Forward pass

In [4]:
class PreTrainedResNet(nn.Module):
  def __init__(self, num_classes: int, feature_extracting: bool):
    super(PreTrainedResNet, self).__init__()
    
    # MARK: 1. Load pre-trained ResNet Model
    self.resnet18 = models.resnet18(pretrained=True)

    # Set gradients to false (freeze)
    if feature_extracting:
      for param in self.resnet18.parameters():
          param.requires_grad = False
    
    # Replace last fc layer
    num_feats = self.resnet18.fc.in_features

    # MARK: 2: Replace fc layer in resnet to a linear layer of size (num_feats, num_classes)
    self.resnet18.fc = nn.Linear(num_feats, num_classes)

  def forward(self, x):
    # MARK: 3. Forward pass x through the model
    x = self.resnet18(x)
    return x

  def set_train_last_only(self, new_val: bool):
      for param in self.resnet18.parameters():
          if new_val:
              param.requires_grad = False
          else:
              param.requires_grad = True


# Train

In [5]:
def train(model, optimizer, scheduler, criterion, epoch, num_epochs):
  model.train()
  epoch_loss = 0.0
  epoch_acc = 0.0
  
  for batch_idx, (images, labels) in enumerate(dataloaders['train']):
    #zero the parameter gradients
    optimizer.zero_grad()
    
    #move to GPU
    images, labels = images.cuda(), labels.cuda()
    
    #forward
    outputs = model.forward(images)
    
    loss = criterion(outputs, labels)
    
    _, preds = torch.max(outputs.data, 1)
    
    loss.backward()
    optimizer.step()
    
    epoch_loss += loss.item()
    epoch_acc += torch.sum(preds == labels).item()

  scheduler.step()

  epoch_loss /= dataset_sizes['train']
  epoch_acc /= dataset_sizes['train']
  
  print('TRAINING Epoch %d/%d Loss %.4f Accuracy %.4f' % (epoch, num_epochs, epoch_loss, epoch_acc))

# Main

1. Vary hyperparams
2. Data augmentation

In [18]:
#TODO: Vary Hyperparams

# All layers hyperparameters
NUM_EPOCHS = 70
LEARNING_RATE = 0.0005
BATCH_SIZE = 32
RESNET_LAST_ONLY = False #Fine tunes only the last layer. Set to False to fine tune entire network

# Final layer hypterparameters
# NUM_EPOCHS = 70
# LEARNING_RATE = 0.005
# BATCH_SIZE = 256
# RESNET_LAST_ONLY = True #Fine tunes only the last layer. Set to False to fine tune entire network

root_path = 'data/' #If your data is in a different folder, set the path accodordingly

data_transforms = {
    'train': transforms.Compose([
#         transforms.RandomRotation(10, resample=Image.BILINEAR),
        # transforms.RandomAffine(10, scale=(0.95, 1.05), resample=Image.BILINEAR),
        transforms.Resize(256),
#         transforms.RandomCrop(224),
        transforms.RandomResizedCrop(224, scale=(0.8, 1.0), ratio=(4/5, 6/5)),
#         transforms.CenterCrop(224),
        #TODO: Transforms.RandomResizedCrop() instead of CenterCrop(), RandomRoate() and Horizontal Flip()
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)),
    ]),
    'test': transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)),
    ]),
}

# loading datasets with PyTorch ImageFolder
image_datasets = {x: datasets.ImageFolder(os.path.join(root_path, x),
                                          data_transforms[x])
                  for x in ['train', 'test']}

# defining data loaders to load data using image_datasets and transforms, here we also specify batch size for the mini batch
dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=BATCH_SIZE,
                                             shuffle=True, num_workers=6)
              for x in ['train', 'test']}

dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'test']}
class_names = image_datasets['train'].classes

#Initialize the model
model = PreTrainedResNet(len(class_names), RESNET_LAST_ONLY)
model = model.cuda()

#Setting the optimizer and loss criterion
optimizer = optim.SGD(model.parameters(), lr=LEARNING_RATE, momentum=0.9)
scheduler = optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=NUM_EPOCHS)
criterion = nn.CrossEntropyLoss()

#Begin Train
for epoch in range(NUM_EPOCHS):
    train(model, optimizer, scheduler, criterion, epoch+1, NUM_EPOCHS)
  
print("Finished Training")
print("-"*10)

TRAINING Epoch 1/70 Loss 0.1687 Accuracy 0.0087
TRAINING Epoch 2/70 Loss 0.1575 Accuracy 0.0393
TRAINING Epoch 3/70 Loss 0.1477 Accuracy 0.1160
TRAINING Epoch 4/70 Loss 0.1381 Accuracy 0.2123
TRAINING Epoch 5/70 Loss 0.1284 Accuracy 0.3260
TRAINING Epoch 6/70 Loss 0.1193 Accuracy 0.3993
TRAINING Epoch 7/70 Loss 0.1110 Accuracy 0.4717
TRAINING Epoch 8/70 Loss 0.1034 Accuracy 0.5310
TRAINING Epoch 9/70 Loss 0.0963 Accuracy 0.5867
TRAINING Epoch 10/70 Loss 0.0899 Accuracy 0.6397
TRAINING Epoch 11/70 Loss 0.0839 Accuracy 0.6577
TRAINING Epoch 12/70 Loss 0.0789 Accuracy 0.6943
TRAINING Epoch 13/70 Loss 0.0736 Accuracy 0.7330
TRAINING Epoch 14/70 Loss 0.0695 Accuracy 0.7567
TRAINING Epoch 15/70 Loss 0.0652 Accuracy 0.7687
TRAINING Epoch 16/70 Loss 0.0613 Accuracy 0.7937
TRAINING Epoch 17/70 Loss 0.0577 Accuracy 0.8200
TRAINING Epoch 18/70 Loss 0.0545 Accuracy 0.8330
TRAINING Epoch 19/70 Loss 0.0512 Accuracy 0.8497
TRAINING Epoch 20/70 Loss 0.0483 Accuracy 0.8617
TRAINING Epoch 21/70 Loss 0.0

# Test

In [10]:
def test(model, criterion, repeats=2):
  model.eval()
  
  test_loss = 0.0
  test_acc = 0.0
  
  with torch.no_grad():
    for itr in range(repeats):
      for batch_idx, (images, labels) in enumerate(dataloaders['test']):
        #move to GPU
        images, labels = images.cuda(), labels.cuda()

        #forward
        outputs = model.forward(images)

        loss = criterion(outputs, labels)

        _, preds = torch.max(outputs.data, 1)

        test_loss += loss.item()
        test_acc += torch.sum(preds == labels).item()

    test_loss /= (dataset_sizes['test']*repeats)
    test_acc /= (dataset_sizes['test']*repeats)

    print('Test Loss: %.4f Test Accuracy %.4f' % (test_loss, test_acc))


In [19]:
test(model, criterion)

Test Loss: 0.0599 Test Accuracy 0.5641


# Visualizing the model predictions

Only for viusalizing. Nothing to be done here. 

In [None]:
def imshow(inp, title=None):
    """Imshow for Tensor."""
    inp = inp.numpy().transpose((1, 2, 0))
    inp = np.clip(inp, 0, 1)
    plt.imshow(inp)
    if title is not None:
        plt.title(title)
    plt.pause(1)  # pause a bit so that plots are updated
    
def visualize_model(model, num_images=8):
    images_so_far = 0
    fig = plt.figure()

    for batch_idx, (images, labels) in enumerate(dataloaders['test']):
        #move to GPU
        images, labels = images.cuda(), labels.cuda()
        
        outputs = model(images)
        
        _, preds = torch.max(outputs.data, 1)
       

        for j in range(images.size()[0]):
            images_so_far += 1
            ax = plt.subplot(num_images//2, 2, images_so_far)
            ax.axis('off')
            ax.set_title('class: {} predicted: {}'.format(class_names[labels.data[j]], class_names[preds[j]]))

            imshow(images.cpu().data[j])

            if images_so_far == num_images:
                return

In [None]:
visualize_model(model)