# CS 7643 Assignment 2 Part 2:  Implement and train a network on CIFAR-10 using Pytorch

Convolutional Neural Networks (CNNs) are one of the major advancements in
computer vision over the past decade. In this assignment, you will complete
a simple CNN architecture from scratch and learn how to implement CNNs
with PyTorch, one of the most commonly used deep learning frameworks.
You will also run different experiments on imbalanced datasets to evaluate
your model and techniques to deal with imbalanced data.

# Setup Code

Before getting started we need to run some standard code to set up our environment. You'll need to execute this code again each time you start the notebook.

First, run this cell to load the [autoreload](https://ipython.readthedocs.io/en/stable/config/extensions/autoreload.html?highlight=autoreload) extension. This enables us to modify `.py` source files and reintegrate them into the notebook, ensuring a smooth editing and debugging experience.


In [1]:
%load_ext autoreload
%autoreload 2

### Google Colab Setup
Next we need to run a few commands to set up our environment on Google Colab. If you are running this notebook on a local machine you can skip this section.

Run the following cell to mount your Google Drive. Follow the link, sign in to your Google account (the same account you used to store this notebook!).

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Now remember the path in your Google Drive where you uploaded this notebook, fill it in below. If all functions properly, executing the next cell should display the filenames from the assignment:

```
['CS7643-Assignment2-2.ipynb', 'cs7643', 'checkpoints', 'losses', 'configs', 'models', 'tests']
```

In [2]:
import os

# TODO: Fill in the Google Drive path where you uploaded assignment1
# Example: If you create a Fall2023 folder and put all the files under A1 folder, then 'Fall2023/A1'
GOOGLE_DRIVE_PATH_POST_MYDRIVE = None
#GOOGLE_DRIVE_PATH_POST_MYDRIVE = r"CSE 7643 - DL/DL PS 2/DL_HW_2_collab/part2-pytorch"
GOOGLE_DRIVE_PATH = os.path.join('/content', 'drive', 'MyDrive', GOOGLE_DRIVE_PATH_POST_MYDRIVE)
print(os.listdir(GOOGLE_DRIVE_PATH))

TypeError: join() argument must be str, bytes, or os.PathLike object, not 'NoneType'

### Local Setup or Google Colab
Run the cell below regardless of setup to set the path 

In [3]:
# if running locally set GOOGLE PATH
import sys
if 'google.colab' in sys.modules:
  print(f'Running in google colab. Our path is `{GOOGLE_DRIVE_PATH}`')
else:
  GOOGLE_DRIVE_PATH = '.'
  print('Running locally.')

Running locally.


After successfully mounting your Google Drive and identifying the path to this assignment, execute the following cell to enable us to import from the `.py` files of this assignment. If it works correctly, it should print the message (note, you may need to retry this twice if it fails):

```
Roger that from cnn.py!
Roger that from my_model.py!
Roger that from resnet.py!
Roger that from twolayer.py!

Roger that from focal_loss.py!
```

as well as the last edit time for the files `cnn.py`, `my_model.py`, `resnet.py`, `twolayer.py`, and `focal_loss.py`.

In [4]:
import sys
import numpy as np
import math
sys.path.append(GOOGLE_DRIVE_PATH)

from cs7643.env_prob import say_hello_do_you_copy

say_hello_do_you_copy(GOOGLE_DRIVE_PATH)


---------- Models ------------------
Roger that from cnn.py!
Roger that from my_model.py!
Roger that from resnet.py!
Roger that from twolayer.py!
cnn.py last edited on Mon Sep  8 20:15:29 2025
my_model.py last edited on Mon Sep  8 20:15:29 2025
resnet.py last edited on Mon Sep  8 20:15:29 2025
twolayer.py last edited on Mon Sep  8 20:15:29 2025

---------- Losses ------------------
Roger that from focal_loss.py!
focal_loss.py last edited on Mon Sep  8 20:15:29 2025


# Load the CIFAR10 dataset
Data loading is the very first step of any machine learning pipelines. Run the following cell to download the CIFAR10 dataset.

In [5]:
from cs7643.cifar10 import CIFAR10

cifar10_ds = CIFAR10(GOOGLE_DRIVE_PATH + '/data/cifar10', download=True, train=True)

Files already downloaded and verified


We will use GPUs to accelerate our computation in this notebook. Run the following to make sure GPUs are enabled:

In [6]:
import torch

device = 'mps' if torch.backends.mps.is_available() else ('cuda' if torch.cuda.is_available() else 'cpu')
print("Using device = " + device)
if device == 'cpu':
    print("WARNING: Using CPU will cause slower train times")


Using device = cpu


# Training
The first thing of working with PyTorch is to get yourself familiarized with
the basic training step of PyTorch. Read through the [PyTorch Tutorial](https://pytorch.org/tutorials/beginner/basics/intro.html) and complete __compute_loss_update_params_ function in `./solver.py`.

## PyTorch Model
You will now implement some actual networks with PyTorch. We provide
some starter files for you in `./models`. The models for you to implement are
as follows:

* **Two-Layer Network**. This is the same network you have implemented from scratch in assignment 1. You will build the model with two fully connected layers and a sigmoid activation function in between the two layers. Please implement the model as instructed in `./models/twolayer.py`.

* **Vanilla Convolutional Neural Network**. You will build the model with a
convolution layer, a ReLU activation, a max-pooling layer, followed by a fully connected layer for classification. Your convolution layer should use **32 output channels**, a **kernel size of 7** with **stride 1** and **zero padding**. You max-pooling should use a **kernel size of 2** and **stride of 2**. The fully connected layer should have **10 output features**. Please implement the model as instructed in `./models/cnn.py`.

* Your Own Network. You are now free to build your own model. Notice that it's okay for you to borrow some insights from existing well-known networks, however, directly using those networks as-is is **NOT** allowed.
In other words, you have to build your model from scratch, which also means using any sort of pre-trained weights is also **NOT** allowed. Please implement your model in `./models/my_model.py`

We provide you configuration files for these three models respectively. For
Two-Layer Network and Vanilla CNN, you need to train the model without modifying the configuration file. The script automatically saves the weights of the best model at the end of training. We will evaluate your implementation by loading your model weights and evaluating the model on CIFAR-10 test data. You should expect the accuracy of Two-Layer Network and Vanilla CNN to be around 0.3 and 0.4 respectively.

For your own network, you are free to tune any hyper-parameters to
obtain better accuracy. Your final accuracy must be above 0.5 to receive
at least partial credit. Please refer to the GradeScope auto-test results
for the requirement of full credits. Try to keep your submission **under
100mb or GradeScope may not accept it**. All in all, please make sure
the checkpoints of each model are saved into `./checkpoints`.

Select a configuration file from the list then run the cell to train your model. To select a custom config, select "Show code" and specify the path to your config file. **Note that you may have to restart the jupyter kernel after updating the files above before running the below snippet.**

In [34]:
import yaml
from solver import Solver

#CHANGE HERE FOR EACH MODEL!!
config_file = "config_mymodel" # feel free to change to  ["config_mymodel", "config_twolayer", "config_vanilla_cnn", or other]

config_file = GOOGLE_DRIVE_PATH + "/configs/" + config_file + ".yaml"

print("Training a model using configuration file " + config_file)

with open(config_file, "r") as read_file:
  config = yaml.safe_load(read_file)

kwargs = {}
for key in config:
  for k, v in config[key].items():
    if k != 'description':
      kwargs[k] = v

kwargs['device'] = device
kwargs['path_prefix'] = GOOGLE_DRIVE_PATH

print(kwargs)

solver = Solver(**kwargs)
solver.train()

Training a model using configuration file ./configs/config_mymodel.yaml
{'batch_size': 128, 'learning_rate': 0.0001, 'reg': 0.0005, 'epochs': 10, 'steps': [6, 8], 'warmup': 0, 'momentum': 0.9, 'gamma': 1, 'model': 'MyModel', 'imbalance': 'regular', 'save_best': True, 'loss_type': 'CE', 'device': 'cpu', 'path_prefix': '.'}
MyModel(
  (conv1): Conv2d(3, 32, kernel_size=(6, 6), stride=(2, 2), padding=(2, 2))
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(32, 64, kernel_size=(4, 4), stride=(1, 1))
  (fc1): Linear(in_features=256, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=60, bias=True)
  (fc3): Linear(in_features=60, out_features=10, bias=True)
)
Epoch: [0][0/391]	Time 0.016 (0.016)	Loss 2.3087 (2.3087)	Prec @1 0.0938 (0.0938)	
Epoch: [0][10/391]	Time 0.013 (0.014)	Loss 2.3047 (2.3033)	Prec @1 0.0938 (0.0966)	
Epoch: [0][20/391]	Time 0.013 (0.013)	Loss 2.3067 (2.3055)	Prec @1 0.1172 (0.0945)	
Epoch: [0

KeyboardInterrupt: 

Let's test your models implementation. **Make sure to train your models and create checkpoints before running the following tests**.

In [15]:
#Let's test your implementation of two layer
!pytest -s {GOOGLE_DRIVE_PATH + '/tests/test_twolayer.py'}

platform win32 -- Python 3.12.0, pytest-8.4.1, pluggy-1.5.0
rootdir: C:\Users\julie\CSE_7643_DL\DL_PS2\part2-pytorch
plugins: anyio-4.7.0
collected 1 item

tests\test_twolayer.py [32m.[0m



In [35]:
#Let's test your implementation of my_model
!pytest -s { GOOGLE_DRIVE_PATH + '/tests/test_mymodel.py'}

platform win32 -- Python 3.12.0, pytest-8.4.1, pluggy-1.5.0
rootdir: C:\Users\julie\CSE_7643_DL\DL_PS2\part2-pytorch
plugins: anyio-4.7.0
collected 3 items

tests\test_mymodel.py [31mE[0m[31mE[0m[31mE[0m

[31m[1m______________ ERROR at setup of TestMyModel.test_accuracy_easy _______________[0m

cls = <class 'tests.test_mymodel.TestMyModel'>

    [0m[37m@classmethod[39;49;00m[90m[39;49;00m
    [94mdef[39;49;00m [92msetUpClass[39;49;00m([96mcls[39;49;00m):[90m[39;49;00m
    [90m    [39;49;00m[33m"""Define the functions to be tested here."""[39;49;00m[90m[39;49;00m
        basedir = pathlib.Path([91m__file__[39;49;00m).parent.parent.resolve()[90m[39;49;00m
        model = MyModel()[90m[39;49;00m
        model.eval()[90m[39;49;00m
        model.load_state_dict([90m[39;49;00m
>           torch.load([96mstr[39;49;00m(basedir) + [33m"[39;49;00m[33m/checkpoints/mymodel.pth[39;49;00m[33m"[39;49;00m, weights_only=[94mTrue[39;49;00m)[90m[39;49;00m

In [21]:
#Let's test your implementation of Vanilla CNN
!pytest -s {GOOGLE_DRIVE_PATH + '/tests/test_vanilla_cnn.py'}

platform win32 -- Python 3.12.0, pytest-8.4.1, pluggy-1.5.0
rootdir: C:\Users\julie\CSE_7643_DL\DL_PS2\part2-pytorch
plugins: anyio-4.7.0
collected 1 item

tests\test_vanilla_cnn.py [32m.[0m



# Data Wrangling
So far we have worked with well-balanced datasets (samples of each class are
evenly distributed). However, in practice, datasets are often not balanced.
In this section, you will explore the limitation of standard training strategy
on this type of dataset. This being an exploration, it is up to you to design
experiments or tests to validate these methods are correct and effective.
You will work with an unbalanced version of CIFAR-10 in this section, and you should use the ResNet-32 model in `./models/resnet.py`.

## Class-Balanced Focal Loss
You will implement one possible solution to the imbalance problem: ClassBalanced Focal Loss using this CVPR-19 paper [Class-Balanced Loss Based on Effective Number of Samples](https://arxiv.org/pdf/1901.05555.pdf). You may also refer to the original paper of [Focal Loss](https://arxiv.org/abs/1708.02002) for more details if you are interested. Please implement CB Focal
Loss in `./losses/focal_loss.py`.

**Note**: The CVPR-19 paper uses Sigmoid̲ CB focal loss (section 4). Softmax CB focal loss is not described in the paper, but it is easy to derive from the mentioned papers. You must implement the softmax version to pass the
tests.

Hint: Make sure you are using torch operations througout your focal loss implementation otherwise the torch computation graph will not be built properly.

In [None]:
# Test your Focal Loss implementation
!pytest -s {GOOGLE_DRIVE_PATH + '/tests/test_focalloss.py'}

Now, follow the instructions in the report template to obtain the best results possible for Resnet with regular CE loss (you may need to perform extra hyperparameter tuning). Then experiment with the beta parameter. Finally,  obtain the best possible results for Resnet using focal loss. You are welcome to change other hyperparameters in the config file as necessary. Make sure to set the `imbalance` config parameter as appropriate.

# Submit Your Work
After completing the notebook for this assignment (`assignment2_2.ipynb`), run the following cell to create a `.zip` file for you to download and then upload to Gradescope.

**Please MANUALLY SAVE `*.py` files before executing the following cell:**

In [None]:
from cs7643.submit import make_a2_2_submission

make_a2_2_submission(GOOGLE_DRIVE_PATH)