## General Colab Tips
- Modify files by opening/editing them in the UI (double-click to open).
- `Right click > Refresh` in the Colab file explorer to update the directory.
- All files are lost when the Colab session disconnects, so make sure back up your work.
- Do **not** use `drive.mount` for your datasets! Reading from GDrive is super slow.
- Instead, place datasets into the `/content/` folder and modify your data accordingly.

**Make a copy of this notebook and modify this to whatever workflow you prefer!**

If you have some additional colab tips, please share them on the discussion forum.

## Setup

First, enable a GPU runtime via `Runtime > Change runtime type > T4 GPU`

Next, upload the your project files to the Colab. You can do this by either
- using Github (**recommended**)
- uploading files manually using the UI

## Github Setup

You can use git from within Google Colab!

For this section, we assume you know how to use git and have already pushed the starter code to a private repo.

It's a good idea to structure your repo something like
```
online_deep_learning/
    homework1/
    homework2/
    ...
```

We highly recommend using this workflow as you'll be able to easily pull/commit your changes after modifying your model on Colab.

To do this, you'll need a personal access token from [https://github.com/settings/tokens](https://github.com/settings/tokens)

The easiest thing to do is select "classic" token and make sure you have the `repo` scope selected to allow access to your private repos.
There's also fine-grained tokens where you can select access to specific repos.

Once you have your token, fill in your information and then run the following cell to clone your git repo to the Colab instance.

## Code Setup

Next let's move into `homework2/` so we can continue setting up the data / code for training.

This will be the main working directory and the training/grading must be run from this directory.


In [7]:
import os
import shutil

# Ensure we are in /content
os.chdir("/content")
print("Current directory:", os.getcwd())

# If a folder named homework2 exists, remove it completely
if os.path.exists("online_deep_learning"):
    shutil.rmtree("online_deep_learning")
    print("Old homework2 folder removed.")
else:
    print("No existing homework2 folder found.")

!pwd

!mkdir /content/online_deep_learning

# @title Clone your GitHub repo to Colab
from getpass import getpass

# Ask for GitHub credentials
username = input("Enter your GitHub username: ")
token = getpass("Enter your GitHub personal access token: ")
repo_url = input("Enter your GitHub repo URL (e.g. https://github.com/username/repo.git): ")

# Format the authenticated URL
https_url = repo_url.replace("https://", f"https://{username}:{token}@")

# Clone the repo
!git clone {https_url}


%ls

# if you don't have a copy of homework2 yet in your git repo
# you can uncomment the lines below to get a copy
#!curl -O https://www.cs.utexas.edu/~bzhou/dl_class/homework2.zip
#!unzip -o homework2.zip
#!rm homework2.zip

Current directory: /content
Old homework2 folder removed.
/content
Enter your GitHub username: thisisanu
Enter your GitHub personal access token: ··········
Enter your GitHub repo URL (e.g. https://github.com/username/repo.git): https://github.com/thisisanu/online_deep_learning
Cloning into 'online_deep_learning'...
remote: Enumerating objects: 343, done.[K
remote: Counting objects: 100% (48/48), done.[K
remote: Compressing objects: 100% (48/48), done.[K
remote: Total 343 (delta 32), reused 0 (delta 0), pack-reused 295 (from 2)[K
Receiving objects: 100% (343/343), 2.99 MiB | 13.67 MiB/s, done.
Resolving deltas: 100% (219/219), done.
[0m[01;34mlogs[0m/  [01;34monline_deep_learning[0m/  [01;34msample_data[0m/


## Dataset Setup

Now that your code is all ready, the next step is to download the datasets.

Note: it's good practice to add data directories like `*/classification_data` to your `.gitignore` so you don't accidently commit them to your repo.

Since the datasets used in this class are relatively small, we can simply re-download them if the compute instance crashes/restarts.

In [8]:
# Move into the repo
%cd /content/online_deep_learning/homework2

# List current directory contents
!ls -l

# Download the dataset
!curl -O https://www.cs.utexas.edu/~bzhou/dl_class/classification_data.zip

# Unzip if needed
!unzip -o classification_data.zip -d .

# List contents of the classification_data folder
!ls -l


[1;30;43mStreaming output truncated to the last 5000 lines.[0m
  inflating: ./classification_data/train/19065.jpg  
  inflating: ./classification_data/train/06947.jpg  
  inflating: ./classification_data/train/16991.jpg  
  inflating: ./classification_data/train/14851.jpg  
  inflating: ./classification_data/train/17387.jpg  
  inflating: ./classification_data/train/00817.jpg  
  inflating: ./classification_data/train/17929.jpg  
  inflating: ./classification_data/train/19054.jpg  
  inflating: ./classification_data/train/03028.jpg  
  inflating: ./classification_data/train/17259.jpg  
  inflating: ./classification_data/train/03016.jpg  
  inflating: ./classification_data/train/12653.jpg  
  inflating: ./classification_data/train/18832.jpg  
  inflating: ./classification_data/train/16127.jpg  
  inflating: ./classification_data/train/02329.jpg  
  inflating: ./classification_data/train/16453.jpg  
  inflating: ./classification_data/train/18507.jpg  
  inflating: ./classification_data

## Setup Verification

Now you should be all set up, check out the `README.md` for additional instructions.

Run this cell to verify your working directory is setup correctly.

Your workspace should be organized as follows:

```
online_deep_learning/
├── homework1/
└── homework2/              <- you should be here
    ├── bundle.py
    ├── classification_data/
    ├── grader/
    ├── homework/
    ├── README.md
    └── requirements.txt
```

In [11]:
!dir classification_data/

classification_data  train  val


## Additional Helper Cells

Now you're on your own! The rest of the provided cells are small helper routines.

If you have any additional helpful colab tips/tricks please share them on the discussion forum.

## Tensorboard (Optional)

You can monitor training using the following command.

Make sure that your training code writes to the corresponding directory.

In [2]:

!mkdir -p logs
!tensorboard --logdir logs

  "[`\000-\040\177-\240\s]+",
  style = re.compile('url\s*\(\s*[^\s)]+?\s*\)\s*').sub(' ', style)
  if not re.match("^\s*([-\w]+\s*:[^:;]*(;\s*|$))*$", style):
  for prop, value in re.findall('([-\w]+)\s*:\s*([^:;]*)', style):
2025-10-25 07:40:36.328590: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1761378036.358028   21153 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1761378036.372104   21153 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1761378036.408585   21153 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1761378036.408649   211

## Training

After you implement your model, modify and run this cell to start training.

Be sure to pass in the appropriate parameters.

In [9]:
from homework.train import train

# List of all models to train
num_epochs = 20
learning_rate =  3e-3
#
model_types = ["linear", "mlp", "mlp_deep", "mlp_deep_residual"]

for model_name in model_types:
    print(f"\nTraining {model_name} model...\n")
    train(
        model_name=model_name,
        num_epoch=num_epochs,
        lr=learning_rate
    )


Training linear model...

CUDA not available, using CPU
Epoch  1 / 20: train_acc=0.6506 val_acc=0.7359
Epoch 10 / 20: train_acc=0.8106 val_acc=0.7519
Epoch 20 / 20: train_acc=0.8394 val_acc=0.7275
Model saved to logs/linear_1025_080339/linear.th

Training mlp model...

CUDA not available, using CPU
Epoch  1 / 20: train_acc=0.6671 val_acc=0.7296
Epoch 10 / 20: train_acc=0.8962 val_acc=0.7786
Epoch 20 / 20: train_acc=0.9308 val_acc=0.7925
Model saved to logs/mlp_1025_080423/mlp.th

Training mlp_deep model...

CUDA not available, using CPU
Epoch  1 / 20: train_acc=0.6541 val_acc=0.7339
Epoch 10 / 20: train_acc=0.8683 val_acc=0.7801
Epoch 20 / 20: train_acc=0.8960 val_acc=0.8156
Model saved to logs/mlp_deep_1025_080734/mlp_deep.th

Training mlp_deep_residual model...

CUDA not available, using CPU
Epoch  1 / 20: train_acc=0.5989 val_acc=0.7183
Epoch 10 / 20: train_acc=0.8387 val_acc=0.7805
Epoch 20 / 20: train_acc=0.8609 val_acc=0.7758
Model saved to logs/mlp_deep_residual_1025_081052/mlp

## Grader

Run the following cell to grade your homework locally.

The Canvas grader uses a different data split for testing,  
so there may be a *small* difference between your local grade and your final grade.

In [10]:
!python3 -m grader homework -vv --disable_color

2025-10-25 08:14:56.437468: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1761380096.481336   29434 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1761380096.496391   29434 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1761380096.534638   29434 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1761380096.534714   29434 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1761380096.534722   29434 computation_placer.cc:177] computation placer alr

## Updating Your Changes

After you've made progress, modify this cell and commit your changes to git.

In [None]:
%ls
!git status

# Be careful not to "git add *" since there are datasets and logs
!git add homework/*.py
!git config --global user.email "GITHUB_EMAIL"
!git config --global user.name "GITHUB_USER"
!git commit -m "update"
!git push origin main

## Tuning

Rather than changing one parameter and re-running the cell above over and over again,  
it is good practice to set up the model/training code so you can "tune" your model in a semi-automatic way.

This cell tunes over the `num_epochs` your training runs for,  
but you could easily modify this to tune over the number of layers, learning rate, etc.

After you find a good set of model hyperparameters, be sure to hard-code them into the constructor,  
since the grader will use the default constructor to load your model!

In [4]:
from homework.train import train


jobs = [
    # Run on short schedule (10 epochs)
    {
        "model_name": "mlp",
        "num_epoch": 10,
        "lr": 1e-3,
        "hidden_dim": 64,
    },
    # Train for longer (20 epochs)
    {
        "model_name": "mlp",
        "num_epoch": 20,
        "lr": 1e-3,
        "hidden_dim": 128,
    },
]

for params in jobs:
    train(**params)

IndentationError: unexpected indent (train.py, line 124)

## Submission

Run the following cell to bundle your submission (modify UTID accordingly).

After the bundler and grader run, right click and download your bundled `.zip` file from the Colab UI.


In [None]:
!python3 bundle.py homework UTID

# optional: run the grader with your bundled homework
!python3 -m grader UTID.zip -vv --disable_color