# DS 542 Fall 2025 Project 1

Your task for this project to train a better model for the Tree-or-Not data set.
Your model will be constrained to use the architecture in this template.
To train a better model, you will instead practice picking appropriate initializations and regularizations.



## Background

This notebook builds a model detecting trees in images.
The data set is available on GitHub at https://github.com/DL4DS/tree-or-not or on the Shared Compute Cluster at `/projectnb/ds542/materials/tree-or-not`.
The initial data set consists of roughly 500 pictures.
Most of them are from the Boston area, but some are from around the globe.
Most of them were taken outside, but some were taken inside or in more exotic locations.
Many other factors such as lighting, weather, and confounding bushes will make this a challenging problem.

## Outline

1. Run the provided notebook and confirm basic functionality.
2. Pick, describe and demonstrate a training improvement in each of the following categories - early stopping, initialization, learning rate, and other regularization. (40%)
3. Train models with all 16 combinations of the previous 4 choices and plot their training progress. (10%)
4. In one week (10/13), a new validation set will be posted. Show the accuracy of your 16 models on this new validation data.
6. Explain as best you can why each improvement was included or not included in the best performing model. (20%)
5. Save the best performing model for further evaluation by the auto-grader. (30%)


## Modules

In [None]:
# automatically add location of class packages if running on the SCC

import os
import sys

scc_site_packages = "/projectnb/ds542/materials/lib/python3.12/site-packages"
if os.path.isdir(scc_site_packages) and scc_site_packages not in sys.path:
    sys.path.append(scc_site_packages)


In [None]:
import imageio.v2 as imageio
import livelossplot
import matplotlib.pyplot as plt
import pandas as pd
import torch
import torcheval.metrics

## GPU Access

In [None]:
def to_gpu(t):
    if torch.cuda.is_available():
        return t.cuda()
    return t

def to_numpy(t):
    return t.detach().cpu().numpy()

device = to_gpu(torch.ones(1,1)).device
device

## Load Data

If you are not running on the SCC, fetch the data from https://github.com/dl4ds/tree-or-not as needed.
For example, you could use `git clone` to make a local copy and update `data_dir` below.

In [None]:
image_width = 256 # DO NOT CHANGE
data_dir = "/projectnb/ds542/materials/tree-or-not"

def load_data_set(data_set_name):
    labels = pd.read_csv(f"{data_dir}/{data_set_name}.tsv", sep="\t")

    file_names = []
    images = []
    targets = []
    for i in range(labels.shape[0]):
        row = labels.iloc[i]
        try:
            image = imageio.imread(f"{data_dir}/images{image_width}/{row['filename']}")[...]
        except:
            print("SKIPPING ", row['filename'], "MISSING")
            continue

        if image.shape[0] != image.shape[1] * 3 // 4:
            print("SKIPPING ", row['filename'], image.shape)
            continue

        # convert from 0-255 to 0.0-1.0
        image = image / 255
        # prepend axis with length one
        # image = image.reshape(1, *image.shape)
        image = torch.tensor(image, device=device, dtype=torch.float32)
        # permute image dimensions to put color channel first
        image = torch.permute(image, [2, 0, 1])

        file_names.append(row['filename'])
        images.append(image)
        targets.append(row["target"])

    images = torch.stack(images)

    targets = torch.tensor(targets, device=device, dtype=torch.float32)
    targets = targets.long()

    return (file_names, images, targets)

train_data_set = load_data_set("train")
for t in train_data_set[1:]:
    print("TRAIN", t.shape, t.dtype, t.device)
(train_file_names, train_X, train_Y) = train_data_set

validation_data_set = load_data_set("validation")
for t in validation_data_set[1:]:
    print("VALIDATION", t.shape, t.dtype, t.device)
(validation_file_names, validation_X, validation_Y) = validation_data_set

In [None]:
plt.imshow(to_numpy(torch.permute(train_X[0,:,:,:], (1, 2, 0))));

## Model Building

Do not modify any of this code.

In [None]:
class TreeNetwork(torch.nn.Module):
    def __init__(self):
        super().__init__()

        self.conv_0 = torch.nn.Conv2d(in_channels=3, out_channels=5, kernel_size=5, stride=2, device=device)
        self.conv_1 = torch.nn.Conv2d(in_channels=5, out_channels=5, kernel_size=5, stride=2, device=device)
        self.conv_2 = torch.nn.Conv2d(in_channels=5, out_channels=5, kernel_size=5, stride=2, device=device)
        self.conv_3 = torch.nn.Conv2d(in_channels=5, out_channels=5, kernel_size=5, stride=2, device=device)
        self.fc_3 = torch.nn.Linear(585, 2)

        self.relu = torch.nn.ReLU()

    def forward(self, X):

        X = self.conv_0(X)
        X = self.relu(X)

        X = self.conv_1(X)
        X = self.relu(X)

        X = self.conv_2(X)
        X = self.relu(X)

        X = self.conv_3(X)
        X = self.relu(X)

        # flatten channels and image dimensions
        X = X.reshape(X.shape[:-3] + (-1,))

        X = self.fc_3(X)

        return X

test_model = TreeNetwork().to(device)
test_output = test_model(train_X[:5,:,:,:])
assert test_output.shape == (5, 2)
del test_output

In [None]:
loss_function = torch.nn.CrossEntropyLoss()

In [None]:
DEFAULT_EPOCHS = 1000 if torch.cuda.is_available() else 100

def train_model(model_class, epochs=DEFAULT_EPOCHS, learning_rate=1e-4, **kwargs):
    model = model_class(**kwargs)
    if torch.cuda.is_available():
        model = model.cuda()

    model = torch.nn.DataParallel(model)
    model.train()
    optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

    liveloss = livelossplot.PlotLosses()
    for i in range(epochs):
        model.train()

        optimizer.zero_grad(set_to_none=True)
        prediction = model(train_X)
        loss = loss_function(prediction, train_Y)
        loss.backward()
        optimizer.step()

        if (i + 1) % 50 == 0:
            liveloss_updates = {}
            with torch.no_grad():
                model.eval()

                def get_metrics(metrics_prefix, metrics_X, metrics_Y):
                    metrics_prediction = model(metrics_X)

                    return {
                        f"{metrics_prefix}loss": loss_function(metrics_prediction, metrics_Y),
                        f"{metrics_prefix}accuracy": torcheval.metrics.functional.multiclass_accuracy(torch.argmax(metrics_prediction, dim=-1), metrics_Y)
                    }
                
                liveloss_updates.update(get_metrics("", train_X, train_Y))
                liveloss_updates.update(get_metrics("val_", validation_X, validation_Y))

            liveloss_updates = {k: to_numpy(v) for k, v in liveloss_updates.items()}
            liveloss.update(liveloss_updates,
                            current_step=i+1)
            liveloss.send()

    return model

test_model = train_model(TreeNetwork, epochs=1)
del test_model

In [None]:
base_model = train_model(TreeNetwork, epochs=2000)

## Training Improvements

Pick and describe training improvements in each of the following categories.
**Your description must be specific to this data set and baseline training process.**
You do not need to describe how these methods work in general, and generalities may cost points for making your answer less concise.

Warning:
Your training improvements will be sanity checked when models are built with them below.
If your training improvement does not improve the validation accuracy, then it will deemed inappropriate for this specific data set and architecture, and you will lose points here.

### Early Stopping

The baseline model overfits a lot.
Make a chart illustrating the overfitting problem in the base model and indicate roughly where the model should have stopped training.

In [None]:
# YOUR ANSWER HERE

...

Describe how you will detect overfitting to trigger early stopping.
Specific details might include how your detection code avoids triggering on noise.
(This might require a finer-grained validation loss chart.)

YOUR ANSWER HERE

...

### Initialization

Pick a better initialization method for this neural network.
What does this initialization method do better than the baseline code?

YOUR ANSWER HERE

...

Give an illustrate how your initialization method does better.
This might be a chart or some numerical representation of the difference.

In [None]:
# YOUR ANSWER HERE

...

### Learning Rate

The ADAM optimizer is often touted as needing no hyperparameter tuning except for the learning rate.
What is the best improvement that you achieved just changing the learning rate?
What was your best learning and was it a material change?

YOUR ANSWER HERE

...

Make a chart showing the improvements (or lack thereof) from optimizing just the learning rate.

In [None]:
# YOUR ANSWER HERE

...

### Regularization

Pick an appropriate regularization method that we covered in class besides early stopping and learning rate.
Explain why you picked this particular method for this problem and model.

YOUR ANSWER HERE

...

Give an illustrate how your regularization method does better.
This might be a chart or some numerical representation of the difference.

In [None]:
# YOUR ANSWER HERE

...

## Training All Combinations

Train all 16 combinations of the 4 training improvements.
For each combination, plot the loss and accuracy for the training and validation sets during the training process.

Hint: You are strongly encouraged to make a new training function that takes in 4 Boolean parameters controlling the activation of your improvements.
Have this function generate the requested charts too.
This way, you can write the improvement code once each and just call the training function repeatedly with different parameters.

In [None]:
# YOUR CHANGES HERE

...

Hint: you may want to save your models with `torch.save` to save retraining time later after receiving the new validation set in a week.

### Make a Table Comparing the Combinations

Fill in the chart below with the final results of each training run.

YOUR CHANGES HERE

| Early Stopping | Initialization | Learning Rate | Regularization | Training Accuracy | Validation Accuracy | Test Accuracy |
|---|---|---|---|---:|---:|---:|
| false | false | false | false | | | |
| false | false | false | true | | | |
| false | false | true | false | | | |
| false | false | true | true | | | |
| false | true | false | false | | | |
| false | true | false | true | | | |
| false | true | true | false | | | |
| false | true | true | true | | | |
| true | false | false | false | | | |
| true | false | false | true | | | |
| true | false | true | false | | | |
| true | false | true | true | | | |
| true | true | false | false | | | |
| true | true | false | true | | | |
| true | true | true | false | | | |
| true | true | true | true | | | |


## Validate Again

Test your models again with the new validation2 data (to be posted 10/13).
Use your existing models trained and do not retrain them using the validation2 data.
Fill in the table below with your validation2 accuracies.

YOUR CHANGES HERE

| Early Stopping | Initialization | Learning Rate | Regularization | Validation2 Accuracy |
|---|---|---|---|---:|
| false | false | false | false | |
| false | false | false | true | |
| false | false | true | false | |
| false | false | true | true | |
| false | true | false | false | |
| false | true | false | true | |
| false | true | true | false | |
| false | true | true | true | |
| true | false | false | false | |
| true | false | false | true | |
| true | false | true | false | |
| true | false | true | true | |
| true | true | false | false | |
| true | true | false | true | |
| true | true | true | false | |
| true | true | true | true | |


If your new validation results are poor, you may go back and refine your training improvements, but avoid using validation2 when you do so.

## Check Your Improvements

For each of your training improvements, check if they were used in the model with the best validation2 performance.
As best you can, explain why that was the case.
It may help to refer to other rows of the validation2 results treating it as an ablation study.

### Early Stopping

YOUR ANSWER HERE

...

### Initialization

YOUR ANSWER HERE

...

### Learning Rate

YOUR ANSWER HERE

...

### Regularization

YOUR ANSWER HERE

...

## Save Best Model for Auto-Grader Evaluation

Use `torch.save` to save your best performing model as "best.pt".
Check the auto-grader results as soon as possible to confirm that it can load your model.

Hint: The auto-grader will check this model's accuracy on a withheld test set.
Review your results above and tweak your improvements as you feel appropriate.
But beware, overfitting on the visible data sets will likely lead to poor performance on the withheld test set.

In [None]:
# YOUR ANSWER HERE

...