In [None]:
# imports
import os
import torch
import cv2
import pandas as pd
import numpy as np
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms
import torch.optim as optim
from torch.autograd import Variable
from torchvision import models
import random
from torch.optim import lr_scheduler
import time
import copy
import torch.nn as nn
from PIL import Image
import os
import sys
import torch
import numpy as np
from torchvision import transforms
from PIL import Image
import cv2
from pytorch_lightning.loggers import CSVLogger
from pytorch_lightning.callbacks import ModelCheckpoint
import pytorch_lightning as pl



# Milestone 3: Model Training and Evaluation with PyTorch Lightning

Welcome to Milestone 3 of LIS 640 – Introduction to Applied Deep Learning. In this milestone, you'll build upon your work from Milestones 1 and 2 by upgrading your neural network baseline to a more robust training framework using PyTorch Lightning and TensorBoard logging. You will also be exploring the advantages of different neural architectures (recurrent and convolutional neural networks) and different optimizers.

## Purpose

The goal of Milestone 3 is to:
- **Explore advanced architectures:** The main goal of Milestone 3 is to strengthen your knowledge about and experience with popular neural architectures including convolutional neural networks (CNNs) and recurrent neural networks (RNNs).
- **Streamline your model development:** Make sure you are working with easy-to-maintain Lightning modules.
- **Enhance experiment tracking:** Integrate TensorBoard to log and visualize training metrics, making it easier to monitor performance and debug issues.
- **Investigate optimizer effects:** Experiment with different optimizers (such as Adam, SGD, and RMSprop) to understand their impact on model training and performance.


## Part 1: Benchmarking Feedforward NN vs. RNN on Sequence Data

In this step, you'll compare the performance of a Recurrent Neural Network (RNN) against a Feedforward Neural Network (FFNN) on a dataset that contains sequential data. **For this exercise, you must use PyTorch Lightning to build your models and manage the training loop, as well as TensorBoard for logging and visualizing your training metrics.**

### A. Choose Your Dataset

- **Option 1:**  
  Use one of the datasets from Milestone 1 **if it contains sequence data**.  
  *For example, if your dataset involves time series, text, or any ordered data, it qualifies for this comparison.* In that case you have already done part B and can skip on to part C.
  

- **Option 2:**  
  If your Milestone 1 dataset does not include sequence data, search online for and download a dataset that features sequential information (e.g., time series forecasting, text classification, sensor data, etc.). Take inspiration from previous milestones on how to do part B (Data Preparation) for your new dataset.



### B. Data Preparation

1. **Create a Custom Dataset Class:**  
   - Implement a PyTorch `Dataset` class that loads your sequence data.
   - Include any necessary preprocessing steps (e.g., normalization, tokenization, padding for sequences).
   - Ensure that your `__getitem__` method returns the data in a format suitable for your models.

2. **Build DataLoaders:**  
   - Use `torch.utils.data.DataLoader` to create train, validation, and test loaders.
   - Choose appropriate batch sizes and shuffling to ensure effective training.

### C. Model Implementation with PyTorch Lightning

*Reuse implementations from Milestone 2 if that makes sense. The key difference now is that you should implement your models as PyTorch Lightning modules to take advantage of the built-in training loop and logging features.*

1. **Feedforward Neural Network (FFNN):**  
   - Implement a baseline feedforward network that treats the sequence data as independent features (e.g., by flattening the sequence).
   - Keep the architecture simple to establish a baseline for comparison.

2. **Recurrent Neural Network (RNN):**  
   - Implement an RNN model (using LSTM or GRU) to handle the sequential nature of the data.
   - Ensure that your model processes the sequence appropriately (e.g., using the final hidden state or an attention mechanism for prediction).

*Remember to use the PyTorch Lightning `Trainer` for model training, and configure the module to log metrics to TensorBoard.*

### D. Benchmarking and Evaluation

1. **Training Both Models:**  
   - Train both the FFNN and the RNN on your chosen dataset using similar training settings (e.g., number of epochs, learning rate, optimizer) to ensure a fair comparison.
   - Use PyTorch Lightning’s `Trainer` to manage the training process.

2. **Logging and Evaluation Metrics:**  
   - Leverage TensorBoard logging to visualize training and validation metrics in real-time.
   - Compare the performance of both models using metrics such as loss, accuracy, or any task-specific metric.
   - Optionally, record additional statistics like training time or convergence behavior.

3. **Document Your Findings:**  
   - Summarize the dataset and preprocessing steps.
   - Describe the architectures used for the FFNN and RNN.
   - Provide a comparative analysis discussing which model performed better and why that might be the case.
   - Include TensorBoard screenshots or logged results to support your analysis.

Part A - Option 1 - As our lane line dataset from milestone 1 contains images, we are going to use the same dataset for this section.

Part B

In [None]:
resize_height, resize_width = 256, 512
class Rescale():
    def __init__(self, output_size):
        self.output_size = output_size

    def __call__(self, sample):
        return cv2.resize(sample, dsize=self.output_size, interpolation=cv2.INTER_NEAREST)

class TusimpleData(Dataset):
    def __init__(self, dataset_file, n_labels=3, transform=None, target_transform=None, training=True, optuna=False):
        self._gt_img_list = []
        self._gt_label_binary_list = []
        self.transform = transform
        self.target_transform = target_transform
        self.n_labels = n_labels

        with open(dataset_file, 'r') as file:
            for _info in file:
                info_tmp = _info.strip(' ').split()
                self._gt_img_list.append(info_tmp[0])
                self._gt_label_binary_list.append(info_tmp[1])

        self._shuffle()

        purger = 1
        if purger < 1.0 and training:
            subset_size = int(len(self._gt_img_list) * purger)
            self._gt_img_list = self._gt_img_list[:subset_size]
            self._gt_label_binary_list = self._gt_label_binary_list[:subset_size]

    def _shuffle(self):
        zipped = list(zip(self._gt_img_list, self._gt_label_binary_list))
        random.shuffle(zipped)
        self._gt_img_list, self._gt_label_binary_list = zip(*zipped)

    def __len__(self):
        return len(self._gt_img_list)

    def __getitem__(self, idx):
        img = Image.open(self._gt_img_list[idx])
        label_img = cv2.imread(self._gt_label_binary_list[idx], cv2.IMREAD_COLOR)

        if self.transform:
            img = self.transform(img)
        if self.target_transform:
            label_img = self.target_transform(label_img)

        label_binary = np.zeros((label_img.shape[0], label_img.shape[1]), dtype=np.uint8)
        mask = np.where((label_img[:, :, :] != [0, 0, 0]).all(axis=2))
        label_binary[mask] = 1
        label_binary = torch.from_numpy(label_binary).long()
        return img, label_binary

data_transforms = {
    'train': transforms.Compose([
        transforms.Resize((resize_height, resize_width)),
        transforms.ColorJitter(brightness=0.1, contrast=0.1, saturation=0.1, hue=0.1),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    'val': transforms.Compose([
        transforms.Resize((resize_height, resize_width)),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
}
target_transforms = transforms.Compose([
    Rescale((resize_width, resize_height))
])

Part C - FNN

In [None]:
input_dim = 3 * resize_height * resize_width
output_dim = 2 * resize_height * resize_width

class LaneLinesFNN(nn.Module):
    def __init__(self, hidden1=1024, hidden2=256):
        super().__init__()
        self.flatten = nn.Flatten()
        self.fc1 = nn.Linear(input_dim, hidden1)
        self.fc2 = nn.Linear(hidden1, hidden2)
        self.fc3 = nn.Linear(hidden2, output_dim)
        self.relu = nn.ReLU()

    def forward(self, x):
        x = self.flatten(x)
        x = self.relu(self.fc1(x))
        x = self.relu(self.fc2(x))
        logits = self.fc3(x)
        logits = logits.view(-1, 2, resize_height, resize_width)
        pred = torch.argmax(logits, dim=1, keepdim=True)
        return {"binary_seg_logits": logits, "binary_seg_pred": pred}

In [None]:
class LaneSegLightningFNN(pl.LightningModule):
    def __init__(self, lr=0.001):
        super().__init__()
        self.model = LaneLinesFNN()
        self.loss_fn = nn.CrossEntropyLoss()
        self.lr = lr
        self.save_hyperparameters()

    def forward(self, x):
        return self.model(x)

    def compute_loss(self, out, target):
        logits = out["binary_seg_logits"]
        loss = self.loss_fn(logits, target) * 10
        return loss

    def training_step(self, batch, batch_idx):
        x, y = batch
        out = self(x)
        loss = self.compute_loss(out, y)
        self.log("train_loss", loss, prog_bar=True)
        return loss

    def validation_step(self, batch, batch_idx):
        x, y = batch
        out = self(x)
        loss = self.compute_loss(out, y)
        self.log("val_loss", loss, prog_bar=True)
        return loss

    def configure_optimizers(self):
        optimizer = torch.optim.Adam(self.parameters(), lr=self.lr)
        scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.1)
        return [optimizer], [scheduler]

In [None]:
def test_fnn(model_ckpt_path):
    if not os.path.exists('test_output'):
        os.makedirs('test_output')

    img_path = '0001.png'
    DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")

    model = LaneSegLightning.load_from_checkpoint(model_ckpt_path)
    model.eval()
    model.freeze()
    model = model.to(DEVICE)

    transform = transforms.Compose([
        transforms.Resize((resize_height, resize_width)),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ])

    inp = Image.open(img_path)
    input_tensor = transform(inp).unsqueeze(0).to(DEVICE)

    with torch.no_grad():
        output = model(input_tensor)

    binary_logits = output['binary_seg_logits']
    binary_pred = output['binary_seg_pred']
    binary_logits_np = binary_logits.detach().cpu().numpy()
    binary_pred_np = binary_pred.detach().cpu().numpy()

    input_img = np.array(inp.resize((resize_width, resize_height)))
    overlay = input_img.copy()
    overlay[binary_pred_np[0, 0, :, :] > 0] = [0, 0, 255]

    cv2.imwrite("test_output/input.jpg", input_img)
    cv2.imwrite("test_output/binary_prediction.jpg", binary_pred_np[0, 0] * 255)
    cv2.imwrite("test_output/input_with_prediction_overlay.jpg", overlay)

    for i in range(binary_logits_np.shape[1]):
        logits = binary_logits_np[0, i, :, :]
        logits_norm = cv2.normalize(logits, None, 0, 255, cv2.NORM_MINMAX, dtype=cv2.CV_8U)
        cv2.imwrite(f"test_output/binary_logits_channel_{i}.jpg", logits_norm)

    print("✅ Prediction visualization complete — see test_output/")

Part C - RNN

Part D - FNN

In [None]:
train_file = 'archive/TUSimple/train_set/training/train.txt'
val_file = 'archive/TUSimple/train_set/training/val.txt'

train_ds = TusimpleData(train_file, transform=data_transforms['train'], target_transform=target_transforms, training=True)
val_ds = TusimpleData(val_file, transform=data_transforms['val'], target_transform=target_transforms, training=False)
train_loader = DataLoader(train_ds, batch_size=32, shuffle=True, num_workers=4)
val_loader = DataLoader(val_ds, batch_size=32, shuffle=False, num_workers=4)

print(f"[INFO] Loaded {len(train_ds)} samples for training.")

model = LaneSegLightningFNN()
logger = CSVLogger("logs", name="laneseg")
checkpoint = ModelCheckpoint(monitor="val_loss", mode="min", save_top_k=1, filename="best_model")

trainer = pl.Trainer(max_epochs=10, logger=logger, callbacks=[checkpoint], accelerator="auto", devices=1)
trainer.fit(model, train_loader, val_loader)

test_fnn(checkpoint.best_model_path)


Part D - RNN

## Part 2: Benchmarking Feedforward NN vs. CNN on Image Data

In this step, you'll compare the performance of a Convolutional Neural Network (CNN) against a Feedforward Neural Network (FFNN) on an image-based dataset. **For this exercise, you must use PyTorch Lightning to implement your models and manage training, and use TensorBoard for logging and visualizing your training metrics.**

### A. Choose Your Dataset

- **Option 1:**  
  Use one of the datasets from Milestone 1 **if it contains image data**.  
  *For example, if your dataset involves images for classification, segmentation, or any visual task, it qualifies for this comparison.*

- **Option 2:**  
  If your Milestone 1 dataset does not include image data, search online for and download an image dataset (e.g., Fashion MNIST, CIFAR-10, or any domain-specific image dataset).

### B. Data Preparation

1. **Create a Custom Dataset Class:**  
   - Implement a PyTorch `Dataset` class that loads your image data.
   - Include any necessary preprocessing steps (e.g., normalization, resizing, data augmentation).
   - Ensure that your `__getitem__` method returns the data in a format suitable for your models.

2. **Build DataLoaders:**  
   - Use `torch.utils.data.DataLoader` to create train, validation, and test loaders.
   - Choose appropriate batch sizes and apply shuffling to ensure effective training.

### C. Model Implementation with PyTorch Lightning

*Reuse or adapt implementations from Milestone 2 as needed. The key requirement is to implement your models as PyTorch Lightning modules to take advantage of the built-in training loop and logging features.*

1. **Feedforward Neural Network (FFNN):**  
   - Implement a baseline FFNN that treats image data as a flat vector (i.e., by flattening the image).
   - Keep the architecture simple to serve as a baseline for comparison.

2. **Convolutional Neural Network (CNN):**  
   - Implement a CNN architecture that leverages convolutional layers to capture spatial hierarchies in the image data.
   - Typical layers might include convolution, activation (ReLU), pooling, and fully connected layers.
   - Ensure that your model architecture is designed to process image data effectively.

*Remember to use the PyTorch Lightning `Trainer` for training and to configure your Lightning module to log metrics to TensorBoard.*

### D. Benchmarking and Evaluation

1. **Training Both Models:**  
   - Train both the FFNN and the CNN on your chosen dataset using similar training settings (e.g., number of epochs, learning rate, optimizer) to ensure a fair comparison.
   - Use PyTorch Lightning’s `Trainer` to manage the training process.

2. **Logging and Evaluation Metrics:**  
   - Leverage TensorBoard to log and visualize training and validation metrics in real-time.
   - Compare the performance of both models using metrics such as loss, accuracy, or any task-specific evaluation metric.
   - Optionally, record additional details like training time and convergence behavior.

3. **Document Your Findings:**  
   - Summarize the dataset and preprocessing steps.
   - Describe the architectures used for both the FFNN and the CNN.
   - Provide a comparative analysis discussing which model performed better and why, supported by TensorBoard screenshots or logged results.

Part A - Choose your dataset Class

Option 1 - As our lane line dataset from milestone 1 contains images, we are going to use the same dataset for this section.

Part B - Dataset Class

In [None]:
resize_height, resize_width = 256, 512
class Rescale():
    def __init__(self, output_size):
        self.output_size = output_size

    def __call__(self, sample):
        return cv2.resize(sample, dsize=self.output_size, interpolation=cv2.INTER_NEAREST)

class TusimpleData(Dataset):
    def __init__(self, dataset_file, n_labels=3, transform=None, target_transform=None, training=True, optuna=False):
        self._gt_img_list = []
        self._gt_label_binary_list = []
        self.transform = transform
        self.target_transform = target_transform
        self.n_labels = n_labels

        with open(dataset_file, 'r') as file:
            for _info in file:
                info_tmp = _info.strip(' ').split()
                self._gt_img_list.append(info_tmp[0])
                self._gt_label_binary_list.append(info_tmp[1])

        self._shuffle()

        purger = 1
        if purger < 1.0 and training:
            subset_size = int(len(self._gt_img_list) * purger)
            self._gt_img_list = self._gt_img_list[:subset_size]
            self._gt_label_binary_list = self._gt_label_binary_list[:subset_size]

    def _shuffle(self):
        zipped = list(zip(self._gt_img_list, self._gt_label_binary_list))
        random.shuffle(zipped)
        self._gt_img_list, self._gt_label_binary_list = zip(*zipped)

    def __len__(self):
        return len(self._gt_img_list)

    def __getitem__(self, idx):
        img = Image.open(self._gt_img_list[idx])
        label_img = cv2.imread(self._gt_label_binary_list[idx], cv2.IMREAD_COLOR)

        if self.transform:
            img = self.transform(img)
        if self.target_transform:
            label_img = self.target_transform(label_img)

        label_binary = np.zeros((label_img.shape[0], label_img.shape[1]), dtype=np.uint8)
        mask = np.where((label_img[:, :, :] != [0, 0, 0]).all(axis=2))
        label_binary[mask] = 1
        label_binary = torch.from_numpy(label_binary).long()
        return img, label_binary

data_transforms = {
    'train': transforms.Compose([
        transforms.Resize((resize_height, resize_width)),
        transforms.ColorJitter(brightness=0.1, contrast=0.1, saturation=0.1, hue=0.1),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    'val': transforms.Compose([
        transforms.Resize((resize_height, resize_width)),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
}
target_transforms = transforms.Compose([
    Rescale((resize_width, resize_height))
])

Part C - FNN

In [None]:
input_dim = 3 * resize_height * resize_width
output_dim = 2 * resize_height * resize_width

class LaneLinesFNN(nn.Module):
    def __init__(self, hidden1=1024, hidden2=256):
        super().__init__()
        self.flatten = nn.Flatten()
        self.fc1 = nn.Linear(input_dim, hidden1)
        self.fc2 = nn.Linear(hidden1, hidden2)
        self.fc3 = nn.Linear(hidden2, output_dim)
        self.relu = nn.ReLU()

    def forward(self, x):
        x = self.flatten(x)
        x = self.relu(self.fc1(x))
        x = self.relu(self.fc2(x))
        logits = self.fc3(x)
        logits = logits.view(-1, 2, resize_height, resize_width)
        pred = torch.argmax(logits, dim=1, keepdim=True)
        return {"binary_seg_logits": logits, "binary_seg_pred": pred}

In [None]:
class LaneSegLightningFNN(pl.LightningModule):
    def __init__(self, lr=0.001):
        super().__init__()
        self.model = LaneLinesFNN()
        self.loss_fn = nn.CrossEntropyLoss()
        self.lr = lr
        self.save_hyperparameters()

    def forward(self, x):
        return self.model(x)

    def compute_loss(self, out, target):
        logits = out["binary_seg_logits"]
        loss = self.loss_fn(logits, target) * 10
        return loss

    def training_step(self, batch, batch_idx):
        x, y = batch
        out = self(x)
        loss = self.compute_loss(out, y)
        self.log("train_loss", loss, prog_bar=True)
        return loss

    def validation_step(self, batch, batch_idx):
        x, y = batch
        out = self(x)
        loss = self.compute_loss(out, y)
        self.log("val_loss", loss, prog_bar=True)
        return loss

    def configure_optimizers(self):
        optimizer = torch.optim.Adam(self.parameters(), lr=self.lr)
        scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.1)
        return [optimizer], [scheduler]

In [None]:
def test_fnn(model_ckpt_path):
    if not os.path.exists('test_output'):
        os.makedirs('test_output')

    img_path = '0001.png'
    DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")

    model = LaneSegLightning.load_from_checkpoint(model_ckpt_path)
    model.eval()
    model.freeze()
    model = model.to(DEVICE)

    transform = transforms.Compose([
        transforms.Resize((resize_height, resize_width)),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ])

    inp = Image.open(img_path)
    input_tensor = transform(inp).unsqueeze(0).to(DEVICE)

    with torch.no_grad():
        output = model(input_tensor)

    binary_logits = output['binary_seg_logits']
    binary_pred = output['binary_seg_pred']
    binary_logits_np = binary_logits.detach().cpu().numpy()
    binary_pred_np = binary_pred.detach().cpu().numpy()

    input_img = np.array(inp.resize((resize_width, resize_height)))
    overlay = input_img.copy()
    overlay[binary_pred_np[0, 0, :, :] > 0] = [0, 0, 255]

    cv2.imwrite("test_output/input.jpg", input_img)
    cv2.imwrite("test_output/binary_prediction.jpg", binary_pred_np[0, 0] * 255)
    cv2.imwrite("test_output/input_with_prediction_overlay.jpg", overlay)

    for i in range(binary_logits_np.shape[1]):
        logits = binary_logits_np[0, i, :, :]
        logits_norm = cv2.normalize(logits, None, 0, 255, cv2.NORM_MINMAX, dtype=cv2.CV_8U)
        cv2.imwrite(f"test_output/binary_logits_channel_{i}.jpg", logits_norm)

    print("✅ Prediction visualization complete — see test_output/")

Part C - CNN

In [None]:
class LaneLinesCNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 32, 3, stride=2, padding=1)
        self.conv2 = nn.Conv2d(32, 64, 3, stride=2, padding=1)
        self.conv3 = nn.Conv2d(64, 128, 3, stride=2, padding=1)
        self.relu = nn.ReLU()
        self.deconv1 = nn.ConvTranspose2d(128, 64, 3, stride=2, padding=1, output_padding=1)
        self.deconv2 = nn.ConvTranspose2d(64, 32, 3, stride=2, padding=1, output_padding=1)
        self.deconv3 = nn.ConvTranspose2d(32, 2, 3, stride=2, padding=1, output_padding=1)

    def forward(self, x):
        x = self.relu(self.conv1(x))
        x = self.relu(self.conv2(x))
        x = self.relu(self.conv3(x))
        x = self.relu(self.deconv1(x))
        x = self.relu(self.deconv2(x))
        logits = self.deconv3(x)
        pred = torch.argmax(logits, dim=1, keepdim=True)
        return {"binary_seg_logits": logits, "binary_seg_pred": pred}


In [None]:
class LaneSegLightningCNN(pl.LightningModule):
    def __init__(self, lr=0.001):
        super().__init__()
        self.model = LaneLinesCNN()
        self.loss_fn = nn.CrossEntropyLoss()
        self.lr = lr
        self.save_hyperparameters()

    def forward(self, x):
        return self.model(x)

    def compute_loss(self, out, target):
        logits = out["binary_seg_logits"]
        loss = self.loss_fn(logits, target) * 10
        return loss

    def training_step(self, batch, batch_idx):
        x, y = batch
        out = self(x)
        loss = self.compute_loss(out, y)
        self.log("train_loss", loss, prog_bar=True)
        return loss

    def validation_step(self, batch, batch_idx):
        x, y = batch
        out = self(x)
        loss = self.compute_loss(out, y)
        self.log("val_loss", loss, prog_bar=True)
        return loss

    def configure_optimizers(self):
        optimizer = torch.optim.Adam(self.parameters(), lr=self.lr)
        scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.1)
        return [optimizer], [scheduler]

In [None]:
def test_cnn(model_ckpt_path):
    if not os.path.exists('test_output'):
        os.makedirs('test_output')

    img_path = '0001.png'
    DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")

    model = LaneSegLightning.load_from_checkpoint(model_ckpt_path)
    model.eval()
    model.freeze()
    model = model.to(DEVICE)

    transform = transforms.Compose([
        transforms.Resize((resize_height, resize_width)),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ])

    inp = Image.open(img_path)
    input_tensor = transform(inp).unsqueeze(0).to(DEVICE)

    with torch.no_grad():
        output = model(input_tensor)

    binary_logits = output['binary_seg_logits']
    binary_pred = output['binary_seg_pred']
    binary_logits_np = binary_logits.detach().cpu().numpy()
    binary_pred_np = binary_pred.detach().cpu().numpy()

    input_img = np.array(inp.resize((resize_width, resize_height)))
    overlay = input_img.copy()
    overlay[binary_pred_np[0, 0, :, :] > 0] = [0, 0, 255]

    cv2.imwrite("test_output/input.jpg", input_img)
    cv2.imwrite("test_output/binary_prediction.jpg", binary_pred_np[0, 0] * 255)
    cv2.imwrite("test_output/input_with_prediction_overlay.jpg", overlay)

    for i in range(binary_logits_np.shape[1]):
        logits = binary_logits_np[0, i, :, :]
        logits_norm = cv2.normalize(logits, None, 0, 255, cv2.NORM_MINMAX, dtype=cv2.CV_8U)
        cv2.imwrite(f"test_output/binary_logits_channel_{i}.jpg", logits_norm)

    print("✅ Prediction visualization complete — see test_output/")

Part D - FNN

In [None]:
train_file = 'archive/TUSimple/train_set/training/train.txt'
val_file = 'archive/TUSimple/train_set/training/val.txt'

train_ds = TusimpleData(train_file, transform=data_transforms['train'], target_transform=target_transforms, training=True)
val_ds = TusimpleData(val_file, transform=data_transforms['val'], target_transform=target_transforms, training=False)
train_loader = DataLoader(train_ds, batch_size=32, shuffle=True, num_workers=4)
val_loader = DataLoader(val_ds, batch_size=32, shuffle=False, num_workers=4)

print(f"[INFO] Loaded {len(train_ds)} samples for training.")

model = LaneSegLightningFNN()
logger = CSVLogger("logs", name="laneseg")
checkpoint = ModelCheckpoint(monitor="val_loss", mode="min", save_top_k=1, filename="best_model")

trainer = pl.Trainer(max_epochs=10, logger=logger, callbacks=[checkpoint], accelerator="auto", devices=1)
trainer.fit(model, train_loader, val_loader)

test_fnn(checkpoint.best_model_path)


Part D - CNN

In [None]:
train_file = 'archive/TUSimple/train_set/training/train.txt'
val_file = 'archive/TUSimple/train_set/training/val.txt'

train_ds = TusimpleData(train_file, transform=data_transforms['train'], target_transform=target_transforms, training=True)
val_ds = TusimpleData(val_file, transform=data_transforms['val'], target_transform=target_transforms, training=False)
train_loader = DataLoader(train_ds, batch_size=32, shuffle=True, num_workers=4)
val_loader = DataLoader(val_ds, batch_size=32, shuffle=False, num_workers=4)

print(f"[INFO] Loaded {len(train_ds)} samples for training.")

model = LaneSegLightningCNN()
logger = CSVLogger("logs", name="laneseg")
checkpoint = ModelCheckpoint(monitor="val_loss", mode="min", save_top_k=1, filename="best_model")

trainer = pl.Trainer(max_epochs=10, logger=logger, callbacks=[checkpoint], accelerator="auto", devices=1)
trainer.fit(model, train_loader, val_loader)

test_cnn(checkpoint.best_model_path)

## Part 3: Comparing Optimizers and Analyzing Training Curves

In this step, you'll experiment with different optimizers—SGD, Adam, and RMSProp—to understand how they affect model performance. You will compare their effects using evaluation metrics on held-out test data and analyze the training and validation curves logged in TensorBoard.

### A. Experiment Setup

1. **Maintain Consistent Training Settings:**  
   - Use the same model architecture (whether FFNN, CNN, or RNN from Parts 1 and 2) and dataset for all experiments.
   - Ensure that the number of epochs, batch size, learning rate, and other hyperparameters are kept constant across different optimizer runs, aside from the optimizer itself.

2. **Implement Optimizer Switching:**  
   - Modify the `configure_optimizers` method in your PyTorch Lightning module to easily switch between optimizers:
     ```python
     def configure_optimizers(self):
         # Uncomment the optimizer you want to use
         # return torch.optim.SGD(self.parameters(), lr=0.01)
         # return torch.optim.Adam(self.parameters(), lr=1e-3)
         # return torch.optim.RMSprop(self.parameters(), lr=1e-3)
     ```
   - Train your model separately with each optimizer.

### B. Evaluation Metrics and Analysis

1. **Held-Out Test Evaluation:**  
   - After training, evaluate each model on a held-out test set.
   - Record quantitative metrics such as loss, accuracy, or any other relevant task-specific metric for each optimizer.

2. **TensorBoard Analysis:**  
   - Use TensorBoard to review the training and validation curves during training.
   - Focus on:
     - **Convergence Behavior:** How quickly does each optimizer reduce the loss?
     - **Stability:** Are there noticeable fluctuations or instability in the curves?
     - **Overfitting/Underfitting:** Do you observe signs of overfitting or underfitting, and how do these behaviors differ across optimizers?

### C. Document Your Findings

- **Summarize Performance:**  
  - Create a table or a brief report comparing the evaluation metrics for SGD, Adam, and RMSProp.
- **Include Visual Evidence:**  
  - Attach TensorBoard screenshots or summaries of the logged training/validation curves.
- **Provide a Comparative Analysis:**  
  - Discuss which optimizer provided the best performance on the test set.
  - Reflect on the convergence rates and stability differences you observed.
  - Explain potential reasons for these differences based on your results.

By the end of this exercise, you will have a deeper understanding of how different optimizers affect model training dynamics and performance. This insight is essential for making informed decisions when tuning models in future projects.

## Submission Instructions

**What to Submit:**

1. Your complete iPython notebook for Milestone 3 (including all code, outputs, and markdown explanations).
2. A single PDF file that contains your entire report for the milestone, covering:
   - Part 1: Benchmarking FFNN vs. RNN on sequence data.
   - Part 2: (Any additional tasks, if applicable.)
   - Part 3: Comparing optimizers and analyzing training curves.

**How to Submit:**

- Upload both your iPython notebook and the PDF report to Canvas.
- Name your files clearly, for example:
  - `YourName_Milestone3.ipynb`
  - `YourName_Milestone3_Report.pdf`

**Deadline:**

- All submissions are due **4/18/21**.

Happy Deep Learning!