<a href="https://colab.research.google.com/github/mindgarage/very-deep-learning-wise2324/blob/main/exercises/Exercise_1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Programming Exercise 1: Introduction to Deep Learning with PyTorch

## Very Deep Learning (VDL) - Winter Semester 2023/24

---

### Group Details:

- **Group Name:** \[VDL_Group_11\]

### Members:

- \[Chirag Singh], \[428971]
- \[Maithilee Vaidya], \[428837\]
- \[Prachi Tejwani\], \[429016\]
- \[Prathamesh Pawar\], \[428966\]
- \[Sankalp Dhupar\], \[429026\]
- \[Vibhor Chauhan\], \[428986\]

---

**Instructions**: The tasks in this notebook are a part of Sheet 1. Look for `TODO` tags throughout the notebook and complete the sections with missing code. Once done, ensure all outputs are visible and correctly displayed. Save your notebook and submit the `.ipynb` file together with the exercise sheet PDF in a single ZIP file.

## Introduction:

Welcome to the first programming exercise of the Very Deep Learning course. In this exercise, you will be introduced to PyTorch, one of the most widely used deep learning frameworks in academia and industry. With its dynamic computation graph and vast ecosystem, PyTorch provides an intuitive and versatile platform for building various deep learning models.

The aim of this task is to familiarize you with the basics of [PyTorch](https://pytorch.org/) and [PyTorch Lightning](https://lightning.ai/docs/pytorch/stable/starter/introduction.html), guiding you in building, training, and evaluating a simple neural network model. You'll be working with the [Flowers102 dataset](https://pytorch.org/vision/stable/generated/torchvision.datasets.Flowers102.html), a collection of images representing 102 flower categories. The flowers chosen are commonly occuring in the United Kingdom. Each class consists of between 40 and 258 images. By the end of this exercise, you should have a foundational understanding of neural networks, how they are trained, and how to evaluate their performance.

---
#### Remember not to modify the number of epochs. All teams must train only for 10 epochs.
---

In [None]:
# Install dependencies.
# Note: You can execute bash commands inside Google Colab!
!pip install pytorch-lightning # reduces boilerplate in vanilla PyTorch
!pip install torchmetrics      # simplifies metric computation
!pip install pandas            # for reading training logs from CSV
# We also need `torch` and `torchvision`, but they come pre-installed inside Colab. If not uncomment the lines below:
# !pip install torch
# !pip install torchvision

## Step 1: Data Preparation

The first step in any deep learning pipeline is data preparation. Here, we will download the ImageNet dataset and prepare it for training and validation.

In [None]:
import pytorch_lightning as pl
pl.seed_everything(42) # seed to make randomness deterministic

from matplotlib.image import NonUniformImage
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
from torchvision.datasets import Flowers102


# Define data transformations
transform = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,)),
])

# Download and load the Flowers102 dataset
train_dataset = Flowers102(root='./data', split='train', download=True, transform=transform)
test_dataset = Flowers102(root='./data', split='test', download=True, transform=transform)

# TODO: Split train dataset into train and val sets using an 80/20 ratio
# HINT: Use torch.utils.data.random_split
train_dataset, val_dataset = None, None

# TODO: Create train, val and test DataLoaders
# HINT: Set `shuffle=True` for the train set.
batch_size = 64
num_workers = 2
train_loader = None
val_loader = None
test_loader = None

## Step 2: Model Definition

For this tutorial, we'll use a pre-trained ResNet-18 model, which is a popular model for image classification tasks.

In [None]:
import torch
import torch.nn as nn
from torchvision.models import resnet18, ResNet18_Weights


class Flowers102Classifier(nn.Module):
    def __init__(self, num_classes=10):
        super(Flowers102Classifier, self).__init__()

        # Create a pre-trained ResNet-18 model from `torchvision` instead of writing
        # our own model. This model was trained on the ImageNet dataset, which has
        # RGB images with 1000 different classes.
        self.model = resnet18(weights=ResNet18_Weights.DEFAULT)
        # HINT: Use print(self.model) to see the architecture

        # TODO: Modify the architecture as you see fit :)

        # The Flowers102 dataset 102 classes.
        # TODO: Modify the last layer to fit the number of classes in Flowers102.


    def forward(self, x):
        return self.model(x)

## Step 3: Training with PyTorch Lightning

PyTorch Lightning simplifies the training loop. Let's create a Lightning module for our classification task.

In [None]:
import pytorch_lightning as pl
import torch.optim as optim
from torchmetrics import Accuracy


class ClassificationModule(pl.LightningModule):
    """ A PyTorch Lightning module for contains both the network and the
    training logic, unlike simple PyTorch code we saw in the first tutorial. """
    def __init__(self, learning_rate=0.001, num_classes=10):
        super(ClassificationModule, self).__init__()
        self.save_hyperparameters() # allows access to constructor args with self.hparams.*

        self.model = Flowers102Classifier(num_classes=num_classes)

        # TODO: Define a loss function
        # HINT: This is a classification task with multiple classes.
        self.loss_fn = None

        # TODO: Create an appropriate metric from torchmetrics
        self.metric = None

    def forward(self, x):
        return self.model(x)

    def training_step(self, batch, batch_idx):
        images, labels = batch
        outputs = self(images) # Forward pass
        loss = self.loss_fn(outputs, labels)
        self.log('train_loss', loss)

        # Note: We do not need to manually call loss.backward() or optim.step()
        # when using PyTorch Lightning
        return loss

    def on_validation_epoch_start(self):
        self.metric.reset()

    def validation_step(self, batch, batch_idx):
        images, labels = batch
        outputs = self(images)
        loss = self.loss_fn(outputs, labels)
        self.log('val_loss', loss, prog_bar=True)

        # Update accuracy for current batch
        _, preds = torch.max(outputs, 1)
        self.metric.update(preds, labels)
        return loss

    def on_validation_epoch_end(self):
        avg_accuracy = self.metric.compute()
        self.log('val_accuracy', avg_accuracy, prog_bar=True)

    def on_test_epoch_start(self):
        self.metric.reset()

    def test_step(self, batch, batch_idx):
        images, labels = batch
        outputs = self(images)

        # Note: We do not need to calculate loss when evaluating
        # on the test dataset, only the performance metric!

        # Update accuracy for current batch
        _, preds = torch.max(outputs, 1)
        self.metric.update(preds, labels)
        return {"test_accuracy": self.metric}

    def on_test_epoch_end(self):
        avg_accuracy = self.metric.compute()
        self.log('test_accuracy', avg_accuracy, prog_bar=True)

    def configure_optimizers(self):
        optimizer = optim.Adam(self.parameters(), lr=self.hparams.learning_rate)
        return optimizer

## Step 4: Training the Model

Now that our Lightning module is defined, we can easily train our model.

In [None]:
# Initialize the classifier
classifier = ClassificationModule()

# Create a logger
# HINT: Lightning has many different kinds of loggers, such
# as Tensorboard, WandB, Comet, etc.
# https://lightning.ai/docs/pytorch/stable/api_references.html#loggers
logger = pl.loggers.CSVLogger('./logs')

# Initialize a trainer
trainer = pl.Trainer(
    deterministic=True,
    accelerator='gpu' if torch.cuda.is_available() else 'cpu',
    logger=logger,
    max_epochs=10, # NOTE: DO NOT MODIFY THE NUMBER OF EPOCHS. YOU MUST ONLY TRAIN FOR 10.
)

# Train the model
trainer.fit(classifier, train_loader, val_loader)

## Step 5: Testing the Model

Let's write a simple loop to test the model's predictions.

In [None]:
# TODO: Test the network performance with test_loader
acc = None
print(f"Accuracy: {(acc[0]['test_accuracy'] * 100):.2f}%")

In [None]:
import pandas as pd

log_file = './logs/lightning_logs/version_0/metrics.csv'
logs = pd.read_csv(log_file)
print(logs.head())

In [None]:
import matplotlib.pyplot as plt


def plot_metrics(df):
    df_train = df[['epoch', 'step', 'train_loss']].dropna()
    df_train = df_train.groupby('epoch').apply(lambda x: x.loc[x['step'].idxmax()])[['epoch', 'step', 'train_loss']]
    df_val = df[['epoch', 'step', 'val_loss', 'val_acc']].dropna()
    df_test = df[['epoch', 'step', 'test_accuracy']].dropna()

    # Set up the figure and axes
    fig, axs = plt.subplots(1, 2, figsize=(14, 5))

    # Plot train_loss and val_loss on the first subplot
    axs[0].plot(df_train['epoch'], df_train['train_loss'], label='Train Loss', color='blue')
    axs[0].plot(df_val['epoch'], df_val['val_loss'], label='Validation Loss', color='red', linestyle='dashed')
    axs[0].set_title('Train Loss vs Validation Loss')
    axs[0].set_xlabel('Step')
    axs[0].set_ylabel('Loss')
    axs[0].legend()

    # Plot val_acc and test_accuracy on the second subplot
    axs[1].plot(df_val['epoch'], df_val['val_acc'], label='Validation Accuracy', color='green')
    axs[1].plot(df_test['epoch'], df_test['test_accuracy'], label='Test Accuracy', color='orange', linestyle='dashed')
    axs[1].set_title('Validation Accuracy vs Test Accuracy')
    axs[1].set_xlabel('Step')
    axs[1].set_ylabel('Accuracy')
    axs[1].legend()

    plt.tight_layout()
    plt.show()

plot_metrics(logs)


In [None]:
# !rm -rf ./logs

## Homework/Exercise

1. Complete missing code and run the notebook.
2. Experiment with different network architectures and hyperparameters to try and improve the classification accuracy.

You are allowed to change:
  - network architecture, including using other torchvision models
  - optimizer
  - learning rate
  - batch size
  - loss function

You can also experiment with [learning rate scheduling](https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate). In [Lightning](https://lightning.ai/docs/pytorch/stable/api/lightning.pytorch.core.LightningModule.html#lightning.pytorch.core.LightningModule), you will add the scheduler in the `configure_optimizer` function by returning something like `return {"optimizer": optimizer, "lr_scheduler": scheduler}`.

However, changing the number of training epochs is **not allowed**. Train your network for **10 epochs**!

The group with the best results gets a small prize :)

Good luck!