<a href="https://colab.research.google.com/github/ShahistaAfreen/DA6401_Assig2/blob/main/NA21B050_DLA2_PartB.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Q - 1**

##  Handling Image Dimension and Class Count Mismatches in Pre-trained Models

###  Image Dimension Mismatch

To address the potential difference in image dimensions between your naturalist dataset and ImageNet (on which many models are pre-trained):

####  Solution:
Use image resizing and rescaling transformations to match the input size expected by the pre-trained model.

Most ImageNet models expect inputs of **224×224 pixels**.

####  PyTorch Transformations:

```python
from torchvision import transforms

transform = transforms.Compose([
    transforms.Resize(256),                     # Resize shorter side to 256
    transforms.CenterCrop(224),                # Crop to 224x224
    transforms.ToTensor(),                     
    transforms.Normalize(                       # Normalize using ImageNet's mean and std
        mean=[0.485, 0.456, 0.406],             
        std=[0.229, 0.224, 0.225]
    )
])


###  Class Count Mismatch

Pre-trained ImageNet models have **1000 output classes**, but your naturalist dataset has **10 classes**.

####  Solution:

Replace the final classification layer of the pre-trained model with a new one that outputs **10 classes**.

####  PyTorch Implementation:

For models like **ResNet**:
```python
import torch.nn as nn
from torchvision import models

model = models.resnet18(pretrained=True)

# Optionally freeze earlier layers if you're only training the classifier
# for param in model.parameters():
#     param.requires_grad = False

# Replace the final fully connected layer
model.fc = nn.Linear(model.fc.in_features, 10)


**Q - 2**

###  Question 2 – Strategies for Fine-Tuning Large Pretrained Models

To keep the training computationally feasible while using large pre-trained models like ResNet50, VGG, EfficientNetV2, etc., the following fine-tuning strategies were used:

-  **Freezing all layers except the final fully connected (classification) layer**  
  - This significantly reduces the number of trainable parameters.
  - The model only learns to classify based on pre-extracted features.

-  **Freezing up to the last `k` layers and fine-tuning the remaining layers**  
  - Example: Freeze all layers up to `layer3` in ResNet and fine-tune `layer4` and `fc`.
  - Allows the model to adapt higher-level features to the new dataset.

-  **Gradual unfreezing**  
  - Initially freeze all layers, then progressively unfreeze and fine-tune deeper layers over epochs.
  - Helps avoid catastrophic forgetting and stabilizes training.

These strategies balance computational efficiency and model adaptability when transferring from ImageNet to the iNaturalist dataset.


In [None]:
import os
import random
import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as T

from torch.utils.data import DataLoader, Subset
from torchvision.datasets import ImageFolder

import pytorch_lightning as pl
from pytorch_lightning.callbacks.early_stopping import EarlyStopping
from pytorch_lightning.loggers.wandb import WandbLogger

import wandb
from google.colab import drive

In [None]:
drive.mount('/content/drive', force_remount=True)

# Set global seed for reproducibility
pl.seed_everything(42)
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "expandable_segments:True"

# Updated dataset path to match your specific path
dataset_path = '/content/drive/MyDrive/DL_A2_Dataset'
train_dir = os.path.join(dataset_path, 'train')
val_dir = os.path.join(dataset_path, 'validation')
test_dir = os.path.join(dataset_path, 'val')  # For testing

In [None]:
# Utility to generate a balanced dataset sample
def create_balanced_subset(dataset, max_samples_per_class=100):
    import random
    classwise_indices = {i: [] for i in range(len(dataset.classes))}
    for idx, (_, label) in enumerate(dataset.samples):
        classwise_indices[label].append(idx)

    selected_indices = []
    for label_indices in classwise_indices.values():
        count = min(max_samples_per_class, len(label_indices))
        selected_indices.extend(random.sample(label_indices, count))

    return Subset(dataset, selected_indices)


# LightningModule using ResNet50 with various finetuning options
class FinetuneResNet(pl.LightningModule):
    def __init__(self, mode="head_only", unfreeze_from=5):
        super().__init__()
        self.mode = mode
        self.learning_rate = 1e-4
        self.loss_fn = nn.CrossEntropyLoss()

        base_model = torchvision.models.resnet50(weights="ResNet50_Weights.DEFAULT")
        orig_fc = base_model.fc

        base_model.fc = nn.Sequential(
            orig_fc,
            nn.ReLU(),
            nn.Linear(1000, 10)
        )

        self.model = base_model
        self._set_trainable_layers(unfreeze_from)

    def _set_trainable_layers(self, from_layer):
        if self.mode == "head_only":
            for param in self.model.parameters():
                param.requires_grad = False
            for param in self.model.fc.parameters():
                param.requires_grad = True

        elif self.mode == "partial":
            for idx, child in enumerate(self.model.children()):
                for param in child.parameters():
                    param.requires_grad = (idx >= from_layer)

        elif self.mode == "last_block":
            for name, param in self.model.named_parameters():
                param.requires_grad = "layer4" in name or "fc" in name

    def forward(self, x):
        return self.model(x)

    def training_step(self, batch, batch_idx):
        x, y = batch
        logits = self(x)
        loss = self.loss_fn(logits, y)
        acc = (logits.argmax(dim=1) == y).float().mean()
        self.log_dict({"train_loss": loss, "train_accuracy": acc}, prog_bar=True)
        return loss

    def validation_step(self, batch, batch_idx):
        x, y = batch
        logits = self(x)
        val_loss = self.loss_fn(logits, y)
        val_acc = (logits.argmax(dim=1) == y).float().mean()
        self.log_dict({"val_loss": val_loss, "val_accuracy": val_acc}, prog_bar=True)

    def configure_optimizers(self):
        trainable = filter(lambda p: p.requires_grad, self.parameters())
        return torch.optim.Adam(trainable, lr=self.learning_rate)


# Custom DataModule for managing train/val/test splits
class ResNetDataModule(pl.LightningDataModule):
    def __init__(self, batch_size=64):
        super().__init__()
        self.train_dir = train_dir
        self.val_dir = val_dir
        self.test_dir = test_dir
        self.batch_size = batch_size

        self.common_transform = T.Compose([
            T.Resize((224, 224)),
            T.ToTensor(),
            T.Normalize(mean=[0.4712, 0.4600, 0.3896], std=[0.2406, 0.2301, 0.2406])
        ])

    def setup(self, stage=None):
        if stage in [None, 'fit']:
            self.train_data = ImageFolder(self.train_dir, transform=self.common_transform)
            self.val_data = ImageFolder(self.val_dir, transform=self.common_transform)
        if stage in [None, 'test']:
            self.test_data = ImageFolder(self.test_dir, transform=self.common_transform)

    def train_dataloader(self):
        return DataLoader(self.train_data, batch_size=self.batch_size, shuffle=True, num_workers=2)

    def val_dataloader(self):
        return DataLoader(self.val_data, batch_size=self.batch_size, shuffle=False, num_workers=2)

    def test_dataloader(self):
        return DataLoader(self.test_data, batch_size=self.batch_size, shuffle=False, num_workers=2)


# Single experiment run with one strategy
def run_experiment(strategy):
    logger = WandbLogger(project="dl-resnet-experiments", name=strategy)
    model = FinetuneResNet(mode=strategy)
    datamodule = ResNetDataModule()

    trainer = pl.Trainer(
        max_epochs=5,
        logger=logger,
        accelerator="auto",
        callbacks=[EarlyStopping(monitor="val_accuracy", mode="max", patience=2)]
    )

    trainer.fit(model, datamodule=datamodule)
    wandb.finish()


# Execute for all selected strategies
if __name__ == "__main__":
    strategies = ["head_only", "partial", "last_block"]
    for strategy in strategies:
        run_experiment(strategy)

INFO:lightning_fabric.utilities.seed:Seed set to 42


Mounted at /content/drive


Downloading: "https://download.pytorch.org/models/resnet50-11ad3fa6.pth" to /root/.cache/torch/hub/checkpoints/resnet50-11ad3fa6.pth
100%|██████████| 97.8M/97.8M [00:01<00:00, 91.1MB/s]
INFO:pytorch_lightning.utilities.rank_zero:You are using the plain ModelCheckpoint callback. Consider using LitModelCheckpoint which with seamless uploading to Model registry.
INFO:pytorch_lightning.utilities.rank_zero:GPU available: False, used: False
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs
[34m[1mwandb[0m: Using wandb-core as the SDK backend.  Please refer to https://wandb.me/wandb-core for more information.


<IPython.core.display.Javascript object>

[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize
[34m[1mwandb[0m: No netrc file found, creating one.
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc
[34m[1mwandb[0m: Currently logged in as: [33mna21b050[0m ([33mna21b050-iit-madras[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin


INFO:pytorch_lightning.callbacks.model_summary:
  | Name      | Type             | Params | Mode 
-------------------------------------------------------
0 | criterion | CrossEntropyLoss | 0      | train
1 | backbone  | ResNet           | 25.6 M | train
-------------------------------------------------------
2.1 M     Trainable params
23.5 M    Non-trainable params
25.6 M    Total params
102.268   Total estimated model params size (MB)
155       Modules in train mode
0         Modules in eval mode


Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

In [None]:
!pip install pytorch-lightning

**Q - 3**

In [None]:
# Verify dataset directories exist
print(f"Training path found: {os.path.exists(train_dir)}")
print(f"Validation path found: {os.path.exists(val_dir)}")
print(f"Testing path found: {os.path.exists(test_dir)}")

# LightningModule for transfer learning
class ResNetFineTuner(pl.LightningModule):
    def __init__(self, finetune_mode="layer4_fc"):
        super().__init__()
        self.learning_rate = 1e-4
        self.finetune_mode = finetune_mode
        self.loss_fn = nn.CrossEntropyLoss()

        # Load pretrained ResNet50 model
        model = torchvision.models.resnet50(weights="ResNet50_Weights.DEFAULT")
        original_output = model.fc

        # Modify classifier head
        model.fc = nn.Sequential(
            original_output,
            nn.ReLU(),
            nn.Linear(1000, 10)
        )

        self.model = model
        self._set_trainable_layers()

    def _set_trainable_layers(self):
        """Configure which layers to fine-tune."""
        for name, param in self.model.named_parameters():
            param.requires_grad = ("layer4" in name or "fc" in name)

    def forward(self, x):
        return self.model(x)

    def training_step(self, batch, _):
        inputs, labels = batch
        outputs = self(inputs)
        loss = self.loss_fn(outputs, labels)
        acc = (outputs.argmax(1) == labels).float().mean()
        self.log_dict({"train_loss": loss, "train_accuracy": acc}, prog_bar=True)
        return loss

    def validation_step(self, batch, _):
        inputs, labels = batch
        outputs = self(inputs)
        loss = self.loss_fn(outputs, labels)
        acc = (outputs.argmax(1) == labels).float().mean()
        self.log_dict({"val_loss": loss, "val_accuracy": acc}, prog_bar=True)

    def test_step(self, batch, _):
        inputs, labels = batch
        outputs = self(inputs)
        loss = self.loss_fn(outputs, labels)
        acc = (outputs.argmax(1) == labels).float().mean()
        self.log_dict({"test_loss": loss, "test_accuracy": acc}, prog_bar=True)

    def configure_optimizers(self):
        params = filter(lambda p: p.requires_grad, self.parameters())
        return torch.optim.Adam(params, lr=self.learning_rate)

# LightningDataModule for loading datasets
class ImageClassificationData(pl.LightningDataModule):
    def __init__(self, batch_size=64):
        super().__init__()
        self.train_path = train_dir
        self.val_path = val_dir
        self.test_path = test_dir
        self.batch_size = batch_size

    def setup(self, stage=None):
        transform = T.Compose([
            T.Resize((224, 224)),
            T.ToTensor(),
            T.Normalize(mean=[0.4712, 0.4600, 0.3896], std=[0.2406, 0.2301, 0.2406])
        ])

        if stage in ("fit", None):
            self.train_data = ImageFolder(self.train_path, transform=transform)
            self.val_data = ImageFolder(self.val_path, transform=transform)

        if stage in ("test", None):
            self.test_data = ImageFolder(self.test_path, transform=transform)

    def train_dataloader(self):
        return DataLoader(self.train_data, batch_size=self.batch_size, shuffle=True, num_workers=2)

    def val_dataloader(self):
        return DataLoader(self.val_data, batch_size=self.batch_size, shuffle=False, num_workers=2)

    def test_dataloader(self):
        return DataLoader(self.test_data, batch_size=self.batch_size, shuffle=False, num_workers=2)

# Run the experiment
def run_experiment():
    model_name = "resnet_last_block"
    wandb_logger = WandbLogger(project="image-classification", name=model_name)

    model = ResNetFineTuner(finetune_mode="layer4_fc")
    datamodule = ImageClassificationData()

    trainer = pl.Trainer(
        max_epochs=5,
        accelerator="auto",
        logger=wandb_logger
    )

    trainer.fit(model, datamodule=datamodule)
    trainer.test(model, datamodule=datamodule)
    wandb.finish()

# Trigger run
run_experiment()

### Inferences: Fine-Tuning vs Training from Scratch

- Fine-tuning a large pretrained model (**ResNet50**) significantly outperformed training a custom CNN from scratch, even with fewer training epochs.
- While training from scratch achieved a test accuracy of **41.95%**, fine-tuning the pretrained model using the **last_block** strategy nearly **doubled the accuracy** to **84.05%**.
- The **last_block** strategy was selected after evaluating all three strategies (head-only, partial, last_block) on **200 training samples per class**, and choosing the one that yielded the **highest validation accuracy**.
- Fine-tuning leverages the **transfer of general visual features** learned from large-scale datasets (e.g., ImageNet), resulting in **faster convergence** and **better generalization**.
- The **last_block** strategy offers a good trade-off between **computational efficiency** and **performance**, avoiding the risks of overfitting or instability from full-network fine-tuning while still allowing adaptation of the deeper, more task-specific layers.
