**Question 01**

### (a) Adjusting Input Image Dimensions for Pretrained Models

- **ImageNet pretrained models** (e.g., ResNet, VGG, EfficientNet) expect input images to have a resolution of **224×224 pixels**.
- **iNaturalist dataset images** may have different sizes and aspect ratios.
- To ensure compatibility with the pretrained model:
  - **Resize** all images to match the input size expected by the model.
  - Use the `transforms.Resize()` function in PyTorch to resize images.
- Example of resizing:

  ```python
  transforms.Resize((224, 224))  # Resize to match input size expected by most pretrained models
  ```


## (b) Extending the Output Layer for a 10-Class Classification Task


- Instead of replacing the last fully connected (FC) layer, we **append** a new output layer after the existing 1000-class layer.
- This allows the model to preserve the learned 1000-class representation and apply an additional transformation for the iNaturalist task.

Example modification for ResNet50:

```python
from torchvision import models
import torch.nn as nn

# Load the pretrained model
model = models.resnet50(pretrained=True)

# Extract the original 1000-class fully connected layer
original_fc = model.fc

# Create a new sequential head
model.fc = nn.Sequential(
    original_fc,            # Existing 1000-class output
    nn.ReLU(),              # Optional non-linearity
    nn.Linear(1000, 10)     # New output layer for 10-class classification
)



**Question 02**

### Strategies to Make Fine-Tuning Tractable

1. **Freezing All Layers Except the Last Layer**:
  - In this approach, all the layers of the model are frozen, meaning their weights remain unchanged during training. Only the final fully connected (FC) layer is trainable, allowing the model to adapt to the new task without requiring full retraining.
  - This approach leverages the pretrained features in the earlier layers, which are generally useful for a wide range of tasks, while only training the final layer to match the specific 10-class task.

  Example:
  ```python
  for param in model.parameters():
       param.requires_grad = False  # Freeze all layers

  # Only the last layer will be updated
  model.fc.requires_grad = True  # Unfreeze the last FC layer
  ```

2. **Freezing Layers Up to the k-th Layer**:
  - In this strategy, the first few layers are frozen, and only the layers after a certain point (e.g., the k-th layer) are allowed to be updated. The idea here is to freeze the early layers which learn basic features (like edges and textures) that are likely transferable across different tasks, while fine-tuning the later layers to capture task-specific details.
  - This approach is a middle ground between freezing all layers and training the entire model.

  Example:
  ```python
  for i, param in enumerate(model.parameters()):
      if i < k:
          param.requires_grad = False  # Freeze layers up to k-th layer
      else:
          param.requires_grad = True  # Unfreeze layers after k-th layer
  ```

3. **Fine-Tuning Only the Last Few Layers**:
  - Another approach is to **fine-tune only the last few layers**, such as the final few convolutional or dense layers. The idea behind this strategy is to allow the model to adapt to the specific characteristics of the new dataset while retaining the powerful features learned by the earlier layers.
  - In this case, the number of layers to fine-tune can be adjusted based on computational constraints and the task's complexity.

  Example:
  ```python
  # Freezing early layers and fine-tuning the last few layers
  for param in model.parameters():
      param.requires_grad = False  # Freeze all layers

  # Unfreeze the last 2 layers
  for param in model.layer4.parameters():
      param.requires_grad = True
  ```


In [1]:
import os
import torch
import wandb
import random
import torchvision
import torch.nn as nn
import pytorch_lightning as pl
from torch.utils.data import DataLoader, Subset
from torchvision.datasets import ImageFolder
import torchvision.transforms as T
from pytorch_lightning.callbacks.early_stopping import EarlyStopping
from pytorch_lightning.loggers.wandb import WandbLogger
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

# Set global seed for reproducibility
pl.seed_everything(42)
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "expandable_segments:True"

# Extract dataset if needed
zip_file = "/content/drive/MyDrive/nature_12K.zip"
extracted_dir = "/content/inaturalist_12K/train"

if not os.path.exists(extracted_dir):
    !cp "{zip_file}" .
    !unzip -qq nature_12K.zip
    !rm nature_12K.zip

# Helper to create a class-balanced subset
def sample_balanced_data(dataset, per_class_limit = 100):
    class_to_indices = {label: [] for label in range(len(dataset.classes))}
    for idx, (_, cls) in enumerate(dataset.samples):
        class_to_indices[cls].append(idx)
    selected = []
    for indices in class_to_indices.values():
        selected.extend(random.sample(indices, min(per_class_limit, len(indices))))
    return Subset(dataset, selected)

# Fine-tuning class using PyTorch Lightning
class TransferLearner(pl.LightningModule):
    def __init__(self, mode="head_only", start_unfreeze_layer=5):
        super().__init__()
        self.strategy = mode
        self.lr = 1e-4
        self.criterion = nn.CrossEntropyLoss()

        # Use ResNet50 pretrained model
        net = torchvision.models.resnet50(weights="ResNet50_Weights.DEFAULT")
        original_fc = net.fc  # Retain the original 1000-class FC layer

        # Replace fc with extended 10-class output layer
        net.fc = nn.Sequential(
            original_fc,         # 1000-class output
            nn.ReLU(),           # Optional non-linearity
            nn.Linear(1000, 10)  # Final output layer for iNaturalist
        )

        self.backbone = net
        self.configure_finetune(start_unfreeze_layer)

    def configure_finetune(self, unfreeze_from):
        """Freeze layers based on the selected strategy."""
        if self.strategy == "head_only":
            for param in self.backbone.parameters():
                param.requires_grad = False
            for param in self.backbone.fc.parameters():
                param.requires_grad = True

        elif self.strategy == "partial":
            child_count = 0
            for child in self.backbone.children():
                child_count += 1
                requires_grad = child_count > unfreeze_from
                for param in child.parameters():
                    param.requires_grad = requires_grad

        elif self.strategy == "last_block":
            for name, param in self.backbone.named_parameters():
                if "layer4" in name or "fc" in name:
                    param.requires_grad = True
                else:
                    param.requires_grad = False

    def forward(self, x):
        return self.backbone(x)

    def training_step(self, batch, batch_idx):
        x, y = batch
        preds = self(x)
        loss = self.criterion(preds, y)
        acc = (preds.argmax(1) == y).float().mean()
        self.log_dict({"train_loss": loss, "train_acc": acc}, prog_bar=True)
        return loss

    def validation_step(self, batch, batch_idx):
        x, y = batch
        preds = self(x)
        loss = self.criterion(preds, y)
        acc = (preds.argmax(1) == y).float().mean()
        self.log_dict({"val_loss": loss, "val_acc": acc}, prog_bar=True)

    def configure_optimizers(self):
        trainable_params = filter(lambda p: p.requires_grad, self.parameters())
        return torch.optim.Adam(trainable_params, lr=self.lr)

# Fine-tuning class using PyTorch Lightning
class TransferLearner(pl.LightningModule):
    def __init__(self, mode="head_only", start_unfreeze_layer=5):
        super().__init__()
        self.strategy = mode
        self.lr = 1e-4
        self.criterion = nn.CrossEntropyLoss()

        # Use ResNet50 pretrained model
        net = torchvision.models.resnet50(weights="ResNet50_Weights.DEFAULT")
        original_fc = net.fc  # Retain the original 1000-class FC layer

        # Replace fc with extended 10-class output layer
        net.fc = nn.Sequential(
            original_fc,         # 1000-class output
            nn.ReLU(),           # Optional non-linearity
            nn.Linear(1000, 10)  # Final output layer for iNaturalist
        )

        self.backbone = net
        self.configure_finetune(start_unfreeze_layer)

    def configure_finetune(self, unfreeze_from):
        """Freeze layers based on the selected strategy."""
        if self.strategy == "head_only":
            for param in self.backbone.parameters():
                param.requires_grad = False
            for param in self.backbone.fc.parameters():
                param.requires_grad = True

        elif self.strategy == "partial":
            child_count = 0
            for child in self.backbone.children():
                child_count += 1
                requires_grad = child_count > unfreeze_from
                for param in child.parameters():
                    param.requires_grad = requires_grad

        elif self.strategy == "last_block":
            for name, param in self.backbone.named_parameters():
                if "layer4" in name or "fc" in name:
                    param.requires_grad = True
                else:
                    param.requires_grad = False

    def forward(self, x):
        return self.backbone(x)

    def training_step(self, batch, batch_idx):
        x, y = batch
        preds = self(x)
        loss = self.criterion(preds, y)
        acc = (preds.argmax(1) == y).float().mean()
        self.log_dict({"train_loss": loss, "train_acc": acc}, prog_bar=True)
        return loss

    def validation_step(self, batch, batch_idx):
        x, y = batch
        preds = self(x)
        loss = self.criterion(preds, y)
        acc = (preds.argmax(1) == y).float().mean()
        self.log_dict({"val_loss": loss, "val_acc": acc}, prog_bar=True)

    def configure_optimizers(self):
        trainable_params = filter(lambda p: p.requires_grad, self.parameters())
        return torch.optim.Adam(trainable_params, lr=self.lr)

# Custom DataModule with image resizing and optional cropping
class INaturalistDataModule(pl.LightningDataModule):
    def __init__(self, batch_size=64):
        super().__init__()
        self.path = "/content/inaturalist_12K/train"
        self.batch_size = batch_size

    def setup(self, stage=None):
        preprocess = T.Compose([
            T.Resize((224, 224)),  # Ensures ImageNet compatibility
            T.ToTensor(),
            T.Normalize(mean=[0.4712, 0.4600, 0.3896], std=[0.2406, 0.2301, 0.2406])
        ])
        base_data = ImageFolder(self.path, transform=preprocess)
        subset = sample_balanced_data(base_data)
        val_split = int(0.2 * len(subset))
        train_split = len(subset) - val_split
        self.train_set, self.val_set = torch.utils.data.random_split(subset, [train_split, val_split])

    def train_dataloader(self):
        return DataLoader(self.train_set, batch_size=self.batch_size, shuffle=True, num_workers=2)

    def val_dataloader(self):
        return DataLoader(self.val_set, batch_size=self.batch_size, shuffle=False, num_workers=2)

# Launch a single experiment with W&B logging
def launch(mode_name):
    wandb_logger = WandbLogger(project="inat-resnet-ft", name=mode_name)
    learner = TransferLearner(mode=mode_name)
    data = INaturalistDataModule()
    trainer = pl.Trainer(
        max_epochs=5,
        logger=wandb_logger,
        accelerator="auto",
        callbacks=[EarlyStopping(monitor="val_acc", mode="max", patience=2)],
    )
    trainer.fit(learner, datamodule=data)
    wandb.finish()

# Try different strategies
for strategy in ["head_only", "partial", "last_block"]:
    launch(strategy)

INFO:lightning_fabric.utilities.seed:Seed set to 42


Mounted at /content/drive


INFO:pytorch_lightning.utilities.rank_zero:You are using the plain ModelCheckpoint callback. Consider using LitModelCheckpoint which with seamless uploading to Model registry.
INFO:pytorch_lightning.utilities.rank_zero:GPU available: True (cuda), used: True
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs
[34m[1mwandb[0m: Using wandb-core as the SDK backend.  Please refer to https://wandb.me/wandb-core for more information.
[34m[1mwandb[0m: Currently logged in as: [33mtejaswiniksssn[0m ([33mtejaswiniksssn-indian-institute-of-technology-madras[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin


INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:pytorch_lightning.callbacks.model_summary:
  | Name      | Type             | Params | Mode 
-------------------------------------------------------
0 | criterion | CrossEntropyLoss | 0      | train
1 | backbone  | ResNet           | 25.6 M | train
-------------------------------------------------------
2.1 M     Trainable params
23.5 M    Non-trainable params
25.6 M    Total params
102.268   Total estimated model params size (MB)
155       Modules in train mode
0         Modules in eval mode


Sanity Checking: |          | 0/? [00:00<?, ?it/s]

/usr/local/lib/python3.11/dist-packages/pytorch_lightning/loops/fit_loop.py:310: The number of training batches (13) is smaller than the logging interval Trainer(log_every_n_steps=50). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.


Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

INFO:pytorch_lightning.utilities.rank_zero:`Trainer.fit` stopped: `max_epochs=5` reached.


0,1
epoch,▁▃▅▆▆█
train_acc,▁
train_loss,▁
trainer/global_step,▁▃▅▆▆█
val_acc,▁▅▆▇█
val_loss,█▆▄▂▁

0,1
epoch,4.0
train_acc,0.48438
train_loss,1.76896
trainer/global_step,64.0
val_acc,0.57
val_loss,1.62048


INFO:pytorch_lightning.utilities.rank_zero:You are using the plain ModelCheckpoint callback. Consider using LitModelCheckpoint which with seamless uploading to Model registry.
INFO:pytorch_lightning.utilities.rank_zero:GPU available: True (cuda), used: True
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs


INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:pytorch_lightning.callbacks.model_summary:
  | Name      | Type             | Params | Mode 
-------------------------------------------------------
0 | criterion | CrossEntropyLoss | 0      | train
1 | backbone  | ResNet           | 25.6 M | train
-------------------------------------------------------
25.3 M    Trainable params
225 K     Non-trainable params
25.6 M    Total params
102.268   Total estimated model params size (MB)
155       Modules in train mode
0         Modules in eval mode


Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

INFO:pytorch_lightning.utilities.rank_zero:`Trainer.fit` stopped: `max_epochs=5` reached.


0,1
epoch,▁▃▅▆▆█
train_acc,▁
train_loss,▁
trainer/global_step,▁▃▅▆▆█
val_acc,▁▆▇██
val_loss,█▅▂▁▁

0,1
epoch,4.0
train_acc,1.0
train_loss,0.16762
trainer/global_step,64.0
val_acc,0.67
val_loss,0.95374


INFO:pytorch_lightning.utilities.rank_zero:You are using the plain ModelCheckpoint callback. Consider using LitModelCheckpoint which with seamless uploading to Model registry.
INFO:pytorch_lightning.utilities.rank_zero:GPU available: True (cuda), used: True
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs


INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:pytorch_lightning.callbacks.model_summary:
  | Name      | Type             | Params | Mode 
-------------------------------------------------------
0 | criterion | CrossEntropyLoss | 0      | train
1 | backbone  | ResNet           | 25.6 M | train
-------------------------------------------------------
17.0 M    Trainable params
8.5 M     Non-trainable params
25.6 M    Total params
102.268   Total estimated model params size (MB)
155       Modules in train mode
0         Modules in eval mode


Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

INFO:pytorch_lightning.utilities.rank_zero:`Trainer.fit` stopped: `max_epochs=5` reached.


0,1
epoch,▁▃▅▆▆█
train_acc,▁
train_loss,▁
trainer/global_step,▁▃▅▆▆█
val_acc,▁▅▆██
val_loss,█▅▃▁▁

0,1
epoch,4.0
train_acc,0.96875
train_loss,0.35129
trainer/global_step,64.0
val_acc,0.71
val_loss,0.85683


**Question 03**

In [2]:
import os
import torch
import wandb
import torchvision
import torch.nn as nn
import pytorch_lightning as pl
from torch.utils.data import DataLoader
from torchvision.datasets import ImageFolder
import torchvision.transforms as T
from pytorch_lightning.loggers.wandb import WandbLogger
from google.colab import drive

drive.mount('/content/drive', force_remount=True)

# Set global seed for reproducibility
pl.seed_everything(42)
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "expandable_segments:True"

# Extract dataset if needed
zip_file = "/content/drive/MyDrive/nature_12K.zip"
extracted_dir = "/content/inaturalist_12K/train"

if not os.path.exists(extracted_dir):
    !cp "{zip_file}" .
    !unzip -qq nature_12K.zip
    !rm nature_12K.zip

# Fine-tuning class using PyTorch Lightning
class TransferLearner(pl.LightningModule):
    def __init__(self, mode="last_block"):
        super().__init__()
        self.strategy = mode
        self.lr = 1e-4
        self.criterion = nn.CrossEntropyLoss()

        # Use ResNet50 pretrained model
        net = torchvision.models.resnet50(weights="ResNet50_Weights.DEFAULT")
        original_fc = net.fc  # Retain the original 1000-class FC layer

        # Replace fc with extended 10-class output layer
        net.fc = nn.Sequential(
            original_fc,         # 1000-class output
            nn.ReLU(),           # Optional non-linearity
            nn.Linear(1000, 10)  # Final output layer for iNaturalist
        )

        self.backbone = net
        self.configure_finetune()

    def configure_finetune(self):
        """Freeze layers based on the selected strategy."""
        if self.strategy == "last_block":
            for name, param in self.backbone.named_parameters():
                if "layer4" in name or "fc" in name:
                    param.requires_grad = True
                else:
                    param.requires_grad = False

    def forward(self, x):
        return self.backbone(x)

    def training_step(self, batch, batch_idx):
        x, y = batch
        preds = self(x)
        loss = self.criterion(preds, y)
        acc = (preds.argmax(1) == y).float().mean()
        self.log_dict({"train_loss": loss, "train_acc": acc}, prog_bar=True)
        return loss

    def test_step(self, batch, batch_idx):
        x, y = batch
        preds = self(x)
        loss = self.criterion(preds, y)
        acc = (preds.argmax(1) == y).float().mean()
        self.log_dict({"test_loss": loss, "test_acc": acc}, prog_bar=True)

    def configure_optimizers(self):
        trainable_params = filter(lambda p: p.requires_grad, self.parameters())
        return torch.optim.Adam(trainable_params, lr=self.lr)

# Custom DataModule for iNaturalist with resizing and normalization
class INaturalistDataModule(pl.LightningDataModule):
    def __init__(self, batch_size=64):
        super().__init__()
        self.path = "/content/inaturalist_12K"
        self.batch_size = batch_size

    def setup(self, stage=None):
        preprocess = T.Compose([
            T.Resize((224, 224)),  # Ensures ImageNet compatibility
            T.ToTensor(),
            T.Normalize(mean=[0.4712, 0.4600, 0.3896], std=[0.2406, 0.2301, 0.2406])
        ])
        # Load the entire dataset
        train_data = ImageFolder(os.path.join(self.path, "train"), transform=preprocess)
        test_data = ImageFolder(os.path.join(self.path, "val"), transform=preprocess)

        # Set up datasets
        self.train_set = train_data
        self.test_set = test_data

    def train_dataloader(self):
        return DataLoader(self.train_set, batch_size=self.batch_size, shuffle=True, num_workers=2)

    def test_dataloader(self):
        return DataLoader(self.test_set, batch_size=self.batch_size, shuffle=False, num_workers=2)

# Launch a single experiment with W&B logging for 'last_block' fine-tuning strategy
def launch():
    mode_name = "last_block"
    wandb_logger = WandbLogger(project="inat-resnet-ft", name=mode_name)
    learner = TransferLearner(mode=mode_name)
    data = INaturalistDataModule()
    trainer = pl.Trainer(
        max_epochs=5,
        logger=wandb_logger,
        accelerator="auto",
    )
    trainer.fit(learner, datamodule=data)
    trainer.test(learner, datamodule=data)  # Run the test set after training
    wandb.finish()

# Run the experiment with last block fine-tuning
launch()

INFO:lightning_fabric.utilities.seed:Seed set to 42


Mounted at /content/drive


INFO:pytorch_lightning.utilities.rank_zero:You are using the plain ModelCheckpoint callback. Consider using LitModelCheckpoint which with seamless uploading to Model registry.
INFO:pytorch_lightning.utilities.rank_zero:GPU available: True (cuda), used: True
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs


INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:pytorch_lightning.callbacks.model_summary:
  | Name      | Type             | Params | Mode 
-------------------------------------------------------
0 | criterion | CrossEntropyLoss | 0      | train
1 | backbone  | ResNet           | 25.6 M | train
-------------------------------------------------------
17.0 M    Trainable params
8.5 M     Non-trainable params
25.6 M    Total params
102.268   Total estimated model params size (MB)
155       Modules in train mode
0         Modules in eval mode


Training: |          | 0/? [00:00<?, ?it/s]

INFO:pytorch_lightning.utilities.rank_zero:`Trainer.fit` stopped: `max_epochs=5` reached.
INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Testing: |          | 0/? [00:00<?, ?it/s]

0,1
epoch,▁▁▁▂▂▂▄▄▄▅▅▅▇▇▇█
test_acc,▁
test_loss,▁
train_acc,▁▂▇▅▇▇▇█▇██████
train_loss,█▆▃▃▂▂▂▁▂▂▁▁▂▁▁
trainer/global_step,▁▁▂▂▃▃▄▄▅▅▆▆▇▇██

0,1
epoch,5.0
test_acc,0.8405
test_loss,0.64107
train_acc,1.0
train_loss,0.01008
trainer/global_step,785.0


In [None]:
import nbformat
from google.colab import drive
import os

drive.mount('/content/drive')
# Path to current notebook
notebook_path = '/content/drive/MyDrive/PartB.ipynb'  # Change this!

# Load the notebook
with open(notebook_path, 'r', encoding='utf-8') as f:
    nb = nbformat.read(f, as_version=4)

# Remove metadata.widgets
if 'widgets' in nb['metadata']:
    del nb['metadata']['widgets']
    print("Removed 'metadata.widgets'")
else:
    print("No 'metadata.widgets' found")

# Save back to the same file (or change filename)
with open(notebook_path, 'w', encoding='utf-8') as f:
    nbformat.write(nb, f)
    print("Saved cleaned notebook")