### <font style="color:blue">Project 2: Kaggle Competition - Classification</font>

#### Maximum Points: 100

<div>
    <table>
        <tr><td><h3>Sr. no.</h3></td> <td><h3>Section</h3></td> <td><h3>Points</h3></td> </tr>
        <tr><td><h3>1</h3></td> <td><h3>Data Loader</h3></td> <td><h3>10</h3></td> </tr>
        <tr><td><h3>2</h3></td> <td><h3>Configuration</h3></td> <td><h3>5</h3></td> </tr>
        <tr><td><h3>3</h3></td> <td><h3>Evaluation Metric</h3></td> <td><h3>10</h3></td> </tr>
        <tr><td><h3>4</h3></td> <td><h3>Train and Validation</h3></td> <td><h3>5</h3></td> </tr>
        <tr><td><h3>5</h3></td> <td><h3>Model</h3></td> <td><h3>5</h3></td> </tr>
        <tr><td><h3>6</h3></td> <td><h3>Utils</h3></td> <td><h3>5</h3></td> </tr>
        <tr><td><h3>7</h3></td> <td><h3>Experiment</h3></td><td><h3>5</h3></td> </tr>
        <tr><td><h3>8</h3></td> <td><h3>TensorBoard Dev Scalars Log Link</h3></td> <td><h3>5</h3></td> </tr>
        <tr><td><h3>9</h3></td> <td><h3>Kaggle Profile Link</h3></td> <td><h3>50</h3></td> </tr>
    </table>
</div>


## <font style="color:green">1. Data Loader [10 Points]</font>

In this section, you have to write a class or methods, which will be used to get training and validation data loader.

You need to write a custom dataset class to load data.

**Note; There is   no separate validation data. , You will thus have to create your own validation set, by dividing the train data into train and validation data. Usually, we do 80:20 ratio for train and validation, respectively.**


For example:

```python
class KenyanFood13Dataset(Dataset):
    """
    
    """
    
    def __init__(self, *args):
    ....
    ...
    
    def __getitem__(self, idx):
    ...
    ...
    

```


```python
def get_data(args1, *args):
    ....
    ....
    return train_loader, test_loader
```

In [None]:
#!/usr/bin/env python3
#global flag to indicate if the script is running in a local environment
g_local_run: bool = True
g_measure_mean_std = True

In [None]:
%matplotlib inline

import torch
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms
from torchvision.transforms import functional as F

import lightning as L
from lightning.pytorch.callbacks import EarlyStopping, ModelCheckpoint
from lightning.pytorch.loggers import TensorBoardLogger
from torchmetrics.classification import  MulticlassAccuracy, MulticlassF1Score, MulticlassPrecision, MulticlassRecall
from torchmetrics import MeanMetric



import matplotlib.pyplot as plt

import os
import numpy as np
import pandas as pd

from PIL import Image


In [None]:
class KenyanFood13Dataset(Dataset):

    """Custom Dataset for Kenyan Food 13 Classification Task"""
    """ Accepts a CSV file with image ID and lable colums,
    Will split total images into train and validation sets with 80:20 ratio
    First 80% images will be used for training and remaining 20% for validation
    Args:
        annotations_file (string): Path to the csv file with annotations.
        img_dir (string): Directory with all the images.
        train (bool, optional): Indicates if the dataset is for training or validation.
            Default is True for training set, False for validation set.
        transform (callable, optional): Optional transform to be applied
            on a sample.
        target_transform (callable, optional): Optional transform to be applied
    """
    def __init__(self, annotations_file, img_dir, train=True, transform=None, target_transform=None):

        if g_local_run:
            print("Running in local mode - loading data from local paths")
            #add 'local' to annotatins_file name
            base, ext = os.path.splitext(annotations_file)
            annotations_file = f"{base}_local{ext}"

        # few error checks
        if not os.path.exists(annotations_file):
            raise FileNotFoundError(f"Annotations file not found: {annotations_file}")
        if not os.path.exists(img_dir):
            raise FileNotFoundError(f"Image directory not found: {img_dir}")

        self.img_labels = pd.read_csv(annotations_file)
        self.img_dir = img_dir
        self.transform = transform
        self.target_transform = target_transform
        self.local_run = g_local_run

        num_classes = len(self.img_labels['label'].unique())
        self.num_classes = num_classes
        print(f"Dataset initialized with {len(self.img_labels)} samples belonging to {num_classes} classes.")
        #split into train and validation sets
        split_index = int(0.8 * len(self.img_labels))
        if train:
            self.img_labels = self.img_labels.iloc[:split_index].reset_index(drop=True)
            print(f"Using {len(self.img_labels)} samples for training.")
        else:
            self.img_labels = self.img_labels.iloc[split_index:].reset_index(drop=True)
            print(f"Using {len(self.img_labels)} samples for validation.")

    def __getitem__(self, idx):
        if torch.is_tensor(idx):
            idx = idx.tolist()

        img_path = os.path.join(self.img_dir, self.img_labels.iloc[idx, 0])
        image = Image.open(img_path).convert("RGB")
        label = self.img_labels.iloc[idx, 1]

        if self.transform:
            image = self.transform(image)
        if self.target_transform:
            label = self.target_transform(label)

        return image, label

    def __len__(self):
        return len(self.img_labels)


In [None]:
# DataModule for Kenyan Food 13 Dataset
class KenyanFood13DataModule(L.LightningDataModule):
    def __init__(self, data_config, mean=None, std=None):
        super().__init__()
        self.data_config = data_config
        #split the data into train and validation sets based on annotations file contents
        if g_local_run:
            print("Running in local mode - loading data from local paths")
            #add 'local' to annotatins_file name
            base, ext = os.path.splitext(self.data_config.annotations_file)
            self.data_config.annotations_file = f"{base}_local{ext}"
        if not os.path.exists(self.data_config.annotations_file):
            raise FileNotFoundError(f"Annotations file not found: {self.data_config.annotations_file}")

        self.train_dataset = None
        self.val_dataset = None

        # Mean and Std for normalization - use provided values or defaults (ImageNet stats)
        self.mean = mean if mean is not None else [0.485, 0.456, 0.406]
        self.std = std if std is not None else [0.229, 0.224, 0.225]

    def setup(self, stage=None):
        # Define transforms, will have common transforms for train and val, and augmentation for train only
        #get height and width from data_config
        if isinstance(self.data_config.input_size, int):
            img_height = self.data_config.input_size
            img_width = self.data_config.input_size
        elif isinstance(self.data_config.input_size, tuple) and len(self.data_config.input_size) == 2:
            img_height = self.data_config.input_size[0]
            img_width = self.data_config.input_size[1]
        else:
            raise ValueError("input_size must be an int or a tuple of two ints (height, width)")

        common_transforms = transforms.Compose([
            transforms.Resize((img_height, img_width)),
            transforms.ToTensor(),
            transforms.Normalize(mean=self.mean, std=self.std)
        ])
        aug_transforms = transforms.Compose([
            transforms.RandomResizedCrop((img_height, img_width)),
            transforms.RandomHorizontalFlip(),
            transforms.RandomVerticalFlip(),
            transforms.RandomRotation(15),
            transforms.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.1),
            transforms.ToTensor(),
            transforms.Normalize(mean=self.mean, std=self.std),
            transforms.RandomErasing(p=0.5, scale=(0.02, 0.33), ratio=(0.3, 3.3))
        ])
        # Create datasets
        self.train_dataset = KenyanFood13Dataset(
            annotations_file=self.data_config.annotations_file,
            img_dir=self.data_config.img_dir,
            train=True,
            transform=aug_transforms)
        self.val_dataset = KenyanFood13Dataset(
            annotations_file=self.data_config.annotations_file,
            img_dir=self.data_config.img_dir,
            train=False,
            transform=common_transforms)
        self.num_classes = self.train_dataset.num_classes

    def train_dataloader(self):
        if self.train_dataset is None:
            raise RuntimeError("train_dataset is not initialized. Call setup() before requesting train_dataloader.")
        return DataLoader(self.train_dataset, batch_size=self.data_config.batch_size, shuffle=True, num_workers=self.data_config.num_workers)

    def val_dataloader(self):
        if self.val_dataset is None:
            raise RuntimeError("val_dataset is not initialized. Call setup() before requesting val_dataloader.")
        return DataLoader(self.val_dataset, batch_size=self.data_config.batch_size, shuffle=False, num_workers=self.data_config.num_workers)


## <font style="color:green">2. Configuration [5 Points]</font>

**Define your configuration here.**

For example:


```python
@dataclass
class TrainingConfiguration:
    '''
    Describes configuration of the training process
    '''
    batch_size: int = 10 
    epochs_count: int = 50  
    init_learning_rate: float = 0.1  # initial learning rate for lr scheduler
    log_interval: int = 5  
    test_interval: int = 1  
    data_root: str = "/kaggle/input/opencv-pytorch-project-2-classification-round-3" 
    num_workers: int = 2  
    device: str = 'cuda'  
    
```

In [None]:
# configurations
from dataclasses import dataclass


@dataclass
class TrainingConfiguration:
    batch_size: int = 32
    learning_rate: float = 0.001
    num_epochs: int = 10
    momentum: float = 0.9
    log_interval: int = 10
    random_seed: int = 42

    # Optimizer configuration
    optimizer: str = "sgd"  # Options: 'sgd', 'adam', 'adamw'
    weight_decay: float = 0.0001  # L2 regularization

    # Learning rate scheduler configuration
    use_scheduler: bool = True
    scheduler: str = "step"  # Options: 'step', 'cosine', 'reduce_on_plateau'
    lr_step_size: int = 5  # For StepLR: step size for learning rate decay
    lr_gamma: float = 0.1  # For StepLR: multiplicative factor of learning rate decay

    model_name: str = "googlenet" # base model we will use for transfer learning and fine-tuning
    pretrained: bool = True # use pretrained weights for the base model
    precision: str = "float32" # precision for training: float32, float16, bfloat16
    fine_tune_start: int = 5 # layer from which to start fine-tuning (1 means all layers, higher means fewer layers)


@dataclass
class DataConfiguration:
    annotations_file: str = "../data/kenyan-food-13/train.csv" if g_local_run else "/kaggle/input/kenyan-food-13/train.csv"
    img_dir: str = "../data/kenyan-food-13/images/images" if g_local_run else "/kaggle/input/kenyan-food-13/images/images"
    input_size: int = 224 # input image size for the model
    num_workers: int = 4 # number of workers for data loading
    batch_size: int = 32 # batch size for training and validation


@dataclass
class SystemConfiguration:
    device: str = "cuda" if torch.cuda.is_available() else "cpu"
    output_dir: str = "./output" if g_local_run else "/kaggle/working/output"

In [None]:
data_config = DataConfiguration()
train_config = TrainingConfiguration()
system_config = SystemConfiguration()

## <font style="color:green">3. Evaluation Metric [10 Points]</font>

**Define methods or classes that will be used in model evaluation. For example, accuracy, f1-score etc.**

In [None]:
#we will have methods to calculate accuracy, f1-score, precision, recall.


In [None]:
# LightningModule, we will use GoogleNet as base model for transfer learning and fine-tuning.
import torchvision


class KenyanFood13Classifier(L.LightningModule):
    def __init__(self, training_config: TrainingConfiguration, num_classes: int):
        super(KenyanFood13Classifier, self).__init__()
        self.save_hyperparameters()

        # Store training configuration
        self.training_config = training_config
        self.num_classes = num_classes

        # Load base model
        if training_config.model_name == "googlenet":
            self.model = torchvision.models.googlenet(pretrained=training_config.pretrained)
            # Replace the final layer
            self.model.fc = torch.nn.Linear(self.model.fc.in_features, num_classes)
        else:
            raise ValueError(f"Model {training_config.model_name} not supported.")

        self.criterion = torch.nn.CrossEntropyLoss()
        self.train_mean_loss = MeanMetric()
        self.val_mean_loss = MeanMetric()

        self.train_accuracy = MulticlassAccuracy(num_classes=num_classes, average='macro')
        self.val_accuracy = MulticlassAccuracy(num_classes=num_classes, average='macro')
        self.train_f1 = MulticlassF1Score(num_classes=num_classes, average='macro')
        self.val_f1 = MulticlassF1Score(num_classes=num_classes, average='macro')
        self.train_precision = MulticlassPrecision(num_classes=num_classes, average='macro')
        self.val_precision = MulticlassPrecision(num_classes=num_classes, average='macro')
        self.train_recall = MulticlassRecall(num_classes=num_classes, average='macro')
        self.val_recall = MulticlassRecall(num_classes=num_classes, average='macro')



    def forward(self, x):
        return self.model(x)

    def training_step(self, batch, batch_idx):
        # get data from batch images, labels
        images, labels = batch
        # predictions
        outputs = self(images)
        # calculate loss, uses cross-entropy loss
        loss = self.criterion(outputs, labels)
        self.train_mean_loss.update(loss)

        preds = torch.argmax(outputs, dim=1)
        self.train_accuracy.update(preds, labels)
        self.train_mean_loss.update(loss)
        self.train_precision.update(preds, labels)
        self.train_recall.update(preds, labels)
        self.train_f1.update(preds, labels)
        self.log('train/loss', self.train_mean_loss, on_step=True, on_epoch=True, prog_bar=True)
        self.log('train/acc', self.train_accuracy, on_step=True, on_epoch=True, prog_bar=True)

        return loss

    def on_train_epoch_end(self) -> None:
        #update  epoch level metrics and reset
        self.log('train/precision', self.train_precision.compute(), on_epoch=True, prog_bar=True)
        self.log('train/recall', self.train_recall.compute(), on_epoch=True, prog_bar=True)
        self.log('train/f1', self.train_f1.compute(), on_epoch=True, prog_bar=True)
        self.log('step', self.current_epoch, on_epoch=True, prog_bar=True)

        return super().on_train_epoch_end()

    def validation_step(self, batch, batch_idx):
        # get data from batch images, labels
        images, labels = batch
        # predictions
        outputs = self(images)
        # calculate loss, uses cross-entropy loss
        loss = self.criterion(outputs, labels)
        self.val_mean_loss.update(loss)

        preds = torch.argmax(outputs, dim=1)
        self.val_accuracy.update(preds, labels)
        self.val_precision.update(preds, labels)
        self.val_recall.update(preds, labels)
        self.val_f1.update(preds, labels)

        # Log metrics
        self.log('valid/loss', self.val_mean_loss, on_step=False, on_epoch=True, prog_bar=True)
        self.log('valid/acc', self.val_accuracy, on_step=False, on_epoch=True, prog_bar=True)
        self.log('valid/precision', self.val_precision, on_step=False, on_epoch=True, prog_bar=False)
        self.log('valid/recall', self.val_recall, on_step=False, on_epoch=True, prog_bar=False)
        self.log('valid/f1', self.val_f1, on_step=False, on_epoch=True, prog_bar=False)

    def on_validation_epoch_end(self) -> None:
        #update  epoch level metrics and reset
        self.log('valid/precision', self.val_precision.compute(), on_epoch=True, prog_bar=True)
        self.log('valid/recall', self.val_recall.compute(), on_epoch=True, prog_bar=True)
        self.log('valid/f1', self.val_f1.compute(), on_epoch=True, prog_bar=True)
        self.log('step', self.current_epoch, on_epoch=True, prog_bar=True)

        return super().on_validation_epoch_end()


    def configure_optimizers(self):
        # Create optimizer based on configuration
        if self.training_config.optimizer.lower() == "sgd":
            optimizer = torch.optim.SGD(
                self.parameters(),
                lr=self.training_config.learning_rate,
                momentum=self.training_config.momentum,
                weight_decay=self.training_config.weight_decay
            )
        elif self.training_config.optimizer.lower() == "adam":
            optimizer = torch.optim.Adam(
                self.parameters(),
                lr=self.training_config.learning_rate,
                weight_decay=self.training_config.weight_decay
            )
        elif self.training_config.optimizer.lower() == "adamw":
            optimizer = torch.optim.AdamW(
                self.parameters(),
                lr=self.training_config.learning_rate,
                weight_decay=self.training_config.weight_decay
            )
        else:
            raise ValueError(f"Optimizer {self.training_config.optimizer} not supported.")

        # Configure learning rate scheduler if enabled
        if self.training_config.use_scheduler:
            if self.training_config.scheduler.lower() == "step":
                scheduler = torch.optim.lr_scheduler.StepLR(
                    optimizer,
                    step_size=self.training_config.lr_step_size,
                    gamma=self.training_config.lr_gamma
                )
                return {
                    "optimizer": optimizer,
                    "lr_scheduler": {
                        "scheduler": scheduler,
                        "interval": "epoch",
                        "frequency": 1
                    }
                }
            elif self.training_config.scheduler.lower() == "cosine":
                scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(
                    optimizer,
                    T_max=self.training_config.num_epochs
                )
                return {
                    "optimizer": optimizer,
                    "lr_scheduler": {
                        "scheduler": scheduler,
                        "interval": "epoch",
                        "frequency": 1
                    }
                }
            elif self.training_config.scheduler.lower() == "reduce_on_plateau":
                scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(
                    optimizer,
                    mode='max',
                    factor=self.training_config.lr_gamma,
                    patience=3
                )
                return {
                    "optimizer": optimizer,
                    "lr_scheduler": {
                        "scheduler": scheduler,
                        "monitor": "valid/acc",
                        "interval": "epoch",
                        "frequency": 1
                    }
                }
            else:
                raise ValueError(f"Scheduler {self.training_config.scheduler} not supported.")

        return optimizer


## <font style="color:green">4. Train and Validation [5 Points]</font>


**Write the methods or classes to be used for training and validation.**

In [None]:
def training_validation(training_config: TrainingConfiguration,
                        data_config: DataConfiguration,
                        system_config: SystemConfiguration,
                        model, data_module):

    #random seed for reproducibility
    L.seed_everything(training_config.random_seed)

    if not model:
        raise ValueError("Model must be provided for training. Please initialize the model before calling this function.")
    if not data_module:
        raise ValueError(" data module is required to run the model")
    model = model
    data_module = data_module

    checkpoint_callback = ModelCheckpoint(
        dirpath=system_config.output_dir,
        filename="{epoch}-{val_loss:.2f}",
        save_top_k=3,
        monitor="valid/acc",
        mode="max",
        auto_insert_metric_name=False,
        save_weights_only=True)

    early_stopping_callback = EarlyStopping(
        monitor="valid/acc",
        patience=3,
        mode="max")

    # TensorBoard logger
    tensorboard_logger = TensorBoardLogger(
        save_dir=system_config.output_dir,
        name="kenyan_food_logs",
        version=None,  # Auto-incrementing version
        default_hp_metric=False
    )

    # Map precision string to PyTorch Lightning expected value
    precision_map = {
        "float32": 32,
        "float16": 16,
        "bfloat16": "bf16"
    }
    trainer_precision = precision_map.get(training_config.precision, 32)

    # Map device to accelerator type
    accelerator = "gpu" if system_config.device == "cuda" else "cpu"

    trainer = L.Trainer(
        max_epochs=training_config.num_epochs,
        accelerator=accelerator,
        devices="auto",
        precision=trainer_precision,
        callbacks=[checkpoint_callback, early_stopping_callback],
        logger=tensorboard_logger,
        default_root_dir=system_config.output_dir,
        log_every_n_steps=training_config.log_interval
    )

    trainer.fit(model, datamodule=data_module)
    trainer.validate(model, datamodule=data_module)

    return model, data_module, checkpoint_callback

## <font style="color:green">5. Model [5 Points]</font>

**Define your model in this section.**

**You are allowed to use any pre-trained model.**

## <font style="color:green">6. Utils [5 Points]</font>

**Define those methods or classes, which have  not been covered in the above sections.**

In [None]:
def calculate_dataset_mean_std(annotations_file, img_dir, img_size=(224, 224), sample_size=None):
    """
    Calculate mean and std for the dataset.

    Args:
        annotations_file: Path to CSV with image filenames
        img_dir: Directory containing images
        img_size: Tuple of (height, width) to resize images to
        sample_size: If provided, only use this many images for calculation (for speed)

    Returns:
        tuple: (mean, std) as lists of 3 values each for RGB channels
    """
    if g_local_run:
        base, ext = os.path.splitext(annotations_file)
        annotations_file = f"{base}_local{ext}"

    img_labels = pd.read_csv(annotations_file)

    # Use subset for faster computation if specified
    if sample_size and sample_size < len(img_labels):
        img_labels = img_labels.sample(n=sample_size, random_state=42)

    print(f"Calculating mean and std from {len(img_labels)} images...")

    means = []
    stds = []

    for idx, row in img_labels.iterrows():
        img_path = os.path.join(img_dir, row.iloc[0])
        try:
            img = Image.open(img_path).convert("RGB")
            img = img.resize(img_size)
            img_array = np.array(img) / 255.0  # Normalize to [0, 1]

            # Calculate mean and std per channel
            img_tensor = torch.tensor(img_array).permute(2, 0, 1)  # C, H, W
            means.append(img_tensor.mean(dim=(1, 2)))
            stds.append(img_tensor.std(dim=(1, 2)))
        except Exception as e:
            print(f"Error processing {img_path}: {e}")
            continue

    # Calculate overall mean and std
    mean = torch.stack(means).mean(dim=0).tolist()
    std = torch.stack(stds).mean(dim=0).tolist()

    print(f"Calculated mean: {mean}")
    print(f"Calculated std: {std}")

    return mean, std


In [None]:

# mean, std = calculate_dataset_mean_std(
#     annotations_file=data_config.annotations_file,
#     img_dir=data_config.img_dir,
#     img_size=(data_config.input_size, data_config.input_size),
#     sample_size=1000  # Use subset for faster calculation, or None for all images
# )

# Then create DataModule with calculated values:
# data_module = KenyanFood13DataModule(data_config, mean=mean, std=std)

# Or use default ImageNet stats:
# data_module = KenyanFood13DataModule(data_config)

if g_measure_mean_std:
    mean, std = calculate_dataset_mean_std(
        annotations_file=data_config.annotations_file,
        img_dir=data_config.img_dir,
        img_size=(data_config.input_size, data_config.input_size),
        sample_size=1000)  # Use subset for faster calculation, or None for all images
    data_module = KenyanFood13DataModule(data_config=data_config, mean=mean, std=std)
else:
    data_module = KenyanFood13DataModule(data_config=data_config)

# Setup data module to get num_classes
data_module.setup()

model = KenyanFood13Classifier(train_config, data_module.num_classes)

## <font style="color:green">7. Experiment [5 Points]</font>

**Choose your optimizer and LR-scheduler and use the above methods and classes to train your model.**

In [None]:
model, data_module, model_ckpt = training_validation(
    training_config=train_config,
    data_config=data_config,
    system_config=system_config,
    model=model,
    data_module=data_module
)

## <font style="color:green">8. TensorBoard Log Link [5 Points]</font>

**Share your TensorBoard scalars logs link here You can also share (not mandatory) your GitHub link, if you have pushed this project in GitHub.**


Note: In light of the recent shutdown of tensorboard.dev, we have updated the submission requirements for your project. Instead of sharing a tensorboard.dev link, you are now required to upload your generated TensorBoard event files directly onto the lab. As an alternative, you may also include a screenshot of your TensorBoard output within your Jupyter notebook. This adjustment ensures that your data visualization and model training efforts are thoroughly documented and accessible for evaluation.

You are also welcome (and encouraged) to utilize alternative logging services like wandB or comet. In such instances, you can easily make your project logs publicly accessible and share the link with others.

## <font style="color:green">9. Kaggle Profile Link [50 Points]</font>

**Share your Kaggle profile link  with us here to score , points in  the competition.**

**For full points, you need a minimum accuracy of `75%` on the test data. If accuracy is less than `70%`, you gain  no points for this section.**


**Submit `submission.csv` (prediction for images in `test.csv`), in the `Submit Predictions` tab in Kaggle, to get evaluated for  this section.**