# <font style="color:blue">Project 2: Kaggle Competition - Classification</font>

#### Maximum Points: 100

<div>
    <table>
        <tr><td><h3>Sr. no.</h3></td> <td><h3>Section</h3></td> <td><h3>Points</h3></td> </tr>
        <tr><td><h3>1</h3></td> <td><h3>Data Loader</h3></td> <td><h3>10</h3></td> </tr>
        <tr><td><h3>2</h3></td> <td><h3>Configuration</h3></td> <td><h3>5</h3></td> </tr>
        <tr><td><h3>3</h3></td> <td><h3>Evaluation Metric</h3></td> <td><h3>10</h3></td> </tr>
        <tr><td><h3>4</h3></td> <td><h3>Train and Validation</h3></td> <td><h3>5</h3></td> </tr>
        <tr><td><h3>5</h3></td> <td><h3>Model</h3></td> <td><h3>5</h3></td> </tr>
        <tr><td><h3>6</h3></td> <td><h3>Utils</h3></td> <td><h3>5</h3></td> </tr>
        <tr><td><h3>7</h3></td> <td><h3>Experiment</h3></td><td><h3>5</h3></td> </tr>
        <tr><td><h3>8</h3></td> <td><h3>TensorBoard Dev Scalars Log Link</h3></td> <td><h3>5</h3></td> </tr>
        <tr><td><h3>9</h3></td> <td><h3>Kaggle Profile Link</h3></td> <td><h3>50</h3></td> </tr>
    </table>
</div>


# <font style="color:green">Project Approach</font>

The <a href="https://www.analyticsvidhya.com/blog/2017/06/transfer-learning-the-art-of-fine-tuning-a-pre-trained-model/"><b>Transfer learning and the art of using Pre-trained Models in Deep Learning</b></a> blog post outlines four ways to fine tune a model that has been trained on a different dataset. The following is a short section of this post that I slightly changed.

The following diagram helps one decide how to use a pretrained model on a new data set.

<img src="https://cdn.analyticsvidhya.com/wp-content/uploads/2017/05/31112715/finetune1.jpg" />

<b>Scenario 1: Size of the data set is small and the data similarity is high.</b> In this case, since the data similarity is very high, we do not need to retrain the model. All we need to do is to customize and modify the output layers according to our problem statement. We use the pretrained model as a feature extractor and retrain the classification block/layer.

<b>Scenario 2: Size of the data set is small and the data similairity is low.</b> In this case we can freeze the initial (let’s say k) layers of the pretrained model and train just the remaining(n-k) layers again. The top layers would then be customized to the new data set. Since the new data set has low similarity it is significant to retrain and customize the higher layers according to the new dataset.  The small size of the data set is compensated by the fact that the initial layers are kept pretrained(which have been trained on a large dataset previously) and the weights for those layers are frozen.

<b>Scenario 3: Size of the data set is large and the data similarity is low.</b>  In this case, since we have a large dataset, our neural network training would be effective. However, since the data we have is very different as compared to the data used for training our pretrained models. The predictions made using pretrained models would not be effective. Hence, its best to train the neural network from scratch according to your data.

<b>Scenario 4: Size of the data set is large and the data similarity is high.</b> This is the ideal situation. In this case the pretrained model should be most effective. The best way to use the model is to retain the architecture of the model and the initial weights of the model. Then we can retrain this model using the weights as initialized in the pre-trained model.

<hr>

Since I did not know how to program in Python before this class, I am using this project to improve my Python proficiency. Consequently, I will not only to explore using pretrained models on a new data set, but I will also spend significant time developing class hierachies that will allow me to easily conduct experiments on the following pretrained TorchVision models using any of the scenarios described above.
<ul>
    <li>ResNet-18</li>
    <li>ResNet-34</li>
    <li>ResNet-50</li>
    <li>ResNet-101</li>
    <li>ResNet-152</li>
    <li>VGG-11 with batch normalization</li>
    <li>VGG-13 with batch normalization</li>
    <li>VGG-16 with batch normalization</li>
    <li>VGG-19 with batch normalization</li>
    <li>DenseNet-121</li>
    <li>DenseNet-169</li>
    <li>DenseNet-201</li>
    <li>DenseNet-161</li>
    <li>ResNeXt-50-32x4d</li>
    <li>ResNeXt-101-32x8d</li>
    <li>Wide ResNet-50-2</li>
    <li>Wide ResNet-101-2</li>
</ul>

Because I want to improve my Python proficiency, I decided to use and modify the trainer module rather than use Pytorch Lightning. Since I plan to conduct a lot of experiments, I will implement mechanisms to stop training when either the loss or accuracy does not significantly improve over a certain number of epochs. I will use TensorBoard to visualize data.

Experiments will be identified with the prefix "Exp" followed by two numbers of a letter (regular expression = Exp[0-9][0-9][A-Z]). The numbers will designate the experiment group and set, while the letter will designates the experiment itself. Hence, all experiments prefixed with Exp0?? belong the Group 0, while all experiments prefixed with Exp01? belong to Group 0, Set 1.

In [None]:
# This cell initializes the notebook for execution on different hosts.

import os
import sys

def get_host() -> str:
    """
    The get_ipython() function returns the following from different hosts.

    colab:  <google.colab._shell.Shell object at 0x7f23c5e386d8>
    brule:  <ipykernel.zmqshell.ZMQInteractiveShell object at 0x7f1990f22a50>
    kaggle: <ipykernel.zmqshell.ZMQInteractiveShell object at 0x7f9d093aebd0>
    """
    
    if 'google.colab' in str(get_ipython()):
        return "colab"
    else:
        # ToDo: Determine whether running on kaggle.
        return "brule"

def init_host(host:str):
    if host == "brule":
        # set data and project directories
        if os.path.isdir("./trainer"):
            data_dir = "./data"
            proj_dir = "./"
        elif os.path.isdir("./project2/trainer"):
            data_dir = "./project2/data"
            proj_dir = "./project2"
        else:
            raise SystemExit("Cannot locate trainer module.")

    elif host == "colab":
        # mount Google Drive
        from google.colab import drive
        drive.mount("/content/gdrive")

        # set data and project directories
        data_dir = "/content/data"
        proj_dir = "/content/gdrive/MyDrive/Colab Notebooks/project2"

        # fetching data from Google Drive is very, very slow ...
        # hence, we will unzip the dataset to /content/data if it is not there
        dataset = os.path.join(proj_dir, "data", "pytorch-opencv-course-classification.zip")
        if not os.path.isdir(data_dir):
              os.makedirs(data_dir)
              import zipfile
              with zipfile.ZipFile(dataset, 'r') as zip_ref:
                  zip_ref.extractall(data_dir)              

    else:
        raise SystemExit("Unknown host! Cannot continue.")

    sys.path.append(proj_dir)
    return data_dir, proj_dir

data_dir, proj_dir = init_host(get_host())

print(f"data_dir: {data_dir}")
!ls -lh {data_dir.replace(" ", "\\ ")}

print(f"proj_dir: {proj_dir}")
!ls -lh {proj_dir.replace(" ", "\\ ")}

In [None]:
# import organzier @ https://pypi.org/project/importanize/

from abc import ABC, abstractmethod, abstractproperty
from collections import namedtuple
from dataclasses import dataclass, replace
from enum import Enum, auto
from operator import itemgetter
from typing import Callable, Iterable, List, Tuple

import numpy as np
import pandas as pd
import PIL
from PIL import Image

import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import torchvision

from sklearn.model_selection import train_test_split
from torch.utils.data import DataLoader, Dataset
from torchvision import models, transforms

from trainer import Trainer, configuration, hooks
from trainer.configuration import SystemConfig, DatasetConfig, DataLoaderConfig, OptimizerConfig, SchedulerConfig, TrainerConfig
from trainer.metrics import AccuracyEstimator
from trainer.tensorboard_visualizer import TensorBoardVisualizer
from trainer.utils import patch_configs, setup_system

## <font style="color:green">1. Data Loader [10 Points]</font>

In this section, you have to write a class or methods that will be used to get training and validation data
loader.

You will have to write a custom dataset class to load data.

**Note that there are not separate validation data, so you will have to create your validation set by dividing train data into train and validation data. Usually, in practice, we do `80:20` ratio for train and validation, respectively.** 

For example,

```
class KenyanFood13Dataset(Dataset):
    """
    
    """
    
    def __init__(self, *args):
    ....
    ...
    
    def __getitem__(self, idx):
    ...
    ...
    
    
```

```
def get_data(args1, *agrs):
    ....
    ....
    return train_loader, test_loader
```

In [None]:
class KenyanFood13Data:
    """
    This class parses the KenyanFood13's test.csv and train.csv files and divides the training data
    into training and validation sets.
    """
    
    def __init__(self, data_root, valid_size = 0.2, random_seed = 42):
        """
        """
        
        # the root path of the images
        self.__image_root = os.path.join(data_root, 'images', 'images')
        
        # parse the test CSV file to obtain filenames (labels are not given)
        test_data_frame = self.__parse_data_file(data_root, 'test.csv')
        self.__test_fnames = test_data_frame.values[:,0]
        
        # parse the train CSV file to obtain filenames and labels       
        train_data_frame = self.__parse_data_file(data_root, 'train.csv')
        fnames = train_data_frame.values[:,0]
        labels = train_data_frame.values[:,1]
        
        # get the classes and class counts
        self.__classes, self.__class_counts = np.unique(labels, return_counts=True)
        num_classes = len(self.__classes)
        
        # create a dictionary of text labels to integer labels
        label_dict = {}
        for key, value in zip(self.__classes, np.arange(num_classes)):
            label_dict[key] = value
                
        # convert the text labels to their numeric equivalents
        labels = [label_dict[label] for label in labels]

        # retain the complete unsplit training dataset for visualization purposes
        self.__unsplit_fnames = fnames
        self.__unsplit_labels = labels

        # create a dictionary library that stores list of images of the same label
        self.__library = {key : [fname for fname, label in zip(fnames, labels) if label == key] 
                          for key in range(num_classes)}

        # split the training data into training and validation sets
        self.__train_fnames, self.__valid_fnames, self.__train_labels, self.__valid_labels = train_test_split(
            fnames,                      # image file names w/o path or extension
            labels,                      # image labels
            test_size = valid_size,      # test size
            random_state = random_seed,  # random seed for reproducibility
            shuffle = True,              # shuffle data before splitting into training and validation sets
            stratify = labels            # maintain equal class representation in training and validation sets
        )

        # create subsets of the training and validation sets for pipeline check
        subset_size = 256.0 / len(self.__train_fnames)

        _, self.__train_fnames_subset, _, self.__train_labels_subset = train_test_split(
            self.__train_fnames,
            self.__train_labels,
            test_size = subset_size,
            random_state = random_seed,
            shuffle = True,
            stratify = self.__train_labels
        )

        _, self.__valid_fnames_subset, _, self.__valid_labels_subset = train_test_split(
            self.__valid_fnames,
            self.__valid_labels,
            test_size = subset_size,
            random_state = random_seed,
            shuffle = True,
            stratify = self.__valid_labels
        )

        
    def __parse_data_file(self, data_root, file):
        path = os.path.join(data_root, file)
        return pd.read_csv(path, delimiter=',', dtype={'id': 'str'}, engine='python')
    
    @property
    def image_root(self):
        return self.__image_root

    @property
    def classes(self):
        return self.__classes
    
    @property
    def class_counts(self):
        return self.__class_counts
    
    @property
    def test_fnames(self):
        return self.__test_fnames
    
    @property
    def train_fnames(self):
        return self.__train_fnames
    
    @property
    def train_labels(self):
        return self.__train_labels
    
    @property
    def valid_fnames(self):
        return self.__valid_fnames
    
    @property
    def valid_labels(self):
        return self.__valid_labels

    @property
    def train_fnames_subset(self):
        return self.__train_fnames_subset
    
    @property
    def train_labels_subset(self):
        return self.__train_labels_subset
    
    @property
    def valid_fnames_subset(self):
        return self.__valid_fnames_subset
    
    @property
    def valid_labels_subset(self):
        return self.__valid_labels_subset

    @property
    def unsplit_fnames(self):
        return self.__unsplit_fnames
    
    @property
    def unsplit_labels(self):
        return self.__unsplit_labels

    @property
    def library(self):
          return self.__library

In [None]:
class KenyanFood13Dataset(Dataset):
    """
    This custom PyTorch dataset contains images and classification labels from
    Kaggle's KenyanFood13 dataset.
    """
    
    def __init__(self, image_root, fnames, labels=None, transform=None):
        super().__init__()
        self.__fnames = fnames
        self.__labels = labels
        self.__transform = transform
        self.__image_root = image_root

    def __len__(self):
        """
        Returns the dataset's length, i.e., the number of image/label pairs.
        """

        return len(self.__fnames)
    
    def __getitem__(self, idx):
        """
        Returns the (optionally resized & preprocessed) image that corresponds to the specified index.
        """

        # conversion needed to remove alpha channel, if present
        path = os.path.join(self.__image_root, self.__fnames[idx] + ".jpg")
        image = Image.open(path).convert("RGB")
        
        if self.__transform is not None:
            image = self.__transform(image)

        if self.__labels is not None:
            extra = self.__labels[idx]  # return target with image
        else:
            extra = self.__fnames[idx]  # return filename with image

        return image, extra

In [None]:
 def get_datasets(
    data: KenyanFood13Data,
    test_transforms,
    train_transforms,
    subset = False
):
    """
    Creates datasets for the training, validation, and testing.
    """

    if not subset:

        train_dataset = KenyanFood13Dataset(
            image_root = data.image_root, 
            fnames = data.train_fnames, 
            labels = data.train_labels, 
            transform = train_transforms)

        valid_dataset = KenyanFood13Dataset(
            image_root = data.image_root, 
            fnames = data.valid_fnames, 
            labels = data.valid_labels, 
            transform = test_transforms)


    else:
        
        train_dataset = KenyanFood13Dataset(
            image_root = data.image_root, 
            fnames = data.train_fnames_subset, 
            labels = data.train_labels_subset, 
            transform = train_transforms)

        valid_dataset = KenyanFood13Dataset(
            image_root = data.image_root, 
            fnames = data.valid_fnames_subset, 
            labels = data.valid_labels_subset, 
            transform = test_transforms)

    test_dataset = KenyanFood13Dataset(
        image_root = data.image_root, 
        fnames = data.test_fnames, 
        transform = test_transforms)

    return train_dataset, valid_dataset, test_dataset

In [None]:
def get_data_loaders(
    train_dataset: Dataset,
    valid_dataset: Dataset,
    test_dataset: Dataset,
    batch_size = 16, 
    num_workers = 2
):
    """
    This function creates and returns the training and validation data loaders.
    """
    
    train_data_loader = torch.utils.data.DataLoader(
        train_dataset, 
        batch_size=batch_size, 
        num_workers=num_workers, 
        shuffle=True)
    
    valid_data_loader = torch.utils.data.DataLoader(
        valid_dataset, 
        batch_size=batch_size, 
        num_workers=num_workers, 
        shuffle=False)

    test_data_loader = torch.utils.data.DataLoader(
        test_dataset, 
        batch_size=batch_size, 
        num_workers=num_workers, 
        shuffle=False)
    

    return train_data_loader, valid_data_loader, test_data_loader

In [None]:
def get_mean_std(data_loader=None):
    """
    Computes the mean and standard deviation. Since this method takes a long
    time to run and the data for this workbook is fixed, this method was run
    once and its result was copied to the normalization transform.
    """
    
    if data_loader is None:
        """
        Returns the mean and standard deviation used by the pretrained
        classification models.
        """

        mean = [0.485, 0.456, 0.406] 
        std = [0.229, 0.224, 0.225]
    
    else:
        """
        Computes the mean and standard deviation of the images returned
        by the specified data loader. 
        
        For comparision, the mean and standard deviation of the KenyanFood13
        images using the train_dataset and preprocess transforms is as follows.
        
            mean = [0.5778, 0.4631, 0.3471], 
            std = [0.2380, 0.2461, 0.2464]):
        """
        
        std = 0.
        mean = 0.
        for images, _ in data_loader:
            batch_samples = images.size(0)
            images = images.view(batch_samples, images.size(1), -1)
            std += images.std(2).sum(0)
            mean += images.mean(2).sum(0)
        std /= len(data_loader.dataset)
        mean /= len(data_loader.dataset)

    return mean, std

In [None]:
class ImageTransforms:
    """
    This utility class has methods to create transforms used to train and evaluate a model as
    well as visualize images.
    """
    
    def __init__(self, resize=256, crop_size=224, mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]):
        self.__resize = resize
        self.__crop_size = crop_size
        self.__mean = mean
        self.__std = std

    def preprocess(self):
        """
        These transformations convert PIL images to uniformly sized tensors whose dimensions
        are crop_size x crop_size pixels.
        """
        return transforms.Compose([
            transforms.Resize(self.__resize),
            transforms.CenterCrop(self.__crop_size),
            transforms.ToTensor()
        ])
    
    def common(self):
        """
        These transformations convert PIL images to uniformly sized tensors whose dimensions
        are crop_size x crop_size pixels and values are normalized by the mean and standard
        deviation.
        """
        return transforms.Compose([
            self.preprocess(),
            transforms.Normalize(self.__mean, self.__std)
        ])
    
    def augment(self):
        """
        These transformations convert PIL images to uniformly sized tensors whose dimensions
        are crop_size x crop_size pixels and values are normalized by the mean and standard
        deviation with the following data random augmentations: color jitter, horizontal flip,
        vertical flip, rotation, translation, scaling, and erasing.
        """
        # rotation will occur before resizing and cropping to avoid "corner voids"
        return transforms.Compose([
            transforms.ColorJitter(brightness=0.25, contrast=0.25, saturation=0.25, hue=0.25),
            transforms.RandomVerticalFlip(),
            transforms.RandomHorizontalFlip(),
            transforms.RandomAffine(degrees=45, translate=(0.2, 0.2), scale=(0.8, 1.2), resample=PIL.Image.BILINEAR),
            self.common(),
            transforms.RandomErasing()
        ])

## <font style="color:green">2. Configuration [5 Points]</font>

Define your configuration in this section.

For example,

```
@dataclass
class TrainingConfiguration:
    '''
    Describes configuration of the training process
    '''
    batch_size: int = 10 
    epochs_count: int = 50  
    init_learning_rate: float = 0.1  # initial learning rate for lr scheduler
    log_interval: int = 5  
    test_interval: int = 1  
    data_root: str = "/kaggle/input/pytorch-opencv-course-classification/" 
    num_workers: int = 2  
    device: str = 'cuda'  
    
```

## <font style="color:blue">Assignment Response</font>

Since I am using the <b>trainer</b> module, I made minor modifications to the <u>configuration.py</u> file. In addition, I created a master <i>MasterConfig</i> data class that encapsulates the individual configuration data classes. Following, I created helper functions to instantiate the <i>MasterConfig</i> class with experiment-specific overrides.

The following is the output of the <code>create_master_config</code> method w/o any parameter overrides.

<u>Note</u>: Replaced nested compose blocks with a linear iterable and removed transform parameters for brevity.

<pre>
MasterConfig(
    system=SystemConfig(
        proj_dir='./project2', 
        seed=42, 
        cudnn_deterministic=True, 
        cudnn_benchmark_enabled=False
    ), 
    dataset=DatasetConfig(
        data_dir='./project2/data', 
        valid_size=0.2, 
        train_transforms=Iterable[Callable] = (
            ColorJitter( ... ),
            RandomVerticalFlip( ... ),
            RandomHorizontalFlip( ... ),
            RandomAffine( ... ),
            Resize( ... ),
            CenterCrop( ... ),
            ToTensor(),
            Normalize( ... ),
            RandomErasing()
        ), 
        test_transforms=Iterable[Callable] = (
            Resize( ... ),
            CenterCrop( ... ),
            ToTensor(),
            Normalize( ... ),
        ), 
        visual_transforms=Iterable[Callable] = (
            Resize( ... ),
            CenterCrop( ... ),
            ToTensor(),
        )
    ), 
    data_loader=DataLoaderConfig(
        batch_size=32, 
        num_workers=4
    ), 
    optimizer=OptimizerConfig(
        learning_rate=0.001, 
        momentum=0.9, 
        weight_decay=0.0001, 
        betas=(0.9, 0.999)
    ), 
    scheduler=SchedulerConfig(
        gamma=0.1, 
        step_size=10, 
        milestones=(20, 30, 40), 
        patience=10, 
        threshold=0.0001
    ), 
    trainer=TrainerConfig(
        device='cuda', 
        training_epochs=50, 
        progress_bar=True, 
        model_dir='models', 
        model_saving_period=0, 
        visualizer_dir='runs', 
        stop_loss_epochs=0, 
        stop_acc_epochs=0, 
        stop_acc_ema_alpha=0.3, 
        stop_acc_threshold=2.0
    )
)
</pre>

In [None]:
def create_system_config():
    return SystemConfig(
        proj_dir = proj_dir
    )

def create_dataset_config(
    resize: int = 256, 
    crop_size: int = 224
):
    mean, std = get_mean_std()
    transforms = ImageTransforms(
        resize = resize, 
        crop_size = crop_size, 
        mean = mean, 
        std = std
    )
    return DatasetConfig(
        data_dir = data_dir,
        test_transforms = transforms.common(),
        train_transforms = transforms.augment(),
        visual_transforms = transforms.preprocess()
    )

def create_data_loader_config(
    batch_size: int = None, 
    num_workers: int = None
):
    config = DataLoaderConfig()
    if batch_size is None:
        batch_size = config.batch_size
    if num_workers is None:
        num_workers = config.num_workers
    return DataLoaderConfig(
        batch_size = batch_size,
        num_workers = num_workers
    )

def create_optimizer_config(
    learning_rate: float = None, 
    momentum: float = None, 
    weight_decay: float = None,
    betas: Tuple[float, float] = None
):
    config = OptimizerConfig()
    if learning_rate is None:
        learning_rate = config.learning_rate
    if momentum is None:
        momentum = config.momentum
    if weight_decay is None:
        weight_decay = config.weight_decay
    if betas is None:
        betas = config.betas
    return OptimizerConfig(
        learning_rate = learning_rate,
        momentum = momentum,
        weight_decay = weight_decay,
        betas = betas
    )

def create_scheduler_config(
    gamma: float = None,
    step_size: int = None,
    milestones: Iterable = None,
    patience: int = None,
    threshold: float = None
):
    config = SchedulerConfig()
    if gamma is None:
        gamma = config.gamma
    if step_size is None:
        step_size = config.step_size
    if milestones is None:
        milestones = config.milestones
    if patience is None:
        patience = config.patience
    if threshold is None:
        threshold = config.threshold
    return SchedulerConfig(
        gamma = gamma,
        step_size = step_size,
        milestones = milestones,
        patience = patience,
        threshold = threshold
    )

def create_trainer_config(
    training_epochs: int = None,
    stop_loss_epochs: int = None,
    stop_acc_epochs: int = None, 
    stop_acc_ema_alpha: float = None,
    stop_acc_threshold: float = None
):
    config = TrainerConfig()
    if training_epochs is None:
        training_epochs = config.training_epochs
    if stop_loss_epochs is None:
        stop_loss_epochs = config.stop_loss_epochs
    if stop_acc_epochs is None:
        stop_acc_epochs = config.stop_acc_epochs
    if stop_acc_ema_alpha is None:
        stop_acc_ema_alpha = config.stop_acc_ema_alpha
    if stop_acc_threshold is None:
        stop_acc_threshold = config.stop_acc_threshold
    return TrainerConfig(
        training_epochs = training_epochs,
        stop_loss_epochs = stop_loss_epochs,
        stop_acc_epochs = stop_acc_epochs,
        stop_acc_ema_alpha = stop_acc_ema_alpha,
        stop_acc_threshold = stop_acc_threshold
    )

In [None]:
@dataclass
class MasterConfig:
    system: SystemConfig = create_system_config()
    dataset: DatasetConfig = create_dataset_config()
    data_loader: DataLoaderConfig = create_data_loader_config()
    optimizer: OptimizerConfig = create_optimizer_config()
    scheduler: SchedulerConfig = create_scheduler_config()
    trainer: TrainerConfig = create_trainer_config()

In [None]:
def create_master_config(
    transform_resize: int = 256,
    transform_crop_size: int = 224,
    data_loader_batch_size: int = None,
    data_loader_num_workers: int = None,
    optimzer_learning_rate: float = None,
    optimzer_momentum: float = None,
    optimzer_weight_decay: float = None,
    optimzer_betas: Tuple[float, float] = None,
    lr_scheduler_gamma: float = None,
    lr_scheduler_step_size: int = None,
    lr_scheduler_milestones: Iterable = None,
    lr_scheduler_patience: int = None,
    lr_scheduler_threshold: float = None,
    trainer_training_epochs: int = None,
    trainer_stop_loss_epochs: int = None,
    trainer_stop_acc_epochs: int = None,
    trainer_stop_acc_ema_alpha: float = None,
    trainer_stop_acc_threshold: float = None       
) -> MasterConfig:
    return MasterConfig(
        system = create_system_config(),
        dataset = create_dataset_config(
            transform_resize,
            transform_crop_size
        ),
        data_loader = create_data_loader_config(
            data_loader_batch_size,
            data_loader_num_workers
        ),
        optimizer = create_optimizer_config(
            optimzer_learning_rate,
            optimzer_momentum,
            optimzer_weight_decay,
            optimzer_betas
        ),
        scheduler = create_scheduler_config(
            lr_scheduler_gamma,
            lr_scheduler_step_size,
            lr_scheduler_milestones,
            lr_scheduler_patience,
            lr_scheduler_threshold   
        ),
        trainer = create_trainer_config(
            trainer_training_epochs,
            trainer_stop_loss_epochs,
            trainer_stop_acc_epochs,
            trainer_stop_acc_ema_alpha,
            trainer_stop_acc_threshold       
        )
    )    

## <font style="color:green">3. Evaluation Metric [10 Points]</font>

Define methods or classes that will be used in model evaluation, for example, accuracy, f1-score, etc.

### <font style="color:blue">Loss Function</font>

The number of images per class are unequal; thus, a weighted cross-entropy loss function should be used.

The number of images per class were obtained via the following code.<code>

    data = KenyanFood13Data(...)
    images_per_class = np.column_stack((data.classes, data.class_counts))
    print(images_per_class)

    [['bhaji' 632]
     ['chapati' 862]
     ['githeri' 479]
     ['kachumbari' 494]
     ['kukuchoma' 173]
     ['mandazi' 620]
     ['masalachips' 438]
     ['matoke' 483]
     ['mukimo' 212]
     ['nyamachoma' 784]
     ['pilau' 329]
     ['sukumawiki' 402]
     ['ugali' 628]]
</code>

The rescaling weights given to each class were obtained via the following code.<code>

    rescaling_weights = data.class_counts / np.sum(data.class_counts)
    print(rescaling_weights)
    
    [0.09669523 0.13188494 0.07328641 0.0755814  0.02646879 0.09485924
     0.06701346 0.07389841 0.03243574 0.11995104 0.0503366  0.06150551
     0.09608323]
</code>

In [None]:
weighted_cross_entropy_loss = nn.CrossEntropyLoss(
    weight=torch.tensor([
        0.09669523, 0.13188494, 0.07328641, 0.0755814,  0.02646879, 
        0.09485924, 0.06701346, 0.07389841, 0.03243574, 0.11995104, 
        0.0503366, 0.06150551, 0.09608323
    ])
)

### <font style="color:blue">Metric Function</font>

We are going to use the <b>trainer</b> module's <i>AccuracyEstimator</i> class from <u>metrics.py</u> file.

## <font style="color:green">4. Train and Validation [5 Points]</font>

Write the methods or classes that will be used for training and validation.

## <font style="color:blue">Assignment Response</font>

Since I am using the <b>trainer</b> module, I made the following modifications to the <u>trainer.py</u> file.
<ul>
    <li>Added the ability to save the model only when the test loss reaches a new minimum.</li>
    <li>Added the ability to terminate training after a specified number of epochs where the test loss is not further reduced.</li>
    <li>Added the ability to terminate training after a specified number of epochs where the exponential moving average of the test loss does not significantly increase.</li>
</ul>

I made the following modifications to the <u>visualizer.py</u> and <u>tensorboard_visualizer.py</u> files.
<ul>
    <li>Added an <code>add_image(self, tag, image)</code> method to visualize the dataset.
    <li>Added an <code>add_graph(self, model, images)</code> method to document the model.</li>
    <li>Added an <code>add_pr_curves(self, classes, pred_probs, targets)</code> method to document the precision-recall curves of the fully trained model for each class type.</li>
</ul>

In [None]:
class Optimizer(Enum):
    SGD = auto()
    ADAM = auto()
    
def get_optimizer(
    model: nn.Module,
    optimizer: Optimizer = Optimizer.SGD,
    config: OptimizerConfig = OptimizerConfig()
):
    """
    Gets the specified optimzer.
    """
    
    if optimizer == Optimizer.SGD:
        return optim.SGD(
            model.parameters(),
            lr = config.learning_rate,
            weight_decay = config.weight_decay,
            momentum = config.momentum
        )
    
    elif optimizer == Optimizer.ADAM:
        return optim.Adam(
            model.parameters(),
            lr = config.learning_rate,
            betas = config.betas
        )
    
    else:
        raise SystemExit("Invalid lr_scheduler value.")

In [None]:
class LrScheduler(Enum):
    STEP = auto()
    MULTI_STEP = auto()
    EXPONENTIAL = auto()
    REDUCE_ON_PLATEAU = auto()
    
def get_lr_scheduler(
    optimizer: optim.Optimizer,
    lr_scheduler: LrScheduler = LrScheduler.STEP,
    config: SchedulerConfig = SchedulerConfig()
):
    """
    Gets the specified LR scheduler.
    """

    if lr_scheduler == LrScheduler.STEP:
        return optim.lr_scheduler.StepLR(
            optimizer,
            step_size = config.step_size,
            gamma = config.gamma
        )
    
    elif lr_scheduler == LrScheduler.MULTI_STEP:
        return optim.lr_scheduler.MultiStepLR(
            optimizer, 
            milestones = config.milestones, 
            gamma = config.gamma
        )
    
    elif lr_scheduler == LrScheduler.EXPONENTIAL:
        return optim.lr_scheduler.ExponentialLR(
            optimizer, 
            gamma = config.gamma
        )
    
    
    elif lr_scheduler == LrScheduler.REDUCE_ON_PLATEAU:
        return optim.lr_scheduler.ReduceLROnPlateau(
            optimizer, 
            factor = config.gamma,
            patience = config.patience,
            threshold = config.threshold
        )
    
    else:
        raise SystemExit("Invalid lr_scheduler value.")

In [None]:
def predict_batch(model, data, max_prob=True):
    """
    Get prediction for a batch of data. This function assumes the model and data
    have be sent to the appropriate device and the model is in evaluation mode.
    """

    output = model(data)

    # get probability score using softmax
    prob = F.softmax(output, dim=1)
    
    if max_prob:
        # get the max probability
        pred_prob = prob.data.max(dim=1)[0]
    else:
        # return all probabilties
        pred_prob = prob.data
    
    # get the index of the max probability
    pred_index = prob.data.max(dim=1)[1]
    
    return pred_index.cpu().numpy(), pred_prob.cpu().numpy()

In [None]:
def get_targets_and_pred_probs(model, dataloader, device):
    """
    Get targets and prediction probabilities.
    """
    
    model.to(device)  # send model to cpu or cuda
    model.eval()      # set model to evaluation mode

    targets = []
    pred_probs = []

    for _, (data, target) in enumerate(dataloader):
        _, probs = predict_batch(model, data.to(device), max_prob=False)       
        pred_probs.append(probs)
        targets.append(target.numpy())
        
    targets = np.concatenate(targets).astype(int)
    pred_probs = np.concatenate(pred_probs, axis=0)
    
    return targets, pred_probs

In [None]:
def predict_test_data(model, dataloader, device):
    """
    Predict the class of the test data.
    """

    model.to(device)  # send model to cpu or cuda
    model.eval()      # set model to evaluation mode

    fnames = []
    labels = []

    for _, (data, fname) in enumerate(dataloader):
        label, _ = predict_batch(model, data.to(device), max_prob=True)       
        fnames.append(fname)
        labels.append(label)
        
    fnames = np.concatenate(fnames)
    labels = np.concatenate(labels).astype(int)
    
    return fnames, labels

## <font style="color:green">5. Model [5 Points]</font>

Define your model in this section.

**You are allowed to use any pre-trained model.**

## <font style="color:blue">Assignment Response</font>

My primary objective is to explore fine tuning numerous pretrained models. Hence, I created classes
to easily set the "tuning level" of the ResNet, VGG, and DenseNet family of TorchVision models. I
also want to see how the model I developed for Project 1 performs, so I created as a class for it.

In [None]:
TuningParam = namedtuple("TuningParam", ["level", "block", "layers"])

In [None]:
class TorchVisionModel(nn.Module):
    """
    Base class for TorchVision models, which provides a method to freeze network
    layers allowing fine tuning. This class does change the network's output layer.
    Derived classes must do this!
    """
    
    def __init__(self, network: nn.Module):
        super().__init__()
        self._network = network
        
    def forward(self, x):
        return self._network(x)
    
    def _freeze_layers(
        self, 
        tuning_params: List[TuningParam], 
        pretrained:bool, 
        tuning_level:int
    ):
        # freeze network if using a pretrained model
        if pretrained:
            self._set_requires_grad(self._network, False)
        
        # unfreeze blocks/layers based on tuning_level
        for param in tuning_params:
            if param.level <= tuning_level:
                block = getattr(self._network, param.block)
                if param.layers is None:
                    self._set_requires_grad(block, True)
                else:
                    for layer in param.layers:
                        if isinstance(layer, int):
                            self._set_requires_grad(block[layer], True)
                        else:
                            self._set_requires_grad(getattr(block, layer), True)
            
    def _set_requires_grad(self, block, value):
        for param in block.parameters():
            param.requires_grad = value
            
    def _inclusive_range(self, start:int, stop:int) -> List[int]:
        return list(range(start, stop + 1))

In [None]:
class ResNetBase(TorchVisionModel):
    """
    Base class for ResNet models that may be pretrained and fine tuned. The
    tuning_level parameter controls the degree of fine tuning as depicted in
    the table below.
        
        ResNet     tuning_level
        -------    ------------
        conv1          >= 5        
        bn1            >= 5
        relu           >= 5
        maxpool        >= 5
        layer1         >= 4
        layer2         >= 3
        layer3         >= 2
        layer4         >= 1
        avgpool        >= 1
        fc             >= 0
        
    If tuning_level = 0, then only the classifier layer is trained.
    If tuning_level = 5, then the entire network is trained.
    """
    
    def __init__(self, model_fn: Callable, pretrained=True, tuning_level=0):
        super().__init__(model_fn(pretrained=pretrained))

        # change the output layer
        last_layer_in = self._network.fc.in_features
        self._network.fc = nn.Linear(last_layer_in, 13)

        # ToDo: Omit layer types that do not have trainable parameters
        tuning_params = [
            TuningParam(0, "fc", None),
            TuningParam(1, "avgpool", None),
            TuningParam(1, "layer4", None),
            TuningParam(2, "layer3", None),
            TuningParam(3, "layer2", None),
            TuningParam(4, "layer1", None),
            TuningParam(5, "maxpool", None),
            TuningParam(5, "relu", None),
            TuningParam(5, "bn1", None),
            TuningParam(5, "conv1", None)
        ]

        self._freeze_layers(tuning_params, pretrained, tuning_level)
    
    def forward(self, x):
        return self._network(x)

In [None]:
class ResNet18(ResNetBase):
    def __init__(self, pretrained=True, tuning_level=0):
        super().__init__(models.resnet18, pretrained, tuning_level)

In [None]:
class ResNet34(ResNetBase):
    def __init__(self, pretrained=True, tuning_level=0):
        super().__init__(models.resnet34, pretrained, tuning_level)

In [None]:
class ResNet50(ResNetBase):
    def __init__(self, pretrained=True, tuning_level=0):
        super().__init__(models.resnet50, pretrained, tuning_level)

In [None]:
class ResNet101(ResNetBase):
    def __init__(self, pretrained=True, tuning_level=0):
        super().__init__(models.resnet101, pretrained, tuning_level)

In [None]:
class ResNet152(ResNetBase):
    def __init__(self, pretrained=True, tuning_level=0):
        super().__init__(models.resnet152, pretrained, tuning_level)

In [None]:
class ResNeXt50(ResNetBase):
    def __init__(self, pretrained=True, tuning_level=0):
        super().__init__(models.resnext50_32x4d, pretrained, tuning_level)

In [None]:
class ResNeXt101(ResNetBase):
    def __init__(self, pretrained=True, tuning_level=0):
        super().__init__(models.resnext101_32x8d, pretrained, tuning_level)

In [None]:
class WideResNet50(ResNetBase):
    def __init__(self, pretrained=True, tuning_level=0):
        super().__init__(models.wide_resnet50_2, pretrained, tuning_level)

In [None]:
class WideResNet101(ResNetBase):
    def __init__(self, pretrained=True, tuning_level=0):
        super().__init__(models.wide_resnet101_2, pretrained, tuning_level)

In [None]:
class VGGBase(TorchVisionModel):
    """
    Base class for ResNet models that may be pretrained and fine tuned.
    """
    
    def __init__(self, model_fn: Callable, pretrained=True):
        super().__init__(model_fn(pretrained=pretrained))

        last_layer_in = self._network.classifier[6].in_features
        self._network.classifier[6] = nn.Linear(last_layer_in, 13)
    
    def forward(self, x):
        return self._network(x)

In [None]:
class VGG11BN(VGGBase):
    """
    VGG11BN model that may be pretrained and fine tuned. The tuning_level
    parameter controls the degree of fine tuning as depicted in the table
    below.
    
        VGG11_BN            tuning_level
        ----------------    ------------
        features
          [00-02] CNR           >= 5
          [03] MaxPool2d        >= 5
          [04-06] CNR           >= 4
          [07] MaxPool2d        >= 4
          [08-10] CNR           >= 3
          [11-13] CNR           >= 3
          [14] MaxPool2d        >= 3
          [15-17] CNR           >= 2
          [18-20] CNR           >= 2
          [21] MaxPool2d        >= 2
          [22-24] CNR           >= 1
          [25-27] CNR           >= 1
          [28] MaxPool2d        >= 1
        avgpool                 >= 1
        classifier              
          [00-02] LRD           >= 0
          [03-05] LRD           >= 0
          [06] Linear           >= 0
    """

    def __init__(self, pretrained=True, tuning_level=0):
        super().__init__(models.vgg11_bn, pretrained)
            
        # ToDo: Omit layer types that do not have trainable parameters
        tuning_params = [
            TuningParam(0, "classifier", None),
            TuningParam(1, "avgpool", None),
            TuningParam(1, "features", self._inclusive_range(22, 28)),
            TuningParam(2, "features", self._inclusive_range(15, 21)),
            TuningParam(3, "features", self._inclusive_range(8, 14)),
            TuningParam(4, "features", self._inclusive_range(4, 7)),
            TuningParam(5, "features", self._inclusive_range(0, 3))
        ]

        self._freeze_layers(tuning_params, pretrained, tuning_level)

In [None]:
class VGG13BN(VGGBase):
    """
    VGG13BN model that may be pretrained and fine tuned. The tuning_level
    parameter controls the degree of fine tuning as depicted in the table
    below.
    
        VGG13_BN            tuning_level
        ----------------    ------------
        features
          [00-02] CNR           >= 5
          [03-05] CNR           >= 5
          [06] MaxPool2d        >= 5
          [07-09] CNR           >= 4
          [10-12] CNR           >= 4
          [13] MaxPool2d        >= 4
          [14-16] CNR           >= 3
          [17-19] CNR           >= 3
          [20] MaxPool2d        >= 3
          [21-23] CNR           >= 2
          [24-26] CNR           >= 2
          [27] MaxPool2d        >= 2
          [28-30] CNR           >= 1
          [31-33] CNR           >= 1
          [34] MaxPool2d        >= 1
        avgpool                 >= 1
        classifier
          [00-02] LRD           >= 0
          [03-05] LRD           >= 0
          [06] Linear           >= 0
    """

    def __init__(self, pretrained=True, tuning_level=0):
        super().__init__(models.vgg13_bn, pretrained)
            
        # ToDo: Omit layer types that do not have trainable parameters
        tuning_params = [
            TuningParam(0, "classifier", None),
            TuningParam(1, "avgpool", None),
            TuningParam(1, "features", self._inclusive_range(28, 34)),
            TuningParam(2, "features", self._inclusive_range(21, 27)),
            TuningParam(3, "features", self._inclusive_range(14, 20)),
            TuningParam(4, "features", self._inclusive_range(7, 13)),
            TuningParam(5, "features", self._inclusive_range(0, 6))
        ]

        self._freeze_layers(tuning_params, pretrained, tuning_level)

In [None]:
class VGG16BN(VGGBase):
    """
    VGG16BN model that may be pretrained and fine tuned. The tuning_level
    parameter controls the degree of fine tuning as depicted in the table
    below.
    
        VGG16_BN            tuning_level
        ----------------    ------------
        features
          [00-02] CNR           >= 5
          [03-05] CNR           >= 5
          [06] MaxPool2d        >= 5
          [07-09] CNR           >= 4
          [10-12] CNR           >= 4
          [13] MaxPool2d        >= 4
          [14-16] CNR           >= 3
          [17-19] CNR           >= 3
          [20-22] CNR           >= 3
          [23] MaxPool2d        >= 3
          [24-26] CNR           >= 2
          [27-29] CNR           >= 2
          [30-32] CNR           >= 2
          [33] MaxPool2d        >= 2
          [34-36] CNR           >= 1
          [37-39] CNR           >= 1
          [40-42] CNR           >= 1
          [43] MaxPool2d        >= 1
        avgpool                 >= 1
        classifier              
          [00-02] LRD           >= 0
          [03-05] LRD           >= 0
          [06] Linear           >= 0
    """

    def __init__(self, pretrained=True, tuning_level=0):
        super().__init__(models.vgg16_bn, pretrained)
            
        # ToDo: Omit layer types that do not have trainable parameters
        tuning_params = [
            TuningParam(0, "classifier", None),
            TuningParam(1, "avgpool", None),
            TuningParam(1, "features", self._inclusive_range(34, 43)),
            TuningParam(2, "features", self._inclusive_range(24, 33)),
            TuningParam(3, "features", self._inclusive_range(14, 23)),
            TuningParam(4, "features", self._inclusive_range(7, 13)),
            TuningParam(5, "features", self._inclusive_range(0, 6))
        ]

        self._freeze_layers(tuning_params, pretrained, tuning_level)

In [None]:
class VGG19BN(VGGBase):
    """
    VGG19BN model that may be pretrained and fine tuned. The tuning_level
    parameter controls the degree of fine tuning as depicted in the table
    below.

        VGG11_BN            tuning_level
        ----------------    ------------
        features
          [00-02] CNR           >= 5
          [03-05] CNR           >= 5
          [06] MaxPool2d        >= 5
          [07-09] CNR           >= 4
          [10-12] CNR           >= 4
          [13] MaxPool2d        >= 4
          [14-16] CNR           >= 3
          [17-19] CNR           >= 3
          [20-22] CNR           >= 3
          [23-25] CNR           >= 3
          [26] MaxPool2d        >= 3
          [27-29] CNR           >= 2
          [30-32] CNR           >= 2
          [33-35] CNR           >= 2
          [36-38] CNR           >= 2
          [39] MaxPool2d        >= 2
          [40-42] CNR           >= 1
          [43-45] CNR           >= 1
          [46-48] CNR           >= 1
          [49-51] CNR           >= 1
          [52] MaxPool2d        >= 1
        avgpool                 >= 1
        classifier              
          [00-02] LRD           >= 0
          [03-05] LRD           >= 0
          [06] Linear           >= 0
    """

    def __init__(self, pretrained=True, tuning_level=0):
        super().__init__(models.vgg19_bn, pretrained)
            
        # ToDo: Omit layer types that do not have trainable parameters
        tuning_params = [
            TuningParam(0, "classifier", None),
            TuningParam(1, "avgpool", None),
            TuningParam(1, "features", self._inclusive_range(40, 52)),
            TuningParam(2, "features", self._inclusive_range(27, 39)),
            TuningParam(3, "features", self._inclusive_range(14, 26)),
            TuningParam(4, "features", self._inclusive_range(7, 13)),
            TuningParam(5, "features", self._inclusive_range(0, 6))
        ]

        self._freeze_layers(tuning_params, pretrained, tuning_level)

In [None]:
class DenseNetBase(TorchVisionModel):
    """
    Base class for DenseNet models that may be pretrained and fine tuned. The
    tuning_level parameter controls the degree of fine tuning as depicted in
    the table below.
        
        DenseNet          tuning_level
        -------------     ------------
        features
          conv0               >= 5
          norm0               >= 5
          relu0               >= 5
          pool0               >= 5
          denseblock1         >= 4
          transition1         >= 4
          denseblock2         >= 3
          transition2         >= 3
          denseblock3         >= 2
          transition3         >= 2
          denseblock4         >= 1
          norm5               >= 1
        classifier            >= 0

    """

    def __init__(self, model_fn: Callable, pretrained=True, tuning_level=0):
        super().__init__(model_fn(pretrained=pretrained))

        # change the output layer
        last_layer_in = self._network.classifier.in_features
        self._network.classifier = nn.Linear(last_layer_in, 13)

        # ToDo: Omit layer types that do not have trainable parameters
        tuning_params = [
            TuningParam(0, "classifier", None),
            TuningParam(1, "features", ["denseblock4", "norm5"]),
            TuningParam(2, "features", ["denseblock3", "transition3"]),
            TuningParam(3, "features", ["denseblock2", "transition2"]),
            TuningParam(4, "features", ["denseblock1", "transition1"]),
            TuningParam(5, "features", ["conv0", "norm0", "relu0", "pool0"])
        ]

        self._freeze_layers(tuning_params, pretrained, tuning_level)
    
    def forward(self, x):
        return self._network(x)

In [None]:
class DenseNet121(DenseNetBase):
    def __init__(self, pretrained=True, tuning_level=0):
        super().__init__(models.densenet121, pretrained, tuning_level)

In [None]:
class DenseNet169(DenseNetBase):
    def __init__(self, pretrained=True, tuning_level=0):
        super().__init__(models.densenet169, pretrained, tuning_level)

In [None]:
class DenseNet201(DenseNetBase):
    def __init__(self, pretrained=True, tuning_level=0):
        super().__init__(models.densenet201, pretrained, tuning_level)

In [None]:
class DenseNet161(DenseNetBase):
    def __init__(self, pretrained=True, tuning_level=0):
        super().__init__(models.densenet161, pretrained, tuning_level)

In [None]:
class Project1Model(nn.Module):
    """
    Modified the last layer to output 13, rather than 3, features.
    """
    def __init__(self):
        super().__init__()

        # Convolution layers
        self._body = nn.Sequential(
            # input 3 x 224 x 224
            nn.Conv2d(in_channels=3, out_channels=16, kernel_size=7, padding=3),
            nn.BatchNorm2d(16),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2),

            # input 24 * 112 * 112
            nn.Conv2d(in_channels=16, out_channels=24, kernel_size=5, padding=2),
            nn.BatchNorm2d(24),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2),

            # input 36 * 56 * 56
            nn.Conv2d(in_channels=24, out_channels=36, kernel_size=5, padding=2),
            nn.BatchNorm2d(36),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2),

            #input 54 * 28 * 28
            nn.Conv2d(in_channels=36, out_channels=54, kernel_size=5, padding=2),
            nn.BatchNorm2d(54),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2),

            #input 81 * 14 * 14
            nn.Conv2d(in_channels=54, out_channels=81, kernel_size=5, padding=2),
            nn.BatchNorm2d(81),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2),
        )

        # Fully connected layers
        self._head = nn.Sequential(
            nn.Dropout(0.5),
            nn.Linear(in_features=81*7*7, out_features=1024), 
            nn.ReLU(inplace=True),

            nn.Dropout(0.5),
            nn.Linear(in_features=1024, out_features=256), 
            nn.ReLU(inplace=True),

            nn.Linear(in_features=256, out_features=13)            
        )
        
    def forward(self, x):
        x = self._body(x)
        x = x.view(x.size()[0], -1)
        x = self._head(x)
        return x

## <font style="color:green">6. Utils [5 Points]</font>

Define your methods or classes which are not covered in the above sections.

In [None]:
def creeate_submission_csv(path, exp):
    """
    ToDo: Need to test and execute on the best model.
    """

    # create a dictionary of numeric labels to text labels
    label_dict = {}
    for key, value in zip(np.arange(len(exp.classes)), exp.classes):
        label_dict[key] = value

    # get predictions for the test data using the trained model            
    fnames, labels = predict_test_data(exp.trained_model, exp.test_loader, exp.device)

    # convert the numeric labels to their text equivalents
    labels = [label_dict[label] for label in labels]

    # create a pandas data frame and write it to a CSV file
    data_frame = pd.DataFrame(
        np.stack((fnames, labels), axis=-1), 
        columns=["id", "class"]
    )

    data_frame.to_csv(path)

In [None]:
def get_requires_grad_status(block) -> str:
    params = list(block.parameters())
    if not params:
        return "N/A"
    
    or_of_params = False
    and_of_params = True
    for param in params:
        or_of_params = or_of_params or param.requires_grad
        and_of_params = and_of_params and param.requires_grad
    if or_of_params and and_of_params:
        return "True"
    elif not or_of_params and not and_of_params:
        return "False"
    else:
        return "Mixed"

In [None]:
def print_top_level_model_blocks(
    model:nn.Module, 
    include_grandchildren:bool = False, 
    display_requires_grad = False
):
    status = ""
    if display_requires_grad:
        status = f", requires_grad={get_requires_grad_status(model)}"
    print(f"{type(model).__name__}{status}")
    for child in model.named_children():
        if display_requires_grad:
            status = f", requires_grad={get_requires_grad_status(child[1])}"
        print(f"  {child[0]}{status}")
        if include_grandchildren:
            for grandchild in child[1].named_children():
                if display_requires_grad:
                    status = f", requires_grad={get_requires_grad_status(grandchild[1])}"
                if not grandchild[0].isnumeric():
                    print(f"    { grandchild[0]}{status}")
                else:
                    print(f"    [{grandchild[0]}] {type(grandchild[1]).__name__}{status}")

In [None]:
#  Output the architecture of several pretrained PyTorch models.
#  
#  The following models all have the same high level ResNet architecture.
#  
#      - ResNet-18
#      - ResNet-34
#      - ResNet-50
#      - ResNet-101
#      - ResNet-152
#      - ResNeXt-50-32x4d
#      - Wide ResNet-50-2
#      - Wide ResNet-101-2
#  
#  The following models all have the same high level DenseNet architecture.
#  
#      - Densenet-121
#      - Densenet-169
#      - Densenet-201
#      - Densenet-161


#  print_top_level_model_blocks(models.resnet18(), False)
#  print_top_level_model_blocks(models.densenet121(), True)
#  print_top_level_model_blocks(models.vgg11_bn(), True)
#  print_top_level_model_blocks(models.vgg13_bn(), True)
#  print_top_level_model_blocks(models.vgg16_bn(), True)
#  print_top_level_model_blocks(models.vgg19_bn(), True)


#  The (formatted) output the previous commented statements yields the following.
#  
#  Note: Groups of Conv2d, BatchNorm, and ReLU layers have been condensed to CNR
#        Groups of Linear, ReLU, and Dropout layers have been condensed to LRD
#  
#  ResNet           | DenseNet         | VGG11_BN         | VGG13_BN         | VGG16_BN         | VGG19_BN
#    conv1          |   features       |   features       |   features       |   features       |   features
#    bn1            |     conv0        |     [00-02] CNR  |     [00-02] CNR  |     [00-02] CNR  |     [00-02] CNR
#    relu           |     norm0        |                  |     [03-05] CNR  |     [03-05] CNR  |     [03-05] CNR
#    maxpool        |     relu0        |     [03] MaxPool |     [06] MaxPool |     [06] MaxPool |     [06] MaxPool2d
#    layer1         |     pool0        |     [04-06] CNR  |     [07-09] CNR  |     [07-09] CNR  |     [07-09] CNR
#    layer2         |     denseblock1  |                  |     [10-12] CNR  |     [10-12] CNR  |     [10-12] CNR
#    layer3         |     transition1  |     [07] MaxPool |     [13] MaxPool |     [13] MaxPool |     [13] MaxPool2d
#    layer4         |     denseblock2  |     [08-10] CNR  |     [14-16] CNR  |     [14-16] CNR  |     [14-16] CNR
#    avgpool        |     transition2  |     [11-13] CNR  |     [17-19] CNR  |     [17-19] CNR  |     [17-19] CNR
#    fc             |     denseblock3  |                  |                  |     [20-22] CNR  |     [20-22] CNR
#                   |     transition3  |                  |                  |                  |     [23-25] CNR
#                   |     denseblock4  |     [14] MaxPool |     [20] MaxPool |     [23] MaxPool |     [26] MaxPool2d
#                   |     norm5        |     [15-17] CNR  |     [21-23] CNR  |     [24-26] CNR  |     [27-29] CNR
#                   |   classifier     |     [18-20] CNR  |     [24-26] CNR  |     [27-29] CNR  |     [30-32] CNR
#                   |                  |                  |                  |     [30-32] CNR  |     [33-35] CNR
#                   |                  |                  |                  |                  |     [36-38] CNR
#                   |                  |     [21] MaxPool |     [27] MaxPool |     [33] MaxPool |     [39] MaxPool2d
#                   |                  |     [22-24] CNR  |     [28-30] CNR  |     [34-36] CNR  |     [40-42] CNR
#                   |                  |     [25-27] CNR  |     [31-33] CNR  |     [37-39] CNR  |     [43-45] CNR
#                   |                  |                  |                  |     [40-42] CNR  |     [46-48] CNR
#                   |                  |                  |                  |                  |     [49-51] CNR
#                   |                  |     [28] MaxPool |     [34] MaxPool |     [43] MaxPool |     [52] MaxPool2d
#                   |                  |   avgpool        |   avgpool        |   avgpool        |   avgpool
#                   |                  |   classifier     |   classifier     |   classifier     |   classifier
#                   |                  |     [00-02] LRD  |     [00-02] LRD  |     [00-02] LRD  |     [00-02] LRD
#                   |                  |     [03-05] LRD  |     [03-05] LRD  |     [03-05] LRD  |     [03-05] LRD
#                   |                  |     [06] Linear  |     [06] Linear  |     [06] Linear  |     [06] Linear


#  This function was also used to test whether I properly implemented the fine tuning code. For example,
#
#  model = ResNet18(pretrained=True, tuning_level=0)
#  print_top_level_model_blocks(model._network, include_grandchildren=False, display_requires_grad=True)
#
#      ResNet, requires_grad=Mixed
#        conv1, requires_grad=False
#        bn1, requires_grad=False
#        relu, requires_grad=N/A
#        maxpool, requires_grad=N/A
#        layer1, requires_grad=False
#        layer2, requires_grad=False
#        layer3, requires_grad=False
#        layer4, requires_grad=False
#        avgpool, requires_grad=N/A
#        fc, requires_grad=True#
#
#  model = ResNet18(pretrained=True, tuning_level=1)
#  print_top_level_model_blocks(model._network, include_grandchildren=False, display_requires_grad=True)
#
#      ResNet, requires_grad=Mixed
#        conv1, requires_grad=False
#        bn1, requires_grad=False
#        relu, requires_grad=N/A
#        maxpool, requires_grad=N/A
#        layer1, requires_grad=False
#        layer2, requires_grad=False
#        layer3, requires_grad=False
#        layer4, requires_grad=True
#        avgpool, requires_grad=N/A
#        fc, requires_grad=True
#
#  ...
#
#  model = ResNet18(pretrained=True, tuning_level=5)
#  print_top_level_model_blocks(model._network, include_grandchildren=False, display_requires_grad=True)
#
#      ResNet, requires_grad=True
#        conv1, requires_grad=True
#        bn1, requires_grad=True
#        relu, requires_grad=N/A
#        maxpool, requires_grad=N/A
#        layer1, requires_grad=True
#        layer2, requires_grad=True
#        layer3, requires_grad=True
#        layer4, requires_grad=True
#        avgpool, requires_grad=N/A
#        fc, requires_grad=True

## <font style="color:green">7. Experiment [5 Points]</font>

Choose your optimizer and LR-scheduler and use the above methods and classes to train your model.

### <font style="color:blue">Base Experiment Classes</font>

The following base classes facilitate experiment creation.
<ul>
    <li>Experiment - Base class for the following classes.</li>
    <li>VisualExperiment - Conduct data visualization experiments.</li>
    <li>ModelExperiment - Conduct model training experiments</li>
</ul>

In [None]:
class Experiment(ABC):
    def __init__(
        self,
        abbr: str = None,
        transform_resize: int = 256,
        transform_crop_size: int = 224,
        data_loader_batch_size: int = None,
        data_loader_num_workers: int = None,
        optimzer_learning_rate: float = None,
        optimzer_momentum: float = None,
        optimzer_weight_decay: float = None,
        optimzer_betas: Tuple[float, float] = None,
        lr_scheduler_gamma: float = None,
        lr_scheduler_step_size: int = None,
        lr_scheduler_milestones: Iterable = None,
        lr_scheduler_patience: int = None,
        lr_scheduler_threshold: float = None,
        trainer_training_epochs: int = None,
        trainer_stop_loss_epochs: int = None,
        trainer_stop_acc_epochs: int = None,
        trainer_stop_acc_ema_alpha: float = None,
        trainer_stop_acc_threshold: float = None
    ):

        """
        This base class for data visualization and model training experiment does the following.
        
            - Creates the master configuration instance accomodating constructor overrides
            - Sets up the system, e.g., ensures reproducibility, enables CUDA acceleration, etc.
            - Initializes the KenyanFood13 dataset
            - Configures experiment visualization 

        """

        if abbr is None:
            self._abbr = type(self).__name__
        else:
            self._abbr = abbr

        
        # ToDo: Apply patch if CUDA is not available.
        self._resize = transform_resize
        self._crop_size = transform_crop_size
        self._config = create_master_config(
            transform_resize,
            transform_crop_size,
            data_loader_batch_size,
            data_loader_num_workers,
            optimzer_learning_rate,
            optimzer_momentum,
            optimzer_weight_decay,
            optimzer_betas,
            lr_scheduler_gamma,
            lr_scheduler_step_size,
            lr_scheduler_milestones,
            lr_scheduler_patience,
            lr_scheduler_threshold,  
            trainer_training_epochs,
            trainer_stop_loss_epochs,
            trainer_stop_acc_epochs,
            trainer_stop_acc_ema_alpha,
            trainer_stop_acc_threshold       
        )
        

        setup_system(self._config.system)
        
        self._data = KenyanFood13Data(
            data_root = self._config.dataset.data_dir,
            valid_size = self._config.dataset.valid_size,
            random_seed = self._config.system.seed
        )

        self._classes = self._data.classes
        self._library = self._data.library
        self.__visualizer = None
        
    @property
    def classes(self):
        return self._classes

    @property
    def library(self):
        return self._library

    """
    Protected methods that may or must be overridden by derived classes.
    """
    
    @abstractproperty
    def _visualizer_name(self) -> str:
        pass

    def _open_visualizer(self):
        if self.__visualizer is None:
            self.__visualizer = TensorBoardVisualizer(os.path.join(
                self._config.system.proj_dir,
                self._config.trainer.visualizer_dir, 
                self._visualizer_name
            ))
        return self.__visualizer

    def _close_visualizer(self):
        if self.__visualizer is not None:
            self.__visualizer.close_tensorboard()
            self.__visualizer = None

In [None]:
class VisualExperiment(Experiment):
    def __init__(
        self,
        abbr: str = None,
        transform_resize: int = 256,
        transform_crop_size: int = 224
    ):
        super().__init__(
            abbr,
            transform_resize,
            transform_crop_size
        )
        
        """
        This is the base class for data visualization experiments.
        """

    def log_sample_images(self):
        """
        Create a 6 x 6 grid of images for each type of food in the data and
        log these images to the visualizer.
        """

        visualizer = self._open_visualizer()

        for food, fnames in self._library.items():
            # create food specific dataset
            dataset = KenyanFood13Dataset(
                image_root=self._data.image_root,
                fnames=fnames,
                transform=self._config.dataset.visual_transforms
            )

            # randomly load 36 images
            dataloader = DataLoader(dataset, batch_size=36, shuffle=True)
            images, _ = next(iter(dataloader))

            # save image to project directory
            # path = os.path.join(proj_dir, self._classes[food] + ".jpg")
            # torchvision.utils.save_image(images, fp=path, nrow=6)

            # add image grid to visualizer
            visualizer.add_image(
                tag=self._classes[food], 
                image=torchvision.utils.make_grid(images, nrow=6)
            )
        
        self._close_visualizer()
        
    @property
    def _visualizer_name(self) -> str:
        return self._abbr + f"--DV-RS_{self._resize}-CS_{self._crop_size}"

In [None]:
class ModelExperiment(Experiment):
    def __init__(
        self,
        abbr: str = None,
        optimizer: Optimizer = Optimizer.SGD,
        lr_scheduler: LrScheduler = LrScheduler.STEP,
        transform_resize: int = 256,
        transform_crop_size: int = 224,
        data_loader_batch_size: int = None,
        data_loader_num_workers: int = None,
        optimzer_learning_rate: float = None,
        optimzer_momentum: float = None,
        optimzer_weight_decay: float = None,
        optimzer_betas: Tuple[float, float] = None,
        lr_scheduler_gamma: float = None,
        lr_scheduler_step_size: int = None,
        lr_scheduler_milestones: Iterable = None,
        lr_scheduler_patience: int = None,
        lr_scheduler_threshold: float = None,
        trainer_training_epochs: int = None,
        trainer_stop_loss_epochs: int = None,
        trainer_stop_acc_epochs: int = None,
        trainer_stop_acc_ema_alpha: float = None,
        trainer_stop_acc_threshold: float = None,
        use_data_subsets: bool = False
    ):
        """
        This is the base class for model training experiments.
        """
        
        super().__init__(
            abbr,
            transform_resize,
            transform_crop_size,
            data_loader_batch_size,
            data_loader_num_workers,
            optimzer_learning_rate,
            optimzer_momentum,
            optimzer_weight_decay,
            optimzer_betas,
            lr_scheduler_gamma,
            lr_scheduler_step_size,
            lr_scheduler_milestones,
            lr_scheduler_patience,
            lr_scheduler_threshold,
            trainer_training_epochs,
            trainer_stop_loss_epochs,
            trainer_stop_acc_epochs,
            trainer_stop_acc_ema_alpha,
            trainer_stop_acc_threshold
        )

        train_dataset, valid_dataset, test_dataset = get_datasets(
            data = self._data,
            test_transforms = self._config.dataset.test_transforms,
            train_transforms = self._config.dataset.train_transforms,
            subset = use_data_subsets
        )

        self.__train_loader, self.__valid_loader, self.__test_loader = get_data_loaders(
            train_dataset = train_dataset,
            valid_dataset = valid_dataset,
            test_dataset = test_dataset,
            batch_size = self._config.data_loader.batch_size,
            num_workers = self._config.data_loader.num_workers
        )                
    
        self.__model, model_id = self._get_model()
        self.__model_name = self._abbr + "--" + model_id
        self.__model_dir = os.path.join(self._config.system.proj_dir, self._config.trainer.model_dir)
        self.__loss_fn = weighted_cross_entropy_loss
        self.__metric_fn = AccuracyEstimator(topk=(1, )) # ToDo: Fix! (trainer.py expects a dictionary w/ 'top1' key)
        self.__optimizer = get_optimizer(self.__model, optimizer, self._config.optimizer)
        self.__lr_scheduler = get_lr_scheduler(self.__optimizer, lr_scheduler, self._config.scheduler)

    @property
    def test_loader(self) -> DataLoader:
        return self.__test_loader

    @property
    def train_loader(self):
        return self.__train_loader
    
    @property
    def valid_loader(self):
        return self.__valid_loader

    @property
    def device(self) -> torch.device:
        return torch.device(self._config.trainer.device)

    @property
    def trained_model_path(self):
        return os.path.join(self.__model_dir, self.__model_name + ".pt")
    
    @property
    def trained_model(self) -> nn.Module:
        self.__load_model()
        return self.__model

    def train(self):
        device = self.device
        self.__model = self.__model.to(device)
        self.__loss_fn = self.__loss_fn.to(device)

        visualizer = self._open_visualizer()
        model_trainer = Trainer(
            model=self.__model,
            loader_train=self.__train_loader,
            loader_test=self.__valid_loader,
            loss_fn=self.__loss_fn,
            metric_fn=self.__metric_fn,
            optimizer=self.__optimizer,
            lr_scheduler=self.__lr_scheduler,
            model_save_dir=self.__model_dir,
            model_name=self.__model_name,
            model_saving_period=0,
            stop_loss_epochs=self._config.trainer.stop_loss_epochs,
            stop_acc_ema_alpha=self._config.trainer.stop_acc_ema_alpha,
            stop_acc_epochs=self._config.trainer.stop_acc_epochs,
            stop_acc_threshold=self._config.trainer.stop_acc_threshold,
            device=device,
            data_getter=itemgetter(0),
            target_getter=itemgetter(1),
            stage_progress=self._config.trainer.progress_bar,
            visualizer=visualizer,
            get_key_metric=itemgetter("top1")
        )
        model_trainer.register_hook("end_epoch", hooks.end_epoch_hook_classification)
        metrics = model_trainer.fit(self._config.trainer.training_epochs)
        self._close_visualizer()

        return metrics
    
    def log_graph(self):
        model = self.trained_model
        images, _ = next(iter(self.valid_loader))
        device = self.device

        visualizer = self._open_visualizer()
        visualizer.add_graph(model.to(device), images.to(device))
        self._close_visualizer()
        
    
    def log_pr_curves(self):
        targets, pred_probs = get_targets_and_pred_probs(
            self.trained_model, 
            self.valid_loader,
            self.device
        )

        visualizer = self._open_visualizer()
        visualizer.add_pr_curves(self._classes, targets, pred_probs)
        self._close_visualizer()
    
    
    """
    Protected methods that may or must be overridden by derived classes.
    """
    
    @property
    def _visualizer_name(self) -> str:
        return self.__model_name
            
    @abstractmethod
    def _get_model(self) -> Tuple[nn.Module, str]:
        pass
    
    """
    Private methods that should only be called by this base class.
    """
    
    def __load_model(self):
        path = self.trained_model_path
        if os.path.exists(path):
            self.__model.load_state_dict(torch.load(path))

### <font style="color:blue">Experiment #01: Training Pipeline Check and Data Visualization Experiments</font>

This set of experiments will log images of each food to the visualizer and retrain the fc classification layer of the pretrained Resnet18 model using a subset of the data to validate the training pipeline. Normally, one would disabled data augmentation, but I am going to test that too.

In [None]:
class Exp01A(VisualExperiment):
    def __init_(self):
        super().__init__()

In [None]:
class Exp01B(ModelExperiment):
    def __init__(self):
        super().__init__(
            use_data_subsets = True,
            trainer_training_epochs = 100, 
            trainer_stop_acc_epochs = 10,
            trainer_stop_acc_ema_alpha = 0.3,
            trainer_stop_acc_threshold = 2.0
        )
    def _get_model(self) -> nn.Module:
        return ResNet18(pretrained=True, tuning_level=0), "ResNet18-PT_T-FTL_0"

### <font style="color:blue">Experiment #02 - Training the Classifier of Pretrained Models</font>

This set of experiments retrains the classifer of the following pretrained models.
<ul>
    <li>ResNet-152</li>
    <li>VGG-19 with batch normalization.</li>
    <li>DenseNet-161</li>
    <li>ResNeXt-101-32x8d</li>
    <li>Wide ResNet-101-2</li>
</ul>

For efficiency, training will stop after 100 epochs or when the smoothed accuracy does not increase by 2% over 10 epochs. Accuracy is smoothed via an exponential moving average with an alpha of 0.3.

<u>Results</u>: TBD.

In [None]:
class Exp02(ModelExperiment):
    def __init__(self):
        super().__init__(
            trainer_training_epochs = 100, 
            trainer_stop_acc_epochs = 10,
            trainer_stop_acc_ema_alpha = 0.3,
            trainer_stop_acc_threshold = 2.0
        )

In [None]:
class Exp02A(Exp02):
    def _get_model(self) -> nn.Module:
        return ResNet152(pretrained=True, tuning_level=0), "ResNet152-PT_T-FTL_0"

In [None]:
class Exp02B(Exp02):
    def _get_model(self) -> nn.Module:
        return VGG19BN(pretrained=True, tuning_level=0), "VGG19BN-PT_T-FTL_0"

In [None]:
class Exp02C(Exp02):
    def _get_model(self) -> nn.Module:
        return DenseNet161(pretrained=True, tuning_level=0), "DenseNet161-PT_T-FTL_0"

In [None]:
class Exp02D(Exp02):
    def _get_model(self) -> nn.Module:
        return ResNeXt101(pretrained=True, tuning_level=0), "ResNeXt101-PT_T-FTL_0"

In [None]:
class Exp02E(Exp02):
    def _get_model(self) -> nn.Module:
        return WideResNet101(pretrained=True, tuning_level=0), "WideResNet101-PT_T-FTL_0"

### <font style="color:blue">Experiment #03 - Training the Classifier and Last Convolution Block of Pretrained Models</font>

This set of experiments retrains the classifer and last convolution block of the following pretrained models.
<ul>
    <li>VGG-19 with batch normalization.</li>
    <li>DenseNet-161</li>
    <li>ResNeXt-101-32x8d</li>
</ul>

For efficiency, training will stop after 100 epochs or when the smoothed accuracy does not increase by 2% over 10 epochs. Accuracy is smoothed via an exponential moving average with an alpha of 0.3. The number of data loader workers is increased from 4 to 8.

<u>Results</u>: TBD.

In [None]:
class Exp03(ModelExperiment):
    def __init__(self):
        super().__init__(
            data_loader_num_workers = 8,
            trainer_training_epochs = 100, 
            trainer_stop_acc_epochs = 10,
            trainer_stop_acc_ema_alpha = 0.3,
            trainer_stop_acc_threshold = 2.0
        )

In [None]:
class Exp03B(Exp03):
    def _get_model(self) -> nn.Module:
        return VGG19BN(pretrained=True, tuning_level=1), "VGG19BN-PT_T-FTL_1"

In [None]:
class Exp03C(Exp03):
    def _get_model(self) -> nn.Module:
        return DenseNet161(pretrained=True, tuning_level=1), "DenseNet161-PT_T-FTL_1"

In [None]:
class Exp03D(Exp03):
    def _get_model(self) -> nn.Module:
        return ResNeXt101(pretrained=True, tuning_level=1), "ResNeXt101-PT_T-FTL_1"

### <font style="color:blue">Experiment #04 - Training the Classifier and Last 2 Convolution Blocks of Pretrained Models</font>

This set of experiments retrains the classifer and last two convolution blocks of the following pretrained models.
<ul>
    <li>VGG-19 with batch normalization.</li>
    <li>DenseNet-161</li>
    <li>ResNeXt-101-32x8d</li>
</ul>

For efficiency, training will stop after 100 epochs or when the smoothed accuracy does not increase by 2% over 10 epochs. Accuracy is smoothed via an exponential moving average with an alpha of 0.3. The number of data loader workers is increased from 4 to 8.

<u>Results</u>: TBD.

In [None]:
class Exp04(ModelExperiment):
    def __init__(self):
        super().__init__(
            data_loader_num_workers = 8,
            trainer_training_epochs = 100, 
            trainer_stop_acc_epochs = 10,
            trainer_stop_acc_ema_alpha = 0.3,
            trainer_stop_acc_threshold = 2.0
        )

In [None]:
class Exp04B(Exp04):
    def _get_model(self) -> nn.Module:
        return VGG19BN(pretrained=True, tuning_level=2), "VGG19BN-PT_T-FTL_2"

In [None]:
class Exp04C(Exp04):
    def _get_model(self) -> nn.Module:
        return DenseNet161(pretrained=True, tuning_level=2), "DenseNet161-PT_T-FTL_2"

In [None]:
class Exp04D(Exp04):
    def _get_model(self) -> nn.Module:
        return ResNeXt101(pretrained=True, tuning_level=2), "ResNeXt101-PT_T-FTL_2"

### <font style="color:blue">Experiment #05 - Training the Classifier and Last 3 Convolution Blocks of Pretrained Models</font>

This set of experiments retrains the classifer and last three convolution blocks of the following pretrained models.
<ul>
    <li>VGG-19 with batch normalization.</li>
    <li>DenseNet-161</li>
    <li>ResNeXt-101-32x8d</li>
</ul>

For efficiency, training will stop after 100 epochs or when the smoothed accuracy does not increase by 2% over 10 epochs. Accuracy is smoothed via an exponential moving average with an alpha of 0.3. The number of data loader workers is increased from 4 to 8.

<u>Results</u>: TBD.

In [None]:
class Exp05(ModelExperiment):
    def __init__(self):
        super().__init__(
            data_loader_num_workers = 8,
            trainer_training_epochs = 100, 
            trainer_stop_acc_epochs = 10,
            trainer_stop_acc_ema_alpha = 0.3,
            trainer_stop_acc_threshold = 2.0
        )

In [None]:
class Exp05B(Exp05):
    def _get_model(self) -> nn.Module:
        return VGG19BN(pretrained=True, tuning_level=3), "VGG19BN-PT_T-FTL_3"

In [None]:
class Exp05C(Exp05):
    def _get_model(self) -> nn.Module:
        return DenseNet161(pretrained=True, tuning_level=3), "DenseNet161-PT_T-FTL_3"

In [None]:
class Exp05D(Exp05):
    def _get_model(self) -> nn.Module:
        return ResNeXt101(pretrained=True, tuning_level=3), "ResNeXt101-PT_T-FTL_3"

### <font style="color:blue">Experiment #06 - Training the Classifier and Last 4 Convolution Blocks of Pretrained Models</font>

This set of experiments retrains the classifer and last four convolution blocks of the following pretrained models.
<ul>
    <li>VGG-19 with batch normalization.</li>
    <li>DenseNet-161</li>
    <li>ResNeXt-101-32x8d</li>
</ul>

For efficiency, training will stop after 100 epochs or when the smoothed accuracy does not increase by 2% over 10 epochs. Accuracy is smoothed via an exponential moving average with an alpha of 0.3. The number of data loader workers is increased from 4 to 8.

<u>Results</u>: TBD.

In [None]:
class Exp06(ModelExperiment):
    def __init__(self):
        super().__init__(
            data_loader_num_workers = 8,
            trainer_training_epochs = 100, 
            trainer_stop_acc_epochs = 10,
            trainer_stop_acc_ema_alpha = 0.3,
            trainer_stop_acc_threshold = 2.0
        )

In [None]:
class Exp06B(Exp06):
    def _get_model(self) -> nn.Module:
        return VGG19BN(pretrained=True, tuning_level=4), "VGG19BN-PT_T-FTL_4"

In [None]:
class Exp06C(Exp06):
    def _get_model(self) -> nn.Module:
        return DenseNet161(pretrained=True, tuning_level=4), "DenseNet161-PT_T-FTL_4"

In [None]:
class Exp06D(Exp06):
    def _get_model(self) -> nn.Module:
        return ResNeXt101(pretrained=True, tuning_level=4), "ResNeXt101-PT_T-FTL_4"

### <font style="color:blue">Experiment #07 - Training Pretrained Models</font>

This set of experiments retrains the following pretrained models.
<ul>
    <li>VGG-19 with batch normalization.</li>
    <li>DenseNet-161</li>
    <li>ResNeXt-101-32x8d</li>
</ul>

For efficiency, training will stop after 100 epochs or when the smoothed accuracy does not increase by 2% over 10 epochs. Accuracy is smoothed via an exponential moving average with an alpha of 0.3. The number of data loader workers is increased from 4 to 8.

<u>Results</u>: TBD.

In [None]:
class Exp07(ModelExperiment):
    def __init__(self):
        super().__init__(
            data_loader_num_workers = 8,
            trainer_training_epochs = 100, 
            trainer_stop_acc_epochs = 10,
            trainer_stop_acc_ema_alpha = 0.3,
            trainer_stop_acc_threshold = 2.0
        )

In [None]:
class Exp07B(Exp07):
    def _get_model(self) -> nn.Module:
        return VGG19BN(pretrained=True, tuning_level=5), "VGG19BN-PT_T-FTL_5"

In [None]:
class Exp07C(Exp07):
    def _get_model(self) -> nn.Module:
        return DenseNet161(pretrained=True, tuning_level=5), "DenseNet161-PT_T-FTL_5"

In [None]:
class Exp07D(Exp07):
    def _get_model(self) -> nn.Module:
        return ResNeXt101(pretrained=True, tuning_level=5), "ResNeXt101-PT_T-FTL_5"

### <font style="color:blue">Experiment #08 - Training Untrained Models From Scratch</font>

This set of experiments trains the following untrained models.
<ul>
    <li>VGG-19 with batch normalization.</li>
    <li>DenseNet-161</li>
    <li>ResNeXt-101-32x8d</li>
</ul>

For efficiency, training will stop after 100 epochs or when the smoothed accuracy does not increase by 2% over 10 epochs. Accuracy is smoothed via an exponential moving average with an alpha of 0.3. The number of data loader workers is increased from 4 to 8.

<u>Results</u>: TBD.

In [None]:
class Exp08(ModelExperiment):
    def __init__(self):
        super().__init__(
            data_loader_num_workers = 8,
            trainer_training_epochs = 100, 
            trainer_stop_acc_epochs = 10,
            trainer_stop_acc_ema_alpha = 0.3,
            trainer_stop_acc_threshold = 2.0
        )

In [None]:
class Exp08B(Exp08):
    def _get_model(self) -> nn.Module:
        return VGG19BN(pretrained=False, tuning_level=0), "VGG19BN-PT_F"

In [None]:
class Exp08C(Exp08):
    def _get_model(self) -> nn.Module:
        return DenseNet161(pretrained=False, tuning_level=0), "DenseNet161-PT_F"

In [None]:
class Exp08D(Exp08):
    def _get_model(self) -> nn.Module:
        return ResNeXt101(pretrained=False, tuning_level=0), "ResNeXt101-PT_F"

### <font style="color:blue">Main Function</font>

A simple function that creates an experiment, trains its model, and logs the model's resulting PR curves and graph.

In [None]:
def vizdata(exp: VisualExperiment):
    """
    This method visualizes a visualizer by logging sample images for each food type to the notebook visualizer.
    (That's a lot of visualizers in one sentence!)
    """

    exp.log_sample_images()

In [None]:
def conduct(exp: ModelExperiment):
    """
    This method conducts an experiment by performing the following steps and returns its training metrics.
    1. Trains the model.
    2. Logs the model's precision-recall curve for each food class.
    3. Logs the model's graph.
    
    The last two steps are performed on model state with the lowest average loss on the validaton set.
    """
    
    exp.log_graph()
    metrics = exp.train()
    exp.log_pr_curves()
    return metrics

In [None]:
def experiment_group_0():
    """
    Visualize the 13 food types of the KenyanFood13 datasset. Images are
    resized to 256 x 256 pixels preserving their aspect ratios and then
    center-cropped to 224 x 224 pixels. Check the training pipeline with
    a simple model on a subset of the data.
    """
    vizdata(Exp01A())
    conduct(Exp01B())

    """
    Retrain the classifer layer of the following pretrained models.
        - ResNet-152
        - VGG-19 with batch normalization.
        - DenseNet-161
        - ResNeXt-101-32x8d
        - Wide ResNet-101-2
    """
    conduct(Exp02A())
    conduct(Exp02B())
    conduct(Exp02C())
    conduct(Exp02D())
    conduct(Exp02E())


    """
    Retrain the classifer and last convolution block of the following
    pretrained models.
        - VGG-19 with batch normalization.
        - DenseNet-161
        - ResNeXt-101-32x8d
    """
    conduct(Exp03B())
    conduct(Exp03C())
    conduct(Exp03D())
    
    """
    Retrain the classifer and last two convolution blocks of the following
    pretrained models.
        - VGG-19 with batch normalization.
        - DenseNet-161
        - ResNeXt-101-32x8d
    """
    conduct(Exp04B())
    conduct(Exp04C())
    conduct(Exp04D())

    """
    Retrain the classifer and last three convolution blocks of the following
    pretrained models.
        - VGG-19 with batch normalization.
        - DenseNet-161
        - ResNeXt-101-32x8d
    """
    conduct(Exp05B())
    conduct(Exp05C())
    conduct(Exp05D())
    
    """
    Retrain the classifer and last four convolution blocks of the following
    pretrained models.
        - VGG-19 with batch normalization.
        - DenseNet-161
        - ResNeXt-101-32x8d
    """
    conduct(Exp06B())
    conduct(Exp06C())
    conduct(Exp06D())
    
    """
    Retrain the following pretrained models.
        - VGG-19 with batch normalization.
        - DenseNet-161
        - ResNeXt-101-32x8d
    """
    conduct(Exp07B())
    conduct(Exp07C())
    conduct(Exp07D())

    """
    Train the following untrained models from scratch.
        - VGG-19 with batch normalization.
        - DenseNet-161
        - ResNeXt-101-32x8d
    """
    conduct(Exp08B())
    conduct(Exp08C())
    conduct(Exp08D())

In [None]:
def experiment_group_1():
    return

In [None]:
def main():
    
    for group in [1]:
        
        if group == 0:
            experiment_group_0()
        elif group == 1:
            experiment_group_1()
    
    return

In [None]:
if __name__ == '__main__':
    main()

## <font style="color:green">8. TensorBoard Dev Scalars Log Link [5 Points]</font>

Share your tensorboard scalars logs link in this section. You can also share (not mandatory) your GitHub link if you have pushed this project in GitHub. 

For example, [Find Project2 logs here](https://tensorboard.dev/experiment/kMJ4YU0wSNG0IkjrluQ5Dg/#scalars).

## <font style="color:green">9. Kaggle Profile Link [50 Points]</font>

Share your Kaggle profile link here with us so that we can give points for the competition score. 

You should have a minimum accuracy of `75%` on the test data to get all points. If accuracy is less than `70%`, you will not get any points for the section. 

**You must have to submit `submission.csv` (prediction for images in `test.csv`) in `Submit Predictions` tab in Kaggle to get any evaluation in this section.**