# Model Training using Ablator

* This chapter covers about training a model using Ablator with a popular **Fashion-mnist** dataset. 

#### Running Experiments using Ablator

An experiment consists of a complete pipeline that involves loading configurations, training the model, and producing metrics. Ablator utilizes configurations, a model wrapper, and a trainer class to run an experiment. This is achieved by defining configurations and a model wrapper, and then passing them to the trainer class to start the experiment.

#### Setting up Ablator

Install ablator using the command: ````pip install ablator````

Import the **Configs**, **ModelWrapper** and **ProtoTrainer** from ablator.

In [1]:
%%capture

from ablator import ModelConfig, OptimizerConfig, TrainConfig, RunConfig
from ablator import ModelWrapper, ProtoTrainer

import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, Dataset

from sklearn.metrics import precision_score, recall_score, f1_score, accuracy_score

import os
import shutil
from typing import Callable, Dict

#### Configurations

The configurations has its own independent functions and serves specific purposes in the overall process.


Defining Configs:

- **Optimizer Config**: adam (lr = 0.001).
- **Train Config**: batch_size = 32, epochs = 10, random weights initialization is set as true.
- **Run Config**: device details, directory path and a random seed for experiment.


In [2]:
optimizer_config = OptimizerConfig(
    name="adam", 
    arguments={"lr": 0.001}
)

train_config = TrainConfig(
    dataset="Fashion-mnist",
    batch_size=32,
    epochs=10,
    optimizer_config=optimizer_config,
    scheduler_config=None,
    rand_weights_init = True
)

model_config = ModelConfig()

# Random seed is used for generating same sequence of randomization every time.
run_config = RunConfig(
    train_config=train_config,
    model_config=model_config,
    metrics_n_batches = 800,
    experiment_dir = "/tmp/dir",
    device="cpu",
    amp=False,
    random_seed = 42
)

#### About the dataset

**Fashion MNIST** is a dataset consisting of 60,000 grayscale images of fashion items. The images are categorized into ten classes, that includes clothing items. 

Image dimensions: 28 pixels x 28 pixels (grayscale)
Shape of the training data tensor: [60000, 1, 28, 28]


#### Creating custom dataloaders.
- In PyTorch, a DataLoader is a utility class that provides an iterable over a dataset. 
- It is commonly used for handling data loading and batching in machine learning and deep learning tasks. 
- Later, we will pass it to model wrapper. The wrapper will internally handle this for training and valuation. 

In [3]:
import torch
import torchvision
import torchvision.transforms as transforms

transform = transforms.ToTensor()

train_dataset = torchvision.datasets.FashionMNIST(
    root='./data',
    train=True,
    download=True,
    transform=transform
)

test_dataset = torchvision.datasets.FashionMNIST(
    root='./data',
    train=False,
    download=True,
    transform=transform
)


train_dataloader = torch.utils.data.DataLoader(
    train_dataset,
    batch_size=32,
    shuffle=True
)

test_dataloader = torch.utils.data.DataLoader(
    test_dataset,
    batch_size=32,
    shuffle=False
)


#### Creating Pytorch Model 

Model Architecture (Simple Neural Network with Linear Layers):

Linear_1_(28*28, 256) -> ReLU -> Linear_2_(256, 256) -> ReLU -> Linear_3_(256, 10). (where, ReLU is an Activation function) 

````MyModel```` defines a model class that extends an existing model ````FashionMNISTModel````, adds a loss function, performs forward computation, and returns the predicted labels and loss during model training and evaluation.

In [4]:
# Define the model
class FashionMNISTModel(nn.Module):
    def __init__(self, input_size, hidden_size, num_classes):
        super(FashionMNISTModel, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)
        self.relu1 = nn.ReLU()
        self.fc2 = nn.Linear(hidden_size, hidden_size)
        self.relu2 = nn.ReLU()
        self.fc3 = nn.Linear(hidden_size, num_classes)
    
    def forward(self, x):
        x = x.view(x.size(0), -1)  
        x = self.fc1(x)
        x = self.relu1(x)
        x = self.fc2(x)
        x = self.relu2(x)
        x = self.fc3(x)
        return x

input_size = 28 * 28  
hidden_size = 256
num_classes = 10

model = FashionMNISTModel(input_size, hidden_size, num_classes)

# Adding loss to the model.
class MyModel(nn.Module):
    def __init__(self, config: ModelConfig) -> None:
        super().__init__()
        self.model = model
        self.loss = nn.CrossEntropyLoss()

    def forward(self, x, labels=None):
        out = self.model(x)
        loss = None

        if labels is not None:
            loss = self.loss(out, labels)

        out = out.argmax(dim=-1)

        return {"y_pred": out, "y_true": labels}, loss

#### Defining Custom Evaluation Metrics

Defining evaluation functions for classification problems. Using average as "weighted" for multiclass evaluation.

In [5]:
def my_accuracy(y_true, y_pred):
    return accuracy_score(y_true.flatten(), y_pred.flatten())

def my_precision(y_true, y_pred):
    return precision_score(y_true.flatten(), y_pred.flatten(), average='weighted')

def my_recall(y_true, y_pred):
    return recall_score(y_true.flatten(), y_pred.flatten(), average='weighted')

def my_f1_score(y_true, y_pred):
    return f1_score(y_true.flatten(), y_pred.flatten(), average='weighted')

#### Model Wrapper

- This class serves as a comprehensive wrapper for PyTorch models, providing a high-level interface for handling various tasks involved in model training.

- It takes care of importing parameters from configuration files into the model, setting up optimizers and schedulers and checkpoints, logging metrics, handling interruptions, creating and utilizing data loaders, evaluating model and much more.

- By encapsulating these functionalities, it significantly reduces development efforts, minimizes the need for writing complex code, ultimately improving efficiency and productivity.

Creating ModelWrapper and passing dataloaders and evaluation functions.

In [6]:

class MyModelWrapper(ModelWrapper):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

    def make_dataloader_train(self, run_config: RunConfig):
        return train_dataloader

    def make_dataloader_val(self, run_config: RunConfig):
        return test_dataloader

    def evaluation_functions(self) -> Dict[str, Callable]:
        return {
            "accuracy": my_accuracy,
            "precision": my_precision,
            "recall": my_recall,
            "f1_score": my_f1_score
            }

#### ProtoTrainer

- This class is responsible to start training the model in the modelWrapper, preparing resources for model to avoid stalling during training or conficts between other trainers.

- Provides logging and syncing facilities to the provided directory or external remote servers like google cloud etc. It also do evaluation and syncing metrics to the directories.

- Therefore, to achieve this, it requires model wrapper and run config as inputs. 

First, we wrap the model (**MyModel**) in a ModelWrapper (**MyModelWrapper**).
Then, we create an instance of Prototrainer, passing the **run_config** and **wrapper** as arguments, and call the ````launch()```` to start the experiment.
The ````launch()```` return metrics object of Class ````TrainMetrics````. It is used for calculates metrics for custom evaluation functions.

In [7]:
%%capture

if not os.path.exists(run_config.experiment_dir):
    shutil.os.mkdir(run_config.experiment_dir)

shutil.rmtree(run_config.experiment_dir)

wrapper = MyModelWrapper(
    model_class=MyModel,
)

ablator = ProtoTrainer(
    wrapper=wrapper,
    run_config=run_config,
)
metrics = ablator.launch()

#### Interpreting Results

TrainMetrics store and manages predictions and calculate metrics given evaluation functions. <br>
We can get all the metrics from TrainMetrics using ````to_dict()```` method. 

We can get the metrics of custom evaluation functions which we had pass to the model Wrapper.

In [8]:
metrics_dict = metrics.to_dict()
max_key_length = max(len(str(k)) for k in metrics_dict.keys())

for k, v in metrics_dict.items():
    print(f"{k:{max_key_length}} : {v}")

train_loss        : 5.5833635228650795
val_loss          : 13.444317814478225
train_accuracy    : 0.827375
train_f1_score    : 0.827292216086392
train_precision   : 0.8272152550200749
train_recall      : 0.827375
val_accuracy      : 0.80342
val_f1_score      : 0.8019798713403301
val_precision     : 0.8130216190983056
val_recall        : 0.80342
best_iteration    : 16875
best_loss         : 14.802360911396715
current_epoch     : 10
current_iteration : 18750
epochs            : 10
learning_rate     : 0.001
total_steps       : 18750


#### Additional Info

Why training with ProtoTrainer?

- It provides a robust way to handle errors during training.
- Ideal for prototyping experiments in a local environment.
- Easily adaptable for hyperparameter optimization with larger configurations and horizontal scaling.
- Quick transition to "ParallelConfig" and "ParallelTrainer" for parallel execution of trials using Ray.

How to visualize metrics

- We can also visualize metrics on TensorBoard with respect to every epoch.
- Just install ````tensorboard````. Load using ````%load_ext tensorboard```` if using notebook.
- Run the command %tensorboard --logdir /tmp/dir/<Experiment_dir_name>/dashboard/tensorboard --port [port]