# Metrics in EdnaML and EdnaDeploy

Here, we examine EdnaML's metrics infrastructure. Most metrics are essentially wrappers around their Torchmetrics 

# Setup Steps (If EdnaML is not already installed)
 
We can install either from source or from PyPi. The appropriate option can be selected from the first cell below.

**Very Important**. Due to the way Colab installs certain packages, you will need to restart the runtime after installing EdnaML. Then you can proceed with future steps.

In [None]:
!nvidia-smi

In [None]:
install_from = "source" # source | pypi
branch = "suprem-devel-mongo-metrics"           # DO NOT CHANGE THIS unless you know what you are doing
version = "0.1.5"           # DO NOT CHANGE THIS unless you know what you are doing

###  Installation steps

In [None]:
if install_from == "source":
  ! rm -rf -- EdnaML ||:
  ! git clone -b $branch https://github.com/asuprem/EdnaML
  ! pip install -e EdnaML/
else:
  ! python -V
  ! pip3 install --pre ednaml==$version

## Restart Runtime

In [None]:
try:
  import ednaml
except (ImportError, KeyError, ModuleNotFoundError):
  print('Stopping RUNTIME. Colaboratory will restart automatically.')
  exit()

# Metrics Basics

We will run a few experiments using the basic [MNIST configuration](./mnist.yml). See [cnn.ipynb](../0-basics/cnn/cnn.ipynb) for more details on MNIST and EdnaML.

We have added a few things here for Metrics on top of the Storage and experimentation components. Specifically, we have added the `METRICS` section with a single metric:

```
METRICS:                      # Defining metrics to be tracked. This needs to be made top-level
  MODEL_METRICS:
    - METRIC_NAME: avgacc
      METRIC_CLASS: BaseTorchMetric
      METRIC_ARGS:
        metric_name: Accuracy
        aggregate: 100
        metric_kwargs:
          task: 'multiclass'
          num_classes: 10
      METRIC_PARAMS: 
        preds: logits
        targets: labels  # basically, key is what the metric expects, value is what WILL be there, e.g. HFTrainer could have labels, not targets, so it would be targets: labels
      METRIC_TRIGGER: step
      METRIC_STORAGE: null
```

Here, we have set up a Metric for the model (more details about specific metrics can be found at []). You can add metrics for each component to track different KPIs, e.g. metrics for the Logger, Cnfiguration manager, Deployment, and Code. For example you may want to record code quality of provided custom code through some backend service as a metric. You may also want to record number of error logs, warning logs, or debug logs per X training steps. 

In this case, we have added an Accuracy Metric. Since we want to use the excellent Torchmetrics package, we will use our internal wrapper `BaseTorchMetric`, and provide the correct arguments (`metric_name` is the specific torchmetric we wish to use, and `metric_kwargs` are the arguments for the above class). 

Then, we have `METRIC_PARAMS`. Each module publishes a set of parameters internally that metrics have access to. For example, `BaseTrainer` publishes `loss` at each step. `ClassificationTrainer`, which we use for MNIST, publishes loss, as well as `logits`, `labels`, `features` and the current `epoch` and `step`, among others. For accuracy, we require `logits` and `labels`. However, `torchmetrics.Accuracy` takes in `preds` and `targets` as input. So we provide this mapping in `METRIC_PARAMS` so that our metrics can access the correct arguments to compute metrics.

Finally, we have `METRIC_TRIGGER`, which can be one of `[once | always | step | batch]`, meaning it is triggered just once at beginning, always whenever some parameter changes inside a module, at the end of each step, or at the end of a batch of steps (batch determined by `LOGGING.STEP_VERBOSE`). We want to compute accuracy at each step. However, we also want to aggregate the accuracy for saving across 100 steps, so we set the `aggregate` parameter in `METRIC_ARGS` to 100.



## 0. Setting up the MNIST model

In [None]:
%load_ext autoreload
%autoreload 2

In [None]:
# Here we define our custom model class
from ednaml.models import ModelAbstract
from torch import nn
import ednaml.core.decorators as edna

class MNISTModel(ModelAbstract):
  def model_attributes_setup(self, **kwargs):
    pass
  def model_setup(self, **kwargs):
    self.conv1 = nn.Sequential(         
        nn.Conv2d(in_channels=1, out_channels=16, kernel_size=5, stride=1, padding=2), 
        nn.ReLU(), 
        nn.MaxPool2d(kernel_size=2),    
    )
    self.conv2 = nn.Sequential(         
        nn.Conv2d(16, 32, 5, 1, 2), nn.ReLU(), nn.MaxPool2d(2),                
    )
    # fully connected layer, output 10 classes
    self.out = nn.Linear(32 * 7 * 7, 10)
    
  def forward_impl(self, x):
    x = self.conv1(x)
    x = self.conv2(x)
    # flatten the output of conv2 to (batch_size, 32 * 7 * 7)
    x = x.view(x.size(0), -1)       
    output = self.out(x)
    # A ModelAbstract returns prediction, features, and secondary output (empty list)
    return output, x, []    

In [None]:
import torch, ednaml
from ednaml.core import EdnaML
torch.__version__

## 1. Basic MNIST Experiment

We first run our basic MNIST experiment, with the default options.

In [None]:
EdnaML.clear_registrations()
cfg = "./EdnaML/usage-docs/sample-configs/3-metrics/mnist.yml"
eml = EdnaML(config=cfg, config_inject = [
    ("MODEL.MODEL_BASE", "simple"),  
    ("SAVE.MODEL_BACKBONE", "simple"),  
    ("EXECUTION.SKIPEVAL", True),  
    ("TRANSFORMATION.BATCH_SIZE", 64),   # We will also increase the batch size
    ("LOGGING.INPUT_SIZE", [64,1,28,28]),   # We will also fix the input size
])
eml.cfg.MODEL.MODEL_KWARGS = {}       # We delete the old MODEL_KWARGS, because our new model needs no arguments
eml.addModelClass(MNISTModel)

In [None]:
# These are the default options.
eml.apply(  storage_manager_mode = "strict",
            storage_mode = "local",
            backup_mode = "hybrid",
            tracking_run = 0,
            new_run = False,
            skip_storage = False
          )

In [None]:
eml.train()

In [None]:
eml.eval()

## What to note

On the file view (if you are on Colab), you should see a directory called `mnist_resnet-v1-simple-mnist` that contains a directory `0`. Inside this, there should be several pytorch files and log files.

Here, the name of the experiment is `mnist_resnet-v1-simple-mnist`, inherited from the `MODEL_CORE_NAME`, `MODEL_VERSION`, `MODEL_BACKBONE`, and `MODEL_QUALIFIER` in the `SAVE` section of the [configuration, linked here](mnist.yml#L20).

`0` is the run for this experiment. There should be a `metrics.json` with the computed metrics.

## 2. Backing up metrics (in canonical mode)

We will let `storage_mode="empty"` instead of `local`. Here, no local files will be created. Note: we have incremented the version to `2` in `config_inject`, so if `storage_mode` was local, the expected directory would be `mnist_resnet-v2-simple-mnist`. However, it will not be created.

In [None]:
EdnaML.clear_registrations()
cfg = "./EdnaML/usage-docs/sample-configs/3-metrics/mnist.yml"
storage = "./EdnaML/usage-docs/sample-configs/3-metrics/mnist_simple_storage.yml"
eml = EdnaML(config=[cfg,storage], config_inject = [
    ("MODEL.MODEL_BASE", "simple"),  
    ("SAVE.MODEL_BACKBONE", "simple"),  
    ("EXECUTION.SKIPEVAL", True),  
    ("TRANSFORMATION.BATCH_SIZE", 64),   # We will also increase the batch size
    ("LOGGING.INPUT_SIZE", [32,1,28,28]),   # We will also fix the input size
])
eml.cfg.MODEL.MODEL_KWARGS = {}       # We delete the old MODEL_KWARGS, because our new model needs no arguments
eml.addModelClass(MNISTModel)

In [None]:
# These are the default options.
eml.apply(  storage_manager_mode = "strict",
            storage_mode = "local",
            backup_mode = "hybrid",
            tracking_run =1,
            new_run = False,
            skip_storage = False
          )

In [None]:
eml.train()

In [None]:
eml.eval()

## 3. Logging ad-hoc metrics

Occasionally, we wish to log metrics independent of the built-in metrics classes. Here, we show such an example: we will create a custom trainer where we log the l2 norm of the model params. While there is already a metric for this [Link??](), this is simply an example.

In the custom trainer below, we calculate the running average of the l2 norm over the past 25 steps and save it into the metrics.

In [None]:
from ednaml.trainer import ClassificationTrainer
import torch
class CustomTrainer(ClassificationTrainer):
    def beginning_of_training_hook(self):
        self.l2check = []
    def end_of_step_metrics(self):
        l2_norm = sum(torch.linalg.norm(p, 2) for p in self.model.parameters())
        self.l2check.append(l2_norm)
        if len(self.l2check) >= 25:
            final_l2 = sum(self.l2check) / len(self.l2check)
            self.log_metric(metric_name = "modell2", metric_val = final_l2)

In [None]:
EdnaML.clear_registrations()
cfg = "./EdnaML/usage-docs/sample-configs/3-metrics/mnist.yml"
storage = "./EdnaML/usage-docs/sample-configs/3-metrics/mnist_simple_storage.yml"
eml = EdnaML(config=[cfg,storage], config_inject = [
    ("MODEL.MODEL_BASE", "simple"),  
    ("SAVE.MODEL_BACKBONE", "simple"),  
    ("EXECUTION.SKIPEVAL", True),  
    ("TRANSFORMATION.BATCH_SIZE", 64),   # We will also increase the batch size
    ("LOGGING.INPUT_SIZE", [32,1,28,28]),   # We will also fix the input size
])
eml.cfg.MODEL.MODEL_KWARGS = {}       # We delete the old MODEL_KWARGS, because our new model needs no arguments
eml.addModelClass(MNISTModel)
eml.addTrainerClass(CustomTrainer)

In [None]:
# These are the default options.
eml.apply(  storage_manager_mode = "strict",
            storage_mode = "local",
            backup_mode = "hybrid",
            tracking_run =2,
            new_run = False,
            skip_storage = False
          )

In [None]:
eml.train()

In [None]:
eml.eval()