## ENOT baseline optimizer

This notebook describes how to use ENOT optimizer.

### Main chapters of this notebook:
1. Setup the environment
1. Prepare dataset and create dataloaders
1. Train model using ENOT baseline optimizer
1. Check ENOT optimizer profit

ENOT optimizer wraps any PyTorch optimizer and can increase metrics.
It was tested on many computer vision classification tasks, other tasks are not tested.
Before trying our optimizer we recommend tuning your training hyperparameters (optimizer, scheduler, learning rate etc). Our optimizer is twice as slow as the original PyTorch optimizer.
 

## Setup the environment
First, let's set up the environment and make some common imports.

In [None]:
import os

os.environ['CUDA_DEVICE_ORDER'] = 'PCI_BUS_ID'
# You may need to uncomment and change this variable to match free GPU index
# os.environ['CUDA_VISIBLE_DEVICES'] = '0'

In [None]:
import sys

sys.path.append('../')

from pathlib import Path

import torch
import torch.nn as nn

from torch.optim import SGD
from torch.optim.lr_scheduler import CosineAnnealingLR
from torch_optimizer import RAdam
from torchvision.models.mobilenet import mobilenet_v2

from enot.optimize import GTBaselineOptimizer

from tutorial_utils.train import accuracy
from tutorial_utils.train import WarmupScheduler

from tutorial_utils.dataset import create_imagenette_dataloaders
from tutorial_utils.phases import tutorial_train_loop

### In the following cell we setup all necessary dirs

* `HOME_DIR` - experiments home directory
* `DATASETS_DIR` - root directory for datasets (imagenette2, ...)
* `PROJECT_DIR` - project directory to save training logs, checkpoints, ...

In [None]:
HOME_DIR = Path.home() / '.optimization_experiments'
DATASETS_DIR = HOME_DIR / 'datasets'
PROJECT_DIR = HOME_DIR / 'search_space_autogeneration'

HOME_DIR.mkdir(exist_ok=True)
DATASETS_DIR.mkdir(exist_ok=True)
PROJECT_DIR.mkdir(exist_ok=True)

## Prepare dataset and create dataloaders

In [None]:
dataloaders = create_imagenette_dataloaders(
    dataset_root_dir=DATASETS_DIR,
    project_dir=PROJECT_DIR,
    input_size=(224, 224),
    batch_size=32,
)

## Train model with ENOT Baseline optimizer

**IMPORTANT:**<br>
We set `N_EPOCHS`= 3 in this example to make tutorial execution faster. This is not enough for good train quality, and you should set `N_EPOCHS`>= 100 if you want to achieve good results.

To use our optimizer wrap the model train step with a closure (closure should clear gradients, compute loss and return it), and pass the closure into `enot_optimizer.step(...)` method as argument. It is necessary because `GTBaselineOptimizer` does more than one step per batch.

In [None]:
model = mobilenet_v2(weights=None)

classifier = model.classifier[1]
model.classifier = nn.Linear(
    in_features=classifier.in_features,
    out_features=10,
    bias=True,
)

In [None]:
N_EPOCHS = 3
N_WARMUP_EPOCHS = 1

len_train = len(dataloaders['tune_train_dataloader'])

optimizer = SGD(params=model.parameters(), lr=0.06, momentum=0.9, weight_decay=1e-4)
enot_optimizer = GTBaselineOptimizer(model=model, optimizer=optimizer)
scheduler = CosineAnnealingLR(optimizer, T_max=len_train * N_EPOCHS, eta_min=1e-8)
# you can also use enot_optimizer._optimizer
scheduler = WarmupScheduler(scheduler, warmup_steps=len_train * N_WARMUP_EPOCHS)
loss_function = nn.CrossEntropyLoss().cuda()

train_loader = dataloaders['tune_train_dataloader']
validation_loader = dataloaders['tune_validation_dataloader']

model.cuda()
for epoch in range(N_EPOCHS):
    print(f'EPOCH #{epoch}')

    model.train()
    train_metrics_accumulator = {
        'loss': 0.0,
        'accuracy': 0.0,
        'n': 0,
    }
    for inputs, labels in train_loader:
        enot_optimizer.zero_grad()

        def closure():
            pred_labels = model(inputs)
            batch_loss = loss_function(pred_labels, labels)
            batch_loss.backward()
            batch_metric = accuracy(pred_labels, labels)

            train_metrics_accumulator['loss'] += batch_loss.item()
            train_metrics_accumulator['accuracy'] += batch_metric.item()
            train_metrics_accumulator['n'] += 1
            return batch_loss

        enot_optimizer.step(closure)
        if scheduler is not None:
            scheduler.step()

    train_loss = train_metrics_accumulator['loss'] / train_metrics_accumulator['n']
    train_accuracy = train_metrics_accumulator['accuracy'] / train_metrics_accumulator['n']

    print('train metrics:')
    print('  loss:', train_loss)
    print('  accuracy:', train_accuracy)

    model.eval()
    validation_loss = 0
    validation_accuracy = 0
    with torch.no_grad():
        for inputs, labels in validation_loader:
            pred_labels = model(inputs)
            batch_loss = loss_function(pred_labels, labels)
            batch_metric = accuracy(pred_labels, labels)

            validation_loss += batch_loss.item()
            validation_accuracy += batch_metric.item()

    n = len(validation_loader)
    validation_loss /= n
    validation_accuracy /= n

    print('validation metrics:')
    print('  loss:', validation_loss)
    print('  accuracy:', validation_accuracy)

    print()