<a href="https://colab.research.google.com/github/HamidrezaKmK/ML-Mnemonist/blob/main/ExperimentRunnerTutorial.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Using Hyper-Runners

Hyper-runners are entities that act on top of normal runners and control them. While runners produce experiments, hyper-runners produce hyper-experiments. An example of hyper-experiment would be *hyper parameter tuning*. If you have a training algorithm with a tunable hyper parameter $\lambda$, you might assign it multiple values $\lambda_1, \lambda_2, ..., \lambda_k$ and run the experiment. The experiment providing the best results on validation would be the output of our hyper parameter tuning. In this example, the training example itself is the experiment and its result is the accuracy on the validation set. The hyper parameter tuning experiment is the hyper experiment which searches over experiment entries.

In this tutorial, we will use the same runner from [Tutorial 1](./Tutorial-1.ipynb) and try to tune some of the properties of our MLP solver. While normal runners use configuration yaml files, hyper-experiments also use a configuration yaml file; however, these are referred to as configuration palettes. Configuration palettes have additional configuration nodes called `MLM_BRANCH`. These branches act as a palette to create multiple normal trees. To illustrate via an example, check out the following configuration palette:

```
DATASET:
  TRAIN_NAME: 'california_housing_train.csv'
  TEST_NAME: 'california_housing_test.csv'

SOLVER:
  LR:
    MLM_BRANCH_1: 0.001
    MLM_BRANCH_2: 0.0001
  OPTIMIZER_TYPE: 'adam'
  DEVICE: 'cpu'
  METHOD: 'mlp-method'

MODEL:
  HYPER_PARAMETERS:
    IN_FEATURES: 8
    H1:
      MLM_BRANCH_1: 100
      MLM_BRANCH_2: 200
    H2:
      10
```
As you can see, some nodes are defined with `MLM_BRANCH_i` which means for example `LR` can be set to either `MLM_BRANCH_1` which is $0.001$ or `MLM_BRANCH_2` which is $0.0001$. At the same time, `H1` can either be set to `MLM_BRANCH_1` which is 100 or `MLM_BRANCH_2` which is 200. All in all, this will produce four different normal configurations with `LR = 0.001, H1 = 100`, `LR = 0.001, H1 = 200`, `LR = 0.0001, H1 = 100`, or `LR = 0.0001, H1 = 200`.

Another more complex example of using branches is given below:
```
A:
  MLM_BRANCH_1:
    B:
      MLM_BRANCH_1: 10
      MLM_BRANCH_2: 100
  MLM_BRANCH_2:
    C:
      MLM_BRANCH_1:
        D: 5
        E: 10
      MLM_BRANCH_2:
        F: 10
```
The end result will be the following 4 configurations:
```
A:
  B: 10
```
```
A:
  B: 100
```
```
A:
  C:
    D: 5
    E: 10
```
```
A:
  C:
    F: 10
```

In [1]:
%load_ext autoreload
import sys
sys.path.append('..')

## Expanding configuration palettes

We note that the configuration palette is available at `conf-test-branches.yaml`. We can use the `expand_cfg` method to create all the actual configurations in a new directory. This function takes in a default configuration to start with and takes in a configuration palette directory and a save directory:


In [None]:
%autoreload 2
import os
from mlmnemonist.validation_tools import expand_cfg
from testing.config.config import get_cfg_defaults
expand_cfg(get_cfg_defaults(),
           cfg_dir='conf-test-branches.yaml',
           save_directory=os.path.join('config', 'all-branches'))


After running the above cell, you can check out the yaml files produced in the corresponding directory; these yaml files represent all the possible configuration given that palette.

## Working with a hyper-runner

Now we will only load the runner from the previous tutorial but we need not use the configurations it used previously.

In [3]:
%autoreload 2
from mlp_functions import *
import mlmnemonist as mlm
from mlmnemonist import FACTORY
from testing.config.config import get_cfg_defaults


In [None]:

try:
  runner = FACTORY.retrieve_experiment_runner('tut2')
except FileNotFoundError as e:
  runner = FACTORY.create_experiment_runner(
    cache_token='tut2',
    description='A sample lightweight runner',
    cfg_base=get_cfg_defaults(),
    cfg_dir='conf-test.yaml',
    experiment_name='fullsearch-runner',
    verbose=2,
  )
  runner.preprocessing_pipeline.update_function(load_raw_data)
  runner.preprocessing_pipeline.update_function(process_data)
  runner.recurring_pipeline.update_function(setup_device)
  runner.recurring_pipeline.update_function(setup_model)
  runner.recurring_pipeline.update_function(setup_training)
  runner.implement_run(my_custom_run)

The runner has implemented functions in [mlp_functions](./mlp_functions.py). Now we need to plug this runner to a hyper-runner. The hyper runner can handle a set of runners and will look at each of the runners as a blackbox and uses their output to conduct search. Following is a syntax of creating a new hyper-runner which is also done via the factory object provided.

Note that hyper-runners are also cached and saved in the caching directory. This means that whenever a session crashes, the search does not need to start from the beginning and can continue from where it left off.

In [None]:
hyper_runner = FACTORY.create_hyper_experiment_runner(
  experiment_name='cross-run',
  cfg_base=get_cfg_defaults(),
  cfg_palette_path='conf-test-branches.yaml',
  experiment_runners=[runner],
  verbose=4,
  cache_token='tut2'
)
# Run the preprocessing pipeline
hyper_runner.preprocess()

In [None]:
# Run the full search for a while. Try interrupting and re-running this cell
hyper_runner.full_search()

## Retrieve Hyper-experiment

Now we want to check whether after crashing the session, the framework is able to revive the hyper experiments or not. To do so, restart the whole kernel and run the following. 

In [4]:
%load_ext autoreload
import sys
sys.path.append('..')
%autoreload 2
import os
from testing.config.config import get_cfg_defaults
from mlp_functions import *
import mlmnemonist as mlm
from mlmnemonist import FACTORY
from testing.config.config import get_cfg_defaults

hyper_runner = FACTORY.retrieve_hyper_experiment_runner('tut2')
hyper_runner.preprocess()

Now if you run the following it will continue from where it left off.

In [None]:
hyper_runner.full_search()