# Ray.io optimization framework

Ray.io is a framework developed to scale compute-intensive Python workload. It relies on many components dedicated among which, notoriously:

* Ray Core to scale general-purpose Python workflows
* Ray Train for scaling DL-models training
* Ray Serve for scaling models inference (serving)
* Ray Datasets for scaling data loading and simple preprocessing
* Ray Tune for scaling hyperparameter tuning

Tune is mature, compatible with both DL frameworks PyTorch + Lightning and TensorFlow + Keras, as well as Scikit-Learn and XGBoost. It also integrates 

Building a deep learning estimator requires to gradually converge to a

Ray produces a lot of logs...

## Note on LR Scheduler

LR schedulers are outside Tune's scope, those are provided by vanilla PyTorch. Learning-rate scheduling modulates the LR while in train mode, allowing DL developers to apply a cyclic LR, LR warmup, or LR decay to help escaping local minima and hopefully converge to a global maximum.

These objects lie under the `torch.optim.lr_scheduler` package.

## Understanding the logic behind an HPO framework

In [43]:
import numpy as np
import ray.tune as tune
import time
import torch
import torch.nn as nn

The overall is a simple 3-step process

1. Sample parameters to make up an HP set, following a specific search algorithm
2. Build an execution stack with desired number of runs, then start at the top
3. Monitor the training on relevant metrics, stop unpromising trainings early, and move the stack with freed-up resources

Let's first load all the necessary params

We will implement all of these components 

## Exploring components in detail

Ray.Tune relies on a lot fo components to achieve this:
* Making a selection of the HParams you wish to optimize for, and setting the search space (and choosing for each parameter a sampling method.)
* A callback to monitor and automatically report metrics progress during training
* A trials scheduler to kill unpromising HP sets
* A search algorithm used to explore the HP space
* A logger to push values to a possibly remote monitor solution
* A runner to sequentially execute experiments with the set of HP

## First fully functional example with no HPO

### **Search space**

Each HP has its own space. Ray comes standard with a range of params types. Report 
* `tune.uniform`, `tune.quniform`, and `tune.qloguniform` to uniformly sample a float in a range of values
* `tune.randn`, `tune.qrandn`, `tune.randint`, `tune.qrandint`, `tune.lograndint`, and `tune.qlograndint` to uniformly sample an integer in a range of values
* `tune.choice` to sample from a discrete list of values
* `tune.sample_from` for a custom-made sampling method
* `tune.grid_search` to end-up browsing an entire list sequentially

Create a config `dict` for data and models HPs

In [23]:
config = {
    "model": {
        "cumulate": tune.choice([False, True]),
        "p": tune.randint(2, 7)
    }
}

In [25]:
cumulate = config["model"]["cumulate"]
p = config["model"]["p"]

for _ in range(10):
    print(f"sampled: cumulate = {cumulate.sample()}, p = {p.sample()}")

sampled: cumulate = True, p = 4
sampled: cumulate = True, p = 2
sampled: cumulate = False, p = 3
sampled: cumulate = False, p = 4
sampled: cumulate = False, p = 3
sampled: cumulate = False, p = 2
sampled: cumulate = False, p = 6
sampled: cumulate = True, p = 4
sampled: cumulate = True, p = 5
sampled: cumulate = True, p = 5


### **Runner**

The runner will execute runs of either a functionnal `trainable`, or a `tune.Trainable`, sequentially.

In [44]:
class Trainable(tune.Trainable):
    
    cumulative = 0
    
    def setup(self, config):
        
        self.cumulate = config["cumulate"]
        self.p = config["p"]

    def step(self):
        
        score = 1 / self.p
        
        self.p += 1
        self.cumulative += score
        
        time.sleep(.2)
        if self.cumulate:
            return {"score": self.cumulative}
        
        return {"score": score}

In [45]:
params = config["model"]
params

{'cumulate': <ray.tune.sample.Categorical at 0x7f3c25b4aa50>,
 'p': <ray.tune.sample.Integer at 0x7f3c25b4a250>}

### **Callbacks**

A callback reports values to the runner, so the scheduler can take decisions.

In [46]:
class PrintCallback(tune.Callback):
    
    def on_trial_result(self, iteration, trials, trial, result, **info):
        print(f"Current score: {result['score']}")

The next run will execute forever, because it has no stopping condition, so you'll need to manually stop it. We'll add a reason to stop later.

In [47]:
tune.run(
    Trainable, 
    config=params, 
    verbose=0,
    mode="min",
    metric="score",
    callbacks=[
        PrintCallback()
    ])

Current score: 0.16666666666666666
Current score: 0.30952380952380953
Current score: 0.43452380952380953
Current score: 0.5456349206349207
Current score: 0.6456349206349207
Current score: 0.7365440115440116
Current score: 0.819877344877345
Current score: 0.896800421800422
Current score: 0.9682289932289934
Current score: 1.03489565989566
Current score: 1.09739565989566
Current score: 1.1562191893074247
Current score: 1.2117747448629803
Current score: 1.2644063238103487
Current score: 1.3144063238103487
Current score: 1.3620253714293964




Current score: 1.4074799168839418


2022-02-24 21:23:55,899	ERROR tune.py:632 -- Trials did not complete: [Trainable_10172_00000]


<ray.tune.analysis.experiment_analysis.ExperimentAnalysis at 0x7f3c25b29610>

### **Scheduler**

Finally, the scheduler will stop the execution of 

In [53]:
asha_scheduler = tune.schedulers.ASHAScheduler(
    time_attr='training_iteration',
    max_t=100,
    grace_period=3,
    reduction_factor=3,
    brackets=1)

In [54]:
analysis = tune.run(
    Trainable, 
    mode="min",
    metric="score",
    config=params, 
    num_samples=10, 
    verbose=1,
    scheduler=asha_scheduler)

2022-02-24 21:30:42,049	INFO tune.py:636 -- Total run time: 20.80 seconds (20.69 seconds for the tuning loop).


All results are saved by Ray.Tune, you can access them later on.

In [55]:
print(f"best config: {analysis.best_config}")
print(f"best result: {analysis.best_result}")

best config: {'cumulate': False, 'p': 5}
best result: {'score': 0.009615384615384616, 'done': True, 'timesteps_total': None, 'episodes_total': None, 'training_iteration': 100, 'trial_id': 'f8366_00001', 'experiment_id': '7e374ab7198f43e9a874eb4c6a194709', 'date': '2022-02-24_21-30-41', 'timestamp': 1645738241, 'time_this_iter_s': 0.2002270221710205, 'time_total_s': 20.025014877319336, 'pid': 29881, 'hostname': 'alx', 'node_ip': '10.164.0.2', 'config': {'cumulate': False, 'p': 5}, 'time_since_restore': 20.025014877319336, 'timesteps_since_restore': 0, 'iterations_since_restore': 100, 'experiment_tag': '1_cumulate=False,p=5'}
