Copyright (c) Microsoft Corporation. All rights reserved. 

Licensed under the MIT License.

# Perform a tuning task with schedulers
## 0. Introduction
In this notebook, we show how to perform tuning tasks in `flaml.tune` with two different types of schedulers.
- Scheduler type 1, the authentic scheduler included in FLAML. It assumes the performance of a particular configuration becomes better with the increase of resources.
- Scheduler type 2: the schedulers which assume the knowledge of the maximum resource to use and assume the best performance can be achieved at the max resource.
Most existing schedulers are of this type, including ASHA, Hyperband, BOHB. 



In [None]:
# requirements to run the examples in this notebook
!pip install flaml[ray];


## 1. Configure the objective functions according to the types of schedulers
We start by defining a vanilla objective as `simple_obj`, and then introduce how this objective function should be revised according to the type of the scheduler.

Background about the tuning task:
The tuning task is an optimization problem in which we want to maximize a half 2-sphere's projection on a vector in a particular 3-dimensional cube $S$, say the names of the three coordinates are "x", "y", "z".

Suppose you do not know how to solve the optimization problem directly, but know how to sample data points from the half 2-sphere uniformly. Then you can get an approximation of the projection value on any vector in the search space. A good approximation of the ground-truth optimization objective makes it possible for you to solve the original optimization problem. The evaluation in `simple_obj` performs such an approximation, which has the following two properties:
1. The quality of the approximation is resource-dependent, and the more resource is used, the better the approximation.
2. The computational cost of the approximation increases monotonically the amount of resource used.

In [17]:
import numpy as np
from flaml import tune
import time

def rand_vector_unit_sphere(dim):
    """this function allows you to generate
    points that uniformly distribute on 
    the (dim-1)-sphere.
    """
    vec = np.random.normal(0, 1, dim)
    mag = np.linalg.norm(vec)
    # time.sleep(0.1)
    return vec / mag

def simple_obj(config, resource=500000):
    config_value_vector = np.array([config["x"], config["y"], config["z"]])
    score_sequence = []
    for i in range(resource):
        a = rand_vector_unit_sphere(3)
        a[2] = abs(a[2])
        point_projection = np.dot(config_value_vector, a)
        score_sequence.append(point_projection)
    score_avg = np.mean(np.array(score_sequence))
    score_std = np.std(np.array(score_sequence))
    score_lb = score_avg - 1.96 * score_std/np.sqrt(resource)
    tune.report(samplesize = resource, sphere_projection = score_lb)

In [18]:
"""Define an evaluation function which can report intermediate result"""
def obj_w_intermediate_report(resource, config):
    config_value_vector = np.array([config["x"], config["y"], config["z"]])
    score_sequence = []
    for i in range(resource):
        a = rand_vector_unit_sphere(3)
        a[2] = abs(a[2])
        point_projection = np.dot(config_value_vector, a)
        score_sequence.append(point_projection)
        if (i + 1) % 5000 == 0:
            score_avg = np.mean(np.array(score_sequence))
            score_std = np.std(np.array(score_sequence))
            score_lb = score_avg - 1.96 * score_std/np.sqrt(i+1)
            tune.report(samplesize = i + 1, sphere_projection=score_lb)

"""Define evaluation functions which take resource according to the 
   suggested value in config"""
def obj_w_suggested_resource(resource_attr, config):
    resource = config[resource_attr]
    simple_obj(config, resource)

## 2. Specify the tuning task

In [19]:
#  define the search space
search_space = {
            "x": tune.uniform(5, 20),
            "y": tune.uniform(-10, 10),
            "z": tune.uniform(0, 10),
        }

In [20]:
# specify the tuning task with different types of schedulers
from ray.tune.schedulers import TrialScheduler
def test_scheduler(scheduler=None):
    from functools import partial
    resource_attr = "samplesize"
    max_resource = 500000

    # specify the objective functions
    if scheduler is None:
        evaluation_obj = simple_obj
    elif scheduler=='flaml':
        evaluation_obj = partial(obj_w_suggested_resource, resource_attr)
    elif scheduler=='asha' or isinstance(scheduler, TrialScheduler):
        evaluation_obj = partial(obj_w_intermediate_report, max_resource)
    else:
        raise ValueError

    analysis = tune.run(
        evaluation_obj,
        config=search_space,
        metric="sphere_projection",
        mode="max",
        verbose=1,
        resource_attr=resource_attr,
        scheduler=scheduler,
        max_resource=max_resource,
        min_resource=100,
        reduction_factor=2,
        time_budget_s=10,
        num_samples=300,
    )

    print("Best hyperparameters found were: ", analysis.best_config)
    # print(analysis.get_best_trial)
    return analysis.best_config

## 3. Result evaluation
According to calculus, the ground-truth optimization objective is `config["z"]/2`, which means that all the configurations that have the largest value on the 3rd, i.e., the "z" coordinate are the optimal solutions. According to the search space, the largest value config[`z`] can take is 10, so 5 is the optimal value. We now compare the absolute difference between 5 and the expected optimal value `config["z"]/2` according to the best configurations found in experiments with different schedulers.

In [21]:
best_config = test_scheduler()
print('No scheduler, test error:', abs(10/2 -best_config['z']/2) )

You passed a `space` parameter to OptunaSearch that contained unresolved search space definitions. OptunaSearch should however be instantiated with fully configured search spaces only. To use Ray Tune's automatic search space conversion, pass the space definition as part of the `config` argument to `tune.run()` instead.
[32m[I 2021-12-02 20:32:49,660][0m A new study created in memory with name: optuna[0m
[flaml.tune.tune: 12-02 20:32:49] {426} INFO - trial 1 config: {'x': 19.7297660391066, 'y': -3.614243256310104, 'z': 3.805185647291036}


In [14]:
# use auto ASHA scheduler
best_config = test_scheduler('asha')
print('Auto ASHA scheduler, test error:', abs(10/2 -best_config['z']/2) )

You passed a `space` parameter to OptunaSearch that contained unresolved search space definitions. OptunaSearch should however be instantiated with fully configured search spaces only. To use Ray Tune's automatic search space conversion, pass the space definition as part of the `config` argument to `tune.run()` instead.
[32m[I 2021-12-02 20:31:45,698][0m A new study created in memory with name: optuna[0m
[flaml.tune.tune: 12-02 20:31:45] {426} INFO - trial 1 config: {'x': 10.183956095979546, 'y': 7.527165629309338, 'z': 2.7288223992202076}


Best hyperparameters found were:  {'x': 10.183956095979546, 'y': 7.527165629309338, 'z': 2.7288223992202076}
Auto ASHA scheduler, test error: 3.635588800389896


In [15]:
# use a custom ASHAScheduler with a smaller max resource
from ray.tune.schedulers import ASHAScheduler
my_scheduler = ASHAScheduler(time_attr="samplesize", max_t=1000, grace_period=50,reduction_factor=2)
best_config = test_scheduler(scheduler=my_scheduler)
print('Custom ASHA scheduler, test error:', abs(10/2 -best_config['z']/2) )

You passed a `space` parameter to OptunaSearch that contained unresolved search space definitions. OptunaSearch should however be instantiated with fully configured search spaces only. To use Ray Tune's automatic search space conversion, pass the space definition as part of the `config` argument to `tune.run()` instead.
[32m[I 2021-12-02 20:32:16,402][0m A new study created in memory with name: optuna[0m
[flaml.tune.tune: 12-02 20:32:16] {426} INFO - trial 1 config: {'x': 14.149445870299534, 'y': 8.894575254188114, 'z': 6.058817600386888}


Best hyperparameters found were:  {'x': 14.149445870299534, 'y': 8.894575254188114, 'z': 6.058817600386888}
Custom ASHA scheduler, test error: 1.970591199806556


In [16]:
# use FLAML scheduler
best_config = test_scheduler(scheduler='flaml')
print('FLAML scheduler, test error', abs(10/2 -best_config['z']/2) )

You passed a `space` parameter to OptunaSearch that contained unresolved search space definitions. OptunaSearch should however be instantiated with fully configured search spaces only. To use Ray Tune's automatic search space conversion, pass the space definition as part of the `config` argument to `tune.run()` instead.
[32m[I 2021-12-02 20:32:38,915][0m A new study created in memory with name: optuna[0m
[flaml.tune.tune: 12-02 20:32:38] {426} INFO - trial 1 config: {'x': 16.77689527987048, 'y': 5.6172360812754185, 'z': 2.560399711357406, 'samplesize': 100}
[flaml.tune.tune: 12-02 20:32:38] {426} INFO - trial 2 config: {'x': 16.56980964900119, 'y': 0.207519493594015, 'z': 6.336482349262754, 'samplesize': 100}
[flaml.tune.tune: 12-02 20:32:38] {426} INFO - trial 3 config: {'x': 16.232058238079176, 'y': 4.9850701230259045, 'z': 2.2479664553084766, 'samplesize': 100}
[flaml.tune.tune: 12-02 20:32:38] {426} INFO - trial 4 config: {'x': 16.43870256912134, 'y': 0, 'z': 7.364799708001436, 

Best hyperparameters found were:  {'x': 5, 'y': 4.7913428305149255, 'z': 10, 'samplesize': 100}
FLAML scheduler, test error 0.0
