# Ray Tune Tutorial - 03: Search Algorithms and Schedulers - Exercise Solution

© 2019-2020, Anyscale. All Rights Reserved

![Anyscale Academy](../../images/AnyscaleAcademy_Logo_clearbanner_141x100.png)

Unlike for previous tutorials, we use separate notebooks for the Tune tutorial solutions, because most of the exercises take a long time to run.

First, we set up everything we need from the lesson.

In [None]:
import ray
from ray import tune

In [None]:
!../../tools/start-ray.sh --check --verbose

In [None]:
ray.init(address='auto', ignore_reinit_error=True)

## Exercise - PopulationBasedTraining

In [None]:
from ray.tune.schedulers import PopulationBasedTraining

In [None]:
import sys
sys.path.append("..")
from mnist import ConvNet, TrainMNIST, EPOCH_SIZE, TEST_SIZE, DATA_ROOT

In [None]:
experiment_metrics = dict(metric="mean_accuracy", mode="max")

#search_algorithm = TuneBOHB(config_space, max_concurrent=4, **experiment_metrics)

In [None]:
pbt_scheduler = PopulationBasedTraining(
        time_attr='training_iteration',
        perturbation_interval=10,  # Every N time_attr units, "perturb" the parameters.
        hyperparam_mutations={
            "lr": [0.001, 0.01, 0.1],
            "momentum": [0.001, 0.01, 0.1, 0.9]
        },
        **experiment_metrics)

In [None]:
# This object is used to bootstrap the process, but these values won't be changed, so when you see them
# listed in the analysis.dataframe(), all values will be the same! Instead, look at the `experiment_tag`.
config = {
    "lr": 0.001,            # Use the lowest values from the previous cell
    "momentum": 0.001
}

Now modify the `tune.run()` call we used in the lesson and run it.

> **WARNING:** This will run for a few minutes.

In [None]:
analysis = tune.run(TrainMNIST, 
    scheduler=pbt_scheduler, 
    config=config,
    stop={"mean_accuracy": 0.97, "training_iteration": 600},
    num_samples=8,
    verbose=1
)

In [None]:
print("Best config: ", analysis.get_best_config(metric="mean_accuracy"))

In [None]:
analysis.dataframe().sort_values('mean_accuracy', ascending=False).head()

It's easy to get above `0.97` accuracy and in fact, it's a poor choice for a stopping criterion because we don't explore as well as we should, so let's sort by `training_iteration` to see which combinations were fast.

In [None]:
analysis.dataframe()[['mean_accuracy', 'experiment_tag', 'training_iteration']].sort_values('training_iteration', ascending=True)

As expected, higher values for the learning rate and momentum generally provide quicker convergence. All but one of the trials shown had a learning rate of `0.1`. The momentum value was much less significant.

In [None]:
stats = analysis.stats()
secs = stats["timestamp"] - stats["start_time"]
print(f'{secs:7.2f} seconds, {secs/60.0:7.2f} minutes')

Try changing the experiment to ensure we explore more combinations.