# Tune Tutorial

<img src="tune.png" alt="Tune Logo" width="400"/>


Tune is a scalable framework for model training and hyperparameter search with a focus on deep learning and deep reinforcement learning.

**Code**: https://github.com/ray-project/ray/tree/master/python/ray/tune

**Examples**: https://github.com/ray-project/ray/tree/master/python/ray/tune/examples

**Documentation**: http://ray.readthedocs.io/en/latest/tune.html

**Mailing List** https://groups.google.com/forum/#!forum/ray-dev

# Overview

Tuning hyperparameters is often the most expensive part of the machine learning workflow. Tune is built to address this, demonstrating an efficient and scalable solution for this pain point.

This tutorial will walk you through the following process:

1. Creating and training a model on a toy dataset (MNIST)
2. Integrating Tune into your workflow by creating a trainable and running an experiment
3. Trying out advanced features - plugging in an efficient scheduler
4. (Optional) Try out a search algorithm

## Part 1: Creating and training an un-Tune-d model

In [None]:
from IPython.display import HTML

from model import load_data, make_model, evaluate
from helper import prepare_data

import tensorflow as tf
tf.logging.set_verbosity(tf.logging.ERROR)

Let's create and train a model to classify [MNIST](https://www.wikiwand.com/en/MNIST_database) digits **without tuning**, as a baseline first model. We will be creating a Convolutional Nueral Network model (using [Keras](https://keras.io/)) to classify the digits. 

*Note: If you would like to see the specifics of the `load_data`, `make_model`, and `evaluate`, feel free to check out model.py!*

In [None]:
def train_mnist():
    data_generator = load_data()
    model = make_model()
    for batch_of_data, batch_of_labels in data_generator:
        model.fit(batch_of_data, batch_of_labels, verbose=0)
    return model

Let's create our model. (This should take ~30 seconds)

In [None]:
first_model = train_mnist()

Lets evaluate the un-Tune-d model.

In [None]:
evaluate(first_model)

Let's now quickly try out this model to see if it works as expected. We'll load the model with our trained weights. Try to write a digit into the box below. This will automatically save your input in a variable `data` behind the scenes.

In [None]:
data = None
HTML(open("input.html").read())

(tip: don't expect it to work)

In [None]:
prepared_data = prepare_data(data)
print("This model predicted your input as", first_model.predict(prepared_data).argmax())

## Part 2: Setting up Tune

In [None]:
import numpy as np

import ray
from ray import tune

from helper import test_reporter, get_best_result
from model import load_data, make_model

One thing we might want to do now is find better hyperparameters so that our model trains more quickly and possibly to a higher accuracy. Let's make some minor modifications to utilize Tune. 

Tune uses Ray as a backend, so we will first import and initialize Ray. You can ignore the output at this point.

In [None]:
ray.init(ignore_reinit_error=True)

### Step 1: Defining a Trainable to run

Tune will automate and distribute your hyperparameter search by scheduling a number of **trials** on a machine. Each trial runs a user-defined Python function with a sampled set of hyperparameters called a **trainable**. 

We define a new training function `train_mnist_tune` as our trainable. A trainable must pass in a `reporter` object, train the model, and report some metric(s) to Tune. This allows Tune to keep track of performance as the model is training.

**TODO: After fitting the Keras model, get the `mean_accuracy` from Keras, and call the ``reporter`` to report the `mean_accuracy` for every batch**. 

You can get model accuracy from the Keras model with the following code:

```python
mean_accuracy = model.evaluate(x_batch, y_batch)[1]
```


Example of using the reporter:

```python
reporter(mean_accuracy=accuracy, metric2=1, metric3=0.3, ...)
```


In [None]:
def train_mnist_tune(config, reporter):
    data_generator = load_data()
    model = make_model()
    for i, (x_batch, y_batch) in enumerate(data_generator):
        model.fit(x_batch, y_batch, verbose=0)
        if i % 3 == 0:
            last_checkpoint = "weights_tune_{}.h5".format(i)
            model.save_weights(last_checkpoint)
        
        reporter(mean_accuracy=None, # TODO
                 timesteps_total=i, 
                 checkpoint=last_checkpoint)        

In [None]:
# This may take 30 seconds or so to run if incorrectly written
assert test_reporter(train_mnist_tune)

*Note: Call ``help(tune.trainable)`` if you are interested in learning more about what qualifies as trainable in Tune.*

### Step 2: Configure the search and run Tune

Now that we have a working trainable, we can configure the search. We will use some basic Tune features for training - namely specifying a stopping criteria and a search space. 

1) First, set the stopping criteria to when `mean_accuracy` passes `0.95`. For example, to specify that trials will be stopped whenever they report a `mean_accuracy` that is `>= 0.95`, do:

```python
stop={"mean_accuracy": 0.95}
```


2) We also want to designate a search space. We'll search over *learning rate*, which sets the step size of our model update, and *momentum*, which helps accelerate gradients vectors in the right directions, thus leading to faster converging.

For `learning rate`, Tune supports sampling parameters from user-specified lambda functions, which can be used independently or in combination with grid search. For `momentum`, you can use `tune.grid_search` to specify an axis of a grid search. For example:

```python
space={
    "lr": tune.sample_from(lambda spec: np.random.uniform(0.001, 0.1)),
    "momentum": tune.grid_search([0.2, 0.4, 0.6]),
    ...
}
```

**TODO: Configure `tune.run` with a stopping criteria using `stop` and a search space using `config`**

As a reminder, we want to stop when `mean_accuracy` passes `0.95` and randomly search for learning rate `"lr"` between 0.001 to 0.1 with a grid search over `"momentum"` for `[0.2, 0.4, 0.6]`

When you're ready, run the experiment! (this should take ~1 minute)

In [None]:
space = {
    "lr": None, # TODO
    "momentum": None # TODO
}
trials = tune.run(
    train_mnist_tune,
    stop=None, # TODO
    config=space,
    resources_per_trial={"cpu": 4},
    verbose=False
)

You can expect the result below to be about `0.6`, although your mileage may vary (and it's OK).

In [None]:
print("The best result is", get_best_result(trials, metric="mean_accuracy"))

As you can see, the accuracy is still low, similar to the accuracy of our first un-Tune-d model! In the next section, we will scale up the search and accelerate the training using a state of the art algorithm.

*Note: Call ``help(tune.run)`` if you are interested in learning more about executing experiments.*

## Part 3: Scale up the search with more samples, hyperparameters, and a custom scheduler 

In [None]:
from ray.tune.schedulers import AsyncHyperBandScheduler

By default, Tune schedules trials in serial order with the `FIFOScheduler` class. Instead, we can specify a custom scheduling algorithm, such as `HyperBand`, to scale up and accelerate our training. `Hyperband` is state of the art algorithm that focuses on speeding up random search through adaptive resource allocation and early-stopping.

We will take a few steps to scale up our search and add `HyperBand`

1) Sample the search space 5 times using the parameter `num_samples`. For example,

```python
num_samples=5
```


2) In addition to `learning rate` and `momentum`, search over another hyperparameter `"hidden"` from 16 to 512 which specifies the size of the last neural network layer.

Here, use `np.random.randint`. For example,

```python
config={
    ...
    "hidden": tune.sample_from(lambda spec: np.random.randint(16, 512))
}
```

3) Create an Asynchronous HyperBand Scheduler. Be sure to set the `time_attr` to `timesteps_total` and `reward_attr` to `mean_accuracy`. For example,

```python
custom_scheduler = AsyncHyperBandScheduler(time_attr='timesteps_total', reward_attr='mean_accuracy')
```

*Note: Read the documentation on this step at https://ray.readthedocs.io/en/latest/tune-schedulers.html#asynchronous-hyperband or call ``help(tune.schedulers.AsyncHyperBandScheduler)`` to learn more about the Asynchronous Hyperband Scheduler*

**TODO: Create and specify the Hyperband scheduler using `scheduler`, search over another hyperparameter `hidden`, and specify the number of sample with `num_samples`.**

When you're ready, run the experiment! (this should take ~1 minute)

In [None]:
custom_scheduler = None # TODO
space = {
    "lr": tune.sample_from(lambda spec: np.random.uniform(0.001, 0.1)),
    "momentum": tune.grid_search([0.2, 0.4, 0.6]),
    "hidden": None # TODO
}
better_trials = tune.run(
    train_mnist_tune,
    num_samples=None, # TODO
    scheduler=custom_scheduler,
    stop={"mean_accuracy": 0.95},
    config=space,
    resources_per_trial={"cpu": 4},
    verbose=False
)

You can expect the result to be about `0.95`, although your mileage may vary.

In [None]:
print("The best result is", get_best_result(better_trials, metric="mean_accuracy"))

# 🎉 Congratulations, you're now a Tune expert! 🎉

# Please: fill out this form to provide feedback on this tutorial!

https://goo.gl/forms/NVTFjUKFz4TH8kgK2

# (Optional) Try using a search algorithm

Tune is an execution layer, so we can combine powerful optimizers such as HyperOpt (https://github.com/hyperopt/hyperopt) with state-of-the-art algorithms such as HyperBand without modifying any model training code.

The documentation to doing this is here: https://ray.readthedocs.io/en/latest/tune-searchalg.html#hyperopt-search-tree-structured-parzen-estimators

In [None]:
from hyperopt import hp
from ray.tune.suggest.hyperopt import HyperOptSearch

space = {
    "lr": hp.uniform("lr", 0.001, 0.1),
    "momentum": hp.uniform("momentum", 0.1, 0.9),
    "hidden": hp.choice("hidden", np.arange(16, 256, dtype=int)),
}

hyperband = AsyncHyperBandScheduler(time_attr='timesteps_total', reward_attr='mean_accuracy')

hyperopt_search = HyperOptSearch(space, reward_attr="mean_accuracy")

good_results = tune.run(
    train_mnist_tune,
    num_samples=5,
    search_alg=hyperopt_search,
    scheduler=hyperband,
    stop={"mean_accuracy": 0.95},
    config=space,
    verbose=False
)

In [None]:
print("The best result is", get_best_result(good_results, metric="mean_accuracy"))