# Tune Tutorial

<img src="tune.png" alt="Tune Logo" width="400"/>


Tune is a scalable framework for model training and hyperparameter search with a focus on deep learning and deep reinforcement learning.

**Code**: https://github.com/ray-project/ray/tree/master/python/ray/tune

**Examples**: https://github.com/ray-project/ray/tree/master/python/ray/tune/examples

**Documentation**: http://ray.readthedocs.io/en/latest/tune.html

**Mailing List** https://groups.google.com/forum/#!forum/ray-dev

# Overview

Tuning hyperparameters is often the most expensive part of the machine learning workflow. Tune is built to , demonstrating an efficient and scalable solution for this pain point.


## Outline
This tutorial will walk you through the following process:

1. Creating and training a model on a toy dataset (MNIST)
2. Integrating Tune into your workflow
3. Trying out advanced features - plugging in an efficient scheduler and search algorithm
4. Validating your trained model


In [None]:
from helper import *
import numpy as np
from IPython.display import HTML
import matplotlib.pyplot as plt

import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.preprocessing.image import ImageDataGenerator
% matplotlib inline
%load_ext autoreload
%autoreload 2

## PART 1: Creating a model to be trained.

Let's create a Convolutional Neural Network model that will classify digits (MNIST).

<img src="mnist.png" alt="MNIST Visualization" width="400"/>

This is a fairly simple dataset, but it enables us to explore Tune's functionality in depth.
We will use 60,000 images to train the network. The images are 28x28 NumPy arrays, with pixel values ranging between 0 and 255. The labels are an array of integers, ranging from 0 to 9. These correspond to the digit the image represents.

Training the neural network model requires the following steps:

1. Feed the training data to the model—in this example, the train_images and train_labels arrays.
2. The model learns to associate images and labels.

Hints:
1. `data_generator` yields (`data_batch`, `label_batch`).
2. You can use `model.fit(data, labels)` to train the model.

In [None]:
import argparse
parser = argparse.ArgumentParser(description='Keras MNIST Example')
parser.add_argument('--lr', type=float, default=0.1, help='learning rate')
parser.add_argument('--momentum', type=float, default=0.0, help='SGD momentum')
parser.add_argument('--kernel1', type=int, default=3, help='Size of first kernel')
parser.add_argument('--kernel2', type=int, default=3, help='Size of second kernel')
parser.add_argument('--poolsize', type=int, default=2, help='Size of Poolin')
parser.add_argument('--dropout1', type=float, default=0.25, help='Size of first kernel')
parser.add_argument('--hidden', type=int, default=16, help='Size of Hidden Layer')
parser.add_argument('--dropout2', type=float, default=0.5, help='Size of first kernel')

DEFAULT_ARGS = vars(parser.parse_known_args()[0])

In [None]:
def make_model(parameters):
    config = DEFAULT_ARGS.copy()  # This is obtained via the global scope
    config.update(parameters)
    num_classes = 10
    
    model = Sequential()
    model.add(Conv2D(32, kernel_size=(config["kernel1"], config["kernel1"]),
                     activation='relu', input_shape=(28, 28, 1)))
    model.add(Conv2D(64, (config["kernel2"], config["kernel2"]), activation='relu'))
    model.add(MaxPooling2D(pool_size=(config["poolsize"], config["poolsize"])))
    model.add(Dropout(config["dropout1"]))
    model.add(Flatten())
    model.add(Dense(config["hidden"], activation='relu'))
    model.add(Dropout(config["dropout2"]))
    model.add(Dense(num_classes, activation='softmax'))

    model.compile(loss=keras.losses.categorical_crossentropy,
                  optimizer=keras.optimizers.SGD(
                      lr=config["lr"], momentum=config["momentum"]),
                  metrics=['accuracy'])
    return model

def train_mnist(args):
    """Loads data, does one pass over the data, and saves the weights."""
    data_generator = load_data()
    model = make_model(args)
    for x_batch, y_batch in data_generator:
        model.fit(x_batch, y_batch)
    model.save_weights("./weights.h5")
    return model

Here, we'll specify some arguments and some reasonable defaults for this model.

*Then*, we want to train this model. 

In [None]:
train_mnist(DEFAULT_ARGS)

Let's now quickly try out this model to see if it works as expected (tip: don't expect it to).

In [None]:
first_model = make_model(DEFAULT_ARGS)
first_model.load_weights("./weights.h5")

In [None]:
HTML(open("input.html").read())

In [None]:
try:
    prepared_data = prepare_data(data)
    first_model.predict(prepared_data).argmax()
except Exception: # run through only
    pass

## Part 2: Setting up Tune

One thing we might want to do now is find better hyperparameters so that our model trains more quickly. Let's make some minor modifications to utilize Tune. 

Tune uses Ray as a backend, so we will first import and initialize Ray.

In [None]:
import ray
from ray import tune

ray.init(ignore_reinit_error=True)

Tune will automate and distribute your hyperparameter search by scheduling a number of trials in a cluster. Each trial runs a user-defined Python function with a given set of hyperparameters. 

### Two steps to use Tune:

*1*. For the function you wish to tune, we need to change the signature to a specific format as shown below. Specifically: pass in a **``reporter``** object to the below `train_mnist_tune` class.

```python
def trainable(config, reporter):
    """
    Args:
        config (dict): Parameters provided from the search algorithm
            or variant generation.
        reporter (Reporter): Handle to report intermediate metrics to Tune.
    """
```

*2*. We want to keep track of performance as the model is training. Specifically: get the `mean_accuracy` from Keras, and call the **``reporter``** to report the `mean_accuracy` for every batch. You can get model accuracy from Keras with the following code:

```python
result = model.fit(x_batch, y_batch, verbose=0)
mean_accuracy = result.history["acc"][0]
```


Example of using the reporter:

```python
 def train_func(config, reporter):  # add a reporter arg
     ...
     for data, target in dataset:
         accuracy = model.fit(data, target)
         reporter(mean_accuracy=accuracy) # report metrics
```


In [None]:
### TODO: Change this signature #####
def train_mnist_tune(config, reporter):
    data_generator = load_data()
    model = make_model(config)
    for x_batch, y_batch in data_generator:
        result = model.fit(x_batch, y_batch, verbose=0)
        reporter(mean_accuracy=result.history["acc"][0])
        # TODO: Use the reporter here to fill out intermediate metrics
        ##########
    model.save_weights("./weights_tune.h5")

### Let's now try to search over the learning rate. 

*NOTE: You can find the documentation for this section here: https://ray.readthedocs.io/en/latest/tune-usage.html#specifying-experiments*


Let's **first create a Tune Experiment specification**. The relevant documentation for the Experiment class is here:

```python
class ray.tune.Experiment(name, run, stop=None, config=None, ... ):
    """Tracks experiment specifications.

    Parameters:
        name (str): Name of experiment.
        run (function|class|str): The algorithm or model to train.
            This may refer to the name of a built-on algorithm
            (e.g. RLLib's DQN or PPO), a user-defined trainable
            function or class, or the string identifier of a
            trainable function or class registered in the tune registry.
        stop (dict): The stopping criteria. The keys may be any field in
            the return result of 'train()', whichever is reached first.
            Defaults to empty dict.
        config (dict): Algorithm-specific configuration for Tune variant
            generation (e.g. env, hyperparams). Defaults to empty dict.
            Custom search algorithms may ignore this.
```

1. Set the stopping criteria to stop when `mean_accuracy` passes `0.95`.


We also want to designate a search space. **Randomly search for learning rate between 0.001 to 0.1, and do a grid search over `momentum` for `[0.2, 0.4, 0.6]` **(https://ray.readthedocs.io/en/latest/tune-usage.html#tune-search-space-default)

You can use `tune.grid_search` to specify an axis of a grid search. By default, Tune also supports sampling parameters from user-specified lambda functions, which can be used independently or in combination with grid search.

The following example shows grid search over two nested parameters combined with random sampling from a lambda functions, generating 9 different trials. 

```python
config={
    "alpha": lambda spec: np.random.uniform(100),
    "nn_layers": [
         tune.grid_search([16, 64, 256]),
         tune.grid_search([16, 64, 256]),
    ],
}
```



In [None]:
configuration = tune.Experiment(
    "experiment_name",
    run=train_mnist_tune,
    trial_resources={"cpu": 4},
    stop={"mean_accuracy": 0.95},
    config={"lr": lambda spec: np.random.uniform(0.001, 0.1),
            "momentum": tune.grid_search([0.2, 0.4, 0.6])}
)

assert configuration.spec.get("stop", {}).get("mean_accuracy") == 0.95
assert "grid_search" in configuration.spec.get("config", {}).get("momentum", {})

In [None]:
trials = tune.run_experiments(configuration)

In [None]:
print("The best result is", get_best_result(trials, metric="mean_accuracy"))

## Try using a scheduler



Now, let's use this machine to multiplex our training to find the best parameters using a single machine.

1. Run 10 samples (https://ray.readthedocs.io/en/latest/tune-usage.html#sampling-multiple-times)
2. Create an Asynchronous HyperBand Scheduler (https://ray.readthedocs.io/en/latest/tune-schedulers.html#asynchronous-hyperband). The documentation is shown below. 

Be sure to set the `time_attr` to `training_iteration` and `reward_attr` to `mean_accuracy`.

```python
class AsyncHyperBandScheduler(FIFOScheduler):
    """Implements the Async Successive Halving.

    See https://openreview.net/forum?id=S1Y7OOlRZ

    Args:
        time_attr (str): A training result attr to use for comparing time.
            Note that you can pass in something non-temporal such as
            `training_iteration` as a measure of progress, the only requirement
            is that the attribute should increase monotonically.
        reward_attr (str): The training result objective value attribute. As
            with `time_attr`, this may refer to any objective value. Stopping
            procedures will use this attribute.
        ...
        
    Examples:
        >>> hyperband = AsyncHyperBandScheduler(
        >>>     time_attr='training_iteration',
        >>>     reward_attr='mean_accuracy')
    """

```

In [None]:
from ray.tune.schedulers import AsyncHyperBandScheduler

## TODO: Create an Asynchronous HyperBand Scheduler
hyperband = AsyncHyperBandScheduler(
    time_attr="training_iteration",
    reward_attr="mean_accuracy")
configuration.spec["num_samples"] = 10  # set this to 10
# configuration.spec["config"] = {
    
# }

Given the previous configuration, pass in the HyperBand scheduler to `run_experiments`.

Recall that the `run_experiments` API is:
```python
def run_experiments(experiments=None,
                    search_alg=None,
                    scheduler=None,
                    ...):
    """Runs and blocks until all trials finish.

    Args:
        experiments (Experiment | list | dict): Experiments to run. Will be
            passed to `search_alg` via `add_configurations`.
        search_alg (SearchAlgorithm): Search Algorithm. Defaults to
            BasicVariantGenerator.
        scheduler (TrialScheduler): Scheduler for executing
            the experiment. Choose among FIFO (default), MedianStopping,
            AsyncHyperBand, and HyperBand.
        ...
    
    Returns:
        List of Trial objects, holding data for each executed trial.
```


In [None]:
# TODO: Call `run_experiments`
trials = tune.run_experiments(configuration, scheduler=hyperband)

In [None]:
best_model = get_best_model(make_model, trials, metric="mean_accuracy")

In [None]:
validation_data, validation_labels = load_validation()
best_model.evaluate(validation_data, validation_labels)

first_model.evaluate(validation_data, validation_labels)

## Try out your model on some manual inputs!

In [None]:
HTML(open("input_final.html").read())

In [None]:
try:
    manual_input = prepare_data(final_data)
    best = best_model.predict(manual_input).argmax()
    first = first_model.predict(manual_input).argmax()

    print("Best model got {}, first model got {}".format(best, first))
except Exception:
    print("skipping input")

# (Optional) Try using a search algorithm

Tune is an execution layer, so we can combine powerful optimizers such as HyperOpt (https://github.com/hyperopt/hyperopt) with state-of-the-art algorithms such as HyperBand without modifying any model training code.

TODO:

1. Create a HyperOptSearch object and run an experiment combining both the previously created `hyperband` scheduler and this Search algorithm. Use the given search space.

In [None]:
from hyperopt import hp
from ray.tune.suggest import HyperOptSearch


space = {
    "lr": hp.uniform("lr", 0.001, 0.1),
    "momentum": hp.uniform("momentum", 0.1, 0.9),
    "hidden": hp.choice("hidden", np.arange(16, 256, dtype=int)),
    "dropout1": hp.uniform("dropout1", 0.2, 0.8),
}

## TODO: CREATE A HyperOptObject
hyperopt_search = HyperOptSearch(space, reward_attr="mean_accuracy")

## TODO: Pass in the object to Tune.
tune.run_experiments(
    configuration, search_alg=hyperopt_search, scheduler=hyperband)