# Tune Tutorial

# Insert Tune Logo

Tune is a scalable framework for hyperparameter search with a focus on deep learning and deep reinforcement learning.

**Code**: https://github.com/ray-project/ray/tree/master/python/ray/tune

**Documentation**: http://ray.readthedocs.io/en/latest/tune.html

In [2]:
from helper import load_data
import numpy as np

import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.preprocessing.image import ImageDataGenerator

%load_ext autoreload
%autoreload 2

Using TensorFlow backend.


# Overview

Tuning hyperparameters is often the most expensive part of the machine learning workflow. Here, we walk through a tutorial on using Tune, demonstrating an efficient and scalable solution for this pain point.


## Outline
This tutorial will walk you through the following process:

1. Creating and training a model on a toy dataset (MNIST)
2. Integrating Tune into your workflow
3. Trying out advanced features - plugging in an efficient scheduler and search algorithm
4. Validating your trained model

## PART 1: Creating a model to be trained.

We want to start off by creating a simple model to classify digits. 


## TODO(rliaw): Make input to `make_model` a dict.

Hints:
1. `data_generator` yields (`x_batch`, `y_batch`).
2. You can use `model.fit(data, targets)` to train the model.

In [3]:
def make_model(args):
    num_classes = 10
    
    model = Sequential()
    model.add(Conv2D(32, kernel_size=(args.kernel1, args.kernel1),
                     activation='relu', input_shape=(28, 28, 1)))
    model.add(Conv2D(64, (args.kernel2, args.kernel2), activation='relu'))
    model.add(MaxPooling2D(pool_size=(args.poolsize, args.poolsize)))
    model.add(Dropout(args.dropout1))
    model.add(Flatten())
    model.add(Dense(args.hidden, activation='relu'))
    model.add(Dropout(args.dropout2))
    model.add(Dense(num_classes, activation='softmax'))

    model.compile(loss=keras.losses.categorical_crossentropy,
                  optimizer=keras.optimizers.SGD(
                      lr=args.lr, momentum=args.momentum),
                  metrics=['accuracy'])
    return model

def train_mnist(args):
    data_generator = load_data()
    model = make_model(args)
    for x_batch, y_batch in data_generator:
        model.fit(x_batch, y_batch)
    ## TODO:
    ## Fill this out here to train the model off all the data 
    ## in the generator.
    pass
    ##
    model.save_weights("./weights.h5")

Let's use some default hyperparameters.

In [None]:
import argparse
parser = argparse.ArgumentParser(description='Keras MNIST Example')
parser.add_argument('--lr', type=float, default=0.01, help='learning rate')
parser.add_argument('--momentum', type=float, default=0.5, help='SGD momentum')
parser.add_argument('--kernel1', type=int, default=3, help='Size of first kernel')
parser.add_argument('--kernel2', type=int, default=3, help='Size of second kernel')
parser.add_argument('--poolsize', type=int, default=2, help='Size of Poolin')
parser.add_argument('--dropout1', type=float, default=0.25, help='Size of first kernel')
parser.add_argument('--hidden', type=int, default=128, help='Size of Hidden Layer')
parser.add_argument('--dropout2', type=float, default=0.5, help='Size of first kernel')

args = parser.parse_known_args()[0]

*Then*, we want to train this model. 

In [None]:
train_mnist(args)

## Part 2: Setting up Tune

Let's make some minor modifications to utilize Tune. 

Tune uses Ray as a backend, so we will first import and initialize Ray.

In [None]:
import ray
from ray import tune

ray.init(ignore_reinit_error=True)

Tune will automate and distribute your hyperparameter search by scheduling a number of trials in a cluster. Each trial runs a user-defined Python function with a given set of hyperparameters. 

There are two steps you need to take to setup Tune to search using the above `train_mnist` method.

### Two steps to use Tune:

*1*. For the function you wish to tune, we need to change the signature to a specific format as shown below. Specifically: pass in a **``reporter``** object to the below `train_mnist_tune` class.

```
def trainable(config, reporter):
    """
    Args:
        config (dict): Parameters provided from the search algorithm
            or variant generation.
        reporter (Reporter): Handle to report intermediate metrics to Tune.
    """
```

*2*. We want to keep track of performance as the model is training. Specifically: get the `mean_accuracy` from Keras, and call the **``reporter``** to report the `mean_accuracy` for every batch. You can get model accuracy from Keras with the following code:

```
result = model.fit(x_batch, y_batch, verbose=0)
mean_accuracy = result.history["acc"][0]
```


Example of using the reporter:

```
 def train_func(config, reporter):  # add a reporter arg
     ...
     for data, target in dataset:
         accuracy = model.fit(data, target)
         reporter(mean_accuracy=accuracy) # report metrics
```


In [1]:
### TODO: Change this signature #####
def train_mnist_tune(config, reporter):
###################################
    global args
    vars(args).update(config)
    data_generator = load_data()
    model = make_model(args)
    for x_batch, y_batch in data_generator:
        result = model.fit(x_batch, y_batch, verbose=0)
        # TODO: Use the reporter here to fill out intermediate metrics
        ##########
        
    model.save_weights("./weights_tune.h5")

### Let's now try to search over the learning rate. 

NOTE: You can find the documentation for this section here: https://ray.readthedocs.io/en/latest/tune-usage.html#specifying-experiments


Let's first create a Tune Experiment specification. The relevant documentation for the Experiment class is here:

```python
class ray.tune.Experiment(name, run, stop=None, config=None, ... ):
    """Tracks experiment specifications.

    Parameters:
        name (str): Name of experiment.
        run (function|class|str): The algorithm or model to train.
            This may refer to the name of a built-on algorithm
            (e.g. RLLib's DQN or PPO), a user-defined trainable
            function or class, or the string identifier of a
            trainable function or class registered in the tune registry.
        stop (dict): The stopping criteria. The keys may be any field in
            the return result of 'train()', whichever is reached first.
            Defaults to empty dict.
        config (dict): Algorithm-specific configuration for Tune variant
            generation (e.g. env, hyperparams). Defaults to empty dict.
            Custom search algorithms may ignore this.
```

1. Set the stopping criteria to stop when `mean_accuracy` passes `0.95`.


Next, we should designate a search space. 
2. Randomly search for learning rate between 0.001 to 0.1. (https://ray.readthedocs.io/en/latest/tune-usage.html#tune-search-space-default)

In [None]:
configuration = tune.Experiment(
    "experiment_name",
    run=train_mnist_tune,
    stop={},
    config={}
)
tune.run_experiments(configuration)

## Try using a scheduler



Now, let's use this machine with CPUs and multiplex our training to find the best parameters using a single machine.

1. Run 10 samples (https://ray.readthedocs.io/en/latest/tune-usage.html#sampling-multiple-times)
2. Create an Asynchronous HyperBand Scheduler (https://ray.readthedocs.io/en/latest/tune-schedulers.html#asynchronous-hyperband) and set the following fields for the scheduler: 
```
time_attr="training_iteration",
reward_attr="mean_accuracy"
```

In [None]:
from ray.tune.schedulers import AsyncHyperBandScheduler

## TODO: Follow the above instructions
configuration.spec["num_samples"] = 1

In [None]:
tune.run_experiments(configuration, scheduler=hyperband)

In [1]:
from IPython.display import HTML
HTML(open("input.html").read())

# (Optional) Try using a search algorithm

Tune is an execution layer, so we can combine powerful optimizers such as HyperOpt (https://github.com/hyperopt/hyperopt) with state-of-the-art algorithms such as HyperBand without modifying any model training code.

TODO:

1. Create a HyperOptSearch object and run an experiment combining both the previously created `hyperband` scheduler and this Search algorithm. Use the given search space.

In [None]:
from ray.tune.suggest import HyperOptSearch
space = {
    "lr": hp.uniform("lr", 0.001, 0.1),
    "momentum": hp.uniform("momentum", 0.1, 0.9),
    "hidden": hp.quniform("hidden", 32, 512, 1),
    "dropout1": hp.uniform("dropout1", 0.2, 0.8),
}

## TODO: CREATE A HyperOptObject

## TODO: Pass in the object to Tune.
tune.run_experiments(
    configuration, scheduler=hyperband)

## (Optional) Fault Tolerance

In [None]:
class Model(tune.Trainable):
    def _setup(self):
        vars(args).update(self.config) #add this
        self.model = make_model(args)
        self.data_generator = load_data()
    
    def _train(self):
        x_train, y_train = self.data_generator.next()
        result = self.model.fit(
        model.fit(x_batch, y_batch, verbose=0)
        return {"mean_accuracy": result.history["acc"][0]}
    
    def _save(self, checkpoint_dir):
        checkpoint_path = os.path.join(checkpoint_dir, "weights.h5")
        self.model.save_weights(checkpoint_path)
    
    def _restore(self, checkpoint_path):
        self.model.load_weights(checkpoint_path)

In [None]:
ray.init(ignore_reinit_error=True)
configuration = tune.Experiment(
    "experiment_name",
    stop={"mean_accuracy": 0.99},
    run=Model,
    config={
        "lr": lambda spec: np.random.uniform(0.001, 0.1),
        "momentum": lambda spec: np.random.uniform(0.1, 0.9),
        "hidden": lambda spec: np.random.randint(32, 512),
        "dropout1": lambda spec: np.random.uniform(0.2, 0.8),
    },
    checkpoint_at_end=True
)
tune.run_experiments(configuration)