Skip to content
{{ message }}

# Optimizing Hyperparams

seba-1511 edited this page Mar 19, 2018 · 7 revisions

# Documentation

##### Clone this wiki locally

Warning This tutorial is not complete yet and there is an open issue about it.

Some examples that cover the discussed functionalities are available in the example folder. (here and here)

# Overview

In this brief tutorial, we'll see how to use the hyperparameter tuning utilities. We'll proceed in two steps.

1. We describe randopt's programmatic API for random, grid, and evolutionary search.

2. We introduce `ropt.py`, a utility that enables the automatic optimization of hyperparameters via the command line.

# Programmatic API

Following the simple example presented in the package overview, we'll keep working on the 2-dimensional quadratic.

`f = lambda x, y: x**2 + y**2`

However, let us now add parameters to our experiment.

```import randopt as ro
exp = ro.Experiment('simple_example', params={
'x': ro.Gaussian(0.0, 0.6),
'y': ro.Uniform(-0.3, 0.3)
})```

Here we've defined two parameters `x` and `y`, and specified for each what we believe is a good search distribution. We've done this by instanciating the Gaussian and Uniform samplers. Samplers are a key functionality of randopt and the next sub-section explains them in details.

## Samplers

Each sampler defines a specific probability distribution, defined by its class and parameters. Currently, randopt supports all basic probability distributions defined in Python's random module.

• `BetaVariate(alpha=1, beta=1)`
• `ExpoVariate(lam)`
• `Gaussian(mean=0.0, std=1.0)`
• `LognormVariate(mean=0.0, std=1.0)`
• `Normal(mean=1.0, std=0.0)`
• `ParetoVariate(alpha=1.0)`
• `Uniform(low, high)`
• `WeibullVariate(alpha, beta)`

as well as some special samplers:

• `Choice(items, sampler)`: given a list `items`, samples the index of the returned element via the provided `sampler`. (Default: Uniform)
• `Constant(value)`: always returns `value` upon sampling.
• `Truncated(sampler, low, high)`: samples a value according to `sampler`. If it is less than `low` or greater `high`, returns these values respectively.

In addition to `.sample()`, each sampler can be seeded via `.seed(seed)` and the underlying random number generator state can be accessed/restored via `get_state()` and `set_state(state)`, respectively.

Note Upon import, randopt forks Python's number generator and seeds it via a cryptographically random number generated via `urandom()`. This means that as long as you seed your random number generator after importing randopt, the samplers will generate different numbers at every run. Conversely if you want the samplers to behave deterministically, you should seed them before using them.

## Random Search

Now that we understand how samplers work, let us use them. Once passed to an experiment, you have two options.

1. Call `exp.sample_all_params()` to sample all provided parameters.
2. Call `exp.sample('x')` and `exp.sample('y')` to sample `x` and `y` individually.

For brevity, we'll use the former. At this point, we can easily access the sampled values via `exp.x` and `exp.y` to compute and record our result.

```exp.sample_all_params()
result = f(exp.x, exp.y)
exp.add_result(result)```

Notice that when calling `add_result(result)`, randopt automatically fetches the last sampled values for `x` and `y` and includes them in the JSON summary.

If we now wanted to perform random search, we could simply re-run this program several times, or wrap the last snippet in a `for` loop.

## Grid Search

While random search is a powerful tool, sometimes it makes more sense to start by performing an exhaustive grid search. To do so, randopt provides the `GridSearch` wrapper.

```exp = ro.Experiment('simple_example', params={
'x': ro.Choice([1, 2, 3]),
'y': ro.Choice([-1, 1]),
})
grid = ro.GridSearch(exp)```

Notice that since we now want to go through a grid of parameters, all samplers need to be instances of `Choice`.

`grid` behaves exactly like an experiment object, except for the added `refresh_index()` method. This method recomputes the number of times each possible experiment was run from existing JSON summaries. It is automatically called when using `grid.sample_all_params()` but not with `grid.sample('x')`.

Once the grid index is computed, `GridSearch` will return the configuration that has been executed the least amount of times. To see this, run the following snippet 6 times consecutively.

```grid.sample_all_params()
grid.add_result(f(grid.x, grid.y))```

You'll notice by inspecting your JSON summaries that indeed each possible combination of parameters from the samplers has been executed exactly once. If you were to run these lines again, the search would restart from the first configuration of the grid.

## Evolutionary Search

To further fine-tune parameters, randopt also implements a special case of evolutionary search. The evolutionary algorithm implemented is best described as follows.

1. Select the best `n` configurations from existing JSON summaries according to a `fitness` function.
2. Of these `n` configurations, choose one uniformly and designate it as the parent.
3. Sample small perturbations and add them to the parent's configuration.
4. Run and record this configuration.

To achieve the above, randopt exposes the `Evolutionary` wrapper.

```exp = ro.Experiment('simple_example', params={
'x': ro.Gaussian(0.0, 0.4),
'y': ro.Gaussian(0.0, 0.001)
})
fitness = lambda res1, res2: res1.result <= res2.result
evo = ro.Evolutionary(exp, elite_size=3, fitness=fitness)```

In this case, the samplers passed to `Experiment` will be used to generate the perturbations of each parameter. In the above, we also define a custom `fitness` function which will be used to select the parent population with least result (i.e. a minimization problem) from previous results.

Again, this wrapper behaves like an experiment with the additional `sample_parent()` method. Just as for `GridSearch` and `refresh_index()`, `sample_parent()` is automatically invoked when calling `sample_all_params()` but not with `sample('x')`. Moreover, the `Evolutionary` wrapper requires some JSON sumarries need to exist so we'll have to create one first.

```exp.x = 1
exp.y = 1
result = f(exp.x, exp.y)
exp.add_result(result)
evo.sample_all_params()
result = f(evo.x, evo.y)
evo.add_result(result)```

# Ropt.py

``````Describe how to use ropt.py, as a way to avoid re-writing the same boilerplate code. (and point to useful commandr)
``````

If our experiment takes can take its parameters as command line arguments (e.g. by using commandr), then we can use ropt.py.

The command has the following form:

`ROPT_ARG1=value1 ROPT_ARG2=value2 ropt.py my_experiment_command --arg1='Sampler1()' --arg2='Sampler2()'`

where the ROPT_ARG can be

• `ROPT_NSEARCH`: the number of searches to run. (Default is infinity)
• `ROPT_NAME`: name of the experiment. (No default)
• `ROPT_DIR`: directory of the experiment. (Default is `randopt_results`)
• `ROPT_TYPE`: the type of search to perform. (Default is `Experiment`, ie random search)

Note no spaces in the samplers and samplers surrounded by quotes (single or double) and passed via =.

The following sub-sections generate the same search process as the ones in the programmatic API.

## Random Search

`ROPT_NSEARCH=15 ropt.py simple_example.py --x='Gaussian(0.0,0.6)' --y='Uniform(-0.3,0.3)'`

## Grid Search

`ROPT_NSEARCH=6 ROPT_TYPE=GridSearch ROPT_NAME=simple_example ropt.py simple_example.py --x='Choice([1,2,3])' --y='Choice([-1,1])'`

## Evolutionary Search

`ROPT_NSEARCH=6 ROPT_TYPE=Evolutionary ROPT_NAME=simple_example ropt.py simple_example.py --x='Gaussian(0.0,0.4)' --y='Gaussian(0.0,0.001)'`