# Tuning

## Defining a Tuning Problem

A tuning problem consists of the process of finding an optimal configuration of arguments or hyperparameters for a function that can be evaluated to produce a score.


## What is a Hyperparameter?

A hyperparameter is each one of the arguments that can be optimized on our tuning problem.
Hyperparameters can be of different types and can be defined with a set of constraints
regarding possible values that they can take.

In BTB, hyperparameters are represented using a family of classes called HyperParams.
This is the list of the HyperParams that are currently implemented in BTB:

- `BooleanHyperParam`: boolean parameters i.e: `True` or `False`.
- `CategoricalHyperParam`: categorical parameters i.e: "foo", "bar".
- `FloatHyperParam`: `float` parameters i.e: `0.0 - 1.0`
- `IntHyperParam`: `int` parameters i.e: `0 - 1`

## Creating a HyperParam

### BooleanHyperParam

The `BooleanHyperParam` is used for parameters that represent boolean values.
This HyperParam has the following arguments:

- `default`: default value for the hyperparameter. Defaults to `False`.

In [1]:
import warnings

warnings.filterwarnings('ignore')

In [2]:
from btb.tuning.hyperparams import BooleanHyperParam

bool_hp = BooleanHyperParam(default=True)

### CategoricalHyperParam

The `CategoricalHyperParam` is used for parameters that use categorical values. This HyperParam accepts the following arguments:
- `choices`: list of values that the hyperparameter can be.
- `default`: default value for the hyperparameter to take. Defaults to the first item in ``choices``.

In [3]:
from btb.tuning.hyperparams import CategoricalHyperParam

values = ['a', 'b', 'c']
categorical_hp = CategoricalHyperParam(choices=values, default='b')

### FloatHyperParam

The `FloatHyperparam` is used for parameters that use `float` values.
This HyperParam accepts the following arguments:

- `min` (float): minimum value that this hyperparameter can take, by default is ``None`` which will take the system's minimum float value possible.
- `max` (float): maximum value that this hyperparameter can take, by default is ``None`` which will take the system's maximum float value possible.
- `default` (float): number that represents the default value for the hyperparameter. Defaults to ``self.min``.
- `include_min` (bool): Either or not to include the minimum value, by default is ``True``.
- `include_max` (bool): Either or not to include the maximum value, by default is ``True``.

In [4]:
from btb.tuning.hyperparams import FloatHyperParam

float_hp = FloatHyperParam(min=0, max=1, default=0.5)

### IntHyperParam

The `IntHyperParam` is used for parameters that use `int` values.
This HyperParam accepts the following arguments:

- `min` (int): minimum value that this hyperparameter can take, by default is ``None`` which will take the system's minimum int value possible.
- `max` (int): maximum value that this hyperparameter can take, by default is ``None`` which will take the system's maximum int value possible.
- `default` (int): number that represents the default value for the hyperparameter. Defaults to ``self.min``.
- `step` (int): Increase amount to take for each sample. Defaults to 1.
- `include_min` (bool): Either or not to include the minimum value, by default is ``True``.
- `include_max` (bool): Either or not to include the maximum value, by default is ``True``.

In [5]:
from btb.tuning.hyperparams import IntHyperParam

int_hp = IntHyperParam(min=1, max=10, default=5, include_min=False, include_max=True)

## What is Tunable?

In BTB, a tuning problem is represented using the class Tunable, which consists of a collection of HyperParams which will be all tuned at once to find the optiomal solution to our Tuning Problem.

### Creating a Tunable

Tunable instances can be created in two ways:

#### Using HyperParam instances

One way of using the Tunable is to create HyperParam instances for
each one of the hyperparameters that we want to tune and pass them as a dict to the Tunable:

In [6]:
from btb.tuning.tunable import Tunable
from btb.tuning.hyperparams import (
    BooleanHyperParam, CategoricalHyperParam, IntHyperParam, FloatHyperParam)

hyperparams = {
    'bhp': BooleanHyperParam(default=False),
    'chp': CategoricalHyperParam(choices=['foo', 'bar'], default='foo'),
    'fhp': FloatHyperParam(min=0, max=1, default=0.5),
    'ihp': IntHyperParam(min=1, max=10, default=2),
}

tunable = Tunable(hyperparams)

#### Using a dict representation

Alternatively, the Tunable can be represented as a dictionary with all the details of each hyperparameter specified, which can then be stored as a JSON file or in other non-python format.

A python dictionary format would contain as key the given name for the parameter and as value a dictionary containing the following keys

- `type` (str): ``bool`` for ``BoolHyperParam``, ``int`` for ``IntHyperParam``, ``float`` for ``FloatHyperParam``, ``str`` for ``CategoricalHyperParam``.
- `range` or `values` (list): range / values that this hyperparameter can take, in case of ``CategoricalHyperParam`` those will be used as the ``choices``, for ``NumericalHyperParams`` the ``min`` value will be used as the minimum value and the ``max`` value will be used as the ``maximum`` value.
- `default` (str, bool, int, float or None): The default value for the hyperparameter. 

Once this dict is written, it can be passed to the `from_dict` method.

The previously created Tunable can be created using the following dictionary:

In [7]:
hyperparams = {
    'bhp': {
        'type': 'bool',
        'default': False
    },
    'chp': {
        'type': 'str',
        'values': ['foo', 'bar'],
        'default': 'foo'
    },
    'fhp': {
        'type': 'float',
        'values': [0, 1],
        'default': 0.5
    },
    'ihp': {
        'type': 'int',
        'values': [1, 10],
        'default': 2
    }
}

tunable = Tunable.from_dict(hyperparams)

##  What is a Tuner?

Tuners are classes with a fit/predict/propose interface for
suggesting sets of hyperparameters. This are specifically designed
to speed up the process of selecting the optimal hyperparameter values
for a specific tuning problem.

## Using a Tuner

The **BTB** Tuners are used by following a Bayesian Optimization approach and iteratively:

* letting the tuner propose new sets of hyper parameter
* fitting and scoring the model with the proposed hyper parameters
* passing the score obtained back to the tuner

At each iteration the tuner will use the information already obtained to propose
the set of hyper parameters that it considers that have the highest probability
to obtain the best results.

### Creating a Tuner

We will be using a `GPTuner` that accepts the following arguments:

- `tunable` (btb.tuning.tunable.Tunable): Instance of a tunable class containing hyperparameters to be tuned.
- `num_candidates` (int): Number of samples to generate and select the best of it for each proposal. Defaults to 1000.
- `maximize` (bool): If ``True`` the model will understand that the score bigger is better, if ``False`` the smaller is better. Defaults to ``True``.
- `min_trials` (int): Number of recorded ``trials`` needed to perform a fitting over the model. Defaults to 2.

*Bear in mind* that the `tunable` is a requiered argument in order to create a `Tuner`.

In [8]:
from btb.tuning import FloatHyperParam, IntHyperParam, Tunable
from btb.tuning.tuners import GPTuner

tunable = Tunable({
    'fhp': FloatHyperParam(min=0, max=1),
    'ihp': IntHyperParam(min=1, max=10)
})

tuner = GPTuner(tunable)

### Propose

This method will propose one or more new hyperparameter configuration(s)
by using the following aproach:

1. Create `num_candidates` amount of candidates.
2. Use acquisition function to select the best candidates.
3. Return the best selected candidate(s) to be evaluated.

This method accepts the following arguments:
- `n` (int): Number of candidates to create. Defaults to 1.
- `allow_duplicates` (bool): If it's False, the tuner will propose trials that are not recorded, otherwise will generate trials that can be repeated. Defaults to ``False``.


In [9]:
proposal = tuner.propose()
proposal

{'fhp': 0.7486822859551803, 'ihp': 7}

### Record

This method will record the result of one trial or more trials. Then  it will
`re-fit` the meta-model (if `min_trials` is reached) in order to generate *posterior* proposals:

1. Append trial to internal results store.
2. Re-fit meta-model if the `min_trials` is reached.

*Bear in mind* that the proposals that we want to record must have the same parameter names as the tunable.

In [10]:
score = 0.5
tuner.record(proposal, score)

### Tuning loop example

The tuners are ment to be used in a loop that perform the following three steps over and over:

1. Propose.
2. Score the proposal.
3. Record the proposal.

In this example we will use the [wine dataset](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_wine.html#sklearn.datasets.load_wine)
and tune the [SGDClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDClassifier.html) that will atempt to solve it.

Following we will import the dataset, the estimator and the `train_test_split` to generate the train / test samples from the dataset.

In [11]:
from sklearn.datasets import load_wine
from sklearn.linear_model import SGDClassifier
from sklearn.model_selection import train_test_split


dataset = load_wine()
X_train, X_test, y_train, y_test = train_test_split(
    dataset.data, dataset.target, test_size=0.3, random_state=0)

Now that we have our dataset ready, we will create the tunable object for our tuner that will
contain hyperparameters for the estimator.

In [12]:
tunable = Tunable({
    "alpha": FloatHyperParam(min=0.0001, max=1, default=0.0001),
    "max_iter": IntHyperParam(min=1, max=5000, default=1000),
    "tol": FloatHyperParam(min=1e-3, max=1, default=1e-3),
    "shuffle": BooleanHyperParam(default=True)
})

tuner = GPTuner(tunable)

Finally, our loop will start by proposing a set of configuration for
our model, we will fit it with the `X_train` and `y_train` previously
generated, score it with `X_test` and `y_test`, record the score that
those parameters obtained and then evaluate if the score is better
we will update the `best_score` and `best_params` in order to reproduce
the result.

In [13]:
best_score = 0

for _ in range(100):
    proposal = tuner.propose()
    model = SGDClassifier(random_state=0, **proposal)
    model.fit(X_train, y_train)
    score = model.score(X_test, y_test)
    if score > best_score:
        best_params = proposal
        best_score = score
        
    tuner.record(proposal, score)
    
print(best_score, best_params)

0.7407407407407407 {'alpha': 0.3323834138794335, 'max_iter': 3873, 'tol': 0.9460329122584166, 'shuffle': True}


Finally we can reproduce the score by using the `best_params`: 

In [14]:
model = SGDClassifier(random_state=0, **best_params)
model.fit(X_train, y_train)
model.score(X_test, y_test)

0.7407407407407407

## Implemented tuners

**BTB** has the following three tuners available:
- [UniformTuner](https://github.com/HDI-Project/BTB/blob/master/btb/tuning/tuners/uniform.py): Uses a Tuner that samples proposals randomly using a uniform distribution.
- [GPTuner](https://github.com/HDI-Project/BTB/blob/master/btb/tuning/tuners/gaussian_process.py): Uses a Bayesian Tuner that optimizes proposals using a GaussianProcess metamodel.
- [GPEiTuner](https://github.com/HDI-Project/BTB/blob/master/btb/tuning/tuners/gaussian_process.py): Uses a Bayesian Tuner that optimizes proposals using a GaussianProcess metamodel and an Expected Improvement acquisition function.


### Leaderboard

Currently we have a [Benchmarking](https://github.com/HDI-Project/BTB/tree/master/benchmark)
process that evaluates the `tuners` performance against each other
this are the latest results that we obtained for the `BTB` tuners.


| tuner                   | with ties | without ties |
|-------------------------|-----------|--------------|
| `BTB.GPEiTuner`         |    **35** |            7 |
| `BTB.GPTuner`           |    33     |        **8** |
| `BTB.UniformTuner`      |    29     |            2 |