# Tuners 

## Introduction

Tuners are specifically designed to speed up the process of selecting the
optimal hyperparameter values for a specific machine learning problem.

``btb.tuning.tuners`` defines Tuners: classes with a fit/predict/propose interface for
suggesting sets of hyperparameters.

This is done by following a Bayesian Optimization approach and iteratively:

* letting the tuner propose new sets of hyper parameter
* fitting and scoring the model with the proposed hyper parameters
* passing the score obtained back to the tuner

At each iteration the tuner will use the information already obtained to propose
the set of hyper parameters that it considers that have the highest probability
to obtain the best results.

## Usage example 

In this example we will tune the
[RandomForestRegressor](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html) against the [Boston dataset](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_boston.html), creating a `Tunable` object with two of its
hyperparameters and we will create a `Tuner` that will propose new
set of hyperparameters in order to improve the score.

First we will import `Tunable`, `GPTuner` and `IntHyperParam` that we
will use to create our `tuner` that will improve our estiamtor's score. Then we will instantiate them.

In [1]:
from btb.tuning import Tunable
from btb.tuning.tuners import GPTuner
from btb.tuning.hyperparams import IntHyperParam

hyperparams = {
    'n_estimators': IntHyperParam(min=10, max=500),
    'max_depth': IntHyperParam(min=10, max=500),
}

tunable = Tunable(hyperparams)
tuner = GPTuner(tunable)

Now we will import our dataset and our estimator.
Then we will load and split our dataset in to train and test.

In [2]:
from sklearn.datasets import load_boston
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split



dataset = load_boston()
X_train, X_test, y_train, y_test = train_test_split(
     dataset.data, dataset.target, test_size=0.3, random_state=0)


Now we perform the following three steps in a loop.

1. Let the Tuner propose a new set of parameters:


In [3]:
parameters = tuner.propose()
parameters

{'n_estimators': 83, 'max_depth': 393}

2. Fit and score a new model using these parameters:
 

In [4]:
model = RandomForestRegressor(**parameters)
model.fit(X_train, y_train)
score = model.score(X_test, y_test)
score

0.8218116410433008

3. Pass the used parameters and the score obtained back to the tuner:


In [5]:
tuner.record(parameters, score)


At each iteration, the ``Tuner`` will use the information about the previous tests
to evaluate and propose the set of parameter values that have the highest probability
of obtaining the highest score.

Below we present the loop that would store the best score obtained for 100 tuning iterations:

In [6]:
best_score = 0

for i in range(100):
    parameters = tuner.propose()
    model = RandomForestRegressor(**parameters)
    model.fit(X_train, y_train)
    score = model.score(X_test, y_test)
    
    if score > best_score:
        best_score = score
        best_parameters = parameters

Once our loop finishes with the tuning, we can now print the `best_score` found for 100 iterations
and also the `best_parameters` that obtained that score in order to reproduce it.

In [7]:
best_score

0.8445775832444316

In [8]:
best_parameters

{'n_estimators': 24, 'max_depth': 406}