## The Basics
The `hypertune` package is designed for easy hyperparameter tuning of Machine Learning algorithms. It allows for a user to specify an algorithm's parameters as `hypertune` `Parameter`s. `hypertune` has six types of `Parameter`s: `ConstantParameter`, `ContinuousParameter`, `DiscreteParameter`, `TupleParameter`, `CategoricalParameter`, and `ObjectParameter`. Each of these specify unique behavior and types. For a more detailed explaination, please see the notebook on Understanding `hypertune` `Parameter`s.

The purpose of this notebook is to demonstrate how one can use `hypertune` in their existing implementations. `hypertune`'s `HyperTune` object is responsible for tuning your algorithm's hyperparameters by passing it: 
* `algorithm` - the non-initialized algorithm, 
* `parameters` - a list of `Parameter`s,
* `train_func`, `objective_func` - the training and objective functions or methods, 
* `train_func_args=None` - the arguments that will be passed to `train_func`. In most use cases this will be set to `(X_train, y_train)`,
* `objective_func_args=None` - the arguments that will be passed to `objective_func`. In most use cases this will be set to `(X_test, y_test)`,
* `max_evals=100` - the number of iterations,
* `optimizer=optimizers.PSO()` - the `HyperTune.Optimizer`,
* `maximize=True` - if the problem is maximization or minimization, and
* `num_replications=1` - the number of replications.

As an example, lets consider this hypothetical Machine Learning algorithm:

In [17]:
import hypertune as ht
import numpy as np


class AI:
    def __init__(self, eta, max_iter=100):
        self.eta = eta
        self.max_iter = max_iter

    def train(self, *args, **kwargs):
        pass

    def accuracy(self, *args, **kwargs):
        return np.random.rand()

Here, AI is the target machine learning algorithm which has the parameters `eta` and `max_iter`. The methods `train()` and `accuracy()` are the desired training and objective functions, respectivley. `accuracy()` returns a random number as `AI` is a fabricated example.

Defining `AI`'s hyperparameters is simply done by creating surrogate definitions for them that `hypertune` can decode:

In [18]:
eta = ht.ContinuousParameter(name='eta', lower_bound=1e-5, upper_bound=1e-1)
max_iter = ht.DiscreteParameter(name='max_iter', lower_bound=10, upper_bound=1e3)

We defined `eta` as a `ContinuousParameter` so `HyperTune` can decode it to a continuous value. Similarly, we define `max_iter` as a `DiscreteParameter` so it is decoded to a discrete value. Tuning `eta` and `max_iter` is done simply by defining a `HyperTune` object and calling its `tune()` method:

In [19]:
hypers = [eta, max_iter]

tuner = ht.HyperTune(algorithm=AI, parameters=hypers, train_func=AI.train, objective_func=AI.accuracy)

results = tuner.tune()
print(results)

[0.94066613] 0.9406661298000178
[0.94066613] 0.9406661298000178
[0.94066613] 0.9406661298000178
[0.94066613] 0.9406661298000178
[0.94066613] 0.9406661298000178
[0.94066613] 0.9406661298000178
[0.94066613] 0.9406661298000178
[0.94066613] 0.9406661298000178
[0.94066613] 0.9406661298000178
[0.94066613] 0.9406661298000178
[0.94066613 0.94066613 0.94066613 0.94066613 0.94066613 0.94066613
 0.94066613 0.94066613 0.94066613 0.94066613]
[0.94066613] 0.9406661298000178
[0.94066613] 0.9406661298000178
[0.94066613] 0.9406661298000178
[0.94066613] 0.9406661298000178
[0.94066613] 0.9406661298000178
[0.94066613] 0.9406661298000178
[0.94066613] 0.9406661298000178
[0.94066613] 0.9406661298000178
[0.94066613] 0.9406661298000178
[0.94066613] 0.9406661298000178
[0.94066613 0.94066613 0.94066613 0.94066613 0.94066613 0.94066613
 0.94066613 0.94066613 0.94066613 0.94066613]
[0.94066613] 0.9406661298000178
[0.94066613] 0.9406661298000178
[0.94066613] 0.9406661298000178
[0.94066613] 0.9406661298000178
[0.940

We see that we were able to maximize our `accuracy()` function to `0.999` using `eta = 0.0634` and `max_iter = 821`.
Additionally as a note, defining hypers as `[eta, max_iter]` is equivilant to defining it as `[max_iter, eta]`:

In addition to using `AI`'s class methods for training and determning performance, `HyperTune` can take functions external to the class as so:

In [20]:
def mse(algo, *args, **kwargs):
    return np.random.rand()

tuner = ht.HyperTune(algorithm=AI, parameters=hypers, train_func=AI.train, objective_func=mse, maximize=False)

results = tuner.tune()
print(results)

[0.52880603] 0.5288060316654928
[0.52880603] 0.5288060316654928
[0.52880603] 0.5288060316654928
[0.52880603] 0.5288060316654928
[0.52880603] 0.5288060316654928
[0.52880603] 0.5288060316654928
[0.52880603] 0.5288060316654928
[0.52880603] 0.5288060316654928
[0.52880603] 0.5288060316654928
[0.52880603] 0.5288060316654928
[0.52880603 0.52880603 0.52880603 0.52880603 0.52880603 0.52880603
 0.52880603 0.52880603 0.52880603 0.52880603]
[0.52880603] 0.5288060316654928
[0.52880603] 0.5288060316654928
[0.52880603] 0.5288060316654928
[0.52880603] 0.5288060316654928
[0.52880603] 0.5288060316654928
[0.52880603] 0.5288060316654928
[0.52880603] 0.5288060316654928
[0.52880603] 0.5288060316654928
[0.52880603] 0.5288060316654928
[0.52880603] 0.5288060316654928
[0.52880603 0.52880603 0.52880603 0.52880603 0.52880603 0.52880603
 0.52880603 0.52880603 0.52880603 0.52880603]
[0.52880603] 0.5288060316654928
[0.52880603] 0.5288060316654928
[0.52880603] 0.5288060316654928
[0.52880603] 0.5288060316654928
[0.528

We see that we were able to minimize `mse()` to `0.002` using `eta = 0.009` and `max_iter = 413` by setting `maximize = False`.

Furthermore, `HyperTune` can handle algorithms with default parameters. Meaning, as `AI.max_iter` has a default value of `100` you need not to specify it as a `Parameter`:

In [21]:
hypers = [eta]
tuner = ht.HyperTune(AI, hypers, AI.train, AI.accuracy)

results = tuner.tune()
print(results)

[0.2500601] 0.2500600977926756
[0.2500601] 0.2500600977926756
[0.2500601] 0.2500600977926756
[0.2500601] 0.2500600977926756
[0.2500601] 0.2500600977926756
[0.2500601] 0.2500600977926756
[0.2500601] 0.2500600977926756
[0.2500601] 0.2500600977926756
[0.2500601] 0.2500600977926756
[0.2500601] 0.2500600977926756
[0.2500601 0.2500601 0.2500601 0.2500601 0.2500601 0.2500601 0.2500601
 0.2500601 0.2500601 0.2500601]
[0.2500601] 0.2500600977926756
[0.2500601] 0.2500600977926756
[0.2500601] 0.2500600977926756
[0.2500601] 0.2500600977926756
[0.2500601] 0.2500600977926756
[0.2500601] 0.2500600977926756
[0.2500601] 0.2500600977926756
[0.2500601] 0.2500600977926756
[0.2500601] 0.2500600977926756
[0.2500601] 0.2500600977926756
[0.2500601 0.2500601 0.2500601 0.2500601 0.2500601 0.2500601 0.2500601
 0.2500601 0.2500601 0.2500601]
[0.2500601] 0.2500600977926756
[0.2500601] 0.2500600977926756
[0.2500601] 0.2500600977926756
[0.2500601] 0.2500600977926756
[0.2500601] 0.2500600977926756
[0.2500601] 0.25006

Here we were able to maximize `accuracy()` to `0.988` by setting `eta = 0.061`. 

For an explicit example, lets take a look at Scikit-learn's `MLPClassifier`. We first need to define a dummy dataset:

In [27]:
# create 500 data instances each with 3 features and a discrete label either `0` or `1`.
X = np.random.rand(500, 3)
y = np.random.rand(500)

y[y > 0.5], y[y <= 0.5] = 1, 0

# (+/-) to make it more linearly seperable
X[y > 0.5] += 0.25
X[y <= 0.5] -= 0.25

X_train, X_test = X[:400], X[400:]
y_train, y_test = y[:400], y[400:]
train = X_train, y_train
test = X_test, y_test

print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)

(400, 3) (100, 3) (400,) (100,)


In [28]:
from sklearn.neural_network import MLPClassifier

learning_rate_init = ht.ContinuousParameter('learning_rate_init', lower_bound=1e-5, upper_bound=0.1)
max_iter = ht.DiscreteParameter('max_iter', lower_bound=5e2, upper_bound=10e4)

hypers = [learning_rate_init, max_iter]
tuner = ht.HyperTune(MLPClassifier, hypers, MLPClassifier.fit, MLPClassifier.score, test, train)

results = tuner.tune()
print(results)


[0.9275] 0.9275
[0.93] 0.93
[0.9275] 0.9275
[0.9275] 0.9275
[0.9275] 0.9275
[0.93] 0.93
[0.9275] 0.9275
[0.935] 0.935
[0.935] 0.935
[0.93] 0.93
[0.9275 0.93   0.935  0.9275 0.9275 0.93   0.9275 0.93   0.9275 0.935 ]
[0.93] 0.93
[0.93] 0.93
[0.93] 0.93
[0.93] 0.93
[0.93] 0.93
[0.93] 0.93
[0.93] 0.93
[0.935] 0.935
[0.9275] 0.9275
[0.93] 0.93
[0.93   0.93   0.935  0.93   0.93   0.9275 0.93   0.93   0.93   0.93  ]
[0.9275] 0.9275
[0.93] 0.93
[0.935] 0.935
[0.9275] 0.9275
[0.93] 0.93
[0.93] 0.93
[0.93] 0.93
[0.93] 0.93
[0.93] 0.93
[0.9325] 0.9325
[0.93   0.9275 0.935  0.93   0.93   0.93   0.9325 0.93   0.93   0.9275]
[0.9275] 0.9275
[0.93] 0.93
[0.935] 0.935
[0.9275] 0.9275
[0.93] 0.93
[0.93] 0.93
[0.93] 0.93
[0.93] 0.93
[0.9325] 0.9325
[0.935] 0.935
[0.93   0.9275 0.935  0.93   0.93   0.93   0.935  0.9325 0.93   0.9275]
[0.93] 0.93
[0.93] 0.93
[0.9275] 0.9275
[0.9275] 0.9275
[0.9275] 0.9275
[0.935] 0.935
[0.93] 0.93
[0.925] 0.925
[0.9275] 0.9275
[0.93] 0.93
[0.93   0.93   0.935  0.9275 0.9

Here we were able to maximize `MLPClassifier.fit` to `0.98` by setting `learning_rate_init = 0.0316` and `'max_iter = 47116`.