# Automation

In [1]:
import matchzoo as mz

Using TensorFlow backend.


## Tuner

### basic Usage

A couple things are needed by the tuner:
 - a model with a parameters filled
 - preprocessed training data
 - preprocessed testing data
 
Since MatchZoo models have pre-defined hyper-spaces, the tuner can start tuning right away once you have the data ready.

#### prepare the data

In [2]:
train = mz.datasets.toy.load_data('train')
dev = mz.datasets.toy.load_data('dev')
test = mz.datasets.toy.load_data('test')
prpr = mz.models.DenseBaseline.get_default_preprocessor()
train = prpr.fit_transform(train, verbose=0)
dev = prpr.transform(dev, verbose=0)
test = prpr.transform(test, verbose=0)

#### prepare the model

In [3]:
model = mz.models.DenseBaseline()
model.params['input_shapes'] = prpr.context['input_shapes']
model.params['task'] = mz.tasks.Ranking()

#### start tuning

In [4]:
tuner = mz.auto.Tuner(params=model.params, train_data=train, test_data=dev, num_runs=5)
results = tuner.tune()

#### view the best hyper-parameter set

In [11]:
print(results['best']['params'], '\n')
print(str(tuner.metric), results['best']['score'])

model_class                   <class 'matchzoo.models.dense_baseline.DenseBaseline'>
input_shapes                  [(30,), (30,)]
task                          Ranking Task
optimizer                     adam
with_multi_layer_perceptron   True
mlp_num_units                 231
mlp_num_layers                4
mlp_num_fan_out               108
mlp_activation_func           relu 

mean_average_precision(0) 0.5


### understading hyper-space

`model.params.hyper_space` reprensents the model's hyper-parameters search space, which is the cross-product of individual hyper parameter's hyper space. When a `Tuner` builds a model, for each hyper parameter in `model.params`, if the hyper-parameter has a hyper-space, then a sample will be taken in the space. However, if the hyper-parameter does not have a hyper-space, then the default value of the hyper-parameter will be used.

In [12]:
model.params.hyper_space

{'mlp_num_units': <hyperopt.pyll.base.Apply at 0x13c210e48>,
 'mlp_num_layers': <hyperopt.pyll.base.Apply at 0x13c210588>,
 'mlp_num_fan_out': <hyperopt.pyll.base.Apply at 0x13c15eba8>}

In a `DenseBaseline` model, only `mlp_num_units`, `mlp_num_layers`, and `mlp_num_fan_out` have pre-defined hyper-space. In other words, only these hyper-parameters will change values during a tuning. Other hyper-parameters, like `mlp_activation_func`, are fixed and will not change.

In [7]:
for _ in range(3):
    sample = mz.hyper_spaces.sample(model.params.hyper_space)
    print('if a tuner sample:', sample, '\n')
    for key, value in sample.items():
        model.params[key] = value
    print('then the tuner will build a model with:\n')
    print(model.params, '\n\n\n')

if a tuner sample: {'mlp_num_fan_out': 60.0, 'mlp_num_layers': 2.0, 'mlp_num_units': 489.0} 

then the tuner will build a model with:

model_class                   <class 'matchzoo.models.dense_baseline.DenseBaseline'>
input_shapes                  [(30,), (30,)]
task                          Ranking Task
optimizer                     adam
with_multi_layer_perceptron   True
mlp_num_units                 489
mlp_num_layers                2
mlp_num_fan_out               60
mlp_activation_func           relu 



if a tuner sample: {'mlp_num_fan_out': 8.0, 'mlp_num_layers': 5.0, 'mlp_num_units': 402.0} 

then the tuner will build a model with:

model_class                   <class 'matchzoo.models.dense_baseline.DenseBaseline'>
input_shapes                  [(30,), (30,)]
task                          Ranking Task
optimizer                     adam
with_multi_layer_perceptron   True
mlp_num_units                 402
mlp_num_layers                5
mlp_num_fan_out               8
mlp_activ

This is similar to the process of a tuner sampling model hyper-parameters, but with one key difference: a tuner's hyper-space is **suggestive**. This means the sampling process in a tuner is not truely random but skewed. Scores of the past samples affect future choices: a tuner with more runs knows better about its hyper-space, and take samples in a way that will likely yields better scores.

For more details, consult tuner's backend: [hyperopt](http://hyperopt.github.io/hyperopt/), and the search algorithm tuner uses: [Tree of Parzen Estimators (TPE)](https://papers.nips.cc/paper/4443-algorithms-for-hyper-parameter-optimization.pdf)

Hyper-space of a single hyper-parameter can be printed in a human-readable format.

In [23]:
print(model.params.get('mlp_num_units').hyper_space)

quantitative uniform distribution in  [16, 512), with a step size of 1


In [24]:
model.params.to_frame()[['Name', 'Hyper-Space']]

Unnamed: 0,Name,Hyper-Space
0,model_class,
1,input_shapes,
2,task,
3,optimizer,
4,with_multi_layer_perceptron,
5,mlp_num_units,"quantitative uniform distribution in [16, 512..."
6,mlp_num_layers,"quantitative uniform distribution in [1, 5), ..."
7,mlp_num_fan_out,"quantitative uniform distribution in [4, 128)..."
8,mlp_activation_func,


### advanced usage