## random tuner

This model uses preprocessed data from the preceeding notebooks.

This is an implementation for the random tuner.
Herein multiple hyper-parameters are randomly selected and models trained using them.

This notebook starts different from previous ones, since the preprocessing of the data is already done (see `6.5.1-manual_data_augmentation`).
Therefore it starts with decoding the record files.
For more insights refer to `src/utils.py`.

The resulting datasets can be used just like the previous generator functions.

In [1]:
from os import path

import tensorflow as tf

from src.utils import decode_image_record

processed = path.join('data', 'processed')

features = {
    'image': tf.io.FixedLenFeature([], tf.string),
    'label': tf.io.FixedLenFeature([2], tf.int64)
}
shape = (32, 32, 1)

def decoder(example):
    feature = tf.io.parse_single_example(example, features)
    image = tf.io.parse_tensor(feature['image'], tf.float32)
    image.set_shape(shape)
    # We only want the 'label_idx'. Not the 'angle'.
    label = feature['label'][0]

    return [image, label]


train_dataset = decode_image_record(path.join(processed, 'train.tfrecord'), decoder)
validation_dataset = decode_image_record(path.join(processed, 'validate.tfrecord'), decoder)
test_dataset = decode_image_record(path.join(processed, 'test.tfrecord'), decoder)

The model creation is composed in a function now.

This function gets a `Hyperparameter` object, which in turn sets the variables randomly.

For this model the tested hyper-paramaters include:

- the input layers filter size (16, 32 or 64)
- the number of additional layers (0 to 3)
- the number of filters for them (16, 32, 64 or 128)
- whether to use pooling for that layer
- the number dense layers after the convolutional layers(1 to 3)
- number of nodes (16, 32, 64 or 128)
- whether to use adam, adadelta, rmsprop or sgd as optimizer
- sgd tests for learning rate, momentum and whether to use nesterov
- rmsprop tests for learning rate


In [2]:
from tensorflow.keras import layers, models
from tensorflow.keras.optimizers import Adadelta, Adam, RMSprop, SGD

def create_model(hp):
    model = models.Sequential()
    model.add(layers.Conv2D(2**hp.Int('2**num_filter_0', 4, 6),
        (4,4) ,activation='relu', input_shape=(32, 32, 1)))

    for i in range(hp.Int('num_cnn_layers', 0, 3)):
        filter = 2**hp.Int('2**num_filter_' + str(i), 4, 7)
        model.add(layers.Conv2D(filter, (4,4), activation='relu', padding='same'))
        if hp.Boolean('pooling_' + str(i)):
            model.add(layers.MaxPooling2D(2, 2))

    model.add(layers.Flatten())
    for i in range(hp.Int('num_dense_layers', 1, 3)):
        nodes = 2**hp.Int('2**num_nodes_' + str(i), 4, 7)
        model.add(layers.Dense(nodes, activation='relu'))
    
    model.add(layers.Dense(3, 'softmax'))

    optimizers = {
        'adam': Adam(),
        'adadelta': Adadelta(),
        'sgd': SGD(lr=hp.Choice('learning_rate', [0.001, 0.003, 0.007, 0.01, 0.03]),
            momentum=hp.Float('momentum', 0.6, 1, 0.1),
            nesterov=hp.Boolean('nesterov')),
        'rms': RMSprop(lr=hp.Choice('learning_rate', [0.001, 0.003, 0.007, 0.01, 0.03]))
    }

    model.compile(
        loss='sparse_categorical_crossentropy',
        optimizer=optimizers[hp.Choice('optimizer', list(optimizers.keys()))],
        metrics=['acc'])

    return model

Keras-tuner has no complete integration into Tensorflow yet.
A `customTuner` is created to "embed" the desired functionalities, since the comparison of the performance is the main goal of the tuning process.

For this the the `hparams.api` plugin is loaded from Tensorboard, which allows to save the hyper-parameters stored in `trial.hyperparameters.values`.

Previously Tensorboard was used as callback injected into the `fit` function.
Because the `fit` function is not called directly in a tuner, the Tensorboard is appended inside the `run_trial` function used by keras-tuner.

In [3]:
from datetime import datetime
from tensorboard.plugins.hparams import api
from kerastuner import RandomSearch
from tensorflow import summary
from tensorflow.keras.callbacks import TensorBoard

class customTuner(RandomSearch):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

    def run_trial(self, trial, *args, **kwargs):
        callbacks = kwargs.pop('callbacks', [])
        callbacks = self._deepcopy_callbacks(callbacks)
        trial_dir = self.get_trial_dir(trial.trial_id)
        callbacks.append(TensorBoard(trial_dir))
        kwargs['callbacks'] = callbacks
        super().run_trial(trial, *args, **kwargs)

    def on_trial_end(self, trial):
        trial_dir = self.get_trial_dir(trial.trial_id)
        # put the hparams where the metrics of tensorboard are
        hparam_dir = path.join(trial_dir, trial.trial_id, 'execution0', 'train')
        hparams = trial.hyperparameters.values
        with summary.create_file_writer(hparam_dir).as_default():
            api.hparams(hparams, trial_id=trial.trial_id)

        print(datetime.now().strftime("%Y-%m-%dT%H-%M-%S"))
        print('Remaining Trials: ' + str(self.remaining_trials))
        
        super().on_trial_end(trial)

    def on_epoch_end(self, trial, model, epoch, logs):
        trial_dir = self.get_trial_dir(trial.trial_id)
        # put the data where the metrics of tensorboard are
        hist_dir = path.join(trial_dir, trial.trial_id, 'execution0', 'train')
        with summary.create_file_writer(hist_dir).as_default():
            for layer in model.weights:
                summary.histogram(layer.name, data=layer, step=epoch)
        super().on_epoch_end(trial, model, epoch, logs)

The tuner is created by invoking the `customTuner`.
The classes provided by keras-tuner accept a model creating function (`create_model`).
`max_trials` is defining, that the tuner tests 1000 different models at max.
`executions_per_trial` gives the possibility to train one set of hyper-parameters more than once, keeping only the best performing one, to reduce the variance but thereby increasing the training time.

In [4]:
from kerastuner import HyperParameters

hp=HyperParameters()
log_dir = path.join('logs', 'srp652')
timestamp = datetime.now().strftime("%Y-%m-%dT%H-%M-%S")

tuner = customTuner(
    create_model,
    hyperparameters=hp,
    objective='acc',
    max_trials=1000,
    executions_per_trial=1,
    directory=log_dir,
    project_name=timestamp)

tuner.search_space_summary()

`search` is very similar to keras `fit` function.
As mentioned, the callback for Tensorboard does not seem to work in this instance.
For the tuner another callback is introduced: `EarlyStopping`, which aborts the training of a model if it is not improving performance (loss is taken as metric) over 3 epochs.
This is done to reduce the training time (dismissing unpromising models faster).

In [5]:
from tensorflow.keras.callbacks import EarlyStopping

callbacks = [ EarlyStopping(monitor='loss', patience=3) ]

tuner.search(
    train_dataset,
    validation_data=validation_dataset,
    epochs=30,
    steps_per_epoch=100,
    validation_steps=100,
    verbose=0,
    callbacks=callbacks)

2019-11-20T04-13-36
Remaining Trials: 999
2019-11-20T04-14-07
Remaining Trials: 998
2019-11-20T04-14-37
Remaining Trials: 997
...


In this test only about 250 trials were done. (Only the first three are kept. You can find the older ones in older commits. This is to keep the files a bit cleaner...)
But this is a sufficient amount to compare the different results in the logs.
These logs can be examined to check for the most promising region of hyper-parameters to zoom in to.