# Model Tuning

## Overview

The Tuning tutorial demonstrates how to use PsiZ models with [Keras Tuner](https://keras.io/keras_tuner/) in order to select the best hyperparameters (e.g., embedding dimensionality) for a model.

This tutorial uses the `birds-16` dataset introduced in the [Beginner Tutorial](https://psiz.readthedocs.io/en/latest/src/beginner_tutorial/beginner_tutorial.html) and is divided into two parts:

1. Model Construction
2. Hypertuning

If you would like to run this notebook on your local machine, the file is available [here on PsiZ's GitHub](https://github.com/psiz-org/psiz/blob/main/docs/src/tutorials/tuning.ipynb).

## Model Construction

### Preliminaries

Let's start by importing the nessary packages, defining some tutorial settings, and preparing the birds-16 dataset. Note that `max_dim=7`, which means we will search models ranging from `n_dim=2` to `n_dim=7`.

```{note}
Keras Tuner is not automatically installed with PsiZ because there are many hyperparameter tuning packages available and user preferences vary. The next code block includes a line that can be uncommented if you would like to install `keras-tuner` in your active environment.
```

In [1]:
# %pip install keras-tuner -q  # Uncomment if you need to install keras-tuner.

import os
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "2"
from pathlib import Path
import shutil

import keras_tuner as kt
import tensorflow as tf
import tensorflow_probability as tfp

import psiz

# Optional CUDA settings.
# os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
# os.environ["CUDA_VISIBLE_DEVICES"] = "0"
# tf.config.run_functions_eagerly(True)

# Define where we want our results to be stored.
fp_project = Path.home() / Path('psiz_examples', 'tutorial', 'tuning')
fp_board = fp_project / Path('logs', 'fit')

# Some hardcoded settings for the tutorial.
max_epochs = 1000
batch_size = 128
max_dim = 7
executions_per_trial = 1

# Directory preparation.
fp_project.mkdir(parents=True, exist_ok=True)
# Remove existing TensorBoard logs.
if fp_board.exists():
    shutil.rmtree(fp_board)

# Load hosted birds-16 dataset.
(obs, catalog) = psiz.datasets.load_dataset('birds-16', verbose=1)

# Partition observations.
obs_train, obs_val, obs_test = psiz.utils.standard_split(obs)
print(
    '\nData Split\n  obs_train:'
    ' {0}\n  obs_val: {1}\n  obs_test: {2}'.format(
        obs_train.n_trial, obs_val.n_trial, obs_test.n_trial
    )
)

# Convert observations to TF dataset.
ds_obs_train = obs_train.as_dataset().shuffle(
    buffer_size=obs_train.n_trial, seed=252, reshuffle_each_iteration=True
).batch(batch_size, drop_remainder=False)
ds_obs_val = obs_val.as_dataset().batch(
    batch_size, drop_remainder=False
)
ds_obs_test = obs_test.as_dataset().batch(
    batch_size, drop_remainder=False
)

Dataset Summary
  n_stimuli: 208
  n_trial: 16292

Data Split
  obs_train: 13033
  obs_val: 1629
  obs_test: 1630


### Hyperpameter Model

With the preliminaries out of the way, we can dive in and define our model in a way that is compatible with Keras Tuning. One of the convenient features of Keras Tuning is that it is easy to wrap existing code. So let's start by defining a `build_model` function in the usual manner.

In [2]:
def build_model(n_stimuli, n_obs_train, n_dim):
    """Build model.

    Args:
        n_stimuli: Integer indicating the number of stimuli in the
            embedding.
        n_obs_train: Integer indicating the number of training
            observations. Used to determine KL weight for variational
            inference.
        n_dim: Integer specifying the dimensionality of the
            embedding.

    Returns:
        model: A TensorFlow Keras model.

    """
    prior_scale = .2  # An educated guess.
    kl_weight = 1. / n_obs_train

    embedding_posterior = psiz.keras.layers.EmbeddingNormalDiag(
        n_stimuli + 1, n_dim, mask_zero=True,
        scale_initializer=tf.keras.initializers.Constant(
            tfp.math.softplus_inverse(prior_scale).numpy()
        )
    )
    embedding_prior = psiz.keras.layers.EmbeddingShared(
        n_stimuli + 1, n_dim, mask_zero=True,
        embedding=psiz.keras.layers.EmbeddingNormalDiag(
            1, 1,
            loc_initializer=tf.keras.initializers.Constant(0.),
            scale_initializer=tf.keras.initializers.Constant(
                tfp.math.softplus_inverse(prior_scale).numpy()
            ),
            loc_trainable=False,
        )
    )
    stimuli = psiz.keras.layers.EmbeddingVariational(
        posterior=embedding_posterior, prior=embedding_prior,
        kl_weight=kl_weight, kl_n_sample=30
    )

    kernel = psiz.keras.layers.DistanceBased(
        distance=psiz.keras.layers.Minkowski(
            rho_initializer=tf.keras.initializers.Constant(2.),
            w_initializer=tf.keras.initializers.Constant(1.),
            trainable=False
        ),
        similarity=psiz.keras.layers.ExponentialSimilarity(
            trainable=False,
            beta_initializer=tf.keras.initializers.Constant(10.),
            tau_initializer=tf.keras.initializers.Constant(1.),
            gamma_initializer=tf.keras.initializers.Constant(0.),
        )
    )
    model = psiz.keras.models.Rank(
        stimuli=stimuli, kernel=kernel, n_sample=1
    )

    # Compile settings.
    compile_kwargs = {
        'loss': tf.keras.losses.CategoricalCrossentropy(),
        'optimizer': tf.keras.optimizers.Adam(learning_rate=.001),
        'weighted_metrics': [
            tf.keras.metrics.CategoricalCrossentropy(name='cce')
        ]
    }

    model.compile(**compile_kwargs)
    return model

With a `build_model` function defined, we can now introduce an additional layer of abstraction that will manage the model's hyperparameters. For this tutorial, we will only consider a setup with one hyperparameter: `n_dim`---the variable that specifies the dimensionality of the embedding.

One way to use Keras Tuner is to define a `build_hypmodel` function that *wraps* our `build_model` function. Note that this function behaves somewhat like an anonymous funcation and has three variables that are supplied by the execution context: `max_dim`, `catalog`, and `obs_train`. We use this pattern so that the `build_hypmodel` function has the function signature expected by Keras Tuner.

Lastly, we instantiate a `Tuner` object that manages the tuning process and takes `build_hypmodel` as an argument. In this tutorial we use a `RandomSearch` object because we will exhaustively sample all dimensionality values in the specified range. In other scenarios, it may make more sense to use a more intelligent tuner.

In [3]:
def build_hypmodel(hp):
    """Build hyperparameter model.

    Ags:
        hp: A kt.Tuner object governing the hyperparameters.

    Returns:
        model: A model compatible with Keras Tuner.

    """
    n_dim = hp.Int("n_dim", min_value=2, max_value=max_dim, step=1)
    return build_model(catalog.n_stimuli, obs_train.n_trial, n_dim)

# Build hypertuner that will use for performing multiple restarts.
tuner = kt.RandomSearch(
    hypermodel=build_hypmodel,
    objective=kt.Objective("val_loss", direction="min"),
    max_trials=(max_dim - 1),
    executions_per_trial=executions_per_trial,
    overwrite=True,
    directory=fp_project,
    project_name="logs/tuner",
)
# Print out summary of search space.
tuner.search_space_summary()
print("'executions_per_trial': {0}".format(tuner.executions_per_trial))

Search space summary
Default search space size: 1
n_dim (Int)
{'default': None, 'conditions': [], 'min_value': 2, 'max_value': 7, 'step': 1, 'sampling': None}
'executions_per_trial': 1


## Hypertuning

With our hypermodel defined, we create some convenient callbacks (for logging and early stopping) and start the hyperparameter search. A nice thing about Keras Tune is the `search` signature matches the `fit` signature.

```{note}
The `TensorBoard` callback saves log files in a format that allows us better understand how different hyperparameters impact model performance. All you need to do is launch TensorBoard in the standard way and you will see an additional "HPARAMS" tab that allows you to explore the hyperparameter search results. For more information, check out the (TensorBoard Documentation)[https://www.tensorflow.org/tensorboard] and the tutorial on using (TensorBoard with Keras Tuner)[https://keras.io/guides/keras_tuner/visualize_tuning/]).
```

In [4]:
# Define callbacks.
cb_board = tf.keras.callbacks.TensorBoard(
    log_dir=fp_board, write_graph=False
)
cb_early = tf.keras.callbacks.EarlyStopping(
    'loss', patience=100, mode='min', restore_best_weights=False,
    verbose=1
)
callbacks = [cb_board, cb_early]

# Start the hyperparameter search. The default settings will take awhile.
tuner.search(
    x=ds_obs_train, validation_data=ds_obs_val, epochs=max_epochs,
    callbacks=callbacks, verbose=0
)
tuner.results_summary()

Epoch 369: early stopping
Epoch 470: early stopping
Epoch 449: early stopping
Epoch 469: early stopping
INFO:tensorflow:Oracle triggered exit
Results summary
Results in /home/brett/psiz_examples/tutorial/tuning/logs/tuner
Showing 10 best trials
<keras_tuner.engine.objective.Objective object at 0x7ff7a4516820>
Trial summary
Hyperparameters:
n_dim: 3
Score: 1.9470715522766113
Trial summary
Hyperparameters:
n_dim: 4
Score: 1.9478240013122559
Trial summary
Hyperparameters:
n_dim: 5
Score: 1.9496349096298218
Trial summary
Hyperparameters:
n_dim: 6
Score: 1.9694207906723022


After running out hyperparameter search, we can load the best performing model and run it on the test set.

In [5]:
# Retrieve best hyperparameters from search.
best_hps = tuner.get_best_hyperparameters()
print('best hps: {}'.format(best_hps[0].values))
# Retrieve best model from search.
best_model = tuner.get_best_models()[0]

# Evaluate on test set
metrics_test = best_model.evaluate(ds_obs_test, verbose=0, return_dict=True)
print('Test Set Performance: {}'.format(metrics_test))

best hps: {'n_dim': 3}
Test Set Performance: {'loss': 1.7893924713134766, 'cce': 1.6967400312423706}


The results of the search indicate that the best dimensionality for the `birds-16` dataset is `3`. If you would like to increase your confidence in this result you could increase the variable `executions_per_trials` or set up a cross-validation procedure that uses different training splits.

## Summary

This tutorial demonstrated how to use `keras-tuner` in order to conduct a hyperparameter search and identify the optimal dimensionality for the `birds-16` dataset. This search can easily be expanded to search over other hyperparameters, such as the optimizer settings.