[View the runnable example on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/nano/tutorial/notebook/hpo/use_hpo_tune_hyperparameters_tensorflow.ipynb)

# Use Nano HPO to Tune the Hyper-Parameters in TensorFlow Training

With the help of Nano HPO (Hyper-Parameter Optimization), you can search the model architecture (layer, activation, etc.) and training procedure (learning rate, batch size) simply by specifying their search spaces. Specifically, search spaces refer to value range specifications that the search engine uses for sampling hyper-parameters. You can use `model.search()` to launch search trials, and `model.search_summary()` to review the search results.

## Prepare the environment and datasets

To apply Nano HPO, you should install BigDL-Nano for TensorFlow and its dependencies first:

In [None]:
!pip install --pre --upgrade bigdl-nano[tensorflow] # install the nightly-built version
!source bigdl-nano-init

> 📝 **Note**
>
> We recommend to run the commands above, especially `source bigdl-nano-init` before jupyter kernel is started, or some of the optimizations may not take effect.

In [None]:
# install dependencies
!pip install pandas
!pip install ConfigSpace
!pip install optuna

We need to enable Nano HPO before tensorflow training.

In [None]:
import bigdl.nano.automl as nano_automl
nano_automl.hpo_config.enable_hpo_tf()

> 📝 **Note**
>
> To disable HPO, you can call `nano_automl.hpo_config.disable_hpo_tf()` similarly. This will remove the searchable objects from `bigdl.nano.tf` module.

Taking the hyper-tuning of a simple CNN on the MINST dataset as an example. We first prepare the datasets for training and testing.

In [None]:
from tensorflow import keras

(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

CLASSES = 10
img_x, img_y = x_train.shape[1], x_train.shape[2]
input_shape = (img_x, img_y, 1)
x_train = x_train.reshape(-1, img_x, img_y,1).astype("float32") / 255
x_test = x_test.reshape(-1, img_x, img_y,1).astype("float32") / 255

## Build a searchable model

We support three different ways to exploit existing models or create new ones, i.e., using either TensorFlow `Sequential` or `Functional` API, or subclassing `tensorflow.keras.Model`. You can choose an appropriate approach depending on your (preferred) code structure.

### Option 1. Define a searchable model using `Sequential` API

Nano HPO provides the same `Sequential` model creation like native TensorFlow do. Therefore, you can easily define your `Sequential`-like model while specifying the search space for each component. Before achieving this, you should change the imports from `tensorflow.keras` to `bigdl.nano` as below.

In [None]:
from bigdl.nano.automl.tf.keras import Sequential
from bigdl.nano.tf.keras.layers import Dense, Flatten, Conv2D
import bigdl.nano.automl.hpo.space as space

In this example, we can assign the search spaces for `filters`, `kernel_size`, `strides`, `activation` (all are categorical with two choices) of a `Conv2D` layer.

In [None]:
model = Sequential()
model.add(Conv2D(
    filters=space.Categorical(32, 64),
    kernel_size=space.Categorical(3, 5),
    strides=space.Categorical(1, 2),
    activation=space.Categorical("relu", "linear"),
    input_shape=input_shape))
model.add(Flatten())
model.add(Dense(CLASSES, activation="softmax"))

> 📝 **Note**
>
> In general, Nano supports four kinds of parameter types in `bigdl.nano.automl.hpo.space`, which are `Categorical(*data, prefix=None)`, `Real(lower, upper, default=None, log=False, prefix=None)`, `Int(lower, upper, default=None, prefix=None)` and `Bool(default=None, prefix=None)`. For their detailed usage, you can refer to the [corresponding API doc](https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/Nano/hpo_api.html#search-space).
> 
> Note that search spaces can only be specified in key-word arguments (which means `Dense(space.Int(...))` should be changed to `Dense(units=space.Int(...))`). And if a layer is used more than once in the model, we strongly suggest you assign a `prefix` for each search space in such layer to distinguish them, or they will share the same search space (the last space will override all previous definition).

### Option 2. Define a searchable model using `Functional` API

Nano HPO provides the same `Functional` model creation like native TensorFlow do. Therefore, you can easily define your `Functional`-like model while specifying the search space for each component. Before achieving this, you should change the imports from `tensorflow.keras` to `bigdl.nano` as below.

In [None]:
from bigdl.nano.tf.keras.layers import Dense, Flatten, Conv2D
from bigdl.nano.tf.keras import Input
from bigdl.nano.automl.tf.keras import Model
import bigdl.nano.automl.hpo.space as space

In this example, we can assign the search spaces for `filters`, `kernel_size`, `strides`, `activation` (all are categorical with two choices) of a `Conv2D` layer.

In [None]:
inputs = Input(shape=(28,28,1))
x = Conv2D(
    filters=space.Categorical(32, 64),
    kernel_size=space.Categorical(3, 5),
    strides=space.Categorical(1, 2),
    activation=space.Categorical("relu", "linear"),
    input_shape=input_shape)(inputs)
x = Flatten()(x)
outputs = Dense(CLASSES, activation="softmax")(x)
model = Model(inputs=inputs, outputs=outputs, name="mnist_model")

> 📝 **Note**
>
> In general, Nano supports four kinds of parameter types in `bigdl.nano.automl.hpo.space`, which are `Categorical(*data, prefix=None)`, `Real(lower, upper, default=None, log=False, prefix=None)`, `Int(lower, upper, default=None, prefix=None)` and `Bool(default=None, prefix=None)`. For their detailed usage, you can refer to the [corresponding API doc](https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/Nano/hpo_api.html#search-space).
> 
> Note that search spaces can only be specified in key-word arguments (which means `Dense(space.Int(...))` should be changed to `Dense(units=space.Int(...))`). And if a layer is used more than once in the model, we strongly suggest you assign a `prefix` for each search space in such layer to distinguish them, or they will share the same search space (the last space will override all previous definition).

### Option 3. Define a searchable model by subclassing `tensorflow.keras.Model`

You can transfer a model that subclasses `tf.keras.Model` to a searchable object flexibly via `@hpo.tfmodel` decorator. Then you will able to specify either search spaces or normal values in the model init arguments.

In this example, we can assign the search spaces for `filters`, `kernel_size`, `strides`, `activation` (all are categorical with two choices) of a `Conv2D` layer.

In [None]:
import bigdl.nano.automl.hpo as hpo
import tensorflow as tf
from tensorflow.keras.layers import Conv2D, Dropout, MaxPooling2D
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Flatten

@hpo.tfmodel()
class MyModel(tf.keras.Model):

    def __init__(self, filters, kernel_size, strides, activation):
        super().__init__()
        self.conv1 = Conv2D(
            filters=filters,
            kernel_size=kernel_size,
            strides=strides,
            activation=activation)
        self.pool1 = MaxPooling2D(pool_size=2)
        self.drop1 = Dropout(0.3)
        self.flat = Flatten()
        self.dense1 = Dense(256, activation='relu')
        self.drop3 = Dropout(0.5)
        self.dense2 = Dense(CLASSES, activation="softmax")

    def call(self, inputs):
        x = self.conv1(inputs)
        x = self.pool1(x)
        x = self.drop1(x)
        x = self.flat(x)
        x = self.dense1(x)
        x = self.drop3(x)
        x = self.dense2(x)
        return x
model = MyModel(
    filters=hpo.space.Categorical(32, 64),
    kernel_size=hpo.space.Categorical(2, 4),
    strides=hpo.space.Categorical(1, 2),
    activation=hpo.space.Categorical("relu", "linear")
)

> 📝 **Note**
>
> In general, Nano supports four kinds of parameter types in `bigdl.nano.automl.hpo.space`, which are `Categorical(*data, prefix=None)`, `Real(lower, upper, default=None, log=False, prefix=None)`, `Int(lower, upper, default=None, prefix=None)` and `Bool(default=None, prefix=None)`. For their detailed usage, you can refer to the [corresponding API doc](https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/Nano/hpo_api.html#search-space).
> 
> Note that search spaces can only be specified in key-word arguments (which means `Dense(space.Int(...))` should be changed to `Dense(units=space.Int(...))`). And if a layer is used more than once in the model, we strongly suggest you assign a `prefix` for each search space in such layer to distinguish them, or they will share the same search space (the last space will override all previous definition).

### Compile the model

We now compile our model with loss function, optimizer and metrics.

In [None]:
from tensorflow.keras.optimizers import RMSprop
model.compile(
    loss="sparse_categorical_crossentropy",
    optimizer=RMSprop(learning_rate=0.001),
    metrics=["accuracy"]
)

> 📝 **Note**
>
> `learning_rate` is also a searchable hyper-parameter. Two steps are needed before and during calling `model.compile()` if you want to optimize the learning rate:
> 
> 1) import the optimizer from `bigdl.nano.tf.optimizers` instead of `tf.keras.optimizers`, i.e., `from bigdl.nano.tf.optimizers import RMSprop`
> 2) specify the search space for `learning_rate` in the optimizer argument in `model.compile()`, e.g., `optimizer=RMSprop(learning_rate=space.Real(0.0001, 0.01, log=True))`

## Find the best hyper-parameters

Then we can call `model.search()` (corresponding API reference can be found [here](https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/Nano/hpo_api.html#hpo-for-tensorflow)) to start searching hyper-parameters. Nano HPO will test `n_trials` sets of hyper-parameter combination in the search space range, and optimize the `target_metric` in the specified `direction`. We need to pass the necessary arguments for `model.fit()`, like `x`, `y`, `batch_size`, `epochs`, etc., referring to its [API doc](https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/Nano/tensorflow.html#bigdl.nano.tf.keras.Model.fit). Additionally, `pruner` is supported to stop non-promising trials early.

In [None]:
from bigdl.nano.automl.hpo.backend import PrunerType
model.search(
    n_trials=8,
    target_metric='val_accuracy',
    direction="maximize",
    pruner=PrunerType.HyperBand,
    pruner_kwargs={'min_resource':1, 'max_resource':100, 'reduction_factor':3},
    x=x_train,
    y=y_train,
    batch_size=128,
    epochs=5,
    validation_split=0.2,
    verbose=False
)

> 📝 **Note**
>
> `batch_size` is also a searchable hyper-parameter. If you want to optimize it, you can specify the search space for `batch_size` argument in `model.search()`, e.g., `batch_size=space.Categorical(128,64)`

When the search completes, you can use `model.search_summary()` to retrive the search results for analysis, which can be used to collect trial statistics in pandas dataframe format, pick the best trial, or do visualizations.

In [None]:
study = model.search_summary()

Number of finished trials: 8
Best trial:
  Value: 0.9805833101272583
  Params: 
    activation▁choice: 0
    filters▁choice: 1
    kernel_size▁choice: 1
    strides▁choice: 1


After the search, `model.fit()` will autotmatically apply the best hyper-parmeters found to fit the model. Then we can use the testing dataset to evaluate it.

In [None]:
history = model.fit(x_train, y_train,
                    batch_size=128, epochs=5, validation_split=0.2)

test_scores = model.evaluate(x_test, y_test, verbose=2)
print("Test loss:", test_scores[0])
print("Test accuracy:", test_scores[1])

The detailed information for each trial can be reviewed through `trials_dataframe()`.

In [None]:
study.trials_dataframe(attrs=("number", "value", "params", "state"))

Unnamed: 0,number,value,params_activation▁choice,params_filters▁choice,params_kernel_size▁choice,params_strides▁choice,state
0,0,0.920917,1,1,1,0,COMPLETE
1,1,0.92325,1,1,1,1,COMPLETE
2,2,0.920083,1,0,1,1,PRUNED
3,3,0.92,1,1,1,1,COMPLETE
4,4,0.980583,0,1,1,1,COMPLETE
5,5,0.926583,1,0,0,0,COMPLETE
6,6,0.916,1,1,1,1,PRUNED
7,7,0.922417,1,1,1,1,COMPLETE


> 📚 **Related Readings**
>
> - [How to install BigDL-Nano](https://bigdl.readthedocs.io/en/latest/doc/Nano/Overview/install.html)