In [1]:
# Copyright 2022 NVIDIA Corporation. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions anda
# limitations under the License.
# ==============================================================================

<img src="https://developer.download.nvidia.com/notebooks/dlsw-notebooks/merlin_models_entertainment-with-pretrained-embeddings/nvidia_logo.png" style="width: 90px; float: right;">

# Retrieval with hyperparameter optimization

## Overview

In this use case we will perform hyperparameter optimization using [optuna](https://optuna.org/). Optuna is an open source hyperparameter optimization framework which automates hyperparameter search and can be used across a wide set of scenarios.

We will look at optimizing candidate retrieval on a dataset from a Kaggle competition, the [H&M Personalized Fashion Recommendations challenge](https://www.kaggle.com/competitions/h-and-m-personalized-fashion-recommendations).

Hyperparameter optimization can be arbitrarily complex -- in this use case, we will look at optimizing the learning rate and embedding dimensionality to achieve best results on candidate generation. We will train a Matrix Factorization model on user-item pairs and maximize the hit rate at 100 (how many out of top 100 retrieved candidates were among purchased items).

### Learning objectives

- How to run a hyperoptimization experiment
- Candidate generation using Merlin Models

## Downloading and preparing the dataset

Let's begin by downloading the dataset. Please find the data on Kaggle [here](https://www.kaggle.com/competitions/h-and-m-personalized-fashion-recommendations/data). We will only use `transactions_train.csv` which lists items that were purchased and maps the transactions to customers.

Please download the `transactions_train.csv` file and store it alongside the current notebook.

Let us read in the file and look at the data.

In [2]:
import cudf
import numpy as np

transactions = cudf.read_csv('transactions_train.csv', parse_dates=['t_dat'])
transactions.head()

  from .autonotebook import tqdm as notebook_tqdm


Unnamed: 0,t_dat,customer_id,article_id,price,sales_channel_id
0,2018-09-20,000058a12d5b43e67d225668fa1f8d618c13dc232df0ca...,663713001,0.050831,2
1,2018-09-20,000058a12d5b43e67d225668fa1f8d618c13dc232df0ca...,541518023,0.030492,2
2,2018-09-20,00007d2de826758b65a93dd24ce629ed66842531df6699...,505221004,0.015237,2
3,2018-09-20,00007d2de826758b65a93dd24ce629ed66842531df6699...,685687003,0.016932,2
4,2018-09-20,00007d2de826758b65a93dd24ce629ed66842531df6699...,685687004,0.016932,2


Let's assign the last week to our validation set and treat the rest of that data as our train set. Additionally, let us remove purchases in our train set that were performed by customers who do not appear in our validation set.

In [3]:
seven_days_ago = transactions.t_dat.max() - np.timedelta64(7, 'D')

train_set = transactions[transactions.t_dat < seven_days_ago]
validation_set = transactions[transactions.t_dat >= seven_days_ago]

validation_set_customers = validation_set.customer_id.unique()
train_set = train_set[train_set.customer_id.isin(validation_set_customers)]

Before we can proceed with training we need to preprocess our data.

We will only use `customer_id` and `article_id` pairs. Still, a neural network expects them to be represented as continuous integers. We need to go from how they are represented in our dataset to that desired representation.

In order to do so, we will leverage `nvtabular` and the `Categorify` operator.

In [4]:
import nvtabular as nvt
from merlin.schema.tags import Tags
from merlin.models.tf.dataset import BatchedDataset

2022-10-04 07:47:00.216818: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:991] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-04 07:47:00.217347: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:991] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-04 07:47:00.217530: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:991] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-04 07:47:00.229052: I tensorflow/core/platform/cpu_feature_guard.cc:194] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE3 SSE4.1 SSE4.2 AVX
To enable them in other operations, rebuild TensorFlow with the appropriate 

We represent our data as `Merlin` `Datasets`.

In [5]:
train_set = nvt.Dataset(train_set)
validation_set = nvt.Dataset(validation_set)

We now define the operations we want to apply to our data.

In [6]:
customer_id = ['customer_id'] >> nvt.ops.Categorify() >> nvt.ops.AddMetadata(tags=[Tags.USER_ID])
article_id = ['article_id'] >> nvt.ops.Categorify() >> nvt.ops.AddMetadata(tags=[Tags.ITEM_ID])

And we proceed with fitting the workflow and transforming both the train and validation datasets.

In [7]:
workflow = nvt.Workflow(customer_id + article_id)
train_set_transformed = workflow.fit_transform(train_set)

validation_set_transformed = workflow.transform(validation_set)



We are now ready to train our model and perform hyperparameter optimization.

## Hyperparameter optimization using optuna

We will train a retrieval model. Customers have performed a number of purchases in the last week of data that we are using as our validation set.

Our objective will be to train a retrieval model, Matrix Factorization, and to generate 100 candidates for each customer.

The goal is to maximize the count of purchases that appear among our candidates.

### The 3 components of hyperparemter optimization with optuna

With optuna, you create a `study`. A `study` is an optimization session with a number of trials. A `trial` is a single experiment, a call of the objective function.

A `parameter` is a variable whose value we will will optimize.

This is a brief primer but should provide you with all the information necessary to get started with optuna. You can find further information in optuna documentation [here](https://optuna.readthedocs.io/en/stable/tutorial/index.html).

Below we will run an optuna study. The hyperparameters we will optimize are:

* embedding dimensionality
* learning rate
* number of epochs

Embedding dimensionality is how we control the capacity of the model. We would like to provide it with just the right amount of expressive power. If we endow our model with too great of a capacity, our model will overfit to our training data and it's ability to generalize to unseen data will suffer.

On the other hand, if we train too simple of a model, it's performance will be limited as it will not be able to capture the signal in our train data.

The learning rate and number of epochs deal with the technicalities of training our model. As the model might train differently depending on it's capacity, we will perform a grid search across all these parameters in order to find a combination that will perform best.

### The setup of our study

Below we define an `objective` function. Running it is a single `trial` in our optimization `study`.

We `search_space` we define the values we would like our `study` to run `trials` with.

In [8]:
import merlin.models.tf as mm
import tensorflow as tf
import optuna

In [9]:
def calcualate_hit_rate_at_100(model):
    item_features = train_set_transformed.schema.select_by_tag(Tags.ITEM_ID).column_names
    item_dataset = train_set_transformed.to_ddf()[item_features].drop_duplicates().compute()
    
    item_dataset = nvt.Dataset(item_dataset)
    top_k_rec = model.to_top_k_recommender(item_dataset, 100)
    
    users_schema = train_set_transformed.schema.select_by_tag(Tags.USER_ID)
    user_features = users_schema.column_names

    unique_users = train_set_transformed.to_ddf()[user_features].drop_duplicates().compute()
    users_dataset = nvt.Dataset(unique_users, schema=users_schema)
    _, cls_idxs = top_k_rec.predict(BatchedDataset(users_dataset, 100, shuffle=False, schema=users_schema))
    
    customer_id_mapping = cudf.read_parquet('.//categories/unique.customer_id.parquet')
    customer_ids_preds = customer_id_mapping.to_pandas().customer_id.iloc[1:].values
    article_id_mapping = cudf.read_parquet('.//categories/unique.article_id.parquet')
    
    validation_set_df = validation_set.compute()
    validation_set_df = validation_set_df.drop_duplicates(['customer_id', 'article_id'])
    val_cust2purchases = validation_set_df.to_pandas().groupby('customer_id')['article_id'].apply(list)
    
    id2a = article_id_mapping.to_pandas().article_id.to_dict()
    
    hit_rate = 0
    for i in range(cls_idxs.shape[0]):
        current_cust = customer_ids_preds[i]
        purchases = set(val_cust2purchases[current_cust])
        candidates = set(int(id2a[c]) for c in cls_idxs[i])
        hit_rate += len(purchases.intersection(candidates))
        
    return hit_rate

In [10]:
def objective(trial):
    learning_rate = trial.suggest_float('learning_rate', 1e-5, 1e-1)
    num_epochs = trial.suggest_int('num_epochs', 1, 100)
    embedding_dim = trial.suggest_int('embedding_dim', 4, 128)
    
    model = mm.MatrixFactorizationModel(train_set_transformed.schema, embedding_dim)

    optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate)
    loss = tf.keras.losses.CategoricalCrossentropy(
        from_logits=True, label_smoothing=0,
    )

    model.compile(optimizer, loss=loss)

    model.fit(train_set_transformed, validation_data=validation_set_transformed, batch_size=1024, epochs=num_epochs)
    
    return calcualate_hit_rate_at_100(model)

In [11]:
search_space = {
    'learning_rate': [5e-3, 1e-3, 1e-4],
    'num_epochs': [1, 3, 6, 10],
    'embedding_dim': [8, 16, 24, 32]
}

Our `study` will execute 48 trials (4 * 4 * 3) for us. I am capturing the output below as otherwise the notebook would be very hard to read.

We will use a `GridSampler` that will iterate over all the possible combinations of the hyperaparams we chose for our study. This is just one of the many ways supported by `Optuna` for exploring the hyperparam space.

We set the `n_trials` parameter to 100. Still, `Optuna` will only run 48 runs and will terminate when it has tested all the possible hyperaparameter combinations.

In [12]:
%%capture

study = optuna.create_study(sampler=optuna.samplers.GridSampler(search_space), direction='maximize')
study.optimize(objective, n_trials=100)

[32m[I 2022-10-04 07:47:01,350][0m A new study created in memory with name: no-name-5bbf326d-1160-4509-9f31-e151b817967e[0m


INFO:tensorflow:Assets written to: /tmp/tmp3qtqqwdb/assets


[32m[I 2022-10-04 07:47:36,883][0m Trial 0 finished with value: 2710.0 and parameters: {'learning_rate': 0.005, 'num_epochs': 1, 'embedding_dim': 8}. Best is trial 0 with value: 2710.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmpf3l6ix__/assets


[32m[I 2022-10-04 07:48:08,032][0m Trial 1 finished with value: 4459.0 and parameters: {'learning_rate': 0.001, 'num_epochs': 1, 'embedding_dim': 8}. Best is trial 1 with value: 4459.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmpnr242ybh/assets


[32m[I 2022-10-04 07:48:39,746][0m Trial 2 finished with value: 259.0 and parameters: {'learning_rate': 0.0001, 'num_epochs': 1, 'embedding_dim': 8}. Best is trial 1 with value: 4459.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmpvml9jchh/assets


[32m[I 2022-10-04 07:49:55,027][0m Trial 3 finished with value: 1148.0 and parameters: {'learning_rate': 0.0001, 'num_epochs': 3, 'embedding_dim': 16}. Best is trial 1 with value: 4459.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmpj3e8n6eo/assets


[32m[I 2022-10-04 07:51:12,957][0m Trial 4 finished with value: 5378.0 and parameters: {'learning_rate': 0.001, 'num_epochs': 3, 'embedding_dim': 16}. Best is trial 4 with value: 5378.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmpiyovu7ui/assets


[32m[I 2022-10-04 07:53:39,874][0m Trial 5 finished with value: 4047.0 and parameters: {'learning_rate': 0.005, 'num_epochs': 6, 'embedding_dim': 32}. Best is trial 4 with value: 5378.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmp5qxhm_ny/assets


[32m[I 2022-10-04 07:55:56,903][0m Trial 6 finished with value: 3706.0 and parameters: {'learning_rate': 0.005, 'num_epochs': 6, 'embedding_dim': 24}. Best is trial 4 with value: 5378.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmpbvpcssh5/assets


[32m[I 2022-10-04 07:58:14,180][0m Trial 7 finished with value: 3446.0 and parameters: {'learning_rate': 0.0001, 'num_epochs': 6, 'embedding_dim': 16}. Best is trial 4 with value: 5378.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmpd_itwph6/assets


[32m[I 2022-10-04 08:00:32,115][0m Trial 8 finished with value: 3955.0 and parameters: {'learning_rate': 0.001, 'num_epochs': 6, 'embedding_dim': 8}. Best is trial 4 with value: 5378.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmp5fnal0x1/assets


[32m[I 2022-10-04 08:01:43,008][0m Trial 9 finished with value: 961.0 and parameters: {'learning_rate': 0.0001, 'num_epochs': 3, 'embedding_dim': 8}. Best is trial 4 with value: 5378.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmp6g7qphhh/assets


[32m[I 2022-10-04 08:02:13,743][0m Trial 10 finished with value: 5554.0 and parameters: {'learning_rate': 0.001, 'num_epochs': 1, 'embedding_dim': 24}. Best is trial 10 with value: 5554.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmpuzsjats6/assets


[32m[I 2022-10-04 08:04:39,469][0m Trial 11 finished with value: 3241.0 and parameters: {'learning_rate': 0.005, 'num_epochs': 6, 'embedding_dim': 16}. Best is trial 10 with value: 5554.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmp9ej83yqy/assets


[32m[I 2022-10-04 08:06:03,631][0m Trial 12 finished with value: 6710.0 and parameters: {'learning_rate': 0.001, 'num_epochs': 3, 'embedding_dim': 32}. Best is trial 12 with value: 6710.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmp4onpd4v3/assets


[32m[I 2022-10-04 08:06:35,959][0m Trial 13 finished with value: 5965.0 and parameters: {'learning_rate': 0.001, 'num_epochs': 1, 'embedding_dim': 32}. Best is trial 12 with value: 6710.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmpjff_iflz/assets


[32m[I 2022-10-04 08:07:57,638][0m Trial 14 finished with value: 4747.0 and parameters: {'learning_rate': 0.005, 'num_epochs': 3, 'embedding_dim': 32}. Best is trial 12 with value: 6710.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmpvejsub7d/assets


[32m[I 2022-10-04 08:09:13,682][0m Trial 15 finished with value: 4338.0 and parameters: {'learning_rate': 0.005, 'num_epochs': 3, 'embedding_dim': 16}. Best is trial 12 with value: 6710.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmpv040yv4m/assets


[32m[I 2022-10-04 08:10:29,144][0m Trial 16 finished with value: 4434.0 and parameters: {'learning_rate': 0.001, 'num_epochs': 3, 'embedding_dim': 8}. Best is trial 12 with value: 6710.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmpn9jy7fl7/assets


[32m[I 2022-10-04 08:14:23,749][0m Trial 17 finished with value: 3087.0 and parameters: {'learning_rate': 0.005, 'num_epochs': 10, 'embedding_dim': 16}. Best is trial 12 with value: 6710.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmpjcxobs2b/assets


[32m[I 2022-10-04 08:18:16,616][0m Trial 18 finished with value: 4595.0 and parameters: {'learning_rate': 0.0001, 'num_epochs': 10, 'embedding_dim': 8}. Best is trial 12 with value: 6710.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmpmiwbc2bb/assets


[32m[I 2022-10-04 08:22:13,229][0m Trial 19 finished with value: 5598.0 and parameters: {'learning_rate': 0.001, 'num_epochs': 10, 'embedding_dim': 16}. Best is trial 12 with value: 6710.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmpdmnv5mob/assets


[32m[I 2022-10-04 08:23:35,761][0m Trial 20 finished with value: 4628.0 and parameters: {'learning_rate': 0.005, 'num_epochs': 3, 'embedding_dim': 24}. Best is trial 12 with value: 6710.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmpxqkz17qs/assets


[32m[I 2022-10-04 08:27:48,907][0m Trial 21 finished with value: 4071.0 and parameters: {'learning_rate': 0.005, 'num_epochs': 10, 'embedding_dim': 32}. Best is trial 12 with value: 6710.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmp9bujh21z/assets


[32m[I 2022-10-04 08:28:22,732][0m Trial 22 finished with value: 4944.0 and parameters: {'learning_rate': 0.005, 'num_epochs': 1, 'embedding_dim': 32}. Best is trial 12 with value: 6710.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmp64aier1x/assets


[32m[I 2022-10-04 08:28:56,495][0m Trial 23 finished with value: 281.0 and parameters: {'learning_rate': 0.0001, 'num_epochs': 1, 'embedding_dim': 32}. Best is trial 12 with value: 6710.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmpywtgooth/assets


[32m[I 2022-10-04 08:31:32,709][0m Trial 24 finished with value: 4153.0 and parameters: {'learning_rate': 0.0001, 'num_epochs': 6, 'embedding_dim': 32}. Best is trial 12 with value: 6710.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmpy7ioc832/assets


[32m[I 2022-10-04 08:32:50,880][0m Trial 25 finished with value: 6447.0 and parameters: {'learning_rate': 0.001, 'num_epochs': 3, 'embedding_dim': 24}. Best is trial 12 with value: 6710.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmp49s2tn10/assets


[32m[I 2022-10-04 08:33:24,464][0m Trial 26 finished with value: 293.0 and parameters: {'learning_rate': 0.0001, 'num_epochs': 1, 'embedding_dim': 24}. Best is trial 12 with value: 6710.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmppju8rml4/assets


[32m[I 2022-10-04 08:34:36,198][0m Trial 27 finished with value: 3377.0 and parameters: {'learning_rate': 0.005, 'num_epochs': 3, 'embedding_dim': 8}. Best is trial 12 with value: 6710.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmpz04ig3q3/assets


[32m[I 2022-10-04 08:38:20,949][0m Trial 28 finished with value: 2020.0 and parameters: {'learning_rate': 0.005, 'num_epochs': 10, 'embedding_dim': 8}. Best is trial 12 with value: 6710.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmpkzhgwwrl/assets


[32m[I 2022-10-04 08:42:27,707][0m Trial 29 finished with value: 6249.0 and parameters: {'learning_rate': 0.001, 'num_epochs': 10, 'embedding_dim': 32}. Best is trial 12 with value: 6710.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmp4fkddg3d/assets


[32m[I 2022-10-04 08:46:16,290][0m Trial 30 finished with value: 4009.0 and parameters: {'learning_rate': 0.001, 'num_epochs': 10, 'embedding_dim': 8}. Best is trial 12 with value: 6710.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmpfmcn2_j9/assets


[32m[I 2022-10-04 08:48:51,938][0m Trial 31 finished with value: 6812.0 and parameters: {'learning_rate': 0.001, 'num_epochs': 6, 'embedding_dim': 32}. Best is trial 31 with value: 6812.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmp6vlk0dxq/assets


[32m[I 2022-10-04 08:49:23,913][0m Trial 32 finished with value: 3925.0 and parameters: {'learning_rate': 0.005, 'num_epochs': 1, 'embedding_dim': 16}. Best is trial 31 with value: 6812.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmp12jyqioj/assets


[32m[I 2022-10-04 08:53:36,826][0m Trial 33 finished with value: 6195.0 and parameters: {'learning_rate': 0.0001, 'num_epochs': 10, 'embedding_dim': 32}. Best is trial 31 with value: 6812.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmp2yvuvm67/assets


[32m[I 2022-10-04 08:57:30,586][0m Trial 34 finished with value: 6165.0 and parameters: {'learning_rate': 0.001, 'num_epochs': 10, 'embedding_dim': 24}. Best is trial 31 with value: 6812.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmpx2mhobyf/assets


[32m[I 2022-10-04 08:59:58,350][0m Trial 35 finished with value: 6124.0 and parameters: {'learning_rate': 0.001, 'num_epochs': 6, 'embedding_dim': 16}. Best is trial 31 with value: 6812.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmpfb1ee2c5/assets


[32m[I 2022-10-04 09:00:30,253][0m Trial 36 finished with value: 4490.0 and parameters: {'learning_rate': 0.005, 'num_epochs': 1, 'embedding_dim': 24}. Best is trial 31 with value: 6812.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmpvvmccjib/assets


[32m[I 2022-10-04 09:01:02,002][0m Trial 37 finished with value: 5113.0 and parameters: {'learning_rate': 0.001, 'num_epochs': 1, 'embedding_dim': 16}. Best is trial 31 with value: 6812.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmp2numy33_/assets


[32m[I 2022-10-04 09:02:24,853][0m Trial 38 finished with value: 1236.0 and parameters: {'learning_rate': 0.0001, 'num_epochs': 3, 'embedding_dim': 32}. Best is trial 31 with value: 6812.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmpnj_zsglg/assets


[32m[I 2022-10-04 09:03:43,660][0m Trial 39 finished with value: 1201.0 and parameters: {'learning_rate': 0.0001, 'num_epochs': 3, 'embedding_dim': 24}. Best is trial 31 with value: 6812.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmp4nngm36s/assets


[32m[I 2022-10-04 09:06:07,186][0m Trial 40 finished with value: 3866.0 and parameters: {'learning_rate': 0.0001, 'num_epochs': 6, 'embedding_dim': 24}. Best is trial 31 with value: 6812.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmpfqvz385c/assets


[32m[I 2022-10-04 09:10:23,140][0m Trial 41 finished with value: 3716.0 and parameters: {'learning_rate': 0.005, 'num_epochs': 10, 'embedding_dim': 24}. Best is trial 31 with value: 6812.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmp8fm5krfa/assets


[32m[I 2022-10-04 09:12:41,196][0m Trial 42 finished with value: 2521.0 and parameters: {'learning_rate': 0.005, 'num_epochs': 6, 'embedding_dim': 8}. Best is trial 31 with value: 6812.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmpzbfqtb2z/assets


[32m[I 2022-10-04 09:16:38,086][0m Trial 43 finished with value: 5591.0 and parameters: {'learning_rate': 0.0001, 'num_epochs': 10, 'embedding_dim': 24}. Best is trial 31 with value: 6812.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmpjpimhcjw/assets


[32m[I 2022-10-04 09:20:32,254][0m Trial 44 finished with value: 4899.0 and parameters: {'learning_rate': 0.0001, 'num_epochs': 10, 'embedding_dim': 16}. Best is trial 31 with value: 6812.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmp4klk9cx1/assets


[32m[I 2022-10-04 09:22:56,566][0m Trial 45 finished with value: 3480.0 and parameters: {'learning_rate': 0.0001, 'num_epochs': 6, 'embedding_dim': 8}. Best is trial 31 with value: 6812.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmp_xwd6x73/assets


[32m[I 2022-10-04 09:23:28,992][0m Trial 46 finished with value: 237.0 and parameters: {'learning_rate': 0.0001, 'num_epochs': 1, 'embedding_dim': 16}. Best is trial 31 with value: 6812.0.[0m


INFO:tensorflow:Assets written to: /tmp/tmpfkile1af/assets


[32m[I 2022-10-04 09:26:01,694][0m Trial 47 finished with value: 6448.0 and parameters: {'learning_rate': 0.001, 'num_epochs': 6, 'embedding_dim': 24}. Best is trial 31 with value: 6812.0.[0m


Now that we have performed hyperparameter optimization (a `study` in `Optuna`'s parlance), let's explore the results.

We can query the `study` for the best parameters along with te best attained result.

In [13]:
study.best_params

{'learning_rate': 0.001, 'num_epochs': 6, 'embedding_dim': 32}

In [14]:
study.best_value

6812.0

We can also query the `study` for the number of `trials` that have been run.

In [18]:
len(study.trials)

48

And if we would like, we can also take a look at each individual run as follows.

In [19]:
study.trials[0]

FrozenTrial(number=0, values=[2710.0], datetime_start=datetime.datetime(2022, 10, 4, 7, 47, 1, 350748), datetime_complete=datetime.datetime(2022, 10, 4, 7, 47, 36, 883536), params={'learning_rate': 0.005, 'num_epochs': 1, 'embedding_dim': 8}, distributions={'learning_rate': FloatDistribution(high=0.1, log=False, low=1e-05, step=None), 'num_epochs': IntDistribution(high=100, log=False, low=1, step=1), 'embedding_dim': IntDistribution(high=128, log=False, low=4, step=1)}, user_attrs={}, system_attrs={'search_space': OrderedDict([('embedding_dim', [8, 16, 24, 32]), ('learning_rate', [0.005, 0.001, 0.0001]), ('num_epochs', [1, 3, 6, 10])]), 'grid_id': 0}, intermediate_values={}, trial_id=0, state=TrialState.COMPLETE, value=None)

In [20]:
study.trials[0].params

{'learning_rate': 0.005, 'num_epochs': 1, 'embedding_dim': 8}

## Summary