In [None]:
# Copyright 2022 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Vertex AI TensorBoard hyperparameter tuning with the HParams Dashboard

<table align="left">
  <td style="text-align: center">
    <a href="https://colab.research.google.com/github/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/tensorboard/tensorboard_hyperparameter_tuning_with_hparams.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Google Colaboratory logo"><br> Open in Colab
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fvertex-ai-samples%2Fmain%2Fnotebooks%2Fofficial%2Ftensorboard%2Ftensorboard_hyperparameter_tuning_with_hparams.ipynb">
      <img width="32px" src="https://cloud.google.com/ml-engine/images/colab-enterprise-logo-32px.png" alt="Google Cloud Colab Enterprise logo"><br> Open in Colab Enterprise
    </a>
  </td>    
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/vertex-ai-samples/main/notebooks/official/tensorboard/tensorboard_hyperparameter_tuning_with_hparams.ipynb">
      <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo"><br> Open in Workbench
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/tensorboard/tensorboard_hyperparameter_tuning_with_hparams.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo"><br> View on GitHub
    </a>
  </td>
</table>

## Overview

In this tutorial, you learn how to log hyperparameter experiment results in TensorFlow and visualize the results in TensorBoard's Hparams dashboard.

**_NOTE_**: This notebook is tested in the following environments:

* Python version = 3.9

### Objective

In this notebook, you train a model and perform hyperparameter tuning using tensorflow. You also log the hyperparameters and metrics in Vertex AI TensorBoard.

This tutorial uses the following Vertex AI services and resources:

- Vertex AI TensorBoard
- Vertex AI Experiments

The steps performed include:

* Adapt TensorFlow runs to log hyperparameters and metrics.
* Start runs and log them all under one parent directory.
* Visualize the results in TensorBoard's HParams dashboard.

### Dataset

This tutorial uses the [FashionMNIST](https://github.com/zalandoresearch/fashion-mnist) dataset.


### Costs

This tutorial uses the following billable components of Google Cloud:

* Vertex AI

Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing),
and use the [Pricing Calculator](https://cloud.google.com/products/calculator/)
to generate a cost estimate based on your projected usage.

## Get started

### Install Vertex AI SDK for Python and other required packages


In [None]:
! pip3 install --upgrade --quiet google-cloud-aiplatform[tensorboard] \
                                 tensorflow

### Restart runtime (Colab only)

To use the newly installed packages, you must restart the runtime on Google Colab.

In [None]:
import sys

if "google.colab" in sys.modules:

    import IPython

    app = IPython.Application.instance()
    app.kernel.do_shutdown(True)

<div class="alert alert-block alert-warning">
<b>⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️</b>
</div>


### Authenticate your notebook environment (Colab only)

Authenticate your environment on Google Colab.


In [None]:
import sys

if "google.colab" in sys.modules:

    from google.colab import auth

    auth.authenticate_user()

### Set Google Cloud project information and initialize Vertex AI SDK for Python

To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com). Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment).

In [None]:
PROJECT_ID = "[your-project-id]"  # @param {type:"string"}
LOCATION = "us-central1"  # @param {type:"string"}


from google.cloud import aiplatform

aiplatform.init(project=PROJECT_ID, location=LOCATION)

## What is Vertex AI TensorBoard?

Vertex AI TensorBoard is an enterprise-ready managed
version of [Open source TensorBoard](https://www.tensorflow.org/tensorboard/get_started)
(TB), which is a Google open source project for machine learning experiment
visualization.

Vertex AI TensorBoard provides various detailed visualizations, including the following:

*   tracking and visualizing metrics, such as loss and accuracy over time,
*   visualizing model computational graphs (ops and layers),
*   viewing histograms of weights, biases, or other tensors as they change over time,
*   projecting embeddings to a lower dimensional space,
*   displaying image, text, and audio samples.

In addition to the powerful visualizations from
TensorBoard, Vertex AI TensorBoard provides the following benefits:

*  a persistent, shareable link to your experiment's dashboard,

*  a searchable list of all experiments in a project,

*  integrations with Vertex AI services for model training,

*  enterprise-grade security, privacy, and compliance.

With Vertex AI TensorBoard, you can track, visualize, and compare
ML experiments and share them with your team.

Learn more about [Vertex AI TensorBoard](https://cloud.google.com/vertex-ai/docs/experiments/tensorboard-introduction).

## Load TensorBoard and TensorFlow components

Load the TensorBoard notebook extension and import TensorFlow and the TensorBoard HParams plugin.


In [None]:
# Load the TensorBoard notebook extension
%load_ext tensorboard

# Clear any logs from previous runs
!rm -rf ./logs/

# Import TensorFlow and the TensorBoard HParams plugin
import tensorflow as tf
from tensorboard.plugins.hparams import api as hp

## Download dataset

Download the [FashionMNIST](https://github.com/zalandoresearch/fashion-mnist) dataset and scale it.

In [None]:
fashion_mnist = tf.keras.datasets.fashion_mnist

(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

## Set up the experiment

Run an experiment by specifying values for the following hyperparameters:

* number of units in the first dense layer
* dropout rate in the dropout layer
* optimizer

Specify the hyperparameter values for the experiment in TensorBoard.

*Optional*: For more fine grained filtering of hyperparameters in the Google Cloud Console, provide domain information and specify which metrics should be displayed.

In [None]:
HP_NUM_UNITS = hp.HParam("num_units", hp.Discrete([16, 32]))
HP_DROPOUT = hp.HParam("dropout", hp.RealInterval(0.1, 0.2))
HP_OPTIMIZER = hp.HParam("optimizer", hp.Discrete(["adam", "sgd"]))

METRIC_ACCURACY = "accuracy"

with tf.summary.create_file_writer("logs/hparam_tuning").as_default():
    hp.hparams_config(
        hparams=[HP_NUM_UNITS, HP_DROPOUT, HP_OPTIMIZER],
        metrics=[hp.Metric(METRIC_ACCURACY, display_name="Accuracy")],
    )

## Adapt TensorFlow runs to log hyperparameters and metrics

The model you define is quite simple: two dense layers with a dropout layer between them. Your hyperparameters are provided in an `hparams` dictionary and used throughout the training function.

In [None]:
def train_test_model(hparams):
    model = tf.keras.models.Sequential(
        [
            tf.keras.layers.Flatten(),
            tf.keras.layers.Dense(hparams[HP_NUM_UNITS], activation=tf.nn.relu),
            tf.keras.layers.Dropout(hparams[HP_DROPOUT]),
            tf.keras.layers.Dense(10, activation=tf.nn.softmax),
        ]
    )
    model.compile(
        optimizer=hparams[HP_OPTIMIZER],
        loss="sparse_categorical_crossentropy",
        metrics=["accuracy"],
    )

    model.fit(
        x_train, y_train, epochs=1
    )  # Run with 1 epoch to speed things up for demo purposes
    _, accuracy = model.evaluate(x_test, y_test)
    return accuracy

For each run, log the summary with hyperparameters and final accuracy.

In [None]:
def run(run_dir, hparams):
    with tf.summary.create_file_writer(run_dir).as_default():
        hp.hparams(hparams)  # record the values used in this trial
        accuracy = train_test_model(hparams)
        tf.summary.scalar(METRIC_ACCURACY, accuracy, step=1)

## Start runs and log them all under one parent directory

You can now try multiple experiments, training each one with a different set of hyperparameters.

For simplicity, use grid search: try all combinations of the discrete parameters and just the lower and upper bounds of the real-valued parameter. For more complex scenarios, it might be more effective to choose each hyperparameter value randomly (this is called a random search). There are more advanced methods that can be used.

In the below cell, run a few experiments. This takes a few minutes to complete.

In [None]:
session_num = 0

for num_units in HP_NUM_UNITS.domain.values:
    for dropout_rate in (HP_DROPOUT.domain.min_value, HP_DROPOUT.domain.max_value):
        for optimizer in HP_OPTIMIZER.domain.values:
            hparams = {
                HP_NUM_UNITS: num_units,
                HP_DROPOUT: dropout_rate,
                HP_OPTIMIZER: optimizer,
            }
            run_name = "run-%d" % session_num
            print("--- Starting trial: %s" % run_name)
            print({h.name: hparams[h] for h in hparams})
            run("logs/hparam_tuning/" + run_name, hparams)
            session_num += 1

## Create Vertex AI TensorBoard
A Vertex AI TensorBoard instance, which is a regionalized resource storing your Vertex AI TensorBoard experiments, must be created before the experiments can be visualized. You can create multiple instances in a project.

Learn more see [Create a Vertex AI TensorBoard instance](https://cloud.google.com/vertex-ai/docs/experiments/tensorboard-setup#create-tensorboard-instance).

Create a TensorBoard instance to be used by the training job.

In [None]:
# Set the display name for your tensorboard instance
TENSORBOARD_NAME = f"tb-name-{PROJECT_ID}-unique"  # @param {type:"string"}

tensorboard = aiplatform.Tensorboard.create(
    display_name=TENSORBOARD_NAME, project=PROJECT_ID, location=LOCATION
)
TENSORBOARD_RESOURCE_NAME = tensorboard.gca_resource.name
print("TensorBoard resource name:", TENSORBOARD_RESOURCE_NAME)

Set your TensorBoard Experiment name.

In [None]:
EXPERIMENT_NAME = f"experiment-name-{PROJECT_ID}-unique"  # @param {type:"string"}

Upload the log to your Vertex AI TensorBoard.

In [None]:
!tb-gcp-uploader --one_shot=True --tensorboard_resource_name=$TENSORBOARD_RESOURCE_NAME --logdir="logs/hparam_tuning/" --experiment_name=$EXPERIMENT_NAME

## Visualize the results in Vertex AI TensorBoard's HParams tab

Click the generated TensorBoard link and click on "HParams" at the top.

The left pane of the dashboard provides filtering capabilities that are active across all the views in the HParams dashboard. In this pane, you can:

- Filter which hyperparameters/metrics are shown in the dashboard.
- Filter which hyperparameter/metrics values are shown in the dashboard.
- Filter on run status (running, success, etc.).
- Sort by hyperparameter/metric in the table view.
- Select number of session groups to show (useful for performance when there are many experiments).

The HParams dashboard has three different views, with various useful information:

* The **Table View** lists the runs, their hyperparameters, and their metrics.
* The **Parallel Coordinates View** shows each run as a line going through an axis for each hyperparemeter and metric. Click and drag the mouse on any axis to mark a region which highlights only the runs that pass through it. This can be useful for identifying which groups of hyperparameters are most important. The axes themselves can be re-ordered by dragging them.
* The **Scatter Plot Matrix View** shows plots comparing each hyperparameter/metric with each metric. This can help identify correlations. Click and drag to select a region in a specific plot and highlight those sessions across the other plots.

These views help to see the plots of the metrics as a function of training steps for that session (although in this tutorial only one step is used for each run).

## Cleaning up

To clean up all Google Cloud resources used in this project, you can [delete the Google Cloud
project](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects) you used for the tutorial.

Otherwise, you can delete the individual resources you created in this tutorial:

In [None]:
# Delete the Vertex AI Experiment
experiment = aiplatform.Experiment(EXPERIMENT_NAME)
experiment.delete()

# Delete the tensorboard instance
tensorboard.delete()

# Delete the locally generated logs folder
! rm -rf logs/