In this notebook, we'll learn how to 1) adapt our locally trained model code for leveraging hyper-parameter tuning service on Vertex AI 2) build custom containers and push container images on GCP Artifact Registry 3) set hardware for our hp-tuning job on Vertex AI

## Initial setup

In [None]:
import os

# The Vertex AI Workbench Notebook product has specific requirements
IS_WORKBENCH_NOTEBOOK = os.getenv("DL_ANACONDA_HOME")
IS_USER_MANAGED_WORKBENCH_NOTEBOOK = os.path.exists(
    "/opt/deeplearning/metadata/env_version"
)

# Vertex AI Notebook requires dependencies to be installed with '--user'
!pip3 install --upgrade google-cloud-aiplatform --user -q

Restart kernel

In [None]:
# Fill appropriate values..
PROJECT_ID = ""
REGION = "us-central1"
BUCKET_NAME = ""
BUCKET_URI = f"gs://{BUCKET_NAME}"

In [None]:
import google.cloud.aiplatform as aiplatform
from google.cloud.aiplatform import hyperparameter_tuning as hpt

In [None]:
aiplatform.init(project=PROJECT_ID,
                location=REGION,
                staging_bucket=BUCKET_URI)

## Code refactoring

Adapt your model training code and put it in trainer/task.py.

References: https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/training/hyperparameter_tuning_xgboost.ipynb or/and https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/training/hyperparameter_tuning_tensorflow.ipynb

## Create custom container image

Create repository in GCP Artifact Registry

In [None]:
REPO_NAME='census-income'

!gcloud artifacts repositories create $REPO_NAME \
--repository-format=docker \
--location=$REGION \
--description="Docker repository"

In [None]:
!gcloud auth configure-docker {REGION}-docker.pkg.dev --quiet

In [None]:
IMAGE_URI = (
             f"{REGION}-docker.pkg.dev/{PROJECT_ID}/{REPO_NAME}/<image-name>:<image-tag>"
            )

Customize the given Dockerfile for your custom container

## Build image and push to Artifact Registry

In [None]:
!docker build ./ -t $IMAGE_URI

In [None]:
!docker push $IMAGE_URI

In [None]:
IMAGE_URI

Check image on GCP console

## Define specs

Choose your hardware

In [None]:
# Provide values to all below arguments..
worker_pool_specs = [
                        {
                            "machine_spec": {
                                            "machine_type": "",
                                            "accelerator_type": "",
                                            "accelerator_count": 1,
                                            },
                            "replica_count": 1,
                            "container_spec": {
                                                "image_uri": IMAGE_URI
                                              },
                        }
                    ]

Set the search space for your chosen hyper-parameters e.g.

In [None]:
parameter_spec = {
                  "<hyper_parameter1>": hpt.DoubleParameterSpec(min=0.001,  # float
                                                           max=0.1,  
                                                           scale="log"),
                  "<hyper_parameter2>": hpt.DiscreteParameterSpec(values=[5, 10, 15, 20],  # int
                                                         scale=None),
                 }

Set your model performance metric and the objective

In [None]:
metric_spec={'<metric>':'<e.g. minimize>'}

## Hyper-parameter Tuning job

Provide appropriate values to the following arguments..

In [None]:
my_custom_job = aiplatform.CustomJob(
                                     display_name=,
                                     worker_pool_specs=,
                                     staging_bucket=,
                                    )

In [None]:
hp_job = aiplatform.HyperparameterTuningJob(
                                            display_name=,
                                            custom_job=,
                                            metric_spec=,
                                            parameter_spec=,
                                            max_trial_count=,
                                            parallel_trial_count=,
                                           )

hp_job.run()

Check output on GCP console