In [None]:
# Copyright 2022 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Tabular Workflows: TabNet Pipeline

<table align="left">
  <td>
    <a href="https://colab.research.google.com/github/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/tabular_workflows/tabnet_on_vertex_pipelines.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Colab logo"> Run in Colab
    </a>
  </td>
  <td>
    <a href="https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/tabular_workflows/tabnet_on_vertex_pipelines.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo">
      View on GitHub
    </a>
  </td>
  <td>
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/vertex-ai-samples/main/notebooks/official/tabular_workflows/tabnet_on_vertex_pipelines.ipynb">
        <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo">
      Open in Vertex AI Workbench
    </a>
  </td>
</table>
<br/><br/><br/>

## Overview

This notebook showcases how to run the TabNet algorithm using Vertex AI Tabular Workflows.

Learn more about [Tabular Workflow for TabNet](https://cloud.google.com/vertex-ai/docs/tabular-data/tabular-workflows/tabnet).

### Objective

In this tutorial, you learn how to create two classification models using Vertex AI TabNet Tabular Workflows. Each workflow is a managed instance of [Vertex AI Pipelines](https://cloud.google.com/vertex-ai/docs/pipelines/introduction).

This tutorial uses the following Google Cloud ML services and resources:

- Vertex AI Training
- Vertex AI Pipelines
- Cloud Storage

The steps performed include:

- Create a TabNet CustomJob. This is the best option if you know which hyperparameters to use for training.
- Create a TabNet HyperparameterTuningJob. This allows you to get the best set of hyperparameters for your dataset.

After training, each pipeline returns a link to the Vertex Model UI. You can use the UI to deploy the model, get online predictions, or run batch prediction.

### Dataset

The dataset you will be using is [Bank Marketing](https://archive.ics.uci.edu/ml/datasets/bank+marketing).
The data is for direct marketing campaigns (phone calls) of a Portuguese banking institution. The binary classification goal is to predict if a client subscribe a term deposit. For this notebook, you randomly selected 90% of the rows in the original dataset and saved them in a train.csv file hosted on Cloud Storage. To download the file, click [here](https://storage.googleapis.com/cloud-samples-data-us-central1/vertex-ai/tabular-workflows/datasets/bank-marketing/train.csv).

### Costs

This tutorial uses billable components of Google Cloud:

* Vertex AI
* Cloud Storage

Learn about [Vertex AI
pricing](https://cloud.google.com/vertex-ai/pricing) and [Cloud Storage
pricing](https://cloud.google.com/storage/pricing), and use the [Pricing
Calculator](https://cloud.google.com/products/calculator/)
to generate a cost estimate based on your projected usage.

### Set up your local development environment

**If you are using Colab or Vertex AI Workbench Notebooks**, your environment already meets
all the requirements to run this notebook. You can skip this step.

**Otherwise**, make sure your environment meets this notebook's requirements.
You need the following:

* The Google Cloud SDK
* Git
* Python 3
* virtualenv
* Jupyter notebook running in a virtual environment with Python 3

The Google Cloud guide to [Setting up a Python development
environment](https://cloud.google.com/python/setup) and the [Jupyter
installation guide](https://jupyter.org/install) provide detailed instructions
for meeting these requirements. The following steps provide a condensed set of
instructions:

1. [Install and initialize the Cloud SDK.](https://cloud.google.com/sdk/docs/)

1. [Install Python 3.](https://cloud.google.com/python/setup#installing_python)

1. [Install
   virtualenv](https://cloud.google.com/python/setup#installing_and_using_virtualenv)
   and create a virtual environment that uses Python 3. Activate the virtual environment.

1. To install Jupyter, run `pip3 install jupyter` on the
command-line in a terminal shell.

1. To launch Jupyter, run `jupyter notebook` on the command-line in a terminal shell.

1. Open this notebook in the Jupyter Notebook Dashboard.

## Installation

Install the following packages required to execute this notebook.

In [None]:
import os

# The Vertex AI Workbench Notebook product has specific requirements
IS_WORKBENCH_NOTEBOOK = os.getenv("DL_ANACONDA_HOME")
IS_USER_MANAGED_WORKBENCH_NOTEBOOK = os.path.exists(
    "/opt/deeplearning/metadata/env_version"
)

# Vertex AI Notebook requires dependencies to be installed with '--user'
USER_FLAG = ""
if IS_WORKBENCH_NOTEBOOK:
    USER_FLAG = "--user"

! pip3 install --upgrade google-cloud-aiplatform google-cloud-pipeline-components {USER_FLAG} -q


### Restart the kernel
Once you've installed the additional packages, you need to restart the notebook kernel so it can find the packages.


**Note: Once this cell has finished running, continue on. You do not need to re-run any of the cells above.**


In [None]:
import os

if not os.getenv("IS_TESTING"):
    # Automatically restart kernel after installs
    import IPython

    app = IPython.Application.instance()
    app.kernel.do_shutdown(True)

## Before you begin

### GPU runtime

This tutorial does not require a GPU runtime.

### Set up your Google Cloud project

**The following steps are required, regardless of your notebook environment.**

1. [Select or create a Google Cloud project](https://console.cloud.google.com/cloud-resource-manager). When you first create an account, you get a $300 free credit towards your compute/storage costs.

2. [Make sure that billing is enabled for your project.](https://cloud.google.com/billing/docs/how-to/modify-project)

3. [Enable the following APIs: Vertex AI APIs, Dataflow APIs, Compute Engine APIs, and Cloud Storage.](https://console.cloud.google.com/flows/enableapi?apiid=ml.googleapis.com,dataflow.googleapis.com,compute_component,storage-component.googleapis.com)

4. If you are running this notebook locally, you need to install the [Cloud SDK](https://cloud.google.com/sdk).

5. Enter your project ID in the cell below. Then run the  cell to make sure the
Cloud SDK uses the right project for all the commands in this notebook.

**Note**: Jupyter runs lines prefixed with `!` as shell commands, and it interpolates Python variables prefixed with `$`.

## Notes about service account and permission

**By default no configuration is required**, if you run into any permission related issue, please make sure the service accounts above have the required roles:

|Service account email|Description|Roles|
|---|---|---|
|PROJECT_NUMBER-compute@developer.gserviceaccount.com|Compute Engine default service account|Dataflow Admin, Dataflow Worker, Storage Admin, BigQuery Admin, Vertex AI User|
|service-PROJECT_NUMBER@gcp-sa-aiplatform.iam.gserviceaccount.com|AI Platform Service Agent|Vertex AI Service Agent|


1. Goto https://console.cloud.google.com/iam-admin/iam.
2. Check the "Include Google-provided role grants" checkbox.
3. Find the above emails.
4. Grant the corresponding roles.

### Using data source from a different project
- For the BQ data source, grant both service accounts the "BigQuery Data Viewer" role.
- For the CSV data source, grant both service accounts the "Storage Object Viewer" role.


### Set your project ID

Set your project ID below. If you know know your project ID, leave the field blank and the following cells may be able to find it. Optionally, you may also set a service account in the cell below.

In [None]:
PROJECT_ID = "[your-project-id]"  # @param {type:"string"}

In [None]:
if PROJECT_ID == "" or PROJECT_ID is None or PROJECT_ID == "[your-project-id]":
    # Get your GCP project id from gcloud
    shell_output = !gcloud config list --format 'value(core.project)' 2>/dev/null
    PROJECT_ID = shell_output[0]
    print("Project ID:", PROJECT_ID)

In [None]:
! gcloud config set project $PROJECT_ID

#### Region

You can also change the `REGION` variable, which is used for operations
throughout the rest of this notebook.  Below are regions supported for Vertex AI. It is recommended that you choose the region closest to you.

- Americas: `us-central1`
- Europe: `europe-west4`
- Asia Pacific: `asia-east1`

You may not use a multi-regional bucket for training with Vertex AI. Not all regions provide support for all Vertex AI services.

Learn more about [Vertex AI regions](https://cloud.google.com/vertex-ai/docs/general/locations).

In [None]:
REGION = "[your-region]"  # @param {type: "string"}

if REGION == "[your-region]":
    REGION = "us-central1"

### Authenticate your Google Cloud account

**If you are using Vertex AI Workbench Notebooks**, your environment is already
authenticated.

**If you are using Colab**, run the cell below and follow the instructions when prompted to authenticate your account via oAuth.

**Otherwise**, follow these steps:

- In the Cloud Console, go to the [Create service account key](https://console.cloud.google.com/apis/credentials/serviceaccountkey) page.

- **Click Create service account**.

- In the **Service account name** field, enter a name, and click **Create**.

- In the **Grant this service account access to project** section, click the Role drop-down list. Type "Vertex" into the filter box, and select **Vertex Administrator**. Type "Storage Object Admin" into the filter box, and select **Storage Object Admin**.

- Click Create. A JSON file that contains your key downloads to your local environment.

- Enter the path to your service account key as the GOOGLE_APPLICATION_CREDENTIALS variable in the cell below and run the cell.

In [None]:
# If you are running this notebook in Colab, run this cell and follow the
# instructions to authenticate your GCP account. This provides access to your
# Cloud Storage bucket and lets you submit training jobs and prediction
# requests.

import os
import sys

# If on Vertex AI Workbench, then don't execute this code
IS_COLAB = "google.colab" in sys.modules
if not os.path.exists("/opt/deeplearning/metadata/env_version") and not os.getenv(
    "DL_ANACONDA_HOME"
):
    if "google.colab" in sys.modules:
        from google.colab import auth as google_auth

        google_auth.authenticate_user()

    # If you are running this notebook locally, replace the string below with the
    # path to your service account key and run this cell to authenticate your GCP
    # account.
    elif not os.getenv("IS_TESTING"):
        %env GOOGLE_APPLICATION_CREDENTIALS '[your-service-account-key-path]'

### Create a Cloud Storage bucket

**The following steps are required, regardless of your notebook environment.**

All training related files (TF model checkpoint, TensorBoard file, etc) will be saved to the GCS bucket. The pipeline not clean up the files since some of them might be useful for you, **please make sure to clean up the files**. For easy cleanup, you can set [GCS bucket level TTL](https://cloud.google.com/storage/docs/lifecycle).

Set the name of your Cloud Storage bucket below. Bucket names must be globally unique across all Google Cloud projects, including those outside of your organization.


In [None]:
BUCKET_URI = "gs://[your-bucket-name]"  # @param {type:"string"}
GENERATE_BUCKET_URI = True  # @param {type:"boolean"}

Create the bucket if it doesn't already exist.

In [None]:
import uuid

if GENERATE_BUCKET_URI:
    bucket_name = "gs://test-{}".format(uuid.uuid4())
    !gsutil mb -p {PROJECT_ID} -l {REGION} {bucket_name}

    # set GCS bucket object TTL to 7 days
    !echo '{"rule":[{"action": {"type": "Delete"},"condition": {"age": 7}}]}' > gcs_lifecycle.tmp
    !gsutil lifecycle set gcs_lifecycle.tmp {bucket_name}
    !rm gcs_lifecycle.tmp

    BUCKET_URI = bucket_name
    print(f"changed BUCKET_URI to {BUCKET_URI} due to GENERATE_BUCKET_URI is True")

if BUCKET_URI == "" or BUCKET_URI is None or BUCKET_URI == "gs://[your-bucket-name]":
    BUCKET_URI = "gs://" + PROJECT_ID + "aip-" + uuid.uuid4()

! gsutil ls -b $BUCKET_URI || gsutil mb -l $DATA_REGION $BUCKET_URI

Finally, validate access to your Cloud Storage bucket by examining its contents:

In [None]:
! gsutil ls -al $BUCKET_URI

#### Service Account

You use a service account to create Vertex AI Pipeline jobs. If you do not want to use your project's Compute Engine service account, set `SERVICE_ACCOUNT` to another service account ID.

In [None]:
SERVICE_ACCOUNT = "[your-service-account]"  # @param {type:"string"}

In [None]:
if (
    SERVICE_ACCOUNT == ""
    or SERVICE_ACCOUNT is None
    or SERVICE_ACCOUNT == "[your-service-account]"
):
    # Get your service account from gcloud
    if not IS_COLAB:
        shell_output = !gcloud auth list 2>/dev/null
        SERVICE_ACCOUNT = shell_output[2].replace("*", "").strip()

    else:  # IS_COLAB:
        shell_output = ! gcloud projects describe  $PROJECT_ID
        project_number = shell_output[-1].split(":")[1].strip().replace("'", "")
        SERVICE_ACCOUNT = f"{project_number}-compute@developer.gserviceaccount.com"

    print("Service Account:", SERVICE_ACCOUNT)

#### Set service account access for Vertex AI Pipelines
Run the following commands to grant your service account access to read and write pipeline artifacts in the bucket that you created in the previous step. You only need to run this step once per service account.

In [None]:
! gsutil iam ch serviceAccount:{SERVICE_ACCOUNT}:roles/storage.objectCreator $BUCKET_URI

! gsutil iam ch serviceAccount:{SERVICE_ACCOUNT}:roles/storage.objectViewer $BUCKET_URI

## Import libraries and define constants

In [None]:
# Import required modules
from typing import Any, Dict, List

from google.cloud import aiplatform, storage
from google_cloud_pipeline_components.experimental.automl.tabular import \
    utils as automl_tabular_utils

## Initialize Vertex AI SDK for Python

Initialize the Vertex AI SDK for Python for your project.

In [None]:
aiplatform.init(project=PROJECT_ID, location=REGION)

### Define helper functions
Define the following helper functions:

- `get_model_artifacts_path`: Get the model artifacts path from task details.
- `get_model_uri`: Get the model uri from the task details..
- `get_bucket_name_and_path`: Get the bucket name and path.
- `download_from_gcs`: Download the content from the bucket.
- `write_to_gcs`: Upload content into the bucket.
- `get_task_detail`: Get the task details by using task name.
- `get_model_name`: Get the model name from pipeline job ID.
- `get_evaluation_metrics`: Get the evaluation metrics from pipeline task details.


In [None]:
# Get the model artifacts path from task details.
def get_model_artifacts_path(task_details: List[Dict[str, Any]], task_name: str) -> str:
    task = get_task_detail(task_details, task_name)
    return task.outputs["unmanaged_container_model"].artifacts[0].uri


# Get the model uri from the task details.
def get_model_uri(task_details: List[Dict[str, Any]]) -> str:
    task = get_task_detail(task_details, "model-upload")
    # in format https://<location>-aiplatform.googleapis.com/v1/projects/<project_number>/locations/<location>/models/<model_id>
    model_id = task.outputs["model"].artifacts[0].uri.split("/")[-1]
    return f"https://console.cloud.google.com/vertex-ai/locations/{REGION}/models/{model_id}?project={PROJECT_ID}"


# Get the bucket name and path.
def get_bucket_name_and_path(uri: str) -> str:
    no_prefix_uri = uri[len("gs://") :]
    splits = no_prefix_uri.split("/")
    return splits[0], "/".join(splits[1:])


# Get the content from the bucket.
def download_from_gcs(uri: str) -> str:
    bucket_name, path = get_bucket_name_and_path(uri)
    storage_client = storage.Client(project=PROJECT_ID)
    bucket = storage_client.get_bucket(bucket_name)
    blob = bucket.blob(path)
    return blob.download_as_string()


# Upload content into the bucket.
def write_to_gcs(uri: str, content: str):
    bucket_name, path = get_bucket_name_and_path(uri)
    storage_client = storage.Client()
    bucket = storage_client.get_bucket(bucket_name)
    blob = bucket.blob(path)
    blob.upload_from_string(content)


# Get the task details by using task name.
def get_task_detail(
    task_details: List[Dict[str, Any]], task_name: str
) -> List[Dict[str, Any]]:
    for task_detail in task_details:
        if task_detail.task_name == task_name:
            return task_detail


# Get the model name from pipeline job ID.
def get_model_name(job_id: str) -> str:
    pipeline_task_details = aiplatform.PipelineJob.get(
        job_id
    ).gca_resource.job_detail.task_details
    upload_task_details = get_task_detail(pipeline_task_details, "model-upload")
    return upload_task_details.outputs["model"].artifacts[0].metadata["resourceName"]


# Get the evaluation metrics.
def get_evaluation_metrics(
    task_details: List[Dict[str, Any]],
) -> str:
    ensemble_task = get_task_detail(task_details, "model-evaluation")
    return download_from_gcs(
        ensemble_task.outputs["evaluation_metrics"].artifacts[0].uri
    )

## Define the training specification

### Configure the dataset

You define either of the following parameters:

- `data_source_csv_filenames`: The CSV data source.
- `data_source_bigquery_table_path`: The BigQuery data source.

***Notes***: Please note that the dataset's location has to be the same as the same as the service location (i.e., `REGION`) set for launching the training pipeline.


In [None]:
data_source_csv_filenames = "gs://cloud-samples-data-us-central1/vertex-ai/tabular-workflows/datasets/bank-marketing/train.csv"
data_source_bigquery_table_path = (
    None  # @param {type:"string"}, format: bq://bq_project.bq_dataset.bq_table
)

### Configure feature transformation

Transformations can be specified using Feature Transform Engine (FTE) specific configurations. FTE supports both TensorFlow-based row-level and BigQuery-based dataset-level transformations.

* TensorFlow-based row-level transformations:
  * Full automatic transformations: FTE automatically configures a set of built-in transformations for each input column based on its data statistics. This can be set via `tf_auto_transform_features` in the training pipeline.
  * Fully specified transformations: All transformations on input columns are explicitly specified with FTE's built-in transformations. Chaining of multiple transformations on a single column is also supported. These transformations can be saved to JSON configuration file and specified via `tf_transformations_path` argument of the training pipeline.
  * Custom transformations: Custom, bring-your-own transform function, where you can define and import your own transform function and use it with other FTE's built-in transformations. You can specify custom transformations as an array of JSON object and pass through the `tf_custom_transformation_definitions` argument of the training pipeline.

* BigQuery-based dataset-level transformations:
  * Fully specified transformations: All transformations on input columns are explicitly specified with FTE's built-in transformations. These transformations can be specified as an array of JSON objects via `dataset_level_transformations` argument of the training pipeline.
  * Custom transformations: Custom, bring-your-own transform function, where you can define and import your own transform function and use it with other FTE's built-in transformations. You can specify custom transformations as an array of JSON object and pass through the `dataset_level_custom_transformation_definitions` argument of the training pipeline.

Below, you configure full automatic transformations by specifying a list of input features to pass to the `tf_auto_transform_features` argument of the training pipeline.

For a complete list of supported feature transformation configurations and examples, please go [here](https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-1.0.31/google_cloud_pipeline_components.experimental.automl.tabular.html#google_cloud_pipeline_components.experimental.automl.tabular.FeatureTransformEngineOp).

In [None]:
auto_transform_features = [
    "age",
    "job",
    "marital",
    "education",
    "default",
    "balance",
    "housing",
    "loan",
    "contact",
    "day",
    "month",
    "duration",
    "campaign",
    "pdays",
    "previous",
    "poutcome",
]

### Configure feature selection

In addition to transformations, you can also apply feature selection via Feature Transform Engine to use only highly ranked features, evaluated by supported algorithms. If enabled, it will be applied right after dataset level transformations, and exclude any feature that's not selected.

To enable it, you need to set `run_feature_selection` to True.

To configure the algorihtm to use, and number of features to be selected, you need to configure both `feature_selection_algorithm` and `max_selected_features` parameter.

For a complete list of supported feature selection algorithms and configurations, please go [here](https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-1.0.31/google_cloud_pipeline_components.experimental.automl.tabular.html#google_cloud_pipeline_components.experimental.automl.tabular.FeatureTransformEngineOp).

In [None]:
RUN_FEATURE_SELECTION = True  # @param {type:"boolean"}

FEATURE_SELECTION_ALGORITHM = "AMI"  # @param {type:"string"}

MAX_SELECTED_FEATURES = 10  # @param {type:"integer"}

### Setup training configuration

You define the following:

- `target_column`: The target column name.
- `prediction_type`: The type of prediction the model is to produce.
  'classification' or 'regression'.
- `predefined_split_key`: The predefined_split column name.
- `timestamp_split_key`: The timestamp_split column name.
- `stratified_split_key`: The stratified_split column name.
- `training_fraction`: The training fraction.
- `validation_fraction`: The validation fraction.
- `test_fraction`: The test fraction.
- `weight_column`: The weight column name.
- `run_evaluation`: Whether to run evaluation steps during training.

In [None]:
run_evaluation = True  # @param {type:"boolean"}
prediction_type = "classification"
target_column = "deposit"

# Fraction split
training_fraction = 0.8
validation_fraction = 0.1
test_fraction = 0.1

timestamp_split_key = None  # timestamp column name when using timestamp split
stratified_split_key = None  # target column name when using stratified split
training_fraction = 0.8
validation_fraction = 0.1
test_fraction = 0.1

predefined_split_key = None
if predefined_split_key:
    training_fraction = None
    validation_fraction = None
    test_fraction = None

weight_column = None

## VPC related config

You define the following:

- `dataflow_subnetwork`: Dataflow's fully qualified subnetwork name, when empty the default subnetwork will be used. Example:
https://cloud.google.com/dataflow/docs/guides/specifying-networks#example_network_and_subnetwork_specifications
- `dataflow_use_public_ips`: Specifies whether Dataflow workers use public IP
  addresses.

If you need to use a custom Dataflow subnetwork, you can set it through the `dataflow_subnetwork` parameter. The requirements are:
1. `dataflow_subnetwork` must be fully qualified subnetwork name.
   [[reference](https://cloud.google.com/dataflow/docs/guides/specifying-networks#example_network_and_subnetwork_specifications)]
1. The following service accounts must have [Compute Network User role](https://cloud.google.com/compute/docs/access/iam#compute.networkUser) assigned on the specified dataflow subnetwork [[reference](https://cloud.google.com/dataflow/docs/guides/specifying-networks#shared)]:
    1. Compute Engine default service account: PROJECT_NUMBER-compute@developer.gserviceaccount.com
    1. Dataflow service account: service-PROJECT_NUMBER@dataflow-service-producer-prod.iam.gserviceaccount.com

If your project has VPC-SC enabled, please make sure:

1. The dataflow subnetwork used in VPC-SC is configured properly for Dataflow.
   [[reference](https://cloud.google.com/dataflow/docs/guides/routes-firewall)]
1. `dataflow_use_public_ips` is set to False.


In [None]:
dataflow_subnetwork = ""  # @param {type:"string"}
dataflow_use_public_ips = True  # @param {type:"boolean"}

## Customize TabNet CustomJob configuration and create pipeline

This is best choice if you know exactly which hyperparameter values to use for model training. It uses fewer training resources than a HyperparameterTuningJob.

In the example below, you configure the following:

- `root_dir`: The root GCS directory for the pipeline components.
- `worker_pool_specs_override`: The dictionary for overriding training and evaluation worker pool specs. The dictionary should be of [this format]( https://github.com/googleapis/googleapis/blob/4e836c7c257e3e20b1de14d470993a2b1f4736a8/google/cloud/aiplatform/v1beta1/custom_job.proto#L172). TabNet supports both CPU and GPU training.
- `learning_rate`: The learning rate used by the linear optimizer.
- `max_steps`: Number of steps to run the trainer for.
- `max_train_secs`: Amount of time in seconds to run the trainer for.

A complete list of pipeline inputs and model hyperparameters is available [here](https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-1.0.23/google_cloud_pipeline_components.experimental.automl.tabular.html#google_cloud_pipeline_components.experimental.automl.tabular.utils.get_tabnet_trainer_pipeline_and_parameters).

In [None]:
pipeline_job_root_dir = os.path.join(BUCKET_URI, "tabnet_custom_job")

# max_steps and/or max_train_secs must be set. If both are
# specified, training stop after either condition is met.
# By default, max_train_secs is set to -1.

max_steps = 1000
max_train_secs = -1

learning_rate = 0.01

worker_pool_specs_override = [
    {"machine_spec": {"machine_type": "c2-standard-16"}}  # Override for TF chief node
]

# To test GPU training, the worker_pool_specs_override can be specified like this.
# worker_pool_specs_override =  [
#     {"machine_spec": {
#       'machine_type': "n1-highmem-32",
#       "accelerator_type": "NVIDIA_TESLA_V100",
#       "accelerator_count": 2
#       }
#     }
#   ]

# If your system does not use Python, you can save the JSON file (`template_path`),
# and use another programming language to submit the pipeline.
(
    template_path,
    parameter_values,
) = automl_tabular_utils.get_tabnet_trainer_pipeline_and_parameters(
    project=PROJECT_ID,
    location=REGION,
    root_dir=pipeline_job_root_dir,
    max_steps=max_steps,
    max_train_secs=max_train_secs,
    learning_rate=learning_rate,
    target_column=target_column,
    prediction_type=prediction_type,
    tf_auto_transform_features=auto_transform_features,
    run_feature_selection=RUN_FEATURE_SELECTION,
    feature_selection_algorithm=FEATURE_SELECTION_ALGORITHM,
    max_selected_features=MAX_SELECTED_FEATURES,
    training_fraction=training_fraction,
    validation_fraction=validation_fraction,
    test_fraction=test_fraction,
    data_source_csv_filenames=data_source_csv_filenames,
    data_source_bigquery_table_path=data_source_bigquery_table_path,
    worker_pool_specs_override=worker_pool_specs_override,
    dataflow_use_public_ips=dataflow_use_public_ips,
    dataflow_subnetwork=dataflow_subnetwork,
    run_evaluation=run_evaluation,
)

pipeline_job_id = f"tabnet-{uuid.uuid4()}"
# More info on parameters PipelineJob accepts:
# https://cloud.google.com/vertex-ai/docs/pipelines/run-pipeline#create_a_pipeline_run
pipeline_job = aiplatform.PipelineJob(
    display_name=pipeline_job_id,
    template_path=template_path,
    job_id=pipeline_job_id,
    pipeline_root=pipeline_job_root_dir,
    parameter_values=parameter_values,
    enable_caching=False,
)

pipeline_job.run(service_account=SERVICE_ACCOUNT)

### Go to the Vertex Model UI
From the link below, you can deploy the model and test online prediction or run batch prediction.

In [None]:
tabnet_trainer_pipeline_task_details = aiplatform.PipelineJob.get(
    pipeline_job_id
).gca_resource.job_detail.task_details
CUSTOM_JOB_MODEL = get_model_name(pipeline_job_id)
print("model uri:", get_model_uri(tabnet_trainer_pipeline_task_details))
print(
    "model artifacts:",
    get_model_artifacts_path(tabnet_trainer_pipeline_task_details, "tabnet-trainer"),
)

## Customize TabNet HyperparameterTuningJob configuration and create pipeline

To get the best set of hyperparameters for your dataset, it is recommended to run a HyperparameterTuningJob.

Hyperparameters that can be tuned are set in the optional `study_spec_parameters_override` parameter. you provide a helper function called `get_tabnet_study_spec_parameters_override` to get these hyperparameters. You provide `dataset_size_bucket` (one of 'small' (< 1M rows), 'medium' (1M - 100M rows), or 'large' (> 100M rows)), `training_budget_bucket` (one of 'small' (< \\$600), 'medium' (\\$600 - \\$2400), or 'large' (> \\$2400)), and `prediction_type` and Vertex AI returns a list of hyperparameters and ranges. `study_spec_parameters_override` can be empty or one or more of these hyperparameters can be specified. For hyperparameters not specified in `study_spec_parameters_override`, you set ranges in the pipeline. For a full list of hyperparameters available for tuning, see [here](https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-1.0.23/google_cloud_pipeline_components.experimental.automl.tabular.html#google_cloud_pipeline_components.experimental.automl.tabular.utils.get_tabnet_trainer_pipeline_and_parameters).

In addition to hyperparameters, HyperparameterTuningJob takes the following values in the example below:

- `root_dir`: The root GCS directory for the pipeline components.
- `worker_pool_specs_override`: The dictionary for overriding training and evaluation worker pool specs. The dictionary should be of [this format]( https://github.com/googleapis/googleapis/blob/4e836c7c257e3e20b1de14d470993a2b1f4736a8/google/cloud/aiplatform/v1beta1/custom_job.proto#L172). TabNet supports both CPU and GPU training.
- `study_spec_metric_id`: Metric to optimize, possible values: ['loss', 'average_loss', 'rmse', 'mae', 'mql', 'accuracy', 'auc', 'precision', 'recall'].
- `study_spec_metric_goal`: Optimization goal of the metric, possible values: "MAXIMIZE", "MINIMIZE".
- `max_trial_count`: The desired total number of trials.
- `parallel_trial_count`: The desired number of trials to run in parallel.
- `max_failed_trial_count`: The number of failed trials that need to be seen before failing the HyperparameterTuningJob. If set to 0, Vertex AI decides how many trials must fail before the whole job fails.
- `study_spec_algorithm`: The search algorithm specified for the study. One of
'ALGORITHM_UNSPECIFIED', 'GRID_SEARCH', or 'RANDOM_SEARCH'.

For a full list of HyperparameterTuningJob parameters, see [here](https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-1.0.23/google_cloud_pipeline_components.experimental.automl.tabular.html#google_cloud_pipeline_components.experimental.automl.tabular.utils.get_tabnet_hyperparameter_tuning_job_pipeline_and_parameters).

Multiple trials can be configured. The pipeline returns the best trial based on the metric configured in `study_spec_metrics`. In the example below, you return the trial with the lowest loss value.

In [None]:
pipeline_job_root_dir = os.path.join(BUCKET_URI, "tabnet_hyperparameter_tuning_job")

worker_pool_specs_override = [
    {"machine_spec": {"machine_type": "c2-standard-16"}}  # Override for TF chief node
]

# To test GPU training, the worker_pool_specs_override can be specified like this.
# worker_pool_specs_override =  [
#    {
#       "machine_spec":{
#          "machine_type":"n1-highmem-32",
#          "accelerator_type":"NVIDIA_TESLA_V100",
#          "accelerator_count":2
#       }
#    }
# ]

study_spec_metric_id = "loss"
study_spec_metric_goal = "MINIMIZE"

# max_steps and/or max_train_secs must be set. If both are
# specified, training stop after either condition is met.
# By default, max_train_secs is set to -1 and max_steps is set to
# an appropriate range given dataset_size and training budget.
study_spec_parameters_override = (
    automl_tabular_utils.get_tabnet_study_spec_parameters_override(
        dataset_size_bucket="small",
        prediction_type=prediction_type,
        training_budget_bucket="small",
    )
)

# If your system does not use Python, you can save the JSON file (`template_path`),
# and use another programming language to submit the pipeline.
(
    template_path,
    parameter_values,
) = automl_tabular_utils.get_tabnet_hyperparameter_tuning_job_pipeline_and_parameters(
    project=PROJECT_ID,
    location=REGION,
    root_dir=pipeline_job_root_dir,
    target_column=target_column,
    prediction_type=prediction_type,
    tf_auto_transform_features=auto_transform_features,
    run_feature_selection=RUN_FEATURE_SELECTION,
    feature_selection_algorithm=FEATURE_SELECTION_ALGORITHM,
    max_selected_features=MAX_SELECTED_FEATURES,
    training_fraction=training_fraction,
    validation_fraction=validation_fraction,
    test_fraction=test_fraction,
    data_source_csv_filenames=data_source_csv_filenames,
    data_source_bigquery_table_path=data_source_bigquery_table_path,
    study_spec_metric_id=study_spec_metric_id,
    study_spec_metric_goal=study_spec_metric_goal,
    study_spec_parameters_override=study_spec_parameters_override,
    max_trial_count=1,
    parallel_trial_count=1,
    max_failed_trial_count=0,
    worker_pool_specs_override=worker_pool_specs_override,
    dataflow_use_public_ips=dataflow_use_public_ips,
    dataflow_subnetwork=dataflow_subnetwork,
    run_evaluation=True,
)

pipeline_job_id = f"tabnet-hpt-{uuid.uuid4()}"
# More info on parameters PipelineJob accepts:
# https://cloud.google.com/vertex-ai/docs/pipelines/run-pipeline#create_a_pipeline_run
pipeline_job = aiplatform.PipelineJob(
    display_name=pipeline_job_id,
    template_path=template_path,
    job_id=pipeline_job_id,
    pipeline_root=pipeline_job_root_dir,
    parameter_values=parameter_values,
    enable_caching=False,
)

pipeline_job.run(service_account=SERVICE_ACCOUNT)

### Go to the Vertex Model UI
From the link below, you can deploy the model and test online prediction or run batch prediction.

In [None]:
tabnet_hpt_pipeline_task_details = aiplatform.PipelineJob.get(
    pipeline_job_id
).gca_resource.job_detail.task_details
HPT_JOB_MODEL = get_model_name(pipeline_job_id)

print("model uri:", get_model_uri(tabnet_hpt_pipeline_task_details))
print(
    "model artifacts:",
    get_model_artifacts_path(
        tabnet_hpt_pipeline_task_details, "get-best-hyperparameter-tuning-job-trial"
    ),
)

## Clean up Vertex and BigQuery resources

To clean up all Google Cloud resources used in this project, you can [delete the Google Cloud
project](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects) you used for the tutorial.

Otherwise, you can delete the individual resources you created in this tutorial:

- Cloud Storage Bucket
- Model from CustomJob pipeline
- Model from HyperparameterTuningJob pipeline

In [None]:
# Delete model resources
custom_job_model = aiplatform.Model(CUSTOM_JOB_MODEL)
hpt_job_model = aiplatform.Model(HPT_JOB_MODEL)
custom_job_model.delete()
hpt_job_model.delete()

# Delete bucket
delete_bucket = False
if delete_bucket or os.getenv("IS_TESTING"):
    ! gsutil -m rm -r $BUCKET_URI