In [None]:
# Copyright 2022 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

In [1]:
!pwd

/home/jupyter/vertex-forecas-repo


# Forecasting on Vertex pipelines for private preview

<table align="left">
  <td>
    <a href="https://colab.research.google.com/github/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/automl/automl_tabular_on_vertex_pipelines.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Colab logo"> Run in Colab
    </a>
  </td>
  <td>
    <a href="https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/automl/automl_tabular_on_vertex_pipelines.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo">
      View on GitHub
    </a>
  </td>
  <td>
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/vertex-ai-samples/main/notebooks/official/automl/automl_tabular_on_vertex_pipelines.ipynb">
        <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo">
      Open in Vertex AI Workbench
    </a>
  </td>
</table>
<br/><br/><br/>

## Overview

In this tutorial, you will use a few Vertex AI Tabular Workflows pipelines to train AutoML models using different configurations. You will see:
- how `get_l2l_forecasting_pipeline_and_parameters` gives you the ability to customize the default AutoML Tabular pipeline
- how `get_l2l_forecasting_pipeline_and_parameters` allows you to reduce the training time and cost for an AutoML model by using the tuning results from a previous pipeline run.
- how `get_time_series_dense_encoder_forecasting_pipeline_and_parameters` allows you to train FastNN model
- how to enable probabilistic inference for forecasting training pipelines
- how to perform the batch prediction with the forecasting model trained with Tabular workflow.

Learn more about [Tabular Workflow for E2E AutoML](https://cloud.google.com/vertex-ai/docs/tabular-data/tabular-workflows/e2e-automl).

### Objective

In this tutorial, you learn how to create AutoML Forecasting models using [Vertex AI Pipelines](https://cloud.google.com/vertex-ai/docs/pipelines/introduction) downloaded from [Google Cloud Pipeline Components](https://cloud.google.com/vertex-ai/docs/pipelines/components-introduction) (GCPC). These pipelines will be Vertex AI Tabular Workflow pipelines which are maintained by Google. These pipelines will showcase different ways to customize the Vertex Tabular training process.

This tutorial uses the following Google Cloud ML services:

- `AutoML Training`
- `Vertex AI Pipelines`

The steps performed are:

- Create a training pipeline with Learn-to-learn algorithm using specified machine type for training.
- Create a training pipeline that reuses the architecture search results from the previous pipeline to save time.
- Create a training pipeline with TiDE(Time series Dense Encoder) algorithm.
- Create a training pipeline with the probabilistic inference enabled.
- Perform the batch prediction using the trained model in the above steps.

### Dataset

The dataset you will be using is [Liquor](https://www.kaggle.com/datasets/residentmario/iowa-liquor-sales).

### Costs

This tutorial uses billable components of Google Cloud:

* Vertex AI
* Cloud Storage
* BigQuery

Learn about [Vertex AI
pricing](https://cloud.google.com/vertex-ai/pricing), [Cloud Storage
pricing](https://cloud.google.com/storage/pricing), and [BigQuery](https://cloud.google.com/bigquery), and use the [Pricing
Calculator](https://cloud.google.com/products/calculator/)
to generate a cost estimate based on your projected usage.

### Set up your local development environment

**If you are using Colab or Vertex AI Workbench Notebooks**, your environment already meets all the requirements to run this notebook. You can skip this step.

**Otherwise**, make sure your environment meets this notebook's requirements. You need the following:

- The Cloud Storage SDK
- Python 3
- virtualenv
- Jupyter notebook running in a virtual environment with Python 3

The Cloud Storage guide to [Setting up a Python development environment](https://cloud.google.com/python/setup) and the [Jupyter installation guide](https://jupyter.org/install) provide detailed instructions for meeting these requirements. The following steps provide a condensed set of instructions:

1. [Install and initialize the SDK](https://cloud.google.com/sdk/docs/).

2. [Install Python 3](https://cloud.google.com/python/setup#installing_python).

3. [Install virtualenv](https://cloud.google.com/python/setup#installing_and_using_virtualenv) and create a virtual environment that uses Python 3.  Activate the virtual environment.

4. To install Jupyter, run `pip3 install jupyter` on the command-line in a terminal shell.

5. To launch Jupyter, run `jupyter notebook` on the command-line in a terminal shell.

6. Open this notebook in the Jupyter Notebook Dashboard.

## Install additional packages

Install the latest version of the Google Cloud Pipeline Components (GCPC) SDK.

In [1]:
# !rm -rf ./google_cloud_pipeline_components*.whl
# !gsutil cp gs://automl-tables-build-oss-dependencies/gcpc/2023_03_27_05_17_32/google_cloud_pipeline_components-2.0.0b1.dev0-py2.py3-none-any.whl .
# !pip install --upgrade --force-reinstall --user ./google_cloud_pipeline_components*.whl -q

Copying gs://automl-tables-build-oss-dependencies/gcpc/2023_03_27_05_17_32/google_cloud_pipeline_components-2.0.0b1.dev0-py2.py3-none-any.whl...
/ [1 files][  1.2 MiB/  1.2 MiB]                                                
Operation completed over 1 objects/1.2 MiB.                                      
[0m[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-api-python-client 1.8.0 requires google-api-core<2dev,>=1.13.0, but you have google-api-core 2.11.0 which is incompatible.
ray 2.4.0 requires grpcio<=1.51.3,>=1.42.0; python_version >= "3.10" and sys_platform != "darwin", but you have grpcio 1.54.2 which is incompatible.
ydata-profiling 4.1.2 requires requests<2.29,>=2.24.0, but you have requests 2.31.0 which is incompatible.[0m[31m
[0m

### Restart the kernel
Once you've installed the additional packages, you need to restart the notebook kernel so it can find the packages.


**Note: Once this cell has finished running, continue on. You do not need to re-run any of the cells above.**


In [2]:
import os

# if not os.getenv("IS_TESTING"):
#     # Automatically restart kernel after installs
#     import IPython

#     app = IPython.Application.instance()
#     app.kernel.do_shutdown(True)

## Before you begin

### GPU runtime

This tutorial does not require a GPU runtime.

### Set up your Google Cloud project

**The following steps are required, regardless of your notebook environment.**

1. [Select or create a Google Cloud project](https://console.cloud.google.com/cloud-resource-manager). When you first create an account, you get a $300 free credit towards your compute/storage costs.

2. [Make sure that billing is enabled for your project.](https://cloud.google.com/billing/docs/how-to/modify-project)

3. [Enable the following APIs: Vertex AI APIs, Dataflow APIs, Compute Engine APIs, and Cloud Storage.](https://console.cloud.google.com/flows/enableapi?apiid=ml.googleapis.com,dataflow.googleapis.com,compute_component,storage-component.googleapis.com)

4. If you are running this notebook locally, you will need to install the [Cloud SDK](https://cloud.google.com/sdk).

5. Enter your project ID in the cell below. Then run the  cell to make sure the
Cloud SDK uses the right project for all the commands in this notebook.

**Note**: Jupyter runs lines prefixed with `!` as shell commands, and it interpolates Python variables prefixed with `$`.

## Notes about service account and permission

For full details of the permission setup, please refer to https://cloud.google.com/vertex-ai/docs/tabular-data/tabular-workflows/service-accounts

**By default no configuration is required**, if you run into any permission related issue, please make sure the service accounts above have the required roles:

|Service account email|Description|Roles|
|---|---|---|
|PROJECT_NUMBER-compute@developer.gserviceaccount.com|Compute Engine default service account|Dataflow Developer, Dataflow Worker, Storage Admin, BigQuery Data Editor, Vertex AI User, Service Account User|
|service-PROJECT_NUMBER@gcp-sa-aiplatform.iam.gserviceaccount.com|AI Platform Service Agent|Vertex AI Service Agent|


1. Goto https://console.cloud.google.com/iam-admin/iam.
2. Check the "Include Google-provided role grants" checkbox.
3. Find the above emails.
4. Grant the corresponding roles.

### Using data source from a different project
- For the BQ data source, grant both service accounts the "BigQuery Data Viewer" role.
- For the CSV data source, grant both service accounts the "Storage Object Viewer" role.


### Set your project ID

Set your project ID below. If you know know your project ID, leave the field blank and the following cells may be able to find it. Optionally, you may also set a service account in the cell below.

In [1]:
PROJECT_ID = "hybrid-vertex"  # @param {type:"string"}

In [2]:
if PROJECT_ID == "" or PROJECT_ID is None or PROJECT_ID == "[your-project-id]":
    # Get your GCP project id from gcloud
    shell_output = !gcloud config list --format 'value(core.project)' 2>/dev/null
    PROJECT_ID = shell_output[0]
    print("Project ID:", PROJECT_ID)

In [3]:
! gcloud config set project $PROJECT_ID

Updated property [core/project].


### Region
You may change the `REGION` variable, which is used for Vertex Forecasting operations
throughout the rest of this notebook.  Below are regions supported for Vertex AI. We recommend that you choose the region closest to you.

- Americas: `us-central1`
- Europe: `europe-west4`
- Asia Pacific: `asia-east1`

Learn more about [Vertex AI regions](https://cloud.google.com/vertex-ai/docs/general/locations)

In [4]:
REGION = "us-central1"  # @param {type: "string"}

if REGION == "[your-region]":
    REGION = "us-central1"

### Authenticate your Google Cloud account

**If you are using Google Cloud Notebooks**, your environment is already authenticated. Skip this step.

**If you are using Colab**, run the cell below and follow the instructions when prompted to authenticate your account via oAuth.

**Otherwise**, follow these steps:

- In the Cloud Console, go to the [Create service account key](https://console.cloud.google.com/apis/credentials/serviceaccountkey) page.

- **Click Create service account**.

- In the **Service account name** field, enter a name, and click **Create**.

- In the **Grant this service account access to project** section, click the Role drop-down list. Type "Vertex" into the filter box, and select **Vertex Administrator**. Type "Storage Object Admin" into the filter box, and select **Storage Object Admin**.

- Click Create. A JSON file that contains your key downloads to your local environment.

- Enter the path to your service account key as the GOOGLE_APPLICATION_CREDENTIALS variable in the cell below and run the cell.

In [5]:
# If you are running this notebook in Colab, run this cell and follow the
# instructions to authenticate your GCP account. This provides access to your
# Cloud Storage bucket and lets you submit training jobs and prediction
# requests.

import os
import sys

# If on Vertex AI Workbench, then don't execute this code
IS_COLAB = "google.colab" in sys.modules
if not os.path.exists("/opt/deeplearning/metadata/env_version") and not os.getenv(
    "DL_ANACONDA_HOME"
):
    if "google.colab" in sys.modules:
        from google.colab import auth as google_auth

        google_auth.authenticate_user()

    # If you are running this notebook locally, replace the string below with the
    # path to your service account key and run this cell to authenticate your GCP
    # account.
    elif not os.getenv("IS_TESTING"):
        %env GOOGLE_APPLICATION_CREDENTIALS '[your-service-account-key-path]'

### Create a Cloud Storage bucket

**The following steps are required, regardless of your notebook environment.**

All training related files (TF model checkpoint, TensorBoard file, etc) will be saved to the GCS bucket. The pipeline will not clean up the files since some of them might be useful for you, **please make sure to clean up the files**. For easy cleanup, you can set [GCS bucket level TTL](https://cloud.google.com/storage/docs/lifecycle).

Set the name of your Cloud Storage bucket below. Bucket names must be globally unique across all Google Cloud projects, including those outside of your organization.


In [6]:
BUCKET_URI = "gs://forecast-v1"  # @param {type:"string"}
GENERATE_BUCKET_URI = False  # @param {type:"boolean"}

Create the bucket if it doesn't already exist.

In [16]:
import uuid

# if GENERATE_BUCKET_URI:
#     bucket_name = "gs://test-{}".format(uuid.uuid4())
#     !gsutil mb -p {PROJECT_ID} -l {REGION} {bucket_name}

#     # set GCS bucket object TTL to 7 days
#     !echo '{"rule":[{"action": {"type": "Delete"},"condition": {"age": 7}}]}' > gcs_lifecycle.tmp
#     !gsutil lifecycle set gcs_lifecycle.tmp {bucket_name}
#     !rm gcs_lifecycle.tmp

#     BUCKET_URI = bucket_name
#     print(f"changed BUCKET_URI to {BUCKET_URI} due to GENERATE_BUCKET_URI is True")

# if BUCKET_URI == "" or BUCKET_URI is None or BUCKET_URI == "gs://[your-bucket-name]":
#     BUCKET_URI = "gs://" + PROJECT_ID + "aip-" + uuid.uuid4()

# ! gsutil ls -b $BUCKET_URI || gsutil mb -l $DATA_REGION $BUCKET_URI

Finally, validate access to your Cloud Storage bucket by examining its contents:

In [7]:
! gsutil ls -al $BUCKET_URI

                                 gs://forecast-v1/automl_forecasting_pipeline/


#### Service Account

You use a service account to create Vertex AI Pipeline jobs. If you do not want to use your project's Compute Engine service account, set `SERVICE_ACCOUNT` to another service account ID.

In [8]:
SERVICE_ACCOUNT = "[your-service-account]"

In [9]:
if (
    SERVICE_ACCOUNT == ""
    or SERVICE_ACCOUNT is None
    or SERVICE_ACCOUNT == "[your-service-account]"
):
    # Get your service account from gcloud
    if not IS_COLAB:
        shell_output = !gcloud auth list 2>/dev/null
        SERVICE_ACCOUNT = shell_output[2].replace("*", "").strip()

    else:  # IS_COLAB:
        shell_output = ! gcloud projects describe  $PROJECT_ID
        project_number = shell_output[-1].split(":")[1].strip().replace("'", "")
        SERVICE_ACCOUNT = f"{project_number}-compute@developer.gserviceaccount.com"

    print("Service Account:", SERVICE_ACCOUNT)

Service Account: 934903580331-compute@developer.gserviceaccount.com


#### Set service account access for Vertex AI Pipelines
Run the following commands to grant your service account access to read and write pipeline artifacts in the bucket that you created in the previous step. You only need to run this step once per service account.

In [10]:
! gsutil iam ch serviceAccount:{SERVICE_ACCOUNT}:roles/storage.objectCreator $BUCKET_URI

! gsutil iam ch serviceAccount:{SERVICE_ACCOUNT}:roles/storage.objectViewer $BUCKET_URI

No changes made to gs://forecast-v1/
No changes made to gs://forecast-v1/


## Import libraries and define constants

In [11]:
# Import required modules
import json
from typing import Any, Dict, List

from google.cloud import aiplatform, storage
from google_cloud_pipeline_components.experimental.automl.forecasting import \
    utils as automl_forecasting_utils

## Initialize Vertex SDK for Python

Initialize the Vertex SDK for Python for your project.

In [12]:
aiplatform.init(project=PROJECT_ID, location=REGION)

## VPC related config

If you need to use a custom Dataflow subnetwork, you can set it through the `dataflow_subnetwork` parameter. The requirements are:
1. `dataflow_subnetwork` must be fully qualified subnetwork name.
   [[reference](https://cloud.google.com/dataflow/docs/guides/specifying-networks#example_network_and_subnetwork_specifications)]
1. The following service accounts must have [Compute Network User role](https://cloud.google.com/compute/docs/access/iam#compute.networkUser) assigned on the specified dataflow subnetwork [[reference](https://cloud.google.com/dataflow/docs/guides/specifying-networks#shared)]:
    1. Compute Engine default service account: PROJECT_NUMBER-compute@developer.gserviceaccount.com
    1. Dataflow service account: service-PROJECT_NUMBER@dataflow-service-producer-prod.iam.gserviceaccount.com

If your project has VPC-SC enabled, please make sure:

1. The dataflow subnetwork used in VPC-SC is configured properly for Dataflow.
   [[reference](https://cloud.google.com/dataflow/docs/guides/routes-firewall)]
1. `dataflow_use_public_ips` is set to False.


In [13]:
# Dataflow's fully qualified subnetwork name, when empty the default subnetwork will be used.
# Fully qualified subnetwork name is in the form of
# https://www.googleapis.com/compute/v1/projects/HOST_PROJECT_ID/regions/REGION_NAME/subnetworks/SUBNETWORK_NAME
# reference: https://cloud.google.com/dataflow/docs/guides/specifying-networks#example_network_and_subnetwork_specifications
dataflow_subnetwork = None  # @param {type:"string"}
# Specifies whether Dataflow workers use public IP addresses.
dataflow_use_public_ips = True  # @param {type:"boolean"}

## Prepare for training

### Define helper functions

In [14]:
def get_bucket_name_and_path(uri):
    no_prefix_uri = uri[len("gs://") :]
    splits = no_prefix_uri.split("/")
    return splits[0], "/".join(splits[1:])


def download_from_gcs(uri):
    bucket_name, path = get_bucket_name_and_path(uri)
    storage_client = storage.Client(project=PROJECT_ID)
    bucket = storage_client.get_bucket(bucket_name)
    blob = bucket.blob(path)
    return blob.download_as_string()


def write_to_gcs(uri: str, content: str):
    bucket_name, path = get_bucket_name_and_path(uri)
    storage_client = storage.Client()
    bucket = storage_client.get_bucket(bucket_name)
    blob = bucket.blob(path)
    blob.upload_from_string(content)


def generate_auto_transformation(column_names: List[str]) -> List[Dict[str, Any]]:
    transformations = []
    for column_name in column_names:
        transformations.append({"auto": {"column_name": column_name}})
    return transformations


def write_auto_transformations(uri: str, column_names: List[str]):
    transformations = generate_auto_transformation(column_names)
    write_to_gcs(uri, json.dumps(transformations))


def get_task_detail(
    task_details: List[Dict[str, Any]], task_name: str
) -> List[Dict[str, Any]]:
    for task_detail in task_details:
        if task_detail.task_name == task_name:
            return task_detail


def get_deployed_model_uri(
    task_details,
):
    ensemble_task = get_task_detail(task_details, "model-upload")
    return ensemble_task.outputs["model"].artifacts[0].uri


def get_no_custom_ops_model_uri(task_details):
    ensemble_task = get_task_detail(task_details, "automl-tabular-ensemble")
    return download_from_gcs(
        ensemble_task.outputs["model_without_custom_ops"].artifacts[0].uri
    )


def get_feature_attributions(
    task_details,
):
    ensemble_task = get_task_detail(task_details, "model-evaluation-2")
    return download_from_gcs(
        ensemble_task.outputs["evaluation_metrics"]
        .artifacts[0]
        .metadata["explanation_gcs_path"]
    )


def get_evaluation_metrics(
    task_details,
):
    ensemble_task = get_task_detail(task_details, "model-evaluation")
    return download_from_gcs(
        ensemble_task.outputs["evaluation_metrics"].artifacts[0].uri
    )


def load_and_print_json(s):
    parsed = json.loads(s)
    print(json.dumps(parsed, indent=2, sort_keys=True))

### Define training specification

#### liquor dataset

In [17]:
root_dir = os.path.join(BUCKET_URI, f"automl_forecasting_pipeline/run-{uuid.uuid4()}")
optimization_objective = "minimize-mae"
time_column = "date"
time_series_identifier_column = "store_name"
target_column = "sale_dollars"
data_source_csv_filenames = None
data_source_bigquery_table_path = (
    "bq://bigquery-public-data.iowa_liquor_sales_forecasting.2020_sales_train"
)

training_fraction = 0.8
validation_fraction = 0.1
test_fraction = 0.1

predefined_split_key = None
if predefined_split_key:
    training_fraction = None
    validation_fraction = None
    test_fraction = None

weight_column = None

features = [
    time_column,
    target_column,
    "city",
    "zip_code",
    "county",
]

available_at_forecast_columns = ",".join([time_column])
unavailable_at_forecast_columns = ",".join([target_column])
time_series_attribute_columns = ",".join(["city", "zip_code", "county"])
forecast_horizon = 30
context_window = 30

transformations = generate_auto_transformation(features)
transform_config_path = os.path.join(root_dir, f"transform_config_{uuid.uuid4()}.json")
write_to_gcs(transform_config_path, json.dumps(transformations))

## L2L training & customize search space and change training configuration

We will create a skip evaluation AutoML Forecasting pipeline with the following customizations:
- Limit the hyperparameter search space
- Change machine type and tuning / training parallelism

In [None]:
# worker_pool_specs_override = [
#     {"machine_spec": {"machine_type": "n1-standard-8"}},  # override for TF chief node
#     {},  # override for TF worker node, since it's not used, leave it empty
#     {},  # override for TF ps node, since it's not used, leave it empty
#     {
#         "machine_spec": {
#             "machine_type": "n1-standard-4"  # override for TF evaluator node
#         }
#     },
# ]

# # Number of weak models in the final ensemble model.
# num_selected_trials = 5

# train_budget_milli_node_hours = 250  # 15 minutes

# (
#     template_path,
#     parameter_values,
# ) = automl_forecasting_utils.get_learn_to_learn_forecasting_pipeline_and_parameters(
#     project=PROJECT_ID,
#     location=REGION,
#     root_dir=root_dir,
#     target_column=target_column,
#     optimization_objective=optimization_objective,
#     transformations=transform_config_path,
#     train_budget_milli_node_hours=train_budget_milli_node_hours,
#     data_source_csv_filenames=data_source_csv_filenames,
#     data_source_bigquery_table_path=data_source_bigquery_table_path,
#     weight_column=weight_column,
#     predefined_split_key=predefined_split_key,
#     training_fraction=training_fraction,
#     validation_fraction=validation_fraction,
#     test_fraction=test_fraction,
#     num_selected_trials=num_selected_trials,
#     time_column=time_column,
#     time_series_identifier_column=time_series_identifier_column,
#     time_series_attribute_columns=time_series_attribute_columns,
#     available_at_forecast_columns=available_at_forecast_columns,
#     unavailable_at_forecast_columns=unavailable_at_forecast_columns,
#     forecast_horizon=forecast_horizon,
#     context_window=context_window,
#     stage_1_tuner_worker_pool_specs_override=worker_pool_specs_override,
#     feature_transform_engine_dataflow_subnetwork=dataflow_subnetwork,
#     feature_transform_engine_dataflow_use_public_ips=dataflow_use_public_ips,
#     # quantile forecast, L2L without probabilistic inference requires `minimize-quantile-loss`
#     # quantiles=",".join(map(lambda x: str(x), [0.25, 0.5, 0.9])),
# )

# job_id = "l2l-forecasting-{}".format(uuid.uuid4())
# job = aiplatform.PipelineJob(
#     display_name=job_id,
#     location=REGION,  # launches the pipeline job in the specified region
#     template_path=template_path,
#     job_id=job_id,
#     pipeline_root=root_dir,
#     parameter_values=parameter_values,
#     enable_caching=False,
# )

# job.run()


# pipeline_task_details = job.gca_resource.job_detail.task_details


Creating PipelineJob
PipelineJob created. Resource name: projects/934903580331/locations/us-central1/pipelineJobs/l2l-forecasting-5273673f-1626-4b96-9bd8-54abac2500b9
To use this PipelineJob in another session:
pipeline_job = aiplatform.PipelineJob.get('projects/934903580331/locations/us-central1/pipelineJobs/l2l-forecasting-5273673f-1626-4b96-9bd8-54abac2500b9')
View Pipeline Job:
https://console.cloud.google.com/vertex-ai/locations/us-central1/pipelines/runs/l2l-forecasting-5273673f-1626-4b96-9bd8-54abac2500b9?project=934903580331
PipelineJob projects/934903580331/locations/us-central1/pipelineJobs/l2l-forecasting-5273673f-1626-4b96-9bd8-54abac2500b9 current state:
PipelineState.PIPELINE_STATE_RUNNING
PipelineJob projects/934903580331/locations/us-central1/pipelineJobs/l2l-forecasting-5273673f-1626-4b96-9bd8-54abac2500b9 current state:
PipelineState.PIPELINE_STATE_RUNNING
PipelineJob projects/934903580331/locations/us-central1/pipelineJobs/l2l-forecasting-5273673f-1626-4b96-9bd8-54ab

## Skip architecture search
Instead of doing architecture search everytime, we can reuse the existing architecture search result. This could help:
1. reducing the variation of the output model
2. reducing training cost

The existing architecture search result is stored in the `tuning_result_output` output of the `automl-forecasting-stage-1-tuner` component. We can manually input it or get it programmatically.

In [None]:
# stage_1_tuner_task = get_task_detail(
#     pipeline_task_details, "automl-forecasting-stage-1-tuner"
# )

# stage_1_tuning_result_artifact_uri = (
#     stage_1_tuner_task.outputs["tuning_result_output"].artifacts[0].uri
# )

# upload_model_task = get_task_detail(
#     pipeline_task_details, "model-upload-2"
# )

# forecasting_mp_model_artifact = (
#     upload_model_task.outputs["model"].artifacts[0]
# )

# forecasting_mp_model = aiplatform.Model(forecasting_mp_model_artifact.metadata['resourceName'])

automl-forecasting-model-upload-1706139130854899712--4654900398811774976


### Run the skip architecture search pipeline


In [None]:
# # Number of weak models in the final ensemble model.
# num_selected_trials = 5

# train_budget_milli_node_hours = 250  # 15 minutes

# (
#     template_path,
#     parameter_values,
# ) = automl_forecasting_utils.get_learn_to_learn_forecasting_pipeline_and_parameters(
#     project=PROJECT_ID,
#     location=REGION,
#     root_dir=root_dir,
#     target_column=target_column,
#     optimization_objective=optimization_objective,
#     transformations=transform_config_path,
#     train_budget_milli_node_hours=train_budget_milli_node_hours,
#     data_source_csv_filenames=data_source_csv_filenames,
#     data_source_bigquery_table_path=data_source_bigquery_table_path,
#     weight_column=weight_column,
#     predefined_split_key=predefined_split_key,
#     training_fraction=training_fraction,
#     validation_fraction=validation_fraction,
#     test_fraction=test_fraction,
#     num_selected_trials=num_selected_trials,
#     time_column=time_column,
#     time_series_identifier_column=time_series_identifier_column,
#     time_series_attribute_columns=time_series_attribute_columns,
#     available_at_forecast_columns=available_at_forecast_columns,
#     unavailable_at_forecast_columns=unavailable_at_forecast_columns,
#     forecast_horizon=forecast_horizon,
#     context_window=context_window,
#     feature_transform_engine_dataflow_subnetwork=dataflow_subnetwork,
#     feature_transform_engine_dataflow_use_public_ips=dataflow_use_public_ips,
#     stage_1_tuning_result_artifact_uri=stage_1_tuning_result_artifact_uri,
# )

# job_id = "l2l-forecasting-skip-architecture-search-{}".format(uuid.uuid4())
# job = aiplatform.PipelineJob(
#     display_name=job_id,
#     location=REGION,  # launches the pipeline job in the specified region
#     template_path=template_path,
#     job_id=job_id,
#     pipeline_root=root_dir,
#     parameter_values=parameter_values,
#     enable_caching=False,
# )

# job.run()

# # Get model URI
# skip_architecture_search_pipeline_task_details = (
#     job.gca_resource.job_detail.task_details
# )


Creating PipelineJob


INFO:google.cloud.aiplatform.pipeline_jobs:Creating PipelineJob


PipelineJob created. Resource name: projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob created. Resource name: projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0


To use this PipelineJob in another session:


INFO:google.cloud.aiplatform.pipeline_jobs:To use this PipelineJob in another session:


pipeline_job = aiplatform.PipelineJob.get('projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0')


INFO:google.cloud.aiplatform.pipeline_jobs:pipeline_job = aiplatform.PipelineJob.get('projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0')


View Pipeline Job:
https://console.cloud.google.com/vertex-ai/locations/us-central1/pipelines/runs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0?project=294348452381


INFO:google.cloud.aiplatform.pipeline_jobs:View Pipeline Job:
https://console.cloud.google.com/vertex-ai/locations/us-central1/pipelines/runs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0?project=294348452381


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob run completed. Resource name: projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob run completed. Resource name: projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-skip-architecture-search-d4344cf6-d78a-40e2-962f-b8b0cc825ed0


## TiDE(a.k.a. FastNN) training


In [18]:
# Number of weak models in the final ensemble model.
num_selected_trials = 5

train_budget_milli_node_hours = 250  # 15 minutes

(
    template_path,
    parameter_values,
) = automl_forecasting_utils.get_time_series_dense_encoder_forecasting_pipeline_and_parameters(
    project=PROJECT_ID,
    location=REGION,
    root_dir=root_dir,
    target_column=target_column,
    optimization_objective=optimization_objective,
    transformations=transform_config_path,
    train_budget_milli_node_hours=train_budget_milli_node_hours,
    data_source_csv_filenames=data_source_csv_filenames,
    data_source_bigquery_table_path=data_source_bigquery_table_path,
    weight_column=weight_column,
    predefined_split_key=predefined_split_key,
    training_fraction=training_fraction,
    validation_fraction=validation_fraction,
    test_fraction=test_fraction,
    num_selected_trials=num_selected_trials,
    time_column=time_column,
    time_series_identifier_column=time_series_identifier_column,
    time_series_attribute_columns=time_series_attribute_columns,
    available_at_forecast_columns=available_at_forecast_columns,
    unavailable_at_forecast_columns=unavailable_at_forecast_columns,
    forecast_horizon=forecast_horizon,
    context_window=context_window,
    feature_transform_engine_dataflow_subnetwork=dataflow_subnetwork,
    feature_transform_engine_dataflow_use_public_ips=dataflow_use_public_ips,
    # enable_probabilistic_inference=True,
    # quantile forecast, TiDE without probabilistic inference requires `minimize-quantile-loss`
    # quantiles=",".join(map(lambda x: str(x), [0.25, 0.5, 0.9])),
)

job_id = "tide-forecasting-{}".format(uuid.uuid4())
job = aiplatform.PipelineJob(
    display_name=job_id,
    location=REGION,  # launches the pipeline job in the specified region
    template_path=template_path,
    job_id=job_id,
    pipeline_root=root_dir,
    parameter_values=parameter_values,
    enable_caching=False,
)

job.run()


pipeline_task_details = job.gca_resource.job_detail.task_details

Creating PipelineJob
PipelineJob created. Resource name: projects/934903580331/locations/us-central1/pipelineJobs/tide-forecasting-0e512ce4-1566-4be1-9993-b28660bd66c7
To use this PipelineJob in another session:
pipeline_job = aiplatform.PipelineJob.get('projects/934903580331/locations/us-central1/pipelineJobs/tide-forecasting-0e512ce4-1566-4be1-9993-b28660bd66c7')
View Pipeline Job:
https://console.cloud.google.com/vertex-ai/locations/us-central1/pipelines/runs/tide-forecasting-0e512ce4-1566-4be1-9993-b28660bd66c7?project=934903580331
PipelineJob projects/934903580331/locations/us-central1/pipelineJobs/tide-forecasting-0e512ce4-1566-4be1-9993-b28660bd66c7 current state:
PipelineState.PIPELINE_STATE_RUNNING
PipelineJob projects/934903580331/locations/us-central1/pipelineJobs/tide-forecasting-0e512ce4-1566-4be1-9993-b28660bd66c7 current state:
PipelineState.PIPELINE_STATE_RUNNING
PipelineJob projects/934903580331/locations/us-central1/pipelineJobs/tide-forecasting-0e512ce4-1566-4be1-999

## Probabilistic training

In [None]:
# Number of weak models in the final ensemble model.
num_selected_trials = 5

train_budget_milli_node_hours = 500  # 30 minutes

(
    template_path,
    parameter_values,
) = automl_forecasting_utils.get_learn_to_learn_forecasting_pipeline_and_parameters(
    project=PROJECT_ID,
    location=REGION,
    root_dir=root_dir,
    target_column=target_column,
    optimization_objective=optimization_objective,
    transformations=transform_config_path,
    train_budget_milli_node_hours=train_budget_milli_node_hours,
    data_source_csv_filenames=data_source_csv_filenames,
    data_source_bigquery_table_path=data_source_bigquery_table_path,
    weight_column=weight_column,
    predefined_split_key=predefined_split_key,
    training_fraction=training_fraction,
    validation_fraction=validation_fraction,
    test_fraction=test_fraction,
    num_selected_trials=num_selected_trials,
    time_column=time_column,
    time_series_identifier_column=time_series_identifier_column,
    time_series_attribute_columns=time_series_attribute_columns,
    available_at_forecast_columns=available_at_forecast_columns,
    unavailable_at_forecast_columns=unavailable_at_forecast_columns,
    forecast_horizon=forecast_horizon,
    context_window=context_window,
    feature_transform_engine_dataflow_subnetwork=dataflow_subnetwork,
    feature_transform_engine_dataflow_use_public_ips=dataflow_use_public_ips,
    enable_probabilistic_inference=True,
    # quantile forecast
    quantiles=",".join(map(lambda x: str(x), [0.25, 0.5, 0.9])),
)

job_id = "l2l-forecasting-probabilistic-inference-{}".format(uuid.uuid4())
job = aiplatform.PipelineJob(
    display_name=job_id,
    location=REGION,  # launches the pipeline job in the specified region
    template_path=template_path,
    job_id=job_id,
    pipeline_root=root_dir,
    parameter_values=parameter_values,
    enable_caching=False,
)

job.run()


pipeline_task_details = job.gca_resource.job_detail.task_details

Creating PipelineJob


INFO:google.cloud.aiplatform.pipeline_jobs:Creating PipelineJob


PipelineJob created. Resource name: projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob created. Resource name: projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620


To use this PipelineJob in another session:


INFO:google.cloud.aiplatform.pipeline_jobs:To use this PipelineJob in another session:


pipeline_job = aiplatform.PipelineJob.get('projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620')


INFO:google.cloud.aiplatform.pipeline_jobs:pipeline_job = aiplatform.PipelineJob.get('projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620')


View Pipeline Job:
https://console.cloud.google.com/vertex-ai/locations/us-central1/pipelines/runs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620?project=294348452381


INFO:google.cloud.aiplatform.pipeline_jobs:View Pipeline Job:
https://console.cloud.google.com/vertex-ai/locations/us-central1/pipelines/runs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620?project=294348452381


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620 current state:
PipelineState.PIPELINE_STATE_RUNNING


PipelineJob run completed. Resource name: projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620


INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob run completed. Resource name: projects/294348452381/locations/us-central1/pipelineJobs/l2l-forecasting-probabilistic-inference-bb3f26ad-af03-48e2-b320-d2c89ac8b620


# experiment logging

[experiments with pipelines](https://cloud.google.com/vertex-ai/docs/experiments/add-pipelinerun-experiment#associate-pipeline-run-with-experiment-run)

In [None]:
def log_pipeline_job_to_experiment_sample(
    experiment_name: str,
    pipeline_job_display_name: str,
    template_path: str,
    pipeline_root: str,
    parameter_values: Optional[Dict[str, Any]],
    project: str,
    location: str,
):
    aiplatform.init(project=project, location=location)

    pipeline_job = aiplatform.PipelineJob(
        display_name=pipeline_job_display_name,
        template_path=template_path,
        pipeline_root=pipeline_root,
        parameter_values=parameter_values,
    )

    pipeline_job.submit(experiment=experiment_name)

In [None]:
def log_pipeline_job_sample(
    experiment_name: str,
    run_name: str,
    pipeline_job: aiplatform.PipelineJob,
    project: str,
    location: str,
):
    aiplatform.init(experiment=experiment_name, project=project, location=location)

    aiplatform.start_run(run=run_name, resume=True)

    aiplatform.log(pipeline_job=pipeline_job)

##Batch prediction

### For liquor dataset

In [None]:
print(f"Running Batch prediction for model: {forecasting_mp_model.display_name}")


batch_predict_bq_output_uri_prefix = f"bq://{PROJECT_ID}"


# Not use this since FTE not support US dataset in us-central1, please copy this
# BigQuery table to your own BigQuery dataset that is located in us-central1.
# PREDICTION_DATASET_BQ_PATH = (
#     "bq://bigquery-public-data:iowa_liquor_sales_forecasting.2021_sales_predict"
# )

PREDICTION_DATASET_BQ_PATH = (
    "bq://[your-project-id].iowa_liquor_sales_forecasting_us_central1.2021_sales_predict"
)



batch_prediction_job = forecasting_mp_model.batch_predict(
    job_display_name=f"forecasting_iowa_liquor_sales_forecasting_predictions",
    bigquery_source=PREDICTION_DATASET_BQ_PATH,
    instances_format="bigquery",
    bigquery_destination_prefix=batch_predict_bq_output_uri_prefix,
    predictions_format="bigquery",
    generate_explanation=False,
    sync=True,
)

print(batch_prediction_job)

Creating BatchPredictionJob


INFO:google.cloud.aiplatform.jobs:Creating BatchPredictionJob


BatchPredictionJob created. Resource name: projects/294348452381/locations/us-central1/batchPredictionJobs/4296530389217837056


INFO:google.cloud.aiplatform.jobs:BatchPredictionJob created. Resource name: projects/294348452381/locations/us-central1/batchPredictionJobs/4296530389217837056


To use this BatchPredictionJob in another session:


INFO:google.cloud.aiplatform.jobs:To use this BatchPredictionJob in another session:


bpj = aiplatform.BatchPredictionJob('projects/294348452381/locations/us-central1/batchPredictionJobs/4296530389217837056')


INFO:google.cloud.aiplatform.jobs:bpj = aiplatform.BatchPredictionJob('projects/294348452381/locations/us-central1/batchPredictionJobs/4296530389217837056')


View Batch Prediction Job:
https://console.cloud.google.com/ai/platform/locations/us-central1/batch-predictions/4296530389217837056?project=294348452381


INFO:google.cloud.aiplatform.jobs:View Batch Prediction Job:
https://console.cloud.google.com/ai/platform/locations/us-central1/batch-predictions/4296530389217837056?project=294348452381


BatchPredictionJob projects/294348452381/locations/us-central1/batchPredictionJobs/4296530389217837056 current state:
JobState.JOB_STATE_RUNNING


INFO:google.cloud.aiplatform.jobs:BatchPredictionJob projects/294348452381/locations/us-central1/batchPredictionJobs/4296530389217837056 current state:
JobState.JOB_STATE_RUNNING


BatchPredictionJob projects/294348452381/locations/us-central1/batchPredictionJobs/4296530389217837056 current state:
JobState.JOB_STATE_RUNNING


INFO:google.cloud.aiplatform.jobs:BatchPredictionJob projects/294348452381/locations/us-central1/batchPredictionJobs/4296530389217837056 current state:
JobState.JOB_STATE_RUNNING


BatchPredictionJob projects/294348452381/locations/us-central1/batchPredictionJobs/4296530389217837056 current state:
JobState.JOB_STATE_RUNNING


INFO:google.cloud.aiplatform.jobs:BatchPredictionJob projects/294348452381/locations/us-central1/batchPredictionJobs/4296530389217837056 current state:
JobState.JOB_STATE_RUNNING


BatchPredictionJob projects/294348452381/locations/us-central1/batchPredictionJobs/4296530389217837056 current state:
JobState.JOB_STATE_RUNNING


INFO:google.cloud.aiplatform.jobs:BatchPredictionJob projects/294348452381/locations/us-central1/batchPredictionJobs/4296530389217837056 current state:
JobState.JOB_STATE_RUNNING


BatchPredictionJob projects/294348452381/locations/us-central1/batchPredictionJobs/4296530389217837056 current state:
JobState.JOB_STATE_RUNNING


INFO:google.cloud.aiplatform.jobs:BatchPredictionJob projects/294348452381/locations/us-central1/batchPredictionJobs/4296530389217837056 current state:
JobState.JOB_STATE_RUNNING


BatchPredictionJob projects/294348452381/locations/us-central1/batchPredictionJobs/4296530389217837056 current state:
JobState.JOB_STATE_RUNNING


INFO:google.cloud.aiplatform.jobs:BatchPredictionJob projects/294348452381/locations/us-central1/batchPredictionJobs/4296530389217837056 current state:
JobState.JOB_STATE_RUNNING


BatchPredictionJob projects/294348452381/locations/us-central1/batchPredictionJobs/4296530389217837056 current state:
JobState.JOB_STATE_RUNNING


INFO:google.cloud.aiplatform.jobs:BatchPredictionJob projects/294348452381/locations/us-central1/batchPredictionJobs/4296530389217837056 current state:
JobState.JOB_STATE_RUNNING


BatchPredictionJob projects/294348452381/locations/us-central1/batchPredictionJobs/4296530389217837056 current state:
JobState.JOB_STATE_RUNNING


INFO:google.cloud.aiplatform.jobs:BatchPredictionJob projects/294348452381/locations/us-central1/batchPredictionJobs/4296530389217837056 current state:
JobState.JOB_STATE_RUNNING


BatchPredictionJob projects/294348452381/locations/us-central1/batchPredictionJobs/4296530389217837056 current state:
JobState.JOB_STATE_RUNNING


INFO:google.cloud.aiplatform.jobs:BatchPredictionJob projects/294348452381/locations/us-central1/batchPredictionJobs/4296530389217837056 current state:
JobState.JOB_STATE_RUNNING


BatchPredictionJob projects/294348452381/locations/us-central1/batchPredictionJobs/4296530389217837056 current state:
JobState.JOB_STATE_RUNNING


INFO:google.cloud.aiplatform.jobs:BatchPredictionJob projects/294348452381/locations/us-central1/batchPredictionJobs/4296530389217837056 current state:
JobState.JOB_STATE_RUNNING


BatchPredictionJob projects/294348452381/locations/us-central1/batchPredictionJobs/4296530389217837056 current state:
JobState.JOB_STATE_SUCCEEDED


INFO:google.cloud.aiplatform.jobs:BatchPredictionJob projects/294348452381/locations/us-central1/batchPredictionJobs/4296530389217837056 current state:
JobState.JOB_STATE_SUCCEEDED


BatchPredictionJob run completed. Resource name: projects/294348452381/locations/us-central1/batchPredictionJobs/4296530389217837056


INFO:google.cloud.aiplatform.jobs:BatchPredictionJob run completed. Resource name: projects/294348452381/locations/us-central1/batchPredictionJobs/4296530389217837056


<google.cloud.aiplatform.jobs.BatchPredictionJob object at 0x7f8878d0cb80> 
resource name: projects/294348452381/locations/us-central1/batchPredictionJobs/4296530389217837056
