# Continuous Training with AutoML Vertex Pipelines

**Learning Objectives:**
1. Learn how to use Vertex AutoML pre-built components
1. Learn how to build a Vertex AutoML pipeline with these components using BigQuery as a data source
1. Learn how to compile, upload, and run the Vertex AutoML pipeline


In this lab, you will build, deploy, and run a Vertex AutoML pipeline that orchestrates the **Vertex AutoML AI** services to train, tune, and deploy a model. 

## Setup

In [221]:
from google.cloud import aiplatform

In [222]:
REGION = "us-central1"
PROJECT = !(gcloud config get-value project)
PROJECT = PROJECT[0]

In [223]:
# Set `PATH` to include the directory containing KFP CLI
PATH = %env PATH
%env PATH=/home/jupyter/.local/bin:{PATH}

env: PATH=/home/jupyter/.local/bin:/home/jupyter/.local/bin:/home/jupyter/.local/bin:/home/jupyter/.local/bin:/usr/local/cuda/bin:/opt/conda/bin:/opt/conda/condabin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/home/jupyter/.local/bin:.:/home/jupyter/.local/bin:.:/home/jupyter/.local/bin:.:/home/jupyter/.local/bin:.:/home/jupyter/.local/bin:.


## Understanding the pipeline design


The workflow implemented by the pipeline is defined using a Python based Domain Specific Language (DSL). The pipeline's DSL is in the `pipeline_vertex/pipeline_vertex_automl.py` file that we will generate below.

The pipeline's DSL has been designed to avoid hardcoding any environment specific settings like file paths or connection strings. These settings are provided to the pipeline code through a set of environment variables.


## Building and deploying the pipeline

Let us write the pipeline to disk:

In [224]:
# General
import os
import sys
import random
from datetime import datetime, timedelta
import json

# Vertex Pipelines
from typing import NamedTuple
import kfp

In [225]:
GCP_PROJECTS = !gcloud config get-value project
PROJECT_ID = GCP_PROJECTS[0]
BUCKET_NAME = f"{PROJECT_ID}-fraudfinder"
config = !gsutil cat gs://{BUCKET_NAME}/config/notebook_env.py
print(config.n)
exec(config.n)


BUCKET_NAME          = "qwiklabs-asl-02-111b5e486eb8-fraudfinder"
PROJECT              = "qwiklabs-asl-02-111b5e486eb8"
REGION               = "us-central1"
ID                   = "9m3ic"
FEATURESTORE_ID      = "fraudfinder_9m3ic"
MODEL_NAME           = "ff_model"
ENDPOINT_NAME        = "ff_model_endpoint"
TRAINING_DS_SIZE     = "1000"



In [226]:
#print("kfp version:", kfp.__version__)
ID='o90cp'

In [227]:
# Components variables
BASE_IMAGE = "python:3.7"
COMPONENTS_DIR = os.path.join(os.curdir, "pipelines", "components")
INGEST_FEATURE_STORE = f"{COMPONENTS_DIR}/ingest_feature_store_{ID}.yaml"
EVALUATE = f"{COMPONENTS_DIR}/evaluate_{ID}.yaml"

# Pipeline variables
PIPELINE_NAME = f"fraud-finder-xgb-pipeline-{ID}"
PIPELINE_DIR = os.path.join(os.curdir, "pipelines")
PIPELINE_ROOT = f"gs://{BUCKET_NAME}/pipelines"
PIPELINE_PACKAGE_PATH = f"{PIPELINE_DIR}/pipeline_{ID}.json"

# Feature Store component variables
BQ_DATASET = "tx"
READ_INSTANCES_TABLE = f"ground_truth_{ID}"
READ_INSTANCES_URI = f"bq://{PROJECT_ID}.{BQ_DATASET}.{READ_INSTANCES_TABLE}"

# Dataset component variables
DATASET_NAME = f"fraud_finder_dataset_{ID}"

# Training component variables
JOB_NAME = f"fraudfinder-train-xgb-{ID}"
MODEL_NAME = f"{MODEL_NAME}_xgb_pipeline_{ID}"
CONTAINER_URI = "us-docker.pkg.dev/vertex-ai/training/xgboost-cpu.1-1:latest"
MODEL_SERVING_IMAGE_URI = (
    "us-docker.pkg.dev/vertex-ai/prediction/xgboost-cpu.1-1:latest"
)
ARGS = json.dumps(["--bucket", f"gs://{BUCKET_NAME}"])
IMAGE_REPOSITORY = f"fraudfinder-{ID}"
IMAGE_NAME = "dask-xgb-classificator"
IMAGE_TAG = "v1"
IMAGE_URI = f"us-central1-docker.pkg.dev/{PROJECT_ID}/{IMAGE_REPOSITORY}/{IMAGE_NAME}:{IMAGE_TAG}"  # TODO: get it from config

# Evaluation component variables
METRICS_URI = f"gs://{BUCKET_NAME}/deliverables/metrics.json"
AVG_PR_THRESHOLD = 0.2
AVG_PR_CONDITION = "avg_pr_condition"

# Endpoint variables
ENDPOINT_NAME = f"{ENDPOINT_NAME}_xgb_pipeline_{ID}"

In [228]:
#!pip install --force-reinstall --no-deps kfp==1.8.22
#!pip install --upgrade 'google-cloud-pipeline-components==2.8.0'

In [229]:
%%writefile -a ./pipeline_vertex/fs_import_component.py
"""Lightweight component ingest features."""
from typing import Dict, List, NamedTuple

from kfp.dsl import Metrics, Output, component

@component(
    base_image="python:3.7",
    packages_to_install=["google-cloud-aiplatform==1.21.0"],
)
def ingest_features_gcs(
    project_id: str,
    region: str,
    bucket_name: str,
    feature_store_id: str,
    read_instances_uri: str,
) -> NamedTuple("Outputs", [("snapshot_uri_paths", str),],):
    # Libraries --------------------------------------------------------------------------------------------------------------------------
    from datetime import datetime
    import glob
    import urllib
    import json
    #import logger TODO:
    from typing import NamedTuple

    # Feature Store
    from google.cloud.aiplatform import Featurestore, EntityType, Feature

    # Variables --------------------------------------------------------------------------------------------------------------------------
    timestamp = datetime.now().strftime("%Y%m%d%H%M%S")
    api_endpoint = region + "-aiplatform.googleapis.com"
    bucket = urllib.parse.urlsplit(bucket_name).netloc
    export_uri = (
        f"{bucket_name}/data/snapshots/{timestamp}"  # format as new gsfuse requires
    )
    export_uri_path = f"/gcs/{bucket}/data/snapshots/{timestamp}"
    customer_entity = "customer"
    terminal_entity = "terminal"
    serving_feature_ids = {customer_entity: ["*"], terminal_entity: ["*"]}

    print(timestamp)
    print(bucket)
    print(export_uri)
    print(export_uri_path)
    print(customer_entity)
    print(terminal_entity)
    print(serving_feature_ids)

    # Main -------------------------------------------------------------------------------------------------------------------------------

    ## Define the feature store resource path
    feature_store_resource_path = (
        f"projects/{project_id}/locations/{region}/featurestores/{feature_store_id}"
    )
    print("Feature Store: \t", feature_store_resource_path)

    ## Run batch job request
    # try:
    if True:
        ff_feature_store = Featurestore(feature_store_resource_path)
        ff_feature_store.batch_serve_to_gcs(
            gcs_destination_output_uri_prefix=export_uri,
            gcs_destination_type="csv",
            serving_feature_ids=serving_feature_ids,
            read_instances_uri=read_instances_uri,
            pass_through_fields=["tx_fraud", "tx_amount"],
        )
    # except Exception as error:
    #     print(error)

    # Store metadata
    snapshot_pattern = f"{export_uri_path}/*.csv"
    snapshot_files = glob.glob(snapshot_pattern)
    snapshot_files_fmt = [p.replace("/gcs/", "gs://") for p in snapshot_files]
    snapshot_files_string = json.dumps(snapshot_files_fmt)

    component_outputs = NamedTuple(
        "Outputs",
        [
            ("snapshot_uri_paths", str),
        ],
    )

    print(snapshot_pattern)
    print(snapshot_files)
    print(snapshot_files_fmt)
    print(snapshot_files_string)

    return component_outputs(snapshot_files_string)

Appending to ./pipeline_vertex/fs_import_component.py


In [274]:
%%writefile ./pipeline_vertex/pipeline_vertex_automl.py
# Copyright 2021 Google LLC

# Licensed under the Apache License, Version 2.0 (the "License"); you may not
# use this file except in compliance with the License. You may obtain a copy of
# the License at

# https://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS"
# BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
# express or implied. See the License for the specific language governing
# permissions and limitations under the License.

"""Kubeflow Covertype Pipeline."""

import os

from google_cloud_pipeline_components.v1.automl.training_job import (
    AutoMLTabularTrainingJobRunOp,
)
from google_cloud_pipeline_components.v1.dataset import TabularDatasetCreateOp
from google_cloud_pipeline_components.v1.endpoint import (
    EndpointCreateOp,
    ModelDeployOp,
)

from kfp import dsl
from fs_import_component import ingest_features_gcs

PIPELINE_ROOT = os.getenv("PIPELINE_ROOT")
PROJECT = os.getenv("PROJECT")
REGION = os.getenv("REGION", "us-central1")
DATASET_SOURCE = os.getenv("DATASET_SOURCE")
PIPELINE_NAME = os.getenv("PIPELINE_NAME", "fraudfinder")
DISPLAY_NAME = os.getenv("MODEL_DISPLAY_NAME", PIPELINE_NAME)
TARGET_COLUMN = os.getenv("TARGET_COLUMN", "tx_fraud")
SERVING_MACHINE_TYPE = os.getenv("SERVING_MACHINE_TYPE", "n1-standard-16")
ID = os.getenv("ID")

column_specs = {
    'tx_amount': "numeric",
    'customer_id_avg_amount_14day_window': "numeric",
    'customer_id_avg_amount_15min_window': "numeric",
    'customer_id_avg_amount_1day_window': "numeric",
    'customer_id_avg_amount_30min_window': "numeric",
    'customer_id_avg_amount_60min_window': "numeric",
    'customer_id_avg_amount_7day_window': "numeric",
    'customer_id_nb_tx_14day_window': "numeric",
    'customer_id_nb_tx_15min_window': "numeric",
    'customer_id_nb_tx_1day_window': "numeric",
    'customer_id_nb_tx_30min_window': "numeric",
    'customer_id_nb_tx_60min_window': "numeric",
    'customer_id_nb_tx_7day_window': "numeric",
    'terminal_id_avg_amount_15min_window': "numeric",
    'terminal_id_avg_amount_30min_window': "numeric",
    'terminal_id_avg_amount_60min_window': "numeric",
    'terminal_id_nb_tx_14day_window': "numeric",
    'terminal_id_nb_tx_15min_window': "numeric",
    'terminal_id_nb_tx_1day_window': "numeric",
    'terminal_id_nb_tx_30min_window': "numeric",
    'terminal_id_nb_tx_60min_window': "numeric",
    'terminal_id_nb_tx_7day_window': "numeric",
    'terminal_id_risk_14day_window': "numeric",
    'terminal_id_risk_1day_window': "numeric",
    'terminal_id_risk_7day_window': "numeric"
}

FEATURESTORE_ID=f"fraudfinder_{ID}"
BUCKET_NAME = "qwiklabs-asl-02-111b5e486eb8-fraudfinder"
# Feature Store component variables
BQ_DATASET = "tx"
READ_INSTANCES_TABLE = f"ground_truth_{ID}"
READ_INSTANCES_URI = f"bq://{PROJECT}.{BQ_DATASET}.{READ_INSTANCES_TABLE}"
bucket_name = f"gs://{BUCKET_NAME}"

@dsl.pipeline(
    name=f"{PIPELINE_NAME}-vertex-automl-pipeline",
    description=f"AutoML Vertex Pipeline for {PIPELINE_NAME}",
    pipeline_root=PIPELINE_ROOT
)
def create_pipeline():

    #Ingest data from featurestore
    ingest_features_op = ingest_features_gcs(
        project_id=PROJECT,
        region=REGION,
        bucket_name=bucket_name,
        feature_store_id=FEATURESTORE_ID,
        read_instances_uri=READ_INSTANCES_URI,
    )
    
    # Create dataset
    dataset_create_task = TabularDatasetCreateOp(
        project=PROJECT,
        display_name=DISPLAY_NAME,
        gcs_source=ingest_features_op.outputs["snapshot_uri_paths"],
    ).after(ingest_features_op)

    # dataset_create_task = TabularDatasetCreateOp(
    #     display_name=DISPLAY_NAME,
    #     bq_source=DATASET_SOURCE,
    #     project=PROJECT,
    # ).after(ingest_features_op)
    
    #2. Run the AutoML Tabular Training Job
    #This is the core component that trains the model.
    automl_training_task = AutoMLTabularTrainingJobRunOp(
        project=PROJECT,
        display_name=DISPLAY_NAME,
        optimization_prediction_type="classification",
        dataset=dataset_create_task.outputs["dataset"],
        target_column=TARGET_COLUMN,
        timestamp_split_column_name='timestamp',
        training_fraction_split=0.8,
        validation_fraction_split=0.1,
        test_fraction_split=0.1,
        # Feature list configuration
        column_specs=column_specs,
        # New parameters for budget and early stopping
        budget_milli_node_hours=1000,  # 1000 milli-node hours = 1 node hour
        disable_early_stopping=False   # Explicitly set to False to enable early stopping
    )

    endpoint_create_task = EndpointCreateOp(
        project=PROJECT,
        display_name=DISPLAY_NAME,
    ).after(automl_training_task)

    model_deploy_task = ModelDeployOp(  # pylint: disable=unused-variable
        model=automl_training_task.outputs["model"],
        endpoint=endpoint_create_task.outputs["endpoint"],
        deployed_model_display_name=DISPLAY_NAME,
        dedicated_resources_machine_type=SERVING_MACHINE_TYPE,
        dedicated_resources_min_replica_count=1,
        dedicated_resources_max_replica_count=1,
    )


Overwriting ./pipeline_vertex/pipeline_vertex_automl.py


### Compile the pipeline

Let's start by defining the environment variables that will be passed to the pipeline compiler:

In [275]:
ARTIFACT_STORE = f"gs://{PROJECT}-kfp-artifact-store"
PIPELINE_ROOT = f"{ARTIFACT_STORE}/pipeline"
DATASET_SOURCE = f"bq://{PROJECT}.covertype_dataset.covertype"

DATASET_SOURCE=f"bq://{PROJECT}.tx.train_table_automl_20250908"
# PIPELINE_NAME="ff-demo"
# DISPLAY_NAME="dd-name"

%env PIPELINE_ROOT={PIPELINE_ROOT}
%env PROJECT={PROJECT}
%env REGION={REGION}
%env DATASET_SOURCE={DATASET_SOURCE}
%env ID={ID}

env: PIPELINE_ROOT=gs://qwiklabs-asl-02-111b5e486eb8-kfp-artifact-store/pipeline
env: PROJECT=qwiklabs-asl-02-111b5e486eb8
env: REGION=us-central1
env: DATASET_SOURCE=bq://qwiklabs-asl-02-111b5e486eb8.tx.train_table_automl_20250908
env: ID=o90cp


Let us make sure that the `ARTIFACT_STORE` has been created, and let us create it if not:

In [276]:
!gsutil ls | grep ^{ARTIFACT_STORE}/$ || gsutil mb -l {REGION} {ARTIFACT_STORE}

gs://qwiklabs-asl-02-111b5e486eb8-kfp-artifact-store/


#### Use the CLI compiler to compile the pipeline

We compile the pipeline from the Python file we generated into a YAML description using the following command:

In [277]:
PIPELINE_YAML = "fraudfinder_automl_vertex_pipeline.yaml"

In [278]:
#!pip install --upgrade kfp --user

In [279]:
!kfp dsl compile --py pipeline_vertex/pipeline_vertex_automl.py --output $PIPELINE_YAML

/home/jupyter/fraudfinder/asl-ml-immersion/notebooks/fraudfinder/vertex_ai/fraudfinder_automl_vertex_pipeline.yaml


**Note:** You can also use the Python SDK to compile the pipeline:

```python
from kfp import compiler

compiler.Compiler().compile(
    pipeline_func=create_pipeline, 
    package_path=PIPELINE_YAML,
)

```

The result is the pipeline file. 

In [280]:
!head {PIPELINE_YAML}

# PIPELINE DEFINITION
# Name: fraudfinder-vertex-automl-pipeline
# Description: AutoML Vertex Pipeline for fraudfinder
components:
  comp-automl-tabular-training-job:
    executorLabel: exec-automl-tabular-training-job
    inputDefinitions:
      artifacts:
        dataset:
          artifactType:


### Deploy the pipeline package

In [None]:
from google.cloud import aiplatform
aiplatform.init(project=PROJECT, location=REGION)

pipeline = aiplatform.PipelineJob(
    display_name="automl_fraudfinder_kfp_pipeline",
    template_path=PIPELINE_YAML,
    enable_caching=True,
)

pipeline.run()

Creating PipelineJob
PipelineJob created. Resource name: projects/25570882233/locations/us-central1/pipelineJobs/fraudfinder-vertex-automl-pipeline-20250910160225
To use this PipelineJob in another session:
pipeline_job = aiplatform.PipelineJob.get('projects/25570882233/locations/us-central1/pipelineJobs/fraudfinder-vertex-automl-pipeline-20250910160225')
View Pipeline Job:
https://console.cloud.google.com/vertex-ai/locations/us-central1/pipelines/runs/fraudfinder-vertex-automl-pipeline-20250910160225?project=25570882233
PipelineJob projects/25570882233/locations/us-central1/pipelineJobs/fraudfinder-vertex-automl-pipeline-20250910160225 current state:
PipelineState.PIPELINE_STATE_RUNNING
PipelineJob projects/25570882233/locations/us-central1/pipelineJobs/fraudfinder-vertex-automl-pipeline-20250910160225 current state:
PipelineState.PIPELINE_STATE_RUNNING
PipelineJob projects/25570882233/locations/us-central1/pipelineJobs/fraudfinder-vertex-automl-pipeline-20250910160225 current state:


Copyright 2021 Google LLC

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.