# Build a pipeline for continuous model training

## Objectives

The following steps cover this process:

1. Acquire and prepare dataset in BigQuery.
2. Create and upload a custom training package. When executed, it reads data from the dataset and trains the model.
3. Build a Vertex AI Pipeline. This pipeline executes the custom training package, uploads the model to the Vertex AI Model Registry, runs the evaluation job, and sends an email notification.
4. Manually run the pipeline.
5. Create a Cloud Function with an Eventarc trigger that runs the pipeline whenever new data is inserted into the BigQuery dataset.

In [1]:
!pwd

/home/jupyter/mlops-w-vertex-ai/01-ct_training


In [2]:
! python3 -c "from google.cloud import aiplatform; print('aiplatform SDK version: {}'.format(aiplatform.__version__))"
! python3 -c "from google.cloud import bigquery; print('bigquery SDK version: {}'.format(bigquery.__version__))"
! python3 -c "import kfp; print('KFP SDK version: {}'.format(kfp.__version__))"
! python3 -c "import google_cloud_pipeline_components; print('google_cloud_pipeline_components version: {}'.format(google_cloud_pipeline_components.__version__))"

aiplatform SDK version: 1.71.0
bigquery SDK version: 3.25.0
KFP SDK version: 2.7.0
google_cloud_pipeline_components version: 2.17.0


In [22]:
VERSION = "v9"

In [23]:
import os
from google.cloud import aiplatform
from google.cloud import bigquery
from google.cloud import storage

from pprint import pprint
import time

print(f'aiplatform SDK version: {aiplatform.__version__}')
print(f'bigquery SDK version: {bigquery.__version__}')

PROJECT_ID = "hybrid-vertex"
REGION = "us-central1"
BUCKET_NAME = f"ct-pipeline-{VERSION}"
BUCKET_URI = f"gs://{BUCKET_NAME}"

bq_client = bigquery.Client(
    project=PROJECT_ID,
    location=REGION
)

# Set the project id
! gcloud config set project {PROJECT_ID}

aiplatform SDK version: 1.71.0
bigquery SDK version: 3.25.0
Updated property [core/project].


In [24]:
! gcloud storage buckets create $BUCKET_URI --location=$REGION --project=$PROJECT_ID

Creating gs://ct-pipeline-v9/...


# Custom training package

In [65]:
# !tree training_package

### training package directory

```
training_package
├── __init__.py
├── setup.py
└── trainer
    ├── __init__.py
    └── task.py
```

Run `setup.py` to create source distribution for training app

In [25]:
! cd training_package && python setup.py sdist --formats=gztar && cd ..

running sdist
running egg_info
writing trainer.egg-info/PKG-INFO
writing dependency_links to trainer.egg-info/dependency_links.txt
writing requirements to trainer.egg-info/requires.txt
writing top-level names to trainer.egg-info/top_level.txt
reading manifest file 'trainer.egg-info/SOURCES.txt'
writing manifest file 'trainer.egg-info/SOURCES.txt'
running check
creating trainer-0.1
creating trainer-0.1/trainer
creating trainer-0.1/trainer.egg-info
creating trainer-0.1/trainer.egg-info/.ipynb_checkpoints
copying files to trainer-0.1...
copying README.md -> trainer-0.1
copying setup.py -> trainer-0.1
copying trainer/__init__.py -> trainer-0.1/trainer
copying trainer/task.py -> trainer-0.1/trainer
copying trainer.egg-info/PKG-INFO -> trainer-0.1/trainer.egg-info
copying trainer.egg-info/SOURCES.txt -> trainer-0.1/trainer.egg-info
copying trainer.egg-info/dependency_links.txt -> trainer-0.1/trainer.egg-info
copying trainer.egg-info/requires.txt -> trainer-0.1/trainer.egg-info
copying traine

In [26]:
DISTRIBUTION_NAME = "trainer-0.1.tar.gz"

In [27]:
# ! gcloud storage cp training_package/dist/trainer-0.2.tar.gz $BUCKET_URI/
! gcloud storage cp training_package/dist/$DISTRIBUTION_NAME $BUCKET_URI/

Copying file://training_package/dist/trainer-0.1.tar.gz to gs://ct-pipeline-v9/trainer-0.1.tar.gz
  Completed files 1/1 | 3.8kiB/3.8kiB                                          


# Vertex Experiments

In [28]:
EXPERIMENT_NAME   = f'training-{VERSION}'

# new experiment
invoke_time       = time.strftime("%Y%m%d-%H%M%S")
RUN_NAME          = f'run-{invoke_time}'

CHECKPT_DIR       = f"{BUCKET_URI}/{EXPERIMENT_NAME}/chkpoint"
BASE_OUTPUT_DIR   = f"{BUCKET_URI}/{EXPERIMENT_NAME}/{RUN_NAME}"
LOG_DIR           = f"{BASE_OUTPUT_DIR}/logs"
DATA_DIR          = f"{BASE_OUTPUT_DIR}/data"
ARTIFACTS_DIR     = f"{BASE_OUTPUT_DIR}/artifacts"

aiplatform.init(
    project=PROJECT_ID,
    staging_bucket=BUCKET_URI,
    location=REGION,
    # experiment=EXPERIMENT_NAME
)

# aiplatform.autolog()

print(f"EXPERIMENT_NAME   : {EXPERIMENT_NAME}")
print(f"RUN_NAME          : {RUN_NAME}\n")
print(f"CHECKPT_DIR       : {CHECKPT_DIR}")
print(f"BASE_OUTPUT_DIR   : {BASE_OUTPUT_DIR}")
print(f"LOG_DIR           : {LOG_DIR}")
print(f"DATA_DIR          : {DATA_DIR}")
print(f"ARTIFACTS_DIR     : {ARTIFACTS_DIR}")

EXPERIMENT_NAME   : training-v9
RUN_NAME          : run-20250114-041128

CHECKPT_DIR       : gs://ct-pipeline-v9/training-v9/chkpoint
BASE_OUTPUT_DIR   : gs://ct-pipeline-v9/training-v9/run-20250114-041128
LOG_DIR           : gs://ct-pipeline-v9/training-v9/run-20250114-041128/logs
DATA_DIR          : gs://ct-pipeline-v9/training-v9/run-20250114-041128/data
ARTIFACTS_DIR     : gs://ct-pipeline-v9/training-v9/run-20250114-041128/artifacts


In [29]:
! gsutil -q cp requirements.txt $ARTIFACTS_DIR/requirements.txt
! gsutil -q cp requirements.txt $DATA_DIR/requirements.txt

! gsutil ls $BASE_OUTPUT_DIR

gs://ct-pipeline-v9/training-v9/run-20250114-041128/artifacts/
gs://ct-pipeline-v9/training-v9/run-20250114-041128/data/


# Vertex Pipeline

In [30]:
EMAIL_RECIPIENTS = [ "jordantotten@google.com" ]

PIPELINE_ROOT = f"{BUCKET_URI}/pipeline_root/chicago-taxi-pipe"
WORKING_DIR = f"{PIPELINE_ROOT}/mlops-trigger-tutorial"
os.environ['AIP_MODEL_DIR'] = ARTIFACTS_DIR

PIPELINE_NAME = f"ct-training-{VERSION}"
PIPELINE_FILE = f"{PIPELINE_NAME}.yaml"

print(f"PIPELINE_ROOT : {PIPELINE_ROOT}")
print(f"WORKING_DIR   : {WORKING_DIR}")
print(f"PIPELINE_NAME : {PIPELINE_NAME}")
print(f"PIPELINE_FILE : {PIPELINE_FILE}")

PIPELINE_ROOT : gs://ct-pipeline-v9/pipeline_root/chicago-taxi-pipe
WORKING_DIR   : gs://ct-pipeline-v9/pipeline_root/chicago-taxi-pipe/mlops-trigger-tutorial
PIPELINE_NAME : ct-training-v9
PIPELINE_FILE : ct-training-v9.yaml


## Build pipeline

In [31]:
from kfp import dsl
from kfp.dsl import importer
from kfp.dsl import OneOf
from google.cloud import aiplatform

from google_cloud_pipeline_components.preview.model_evaluation.model_evaluation_import_component import model_evaluation_import as ModelImportEvaluationOp
from google_cloud_pipeline_components.v1.custom_job import CustomTrainingJobOp
from google_cloud_pipeline_components.types import artifact_types
from google_cloud_pipeline_components.v1.model import ModelUploadOp
from google_cloud_pipeline_components.v1.batch_predict_job import ModelBatchPredictOp
from google_cloud_pipeline_components.v1.model_evaluation import ModelEvaluationRegressionOp
from google_cloud_pipeline_components.v1.vertex_notification_email import VertexNotificationEmailOp
from google_cloud_pipeline_components.v1.endpoint import ModelDeployOp
from google_cloud_pipeline_components.v1.endpoint import EndpointCreateOp

# from src.upload_version_component import create_next_model_version

# define the train-deploy pipeline
@dsl.pipeline(name=PIPELINE_NAME)
def custom_model_training_evaluation_pipeline(
    project: str,
    location: str,
    version: str,
    training_job_display_name: str,
    worker_pool_specs: list,
    base_output_dir: str,
    artifacts_dir: str,
    prediction_container_uri: str,
    model_display_name: str,
    batch_prediction_job_display_name: str,
    target_field_name: str,
    test_data_gcs_uri: list,
    ground_truth_gcs_source: list,
    batch_predictions_gcs_prefix: str,
    eval_display_name: str,
    batch_predictions_input_format: str="csv",
    batch_predictions_output_format: str="jsonl",
    ground_truth_format: str="csv",
    parent_model_resource_name: str=None,
    parent_model_artifact_uri: str=None,
    endpoint_resource_name: str=None,
    endpoint_resource_uri: str=None,
    existing_model: bool=False,
    existing_endpoint: bool=False,
    model_version_alias: str="new-version",
):
    # Notification task
    notify_task = VertexNotificationEmailOp(
        recipients= EMAIL_RECIPIENTS
    )
    
    with dsl.ExitHandler(notify_task, name='MLOps Continuous Training Pipeline'):
        # Train the model
        custom_job_task = (
            CustomTrainingJobOp(
                project=project,
                display_name=training_job_display_name,
                worker_pool_specs=worker_pool_specs,
                base_output_directory=base_output_dir,
                location=location
            )
        ).set_display_name("custom-train")

        # Import the unmanaged model
        import_unmanaged_model_task = (
            importer(
                artifact_uri=artifacts_dir,
                artifact_class=artifact_types.UnmanagedContainerModel,
                metadata={
                    "containerSpec": {
                        "imageUri": prediction_container_uri,
                    },
                },
            )
            .set_display_name("import-trained-model")
            .after(custom_job_task)
        )

        with dsl.If(existing_model == True):
            # Import the parent model to upload as a version
            import_registry_model_task = (
                importer(
                    artifact_uri=parent_model_artifact_uri,
                    artifact_class=artifact_types.VertexModel,
                    metadata={
                        "resourceName": parent_model_resource_name
                    },
                )
                .set_display_name("import-existing-model")
                .after(import_unmanaged_model_task)
            )
            
            # Upload the model as a version
            model_version_upload_op = ModelUploadOp(
                project=project,
                location=location,
                display_name=model_display_name,
                parent_model=import_registry_model_task.outputs["artifact"],
                unmanaged_container_model=import_unmanaged_model_task.outputs["artifact"],
            )

        with dsl.Else():
            # Upload the model
            model_upload_op = (
                ModelUploadOp(
                    project=project,
                    location=location,
                    display_name=model_display_name,
                    unmanaged_container_model=import_unmanaged_model_task.outputs["artifact"],
                )
                .set_display_name("upload-new-model")
            )
        
        # Get the model (or model version)
        model_resource = OneOf(
                model_version_upload_op.outputs["model"], 
                model_upload_op.outputs["model"]
        )

        # Batch prediction
        batch_predict_task = (
            ModelBatchPredictOp(
                project=project,
                job_display_name=batch_prediction_job_display_name,
                model=model_resource,
                location=location,
                instances_format=batch_predictions_input_format,
                predictions_format=batch_predictions_output_format,
                gcs_source_uris=test_data_gcs_uri,
                gcs_destination_output_uri_prefix=batch_predictions_gcs_prefix,
                machine_type='n1-standard-4'
            )
            .set_display_name("batch-prediction")
        )
        
        # Evaluation task
        evaluation_task = (
            ModelEvaluationRegressionOp(
                project=project,
                target_field_name=target_field_name,
                location=location,
                model=model_resource,
                predictions_format=batch_predictions_output_format,
                predictions_gcs_source=batch_predict_task.outputs["gcs_output_directory"],
                ground_truth_format=ground_truth_format,
                ground_truth_gcs_source=ground_truth_gcs_source
            )
            .set_display_name("model-eval-job")
        )
        
        # Import the evaluation result to Vertex AI.
        import_evaluation_task = (
            ModelImportEvaluationOp(
                regression_metrics=evaluation_task.outputs['evaluation_metrics'],
                model=model_resource,
                dataset_type=batch_predictions_input_format,
                dataset_path="", # test_data_gcs_uri
                dataset_paths=ground_truth_gcs_source,
                display_name=eval_display_name,
            )
            .set_display_name("import-model-eval")
        )
        
        
        with dsl.If(existing_endpoint == True):
            # Import the parent model to upload as a version
            endpoint = importer(
                artifact_uri=endpoint_resource_uri,
                artifact_class=artifact_types.VertexEndpoint,
                metadata={"resourceName": endpoint_resource_name},
            )
            
            # deploy model to endpoint
            _ = ModelDeployOp(
                model=model_resource,
                endpoint=endpoint.output,  # .outputs["endpoint"],
                dedicated_resources_min_replica_count=1,
                dedicated_resources_max_replica_count=1,
                dedicated_resources_machine_type="n1-standard-4",
                traffic_split={"0": 100},
            )
            
        with dsl.Else():
            # Create an endpoint
            endpoint_create_op = EndpointCreateOp(
                project=project,
                display_name="taxifare-endpoint",
            )
            # deploy model to endpoint
            _ = ModelDeployOp(
                model=model_resource,
                endpoint=endpoint_create_op.outputs["endpoint"],
                dedicated_resources_min_replica_count=1,
                dedicated_resources_max_replica_count=1,
                dedicated_resources_machine_type="n1-standard-4",
                traffic_split={"0": 100},
            )
    return

## Compile pipeline

In [32]:
from kfp import dsl
from kfp import compiler
from kfp.registry import RegistryClient

LOCAL_PIPELINE_YAML = PIPELINE_FILE

compiler.Compiler().compile(
    pipeline_func=custom_model_training_evaluation_pipeline,
    package_path=LOCAL_PIPELINE_YAML,
    # package_path="{}.yaml".format(PIPELINE_NAME),
)

print(f"LOCAL_PIPELINE_YAML: {LOCAL_PIPELINE_YAML}")

LOCAL_PIPELINE_YAML: ct-training-v9.yaml


In [33]:
! gsutil -q cp $LOCAL_PIPELINE_YAML $BASE_OUTPUT_DIR/$LOCAL_PIPELINE_YAML

! gsutil ls $BASE_OUTPUT_DIR

gs://ct-pipeline-v9/training-v9/run-20250114-041128/ct-training-v9.yaml
gs://ct-pipeline-v9/training-v9/run-20250114-041128/artifacts/
gs://ct-pipeline-v9/training-v9/run-20250114-041128/data/


## Upload as pipeline template

In [34]:
REPO_NAME = "mlops"

# Create a repo in the artifact registry
# ! gcloud artifacts repositories create $REPO_NAME --location=$REGION --repository-format=KFP

host = f"https://{REGION}-kfp.pkg.dev/{PROJECT_ID}/{REPO_NAME}"
client = RegistryClient(host=host)

TEMPLATE_NAME, VERSION_NAME = client.upload_pipeline(
    file_name=PIPELINE_FILE,
    tags=[VERSION, "latest"],
    extra_headers={"description":"This is an example pipeline template."}
)

TEMPLATE_URI = f"https://{REGION}-kfp.pkg.dev/{PROJECT_ID}/{REPO_NAME}/{TEMPLATE_NAME}/latest"

print(f"TEMPLATE_NAME : {TEMPLATE_NAME}")
print(f"TEMPLATE_URI  : {TEMPLATE_URI}")

TEMPLATE_NAME : ct-training-v9
TEMPLATE_URI  : https://us-central1-kfp.pkg.dev/hybrid-vertex/mlops/ct-training-v9/latest


# Manually Run Pipeline

### Set parameters

In [35]:
DATASET_NAME = "mlops"
TABLE_NAME = "chicago"

worker_pool_specs = [
    {
        "machine_spec": {"machine_type": "e2-highmem-16"},
        "replica_count": 1,
        "python_package_spec": {
            "executor_image_uri": "us-docker.pkg.dev/vertex-ai/training/sklearn-cpu.1-0:latest",
            "package_uris": [f"{BUCKET_URI}/{DISTRIBUTION_NAME}"],
            "python_module": "trainer.task",
            "args":[
                "--project-id", PROJECT_ID,
                "--data-dir", f"/gcs/{BUCKET_NAME}/{EXPERIMENT_NAME}/{RUN_NAME}/data",
                "--training-dir", f"/gcs/{BUCKET_NAME}/{EXPERIMENT_NAME}/{RUN_NAME}/artifacts",
                "--bq-source", f"{PROJECT_ID}.{DATASET_NAME}.{TABLE_NAME}",
                "--experiment-run", RUN_NAME,
                "--experiment-name", EXPERIMENT_NAME,
            ]
        },
    }
]
pprint(worker_pool_specs)

[{'machine_spec': {'machine_type': 'e2-highmem-16'},
  'python_package_spec': {'args': ['--project-id',
                                   'hybrid-vertex',
                                   '--data-dir',
                                   '/gcs/ct-pipeline-v9/training-v9/run-20250114-041128/data',
                                   '--training-dir',
                                   '/gcs/ct-pipeline-v9/training-v9/run-20250114-041128/artifacts',
                                   '--bq-source',
                                   'hybrid-vertex.mlops.chicago',
                                   '--experiment-run',
                                   'run-20250114-041128',
                                   '--experiment-name',
                                   'training-v9'],
                          'executor_image_uri': 'us-docker.pkg.dev/vertex-ai/training/sklearn-cpu.1-0:latest',
                          'package_uris': ['gs://ct-pipeline-v9/trainer-0.1.tar.gz'],
              

In [36]:
# existing model
EXISTING_MODEL_BOOL=True # True | False
PARENT_MODEL_ID="3204070341827624960"
PARENT_MODEL_URI="gs://ct-pipeline-v6/training-v6/run-20241105-160557/artifacts"
MODEL_VERSION_ALIAS="v4"

# endpoint
EXISTING_ENDPOINT_BOOL=True
ENDPOINT_ID="3326806625813004288"
ENDPOINT_RESOURCE_NAME=f"projects/934903580331/locations/{REGION}/endpoints/{ENDPOINT_ID}"

In [37]:
parameters = {
    "project": PROJECT_ID,
    "location": REGION,
    "version": VERSION,
    "training_job_display_name": "taxifare-prediction-training-job",
    "worker_pool_specs": worker_pool_specs,
    "base_output_dir": BASE_OUTPUT_DIR,
    "artifacts_dir": ARTIFACTS_DIR,
    "prediction_container_uri": "us-docker.pkg.dev/vertex-ai/prediction/sklearn-cpu.1-0:latest",
    "model_display_name": "taxifare-prediction-model",
    "batch_prediction_job_display_name": "taxifare-prediction-batch-job",
    "target_field_name": "fare",
    "test_data_gcs_uri": [f"{DATA_DIR}/test_no_target.csv"],
    "ground_truth_gcs_source": [f"{DATA_DIR}/test.csv"],
    "batch_predictions_gcs_prefix": f"{BUCKET_URI}/batch_predict_output",
    "eval_display_name": f'eval-{invoke_time}',
    "existing_model": EXISTING_MODEL_BOOL,
    "existing_endpoint":EXISTING_ENDPOINT_BOOL,
}

if EXISTING_MODEL_BOOL:
    parameters["parent_model_resource_name"] = f"projects/{PROJECT_ID}/locations/{REGION}/models/{PARENT_MODEL_ID}"
    parameters["parent_model_artifact_uri"] = PARENT_MODEL_URI
    parameters["model_version_alias"] = MODEL_VERSION_ALIAS
    
    
if EXISTING_ENDPOINT_BOOL:
    parameters["endpoint_resource_uri"] = f"https://us-central1-aiplatform.googleapis.com/v1/{ENDPOINT_RESOURCE_NAME}"
    parameters["endpoint_resource_name"] = ENDPOINT_RESOURCE_NAME


pprint(parameters)

{'artifacts_dir': 'gs://ct-pipeline-v9/training-v9/run-20250114-041128/artifacts',
 'base_output_dir': 'gs://ct-pipeline-v9/training-v9/run-20250114-041128',
 'batch_prediction_job_display_name': 'taxifare-prediction-batch-job',
 'batch_predictions_gcs_prefix': 'gs://ct-pipeline-v9/batch_predict_output',
 'endpoint_resource_name': 'projects/934903580331/locations/us-central1/endpoints/3326806625813004288',
 'endpoint_resource_uri': 'https://us-central1-aiplatform.googleapis.com/v1/projects/934903580331/locations/us-central1/endpoints/3326806625813004288',
 'eval_display_name': 'eval-20250114-041128',
 'existing_endpoint': True,
 'existing_model': True,
 'ground_truth_gcs_source': ['gs://ct-pipeline-v9/training-v9/run-20250114-041128/data/test.csv'],
 'location': 'us-central1',
 'model_display_name': 'taxifare-prediction-model',
 'model_version_alias': 'v4',
 'parent_model_artifact_uri': 'gs://ct-pipeline-v6/training-v6/run-20241105-160557/artifacts',
 'parent_model_resource_name': 'pro

## Create and Run pipeline job

In [38]:
# Create a pipeline job
job = aiplatform.PipelineJob(
    display_name=f"{PIPELINE_NAME}-manual",
    template_path=TEMPLATE_URI,
    parameter_values=parameters,
    pipeline_root=PIPELINE_ROOT,
    enable_caching=False,
    failure_policy='fast',
)
# Run the pipeline job
# job.run(sync=False)

job.submit(
    experiment=EXPERIMENT_NAME,
    # service_account="934903580331-compute@developer.gserviceaccount.com"
)

Creating PipelineJob
PipelineJob created. Resource name: projects/934903580331/locations/us-central1/pipelineJobs/ct-training-v9-20250114041203
To use this PipelineJob in another session:
pipeline_job = aiplatform.PipelineJob.get('projects/934903580331/locations/us-central1/pipelineJobs/ct-training-v9-20250114041203')
View Pipeline Job:
https://console.cloud.google.com/vertex-ai/locations/us-central1/pipelines/runs/ct-training-v9-20250114041203?project=934903580331
Associating projects/934903580331/locations/us-central1/pipelineJobs/ct-training-v9-20250114041203 to Experiment: training-v9


# Inspect deployed models 

### initialize Model resource

* `model_name`: Required. fully-qualified model resource name or model ID
* `version`: Optional. Version ID or version alias.

In [59]:
my_model = aiplatform.Model(
    model_name="projects/934903580331/locations/us-central1/models/3204070341827624960", #@2
    version="2",
)
my_model

<google.cloud.aiplatform.models.Model object at 0x7f2edc707880> 
resource name: projects/934903580331/locations/us-central1/models/3204070341827624960

In [60]:
my_model.to_dict()

{'name': 'projects/934903580331/locations/us-central1/models/3204070341827624960',
 'displayName': 'taxifare-prediction-model',
 'predictSchemata': {},
 'metadata': None,
 'containerSpec': {'imageUri': 'us-docker.pkg.dev/vertex-ai/prediction/sklearn-cpu.1-0:latest'},
 'supportedDeploymentResourcesTypes': ['DEDICATED_RESOURCES'],
 'supportedInputStorageFormats': ['jsonl',
  'bigquery',
  'csv',
  'tf-record',
  'tf-record-gzip',
  'file-list'],
 'supportedOutputStorageFormats': ['jsonl', 'bigquery'],
 'createTime': '2024-11-05T16:09:47.035941Z',
 'updateTime': '2024-11-07T16:59:21.266241Z',
 'etag': 'AMEw9yObdsIU2rg2_jZeADdcn4DPEA2pydpeB3j4glon7em9gRCVpacPOa1pxId6nog=',
 'labels': {'vertex-ai-pipelines-run-billing-id': '9056019495159595008'},
 'supportedExportFormats': [{'id': 'custom-trained',
   'exportableContents': ['ARTIFACT']}],
 'artifactUri': 'gs://ct-pipeline-v6/training-v6/run-20241105-160557/artifacts',
 'versionId': '1',
 'versionAliases': ['default'],
 'versionCreateTime': 

In [61]:
my_evaluations = my_model.list_model_evaluations()
my_evaluations

[<google.cloud.aiplatform.model_evaluation.model_evaluation.ModelEvaluation object at 0x7f2edc705fc0> 
 resource name: projects/934903580331/locations/us-central1/models/3204070341827624960@1/evaluations/3761886780687585792]

In [71]:
# my_model.

'projects/934903580331/locations/us-central1/models/7404802894257455104'

In [None]:
ENDPOINT_ID = "3326806625813004288"

### Create model evaluation job (SDK)

> see [code snippets](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/model_evaluation/custom_tabular_regression_model_evaluation.ipynb) 

In [None]:
%%writefile instance_schema.yaml
title: TabularRegression
description: 'Regression Instances.'

type: object
properties:
  dense_input:
    type: array
    items:
      type: float
      minimum: 0.0
      maximum: 1.0
    description: 'Input values to model'

In [None]:
%%writefile prediction_schema.yaml
title: TabularRegression
description: 'Regression results.'

type: array

In [None]:
!gsutil cp instance_schema.yaml {BUCKET_URI}/instance_schema.yaml
!gsutil cp prediction_schema.yaml {BUCKET_URI}/prediction_schema.yaml

In [None]:
## upload to Registy
model = aiplatform.Model.upload(
    display_name="boston_new_model",
    artifact_uri=MODEL_DIR,
    serving_container_image_uri=DEPLOY_IMAGE,
    instance_schema_uri=f"{BUCKET_URI}/instance_schema.yaml",
    prediction_schema_uri=f"{BUCKET_URI}/prediction_schema.yaml",
    explanation_parameters=parameters,
    explanation_metadata=metadata,
    sync=False,
)

In [None]:
gcs_input_uri = BUCKET_URI + "/" + "test_file_with_ground_truth.jsonl"
with tf.io.gfile.GFile(gcs_input_uri, "w") as f:
    for i in range(10):
        data = {serving_input: x_test[i].tolist(), "MEDV": y_test[i]}
        f.write(json.dumps(data) + "\n")

In [None]:
job = model.evaluate(
    prediction_type="regression",
    target_field_name="MEDV",
    gcs_source_uris=[BUCKET_URI + "/" + "test_file_with_ground_truth.jsonl"],
    generate_feature_attributions=True,
)

print("Waiting model evaluation is in process")
job.wait()

### Associate eval with model (manually)

In [72]:
from google.cloud.aiplatform import gapic

# metrics = {"logLoss": 1.4, "auPrc": 0.85}
# print(metrics)

# model_eval = gapic.ModelEvaluation(
#     display_name="eval",
#     metrics_schema_uri="gs://google-cloud-aiplatform/schema/modelevaluation/classification_metrics_1.0.0.yaml",
#     metrics=metrics,
# )


metrics = {
    "rootMeanSquaredError": 2.019573,
    "rSquared": 0.9765769,
    "meanAbsoluteError": 0.7664642, 
    "meanAbsolutePercentageError": 5.1649256,
    "rootMeanSquaredLogError": 0.093754485
,
}
print(metrics)

model_eval = gapic.ModelEvaluation(
    display_name="eval-sdk",
    metrics_schema_uri="gs://google-cloud-aiplatform/schema/modelevaluation/regression_metrics_1.0.0.yaml",
    metrics=metrics,
)

model_eval

{'rootMeanSquaredError': 2.019573, 'rSquared': 0.9765769, 'meanAbsoluteError': 0.7664642, 'meanAbsolutePercentageError': 5.1649256, 'rootMeanSquaredLogError': 0.093754485}


display_name: "eval-sdk"
metrics_schema_uri: "gs://google-cloud-aiplatform/schema/modelevaluation/regression_metrics_1.0.0.yaml"
metrics {
  struct_value {
    fields {
      key: "rootMeanSquaredLogError"
      value {
        number_value: 0.093754485
      }
    }
    fields {
      key: "rootMeanSquaredError"
      value {
        number_value: 2.019573
      }
    }
    fields {
      key: "rSquared"
      value {
        number_value: 0.9765769
      }
    }
    fields {
      key: "meanAbsolutePercentageError"
      value {
        number_value: 5.1649256
      }
    }
    fields {
      key: "meanAbsoluteError"
      value {
        number_value: 0.7664642
      }
    }
  }
}

### Upload the evaluation metrics to the Model Registry

In [73]:
API_ENDPOINT = f"{REGION}-aiplatform.googleapis.com"
client = gapic.ModelServiceClient(client_options={"api_endpoint": API_ENDPOINT})
client

<google.cloud.aiplatform_v1.services.model_service.client.ModelServiceClient at 0x7fce75ddad10>

In [74]:
client.import_model_evaluation(parent=my_model.resource_name, model_evaluation=model_eval)

name: "projects/934903580331/locations/us-central1/models/7404802894257455104/evaluations/3687972264888737838"
display_name: "eval-sdk"
metrics_schema_uri: "gs://google-cloud-aiplatform/schema/modelevaluation/regression_metrics_1.0.0.yaml"
metrics {
  struct_value {
    fields {
      key: "rootMeanSquaredLogError"
      value {
        number_value: 0.093754485
      }
    }
    fields {
      key: "rootMeanSquaredError"
      value {
        number_value: 2.019573
      }
    }
    fields {
      key: "rSquared"
      value {
        number_value: 0.9765769
      }
    }
    fields {
      key: "meanAbsolutePercentageError"
      value {
        number_value: 5.1649256
      }
    }
    fields {
      key: "meanAbsoluteError"
      value {
        number_value: 0.7664642
      }
    }
  }
}

In [75]:
my_evaluations = my_model.list_model_evaluations()
my_evaluations

[]

In [67]:
model_registry = aiplatform.models.ModelRegistry(model="projects/934903580331/locations/us-central1/models/7404802894257455104")
model_version_info = model_registry.get_version_info(version="1")
model_version_info

Getting version 1 info for projects/934903580331/locations/us-central1/models/7404802894257455104


VersionInfo(version_id='1', version_create_time=DatetimeWithNanoseconds(2024, 11, 5, 3, 53, 2, 910944, tzinfo=datetime.timezone.utc), version_update_time=DatetimeWithNanoseconds(2024, 11, 5, 3, 53, 4, 865125, tzinfo=datetime.timezone.utc), model_display_name='taxifare-prediction-model', model_resource_name='projects/934903580331/locations/us-central1/models/7404802894257455104', version_aliases=['default'], version_description='')