In [None]:
# @title Copyright & License (click to expand)
# Copyright 2022 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Vertex AI Model Monitoring for XGBoost models

<table align="left">
  <td>
    <a href="https://colab.research.google.com/github/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/model_monitoring/get_started_with_model_monitoring_xgboost.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Colab logo"> Run in Colab
    </a>
  </td>
  <td>
    <a href="https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/model_monitoring/get_started_with_model_monitoring_xgboost.ipynb">
        <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo">
      View on GitHub
    </a>
  </td> 
  <td>
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/vertex-ai-samples/main/notebooks/official/model_monitoring/get_started_with_model_monitoring_xgboost.ipynb">
      <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo">
      Open in Vertex AI Workbench
    </a>
  </td>
</table>


## Overview

This tutorial demonstrates how to use `Vertex AI Model Monitoring` for XGBoost models.

Learn more about [Vertex AI Model Monitoring](https://cloud.google.com/vertex-ai/docs/model-monitoring).

### Objective

In this notebook, you learn to use the `Vertex AI Model Monitoring` service to detect feature skew and drift in the input predict requests for XGBoost models.

This tutorial uses the following Google Cloud ML services:

- `Vertex AI Model Monitoring`
- `Vertex AI Prediction`
- `Vertex AI Model` resource
- `Vertex AI Endpoint` resource

The steps performed include:

- Download a pre-trained XGBoost model.
- Upload the pre-trained model as a `Model` resource.
- Deploy the `Model` resource to the `Endpoint` resource.
- Configure the `Endpoint` resource for model monitoring:
  - drift detection only -- no access to training data.
  - predefine the input schema to map feature alias names to the unnamed array input to the model.
- Generate synthetic prediction requests for drift.


Learn more about [Introduction to Vertex AI Model Monitoring](https://cloud.google.com/vertex-ai/docs/model-monitoring/overview).

### Model

The model used for this tutorial is a pretrain XGBoost model that was trained on the [Iris dataset](https://www.tensorflow.org/datasets/catalog/iris) from [TensorFlow Datasets](https://www.tensorflow.org/datasets/catalog/overview). The trained model predicts the type of Iris flower species from a class of three species: setosa, virginica, or versicolor.

### Costs 

This tutorial uses billable components of Google Cloud:

* Vertex AI
* BigQuery
* Cloud Storage

Learn about [Vertext AI
pricing](https://cloud.google.com/vertex-ai/pricing) and [Cloud Storage
pricing](https://cloud.google.com/storage/pricing), and use the [Pricing
Calculator](https://cloud.google.com/products/calculator/)
to generate a cost estimate based on your projected usage.

## Installation

Install the packages required for executing this notebook.

In [None]:
# Install required packages.
! pip3 install --quiet --upgrade google-cloud-aiplatform \
                                 google-cloud-bigquery

### Colab only: Uncomment the following cell to restart the kernel

In [None]:
# Automatically restart kernel after installs so that your environment can access the new packages
# import IPython

# app = IPython.Application.instance()
# app.kernel.do_shutdown(True)

## Before you begin

### Set your project ID

**If you don't know your project ID**, try the following:
* Run `gcloud config list`.
* Run `gcloud projects list`.
* See the support page: [Locate the project ID](https://support.google.com/googleapi/answer/7014113)

In [None]:
PROJECT_ID = "[your-project-id]"  # @param {type:"string"}

# Set the project id
! gcloud config set project {PROJECT_ID}

#### Region

You can also change the `REGION` variable used by Vertex AI. Learn more about [Vertex AI regions](https://cloud.google.com/vertex-ai/docs/general/locations).

In [None]:
REGION = "us-central1"

#### User Email

Set your user email address to receive monitoring alerts.

In [None]:
import os

USER_EMAIL = "[your-email-addr]"  # @param {type:"string"}

if os.getenv("IS_TESTING"):
    USER_EMAIL = "noreply@google.com"

### Authenticate your Google Cloud account

Depending on your Jupyter environment, you may have to manually authenticate. Follow the relevant instructions below.

**1. Vertex AI Workbench**
* Do nothing as you are already authenticated.

**2. Local JupyterLab instance, uncomment and run:**

In [None]:
# ! gcloud auth login

**3. Colab, uncomment and run:**

In [None]:
# from google.colab import auth
# auth.authenticate_user()

**4. Service account or other**
* See how to grant Cloud Storage permissions to your service account at https://cloud.google.com/storage/docs/gsutil/commands/iam#ch-examples.

### Notes about service account and permission

**By default no configuration is required**, if you run into any permission related issue, please make sure the service accounts above have the required roles:

|Service account email|Description|Roles|
|---|---|---|
|PROJECT_NUMBER-compute@developer.gserviceaccount.com|Compute Engine default service account|Dataflow Admin, Dataflow Worker, Storage Admin, BigQuery Admin, Vertex AI User|
|service-PROJECT_NUMBER@gcp-sa-aiplatform.iam.gserviceaccount.com|AI Platform Service Agent|Vertex AI Service Agent|


1. Goto https://console.cloud.google.com/iam-admin/iam.
2. Check the "Include Google-provided role grants" checkbox.
3. Find the above emails.
4. Grant the corresponding roles.

### Using data source from a different project
- For the BQ data source, grant both service accounts the "BigQuery Data Viewer" role.
- For the CSV data source, grant both service accounts the "Storage Object Viewer" role.

### Create a Cloud Storage bucket

Create a storage bucket to store intermediate artifacts such as datasets.

In [None]:
BUCKET_URI = f"gs://your-bucket-name-{PROJECT_ID}-unique"  # @param {type:"string"}

**Only if your bucket doesn't already exist**: Run the following cell to create your Cloud Storage bucket.

In [None]:
! gsutil mb -l $REGION -p $PROJECT_ID $BUCKET_URI

### Import libraries

In [None]:
import os
import random

from google.cloud import aiplatform, bigquery
from google.cloud.aiplatform import model_monitoring

### Initialize Vertex AI SDK for Python

Initialize the Vertex AI SDK for Python for your project and corresponding bucket.

In [None]:
aiplatform.init(project=PROJECT_ID, location=REGION)

### Create BigQuery client

In this tutorial, you use data from the same public BigQuery table that was used to train the pre-trained model. You create a client interface, which you subsequently use to access the data.

In [None]:
bqclient = bigquery.Client(project=PROJECT_ID)

#### Set pre-built containers

Set the pre-built Docker container image for prediction.

For the latest list, see [Pre-built containers for prediction](https://cloud.google.com/ai-platform-unified/docs/predictions/pre-built-containers).

In [None]:
DEPLOY_VERSION = "xgboost-cpu.1-1"

LOCATION = REGION.split("-")[0]

DEPLOY_IMAGE = f"{LOCATION}-docker.pkg.dev/vertex-ai/prediction/{DEPLOY_VERSION}:latest"

print("Deployment:", DEPLOY_IMAGE)

#### Set machine types

Next, set the machine types to use for training and prediction.

- Set the variable `DEPLOY_COMPUTE` to configure your compute resources for prediction.
 - `machine type`
     - `n1-standard`: 3.75GB of memory per vCPU
     - `n1-highmem`: 6.5GB of memory per vCPU
     - `n1-highcpu`: 0.9 GB of memory per vCPU
 - `vCPUs`: number of \[2, 4, 8, 16, 32, 64, 96 \]

*Note: You may also use n2 and e2 machine types for training and deployment, but they do not support GPUs*.

In [None]:
MACHINE_TYPE = "n1-standard"

VCPU = "4"
TRAIN_COMPUTE = MACHINE_TYPE + "-" + VCPU
print("Train machine type", TRAIN_COMPUTE)

MACHINE_TYPE = "n1-standard"

VCPU = "4"
DEPLOY_COMPUTE = MACHINE_TYPE + "-" + VCPU
print("Deploy machine type", DEPLOY_COMPUTE)

## Introduction to Vertex AI Model Monitoring

Vertex AI Model Monitoring is supported for AutoML tabular models and custom tabular models. You can monitor for skew and drift detection of the features in the inbound prediction requests or skew and drift detection of the feature attributions (Explainable AI) in the outbound prediction response -- that is, the distribution of the attributions on how they contributed to the output (predictions).

The following are the basic steps to enable model monitoring:

1. Deploy a `Vertex AI` AutoML or custom tabular model to an `Vertex AI Endpoint`.
2. Configure a model monitoring specification.
3. Upload the model monitoring specification to the `Vertex AI Endpoint`.
4. Upload or automatic generation of the `input schema` for parsing.
5. For feature skew detection, upload the training data for automatic generation of the feature distribution.
6. For feature attributions, upload corresponding `Vertex AI Explainability` specification.

Once configured, you can enable/disable monitoring, change alerts and update the model monitoring configuration. 

When model monitoring is enabled, the sampled incoming prediction requests are logged into a BigQuery table. The input feature values contained in the logged requests are then analyzed for skew or drift on an specified interval basis. You set a sampling rate to monitor a subset of the production inputs to a model, and the monitoring interval.

The model monitoring service needs to know how to parse the feature values, which is referred to as the input schema. For AutoML tabular models, the input schema is automatically generated. For custom tabular models, the service attempts to automatically derive the input schema from the first 1000 prediction requests. Alternatively, one can upload the input schema.

For skew detection, the monitoring service requires a baseline for the statistical distribution of values in the training data. For AutoML tabular models this is automatically derived. For custom tabular models, you upload the training data to the service, and have the service automatically derive the distribution.

For feature attribution skew and drift detection, requires enabling your deployed model for `Vertex AI Explainability` for custom tabular models. For AutoML models, `Vertex AI Explainability` is automatically enabled.

Learn more about [Introduction to Vertex AI Model Monitoring](https://cloud.google.com/vertex-ai/docs/model-monitoring/overview).

### Upload the model artifacts as a `Vertex AI Model` resource

First, you upload the pre-trained XGBoost tabular model artifacts as a `Vertex AI Model` resource using the `upload()` method, with the following parameters:

- `display_name`: The human readable name for the `Model` resource.
- `artifact_uri`: The Cloud Storage location of the model artifacts.
- `serving_container_image`: The serving container image to use when the model is deployed to a `Vertex AI Endpoint` resource.
- `sync`: Whether to wait for the process to complete, or return immediately (async).

In [None]:
MODEL_ARTIFACT_URI = (
    "gs://cloud-samples-data/vertex-ai/model-deployment/models/xgboost_iris"
)

model = aiplatform.Model.upload(
    display_name="xgboost_iris",
    artifact_uri=MODEL_ARTIFACT_URI,
    serving_container_image_uri=DEPLOY_IMAGE,
    sync=True,
)

print(model)

### Deploy the `Vertex AI Model` resource to a `Vertex AI Endpoint` resource

Next, you deploy your `Vertex AI Model` resource to a `Vertex AI Endpoint` resource using the `deploy()` method, with the following parameters:

- `deploy_model_display`: The human reable name for the deployed model.
- `machine_type`: The machine type for each VM node instance.
- `min_replica_count`: The minimum number of nodes to provision for auto-scaling.
- `max_replica_count`: The maximum number of nodes to provision for auto-scaling.

In [None]:
MIN_NODES = 1
MAX_NODES = 1


endpoint = model.deploy(
    deployed_model_display_name="xgboost_iris",
    machine_type=DEPLOY_COMPUTE,
    min_replica_count=MIN_NODES,
    max_replica_count=MAX_NODES,
)

## Configure a monitoring job

Configuring the monitoring job consists of the following specifications:

- `alert_config`: The email address(es) to send monitoring alerts to.
- `schedule_config`: The time window to analyze predictions.
- `logging_sampling_strategy`: The rate for sampling prediction requests. 
- `drift_config`: The features and drift thresholds to monitor.
- `skew_config`: The features and skew thresholds to monitor.

### Configure the alerting specification

First, you configure the `alerting_config` specification with the following settings:

- `user_emails`: A list of one or more email to send alerts to.
- `enable_logging`: Streams detected anomalies to Cloud Logging. Default is False.

In [None]:
# Create alerting configuration.
alerting_config = model_monitoring.EmailAlertConfig(
    user_emails=[USER_EMAIL], enable_logging=True
)

### Configure the monitoring interval specification

Next, you configure the `schedule_config` specification with the following settings:

- `monitor_interval`:  Sets the model monitoring job scheduling interval in hours. Minimum time interval is 1 hour. For example, at a one hour interval, the monitoring job will run once an hour.

In [None]:
# Monitoring Interval
MONITOR_INTERVAL = 1  # @param {type:"number"}

# Create schedule configuration
schedule_config = model_monitoring.ScheduleConfig(monitor_interval=MONITOR_INTERVAL)

### Configure the sampling specification

Next, you configure the `logging_sampling_strategy` specification with the following settings:

- `sample_rate`: The rate as a percentage (between 0 and 1) to randomly sample prediction requests for monitoring. Selected samples are logged to a BigQuery table.

In [None]:
# Sampling rate (optional, default=.8)
SAMPLE_RATE = 0.5  # @param {type:"number"}

# Create sampling configuration
logging_sampling_strategy = model_monitoring.RandomSampleConfig(sample_rate=SAMPLE_RATE)

### Configure the drift detection specification

Next, you configure the `drift_config` specification with the following settings:

- `drift_thresholds`: A dictionary of key/value pairs where the keys are the input features for monitor for drift. The value is the detection threshold. When not specified, the default drift threshold for a feature is 0.3 (30%).

*Note:* Enabling drift detection is optional.

In [None]:
DRIFT_THRESHOLD_VALUE = 0.05

DRIFT_THRESHOLDS = {
    "sepal_length": DRIFT_THRESHOLD_VALUE,
    "petal_length": DRIFT_THRESHOLD_VALUE,
}

drift_config = model_monitoring.DriftDetectionConfig(drift_thresholds=DRIFT_THRESHOLDS)

### Assemble the objective specification

Finally, you assemble the objective specification `objective_config` with the following settings:

- `skew_detection_config`: (Optional) The specification for the skew detection configuration.
- `drift_detection_config`: (Optional) The specification for the drift detection configuration.
- `explanation_config`: (Optional) The specification for explanations when enabling monitoring for feature attributions.

*Note:* You don't configure skew detection, since the assumption is you don't have access to the training data.

In [None]:
objective_config = model_monitoring.ObjectiveConfig(
    skew_detection_config=None,
    drift_detection_config=drift_config,
    explanation_config=None,
)

### Create the input schema

The monitoring service needs to know the features and data types for the the feature inputs to the model, which is referred to as the `input schema`. The `input schema` can either be 
 - Preloaded to the monitoring service.
 - Automatically generated by the monitoring service after receiving first 1000 prediction instances.
 
In this tutorial, you preload the `input schema`.

#### Create the predefined input schema

The predefined `input schema` is specified as a YAML file. In this example, you generate the YAML specification according to the model's input layer. In this case, the input layer is an array of four floating point numeric values. In the schema, this is represented by:

- `type: array`: Refers to the input is an array (list)
- `properties`: An ordered list of the inputs in the array
- `properties -> name`: The alias (e.g., sepal_length) for the corresponding value in the array.
- `properties -> type: number`: The value for the array element is floating point.
- `required`: The order of the values in the array specified by alias. 

The input schema then informs the model monitoring service how to map the unnamed input values to the corresponding feature alias names, which can then be specified in your model monitoring configuration.

The predefined `input schema` must be loaded to a Cloud Storage location.

Learn more about [Custom instance schemas for parsing input](https://cloud.google.com/vertex-ai/docs/model-monitoring/overview#custom-input-schemas).

In [None]:
yaml = """type: array
properties:
  sepal_length:
    type: number
  sepal_width:
    type: number
  petal_length:
    type: number
  petal_width:
    type: number
required:
  - sepal_length
  - sepal_width
  - petal_length
  - petal_width
"""

print(yaml)

with open("schema.yaml", "w") as f:
    f.write(yaml)

! gsutil cp schema.yaml {BUCKET_URI}/schema.yaml

### Create the monitoring job

You create a monitoring job, with your monitoring specifications, using the `aiplatform.ModelDeploymentMonitoringJob.create()` method, with the following parameters:

- `display_name`: The human readable name for the monitoring job.
- `project`: The project ID.
- `region`: The region.
- `endpoint`: The fully qualified resource name of the `Vertex AI Endpoint` to enable monitoring.
- `logging_sampling_strategy`: The specification for the sampling configuration.
- `schedule_config`: The specification for the scheduling configuration.
- `alert_config`: The specification for the alerting configuration.
- `objective_configs`: The specification for the objectives configuration.
- `analysis_instance_schema_uri`: The location of the YAML file containing the `input schema`.

In [None]:
monitoring_job = aiplatform.ModelDeploymentMonitoringJob.create(
    display_name="xgboost_iris",
    project=PROJECT_ID,
    location=REGION,
    endpoint=endpoint,
    logging_sampling_strategy=logging_sampling_strategy,
    schedule_config=schedule_config,
    alert_config=alerting_config,
    objective_configs=objective_config,
    analysis_instance_schema_uri=f"{BUCKET_URI}/schema.yaml",
)

print(monitoring_job)

#### Email notification of the monitoring job.

An email notification is sent to the email address in the alerting configuration, notifying that the model monitoring job is now enabled.

The contents will appear like:

<blockquote>
Hello Vertex AI Customer,

You are receiving this mail because you are using the Vertex AI Model Monitoring service.
This mail is to inform you that we received your request to set up drift or skew detection for the Prediction Endpoint listed below. Starting from now, incoming prediction requests will be sampled and logged for analysis.
Raw requests and responses will be collected from prediction service and saved in bq://[your-project-id].model_deployment_monitoring_[endpoint-id].serving_predict .
</blockquote>

#### Monitoring Job State

After you start the `Vertex AI Model Monitoring` job, there are three transition states the job may be in:

- `PENDING`: The job is configured for skew detection and the `skew distribution baseline` is being calculated. The monitoring service will initiate a batch job to generate the distribution baseline from the training data. 

- `OFFLINE`: The monitoring job is between monitoring intervals.

- `RUNNING`: The monitoring job on a per interval basis is analyzing the sampled data.

In [None]:
jobs = monitoring_job.list(filter="display_name=xgboost_iris")
job = jobs[0]
print(job.state)

pause for the monitoring job to be enabled

In [None]:
import time

time.sleep(180)

### Generate synthetic prediction requests for first baseline

Next, you create a 1000 synthetic data items to use for prediction requests. 

In [None]:
instances = []
for _ in range(1000):
    sepal_length = random.uniform(0.5, 3.5)
    sepal_width = random.uniform(0.2, 2.0)
    petal_length = random.uniform(0.5, 2.0)
    petal_width = random.uniform(0.2, 1.5)
    instances.append([sepal_length, sepal_width, petal_length, petal_width])

### Make the prediction requests

Next, you send the the 1000 prediction requests to your `Vertex AI Endpoint` resource using the `predict()` method.

Note, the model outputs the class as a floating point value. For example, `0.0` is the label `0`.

In [None]:
for instance in instances:
    response = endpoint.predict(instances=[instance])

prediction = response[0]

# print the prediction for the first instance
print(prediction[0])

### Logging sampled requests

On the next monitoring interval, the sampled predictions are then copied over to the BigQuery logging table. Once the entries are in the BigQuery table, the monitoring service will analyze the sampled data.

Next, you wait for the first logged entres to appear in the BigQuery table used for logging prediction samples. Since you sent 1000 prediction requests, with 50% sampling, you should see around 500 entries.

*Note*: This may take upto the length of the monitoring interval (e.g., one hour).

In [None]:
import time

while True:

    ENDPOINT_ID = endpoint.resource_name.split("/")[-1]

    table = bigquery.TableReference.from_string(
        f"{PROJECT_ID}.model_deployment_monitoring_{ENDPOINT_ID}.serving_predict"
    )
    rows = bqclient.list_rows(table)
    print(rows.total_rows)
    if rows.total_rows > 0:
        break
    time.sleep(180)

### Generate synthetic prediction requests for drift detection

You modify the data (synthetic) to trigger the drift detection in the prediction requests from the previous basline distribution versus the current distribution, as follows:

- `sepal_length`: increase the value 4x.

In [None]:
instances = []
for _ in range(1000):
    sepal_length = random.uniform(0.5, 3.5) * 4.0
    sepal_width = random.uniform(0.2, 2.0)
    petal_length = random.uniform(0.5, 2.0)
    petal_width = random.uniform(0.2, 1.5)
    instances.append([sepal_length, sepal_width, petal_length, petal_width])

### Make the prediction requests

Next, you send the the 1000 prediction requests to your `Vertex AI Endpoint` resource using the `predict()` method.

In [None]:
for instance in instances:
    response = endpoint.predict(instances=[instance])

prediction = response[0]

# print the prediction for the first instance
print(prediction[0])

### Drift detection during monitoring

The feature input drift detection will occur at the next monitoring interval. In this tutorial, you set the monitoring interval to one hour. So, in about an hour your monitoring job will go from `OFFLINE` to `RUNNING`. While running, it will analyze the logged sampled tables from the predictions during this interval and compare them to the previous monitoring interva distribution.

Once the analysis is completed, the monitoring job will send email notifications on the detected drift, in this case `cnt_user_engagement`, and the monitoring job will go into `OFFLINE` state until the next interval.

#### Wait for monitoring interval

It can take upwards of 40 minutes from when the analyis occurred on the monitoring interval to when you receive an email alert.

In [None]:
if os.getenv("IS_TESTING"):
    time.sleep(60 * 45)

### Logging sampled requests

On the next monitoring interval, the sampled predictions are then copied over to the BigQuery logging table. Once the entries are in the BigQuery table, the monitoring service will analyze the sampled data.

Next, you wait for the second logged entres to appear in the BigQuery table used for logging prediction samples. Since you sent 1000 prediction requests, with 50% sampling, you should see around 1000 entries.

*Note*: This may take upto the length of the monitoring interval (e.g., one hour).

In [None]:
import time

while True:

    ENDPOINT_ID = endpoint.resource_name.split("/")[-1]

    table = bigquery.TableReference.from_string(
        f"{PROJECT_ID}.model_deployment_monitoring_{ENDPOINT_ID}.serving_predict"
    )
    rows = bqclient.list_rows(table)
    print(rows.total_rows)
    if rows.total_rows > 950:
        break
    time.sleep(180)

### Delete the monitoring job

You can delete the monitoring job using the `delete()` method. 

In [None]:
monitoring_job.pause()
monitoring_job.delete()

#### Undeploy and delete the `Vertex AI Endpoint` resource

Your `Vertex AI Endpoint` resource can be deleted using the `delete()` method. Prior to deleting, any model deployed to your `Vertex AI Endpoint` resource, must first be undeployed.

In [None]:
endpoint.undeploy_all()
endpoint.delete()

## Cleaning up

To clean up all Google Cloud resources used in this project, you can [delete the Google Cloud
project](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects) you used for the tutorial.

Otherwise, you can delete the individual resources you created in this tutorial.

In [None]:
delete_bucket = False

if delete_bucket or os.getenv("IS_TESTING"):
    ! gsutil rm -rf {BUCKET_URI}

! rm -f schema.yaml

! bq rm -f {PROJECT_ID}.model_deployment_monitoring_{ENDPOINT_ID}