In [None]:
# @title Copyright & License (click to expand)
# Copyright 2022 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Vertex AI Model Monitoring for custom tabular models

<table align="left">
  <td style="text-align: center">
    <a href="https://colab.research.google.com/github/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/model_monitoring/get_started_with_model_monitoring_custom.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Google Colaboratory logo"><br> Open in Colab
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fvertex-ai-samples%2Fmain%2Fnotebooks%2Fofficial%2Fmodel_monitoring%2Fget_started_with_model_monitoring_custom.ipynb">
      <img width="32px" src="https://cloud.google.com/ml-engine/images/colab-enterprise-logo-32px.png" alt="Google Cloud Colab Enterprise logo"><br> Open in Colab Enterprise
    </a>
  </td>    
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/vertex-ai-samples/main/notebooks/official/model_monitoring/get_started_with_model_monitoring_custom.ipynb">
      <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo"><br> Open in Workbench
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/model_monitoring/get_started_with_model_monitoring_custom.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo"><br> View on GitHub
    </a>
  </td>
</table>

## Overview

This tutorial demonstrates how to use Vertex AI Model Monitoring for custom tabular models.

Learn more about [Vertex AI Model Monitoring](https://cloud.google.com/vertex-ai/docs/model-monitoring).

### Objective

In this notebook, you learn to use the Vertex AI Model Monitoring service to detect feature skewness and drift in the input predict requests, for custom tabular models.

This tutorial uses the following Vertex AI services:

- Vertex AI Model Monitoring
- Vertex AI Prediction
- Vertex AI Model resource
- Vertex AI Endpoint resource

The steps performed include:

- Download a pre-trained custom tabular model.
- Upload the pre-trained model to Vertex AI Model Registry.
- Deploy the model resource to a Vertex AI endpoint resource.
- Configure the endpoint resource for model monitoring.
- Generate synthetic prediction requests to simulate skewness.
- Wait for email alert notifications.
- Generate synthetic prediction requests to simulate drift.
- Wait for email alert notifications.

Learn more about [Vertex AI Model Monitoring](https://cloud.google.com/vertex-ai/docs/model-monitoring/overview).

### Model

This tutorial uses a pre-trained model, where the model artifacts are stored in a public Cloud Storage bucket. 

The model is based on a [blog post about Churn prediction model](https://cloud.google.com/blog/topics/developers-practitioners/churn-prediction-game-developers-using-google-analytics-4-ga4-and-bigquery-ml). This model involves extensive log data describing how game users have interacted with a site. The raw data contains the following categories of information:

- identity - unique player identitity numbers.
- demographic features - information about the player, such as the geographic region in which a player is located.
- behavioral features - counts of the number of times a player has triggered certain game events, such as reaching a new level.
- churn propensity - this is the label or target feature. It provides an estimated probability that this player may churn, i.e., stop being an active player.

### Costs 

This tutorial uses billable components of Google Cloud:

* Vertex AI
* BigQuery
* Cloud Storage

Learn about [Vertext AI
pricing](https://cloud.google.com/vertex-ai/pricing) and [Cloud Storage
pricing](https://cloud.google.com/storage/pricing), and use the [Pricing
Calculator](https://cloud.google.com/products/calculator/)
to generate a cost estimate based on your projected usage.

## Get Started

### Install Vertex AI SDK for Python and other required packages


In [None]:
# Install required packages.
! pip3 install --quiet --upgrade google-cloud-aiplatform \
                                 google-cloud-bigquery

### Restart runtime (Colab only)

To use the newly installed packages, you must restart the runtime on Google Colab.

In [None]:
import sys

if "google.colab" in sys.modules:

    import IPython

    app = IPython.Application.instance()
    app.kernel.do_shutdown(True)

<div class="alert alert-block alert-warning">
<b>⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️</b>
</div>


### Authenticate your notebook environment (Colab only)

Authenticate your environment on Google Colab.


In [None]:
import sys

if "google.colab" in sys.modules:

    from google.colab import auth

    auth.authenticate_user()

### Set Google Cloud project information 

Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment).

In [None]:
PROJECT_ID = "[your-project-id]"  # @param {type:"string"}
LOCATION = "us-central1"  # @param {type:"string"}

### User email

Set your user email address to receive monitoring alerts.

In [None]:
import os

USER_EMAIL = "[your-email-address]"  # @param {type:"string"}

if os.getenv("IS_TESTING"):
    USER_EMAIL = "noreply@google.com"

### Notes about service account and permission

**By default no configuration is required**, if you run into any permission related issue, please make sure the service accounts above have the required roles:

|Service account email|Description|Roles|
|---|---|---|
|PROJECT_NUMBER-compute@developer.gserviceaccount.com|Compute Engine default service account|Dataflow Admin, Dataflow Worker, Storage Admin, BigQuery Admin, Vertex AI User|
|service-PROJECT_NUMBER@gcp-sa-aiplatform.iam.gserviceaccount.com|AI Platform Service Agent|Vertex AI Service Agent|


1. Go to [IAM console](https://console.cloud.google.com/iam-admin/iam).
2. Check the **Include Google-provided role grants** checkbox.
3. Find the above emails.
4. Grant the corresponding roles.

### Using data source from a different project

If you're using data sources from a different project:

- For BigQuery data source, grant the "BigQuery Data Viewer" role to both the service accounts.
- For CSV data source, grant the "Storage Object Viewer" role to both the service accounts.

### Create a Cloud Storage bucket

Create a storage bucket to store intermediate artifacts such as datasets.

In [None]:
BUCKET_URI = f"gs://your-bucket-name-{PROJECT_ID}-unique"  # @param {type:"string"}

**If your bucket doesn't already exist**: Run the following cell to create your Cloud Storage bucket.

In [None]:
! gsutil mb -l $LOCATION -p $PROJECT_ID $BUCKET_URI

### Import libraries

In [None]:
import google.cloud.aiplatform as aiplatform
from google.cloud import bigquery
from google.cloud.aiplatform import model_monitoring

### Initialize Vertex AI SDK for Python

To get started using Vertex AI, you must [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).

In [None]:
aiplatform.init(project=PROJECT_ID, location=LOCATION, staging_bucket=BUCKET_URI)

### Create BigQuery client

In this tutorial, you use data from the same public BigQuery table that was used to train the pre-trained model. You create a client interface, which you subsequently use to access the data.

In [None]:
bqclient = bigquery.Client(project=PROJECT_ID)

#### Set hardware accelerators

You can set hardware accelerators for prediction (e.g., GPUs) or use only CPUs. Hardware accelertors lower the latency response for a prediction request. When choosing a hardware accelerator, consider the additional cost trade-off over latency.

Set the variables `DEPLOY_GPU/DEPLOY_NGPU` to use a container image supporting a GPU and the number of GPUs allocated to the virtual machine (VM) instance. For example, to use a GPU container image with 4 Nvidia Tesla T4 GPUs allocated to each VM, you would specify:

    (aiplatform.gapic.AcceleratorType.NVIDIA_TESLA_T4, 4)

See the [locations where accelerators are available](https://cloud.google.com/vertex-ai/docs/general/locations#accelerators).

Otherwise specify `(None, None)` to use a container image to run on a CPU.

In [None]:
GPU = False
if GPU:
    DEPLOY_GPU, DEPLOY_NGPU = (aiplatform.gapic.AcceleratorType.NVIDIA_TESLA_T4, 1)
else:
    DEPLOY_GPU, DEPLOY_NGPU = (None, None)

#### Set pre-built containers

Set the pre-built Docker container image for prediction.

For the latest list, see [Pre-built containers for prediction](https://cloud.google.com/ai-platform-unified/docs/predictions/pre-built-containers).

In [None]:
if GPU:
    DEPLOY_VERSION = "tf2-gpu.2-5"
else:
    DEPLOY_VERSION = "tf2-cpu.2-5"

DEPLOY_IMAGE = "{}-docker.pkg.dev/vertex-ai/prediction/{}:latest".format(
    LOCATION.split("-")[0], DEPLOY_VERSION
)

print("Deployment:", DEPLOY_IMAGE, DEPLOY_GPU, DEPLOY_NGPU)

#### Set machine types

Next, set the machine types to use for training and prediction.

- Set the variable `DEPLOY_COMPUTE` to configure your compute resources for prediction.
 - Set a `machine type`:
     - `n1-standard`: 3.75GB of memory per vCPU
     - `n1-highmem`: 6.5GB of memory per vCPU
     - `n1-highcpu`: 0.9 GB of memory per vCPU
 - `vCPUs`: number of \[2, 4, 8, 16, 32, 64, 96 \]

**Note**: You may also use n2 and e2 machine types for training and deployment, but they don't support GPUs.

In [None]:
MACHINE_TYPE = "n1-standard"

VCPU = "4"
TRAIN_COMPUTE = MACHINE_TYPE + "-" + VCPU
print("Train machine type", TRAIN_COMPUTE)

MACHINE_TYPE = "n1-standard"

VCPU = "4"
DEPLOY_COMPUTE = MACHINE_TYPE + "-" + VCPU
print("Deploy machine type", DEPLOY_COMPUTE)

## Introduction to Vertex AI Model Monitoring

Vertex AI Model Monitoring is supported for AutoML tabular models and custom tabular models. You can monitor for skewness and drift detection of the features in the inbound prediction requests or the feature attributions (Explainable AI) in the outbound prediction responses. In other words, you monitor the distribution of the attributions that quantify feature contributions to the output (predictions).

The following are the basic steps to enable model monitoring:

1. Deploy a Vertex AI AutoML or custom tabular model to a Vertex AI endpoint.
2. Configure a model monitoring specification.
3. Upload the model monitoring specification to the Vertex AI endpoint.
4. Upload schema or use automatic generation of the *input schema* for parsing.
5. For feature skewness detection, upload the training data. This enables automatic generation of the feature distributions.
6. For feature attributions, upload the corresponding Vertex Explainable AI specification.

Once configured, you can enable/disable monitoring, change alerts and update the model monitoring configuration. 

When model monitoring is enabled, the sampled incoming prediction requests are logged into a BigQuery table. The input feature values contained in the logged requests are then analyzed for skewness or drift on a specified interval basis. You set a sampling rate to monitor a subset of the production inputs to the model, and the monitoring interval.

The model monitoring service needs to know how to parse the feature values, which is referred to as the input schema. For AutoML tabular models, the input schema is automatically generated. For custom tabular models, the service attempts to automatically derive the input schema from the first 1000 prediction requests. Alternatively, one can upload the input schema.

For skewness detection, the monitoring service requires a baseline for the statistical distribution of values in the training data. For AutoML tabular models this is automatically derived. For custom tabular models, you upload the training data to the service, and have the service automatically derive the distribution.

For skewness and drift detection in feature attributions, you're required to enable Vertex Explainable AI feature for your deployed custom tabular models. For AutoML models, Vertex Explainable AI is automatically enabled.

Learn more about [Introduction to Vertex AI Model Monitoring](https://cloud.google.com/vertex-ai/docs/model-monitoring/overview).

### Upload the model artifacts to Vertex AI Model Registry

First, upload the pre-trained custom tabular model artifacts as a Vertex AI model resource using the `upload()` method, with the following parameters:

- `display_name`: The human readable name for the model resource.
- `artifact_uri`: The Cloud Storage location of the model artifacts.
- `serving_container_image`: The serving container image to use when the model is deployed to a Vertex AI endpoint resource.
- `sync`: Whether to wait for the process to complete, or return immediately (async).

In [None]:
MODEL_ARTIFACT_URI = "gs://mco-mm/churn"

model = aiplatform.Model.upload(
    display_name="churn",
    artifact_uri=MODEL_ARTIFACT_URI,
    serving_container_image_uri=DEPLOY_IMAGE,
    sync=True,
)

print(model)

### Deploy the model to an endpoint

Next, deploy your Vertex AI model resource to a Vertex AI endpoint resource using the `deploy()` method, with the following parameters:

- `deploy_model_display`: The human reable name for the deployed model.
- `machine_type`: The machine type for each VM node instance.
- `min_replica_count`: The minimum number of nodes to provision for auto-scaling.
- `max_replica_count`: The maximum number of nodes to provision for auto-scaling.
- `accelerator_type`: The type, if any, of GPU accelators per provisioned node.
- `accelrator_count`: The number, if any, of GPU accelators per provisioned node.

In [None]:
MIN_NODES = 1
MAX_NODES = 1

if GPU:
    endpoint = model.deploy(
        deployed_model_display_name="churn",
        machine_type=DEPLOY_COMPUTE,
        min_replica_count=MIN_NODES,
        max_replica_count=MAX_NODES,
        accelerator_type=DEPLOY_GPU.name,
        accelerator_count=DEPLOY_NGPU,
    )
else:
    endpoint = model.deploy(
        deployed_model_display_name="churn",
        machine_type=DEPLOY_COMPUTE,
        min_replica_count=MIN_NODES,
        max_replica_count=MAX_NODES,
    )

## Configure a monitoring job

Configuring the monitoring job consists of the following specifications:

- `alert_config`: The email address(es) that are supposed to receive the monitoring alerts.
- `schedule_config`: The time window to analyze predictions.
- `logging_sampling_strategy`: The rate for sampling prediction requests. 
- `drift_config`: The features and drift thresholds to monitor.
- `skew_config`: The features and skewness thresholds to monitor.

### Configure the alerting specification

Configure the `alerting_config` specification with the following settings:

- `user_emails`: A list of one or more emails that should receive the alerts.
- `enable_logging`: Streams detected anomalies to Cloud Logging. Default is False.

In [None]:
# Create alerting configuration.
alerting_config = model_monitoring.EmailAlertConfig(
    user_emails=[USER_EMAIL], enable_logging=True
)

### Configure the monitoring interval specification

Next, you configure the `schedule_config` specification with the following settings:

- `monitor_interval`:  Sets the model monitoring job scheduling interval in hours. Minimum time interval is 1 hour.

In [None]:
# Monitoring Interval
MONITOR_INTERVAL = 1  # @param {type:"number"}

# Create schedule configuration
schedule_config = model_monitoring.ScheduleConfig(monitor_interval=MONITOR_INTERVAL)

### Configure the sampling specification

Now, you configure the `logging_sampling_strategy` specification with the following settings:

- `sample_rate`: The rate as a percentage (between 0 and 1) to randomly sample prediction requests for monitoring. Selected samples are logged to a BigQuery table.

In [None]:
# Sampling rate (optional, default=.8)
SAMPLE_RATE = 0.5  # @param {type:"number"}

# Create sampling configuration
logging_sampling_strategy = model_monitoring.RandomSampleConfig(sample_rate=SAMPLE_RATE)

### Configure the drift detection specification

Then, you configure the `drift_config` specification with the following settings:

- `drift_thresholds`: A dictionary of key/value pairs where the keys are the input features for monitoring the drift. The value represents the detection threshold. When not specified, the default drift threshold for a feature is 0.3 (30%).

**Note:** Enabling drift detection is optional.

In [None]:
DRIFT_THRESHOLD_VALUE = 0.05

DRIFT_THRESHOLDS = {
    "country": DRIFT_THRESHOLD_VALUE,
    "cnt_user_engagement": DRIFT_THRESHOLD_VALUE,
}

drift_config = model_monitoring.DriftDetectionConfig(drift_thresholds=DRIFT_THRESHOLDS)

### Configure the skew detection specification

Next, you configure the `skew_config` specification with the following settings:

- `data_source`: The source of the dataset of the original training data. The format of the source defaults to a BigQuery table. Otherwise the setting `data_format` must be set to one of the values below. The location of the data must be a Cloud Storage location.
  - `csv`
  - `jsonl`
  - `tf-record`
- `skew_thresholds`: A dictionary of key/value pairs where the keys are the input features for monitoring the skewness. The value represents the detection threshold. When not specified, the default skew threshold for a feature is 0.3 (30%).
- `target_field`: The target label for the training dataset

**Note:** Enabling skewness detection is optional.

In [None]:
# URI to training dataset.
DATASET_BQ_URI = "bq://mco-mm.bqmlga4.train"  # @param {type:"string"}
# Prediction target column name in training dataset.
TARGET = "churned"

SKEW_THRESHOLD_VALUE = 0.5

SKEW_THRESHOLDS = {
    "country": SKEW_THRESHOLD_VALUE,
    "cnt_user_engagement": SKEW_THRESHOLD_VALUE,
}

skew_config = model_monitoring.SkewDetectionConfig(
    data_source=DATASET_BQ_URI, skew_thresholds=SKEW_THRESHOLDS, target_field=TARGET
)

### Assemble the objective specification

Finally, you assemble the objective specification (`objective_config`) with the following settings:

- `skew_detection_config`: (Optional) The specification for the skewness detection configuration.
- `drift_detection_config`: (Optional) The specification for the drift detection configuration.
- `explanation_config`: (Optional) The specification for explanations when enabling monitoring for feature attributions.

In [None]:
objective_config = model_monitoring.ObjectiveConfig(
    skew_detection_config=skew_config,
    drift_detection_config=drift_config,
    explanation_config=None,
)

### Create the input schema

The monitoring service needs to know the features and data types for the the feature inputs to the model, which is referred to as the *input schema*. The *input schema* can either be 
 - Preloaded to the monitoring service.
 - Automatically generated by the monitoring service after receiving first 1000 prediction instances.
 
In this tutorial, you preload the *input schema*.

#### Create the predefined input schema

The predefined *input schema* is specified as a YAML file. In this example, you retrieve the BigQuery schema for the training data, which includes the feature names and data types, to generate the YAML specification. The predefined *input schema* must be loaded to a Cloud Storage location.

Learn more about [Custom instance schemas for parsing input](https://cloud.google.com/vertex-ai/docs/model-monitoring/overview#custom-input-schemas).

In [None]:
# Get the BQ table

table = bigquery.TableReference.from_string(DATASET_BQ_URI[5:])
bq_table = bqclient.get_table(table)

yaml = """type: object
properties:
"""

schema = bq_table.schema
for feature in schema:
    if feature.name == TARGET:
        continue
    if feature.field_type == "STRING":
        f_type = "string"
    else:
        f_type = "integer"
    yaml += f"""  {feature.name}:
    type: {f_type}
"""

yaml += """required:
"""
for feature in schema:
    if feature.name == TARGET:
        continue
    yaml += f"""- {feature.name}
"""

print(yaml)

with open("schema.yaml", "w") as f:
    f.write(yaml)

! gsutil cp schema.yaml {BUCKET_URI}/schema.yaml

### Create the monitoring job

Create a monitoring job, with your monitoring specifications, using Vertex AI's [`ModelDeploymentMonitoringJob.create()`](https://cloud.google.com/python/docs/reference/aiplatform/1.48.0/summary_method#google_cloud_aiplatform_ModelDeploymentMonitoringJob_create_summary) method, with the following parameters:

- `display_name`: The human readable name for the monitoring job.
- `project`: The project ID.
- `region`: The region.
- `endpoint`: The fully qualified resource name of the Vertex AI endpoint to enable monitoring.
- `logging_sampling_strategy`: The specification for the sampling configuration.
- `schedule_config`: The specification for the scheduling configuration.
- `alert_config`: The specification for the alerting configuration.
- `objective_configs`: The specification for the objectives configuration.
- `analysis_instance_schema_uri`: The location of the YAML file containing the *input schema*.

In [None]:
monitoring_job = aiplatform.ModelDeploymentMonitoringJob.create(
    display_name="churn",
    project=PROJECT_ID,
    location=LOCATION,
    endpoint=endpoint,
    logging_sampling_strategy=logging_sampling_strategy,
    schedule_config=schedule_config,
    alert_config=alerting_config,
    objective_configs=objective_config,
    analysis_instance_schema_uri=f"{BUCKET_URI}/schema.yaml",
)

print(monitoring_job)

#### Email notification of the monitoring job.

An email notification is sent to the email address in the alerting configuration, notifying that the model monitoring job is now enabled.

The contents of the email appear as below:

<blockquote>
Hello Vertex AI Customer,

You are receiving this mail because you are using the Vertex AI Model Monitoring service.
This mail is to inform you that we received your request to set up drift or skew detection for the Prediction Endpoint listed below. Starting from now, incoming prediction requests will be sampled and logged for analysis.
Raw requests and responses will be collected from prediction service and saved in bq://[your-project-id].model_deployment_monitoring_[endpoint-id].serving_predict .
</blockquote>

#### Monitoring Job State

After you start the Vertex AI Model Monitoring job, it stays in a **PENDING** state until `skew distribution baseline` is calculated. The monitoring service initiates a batch job to generate the distribution baseline from the training data. 

Once the baseline distribution is generated, then the monitoring job changes to **OFFLINE** state. On a per interval basis, for example, once an hour, the monitoring job enters **RUNNING** state while analyzing the sampled data. Once completed, it returns to the **OFFLINE** state while awaiting the next scheduled analysis.

In [None]:
jobs = monitoring_job.list(filter="display_name=churn")
job = jobs[0]
print(job.state)

### Automatic generation of the baseline distribution

Next, the monitoring service creates a batch job to analyze the training data to generate the baseline distribution. Once completed, the monitoring service starts monitoring on the specified intervals.

In [None]:
import time

# Pause a bit for the baseline distribution to be calculated
if os.getenv("IS_TESTING"):
    time.sleep(180)

### Generate synthetic prediction requests for skew detection

Now, you extract the first 1000 instances from the BigQuery training table for creating prediction requests. Then, modify the data (synthetic) to trigger the skewness detection in the prediction requests by making the follwing updates:

- `country`: Set all values to Canada.

In [None]:
# Download the table.
table = bigquery.TableReference.from_string(DATASET_BQ_URI[5:])

rows = bqclient.list_rows(table, max_results=1000)

instances = []
for row in rows:
    instance = {}
    for key, value in row.items():
        if key == TARGET:
            continue
        if value is None:
            value = ""
        if key == "country":
            value = "Canada"
        instance[key] = value
    instances.append(instance)

print(len(instances))

### Make the prediction requests

Next, you send the 1000 prediction requests to your Vertex AI endpoint resource using the `predict()` method.

In [None]:
for instance in instances:
    response = endpoint.predict(instances=[instance])

prediction = response[0]

# print the prediction for the first instance
print(prediction[0])

### Logging sampled requests

Once the monitoring service has started, the sampled prediction requests are logged to Cloud Storage. On the next monitoring interval, the sampled predictions are copied to the BigQuery logging table. Once the entries are logged, the monitoring service analyzes the sampled data.

Next, you wait for the first logged entries to appear in the BigQuery logging table for prediction samples. Since you sent 1000 prediction requests, with 50% sampling, you should see around 500 entries.

In [None]:
while True:
    time.sleep(180)

    ENDPOINT_ID = endpoint.resource_name.split("/")[-1]

    table = bigquery.TableReference.from_string(
        f"{PROJECT_ID}.model_deployment_monitoring_{ENDPOINT_ID}.serving_predict"
    )
    rows = bqclient.list_rows(table)
    print(rows.total_rows)
    if rows.total_rows > 0:
        break

### Skewness detection during monitoring

The skewness detection for feature inputs occurs at the next monitoring interval. In this tutorial, you set the monitoring interval to one hour. So, in about an hour your monitoring job goes from **OFFLINE** to **RUNNING**. While running, it analyzes the logged sampled tables from the predictions during this interval and compares them to the baseline distribution.

Once the analysis is completed, the monitoring job sends email notifications on the detected skewness, in this case `country`. Further, the monitoring job goes into **OFFLINE** state until the next interval.

#### Wait for monitoring interval

It can take up to 40 minutes or more from the moment the analysis on the monitoring interval is done until you receive an email alert.

The contents of the email appear as below:

<blockquote>
   Hello Vertex AI Customer,

You are receiving this mail because you are subscribing to the Vertex AI Model Monitoring service.
This mail is just to inform you that there are some anomalies detected in your deployed models and may need your attention.


Basic Information:

Endpoint Name: projects/[your-project-id]/locations/us-central1/endpoints/3315907167046860800
Monitoring Job: projects/[your-project-id]/locations/us-central1/modelDeploymentMonitoringJobs/8672170640054157312
Statistics and Anomalies Root Path(Google Cloud Storage): gs://cloud-ai-platform-773884b1-2a32-48d6-8b83-c03cde416b68/model_monitoring/job-8672170640054157312
BigQuery Command: SELECT * FROM `bq://[your-project-id].model_deployment_monitoring_3315907167046860800.serving_predict`


Training Prediction Skew Anomalies (Raw Feature):

Anomalies Report Path(Google Cloud Storage): gs://cloud-ai-platform-773884b1-2a32-48d6-8b83-c03cde416b68/model_monitoring/job-8672170640054157312/serving/2022-08-25T00:00/stats_and_anomalies/<deployed-model-id>/anomalies/training_prediction_skew_anomalies

For more information about the alert, please visit the model monitoring alert page.

Deployed model id: <deployed-model-id>

Feature name	Anomaly short description	Anomaly long description
country	High Linfty distance between training and serving	The Linfty distance between training and serving is 0.947563 (up to six significant digits), above the threshold 0.5. The feature value with maximum difference is: Canada
<blockquote>

In [None]:
if os.getenv("IS_TESTING"):
    time.sleep(60 * 45)

### Generate synthetic prediction requests for drift detection

Next, you extract the same first 1000 instances from the BigQuery training table to use for prediction requests. Then, modify the data (synthetic) to trigger the drift detection in the prediction requests by making the following updates:

- `cnt_user_engagement`: increase the value 4x.

In [None]:
# Download the table.
table = bigquery.TableReference.from_string(DATASET_BQ_URI[5:])

rows = bqclient.list_rows(table, max_results=1000)

instances = []
for row in rows:
    instance = {}
    for key, value in row.items():
        if key == TARGET:
            continue
        if value is None:
            value = ""
        elif key == "cnt_user_engagement":
            value = int(value * 4)
        instance[key] = value
    instances.append(instance)

print(len(instances))

### Make the prediction requests

Next, you send the 1000 prediction requests to your Vertex AI endpoint resource using the `predict()` method.

In [None]:
for instance in instances:
    response = endpoint.predict(instances=[instance])

prediction = response[0]

# print the prediction for the first instance
print(prediction[0])

### Logging sampled requests

On the next monitoring interval, the sampled predictions are copied to the BigQuery logging table. Once the entries are logged, the monitoring service analyzes the sampled data.

Next, you wait for the first logged entries to appear in the BigQuery logging table for prediction samples. Since you sent 1000 prediction requests, with 50% sampling, you should see around 1000 entries.

In [None]:
while True:
    time.sleep(180)

    ENDPOINT_ID = endpoint.resource_name.split("/")[-1]

    table = bigquery.TableReference.from_string(
        f"{PROJECT_ID}.model_deployment_monitoring_{ENDPOINT_ID}.serving_predict"
    )
    rows = bqclient.list_rows(table)
    print(rows.total_rows)
    if rows.total_rows > 550:
        break

### Drift detection during monitoring

The drift detection for feature inputs occurs at the next monitoring interval. In this tutorial, you set the monitoring interval to one hour. So, in about an hour your monitoring job goes from **OFFLINE** to **RUNNING**. While running, it analyzes the logged sampled tables from the predictions during this interval and compares them to the previous monitoring interval distribution.

Once the analysis is completed, the monitoring job sends email notifications on the detected drift, in this case `cnt_user_engagement`. Then, the monitoring job goes into **OFFLINE** state until the next interval.

#### Wait for monitoring interval

It can take up to 40 minutes or more from the moment the analyis on the monitoring interval is done until you receive an email alert.

In [None]:
if os.getenv("IS_TESTING"):
    time.sleep(60 * 45)

### Delete the monitoring job

Once you've received the email alerts and verified the content, you can:
- pause the monitoring job using the `pause()` method.
- delete the monitoring job using the `delete()` method. 

In [None]:
# Pause the job
monitoring_job.pause()
# Delete the job
monitoring_job.delete()

## Cleaning up

To clean up all Google Cloud resources used in this project, you can [delete the Google Cloud
project](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects) you used for the tutorial.

Otherwise, you can delete the individual resources you created in this tutorial.

In [None]:
# Undeploy the model from endpoint before deletion
endpoint.undeploy_all()

# Delete the endpoint
endpoint.delete()

# Delete the model
model.delete()

# Delete the Cloud Storage bucket
delete_bucket = False  # Set True for deletion
if delete_bucket:
    ! gsutil rm -rf {BUCKET_URI}

# Delete the locally generated files
! rm -f schema.yaml

# Delete the BigQuery table
! bq rm -f {PROJECT_ID}.model_deployment_monitoring_{ENDPOINT_ID}