In [None]:
# @title Copyright & License (click to expand)
# Copyright 2022 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Vertex AI Model Monitoring for setup for tabular models

<table align="left">
  <td>
    <a href="https://colab.research.google.com/github/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/model_monitoring/get_started_with_model_monitoring_setup.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Colab logo"> Run in Colab
    </a>
  </td>
  <td>
    <a href="https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/model_monitoring/get_started_with_model_monitoring_setup.ipynb">
        <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo">
      View on GitHub
    </a>
  </td>
  <td>
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/vertex-ai-samples/main/notebooks/official/model_monitoring/get_started_with_model_monitoring_setup.ipynb">
      <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo">
      Open in Vertex AI Workbench
    </a>
  </td>
</table>

## Overview

This tutorial demonstrates how to setup Vertex AI Model Monitoring for tabular models.

Learn more about [Vertex AI Model Monitoring](https://cloud.google.com/vertex-ai/docs/model-monitoring).

### Objective

In this notebook, you learn to setup the `Vertex AI Model Monitoring` service to detect feature skew and drift in the input predict requests.

This tutorial uses the following Google Cloud ML services:

- `Vertex AI Model Monitoring`
- `Vertex AI Prediction`
- `Vertex AI Model` resource
- `Vertex AI Endpoint` resource

The steps performed include:

- Download a pre-trained custom tabular model.
- Upload the pre-trained model as a `Model` resource.
- Deploy the `Model` resource to the `Endpoint` resource.
- Configure the `Endpoint` resource for model monitoring.
    - Skew and drift detection for feature inputs.
    - Skew and drift detection for feature attributions.
- Automatic generation of the `input schema` by sending 1000 prediction request.
- List, pause, resume and delete monitoring jobs.
- Restart monitoring job with predefined `input schema`.
- View logged monitored data.

### Model

This tutorial uses a pre-trained model, where the model artifacts are stored in a public Cloud Storage bucket. 

The model is based on [the blog post](https://cloud.google.com/blog/topics/developers-practitioners/churn-prediction-game-developers-using-google-analytics-4-ga4-and-bigquery-ml). The idea behind this model is that your company has extensive log data describing how your game users have interacted with the site. The raw data contains the following categories of information:

- identity - unique player identitity numbers
- demographic features - information about the player, such as the geographic region in which a player is located
- behavioral features - counts of the number of times a  player has triggered certain game events, such as reaching a new level
- churn propensity - this is the label or target feature, it provides an estimated probability that this player may churn, i.e. stop being an active player.

### Costs 

This tutorial uses billable components of Google Cloud:

* Vertex AI
* BigQuery
* Cloud Storage

Learn about [Vertext AI
pricing](https://cloud.google.com/vertex-ai/pricing) and [Cloud Storage
pricing](https://cloud.google.com/storage/pricing), and use the [Pricing
Calculator](https://cloud.google.com/products/calculator/)
to generate a cost estimate based on your projected usage.

## Installation

Install the packages required for executing this notebook.

In [None]:
# Install required packages.
! pip3 install --quiet --upgrade google-cloud-aiplatform \
                                 google-cloud-bigquery \
                                 tensorflow==2.7 \
                                 protobuf==3.20.3

### Colab only: Uncomment the following cell to restart the kernel

In [None]:
# Automatically restart kernel after installs so that your environment can access the new packages
# import IPython

# app = IPython.Application.instance()
# app.kernel.do_shutdown(True)

## Before you begin

### Set your project ID

**If you don't know your project ID**, try the following:
* Run `gcloud config list`.
* Run `gcloud projects list`.
* See the support page: [Locate the project ID](https://support.google.com/googleapi/answer/7014113)

In [None]:
PROJECT_ID = "[your-project-id]"  # @param {type:"string"}

# Set the project id
! gcloud config set project {PROJECT_ID}

#### Region

You can also change the `REGION` variable used by Vertex AI. Learn more about [Vertex AI regions](https://cloud.google.com/vertex-ai/docs/general/locations).

In [None]:
REGION = "us-central1"

#### User Email

Set your user email address to receive monitoring alerts.

In [None]:
import os

USER_EMAIL = "[your-email-address]"  # @param {type:"string"}

if os.getenv("IS_TESTING"):
    USER_EMAIL = "noreply@google.com"

### Authenticate your Google Cloud account

Depending on your Jupyter environment, you may have to manually authenticate. Follow the relevant instructions below.

**1. Vertex AI Workbench**
* Do nothing as you are already authenticated.

**2. Local JupyterLab instance, uncomment and run:**

In [None]:
# ! gcloud auth login

**3. Colab, uncomment and run:**

In [None]:
# from google.colab import auth
# auth.authenticate_user()

**4. Service account or other**
* See how to grant Cloud Storage permissions to your service account at https://cloud.google.com/storage/docs/gsutil/commands/iam#ch-examples.

### Notes about service account and permission

**By default no configuration is required**, if you run into any permission related issue, please make sure the service accounts above have the required roles:

|Service account email|Description|Roles|
|---|---|---|
|PROJECT_NUMBER-compute@developer.gserviceaccount.com|Compute Engine default service account|Dataflow Admin, Dataflow Worker, Storage Admin, BigQuery Admin, Vertex AI User|
|service-PROJECT_NUMBER@gcp-sa-aiplatform.iam.gserviceaccount.com|AI Platform Service Agent|Vertex AI Service Agent|


1. Goto https://console.cloud.google.com/iam-admin/iam.
2. Check the "Include Google-provided role grants" checkbox.
3. Grant the corresponding roles.

### Using data source from a different project
- For the BQ data source, grant both service accounts the "BigQuery Data Viewer" role.
- For the CSV data source, grant both service accounts the "Storage Object Viewer" role.

### Create a Cloud Storage bucket

Create a storage bucket to store intermediate artifacts such as datasets.

In [None]:
BUCKET_URI = f"gs://your-bucket-name-{PROJECT_ID}-unique"  # @param {type:"string"}

**Only if your bucket doesn't already exist**: Run the following cell to create your Cloud Storage bucket.

In [None]:
! gsutil mb -l $REGION -p $PROJECT_ID $BUCKET_URI

### Import libraries

In [None]:
import google.cloud.aiplatform as aiplatform
from google.cloud import bigquery
from google.cloud.aiplatform import model_monitoring
from google.cloud.aiplatform.explain.metadata.tf.v2 import \
    saved_model_metadata_builder

### Initialize Vertex AI SDK for Python

Initialize the Vertex AI SDK for Python for your project and corresponding bucket.

In [None]:
aiplatform.init(project=PROJECT_ID, location=REGION)

### Create BigQuery client

In this tutorial, you use data from the same public BigQuery table that was used to train the pre-trained model. You create a client interface, which you subsequently use to access the data.

In [None]:
bqclient = bigquery.Client(project=PROJECT_ID)

#### Set hardware accelerators

You can set hardware accelerators for prediction (e.g., GPUs) or choose not to use any (CPU). Hardware accelertors lower the latency response for a prediction request. When choosing a hardware accelerators, consider the additional cost trade-off over latency.

Set the variables `DEPLOY_GPU/DEPLOY_NGPU` to use a container image supporting a GPU and the number of GPUs allocated to the virtual machine (VM) instance. For example, to use a GPU container image with 4 Nvidia Tesla K80 GPUs allocated to each VM, you would specify:

    (aiplatform.gapic.AcceleratorType.NVIDIA_TESLA_K80, 4)

See the [locations where accelerators are available](https://cloud.google.com/vertex-ai/docs/general/locations#accelerators).

Otherwise specify `(None, None)` to use a container image to run on a CPU.

In [None]:
GPU = False
if GPU:
    DEPLOY_GPU, DEPLOY_NGPU = (aiplatform.gapic.AcceleratorType.NVIDIA_TESLA_K80, 1)
else:
    DEPLOY_GPU, DEPLOY_NGPU = (None, None)

#### Set pre-built containers

Set the pre-built Docker container image for prediction.

For the latest list, see [Pre-built containers for prediction](https://cloud.google.com/ai-platform-unified/docs/predictions/pre-built-containers).

In [None]:
if GPU:
    DEPLOY_VERSION = "tf2-gpu.2-5"
else:
    DEPLOY_VERSION = "tf2-cpu.2-5"

DEPLOY_IMAGE = "{}-docker.pkg.dev/vertex-ai/prediction/{}:latest".format(
    REGION.split("-")[0], DEPLOY_VERSION
)

print("Deployment:", DEPLOY_IMAGE, DEPLOY_GPU, DEPLOY_NGPU)

#### Set machine types

Next, set the machine types to use for training and prediction.

- Set the variable `DEPLOY_COMPUTE` to configure your compute resources for prediction.
 - `machine type`
     - `n1-standard`: 3.75GB of memory per vCPU
     - `n1-highmem`: 6.5GB of memory per vCPU
     - `n1-highcpu`: 0.9 GB of memory per vCPU
 - `vCPUs`: number of \[2, 4, 8, 16, 32, 64, 96 \]

*Note: You may also use n2 and e2 machine types for training and deployment, but they do not support GPUs*.

In [None]:
MACHINE_TYPE = "n1-standard"

VCPU = "4"
TRAIN_COMPUTE = MACHINE_TYPE + "-" + VCPU
print("Train machine type", TRAIN_COMPUTE)

MACHINE_TYPE = "n1-standard"

VCPU = "4"
DEPLOY_COMPUTE = MACHINE_TYPE + "-" + VCPU
print("Deploy machine type", DEPLOY_COMPUTE)

## Introduction to Vertex AI Model Monitoring

Vertex AI Model Monitoring is supported for AutoML tabular models and custom tabular models. You can monitor for skew and drift detection of the features in the inbound prediction requests or skew and drift detection of the feature attributions (Explainable AI) in the outbound prediction response -- that is, the distribution of the attributions on how they contributed to the output (predictions).

The following are the basic steps to enable model monitoring:

1. Deploy a `Vertex AI` AutoML or custom tabular model to an `Vertex AI Endpoint`.
2. Configure a model monitoring specification.
3. Upload the model monitoring specification to the `Vertex AI Endpoint`.
4. Upload or automatic generation of the `input schema` for parsing.
5. For feature skew detection, upload the training data for automatic generation of the feature distribution.
6. For feature attributions, upload corresponding `Vertex AI Explainability` specification.

Once configured, you can enable/disable monitoring, change alerts and update the model monitoring configuration. 

When model monitoring is enabled, the sampled incoming prediction requests are logged into a BigQuery table. The input feature values contained in the logged requests are then analyzed for skew or drift on an specified interval basis. You set a sampling rate to monitor a subset of the production inputs to a model, and the monitoring interval.

The model monitoring service needs to know how to parse the feature values, which is referred to as the input schema. For AutoML tabular models, the input schema is automatically generated. For custom tabular models, the service attempts to automatically derive the input schema from the first 1000 prediction requests. Alternatively, one can upload the input schema.

For skew detection, the monitoring service requires a baseline for the statistical distribution of values in the training data. For AutoML tabular models this is automatically derived. For custom tabular models, you upload the training data to the service, and have the service automatically derive the distribution.

For feature attribution skew and drift detection, requires enabling your deployed model for `Vertex AI Explainability`

Learn more about [Introduction to Vertex AI Model Monitoring](https://cloud.google.com/vertex-ai/docs/model-monitoring/overview).

### Generate explainable metadata for `Vertex Explainable AI`

If you want to do skew and drift detection on feature attributions of the output predictions (response), you do the additional steps:

- Specify the explainability specification for the model.
- When subsequently uploading the model as a `Vertex AI Model` resource, include the explainability specification.
- When subsequently uploading the model monitoring configuration specification to the corresponding `Vertex AI Endpoint` resource, include the explainability objective configuration.

As the first step, you create the explainable AI specification for your model using the helper method `SavedModelMetadataBuilder()`.

Learn more about [Introduction to Explainable AI](https://cloud.google.com/vertex-ai/docs/explainable-ai/overview).

In [None]:
MODEL_ARTIFACT_URI = "gs://mco-mm/churn"

params = {"sampled_shapley_attribution": {"path_count": 10}}
explanation_parameters = aiplatform.explain.ExplanationParameters(params)

builder = saved_model_metadata_builder.SavedModelMetadataBuilder(
    model_path=MODEL_ARTIFACT_URI, outputs_to_explain=["churned_probs"]
)
explanation_metadata = builder.get_metadata_protobuf()

### Upload the model artifacts as a `Vertex AI Model` resource

Next, you upload the pre-trained custom tabular model artifacts as a `Vertex AI Model` resource using the `upload()` method, with the following parameters:

- `display_name`: The human readable name for the `Model` resource.
- `artifact_uri`: The Cloud Storage location of the model artifacts.
- `serving_container_image`: The serving container image to use when the model is deployed to a `Vertex AI Endpoint` resource.
- `explanation_parameters`: The parameters to configure explaining for the model's predictions.
- `explanation_metadata`: The metadata describing the model's input and output for explanation.
- `sync`: Whether to wait for the process to complete, or return immediately (async).

Learn more about [Import models to Vertex AI](https://cloud.google.com/vertex-ai/docs/model-registry/import-model)

In [None]:
model = aiplatform.Model.upload(
    display_name="churn",
    artifact_uri=MODEL_ARTIFACT_URI,
    serving_container_image_uri=DEPLOY_IMAGE,
    explanation_parameters=explanation_parameters,
    explanation_metadata=explanation_metadata,
    sync=True,
)

print(model)

### Deploy the `Vertex AI Model` resource to a `Vertex AI Endpoint` resource

Next, you deploy your `Vertex AI Model` resource to a `Vertex AI Endpoint` resource using the `deploy()` method, with the following parameters:

- `deploy_model_display`: The human reable name for the deployed model.
- `machine_type`: The machine type for each VM node instance.
- `min_replica_count`: The minimum number of nodes to provision for auto-scaling.
- `max_replica_count`: The maximum number of nodes to provision for auto-scaling.
- `accelerator_type`: The type, if any, of GPU accelators per provisioned node.
- `accelrator_count`: The number, if any, of GPU accelators per provisioned node.

Learn more about [Deploy a model using Vertex AI](https://cloud.google.com/vertex-ai/docs/predictions/deploy-model-api).

In [None]:
MIN_NODES = 1
MAX_NODES = 1

if GPU:
    endpoint = model.deploy(
        deployed_model_display_name="churn",
        machine_type=DEPLOY_COMPUTE,
        min_replica_count=MIN_NODES,
        max_replica_count=MAX_NODES,
        accelerator_type=DEPLOY_GPU.name,
        accelerator_count=DEPLOY_NGPU,
    )
else:
    endpoint = model.deploy(
        deployed_model_display_name="churn",
        machine_type=DEPLOY_COMPUTE,
        min_replica_count=MIN_NODES,
        max_replica_count=MAX_NODES,
    )

## Configure a monitoring job

Configuring the monitoring job consists of the following specifications:

- `alert_config`: The email address(es) to send monitoring alerts to.
- `schedule_config`: The time window to analyze predictions.
- `logging_sampling_strategy`: The rate for sampling prediction requests. 
- `drift_config`: The features and drift thresholds to monitor.
- `skew_config`: The features and skew thresholds to monitor.

Learn more about [Monitor feature skew and drift](https://cloud.google.com/vertex-ai/docs/model-monitoring/using-model-monitoring).

### Configure the alerting specification

First, you configure the `alerting_config` specification with the following settings:

- `user_emails`: A list of one or more email to send alerts to.
- `enable_logging`: Stream detected anomalies to Cloud Logging. Default is False.

Learn more about [Configure alerts for model monitoring jobs](https://cloud.google.com/vertex-ai/docs/model-monitoring/using-model-monitoring#monitor-job).

In [None]:
# Create alerting configuration.
alerting_config = model_monitoring.EmailAlertConfig(
    user_emails=[USER_EMAIL], enable_logging=True
)

### Configure the monitoring interval specification

Next, you configure the `schedule_config` specification with the following settings:

- `monitor_interval`:  Sets the model monitoring job scheduling interval in hours. Minimum time interval is 1 hour.

*Note:* The REST API specifies the unit in seconds.

In [None]:
# Monitoring Interval
MONITOR_INTERVAL = 1  # @param {type:"number"}

# Create schedule configuration
schedule_config = model_monitoring.ScheduleConfig(monitor_interval=MONITOR_INTERVAL)

### Configure the sampling specification

Next, you configure the `logging_sampling_strategy` specification with the following settings:

- `sample_rate`: The rate as a percentage (between 0 and 1) to randomly sample predictions for monitoring. Select samples are logged to a BigQuery table.


In [None]:
# Sampling rate (optional, default=.8)
SAMPLE_RATE = 0.5  # @param {type:"number"}

# Create sampling configuration
logging_sampling_strategy = model_monitoring.RandomSampleConfig(sample_rate=SAMPLE_RATE)

### Configure the drift detection specification

Next, you configure the `drift_config` specification with the following settings:

- `drift_thresholds`: A dictionary of key/value pairs where the keys are the input features for monitor for feature input drift. The value is the detection threshold. When not specified, the default drift threshold for a feature is 0.3 (30%).
- `attribute_drift_threshold`: A dictionary of key/value pairs where the keys are the input features for monitor for feature attribution drift. The value is the detection threshold. When not specified, the default drift threshold for a feature is 0.3 (30%).

*Note:* Enabling drift detection for either feature inputs or feature attributions is optional.

In [None]:
DRIFT_THRESHOLD_VALUE = 0.05
ATTRIBUTION_DRIFT_THRESHOLD_VALUE = 0.05

DRIFT_THRESHOLDS = {
    "country": DRIFT_THRESHOLD_VALUE,
    "cnt_user_engagement": DRIFT_THRESHOLD_VALUE,
}

ATTRIBUTION_DRIFT_THRESHOLDS = {
    "country": ATTRIBUTION_DRIFT_THRESHOLD_VALUE,
    "cnt_user_engagement": ATTRIBUTION_DRIFT_THRESHOLD_VALUE,
}

drift_config = model_monitoring.DriftDetectionConfig(
    drift_thresholds=DRIFT_THRESHOLDS,
    attribute_drift_thresholds=ATTRIBUTION_DRIFT_THRESHOLDS,
)

### Configure the skew detection specification

Next, you configure the `skew_config` specification with the following settings:

- `data_source`: The source of the dataset of the original training data. The format of the source defaults to a BigQuery table. Otherwise the setting `data_format` must be set to one of the values below. The location of the data must be a Cloud Storage location.
  - `csv`: 
  - `jsonl`:
  - `tf-record`:
- `skew_thresholds`: A dictionary of key/value pairs where the keys are the input features for monitor for feature input skew. The value is the detection threshold. When not specified, the default skew threshold for a feature is 0.3 (30%).
- `attribute_skew_thresholds`: A dictionary of key/value pairs where the keys are the input features for monitor for feature attribution skew. The value is the detection threshold. When not specified, the default skew threshold for a feature is 0.3 (30%).
- `target_field`: The target label for the training dataset

*Note:* Enabling skew detection is optional.

In [None]:
# URI to training dataset.
DATASET_BQ_URI = "bq://mco-mm.bqmlga4.train"  # @param {type:"string"}
# Prediction target column name in training dataset.
TARGET = "churned"

SKEW_THRESHOLD_VALUE = 0.5

SKEW_THRESHOLDS = {
    "country": SKEW_THRESHOLD_VALUE,
    "cnt_user_engagement": SKEW_THRESHOLD_VALUE,
}

ATTRIBUTE_SKEW_THRESHOLDS = {
    "country": SKEW_THRESHOLD_VALUE,
    "cnt_user_engagement": SKEW_THRESHOLD_VALUE,
}

skew_config = model_monitoring.SkewDetectionConfig(
    data_source=DATASET_BQ_URI,
    skew_thresholds=SKEW_THRESHOLDS,
    attribute_skew_thresholds=ATTRIBUTE_SKEW_THRESHOLDS,
    target_field=TARGET,
)

### Assemble the objective specification

Finally, you assemble the objective specification `objective_config` with the following settings:

- `skew_detection_config`: (Optional) The specification for the skew detection configuration.
- `drift_detection_config`: (Optional) The specification for the drift detection configuration.
- `explanation_config`: (Optional) The specification for explanations when enabling monitoring for feature attributions.

In [None]:
explanation_config = model_monitoring.ExplanationConfig()

objective_config = model_monitoring.ObjectiveConfig(
    skew_detection_config=skew_config,
    drift_detection_config=drift_config,
    explanation_config=explanation_config,
)

### Create the monitoring job

You create a monitoring job, with your monitoring specifications, using the `aiplatform.ModelDeploymentMonitoringJob.create()` method, with the following parameters:

- `display_name`: The human readable name for the monitoring job.
- `project`: The project ID.
- `region`: The region.
- `endpoint`: The fully qualified resource name of the `Vertex AI Endpoint` to enable monitoring.
- `logging_sampling_strategy`: The specification for the sampling configuration.
- `schedule_config`: The specification for the scheduling configuration.
- `alert_config`: The specification for the alerting configuration.
- `objective_configs`: The specification for the objectives configuration.

In [None]:
monitoring_job = aiplatform.ModelDeploymentMonitoringJob.create(
    display_name="churn",
    project=PROJECT_ID,
    location=REGION,
    endpoint=endpoint,
    logging_sampling_strategy=logging_sampling_strategy,
    schedule_config=schedule_config,
    alert_config=alerting_config,
    objective_configs=objective_config,
)

print(monitoring_job.gca_resource)

#### Email notification of the monitoring job.

An email notification is sent to the email address in the alerting configuration, notifying that the model monitoring job is now enabled.

The contents will appear like:

<blockquote>
Hello Vertex AI Customer,

You are receiving this mail because you are using the Vertex AI Model Monitoring service.
This mail is to inform you that we received your request to set up drift or skew detection for the Prediction Endpoint listed below. Starting from now, incoming prediction requests will be sampled and logged for analysis.
Raw requests and responses will be collected from prediction service and saved in bq://[your-project-id].model_deployment_monitoring_[endpoint-id].serving_predict .
</blockquote>

#### Monitoring Job State

After you start the `Vertex AI Model Monitoring` job, it will be in a `PENDING` state until the `input schema` and `skew distribution baselines` are calculated. The process happens sequentially. In this example where you use automatic generation of the `input schema`, the service stays in a `PENDING` state until the 1000 prediction request (discussed subsequently) is sent. 

Once the `input schema` has been generated, then a batch job will be initiated to generate the distribution baseline from the training data. Again, the service stays in a `PENDING` state until the baseline distribution is calculated.

Once the baseline distribution is generated, then the monitoring job will enter `OFFLINE` state. On the per interval basis -- e.g., once an hour, the monitoring job will enter `RUNNING` state while analyzing the sampled data. Once completed, it will return to an `OFFLINE` state while awaiting the next scheduled analysis.

In [None]:
jobs = monitoring_job.list(filter="display_name=churn")
job = jobs[0]
print(job.state)

## Initialize the parsing for automatically generating the input schema

After your `Endpoint` receives a 1000 prediction requests, the modeling service will automatically parse and create the `input schema`.

### Create the 1000 instance data

In this example, the first 1000 entries in the BigQuery training data are used as the first 1000 prediction requests. 

*Note:* In this context, each instance is a prediction request. In otherwords, sending 1000 prediction requests of a single instance is the same as sending a single prediction request with 1000 instances.

In [None]:
# Download the table.
table = bigquery.TableReference.from_string(DATASET_BQ_URI[5:])

rows = bqclient.list_rows(table, max_results=1000)

instances = []
for row in rows:
    instance = {}
    for key, value in row.items():
        if key == TARGET:
            continue
        if value is None:
            value = ""
        instance[key] = value
    instances.append(instance)

print(len(instances))

### Make the initial prediction request

Next, you send the the 1000 prediction request to your `Vertex AI Endpoint` resource using the `predict()` method.

In [None]:
response = endpoint.predict(instances=instances)

prediction = response[0]

# print the prediction for the first instance
print(prediction[0])

### Automatic generation of the input schema

After the model monitoring service receives 1000 instances of prediction requests, the monitoring will start analyzing the prediction requests to automatically generate an `input schema` for the feature inputs.

### Automatic generation of the baseline distribution

After the `input schema` is generated, the monitoring service creates a batch job to analyze the training data to determine the baseline distribution. 

In [None]:
# Pause a bit for the baseline distribution to be calculated
if os.getenv("IS_TESTING"):
    import time

    time.sleep(120)

#### The location of the BigQuery table for monitoring

The BigQuery table for logging the sampled requests is located at:

    `<PROJECT_ID>.model_deployment_monitoring_<ENDPOINT_ID>`.serving_predict, 
    
Where <ENDPOINT_ID> is the numerical identifier for the `Vertex AI Endpoint` resource.

In [None]:
ENDPOINT_ID = endpoint.resource_name.split("/")[-1]

BQ_MON_TABLE = f"{PROJECT_ID}.model_deployment_monitoring_{ENDPOINT_ID}.serving_predict"

table = bigquery.TableReference.from_string(BQ_MON_TABLE)
bq_table = bqclient.get_table(table)

print(bq_table)

### Pause the monitoring job

You can pause and resume a monitoring job with the methods `pause()` and `resume()`, respectively.

In [None]:
monitoring_job.pause()
monitoring_job.resume()

### List monitoring jobs

Next, you can get a list of all monitoring jobs using the `list()` method.

In [None]:
monitoring_jobs = aiplatform.ModelDeploymentMonitoringJob.list()
print(monitoring_jobs)

### List monitoring jobs by a filter

Alternatively, you can use a `filter` parameter to list a subset of jobs. In this example, you filter the list by the monitoring job's display name.

In [None]:
monitoring_jobs = aiplatform.ModelDeploymentMonitoringJob.list(
    filter="display_name=churn"
)

print(monitoring_jobs[0].gca_resource)

### Delete the monitoring job

You can delete the monitoring job using the `delete()` method. 

*Note:* You cannot delete a monitoring job when in a state of RUNNING. You must pause the job first.

In [None]:
monitoring_job.pause()
monitoring_job.delete()

### Delete the logged sampled data

In [None]:
# Delete the monitoring logged data BigQuery dataset

! bq rm -r -f {PROJECT_ID}.model_deployment_monitoring_{ENDPOINT_ID}

### Create a monitoring job with a predefined input schema

Next, you create another monitoring job. This time you will load a predefined `input schema`. Once loaded, the monitoring service will use this `input schema` instead of automatically generating one from the first 1000 prediction instances.

#### Create the predefined input schema

The predefined `input schema` is specified as a YAML file. In this example, you retrieve the BigQuery schema for the training data, which includes the feature names and data types, to generate the YAML specification. The predefined `input schema` must be loaded to a Cloud Storage location.

Learn more about [Custom instance schemas for parsing input](https://cloud.google.com/vertex-ai/docs/model-monitoring/overview#custom-input-schemas).

In [None]:
# Get the BQ table

table = bigquery.TableReference.from_string(DATASET_BQ_URI[5:])
bq_table = bqclient.get_table(table)

yaml = """type: object
properties:
"""

schema = bq_table.schema
for feature in schema:
    if feature.name == TARGET:
        continue
    if feature.field_type == "STRING":
        f_type = "string"
    else:
        f_type = "integer"
    yaml += f"""  {feature.name}:
    type: {f_type}
"""

yaml += """required:
"""
for feature in schema:
    if feature.name == TARGET:
        continue
    yaml += f"""- {feature.name}
"""

print(yaml)

with open("schema.yaml", "w") as f:
    f.write(yaml)

! gsutil cp schema.yaml {BUCKET_URI}/schema.yaml

### Create the monitoring job

Finally, you create the monitoring job using the `create()` method, with the following additional parameter:

- `analysis_instance_schema_uri`: The location of the YAML file containing the `input schema`.

In [None]:
monitoring_job = aiplatform.ModelDeploymentMonitoringJob.create(
    display_name="churn",
    project=PROJECT_ID,
    location=REGION,
    endpoint=endpoint,
    logging_sampling_strategy=logging_sampling_strategy,
    schedule_config=schedule_config,
    alert_config=alerting_config,
    objective_configs=objective_config,
    analysis_instance_schema_uri=f"{BUCKET_URI}/schema.yaml",
)

print(monitoring_job)

### Delete the monitoring job

You can delete the monitoring job using the `delete()` method. 

In [None]:
monitoring_job.pause()
monitoring_job.delete()

#### Undeploy and delete the `Vertex AI Endpoint` resource

Your `Vertex AI Endpoint` resource can be deleted using the `delete()` method. Prior to deleting, any model deployed to your `Vertex AI Endpoint` resource, must first be undeployed.

In [None]:
endpoint.undeploy_all()
endpoint.delete()

#### Delete the `Vertex AI Model` resource

Your `Vertex AI Model` resource can be deleted using the `delete()` method.

In [None]:
model.delete()

## Cleaning up

To clean up all Google Cloud resources used in this project, you can [delete the Google Cloud
project](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects) you used for the tutorial.

Otherwise, you can delete the individual resources you created in this tutorial.

In [None]:
delete_bucket = False

if delete_bucket or os.getenv("IS_TESTING"):
    ! gsutil rm -rf {BUCKET_URI}

! rm -f schema.yaml

! bq rm -r -f {PROJECT_ID}.model_deployment_monitoring_{ENDPOINT_ID}