In [None]:
# Copyright 2023 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# AutoMLOps - Customer Churn Model Monitoring Example

<table align="left">
  <td>
    <a href="https://colab.research.google.com/github/GoogleCloudPlatform/automlops/blob/main/examples/inference/01_customer_churn_model_monitoring.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Colab logo"> Run in Colab
    </a>
  </td>
  <td>
    <a href="https://github.com/GoogleCloudPlatform/automlops/blob/main/examples/inference/01_customer_churn_model_monitoring.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo">
      View on GitHub
    </a>
  </td>
  <td>
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/automlops/main/examples/inference/01_customer_churn_model_monitoring.ipynb">
        <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo">
      Open in Vertex AI Workbench
    </a>
  </td>
</table>
<br/><br/><br/>

# Overview

This tutorial explores using AutoMLOps for model monitoring on a pre-trained customer churn model. The tutorial will walk you through how to use AutoMLOps to define, create and run inference pipelines, and create a model monitoring job around a Vertex AI Endpoint.

### What is Model Monitoring?

Modern applications rely on a well established set of capabilities to monitor the health of their services. Examples include:

* software versioning
* rigorous deployment processes
* event logging
* alerting/notication of situations requiring intervention
* on-demand and automated diagnostic tracing
* automated performance and functional testing

You should be able to manage your ML services with the same degree of power and flexibility with which you can manage your applications. That's what MLOps is all about - managing ML services with the best practices Google and the broader computing industry have learned from generations of experience deploying well engineered, reliable, and scalable services.

Model monitoring is only one piece of the MLOps puzzle - it helps answer the following questions:

* How well do recent service requests match the training data used to build your model? This is called **training-serving skew**.
* How significantly are service requests evolving over time? This is called **drift detection**.

[Vertex Explainable AI](https://cloud.google.com/vertex-ai/docs/explainable-ai/overview) adds another facet to model monitoring, which we call feature attribution monitoring. Explainable AI enables you to understand the relative contribution of each feature to a resulting prediction. In essence, it assesses the magnitude of each feature's influence.

If production traffic differs from  training data, or varies substantially over time, **either in terms of model predictions or feature attributions**, that's likely to impact the quality of the answers your model produces. When that happens, you'd like to be alerted automatically and responsively, so that **you can anticipate problems before they affect your customer experiences or your revenue streams**.

Learn more about [Vertex AI Model Monitoring](https://cloud.google.com/vertex-ai/docs/model-monitoring).

### Objective

In this tutorial, you learn to use the `Vertex AI Model Monitoring` service to detect drift and anomalies in prediction requests from a deployed `Vertex AI Model` resource. You will then learn how to create and run MLOps pipelines integrated with CI/CD. The pipeline goes through the following steps:

1. deploy_and_test_model: Upload a pretrained model and deploy it to an Endpoint. Runs tests for predictions and explainability. 
2. create_monitoring_job: Creates a model monitoring job and sends alerts to specified emails. 
3. test_monitoring_job: Generate synthetic prediction requests and analyze monitoring. 

# Prerequisites

In order to use AutoMLOps, the following are required:

- Python 3.7 - 3.10
- [Google Cloud SDK 407.0.0](https://cloud.google.com/sdk/gcloud/reference)
- [beta 2022.10.21](https://cloud.google.com/sdk/gcloud/reference/beta)
- `git` installed
- `git` logged-in:
```
  git config --global user.email "you@example.com"
  git config --global user.name "Your Name"
```
- [Application Default Credentials (ADC)](https://cloud.google.com/docs/authentication/provide-credentials-adc) are setup. This can be done through the following commands:
```
gcloud auth application-default login
gcloud config set account <account@example.com>
```

# APIs & IAM
Based on the user options selection, AutoMLOps will enable up to the following APIs during the provision step:
- [aiplatform.googleapis.com](https://cloud.google.com/vertex-ai/docs/reference/rest)
- [artifactregistry.googleapis.com](https://cloud.google.com/artifact-registry/docs/reference/rest)
- [cloudbuild.googleapis.com](https://cloud.google.com/build/docs/api/reference/rest)
- [cloudfunctions.googleapis.com](https://cloud.google.com/functions/docs/reference/rest)
- [cloudresourcemanager.googleapis.com](https://cloud.google.com/resource-manager/reference/rest)
- [cloudscheduler.googleapis.com](https://cloud.google.com/scheduler/docs/reference/rest)
- [compute.googleapis.com](https://cloud.google.com/compute/docs/reference/rest/v1)
- [iam.googleapis.com](https://cloud.google.com/iam/docs/reference/rest)
- [iamcredentials.googleapis.com](https://cloud.google.com/iam/docs/reference/credentials/rest)
- [pubsub.googleapis.com](https://cloud.google.com/pubsub/docs/reference/rest)
- [run.googleapis.com](https://cloud.google.com/run/docs/reference/rest)
- [storage.googleapis.com](https://cloud.google.com/storage/docs/apis)
- [sourcerepo.googleapis.com](https://cloud.google.com/source-repositories/docs/reference/rest)


AutoMLOps will create the following service account and update [IAM permissions](https://cloud.google.com/iam/docs/understanding-roles) during the provision step:
1. Pipeline Runner Service Account (defaults to: vertex-pipelines@PROJECT_ID.iam.gserviceaccount.com). Roles added:
- roles/aiplatform.user
- roles/artifactregistry.reader
- roles/bigquery.user
- roles/bigquery.dataEditor
- roles/iam.serviceAccountUser
- roles/storage.admin
- roles/cloudfunctions.admin

# User Guide

For a user-guide, please view these [slides](../../AutoMLOps_User_Guide.pdf).

# Costs

This tutorial uses billable components of Google Cloud:
- Vertex AI
- Artifact Registry
- Cloud Storage
- Cloud Source Repository
- Cloud Build
- Cloud Run
- Cloud Scheduler
- Cloud Pub/Sub

Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing), and use the [Pricing Calculator](https://cloud.google.com/products/calculator/) to generate a cost estimate based on your projected usage.

# Ground-rules for using AutoMLOps
1. Do not use variables, functions, code, etc. not defined within the scope of a custom component. These custom components will become containers and will have no reference to the out of scope code.
2. Import statements and helper functions must be added inside the function. Provide parameter type hints.
3. Test each of your components for accuracy and correctness before running them using AutoMLOps. We cannot fix bugs automatically; bugs are much more difficult to fix once they are made into pipelines.
4. If you are using Kubeflow, be sure to define all the requirements needed to run the custom component - it can be easy to leave out packages which will cause the container to fail when running within a pipeline.  


# Model

This tutorial uses a pre-trained model, where the model artifacts are stored in a public Cloud Storage bucket. The model predicts for an online gaming site, the probability that a player may churn, i.e. stop being an active player.

The model you use in this notebook is based on [this blog post](https://cloud.google.com/blog/topics/developers-practitioners/churn-prediction-game-developers-using-google-analytics-4-ga4-and-bigquery-ml). The idea behind this model is that your company has extensive log data describing how your game users have interacted with the site. The raw data contains the following categories of information:

- identity - unique player identitity numbers
- demographic features - information about the player, such as the geographic region in which a player is located
- behavioral features - counts of the number of times a  player has triggered certain game events, such as reaching a new level
- churn propensity - this is the label or target feature, it provides an estimated probability that this player will churn, i.e. stop being an active player.

The blog article referenced above explains how to use BigQuery to store the raw data, pre-process the data for machine learning, and train the corresponding model. Because this notebook focuses on model monitoring, rather than training models, you're going to reuse a pre-trained version of this model, which has been exported to Cloud Storage.

# Dataset
For training data, we are using the [Google Analytics 4 (GA4)](https://cloud.google.com/blog/topics/developers-practitioners/churn-prediction-game-developers-using-google-analytics-4-ga4-and-bigquery-ml) BQML train dataset which is a publicly available dataset that contains a sample of obfuscated BiqQuery event export data using Google Analytics 4's standard web ecommerce implementation on [Google Merchandise Store](https://shop.googlemerchandisestore.com/). This is the same public BigQuery table that was used to train the pre-trained model.

# Setup Git
Set up your git configuration below

In [None]:
!git config --global user.email 'you@example.com'
!git config --global user.name 'Your Name'

# Install AutoMLOps

Install AutoMLOps from [PyPI](https://pypi.org/project/google-cloud-automlops/), or locally by cloning the repo and running `pip install .`

In [None]:
!pip3 install google-cloud-automlops --user

# Restart the kernel
Once you've installed the AutoMLOps package, you need to restart the notebook kernel so it can find the package.

**Note: Once this cell has finished running, continue on. You do not need to re-run any of the cells above.**

In [None]:
import os

if not os.getenv('IS_TESTING'):
    # Automatically restart kernel after installs
    import IPython

    app = IPython.Application.instance()
    app.kernel.do_shutdown(True)

# Set your project ID
Set your project ID below. If you don't know your project ID, leave the field blank and the following cells may be able to find it.

In [1]:
PROJECT_ID = '[your-project-id]'  # @param {type:"string"}

In [2]:
if PROJECT_ID == '' or PROJECT_ID is None or PROJECT_ID == '[your-project-id]':
    # Get your GCP project id from gcloud
    shell_output = !gcloud config list --format 'value(core.project)' 2>/dev/null
    PROJECT_ID = shell_output[0]
    print('Project ID:', PROJECT_ID)

Project ID: automlops-sandbox


In [3]:
! gcloud config set project $PROJECT_ID

Updated property [core/project].


Set your Model_ID below:

In [None]:
MODEL_ID = 'GA4-BQML-Monitoring'

# Monitoring Emails

Set your user email address to receive monitoring alerts.

In [4]:
ALERT_EMAILS = ['noreply@google.com']  # Update with your emails

# 1. AutoMLOps Pipeline
This workflow will define and generate a pipeline using AutoMLOps. `generate()` will create all the necessary files but not run them. `go()` will create all the necessary files, resources, push the code to the source repo to trigger the build, and then submit a Pipeline training job to Vertex AI. Please see the [readme](https://github.com/GoogleCloudPlatform/automlops/blob/main/README.md) for more information.

## Import AutoMLOps

In [5]:
from google_cloud_automlops import AutoMLOps

## Clear the cache
`AutoMLOps.clear_cache` will remove previous instantiations of AutoMLOps components and pipelines. Use this function if you have previously defined a component that you no longer need.

In [6]:
AutoMLOps.clear_cache()

Cache cleared.


## Deploy and Test your model


The churn propensity model you use in this notebook has been trained in BigQuery ML and exported to a Cloud Storage bucket. Define a custom component for deploying this pretrained model and testing it for predictions and explanability. Import statements and helper functions must be added inside the function. Provide parameter type hints.

**Note: we currently only support python primitive types for component parameters. If you would like to use something more advanced, please use the Kubeflow spec instead.**

In [7]:
@AutoMLOps.component(
    packages_to_install=[
        'explainable_ai_sdk',
        'google-cloud-aiplatform'
    ]
)
def deploy_and_test_model(
    model_directory: str,
    project_id: str,
    region: str
):
    """Custom component that uploads a saved model from GCS to Vertex Model Registry
       and deploys the model to an endpoint for online prediction. Runs a prediction
       and explanation test as well.

    Args:
        model_directory: GS location of saved model.
        project_id: Project_id.
        region: Region.
    """
    from google.cloud import aiplatform
    from google.cloud.aiplatform.explain.metadata.tf.v2 import \
    saved_model_metadata_builder
    import pprint as pp

    aiplatform.init(project=project_id, location=region)

    MODEL_NAME = 'churn'
    IMAGE = 'us-docker.pkg.dev/cloud-aiplatform/prediction/tf2-cpu.2-5:latest'
    params = {'sampled_shapley_attribution': {'path_count': 10}}
    EXPLAIN_PARAMS = aiplatform.explain.ExplanationParameters(params)
    builder = saved_model_metadata_builder.SavedModelMetadataBuilder(
        model_path=model_directory, outputs_to_explain=['churned_probs']
    )
    EXPLAIN_META = builder.get_metadata_protobuf()
    DEFAULT_INPUT = {
        'cnt_ad_reward': 0,
        'cnt_challenge_a_friend': 0,
        'cnt_completed_5_levels': 1,
        'cnt_level_complete_quickplay': 3,
        'cnt_level_end_quickplay': 5,
        'cnt_level_reset_quickplay': 2,
        'cnt_level_start_quickplay': 6,
        'cnt_post_score': 34,
        'cnt_spend_virtual_currency': 0,
        'cnt_use_extra_steps': 0,
        'cnt_user_engagement': 120,
        'country': 'Denmark',
        'dayofweek': 3,
        'julianday': 254,
        'language': 'da-dk',
        'month': 9,
        'operating_system': 'IOS',
        'user_pseudo_id': '104B0770BAE16E8B53DF330C95881893',
    }

    model = aiplatform.Model.upload(
        display_name=MODEL_NAME,
        artifact_uri=model_directory,
        serving_container_image_uri=IMAGE,
        explanation_parameters=EXPLAIN_PARAMS,
        explanation_metadata=EXPLAIN_META,
        sync=True
    )

    endpoint = model.deploy(
        machine_type='n1-standard-4',
        deployed_model_display_name='deployed-churn-model')

    # Test predictions
    print('running prediction test...')
    try:
        resp = endpoint.predict([DEFAULT_INPUT])
        for i in resp.predictions:
            vals = i['churned_values']
            probs = i['churned_probs']
        for i in range(len(vals)):
            print(vals[i], probs[i])
        pp.pprint(resp)
    except Exception as ex:
        print('prediction request failed', ex)

    # Test explanations
    print('\nrunning explanation test...')
    try:
        features = []
        scores = []
        resp = endpoint.explain([DEFAULT_INPUT])
        for i in resp.explanations:
            for j in i.attributions:
                for k in j.feature_attributions:
                    features.append(k)
                    scores.append(j.feature_attributions[k])
        features = [x for _, x in sorted(zip(scores, features))]
        scores = sorted(scores)
        for i in range(len(scores)):
            print(scores[i], features[i])
        pp.pprint(resp)
    except Exception as ex:
        print('explanation request failed', ex)

## Create Monitoring Job

Define a custom component for creating a model monitoring job around the previously deployed model. Import statements and helper functions must be added inside the function. Provide parameter type hints.

The following code uses the Google Python client library to translate your configuration settings into a programmatic request to start a model monitoring job. Instantiating a monitoring job can take some time. If everything looks good with your request, you'll get a successful API response. Then, you'll need to check your email to receive a notification that the job is running.

After a minute or two, you should receive email at the address you configured above for USER_EMAIL. This email confirms successful deployment of your monitoring job. Here's a sample of what this email might look like:
<br>
<br>
<img src="https://storage.googleapis.com/mco-general/img/mm6.png" />
<br>
As your monitoring job collects data, measurements are stored in Cloud Storage and you are free to examine your data at any time. The "Statistics and Anomalies Root Path" specifies the location of your measurements in Cloud Storage. Run the following cell to see an example of the layout of these measurements in Cloud Storage. If you substitute the Cloud Storage URL in your job creation email, you can view the structure and content of the data files for your own monitoring job.

In [8]:
@AutoMLOps.component(
    packages_to_install=[
        'google-cloud-aiplatform'
    ]
)
def create_monitoring_job(
    alert_emails: list,
    cnt_user_engagement_threshold_value: float,
    country_threshold_value: float,
    data_source: str,
    log_sampling_rate: float,
    monitor_interval: int,
    project_id: str,
    region: str,
    target: str
):
    """Custom component that creates a model monitoring job on the given model.

    Args:
        alert_emails: List of emails to send monitoring alerts.
        cnt_user_engagement_threshold_value: Threshold value for the cnt_user_engagement feature.
        country_threshold_value: Threshold value for the country feature.
        data_source: BQ training data table.        
        log_sampling_rate: Sampling rate.
        monitor_interval: Monitoring interval in hours.
        project_id: Project_id.
        region: Region.
        target: Prediction target column name in training dataset.
    """
    from google.cloud import aiplatform
    from google.cloud.aiplatform import model_monitoring

    aiplatform.init(project=project_id, location=region)

    JOB_NAME = 'churn'
    SKEW_THRESHOLDS = {
        'country': country_threshold_value,
        'cnt_user_engagement': cnt_user_engagement_threshold_value,
    }
    DRIFT_THRESHOLDS = {
        'country': country_threshold_value,
        'cnt_user_engagement': cnt_user_engagement_threshold_value,
    }
    ATTRIB_SKEW_THRESHOLDS = {
        'country': country_threshold_value,
        'cnt_user_engagement': cnt_user_engagement_threshold_value,
    }
    ATTRIB_DRIFT_THRESHOLDS = {
        'country': country_threshold_value,
        'cnt_user_engagement': cnt_user_engagement_threshold_value,
    }

    skew_config = model_monitoring.SkewDetectionConfig(
        data_source=data_source,
        skew_thresholds=SKEW_THRESHOLDS,
        attribute_skew_thresholds=ATTRIB_SKEW_THRESHOLDS,
        target_field=target,
    )

    drift_config = model_monitoring.DriftDetectionConfig(
        drift_thresholds=DRIFT_THRESHOLDS,
        attribute_drift_thresholds=ATTRIB_DRIFT_THRESHOLDS,
    )

    explanation_config = model_monitoring.ExplanationConfig()
    objective_config = model_monitoring.ObjectiveConfig(
        skew_config, drift_config, explanation_config
    )

    # Create sampling configuration
    random_sampling = model_monitoring.RandomSampleConfig(sample_rate=log_sampling_rate)

    # Create schedule configuration
    schedule_config = model_monitoring.ScheduleConfig(monitor_interval=monitor_interval)

    # Create alerting configuration.
    alerting_config = model_monitoring.EmailAlertConfig(
        user_emails=alert_emails, enable_logging=True
    )

    endpoint = aiplatform.Endpoint.list(filter='display_name="churn_endpoint"')[0]
    # Create the monitoring job.
    job = aiplatform.ModelDeploymentMonitoringJob.create(
        display_name=JOB_NAME,
        logging_sampling_strategy=random_sampling,
        schedule_config=schedule_config,
        alert_config=alerting_config,
        objective_configs=objective_config,
        project=project_id,
        location=region,
        endpoint=endpoint,
    )

## Test your Monitoring Job

Send a first test prediction request. The model monitoring service will analyze the distribution of features and automatically create a baseline to monitor deviations from the baseline. After your `Endpoint` receives a 1000 prediction requests, the modeling service will automatically parse and create the `input schema`. In this example, the first 1000 entries in the BigQuery training data are used as the first 1000 prediction requests.

Define a custom component for testing the monitoring job. Import statements and helper functions must be added inside the function. Provide parameter type hints.

In [9]:
@AutoMLOps.component(
    packages_to_install=[
        'google-cloud-bigquery',
        'google-cloud-aiplatform'
    ]
)
def test_monitoring_job(
    data_source: str,
    project_id: str,
    region: str,
    target: str
):
    """Custom component that uploads a saved model from GCS to Vertex Model Registry
       and deploys the model to an endpoint for online prediction. Runs a prediction
       and explanation test as well.

    Args:
        data_source: BQ training data table.
        project_id: Project_id.
        region: Region.
        target: Prediction target column name in training dataset.
    """
    import time

    from google.cloud import aiplatform
    from google.cloud import bigquery

    bq_client = bigquery.Client(project=project_id)
    # Download the table.
    table = bigquery.TableReference.from_string(data_source[5:])

    rows = bq_client.list_rows(table, max_results=1000)

    instances = []
    for row in rows:
        instance = {}
        for key, value in row.items():
            if key == target:
                continue
            if value is None:
                value = ""
            instance[key] = value
        instances.append(instance)

    print(len(instances))

    endpoint = aiplatform.Endpoint.list(filter='display_name="churn_endpoint"')[0]
    response = endpoint.predict(instances=instances)
    prediction = response[0]
    # print the predictions
    print(prediction)

    # Pause a bit for the baseline distribution to be calculated
    time.sleep(120)

## Define the Pipeline
Define your pipeline. You can optionally give the pipeline a name and description. Define the structure by listing the components to be called in your pipeline; use `.after` to specify the order of execution.

In [10]:
@AutoMLOps.pipeline(
    name='automlops-monitoring-pipeline',
    description='This is an example model monitoring pipeline')
def pipeline(alert_emails: list,
             cnt_user_engagement_threshold_value: float,
             country_threshold_value: float,
             data_source: str,
             log_sampling_rate: float,
             model_directory: str,
             monitor_interval: int,
             project_id: str,
             region: str,
             target: str):

    deploy_and_test_model_task = deploy_and_test_model(
        model_directory=model_directory,
        project_id=project_id,
        region=region)
    
    create_monitoring_job_task = create_monitoring_job(
        alert_emails=alert_emails,
        cnt_user_engagement_threshold_value=cnt_user_engagement_threshold_value,
        country_threshold_value=country_threshold_value,
        data_source=data_source,
        log_sampling_rate=log_sampling_rate,
        monitor_interval=monitor_interval,
        project_id=project_id,
        region=region,
        target=target).after(deploy_and_test_model_task)
    
    test_monitoring_job_task = test_monitoring_job(
        data_source=data_source,
        project_id=project_id,
        region=region,
        target=target).after(create_monitoring_job_task)

## Define the Pipeline Arguments

In [11]:
pipeline_params = {
    'alert_emails': ALERT_EMAILS,
    'cnt_user_engagement_threshold_value': 0.001,
    'country_threshold_value': 0.001,
    'data_source': 'bq://mco-mm.bqmlga4.train',
    'log_sampling_rate': 0.8,
    'model_directory': 'gs://mco-mm/churn',
    'monitor_interval': 1,
    'project_id': PROJECT_ID,
    'region': 'us-central1',
    'target': 'churned'
}

## Generate and Run the pipeline
`AutoMLOps.generate(...)` generates the MLOps codebase. Users can specify the tooling and technologies they would like to use in their MLOps pipeline.

In [None]:
AutoMLOps.generate(project_id=PROJECT_ID,
                   pipeline_params=pipeline_params,
                   use_ci=True,
                   naming_prefix=MODEL_ID,
                   schedule_pattern='59 11 * * 0' # retrain every Sunday at Midnight
)

Writing directories under AutoMLOps/
Writing configurations to AutoMLOps/configs/defaults.yaml
Writing Kubeflow Pipelines code to AutoMLOps/pipelines, AutoMLOps/components, AutoMLOps/services
Writing README.md to AutoMLOps/README.md
Writing scripts to AutoMLOps/scripts
Writing CloudBuild config to AutoMLOps/cloudbuild.yaml
Code Generation Complete.


`AutoMLOps.provision(...)` runs provisioning scripts to create and maintain necessary infra for MLOps.

In [None]:
AutoMLOps.provision(hide_warnings=False)            # hide_warnings is optional, defaults to True

-cloudfunctions.functions.get
-serviceusage.services.use
-serviceusage.services.enable
-cloudfunctions.functions.create
-pubsub.subscriptions.list
-cloudscheduler.jobs.list
-pubsub.topics.create
-source.repos.list
-artifactregistry.repositories.create
-resourcemanager.projects.setIamPolicy
-iam.serviceAccounts.listiam.serviceAccounts.create
-pubsub.subscriptions.create
-cloudscheduler.jobs.create
-storage.buckets.create
-source.repos.create
-artifactregistry.repositories.list
-cloudbuild.builds.create
-cloudbuild.builds.list
-pubsub.topics.list
-storage.buckets.get

You are currently using: srastatter@google.com. Please check your account permissions.
The following are the recommended roles for provisioning:
-roles/resourcemanager.projectIamAdmin
-roles/cloudfunctions.admin
-roles/artifactregistry.admin
-roles/iam.serviceAccountAdmin
-roles/serviceusage.serviceUsageAdmin
-roles/aiplatform.serviceAgent
-roles/cloudscheduler.admin
-roles/pubsub.editor
-roles/source.admin
-roles/cloudbuil

`AutoMLOps.deploy(...)` builds and pushes component container, then triggers the pipeline job.

In [None]:
AutoMLOps.deploy(precheck=True,                     # precheck is optional, defaults to True
                 hide_warnings=False)               # hide_warnings is optional, defaults to True

-artifactregistry.repositories.get
-cloudbuild.builds.get
-resourcemanager.projects.getIamPolicy
-storage.buckets.update
-serviceusage.services.get
-cloudfunctions.functions.get
-pubsub.topics.get
-iam.serviceAccounts.get
-source.repos.update
-pubsub.subscriptions.get

You are currently using: srastatter@google.com. Please check your account permissions.
The following are the recommended roles for deploying with precheck:
-roles/serviceusage.serviceUsageViewer
-roles/iam.roleViewer
-roles/pubsub.viewer
-roles/storage.admin
-roles/cloudbuild.builds.editor
-roles/source.writer
-roles/iam.serviceAccountUser
-roles/cloudfunctions.viewer
-roles/artifactregistry.reader

Checking for required API services in project automlops-sandbox...
Checking for Artifact Registry in project automlops-sandbox...
Checking for Storage Bucket in project automlops-sandbox...
Checking for Pipeline Runner Service Account in project automlops-sandbox...
Checking for IAM roles on Pipeline Runner Service Account in

## Interpret your results

Vertex AI Model Monitoring detects an anomaly when the threshold set for a feature is exceeded. The following cells give you a sense of the alerting and reporting experience after model monitoring anomalies have been detected.

Vertex AI Model Monitoring automatically notifies you of detected anomalies through email, but you can also [set up alerts through Cloud Logging](https://cloud.google.com/vertex-ai/docs/model-monitoring/using-model-monitoring#monitor-job).

### Here's what a sample email alert looks like...

<img src="https://storage.googleapis.com/mco-general/img/mm7.png" />


This email is warning you that the *cnt_level_start_quickplay*, *cnt_user_engagement*, and *country* feature values seen in production have skewed above your threshold between training and serving your model. It's also telling you that the *cnt_user_engagement* and *country* feature attribution values are skewed relative to your training data, again, as per your threshold specification.

### Monitoring results in the Cloud Console

You can examine your model monitoring data from the Cloud Console. Below is a screenshot of those capabilities.

#### Monitoring Status

You can verify that a given endpoint has an active model monitoring job via the Endpoint summary page:

<img src="https://storage.googleapis.com/mco-general/img/mm1.png" />

#### Monitoring Alerts

You can examine the alert details by clicking into the endpoint of interest, and selecting the alerts panel:

<img src="https://storage.googleapis.com/mco-general/img/mm2.png" />

#### Feature Value Distributions

You can also examine the recorded training and production feature distributions by drilling down into a given feature, like this:

<img src="https://storage.googleapis.com/mco-general/img/mm9.png" />

which yields graphical representations of the feature distrubution during both training and production, like this:

<img src="https://storage.googleapis.com/mco-general/img/mm8.png" />

## Learn more about model monitoring

**Congratulations!** You've now learned what model monitoring is, how to configure and enable it, and how to find and interpret the results. Check out the following resources to learn more about model monitoring and MLOps.

- [TensorFlow Data Validation](https://www.tensorflow.org/tfx/guide/tfdv)
- [Data Understanding, Validation, and Monitoring At Scale](https://blog.tensorflow.org/2018/09/introducing-tensorflow-data-validation.html)
- [Vertex Product Documentation](https://cloud.google.com/vertex-ai)
- [Vertex AI Model Monitoring Reference Docs](https://cloud.google.com/vertex-ai/docs/reference)
- [Vertex AI Model Monitoring blog article](https://cloud.google.com/blog/topics/developers-practitioners/monitor-models-training-serving-skew-vertex-ai)
- [Explainable AI Whitepaper](https://storage.googleapis.com/cloud-ai-whitepapers/AI%20Explainability%20Whitepaper.pdf)