<a href="https://colab.research.google.com/github/bongjoonsiong/GCP-MLOPS-Model-Monitoring/blob/main/MLOPS_model_monitoring_20231109.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Monitoring Vertex AI Model

### What is Model Monitoring?

Modern applications rely on a well established set of capabilities to monitor the health of their services. Examples include:

* software versioning
* rigorous deployment processes
* event logging
* alerting/notification of situations requiring intervention
* on-demand and automated diagnostic tracing
* automated performance and functional testing

You should be able to manage your ML services with the same degree of power and flexibility with which you can manage your applications. That's what MLOps is all about - managing ML services with the best practices Google and the broader computing industry have learned from generations of experience deploying well engineered, reliable, and scalable services.

Model monitoring is only one piece of the ML Ops puzzle - it helps answer the following questions:

* How well do recent service requests match the training data used to build your model? This is called **training-serving skew**.
* How significantly are service requests evolving over time? This is called **drift detection**.

If production traffic differs from  training data, or varies substantially over time, that's likely to impact the quality of the answers your model produces. When that happens, you'd like to be alerted automatically and responsively, so that **you can anticipate problems before they affect your customer experiences or your revenue streams**.

Learning Objectives
Deploy a pre-trained model.
Configure model monitoring.
Generate some artificial traffic.
Interpret the data reported by the model monitoring feature.

Introduction
In this notebook, you will deploy a pre-trained model to an endpoint and generate some prediction requests on the model. You will also create a monitoring job to keep an eye on the model quality and generate test data to trigger alerting.

The example model
The model you'll use in this notebook is based on this blog post. The idea behind this model is that your company has extensive log data describing how your game users have interacted with the site. The raw data contains the following categories of information:

identity - unique player identitity numbers
demographic features - information about the player, such as the geographic region in which a player is located
behavioral features - counts of the number of times a player has triggered certain game events, such as reaching a new level
churn propensity - this is the label or target feature, it provides an estimated probability that this player will churn, i.e. stop being an active player.
The blog article referenced above explains how to use BigQuery to store the raw data, pre-process it for use in machine learning, and train a model. Because this notebook focuses on model monitoring, rather than training models, you're going to reuse a pre-trained version of this model, which has been exported to Google Cloud Storage. In the next section, you will setup your environment and import this model into your own project.

Each learning objective will correspond to a #TODO in this student lab notebook -- try to complete this notebook first and then review the Solution Notebook for reference.

#Before you begin

### Setup your dependencies

In [None]:
import os

# The Google Cloud Notebook product has specific requirements
IS_GOOGLE_CLOUD_NOTEBOOK = os.path.exists("/opt/deeplearning/metadata/env_version")

# Google Cloud Notebook requires dependencies to be installed with '--user'
USER_FLAG = ""
if IS_GOOGLE_CLOUD_NOTEBOOK:
    USER_FLAG = "--user"

In [None]:
# Import necessary libraries
import os
import sys

import IPython

assert sys.version_info.major == 3, "This notebook requires Python 3."

# Install Python package dependencies.
# Upgrade the specified packages to the available versions
print("Installing TensorFlow Data Validation (TFDV)")
! pip3 install {USER_FLAG} --quiet --upgrade tensorflow_data_validation[visualization]
! pip3 install {USER_FLAG} --quiet --upgrade google-api-python-client google-auth-oauthlib google-auth-httplib2 oauth2client requests
! pip3 install {USER_FLAG} --quiet --upgrade google-cloud-aiplatform
! pip3 install {USER_FLAG} --quiet --upgrade google-cloud-storage

# Automatically restart kernel after installing new packages.
if not os.getenv("IS_TESTING"):
    print("Restarting kernel...")
    app = IPython.Application.instance()
    app.kernel.do_shutdown(True)
    print("Done.")

In [None]:
import os
import random
import sys
import time

# Import required packages.
import numpy as np