# Model Monitoring (beta)

```{note}
Model Monitoring is based on Iguazio's streaming technology. Contact Iguazio to enable this feature.
```

## Introduction
MLRun provides a model monitoring service that tracks the performance of models in production to help identify
potential issues with concept drift and prediction accuracy before they impact business goals.
Typically, Model monitoring is used by devops for tracking model performance, and by data scientists to track model drift.
Two monitoring types are supported:
1. Model operational performance (stream processing)
2. Concept drift detection and accuracy monitoring (features/labels/predictions)&mdash;see [Drift Analysis](#drift-analysis) for more details.

Model Monitoring provides warning alerts that can be sent to stakeholders for processing.

The Model Monitoring data can be viewed using Iguazio's user interface portal or through Grafana Dashboards.

## Architecture
The Model Monitoring process flow starts by collecting operational data, converts them to vectors which are then posted to the Model Server.
Then, the Model Server is wrapped around a machine learning model that uses a function to calculate predictions based on the available vectors.
Next, the Model Server creates a log for the input and output of the vectors, and the entries are written to the production data stream.
While the Model Server is processing the vectors, a Nuclio operation monitors the log of the data stream and is triggered when a new log entry is detected.
The Nuclio function examines the log entry, processes it in to statistics which are then written to the statistics databases.
In parallel, a scheduled MLRun runs and reads the parquet files performing the drift analysis which is then stored so that
the user can retrieve the analysis data in the Iguazio UI or in a Grafana dashboard.

![Architecture](./model-monitoring-data-flow.svg)

### Drift Analysis
The Model Monitoring feature provides both Concept Drift analysis and accuracy monitoring.
Concept Drift in machine learning is a situation where the statistical properties of the target variable (what the model is trying to predict) change over time.
In other words, the production data has changed significantly over the course of time and no longer matches the input data used to train the model.
So, for this new data, accuracy of the model predictions is low.
For more information see <a href="https://www.iguazio.com/glossary/concept-drift/" target="_blank">Concept Drift</a>.

### Common Terminology
The following are terms you will see in all the model monitoring screens:

* **Total Variation Distance** (TVD)&mdash;this is the statistical difference between the actual predictions, and the model's trained predictions
* **Hellinger Distance**&mdash;this is a type of f-divergence that quantifies the similarity between the actual predictions, and the model's trained predictions.
* **Kullback–Leibler Divergence** (KLD)&mdash;this is the measure of how the probability distribution of actual predictions is different from the second, model's trained reference probability distribution.

## Model Monitoring Using the Iguazio Platform Interface
Iguazio's Model Monitoring datas is available for viewing through the regular platform interface.
The platform provides four screens in which information about the model monitoring is displayed.
* [Model Endpoint Summary List](#model-endpoint-summary-list)
* [Model Endpoint Overview](#model-endpoint-overview)
* [Model Drift Analysis](#model-drift-analysis)
* [Model Features Analysis](#model-features-analysis)

From the application drawer, press **Projects** and then press the project dashlet that has the Model Monitoring feature enabled.

From the project dashboard, press the **Models** dashlet to view the models that are currently setup.
If the Model Monitoring feature is enabled, you will see a list of endpoints in the **Model Endpoints** screen.

### Model Endpoint Summary List
The Model Endpoints summary list provides a quick view of the Model Monitoring data.

![Model Monitoring Summary List](./model_endpoints_main.png)

The summary page contains the following fields:
* **Name**&mdash;the name of the model endpoint
* **Version**&mdash;
* **Class**&mdash;the implementation class that is used by the endpoint
* **Model**&mdash;user defined name for the model
* **Labels**&mdash;
* **Uptime**&mdash;first request for production data
* **Last Prediction**&mdash;most recent request for production data
* **Error Count**&mdash;total error count from processing production data
* **Drift**&mdash;indication of drift status (no drift (green), possible drift (yellow), drift detected (red))
* **Accuracy**&mdash;a numeric value representing the accuracy of model predictions

### Model Endpoint Overview
The Model Endpoints Overview screen displays general information about the selected model.

![Model Endpoints Overview](./IG_model_endpoints_overview.png)

The Overview page contains the following fields:
* **UUID**&mdash;the ID of the deployed model
* **Model Class**&mdash;the implementation class that is used by the endpoint
* **Model Artifact**&mdash;
* **Function URI**&mdash;the MLRun function to access the model
* **Last Prediction**&mdash;most recent request for production data
* **Error Count**&mdash;total error count from processing production data
* **Accuracy**&mdash;a numeric value representing the accuracy of model predictions
* **Stream path**&mdash;

Use the ellipsis view the YAML resource file for details about the resource that is being monitored.

### Model Drift Analysis
The Model Endpoints Drift Analysis screen provides performance statistics for the currently selected model.

![Model Endpoints Drift Analysis](./IG_model_endpoints_drift_analysis.png)

Each of the following fields had both sum and mean numbers displayed. For definitions of the terms see [Common Terminology](#common-terminology)
* **TVD**
* **Hellinger**
* **KLD**

Use the ellipsis view the YAML resource file for details about the resource that is being monitored.

### Model Features Analysis
The Features Analysis pane provides details of the drift analysis in a table format with each feature in the selected model on its own line.

![Model Endpoints Features Analysis](./IG_model_endpoints_features_analysis.png)

The table is broken down into columns with both the expected, and the actual performance numbers. The expected column displays the number from the model training phase and tha actual column
displays the number from live production data. The following fields are available:
* **Mean**&mdash;
* **STD** (Standard deviation)&mdash;
* **Min**&mdash;
* **Max**&mdash;
* **TVD**
* **Hellinger**
* **KLD**
* **Histograms**&mdash;hover over the bars in the graph for details

Use the ellipsis view the YAML resource file for details about the resource that is being monitored.

## Model Monitoring Using a Grafana Service
You can deploy a Grafana service in your Iguazio instance in order to view the Model Monitoring feature.

### Model Endpoints Overview Dashboard
The Overview dashboard will display the model endpoint IDs of a specific project. Only deployed models with Model Monitoring option enabled are displayed.
Endpoint IDs are URIs used to provide access to performance data, drift detection statistics, and accuracy monitoring of a deployed model.

![overview](./overview.png)

The Overview screen providers details about the performance of all the deployed and monitored models within a project. You can change projects by choosing a new project from the
**Project** dropdown. The Overview dashboard displays the number of endpoints in the project, the average predictions per second (using a 5-minute rolling average),
the average latency (using a 1-hour rolling average), and the total error count in the project.

Additional details include:
* **Endpoint ID**&mdash;the ID of the deployed model. Use this link to drill down to the model performance and details screens.
* **Function**&mdash;the MLRun function to access the model
* **Model**&mdash;user defined name for the model
* **Model Class**&mdash;the implementation class that is used by the endpoint
* **First Request**&mdash;first request for production data
* **Last Request**&mdash;most recent request for production data
* **Error count**&mdash;total error count from processing production data
* **Accuracy**&mdash;a numeric value representing the accuracy of model predictions
* **Drift Status**&mdash;no drift (green), possible drift (yellow), drift detected (red)

At the bottom of the dashboard are heat maps for the Predictions per second, Average Latency and Errors. The heat maps display data based on 15 minute intervals.
See [How to Read a Heat Map](#how-to-read-a-heat-map)for more details.

Click an endpoint ID to drill down the performance details of that model.

#### How to Read a Heat Map
Heat maps are used to analyze trends and to instantly transform and enhance data through visualizations. This helps identify areas of interest quickly,
and empower users to explore the data in order to pinpoint where there may be potential issues. A heat map uses a matrix layout with colour and shading to show the relationship between
two categories of values (x and y axises), so he darker the cell, the higher the value. The values presented along each axis correspond to a cell which is then colour-coded to represent the relationship between
the two categories. The Predictions per second heatmap shows the relationship between time and the predictions per second, and the Average Latency per hour shows the relationship between
time and the latency.

To properly read the heap maps, look follow the hierarchy of shades from the darkest (the highest values) to the lightest shades (the lowest values).

```{note}
The exact quantitative values represented by the colors may be difficult to determine. Use the [Performance Dashboard](#model-endpoint-performance-dashboard) to see detailed results.
```

### Model Endpoint Details Dashboard
The model endpoint details dashboard displays the real time performance data of the selected model in detail.
Model performance data provided is rich and is used to fine tune or diagnose potential performance issues that may affect business goals.
The data in this dashboard changes based on the selection of the project and model.

This dashboard is broken down into three panes:

1. [Project and model summary](#project-and-model-summary)
2. [Analysis panes](#analysis-panes)
   1. Overall drift analysis
   2. Features analysis
3. [Incoming features graph](#incoming-features-graph)

![details](./details.png)

#### Project and Model Summary
Use the dropdown to change the project and model. The dashboard presents the following information about the project:
* **Endpoint ID**&mdash;the ID of the deployed model
* **Model**&mdash;user defined name for the model
* **Function URI**&mdash;the MLRun function to access the model
* **Model Class**&mdash;the implementation class that is used by the endpoint
* **Prediction/s**&mdash;the average number of predictions per second over a rolling 5-minute period
* **Average Latency**&mdash;the average latency over a rolling 1-hour period
* **First Request**&mdash;first request for production data
* **Last Request**&mdash;most recent request for production data


Use the [Performance](#model-endpoint-performance-dashboard) and [Overview](#model-endpoints-overview-dashboard) buttons view those dashboards.

#### Analysis Panes
This pane is broken down into sections: Overall Drift Analysis and Features Analysis.
The Overall Drift Analysis pane provides performance statistics for the currently selected model.
* **TVD** (sum and mean)
* **Hellinger** (sum and mean)
* **KLD** (sum and mean)


The Features Analysis pane provides details of the drift analysis for each feature in the selected model.
This pane includes five types of statistics:
* **Actual** (min, mean and max)&mdash;results based on actual live data stream
* **Expected** (min, mean and max)&mdash;results based on training data
* **TVD**
* **Hellinger**
* **KLD**

#### Incoming Features Graph
This graph displays the performance of the features that are in the selected model based on sampled data points from actual feature production data.
The graph displays the values of the features in the model over time.

### Model Endpoint Performance Dashboard
Model endpoint performance displays performance details in graphical format.

![performance](./performance.png)

This dashboard is broken down into 5 graphs:
* **Drift Measures**&mdash;the overall drift over time for each of the endpoints in the selected model
* **Average Latency**&mdash;the average latency of the model in 5 minute intervals, for 5 minutes and 1 hour rolling windows
* **Predictions/s**&mdash;the model predictions per second displayed in 5 second intervals for 5 minutes (rolling)
* **Predictions Count**&mdash;the number of predictions the model makes for 5 minutes and 1 hour rolling windows


## Initial Set Up
You will need to make sure you have a Grafana service running in your Iguazio instance.
If you do not have a Grafana service running,
see <a href="https://www.iguazio.com/docs/latest-release/services/fundamentals/#create-new-service" target="_blank">Creating a New Service</a> to create and configure it.

1. Make sure you have the `mlrun-api` as a Grafana data source configured in your Grafana service. If not,
add it by:
   1. Open your grafana service.
   2. Navigate to `Configuration -> Data Sources`.
   3. Press `Add data source`.
   4. Select the `SimpleJson` datasource and configure the following parameters.
      ```Name: mlrun-api
       URL: http://mlrun-api:8080/api/grafana-proxy/model-endpoints
       Access: Server (default)

       ## Add a custom header of:
       X-V3io-Session-Key: <YOUR ACCESS KEY>
       ```
   5. Press `Save & Test` for verification. You will receive a confirmation with either a success, or a failure message.
&nbsp;

2. Download the following monitoring dashboards:

   * {download}`Overview <./dashboards/overview.json>`
   * {download}`Details <./dashboards/details.json>`
   * {download}`Performance <./dashboards/performance.json>`

3. Import the downloaded dashboards to your Grafana service.
   To import that dashboards into your Grafana service:
   1. Navigate to your Grafana service in the Services list and press on it
   2. Press the dashboards icon in left menu
   3. In the dashboard management screen press the IMPORT button, and select one file to import. Repeat this step for each dashboard.
&nbsp;

```{note}
If you have a model already trained and running with production data, you should see results in the dashboards.
If you do not have a model already trained and running follow the steps to train and deploy your model. After deplyoment
you should see results in the dashboards.
```
&nbsp;

```{important}
* To allow the system to utilize drift measurement, make sure you supply the train set when logging the model in the
   training step.

* When serving the model, make sure that the Nuclio function is deployed with tracking enabled by applying
   `fn.set_tracking()` on the serving function.
```

## Demo Deployment
The following is a demo that you can use to test the model monitoring feature.
Use the code blocks as is and change only the project name. One you have completed this demo, you should have a better understanding of
how Model Monitoring feature will work on your project.

```{note}
Each section below should be in its own cell in a Jupyter notebook.
```

### Start of Jupyter Notebook
Copy this code to the top of the Jupyter notebook and give your demo project a name.

In [None]:
# Set project name
project = ""

### Deploy Model Servers

Copy this code to its own cell in Jupyter notebook.

In [None]:
import pandas as pd
from sklearn.datasets import load_iris

from mlrun import import_function, get_dataitem
from mlrun import projects
from mlrun.platforms import auto_mount

proj = projects.new_project(project)

get_dataitem("https://s3.wasabisys.com/iguazio/models/iris/model.pkl").download(
    "model.pkl")

iris = load_iris()
train_set = pd.DataFrame(iris['data'],
                         columns=['sepal_length_cm', 'sepal_width_cm',
                                  'petal_length_cm', 'petal_width_cm'])

model_names = [
    "sklearn_ensemble_RandomForestClassifier",
    "sklearn_linear_model_LogisticRegression",
    "sklearn_ensemble_AdaBoostClassifier"
]

serving_fn = import_function('hub://v2_model_server').apply(auto_mount())

for name in model_names:
    proj.log_model(name, model_file="model.pkl", training_set=train_set)
    serving_fn.add_model(name,
                         model_path=f"store://models/{project}/{name}:latest")

serving_fn.metadata.project = project
serving_fn.set_tracking()
serving_fn.deploy()

### Deploy Stream Processing

Use the following code to deploy the stream processing.

In [None]:
import os

from mlrun import import_function
from mlrun.platforms import mount_v3io
from mlrun.runtimes import RemoteRuntime
import json

fn: RemoteRuntime = import_function("hub://model_monitoring_stream")

fn.add_v3io_stream_trigger(
    stream_path=f"projects/{project}/model-endpoints/stream",
    name="monitoring_stream_trigger",
)

fn.set_env("MODEL_MONITORING_PARAMETERS", json.dumps(
    {"project": project, "v3io_framesd": os.environ.get("V3IO_FRAMESD")}))

fn.metadata.project = project
fn.apply(mount_v3io())
fn.deploy()

### Deploy Batch Processing

Use the following code to deploy batch processing.

In [None]:
from mlrun import import_function
from mlrun.platforms import mount_v3io
from mlrun.runtimes import KubejobRuntime

fn: KubejobRuntime = import_function("hub://model_monitoring_batch")
fn.metadata.project = project
fn.apply(mount_v3io())
fn.run(name='model-monitoring-batch', schedule="0 */1 * * *",
       params={"project": project})

### Simulating Requests

Use the following code to stimulate requests and view data in the model monitoring feature.

In [None]:
import json
from time import sleep
from random import choice, uniform
from sklearn.datasets import load_iris

iris = load_iris()
iris_data = iris['data'].tolist()

while True:
    for name in model_names:
        data_point = choice(iris_data)
        serving_fn.invoke(f'v2/models/{name}/infer',
                          json.dumps({'inputs': [data_point]}))
        sleep(uniform(0.1, 0.4))
    sleep(uniform(0.2, 1.7))