# Monitoring Drift + Automated Re-Training

## Setup Project

Create project to separate resources.

In [1]:
import os

import mlrun
import pandas as pd

In [2]:
project = mlrun.get_or_create_project(name="berkeley-mlops", context=".")
project.set_model_monitoring_credentials(os.environ.get("V3IO_ACCESS_KEY"))

> 2022-09-06 17:24:51,243 [info] loaded project berkeley-mlops from MLRun DB


## Log Model
Model will be logged with training set statistics to calculate drift against.

In [3]:
model_name = "RandomForestClassifier"

In [4]:
model_artifact = project.log_model(model_name, model_file="model.pkl", training_set=pd.read_csv("train.csv"))

In [5]:
model_artifact.uri

'store://models/berkeley-mlops/RandomForestClassifier#0:latest'

## Import Serving Function

Import serving function from function marketplace, mount filesytem, add model from experiment tracking, and enable drift detection.

In [6]:
# Import the serving function from the function hub and mount filesystem
serving_fn = mlrun.import_function('hub://v2_model_server').apply(mlrun.auto_mount())

# Add the model to the serving function's routing spec
serving_fn.add_model(model_name, model_path=model_artifact.uri)

# Enable model monitoring
serving_fn.set_tracking()

## Deploy Serving Function with Drift Detection

Deploys model server with behind-the-scenes infrastructure to facilitate model monitoring. See [docs](https://docs.mlrun.org/en/latest/model_monitoring/model-monitoring-deployment.html) for more info.

In [7]:
# Deploy the function
serving_fn.deploy()

> 2022-09-06 17:24:54,025 [info] Starting remote function deploy
2022-09-06 17:24:55  (info) Deploying function
2022-09-06 17:24:55  (info) Building
2022-09-06 17:24:56  (info) Staging files and preparing base images
2022-09-06 17:24:56  (info) Building processor image
2022-09-06 17:25:51  (info) Build complete
2022-09-06 17:26:01  (info) Function deploy complete
> 2022-09-06 17:26:02,126 [info] successfully deployed function: {'internal_invocation_urls': ['nuclio-berkeley-mlops-v2-model-server.default-tenant.svc.cluster.local:8080'], 'external_invocation_urls': ['berkeley-mlops-v2-model-server-berkeley-mlops.default-tenant.app.us-sales-350.iguazio-cd1.com/']}


'http://berkeley-mlops-v2-model-server-berkeley-mlops.default-tenant.app.us-sales-350.iguazio-cd1.com/'

## Simulate Production Requests

Use the following code to simulate production data.

In [None]:
import json
from time import sleep
from random import choice, uniform

iris_data = pd.read_csv("train.csv").to_dict(orient="split")["data"]

while True:
    data_point = choice(iris_data)
    serving_fn.invoke(f'v2/models/{model_name}/infer', json.dumps({'inputs': [data_point]}))
    sleep(uniform(0.2, 1.7))