# Azure ML Model Monitoring Demo - Production Data Simulation

Series of sample notebooks designed to showcase [AML's continuous model monitoring capabilities](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-monitor-model-performance?view=azureml-api-2&tabs=azure-cli). The series of notebooks in this repo have been developed to perform core operations including model training, deployment, simulated production data scoring, and inference data collection. These notebooks have been designed to be run in order and include the following steps:

- 00. Data Upload - Load time-series weather data from a local CSV into an AML datastore, and register as training & evaluation datasets
- 01. Model Training - Train a custom temperature prediction regression model using Mlflow & Scikit-Learn and register into your AML workspace
- 02. Model Deployment - Deploy your newly trained model to a Managed Online Endpoint with production data collection configured.
- <b>03. Production Data Simulation - Send time-series data to your endpoint at a slow rate to simulate production inferencing. All submitted data will be collected automatically.</b>
- 04. Monitoring Configuration - Configure a production model data monitor looking for drift in inferencing data, and scored results which can indicate that retraining should be performed.
- 05. Offline Monitoring - Sample notebook showcasing how to identify drift in data from datasets scored outside of Azure ML.

<b>This notebook utilizes the previously registered `Temperature_Prediction_Model` deployed to the endpoint `temp-pred-endpoint` and submits all weather data points to it for scoring. Here we have configured data to be scored on an extended timeline (~6 days) to simulate ongoing production data inferencing. Over time, these data will begin to drift as later months are reflected.</b>

### Import required packages

In [None]:
from azure.ai.ml import MLClient
from azure.ai.ml.entities import ManagedOnlineEndpoint, ManagedOnlineDeployment, Environment, CodeConfiguration, DataCollector, DeploymentCollection
from azure.identity import DefaultAzureCredential
from mlflow import set_tracking_uri
import mlflow
import mltable
import requests
import json
import time
import pandas as pd

### Establish connection to Azure ML workspace using the v2 SDK

In [None]:
subscription_id = "<your_subscription_id>"
resource_group = "<your_resource_group>"
workspace_name = "<your_workspace_name>"

ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group, workspace_name)
workspace = ml_client.workspaces.get(workspace_name)
tracking_uri = workspace.mlflow_tracking_uri

set_tracking_uri(tracking_uri)


### Set environment variables

Set environment variables for your endpoint URI and key which can be retrieved from your AML workspace

In [None]:
os.environ['ENDPOINT_URI'] = '<YOUR-ENDPOINT-URI>'
os.environ['ENDPOINT_KEY'] = '<YOUR-ENDPOINT-KEY>'

### Load complete registered weather dataset

In [None]:
import mltable

dataset_name = 'weather-full-data'

data = ml_client.data.get(dataset_name, version='5')
dataset = mltable.from_delimited_files(paths=[{'pattern': data._referenced_uris[0]}])
df = dataset.to_pandas_dataframe()
df = df.drop(columns=['temperature'])
df

### Score all data over an extended period

Iterate over all datapoints and send to the AML endpoint with a delay between each submission. The implemented delay will result in all data being scored over the period of ~6 days, but can be adjusted up/down as necessary

In [None]:
import requests
import json
import time
import pandas as pd

def submit_request(row):
    ind_df = pd.DataFrame([row])
    url = os.environ['ENDPOINT_URI']
    # Replace this with the primary/secondary key or AMLToken for the endpoint
    api_key = os.environ['ENDPOINT_KEY']
    if not api_key:
        raise Exception("A key should be provided to invoke the endpoint")

    # The azureml-model-deployment header will force the request to go to a specific deployment.
    # Remove this header to have the request observe the endpoint traffic rules
    headers = {'Content-Type':'application/json', 'Authorization':('Bearer '+ api_key), 'azureml-model-deployment': 'blue' }
    
    resp = requests.post(url, headers=headers, data=json.dumps({'data': ind_df.to_dict(orient='records')}))

total_requests = 0

for _, row in df.iterrows():
    try:
        submit_request(row)
        total_requests+=1
        if total_requests%100==0:
            print(total_requests)
    except Exception as e:
        print(e)
        pass
    time.sleep(15)
   
    
