## AzureML Model Monitoring through Operationalization

In this sample notebook, you will observe the end-to-end lifecycle of the Machine Learning (ML) operationalization process. You will follow the following steps to train your ML model, deploy it to production, and monitor it to ensure its continuous performance:

1) Setup environment 
2) Register data assets
3) Train the model
4) Deploy the model
5) Simulate inference requests
6) Monitor the model

Let's begin. 

## Setup your environment

To start, connect to your project workspace.

In [2]:
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential

# Connect to the project workspace
ml_client = MLClient.from_config(credential=DefaultAzureCredential())

Found the config file in: /config.json
Class DeploymentTemplateOperations: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.


Set up a compute cluster to use to train your model.

In [2]:
from azure.ai.ml.entities import AmlCompute

cluster_basic = AmlCompute(
    name="cpu-cluster",
    type="amlcompute",
    size="STANDARD_F2S_V2",  # you can replace it with other supported VM SKUs
    location=ml_client.workspaces.get(ml_client.workspace_name).location,
    min_instances=0,
    max_instances=1,
    idle_time_before_scale_down=360,
)

ml_client.begin_create_or_update(cluster_basic).result()

  mlflow.mismatch._check_version_mismatch()


AmlCompute({'type': 'amlcompute', 'created_on': None, 'provisioning_state': 'Succeeded', 'provisioning_errors': None, 'name': 'cpu-cluster', 'description': None, 'tags': None, 'properties': {}, 'print_as_yaml': False, 'id': '/subscriptions/2e5fc10d-039e-4cb2-975d-4ed7bc31a0ee/resourceGroups/datacollecrg/providers/Microsoft.MachineLearningServices/workspaces/amlforuaicaroljan/computes/cpu-cluster', 'Resource__source_path': '', 'base_path': '/mnt/batch/tasks/shared/LS_root/mounts/clusters/carsilva3/code/Users/carsilva/amle2emonitoring/notebooks', 'creation_context': None, 'serialize': <msrest.serialization.Serializer object at 0x78f28ffed5d0>, 'resource_id': None, 'location': 'westeurope', 'size': 'STANDARD_F2S_V2', 'min_instances': 0, 'max_instances': 1, 'idle_time_before_scale_down': 360.0, 'identity': None, 'ssh_public_access_enabled': True, 'ssh_settings': None, 'network_settings': <azure.ai.ml.entities._compute.compute.NetworkSettings object at 0x78f28ffedb70>, 'tier': 'dedicated', 

## Register data assets

Next, let's use some sample data to train our model. We will randomly split the dataset into reference and production sets. We add a timestamp column to simulate "production-like" data, since production data typically comes with timestamps. The dataset we are using in this example notebook has several columns related to credit card borrowers and contains a column on whether or not they defaulted on their credit card debt. We will train a model to predict `DEFAULT_NEXT_MONTH`, which is whether or not a borrower will default on their debt next month.

In [3]:
import pandas as pd
import datetime

# Read the default_of_credit_card_clients dataset into a pandas data frame
data_path = "https://azuremlexamples.blob.core.windows.net/datasets/credit_card/default_of_credit_card_clients.csv"
df = pd.read_csv(data_path, header=1, index_col=0).rename(
    columns={"default payment next month": "DEFAULT_NEXT_MONTH"}
)

# Split the data into production_data_df and reference_data_df
# Use the iloc method to select the first 80% and the last 20% of the rows
reference_data_df = df.iloc[: int(0.8 * len(df))].copy()
production_data_df = df.iloc[int(0.8 * len(df)) :].copy()

# Add a timestamp column in ISO8601 format
timestamp = datetime.datetime.now() - datetime.timedelta(days=45)
reference_data_df["TIMESTAMP"] = timestamp.strftime("%Y-%m-%dT%H:%M:%S")
production_data_df["TIMESTAMP"] = [
    timestamp + datetime.timedelta(minutes=i * 10)
    for i in range(len(production_data_df))
]
production_data_df["TIMESTAMP"] = production_data_df["TIMESTAMP"].apply(
    lambda x: x.strftime("%Y-%m-%dT%H:%M:%S")
)

In [4]:
import os


def write_df(df, local_path, file_name):
    # Create directory if it does not exist
    os.makedirs(local_path, exist_ok=True)

    # Write data
    df.to_csv(f"{local_path}/{file_name}", index=False)


# Write data to local directory
reference_data_dir_local_path = "../data/reference"
production_data_dir_local_path = "../data/production"

write_df(reference_data_df, reference_data_dir_local_path, "01.csv"),
write_df(production_data_df, production_data_dir_local_path, "01.csv")

In [6]:
import mltable
from mltable import MLTableHeaders, MLTableFileEncoding

from azureml.fsspec import AzureMachineLearningFileSystem
from azure.ai.ml.entities import Data
from azure.ai.ml.constants import AssetTypes


def upload_data_and_create_data_asset(
    local_path, remote_path, datastore_uri, data_name, data_version
):
    # Write MLTable file
    tbl = mltable.from_delimited_files(
        paths=[{"pattern": f"{datastore_uri}{remote_path}*.csv"}],
        delimiter=",",
        header="all_files_same_headers",
        infer_column_types=True,
        include_path_column=False,
        encoding="utf8",
    )

    tbl.save(local_path)

    # Instantiate file system
    fs = AzureMachineLearningFileSystem(datastore_uri)

    # Upload data
    fs.upload(
        lpath=local_path,
        rpath=remote_path,
        recursive=False,
        **{"overwrite": "MERGE_WITH_OVERWRITE"},
    )

    # Define the Data asset object
    data = Data(
        path=f"{datastore_uri}{remote_path}",
        type=AssetTypes.MLTABLE,
        name=data_name,
        version=data_version,
    )

    # Create the data asset in the workspace
    ml_client.data.create_or_update(data)

    return data


# Datastore uri for data
datastore_uri = "azureml://subscriptions/{}/resourcegroups/{}/workspaces/{}/datastores/workspaceblobstore/paths/".format(
    ml_client.subscription_id, ml_client.resource_group_name, ml_client.workspace_name
)

# Define paths
reference_data_dir_remote_path = "data/credit-default/reference/"
production_data_dir_remote_path = "data/credit-default/production/"

# Define data asset names
reference_data_asset_name = "credit-default-reference1"
production_data_asset_name = "credit-default-production1"

# Write data to remote directory and create data asset
reference_data = upload_data_and_create_data_asset(
    reference_data_dir_local_path,
    reference_data_dir_remote_path,
    datastore_uri,
    reference_data_asset_name,
    "1",
)
production_data = upload_data_and_create_data_asset(
    production_data_dir_local_path,
    production_data_dir_remote_path,
    datastore_uri,
    production_data_asset_name,
    "1",
)

## Train the model

Train the model.

In [4]:
from azure.ai.ml import load_job

# Define training pipeline directory
training_pipeline_path = "../configurations/training_pipeline.yaml"

# Trigger training
training_pipeline_definition = load_job(source=training_pipeline_path)
training_pipeline_job = ml_client.jobs.create_or_update(training_pipeline_definition)

ml_client.jobs.stream(training_pipeline_job.name)

pathOnCompute is not a known attribute of class <class 'azure.ai.ml._restclient.v2023_04_01_preview.models._models_py3.MLFlowModelJobOutput'> and will be ignored


RunId: placid_eye_r1vlmf4c4h
Web View: https://ml.azure.com/runs/placid_eye_r1vlmf4c4h?wsid=/subscriptions/2e5fc10d-039e-4cb2-975d-4ed7bc31a0ee/resourcegroups/datacollecrg/workspaces/amlforuaicaroljan

Streaming logs/azureml/executionlogs.txt

[2026-01-29 14:53:54Z] Submitting 1 runs, first five are: 994d1139:6b098789-bde3-425b-bd98-44b7a2bacb01
[2026-01-29 14:59:53Z] Completing processing run id 6b098789-bde3-425b-bd98-44b7a2bacb01.

Execution Summary
RunId: placid_eye_r1vlmf4c4h
Web View: https://ml.azure.com/runs/placid_eye_r1vlmf4c4h?wsid=/subscriptions/2e5fc10d-039e-4cb2-975d-4ed7bc31a0ee/resourcegroups/datacollecrg/workspaces/amlforuaicaroljan



## Deploy the model

Deploy the model with AzureML managed online endpoints.

### Create Endpoint

In [6]:
from azure.ai.ml import load_online_endpoint

# Define endpoint directory
endpoint_path = "../endpoints/endpoint.yaml"

# Trigger endpoint creation
endpoint_definition = load_online_endpoint(source=endpoint_path)
endpoint = ml_client.online_endpoints.begin_create_or_update(endpoint_definition)

In [8]:
# Check endpoint status
endpoint = ml_client.online_endpoints.get(name=endpoint_definition.name)
print(
    f'Endpoint "{endpoint.name}" with provisioning state "{endpoint.provisioning_state}" is retrieved'
)

Endpoint "credit-default-carol" with provisioning state "Succeeded" is retrieved


### Create Deployment

As part of the deployment configuration, the Model Data Collector (MDC) is enabled, so that inference data is collected for model monitoring. 

In [11]:
from azure.ai.ml import load_online_deployment

# Define deployment directory
deployment_path = "../endpoints/deployment.yaml"

# Trigger deployment creation
deployment_definition = load_online_deployment(source=deployment_path)
deployment = ml_client.online_deployments.begin_create_or_update(deployment_definition)

Check: endpoint credit-default-carol exists


..

In [12]:
# Check deployment status
deployment = ml_client.online_deployments.get(
    name=deployment_definition.name, endpoint_name=endpoint_definition.name
)
print(
    f'Deployment "{deployment.name}" with provisioning state "{deployment.provisioning_state}" is retrieved'
)

Deployment "main" with provisioning state "Creating" is retrieved


In [13]:
import time

terminal_states = {"Succeeded", "Failed", "Canceled"}
success_state = "Succeeded"

while True:
    deployment = ml_client.online_deployments.get(
        name=deployment_definition.name,
        endpoint_name=endpoint_definition.name,
    )

    state = deployment.provisioning_state
    print(f'Deployment "{deployment.name}" provisioning state: "{state}"')

    if state == success_state:
        print("✅ Deployment is ready (Succeeded).")
        break

    if state in terminal_states and state != success_state:
        raise RuntimeError(f"❌ Deployment ended in terminal state: {state}")
  

    # Not done yet (e.g., Creating/Updating)
    time.sleep(15)



Deployment "main" provisioning state: "Creating"
...Deployment "main" provisioning state: "Creating"
...Deployment "main" provisioning state: "Creating"
...Deployment "main" provisioning state: "Creating"
...Deployment "main" provisioning state: "Creating"
...Deployment "main" provisioning state: "Creating"
...Deployment "main" provisioning state: "Creating"
..Deployment "main" provisioning state: "Creating"
...Deployment "main" provisioning state: "Creating"
...Deployment "main" provisioning state: "Creating"
...Deployment "main" provisioning state: "Creating"
...Deployment "main" provisioning state: "Creating"
...Deployment "main" provisioning state: "Creating"
...Deployment "main" provisioning state: "Creating"
...Deployment "main" provisioning state: "Creating"
...Deployment "main" provisioning state: "Creating"
...Deployment "main" provisioning state: "Creating"
...Deployment "main" provisioning state: "Creating"
...Deployment "main" provisioning state: "Creating"
...Deployment "m

In [14]:
endpoint = ml_client.online_endpoints.get(endpoint_definition.name)

endpoint.traffic = {
    "main": 100
}

ml_client.online_endpoints.begin_create_or_update(endpoint).result()

print("✅ 100% traffic assigned to deployment 'main'")

Readonly attribute principal_id will be ignored in class <class 'azure.ai.ml._restclient.v2022_05_01.models._models_py3.ManagedServiceIdentity'>
Readonly attribute tenant_id will be ignored in class <class 'azure.ai.ml._restclient.v2022_05_01.models._models_py3.ManagedServiceIdentity'>


✅ 100% traffic assigned to deployment 'main'


## Simulate production inference data

### Generate Sample Data

We generate sample inference data by taking the distribution for each input feature and adding a small amount of random noise. 

In [15]:
import numpy as np

# Define numeric and categotical feature columns
NUMERIC_FEATURES = [
    "LIMIT_BAL",
    "AGE",
    "BILL_AMT1",
    "BILL_AMT2",
    "BILL_AMT3",
    "BILL_AMT4",
    "BILL_AMT5",
    "BILL_AMT6",
    "PAY_AMT1",
    "PAY_AMT2",
    "PAY_AMT3",
    "PAY_AMT4",
    "PAY_AMT5",
    "PAY_AMT6",
]
CATEGORICAL_FEATURES = [
    "SEX",
    "EDUCATION",
    "MARRIAGE",
    "PAY_0",
    "PAY_2",
    "PAY_3",
    "PAY_4",
    "PAY_5",
    "PAY_6",
]


def generate_sample_inference_data(df_production, number_of_records=20):
    # Sample records
    df_sample = df_production.sample(n=number_of_records, replace=True)

    # Generate numeric features with random noise
    df_numeric_generated = pd.DataFrame(
        {
            feature: np.random.normal(
                0, df_production[feature].std(), number_of_records
            ).astype(np.int64)
            for feature in NUMERIC_FEATURES
        }
    ) + df_sample[NUMERIC_FEATURES].reset_index(drop=True)

    # Take categorical columns
    df_categorical = df_sample[CATEGORICAL_FEATURES].reset_index(drop=True)

    # Combine numerical and categorical columns
    df_combined = pd.concat([df_numeric_generated, df_categorical], axis=1)

    return df_combined

In [16]:
import mltable
import pandas as pd
from azure.ai.ml import MLClient

# Load production / inference data
data_asset = ml_client.data.get("credit-default-production", version="1")
tbl = mltable.load(data_asset.path)
df_production = tbl.to_pandas_dataframe()

# Generate sample data for inference
number_of_records = 20
df_generated = generate_sample_inference_data(df_production, number_of_records)

Overriding of current TracerProvider is not allowed
Overriding of current LoggerProvider is not allowed
Overriding of current MeterProvider is not allowed
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Overriding of current TracerProvider is not allowed
Overriding of current LoggerProvider is not allowed
Overriding of current MeterProvider is not allowed
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented


Resolving access token for scope "https://storage.azure.com/.default" using identity of type "MANAGED".
Getting data access token with Assigned Identity (client_id=clientid) and endpoint type based on configuration


In [17]:
display(df_generated)

Unnamed: 0,LIMIT_BAL,AGE,BILL_AMT1,BILL_AMT2,BILL_AMT3,BILL_AMT4,BILL_AMT5,BILL_AMT6,PAY_AMT1,PAY_AMT2,...,PAY_AMT6,SEX,EDUCATION,MARRIAGE,PAY_0,PAY_2,PAY_3,PAY_4,PAY_5,PAY_6
0,203493,54,131565,-13817,65222,158942,-9961,86276,-37007,-25423,...,6245,2,3,2,0,0,0,2,0,0
1,142592,32,41289,-178963,57650,-186662,19470,-38633,6185,-1179,...,-38357,2,1,2,1,-2,-2,-2,-1,2
2,-104267,31,-28690,64921,68486,-11122,-24126,52045,5154,17808,...,-8172,1,1,2,1,2,0,0,-2,-2
3,91286,35,150656,93909,39057,131161,99607,32120,25268,46858,...,2988,2,2,1,0,0,0,0,0,0
4,372679,32,45727,277585,151244,199390,-125193,-14051,17269,21504,...,8982,2,2,1,0,0,0,0,0,0
5,174515,24,-103854,-8155,-61038,-28022,-35993,-31037,-41914,47313,...,-13062,2,2,2,-2,-2,-2,-2,-2,-2
6,-223198,43,-42713,51426,-37248,-125736,81092,79448,5976,13112,...,7124,1,2,1,0,0,0,0,0,2
7,231280,12,335322,286797,215197,358969,244640,248936,65647,-21467,...,11081,1,1,1,0,0,0,0,0,0
8,224819,32,14906,-123993,-85802,47576,55653,-29147,-11242,-19837,...,26350,2,3,1,-2,-2,-2,-2,-2,-2
9,13920,38,-18873,245384,-57942,-89002,10082,114926,17564,25501,...,21440,2,2,2,1,2,0,0,0,0


### Call Online Managed Endpoint

Call the endpoint with the sample data. Since your deployment was created with the Model Data Collector (MDC) enabled, the inference inputs and outputs will be collected in your workspace blob storage. 

In [18]:
import json
import os

request_file_name = "request.json"

# Request sample data
data = {"data": df_generated.to_dict(orient="records")}

# Write sample data
with open(request_file_name, "w") as f:
    json.dump(data, f)

# Call online endpoint
result = ml_client.online_endpoints.invoke(
    endpoint_name=endpoint_definition.name,
    deployment_name=deployment_definition.name,
    request_file=request_file_name,
)

# Delete sample data
os.remove(request_file_name)

In [19]:
print(result)

"{\"DEFAULT_NEXT_MONTH\": [false, false, false, false, false, false, false, false, false, false, false, false, false, false, true, false, false, false, false, false]}"


In [21]:
# pip install -U azureml-fsspec==1.3.1 pandas

import pandas as pd
from azureml.fsspec import AzureMachineLearningFileSystem

def browse_and_preview(uri, max_list=100):
    fs = AzureMachineLearningFileSystem(uri)

    items = fs.find("/")  # recursive
    print(f"\nURI: {uri}")
    print(f"Found {len(items)} items")
    for p in items[:max_list]:
        print("  ", p)

    # preview first parquet/csv/jsonl found
    candidates = [p for p in items if p.lower().endswith((".parquet",".csv",".jsonl"))]
    if not candidates:
        print("No parquet/csv/jsonl files found to preview.")
        return

    sample = candidates[0]
    print("\nPreviewing:", sample)

    if sample.lower().endswith(".parquet"):
        df = pd.read_parquet(fs.open(sample))
        print(df.head(10))
    elif sample.lower().endswith(".csv"):
        df = pd.read_csv(fs.open(sample))
        print(df.head(10))
    else:  # jsonl
        import json
        rows = []
        with fs.open(sample) as f:
            for _ in range(20):
                rows.append(json.loads(next(f)))
        df = pd.DataFrame(rows)
        print(df.head(10))

# ---- Example URIs (replace endpoint/deployment with yours) ----
endpoint = endpoint_definition.name
deployment = deployment_definition.name

# base = f"azureml://datastores/workspaceblobstore/paths/modelDataCollector/{endpoint}/{deployment}"
#azureml://subscriptions/80ef7369-572a-4abd-b09a-033367f44858/resourcegroups/amltest1/workspaces/amltest1/datastores/workspaceblobstore/paths/modelDataCollector/credit-default-bb26/main/model_outputs/
#base = f"azureml://subscriptions/80ef7369-572a-4abd-b09a-033367f44858/resourcegroups/amltest1/workspaces/amltest1/datastores/workspaceblobstore/paths/modelDataCollector/credit-default-bb26/main"
base = f"azureml://subscriptions/2e5fc10d-039e-4cb2-975d-4ed7bc31a0ee/resourcegroups/datacollecRG/workspaces/amlforuaicaroljan/datastores/workspaceblobstore/paths/modelDataCollector/{endpoint}/{deployment}"
print(base)

browse_and_preview(f"{base}/model_inputs/")

#browse_and_preview(f"{base}/inputs_outputs")

azureml://subscriptions/2e5fc10d-039e-4cb2-975d-4ed7bc31a0ee/resourcegroups/datacollecRG/workspaces/amlforuaicaroljan/datastores/workspaceblobstore/paths/modelDataCollector/credit-default-carol/main

URI: azureml://subscriptions/2e5fc10d-039e-4cb2-975d-4ed7bc31a0ee/resourcegroups/datacollecRG/workspaces/amlforuaicaroljan/datastores/workspaceblobstore/paths/modelDataCollector/credit-default-carol/main/model_inputs/
Found 78 items
   LocalUpload/5314df52c81210d5668eca591f1f6dbe5da1a319b66eabc5fb1b3d45eb5672bf/model/MLmodel
   LocalUpload/5314df52c81210d5668eca591f1f6dbe5da1a319b66eabc5fb1b3d45eb5672bf/model/conda.yaml
   LocalUpload/5314df52c81210d5668eca591f1f6dbe5da1a319b66eabc5fb1b3d45eb5672bf/model/model.pkl
   LocalUpload/5314df52c81210d5668eca591f1f6dbe5da1a319b66eabc5fb1b3d45eb5672bf/model/requirements.txt
   LocalUpload/8e7865a95e00bc0123a7fbcdf675009ff120748629ee957b5bb71ee967f99d7a/model/MLmodel
   LocalUpload/8e7865a95e00bc0123a7fbcdf675009ff120748629ee957b5bb71ee967f99d7a/mod

Overriding of current TracerProvider is not allowed
Overriding of current LoggerProvider is not allowed
Overriding of current MeterProvider is not allowed
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Overriding of current TracerProvider is not allowed
Overriding of current LoggerProvider is not allowed
Overriding of current MeterProvider is not allowed
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented


In [22]:
browse_and_preview(f"{base}/model_outputs/")


URI: azureml://subscriptions/2e5fc10d-039e-4cb2-975d-4ed7bc31a0ee/resourcegroups/datacollecRG/workspaces/amlforuaicaroljan/datastores/workspaceblobstore/paths/modelDataCollector/credit-default-carol/main/model_outputs/
Found 78 items
   LocalUpload/5314df52c81210d5668eca591f1f6dbe5da1a319b66eabc5fb1b3d45eb5672bf/model/MLmodel
   LocalUpload/5314df52c81210d5668eca591f1f6dbe5da1a319b66eabc5fb1b3d45eb5672bf/model/conda.yaml
   LocalUpload/5314df52c81210d5668eca591f1f6dbe5da1a319b66eabc5fb1b3d45eb5672bf/model/model.pkl
   LocalUpload/5314df52c81210d5668eca591f1f6dbe5da1a319b66eabc5fb1b3d45eb5672bf/model/requirements.txt
   LocalUpload/8e7865a95e00bc0123a7fbcdf675009ff120748629ee957b5bb71ee967f99d7a/model/MLmodel
   LocalUpload/8e7865a95e00bc0123a7fbcdf675009ff120748629ee957b5bb71ee967f99d7a/model/conda.yaml
   LocalUpload/8e7865a95e00bc0123a7fbcdf675009ff120748629ee957b5bb71ee967f99d7a/model/model.pkl
   LocalUpload/8e7865a95e00bc0123a7fbcdf675009ff120748629ee957b5bb71ee967f99d7a/model/re

## Create model monitor

In [23]:
from azure.identity import DefaultAzureCredential
from azure.ai.ml import MLClient
from azure.ai.ml.entities import (
    AlertNotification,
    MonitoringTarget,
    MonitorDefinition,
    MonitorSchedule,
    RecurrencePattern,
    RecurrenceTrigger,
    ServerlessSparkCompute,
)

# get a handle to the workspace
# ml_client = MLClient(
#     DefaultAzureCredential(),
#     subscription_id="80ef7369-572a-4abd-b09a-033367f44858",
#     resource_group_name="amltest1",
#     workspace_name="amltest1",
# )
ml_client = MLClient.from_config(credential=DefaultAzureCredential())

Found the config file in: /config.json
Overriding of current TracerProvider is not allowed
Overriding of current LoggerProvider is not allowed
Overriding of current MeterProvider is not allowed
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented


Here is a basic model monitor. Please feel free to augment it to meet the needs of your scenario. 

In [25]:
from azure.identity import DefaultAzureCredential
from azure.ai.ml import MLClient
from azure.ai.ml.entities import (
    AlertNotification,
    MonitoringTarget,
    MonitorDefinition,
    MonitorSchedule,
    RecurrencePattern,
    RecurrenceTrigger,
    ServerlessSparkCompute,
)

# get a handle to the workspace
# ml_client = MLClient(
#     DefaultAzureCredential(),
#     subscription_id="80ef7369-572a-4abd-b09a-033367f44858",
#     resource_group_name="amltest1",
#     workspace_name="amltest1",
# )
ml_client = MLClient.from_config(credential=DefaultAzureCredential())

# create the compute
spark_compute = ServerlessSparkCompute(
    instance_type="standard_e4s_v3", runtime_version="3.4"
)

# specify your online endpoint deployment
monitoring_target = MonitoringTarget(
    ml_task="classification", endpoint_deployment_id="azureml:credit-default-carol:main"
)


# create alert notification object
alert_notification = AlertNotification(emails=["carsilva@microsoft.com", "carsilva@microsoft.com"])

# create the monitor definition
monitor_definition = MonitorDefinition(
    compute=spark_compute,
    monitoring_target=monitoring_target,
    alert_notification=alert_notification,
)

# specify the schedule frequency
recurrence_trigger = RecurrenceTrigger(
    frequency="day", interval=1, schedule=RecurrencePattern(hours=3, minutes=15)
)

# create the monitor
model_monitor = MonitorSchedule(
    name="credit_default_monitor_basic",
    trigger=recurrence_trigger,
    create_monitor=monitor_definition,
)

poller = ml_client.schedules.begin_create_or_update(model_monitor)
created_monitor = poller.result()

Found the config file in: /config.json
Overriding of current TracerProvider is not allowed
Overriding of current LoggerProvider is not allowed
Overriding of current MeterProvider is not allowed
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented


......

Here is an advanced model monitoring configuration. Feel free to augment it to meet the needs of your scenario. 

In [28]:
from azure.identity import DefaultAzureCredential
from azure.ai.ml import Input, MLClient
from azure.ai.ml.constants import (
    MonitorDatasetContext,
)
from azure.ai.ml.entities import (
    AlertNotification,
    DataDriftSignal,
    DataQualitySignal,
    PredictionDriftSignal,
    DataDriftMetricThreshold,
    DataQualityMetricThreshold,
    PredictionDriftMetricThreshold,
    NumericalDriftMetrics,
    CategoricalDriftMetrics,
    DataQualityMetricsNumerical,
    DataQualityMetricsCategorical,
    MonitorFeatureFilter,
    MonitoringTarget,
    MonitorDefinition,
    MonitorSchedule,
    RecurrencePattern,
    RecurrenceTrigger,
    ServerlessSparkCompute,
    ReferenceData,
)

# get a handle to the workspace
ml_client = MLClient(
    DefaultAzureCredential(),
    subscription_id="2e5fc10d-039e-4cb2-975d-4ed7bc31a0ee",
    resource_group_name="datacollecRG",
    workspace_name="amlforuaicaroljan",
)

# create your compute
spark_compute = ServerlessSparkCompute(
    instance_type="standard_e4s_v3", runtime_version="3.4"
)

# specify the online deployment (if you have one)
monitoring_target = MonitoringTarget(
    ml_task="classification", endpoint_deployment_id="azureml:credit-default:main"
)

# training data to be used as baseline dataset
reference_data_training = ReferenceData(
    input_data=Input(type="mltable", path="azureml:credit-default-reference:1"),
    #target_column_name="DEFAULT_NEXT_MONTH",
    data_context=MonitorDatasetContext.TRAINING,
)

# create an advanced data drift signal
features = MonitorFeatureFilter(top_n_feature_importance=10)

metric_thresholds = DataDriftMetricThreshold(
    numerical=NumericalDriftMetrics(jensen_shannon_distance=0.01),
    categorical=CategoricalDriftMetrics(pearsons_chi_squared_test=0.02),
)

advanced_data_drift = DataDriftSignal(
    reference_data=reference_data_training,
    features=features,
    metric_thresholds=metric_thresholds,
)

# create an advanced prediction drift signal
metric_thresholds = PredictionDriftMetricThreshold(
    categorical=CategoricalDriftMetrics(jensen_shannon_distance=0.01)
)

advanced_prediction_drift = PredictionDriftSignal(
    reference_data=reference_data_training, metric_thresholds=metric_thresholds
)

# create an advanced data quality signal
features = ["SEX", "EDUCATION", "AGE"]

metric_thresholds = DataQualityMetricThreshold(
    numerical=DataQualityMetricsNumerical(null_value_rate=0.01),
    categorical=DataQualityMetricsCategorical(out_of_bounds_rate=0.02),
)

advanced_data_quality = DataQualitySignal(
    reference_data=reference_data_training,
    features=features,
    metric_thresholds=metric_thresholds,
    alert_enabled=False,
)

# put all monitoring signals in a dictionary
monitoring_signals = {
    "data_drift_advanced": advanced_data_drift,
    "data_quality_advanced": advanced_data_quality,
}

# create alert notification object
alert_notification = AlertNotification(emails=["abc@example.com", "def@example.com"])

# create the monitor definition
monitor_definition = MonitorDefinition(
    compute=spark_compute,
    monitoring_target=monitoring_target,
    monitoring_signals=monitoring_signals,
    alert_notification=alert_notification,
)

# specify the frequency on which to run your monitor
recurrence_trigger = RecurrenceTrigger(
    frequency="day", interval=1, schedule=RecurrencePattern(hours=3, minutes=15)
)

# create your monitor
model_monitor = MonitorSchedule(
    name="credit_default_monitor_advanced",
    trigger=recurrence_trigger,
    create_monitor=monitor_definition,
)

poller = ml_client.schedules.begin_create_or_update(model_monitor)
created_monitor = poller.result()

Overriding of current TracerProvider is not allowed
Overriding of current LoggerProvider is not allowed
Overriding of current MeterProvider is not allowed
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented


ResourceNotFoundError: (ResourceNotFound) The Resource 'Microsoft.MachineLearningServices/workspaces/amlforuaicaroljan/onlineEndpoints/credit-default/deployments/main' under resource group 'datacollecRG' was not found. For more details please go to https://aka.ms/ARMResourceNotFoundFix
Code: ResourceNotFound
Message: The Resource 'Microsoft.MachineLearningServices/workspaces/amlforuaicaroljan/onlineEndpoints/credit-default/deployments/main' under resource group 'datacollecRG' was not found. For more details please go to https://aka.ms/ARMResourceNotFoundFix

In [None]:
#delete the monitoring
#delete the drift


In [None]:

from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential

endpoint_name = "credit-default-bb26"   # <-- your endpoint name

endpoint = ml_client.online_endpoints.get(endpoint_name)

# Set traffic to 0 for the deployment(s) you currently route to
# If you know the deployment name:
endpoint.traffic = {"main": 0}  # replace "blue"

ml_client.begin_create_or_update(endpoint).result()


In [None]:
#endpoint_name = "credit-default-bb26"
deployment_name = "main"   # <-- deployment to delete

ml_client.online_deployments.begin_delete(
    name=deployment_name,
    endpoint_name=endpoint_name
).result()

In [None]:
# endpoint_name = "credit-default-bb26"

deployments = ml_client.online_deployments.list(endpoint_name=endpoint_name)
for d in deployments:
    ml_client.online_deployments.begin_delete(
        name=d.name,
        endpoint_name=endpoint_name
    ).result()

In [None]:
ml_client.online_endpoints.begin_delete(name=endpoint_name).result()

In [None]:
# finally done