Azure ML Model Monitoring

This project provides an end-to-end workflow for training, registering, and monitoring machine learning models using Azure Machine Learning (AML). It demonstrates how to set up data drift monitoring using both the Azure ML Python SDK and REST API approaches.

What This Project Accomplishes

Trains a classification model (RandomForest) on synthetic transaction data
Registers the model and datasets in Azure ML workspace
Uploads inference data to simulate production model inputs
Creates MLTable assets required for Azure ML monitoring
Sets up a Data Drift Monitor that compares production data distributions against training data baseline
Verifies the monitoring setup using both SDK and REST API methods

Project Structure

amlmodelmonitoring/
├── .env                          # Environment variables (create from template)
├── set_env.ps1                   # Loads .env variables into PowerShell session
├── requirements.txt              # Python dependencies
│
├── train_register.py             # Train model & register dataset/model in Azure ML
├── upload_inference.py           # Upload inference batch data to Azure ML
├── register_mltable.py           # Convert CSV data to MLTable format (required for monitoring)
│
├── create_monitor_sdk.py         # Create data drift monitor using Azure ML SDK
├── create_monitor.py             # Alternative monitor creation (custom helper approach)
├── monitoring_setup.py           # Helper classes for monitor configuration
│
├── verify_monitor.py             # Verify monitors using Azure ML SDK
├── verify_monitor_rest.ps1       # Verify monitors using ARM REST API (PowerShell)
│
├── train.csv                     # Training data (generated by train_register.py)
├── inference_batch.csv           # Inference data (generated by upload_inference.py)
├── rf_model.joblib               # Trained model artifact
│
├── scripts/
│   └── check_monitor_api_versions.py  # Utility to check API version compatibility
│
├── tests/
│   └── test_monitoring_setup.py  # Unit tests for monitoring setup
│
└── *.json                        # Various debug/output files

File Descriptions

Core Scripts

File	Description
`train_register.py`	Generates synthetic classification data (2000 samples, 5 features), trains a RandomForest model, and registers both the dataset (`tx_training_dataset`) and model (`tx_rf_model`) in Azure ML.
`upload_inference.py`	Creates sample inference data and uploads it to Azure ML as a data asset (`inference_batch_csv`). Simulates production model inputs.
`register_mltable.py`	Converts CSV files to MLTable format and registers them as Azure ML data assets. Required for monitoring — Azure ML monitoring only supports MLTable format. Creates `tx_training_mltable` and `tx_inference_mltable`.
`create_monitor_sdk.py`	Main monitoring script — Creates a data drift monitor using the Azure ML Python SDK. Configures serverless Spark compute, data drift signals, metric thresholds, and daily schedule.
`verify_monitor.py`	Lists all monitor schedules in the workspace using the Azure ML SDK. Shows monitor names, types, signals, and triggers.

Alternative/Helper Scripts

File	Description
`create_monitor.py`	Alternative monitor creation using custom helper functions from `monitoring_setup.py`.
`monitoring_setup.py`	Dataclasses and helper functions (`DataDriftSignal`, `MonitorSchedule`, `create_drift_signal`, `create_monitor_schedule`) for building monitor configurations.
`verify_monitor_rest.ps1`	PowerShell script that verifies monitors using ARM REST API via `az rest`. Uses the `/schedules` endpoint (monitor schedules are a type of schedule in Azure ML). Iterates through supported API versions (`2024-04-01`, `2023-10-01`, etc.) until finding one that works.

Configuration Files

File	Description
`set_env.ps1`	PowerShell script that loads environment variables from `.env` file into the current session.
`requirements.txt`	Python dependencies: `azure-ai-ml`, `azure-identity`, `pandas`, `scikit-learn`, `joblib`, `python-dotenv`.

Data Files (Generated)

File	Description
`train.csv`	Training dataset with columns `feature_0` to `feature_4` and `label`. Generated by `train_register.py`.
`inference_batch.csv`	Inference batch data with columns `feature_0` to `feature_4`. Generated by `upload_inference.py`.
`rf_model.joblib`	Serialized RandomForest classifier model.

Prerequisites

Python 3.10+
Azure CLI installed and logged in (az login)
Azure ML Workspace with appropriate permissions
Service Principal or user credentials with Contributor access to the workspace

Environment Variables

Create a .env file in the project root with the following variables:

AZURE_SUBSCRIPTION_ID=your-subscription-id
AZURE_RESOURCE_GROUP=your-resource-group-name
AZURE_ML_WORKSPACE=your-workspace-name
DEFAULT_DATASTORE=your-datastore-name          # Optional, has default
ALERT_EMAIL=your-email@example.com             # Optional, for monitor alerts

Step-by-Step Setup Guide

Step 1: Clone and Set Up Environment

# Navigate to project directory
cd C:\Projects\GithubLocal\amlmodelmonitoring

# Create virtual environment
python -m venv .venv

# Activate virtual environment
.\.venv\Scripts\Activate.ps1

# Install dependencies
pip install -r requirements.txt

Step 2: Configure Environment Variables

# Create .env file with your Azure details (edit with your values)
@"
AZURE_SUBSCRIPTION_ID=your-subscription-id
AZURE_RESOURCE_GROUP=your-resource-group
AZURE_ML_WORKSPACE=your-workspace-name
ALERT_EMAIL=your-email@example.com
"@ | Out-File -FilePath .env -Encoding utf8

# Load environment variables into PowerShell session
.\set_env.ps1

Step 3: Login to Azure

# Login to Azure (opens browser)
az login

# Set the subscription (if you have multiple)
az account set --subscription $env:AZURE_SUBSCRIPTION_ID

Step 4: Train Model and Register Assets

# Train model and register dataset + model in Azure ML
python train_register.py

Output: Creates train.csv, rf_model.joblib, and registers:

Dataset: tx_training_dataset
Model: tx_rf_model

Step 5: Upload Inference Data

# Create and upload inference batch data
python upload_inference.py

Output: Creates inference_batch.csv and registers data asset inference_batch_csv.

Step 6: Register MLTable Assets (Required for Monitoring)

# Convert CSV data to MLTable format
python register_mltable.py

Output: Creates and registers:

tx_training_mltable (baseline/reference data)
tx_inference_mltable (production data)

Step 7: Create the Data Drift Monitor

# Create the monitoring schedule
python create_monitor_sdk.py

Output: Creates tx_data_drift_monitor schedule that:

Runs daily at 6 AM
Compares inference data distribution against training data baseline
Monitors features: feature_0 through feature_4
Sends alerts if drift is detected

Step 8: Verify the Monitor

# Verify using Python SDK
python verify_monitor.py

# Or verify using REST API (PowerShell)
.\verify_monitor_rest.ps1

Quick Start (All Steps)

After setting up .env, run all steps in sequence:

# Load environment
.\set_env.ps1
.\.venv\Scripts\Activate.ps1

# Full pipeline
python train_register.py
python upload_inference.py
python register_mltable.py
python create_monitor_sdk.py
python verify_monitor.py

Viewing Monitors in Azure ML Studio

After creating a monitor, view it in Azure ML Studio:

Go to Azure ML Studio
Select your workspace
Navigate to Monitoring in the left menu
Find tx_data_drift_monitor in the list

Direct link (replace with your values):

https://ml.azure.com/monitoring?wsid=/subscriptions/{subscription-id}/resourceGroups/{resource-group}/providers/Microsoft.MachineLearningServices/workspaces/{workspace-name}

SDK vs REST API Approaches

This project demonstrates two approaches to interact with Azure ML monitoring:

Approach	Files	Pros	Cons
Python SDK	`create_monitor_sdk.py`, `verify_monitor.py`	Type-safe, easier to use, better error messages	Requires `azure-ai-ml` package
REST API	`verify_monitor_rest.ps1`	No Python dependencies, works with `az rest`	Manual URL construction, less intuitive

REST API Details

The verify_monitor_rest.ps1 script uses the Azure Resource Manager (ARM) REST API to list schedules:

GET https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroup}/providers/Microsoft.MachineLearningServices/workspaces/{workspace}/schedules?api-version=2024-04-01

Key points:

Endpoint: /schedules — Monitor schedules are a subtype of the general schedules resource in Azure ML
No dedicated /monitors endpoint — Monitors are represented as schedules with actionType: CreateMonitor
API versions tested: 2024-04-01, 2024-01-01-preview, 2023-10-01, 2023-06-01-preview, 2023-04-01-preview
Authentication: Uses az rest which leverages your Azure CLI login session

Troubleshooting

Common Errors

Error	Solution
`Compute runtime version must be 3.4`	Update `ServerlessSparkCompute(runtime_version="3.4")`
`UriFile is not supported`	Use MLTable format — run `register_mltable.py`
`.emails is missing`	Alert notification requires at least one email address
`Please load environment variables first`	Run `.\set_env.ps1` before running scripts

Verify Environment Variables

# Check that variables are loaded
echo $env:AZURE_SUBSCRIPTION_ID
echo $env:AZURE_RESOURCE_GROUP
echo $env:AZURE_ML_WORKSPACE

Running Tests

# Run all tests
pytest

# Run with verbose output
pytest -v

# Run specific test file
pytest tests/test_monitoring_setup.py

Notes

MLTable format is required for Azure ML model monitoring — CSV/URI files are not supported
Serverless Spark runtime 3.4 is the minimum required version as of late 2024
Adjust metric thresholds in create_monitor_sdk.py based on your drift tolerance
Monitor schedules can be paused/resumed from Azure ML Studio

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Azure ML Model Monitoring

Table of Contents

What This Project Accomplishes

Project Structure

File Descriptions

Core Scripts

Alternative/Helper Scripts

Configuration Files

Data Files (Generated)

Prerequisites

Environment Variables

Step-by-Step Setup Guide

Step 1: Clone and Set Up Environment

Step 2: Configure Environment Variables

Step 3: Login to Azure

Step 4: Train Model and Register Assets

Step 5: Upload Inference Data

Step 6: Register MLTable Assets (Required for Monitoring)

Step 7: Create the Data Drift Monitor

Step 8: Verify the Monitor

Quick Start (All Steps)

Viewing Monitors in Azure ML Studio

SDK vs REST API Approaches

REST API Details

Troubleshooting

Common Errors

Verify Environment Variables

Running Tests

Notes

References

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
scripts		scripts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
azureml-model-monitoring.md		azureml-model-monitoring.md
create_monitor.py		create_monitor.py
create_monitor_sdk.py		create_monitor_sdk.py
inference_batch.csv		inference_batch.csv
monitoring_setup.py		monitoring_setup.py
readme-security.md		readme-security.md
register_mltable.py		register_mltable.py
requirements.txt		requirements.txt
rf_model.joblib		rf_model.joblib
set_env.ps1		set_env.ps1
train.csv		train.csv
train_register.py		train_register.py
upload_inference.py		upload_inference.py
verify_monitor.py		verify_monitor.py
verify_monitor_rest.ps1		verify_monitor_rest.ps1

License

parchuric/azure-ml-model-monitoring

Folders and files

Latest commit

History

Repository files navigation

Azure ML Model Monitoring

Table of Contents

What This Project Accomplishes

Project Structure

File Descriptions

Core Scripts

Alternative/Helper Scripts

Configuration Files

Data Files (Generated)

Prerequisites

Environment Variables

Step-by-Step Setup Guide

Step 1: Clone and Set Up Environment

Step 2: Configure Environment Variables

Step 3: Login to Azure

Step 4: Train Model and Register Assets

Step 5: Upload Inference Data

Step 6: Register MLTable Assets (Required for Monitoring)

Step 7: Create the Data Drift Monitor

Step 8: Verify the Monitor

Quick Start (All Steps)

Viewing Monitors in Azure ML Studio

SDK vs REST API Approaches

REST API Details

Troubleshooting

Common Errors

Verify Environment Variables

Running Tests

Notes

References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages