# 01 ‚Äî Azure ML + MLflow Quickstart

This notebook connects to an Azure ML workspace, sets MLflow tracking to the workspace, trains a simple model, logs metrics + artifacts, registers the model, and (optionally) deploys a managed online endpoint.

## Prereqs
- Azure subscription + access to the workshop resource group
- Azure ML workspace deployed (see `infra/main.bicep`)
- Auth: `DefaultAzureCredential` (recommended) or interactive browser fallback

In [1]:
# If you're running this notebook in a fresh environment, run:
# %pip install -r ../requirements.txt

import os
import json
import time
import uuid
import sys
import site

# Avoid mixing user-site packages with the repo venv (prevents weird import conflicts).
try:
    user_site = site.getusersitepackages()
    sys.path = [p for p in sys.path if os.path.normcase(p) != os.path.normcase(user_site)]
except Exception:
    pass

# Note: Do NOT delete mlflow/protobuf modules from sys.modules here.
# Re-importing MLflow protos in the same kernel can trigger protobuf descriptor errors.
# If you changed MLflow/protobuf versions, use 'Restart Kernel' in VS Code.

import mlflow
import numpy as np
import pandas as pd

from sklearn.datasets import fetch_openml
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, f1_score, precision_score, recall_score, roc_auc_score
from sklearn.model_selection import train_test_split

from azure.ai.ml import MLClient
from azure.ai.ml.constants import AssetTypes
from azure.ai.ml.entities import Model, ManagedOnlineEndpoint, ManagedOnlineDeployment
from azure.identity import DefaultAzureCredential, InteractiveBrowserCredential

print('Imports OK')
print('Python:', sys.executable)
print('mlflow:', mlflow.__version__, 'from', mlflow.__file__)

Imports OK
Python: c:\Users\ritwickdutta\OneDrive - Microsoft\Documents\DND MLOps Demo\.venv311\Scripts\python.exe
mlflow: 2.16.2 from c:\Users\ritwickdutta\OneDrive - Microsoft\Documents\DND MLOps Demo\.venv311\Lib\site-packages\mlflow\__init__.py


In [2]:
# Workshop configuration
# Prefer environment variables so attendees don't have to edit the notebook.
# If you prefer, you can also paste values directly into this cell.
import subprocess
import shutil
import platform

def _get_windows_persisted_env(var_name: str) -> str:
    """Read a persisted env var from Windows registry (HKCU/HKLM).
    This helps when the kernel started before `setx` was run (no restart yet).
    """
    try:
        import winreg  # type: ignore
    except Exception:
        return ""

    for root, path in (
        (winreg.HKEY_CURRENT_USER, r"Environment"),
        (winreg.HKEY_LOCAL_MACHINE, r"SYSTEM\CurrentControlSet\Control\Session Manager\Environment"),
    ):
        try:
            with winreg.OpenKey(root, path) as key:
                value, _ = winreg.QueryValueEx(key, var_name)
                if isinstance(value, str) and value.strip():
                    return value.strip()
        except FileNotFoundError:
            continue
        except OSError:
            continue
    return ""

SUBSCRIPTION_ID = os.getenv('AZURE_SUBSCRIPTION_ID', '').strip()
RESOURCE_GROUP = os.getenv('AZURE_RESOURCE_GROUP', 'rg-dnd-mlops-demo').strip()
WORKSPACE_NAME = os.getenv('AZURE_ML_WORKSPACE', 'mlw-dndmlops2-dev').strip()

# If the kernel started before `setx`, os.getenv won't see the new value.
# On Windows, try loading the persisted user env var from the registry.
if not SUBSCRIPTION_ID and platform.system() == 'Windows':
    persisted = _get_windows_persisted_env('AZURE_SUBSCRIPTION_ID')
    if persisted:
        SUBSCRIPTION_ID = persisted
        os.environ['AZURE_SUBSCRIPTION_ID'] = SUBSCRIPTION_ID
        print('Loaded AZURE_SUBSCRIPTION_ID from Windows user environment (no kernel restart needed).')

# Optional manual override (uncomment and paste):
# SUBSCRIPTION_ID = "<your-subscription-id>"

if not SUBSCRIPTION_ID and shutil.which('az'):
    try:
        SUBSCRIPTION_ID = subprocess.check_output(
            ['az', 'account', 'show', '--query', 'id', '-o', 'tsv'],
            text=True,
            stderr=subprocess.STDOUT,
        ).strip()
        if SUBSCRIPTION_ID:
            os.environ['AZURE_SUBSCRIPTION_ID'] = SUBSCRIPTION_ID
            print('Using subscription from Azure CLI context (az account show).')
    except Exception as e:
        print('Azure CLI fallback failed:', repr(e))

if not SUBSCRIPTION_ID:
    raise ValueError(
        'Missing AZURE_SUBSCRIPTION_ID. Set it as an env var (or restart kernel after setx), '
        'or uncomment the manual override in this cell.'
    )

print('Subscription:', SUBSCRIPTION_ID)
print('Resource group:', RESOURCE_GROUP)
print('Workspace:', WORKSPACE_NAME)

Loaded AZURE_SUBSCRIPTION_ID from Windows user environment (no kernel restart needed).
Subscription: 1d53bfb3-a84c-4eb4-8c79-f29dc8424b6a
Resource group: rg-dnd-mlops-demo
Workspace: mlw-dndmlops2-dev


In [3]:
# Authenticate and connect to Azure ML
try:
    credential = DefaultAzureCredential(exclude_interactive_browser_credential=True)
    credential.get_token('https://management.azure.com/.default')
except Exception:
    credential = InteractiveBrowserCredential()

ml_client = MLClient(
    credential=credential,
    subscription_id=SUBSCRIPTION_ID,
    resource_group_name=RESOURCE_GROUP,
    workspace_name=WORKSPACE_NAME,
)

print('Connected to workspace:', ml_client.workspace_name)

Class DeploymentTemplateOperations: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.


Connected to workspace: mlw-dndmlops2-dev


In [4]:
# Configure MLflow to use Azure ML workspace tracking
tracking_uri = ml_client.workspaces.get(WORKSPACE_NAME).mlflow_tracking_uri
mlflow.set_tracking_uri(tracking_uri)

experiment_name = os.getenv('MLFLOW_EXPERIMENT_NAME', 'mlops-hackathon-demo')
mlflow.set_experiment(experiment_name)

print('MLflow tracking URI:', tracking_uri)
print('Experiment:', experiment_name)

MLflow tracking URI: azureml://eastus.api.azureml.ms/mlflow/v2.0/subscriptions/1d53bfb3-a84c-4eb4-8c79-f29dc8424b6a/resourceGroups/rg-dnd-mlops-demo/providers/Microsoft.MachineLearningServices/workspaces/mlw-dndmlops2-dev
Experiment: mlops-hackathon-demo


In [5]:
# Optional: avoid Azure CLI credential timeouts when logging artifacts
import os
os.environ.setdefault("AZURE_IDENTITY_DISABLE_AZURECLI", "true")

'true'

In [6]:
# Load dataset (OpenML Spambase)
print('Loading Spambase dataset from OpenML...')
spambase = fetch_openml(data_id=44, as_frame=True, parser='auto')
data = spambase.frame.rename(columns={'class': 'is_spam'})
data['is_spam'] = data['is_spam'].astype(int)

X = data.drop('is_spam', axis=1)
y = data['is_spam']

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

print('Train:', X_train.shape, 'Test:', X_test.shape)
X_train.head()

Loading Spambase dataset from OpenML...
Train: (3680, 57) Test: (921, 57)


Unnamed: 0,word_freq_make,word_freq_address,word_freq_all,word_freq_3d,word_freq_our,word_freq_over,word_freq_remove,word_freq_internet,word_freq_order,word_freq_mail,...,word_freq_conference,char_freq_%3B,char_freq_%28,char_freq_%5B,char_freq_%21,char_freq_%24,char_freq_%23,capital_run_length_average,capital_run_length_longest,capital_run_length_total
2940,0.05,0.0,0.45,0.0,0.15,0.1,0.0,0.0,0.55,0.0,...,0.0,0.203,0.195,0.05,0.0,0.014,0.0,2.88,45,1080
1303,0.17,0.26,1.21,0.0,0.43,0.6,0.43,0.26,0.69,0.52,...,0.0,0.0,0.108,0.0,0.271,0.243,0.013,6.395,583,1375
3468,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.153,0.0,0.0,1.933,7,58
3181,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4.333,20,26
794,0.0,0.56,0.0,0.0,0.56,0.0,0.0,0.0,1.01,0.56,...,0.0,0.0,0.186,0.0,0.056,0.056,0.0,2.153,53,532


In [7]:
# Train + log to MLflow
from mlflow.models import infer_signature

params = {
    'n_estimators': 100,
    'max_depth': 10,
    'min_samples_split': 5,
    'min_samples_leaf': 2,
    'random_state': 42,
}

with mlflow.start_run(run_name='spam-classifier-rf') as run:
    mlflow.log_params(params)
    mlflow.log_param('training_samples', len(X_train))
    mlflow.log_param('dataset', 'UCI Spambase')
    mlflow.log_param('num_features', X_train.shape[1])

    model = RandomForestClassifier(**params)
    model.fit(X_train, y_train)

    y_pred = model.predict(X_test)
    y_pred_proba = model.predict_proba(X_test)[:, 1]

    metrics = {
        'accuracy': accuracy_score(y_test, y_pred),
        'precision': precision_score(y_test, y_pred),
        'recall': recall_score(y_test, y_pred),
        'f1_score': f1_score(y_test, y_pred),
        'roc_auc': roc_auc_score(y_test, y_pred_proba),
    }
    mlflow.log_metrics(metrics)

    feature_importance = (
        pd.DataFrame({'feature': X.columns, 'importance': model.feature_importances_})
        .sort_values('importance', ascending=False)
    )
    feature_importance.to_csv('feature_importance.csv', index=False)
    mlflow.log_artifact('feature_importance.csv')

    signature = infer_signature(X_train, model.predict(X_train))

    mlflow.sklearn.log_model(
        model,
        artifact_path='model',
        signature=signature,
        registered_model_name='spam-classifier',
    )

print('Run ID:', run.info.run_id)
print('Metrics:', metrics)
feature_importance.head(10)

Registered model 'spam-classifier' already exists. Creating a new version of this model...
2026/02/04 14:42:15 INFO mlflow.store.model_registry.abstract_store: Waiting up to 300 seconds for model version to finish creation. Model name: spam-classifier, version 5
Created version '5' of model 'spam-classifier'.
2026/02/04 14:42:22 INFO mlflow.tracking._tracking_service.client: üèÉ View run spam-classifier-rf at: https://eastus.api.azureml.ms/mlflow/v2.0/subscriptions/1d53bfb3-a84c-4eb4-8c79-f29dc8424b6a/resourceGroups/rg-dnd-mlops-demo/providers/Microsoft.MachineLearningServices/workspaces/mlw-dndmlops2-dev/#/experiments/5a65b315-6ff3-4683-9f81-002232b041c1/runs/6cab43f8-6e4b-4d2b-9aaa-a3be6c084287.
2026/02/04 14:42:22 INFO mlflow.tracking._tracking_service.client: üß™ View experiment at: https://eastus.api.azureml.ms/mlflow/v2.0/subscriptions/1d53bfb3-a84c-4eb4-8c79-f29dc8424b6a/resourceGroups/rg-dnd-mlops-demo/providers/Microsoft.MachineLearningServices/workspaces/mlw-dndmlops2-dev/#

Run ID: 6cab43f8-6e4b-4d2b-9aaa-a3be6c084287
Metrics: {'accuracy': 0.9348534201954397, 'precision': 0.9495548961424333, 'recall': 0.8815426997245179, 'f1_score': 0.9142857142857143, 'roc_auc': 0.9802867383512545}


Unnamed: 0,feature,importance
51,char_freq_%21,0.130966
52,char_freq_%24,0.117364
6,word_freq_remove,0.095036
15,word_freq_free,0.074471
54,capital_run_length_average,0.056725
56,capital_run_length_total,0.05468
55,capital_run_length_longest,0.049536
24,word_freq_hp,0.046051
20,word_freq_your,0.044948
23,word_freq_money,0.034451


In [8]:
# (Optional) Register model in Azure ML with governance metadata
# Note: MLflow registration above already creates a registered model in many setups.
# This step adds explicit tags/properties via the Azure ML SDK.

model_uri = f'runs:/{run.info.run_id}/model'

registered_model = ml_client.models.create_or_update(
    Model(
        path=model_uri,
        name='spam-classifier',
        type=AssetTypes.MLFLOW_MODEL,
        description='Email spam classifier trained on UCI Spambase dataset',
        tags={
            'author': os.getenv('MODEL_AUTHOR', 'workshop-attendee'),
            'use_case': 'spam_detection',
            'dataset': 'UCI Spambase',
            'framework': 'sklearn',
            'algorithm': 'RandomForest',
        },
        properties={k: str(round(v, 4)) for k, v in metrics.items()},
    )
)

print('Model registered:', registered_model.name, registered_model.version)

Model registered: spam-classifier 6


## Fix: Re-log Model with NumPy 1.x Constraint

MLflow embeds conda/pip dependencies when logging a model. If the model was logged with NumPy 2.x, Azure ML's curated environments (which use NumPy 1.x) will fail to load it.

**Solution**: Re-log the model with explicit `pip_requirements` specifying `numpy<2.0`.

In [21]:
# Re-log the model with explicit numpy<2.0 constraint
# This ensures the model's embedded environment is compatible with Azure ML

from mlflow.models import infer_signature

# Define explicit pip requirements that work with Azure ML
pip_requirements = [
    "numpy<2.0",
    "scikit-learn>=1.0,<2.0",
    "pandas",
    "mlflow",
]

with mlflow.start_run(run_name='spam-classifier-rf-numpy1x') as run:
    mlflow.log_params(params)
    mlflow.log_param('training_samples', len(X_train))
    mlflow.log_param('dataset', 'UCI Spambase')
    mlflow.log_param('numpy_constraint', '<2.0')
    
    # Use the already-trained model from memory
    y_pred = model.predict(X_test)
    y_pred_proba = model.predict_proba(X_test)[:, 1]
    
    mlflow.log_metrics(metrics)
    
    signature = infer_signature(X_train, model.predict(X_train))
    
    # Log with explicit pip_requirements to override auto-detected deps
    mlflow.sklearn.log_model(
        model,
        artifact_path='model',
        signature=signature,
        pip_requirements=pip_requirements,
        registered_model_name='spam-classifier',
    )

print(f'Run ID: {run.info.run_id}')
print(f'Model logged with pip_requirements: {pip_requirements}')

Registered model 'spam-classifier' already exists. Creating a new version of this model...
2026/02/04 21:49:45 INFO mlflow.store.model_registry.abstract_store: Waiting up to 300 seconds for model version to finish creation. Model name: spam-classifier, version 7
Created version '7' of model 'spam-classifier'.
2026/02/04 21:49:48 INFO mlflow.tracking._tracking_service.client: üèÉ View run spam-classifier-rf-numpy1x at: https://eastus.api.azureml.ms/mlflow/v2.0/subscriptions/1d53bfb3-a84c-4eb4-8c79-f29dc8424b6a/resourceGroups/rg-dnd-mlops-demo/providers/Microsoft.MachineLearningServices/workspaces/mlw-dndmlops2-dev/#/experiments/5a65b315-6ff3-4683-9f81-002232b041c1/runs/fa046904-ce27-4847-a34a-03e82f662cd4.
2026/02/04 21:49:48 INFO mlflow.tracking._tracking_service.client: üß™ View experiment at: https://eastus.api.azureml.ms/mlflow/v2.0/subscriptions/1d53bfb3-a84c-4eb4-8c79-f29dc8424b6a/resourceGroups/rg-dnd-mlops-demo/providers/Microsoft.MachineLearningServices/workspaces/mlw-dndmlop

Run ID: fa046904-ce27-4847-a34a-03e82f662cd4
Model logged with pip_requirements: ['numpy<2.0', 'scikit-learn>=1.0,<2.0', 'pandas', 'mlflow']


In [22]:
# Register the fixed model in Azure ML
from azure.ai.ml.entities import Model
from azure.ai.ml.constants import AssetTypes

model_uri = f'runs:/{run.info.run_id}/model'

registered_model = ml_client.models.create_or_update(
    Model(
        path=model_uri,
        name='spam-classifier',
        type=AssetTypes.MLFLOW_MODEL,
        description='Email spam classifier (numpy<2.0 compatible)',
        tags={
            'author': os.getenv('MODEL_AUTHOR', 'workshop-attendee'),
            'use_case': 'spam_detection',
            'dataset': 'UCI Spambase',
            'framework': 'sklearn',
            'numpy_constraint': '<2.0',
        },
        properties={k: str(round(v, 4)) for k, v in metrics.items()},
    )
)

print(f'Model registered: {registered_model.name}:{registered_model.version}')
print('This model version has numpy<2.0 embedded - endpoints should work now!')

Model registered: spam-classifier:8
This model version has numpy<2.0 embedded - endpoints should work now!


## Batch scoring demo (recommended)
This section runs **asynchronous batch scoring** as an Azure ML job (offline inference).
- No online endpoint required
- Produces a CSV output you can download and inspect
- Uses the registered MLflow model in the Azure ML model registry

After this, the notebook includes an **optional** real-time serving section (managed online endpoint).

In [9]:
# Batch scoring (1/3): resolve the model reference (no local uploads)
from typing import Optional

from azure.ai.ml import Input, Output, command

# Resolve a model to use for scoring.
# Prefer the explicit Azure ML registration cell output if you ran it; otherwise, pick the latest version by name.
model_name = os.getenv('BATCH_MODEL_NAME', 'spam-classifier').strip()
model_version: Optional[str] = os.getenv('BATCH_MODEL_VERSION', '').strip() or None

if 'registered_model' in globals() and getattr(registered_model, 'name', None):
    model_name = registered_model.name
    model_version = str(registered_model.version)
    print('Using model from model registration:', model_name, model_version)
else:
    if model_version is None:
        versions = list(ml_client.models.list(name=model_name))
        if not versions:
            raise RuntimeError(
                f'No Azure ML model named {model_name!r} found. '
                'Run the model registration cell, or set BATCH_MODEL_NAME / BATCH_MODEL_VERSION.'
            )
        def _version_key(m):
            try:
                return int(str(m.version))
            except Exception:
                return -1
        latest = sorted(versions, key=_version_key, reverse=True)[0]
        model_version = str(latest.version)
        print('Resolved latest model version:', model_name, model_version)

model_ref = f'azureml:{model_name}:{model_version}'
print('Model reference:', model_ref)

# How many rows to score (downloaded inside the job).
batch_n_rows = int(os.getenv('BATCH_N_ROWS', '100'))
print('Batch rows:', batch_n_rows)

Using model from model registration: spam-classifier 6
Model reference: azureml:spam-classifier:6
Batch rows: 100


In [None]:
# RBAC preflight: identify which identities need Storage Blob Data roles
import json
import subprocess
import shutil

ws = ml_client.workspaces.get(WORKSPACE_NAME)
ds = ml_client.datastores.get('workspaceartifactstore')
compute_name = batch_compute if 'batch_compute' in globals() else os.getenv('AML_BATCH_COMPUTE', '').strip()

print('Workspace:', ws.name)
print('Workspace identity type:', getattr(getattr(ws, 'identity', None), 'type', None))
print('Workspace principal_id:', getattr(getattr(ws, 'identity', None), 'principal_id', None))
print('Workspace tenant_id:', getattr(getattr(ws, 'identity', None), 'tenant_id', None))

print('workspaceartifactstore account:', getattr(ds, 'account_name', None))
print('workspaceartifactstore container:', getattr(ds, 'container_name', None))

if compute_name:
    try:
        c = ml_client.compute.get(compute_name)
        ident = getattr(c, 'identity', None)
        print('Compute:', c.name, 'type:', c.type)
        print('Compute identity type:', getattr(ident, 'type', None))
        print('Compute principal_id:', getattr(ident, 'principal_id', None))
        print('Compute tenant_id:', getattr(ident, 'tenant_id', None))
    except Exception as e:
        print('Could not load compute identity:', repr(e))
else:
    print('Compute not set yet; run Batch scoring (2/3) once to auto-select compute, then re-run this cell.')

# Best-effort: resolve storage account ARM id (scope) so you can paste it into RBAC commands.
storage_account_id = getattr(ws, 'storage_account', None)
if storage_account_id:
    print('Workspace storage_account resource id:', storage_account_id)
elif shutil.which('az') and getattr(ds, 'account_name', None):
    try:
        storage_account_id = subprocess.check_output(
            ['az', 'storage', 'account', 'show', '-n', ds.account_name, '--query', 'id', '-o', 'tsv'],
            text=True,
            stderr=subprocess.STDOUT,
        ).strip()
        print('Resolved storage account resource id (via az):', storage_account_id)
    except Exception as e:
        print('Could not resolve storage account id via az:', repr(e))
else:
    print('Could not resolve storage account ARM id automatically (no ws.storage_account and/or az not available).')

print('\nRBAC target roles (apply on the storage account):')
print(' - Storage Blob Data Reader (minimum)')
print(' - Storage Blob Data Contributor (often required)')
print('Grant to the principal_id for: workspace identity, compute identity, and (later) online endpoint identity.')

In [10]:
# Batch scoring (2/3): submit an offline scoring job (no local uploads)
# Why this looks a bit different: some secured workspaces disable key-based auth on the default storage account.
# In that case, Azure ML can't upload local files (code/input) using account-key SAS tokens.
# So this job downloads data inside the container and only uses the registered model asset as an input.

from azure.ai.ml.entities import UserIdentityConfiguration
from azure.core.exceptions import HttpResponseError

# Compute for the batch job.
batch_compute = os.getenv('AML_BATCH_COMPUTE', '').strip()

if not batch_compute:
    computes = list(ml_client.compute.list())
    print('Available compute targets (name -> type):')
    for c in computes:
        name = getattr(c, 'name', None)
        ctype = getattr(c, 'type', None)
        if name:
            print(' -', name, '->', ctype)
    aml_compute_names = [
        getattr(c, 'name', None) for c in computes
        if getattr(c, 'name', None)
        and ('amlcompute' in str(getattr(c, 'type', '')).lower() or 'amlcompute' in c.__class__.__name__.lower())
    ]
    if not aml_compute_names:
        raise RuntimeError('No AmlCompute cluster found. Create one in Azure ML Studio.')
    batch_compute = aml_compute_names[0]
    print('Auto-selected compute:', batch_compute)
else:
    print('Using compute from AML_BATCH_COMPUTE:', batch_compute)

# Use a recent curated environment with sklearn 1.5+ from public registry (no ACR pull needed).
# The sklearn-1.5 environment has Python 3.10 and scikit-learn >= 1.5 which matches the model.
batch_env = 'azureml://registries/azureml/environments/sklearn-1.5/labels/latest'

# Heredoc script avoids quoting issues.
inline_command = """python - <<'PYSCRIPT'
import os
import pandas as pd
import mlflow
from sklearn.datasets import fetch_openml

model_dir = r"${{inputs.model}}"
out_dir = r"${{outputs.predictions}}"
n_rows = __N_ROWS__

spambase = fetch_openml(data_id=44, as_frame=True, parser='auto')
data = spambase.frame.rename(columns={'class': 'is_spam'})
X = data.drop('is_spam', axis=1).head(n_rows)

model = mlflow.pyfunc.load_model(model_dir)
preds = model.predict(X)

out = pd.DataFrame({'prediction': preds})
os.makedirs(out_dir, exist_ok=True)
out_path = os.path.join(out_dir, 'predictions.csv')
out.to_csv(out_path, index=False)
print('Wrote:', out_path)
PYSCRIPT
"""
inline_command = inline_command.replace('__N_ROWS__', str(batch_n_rows))

batch_job = command(
    command=inline_command,
    inputs={
        'model': Input(type='mlflow_model', path=model_ref, mode='download'),
    },
    outputs={
        'predictions': Output(type='uri_folder'),
    },
    environment=batch_env,
    compute=batch_compute,
    experiment_name=experiment_name,
    display_name='batch-score-spam-classifier',
    identity=UserIdentityConfiguration(),
)

submitted = ml_client.jobs.create_or_update(batch_job)
print('Submitted batch scoring job:', submitted.name)
print('Compute:', batch_compute)
print('Watch logs with: ml_client.jobs.stream(submitted.name)')

Class AutoDeleteSettingSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.


Available compute targets (name -> type):
 - cpu-cluster -> amlcompute
Auto-selected compute: cpu-cluster


Class AutoDeleteConditionSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class BaseAutoDeleteSettingSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class IntellectualPropertySchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class ProtectionLevelSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class BaseIntellectualPropertySchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
pathOnCompute is not a known attribute of class <class 'azure.ai.ml._restclient.v2023_04_01_preview.models._models_py3.UriFolderJobOutput'> and will be ignored


Submitted batch scoring job: mango_tongue_rn0rx6my87
Compute: cpu-cluster
Watch logs with: ml_client.jobs.stream(submitted.name)


In [11]:
# Batch scoring (3/3): download output + inspect predictions (and show logs on failure)
from pathlib import Path
import time

if 'submitted' not in globals():
    raise RuntimeError('Run the Batch scoring (2/3) cell first to submit the job.')

# Wait for job completion (up to 15 minutes by default) so this cell is "one-click" in workshops.
wait_seconds = int(os.getenv('BATCH_WAIT_SECONDS', '900'))
poll_seconds = int(os.getenv('BATCH_POLL_SECONDS', '15'))

start = time.time()
status = None
while True:
    status = ml_client.jobs.get(submitted.name).status
    print('Job status:', status)
    if status in {'Completed', 'Failed', 'Canceled'}:
        break
    if time.time() - start > wait_seconds:
        print('Job still running; re-run this cell in a bit to download outputs.')
        break
    time.sleep(poll_seconds)

if status == 'Completed':
    download_dir = Path('batch_outputs') / submitted.name
    download_dir.mkdir(parents=True, exist_ok=True)

    ml_client.jobs.download(
        name=submitted.name,
        download_path=str(download_dir),
        output_name='predictions',
    )
    print('Downloaded job output to:', str(download_dir.resolve()))

    pred_path = download_dir / 'named-outputs' / 'predictions' / 'predictions.csv'
    if pred_path.exists():
        preds = pd.read_csv(pred_path)
        display(preds.head(10))
    else:
        print('Predictions file not found in downloaded output. Check job outputs in Studio for details.')
else:
    # If the job failed, surface logs and still complete the demo via a local fallback.
    if status in {'Failed', 'Canceled'}:
        print('Job did not complete successfully. Streaming logs (if available)...')
        try:
            ml_client.jobs.stream(submitted.name)
        except Exception as e:
            print('Could not stream logs from this client/session:', repr(e))
            print('Open the job in Azure ML Studio for full details:', getattr(submitted, 'studio_url', None) or '(see the Web View link above if printed)')

        print('')
        print('Fallback: running batch scoring locally in this notebook (offline inference)')
        # Use the in-kernel trained model if present; otherwise try to load from MLflow run artifact.
        if 'model' in globals():
            scorer = model
            predict_fn = scorer.predict
        else:
            loaded = mlflow.pyfunc.load_model(model_uri)
            predict_fn = loaded.predict

        X_batch = (X_test if 'X_test' in globals() else X).head(int(batch_n_rows) if 'batch_n_rows' in globals() else 100)
        local_preds = predict_fn(X_batch)
        local_out = pd.DataFrame({'prediction': local_preds})
        local_dir = Path('batch_outputs') / 'local_fallback'
        local_dir.mkdir(parents=True, exist_ok=True)
        local_path = local_dir / 'predictions.csv'
        local_out.to_csv(local_path, index=False)
        print('Wrote local predictions to:', str(local_path.resolve()))
        display(local_out.head(10))
    else:
        print('Job not completed yet. Re-run this cell later to download outputs.')

Job status: Queued
Job status: Queued
Job status: Queued
Job status: Queued
Job status: Queued
Job status: Queued
Job status: Queued
Job status: Queued
Job status: Queued
Job status: Queued
Job status: Queued
Job status: Queued
Job status: Running
Job status: Running
Job status: Running
Job status: Running
Job status: Running
Job status: Running
Job status: Running
Job status: Running
Job status: Running
Job status: Completed


Downloading artifact azureml://subscriptions/1d53bfb3-a84c-4eb4-8c79-f29dc8424b6a/resourcegroups/rg-dnd-mlops-demo/workspaces/mlw-dndmlops2-dev/datastores/workspaceblobstore/paths/azureml/mango_tongue_rn0rx6my87/predictions/ to batch_outputs\mango_tongue_rn0rx6my87\named-outputs\predictions


Downloaded job output to: C:\Users\ritwickdutta\OneDrive - Microsoft\Documents\DND MLOps Demo\notebooks\batch_outputs\mango_tongue_rn0rx6my87


Unnamed: 0,prediction
0,1
1,1
2,1
3,1
4,1
5,0
6,1
7,0
8,1
9,1


## Batch Endpoint (Recommended for Production)

A **Batch Endpoint** provides a durable REST endpoint for batch inference:
- No always-on compute (cost-efficient)
- Process large datasets asynchronously
- Versioned deployments with traffic routing
- Built-in job management and monitoring

This is different from the "batch scoring job" above - a batch endpoint is a **permanent, reusable endpoint** that can be invoked via REST API or SDK.

In [12]:
# Batch Endpoint (1/3): Create the batch endpoint
from azure.ai.ml.entities import BatchEndpoint, BatchDeployment, BatchRetrySettings
from azure.ai.ml.constants import BatchDeploymentOutputAction
import uuid

# Create a unique batch endpoint name
batch_endpoint_name = os.getenv('AML_BATCH_ENDPOINT_NAME', f'spam-batch-{uuid.uuid4().hex[:8]}')

batch_endpoint = BatchEndpoint(
    name=batch_endpoint_name,
    description='Spam classifier batch endpoint for async inference',
    tags={'environment': 'workshop', 'use_case': 'spam_detection'},
)

print(f'Creating batch endpoint: {batch_endpoint_name}')
ml_client.batch_endpoints.begin_create_or_update(batch_endpoint).result()
print(f'Batch endpoint created: {batch_endpoint_name}')

Creating batch endpoint: spam-batch-241391f6
Batch endpoint created: spam-batch-241391f6


In [24]:
# Batch Endpoint (2/3): Create/update deployment with the fixed model (numpy<2.0)
from azure.ai.ml.entities import ModelBatchDeployment, ModelBatchDeploymentSettings, Environment
from azure.ai.ml.constants import BatchDeploymentOutputAction

# Use the registered model (version 8 has numpy<2.0)
model_for_batch = f'azureml:{registered_model.name}:{registered_model.version}'
print(f'Using model: {model_for_batch}')

# Use the sklearn-1.5 curated environment (also has numpy<2.0)
batch_endpoint_env = 'azureml://registries/azureml/environments/sklearn-1.5/labels/latest'

# Create a new deployment name (must be 3+ chars)
deployment_name = 'numpy1x'

batch_deployment = ModelBatchDeployment(
    name=deployment_name,
    endpoint_name=batch_endpoint_name,
    model=model_for_batch,
    environment=batch_endpoint_env,
    compute=batch_compute,
    settings=ModelBatchDeploymentSettings(
        instance_count=1,
        max_concurrency_per_instance=2,
        mini_batch_size=10,
        output_action=BatchDeploymentOutputAction.APPEND_ROW,
        output_file_name='predictions.csv',
        retry_settings=BatchRetrySettings(max_retries=3, timeout=300),
        logging_level='info',
    ),
)

print(f'Creating batch deployment: {deployment_name}')
ml_client.batch_deployments.begin_create_or_update(batch_deployment).result()

# Set as default deployment
batch_endpoint = ml_client.batch_endpoints.get(batch_endpoint_name)
batch_endpoint.defaults.deployment_name = deployment_name
ml_client.batch_endpoints.begin_create_or_update(batch_endpoint).result()

print(f'Batch deployment "{deployment_name}" created and set as default')
print(f'Model: {model_for_batch}')
print(f'Endpoint: {batch_endpoint_name}')

Using model: azureml:spam-classifier:8
Creating batch deployment: numpy1x
Batch deployment "numpy1x" created and set as default
Model: azureml:spam-classifier:8
Endpoint: spam-batch-241391f6


In [25]:
# Batch Endpoint (3/3): Invoke the batch endpoint with test data
from azure.ai.ml import Input
from azure.ai.ml.constants import AssetTypes

# Create a small CSV file with test data to score
batch_input_path = Path('batch_endpoint_input.csv')
X_test.head(50).to_csv(batch_input_path, index=False)
print(f'Created input file: {batch_input_path} ({len(X_test.head(50))} rows)')

# Invoke the batch endpoint
# Note: For production, you'd typically use a registered data asset or datastore path
job = ml_client.batch_endpoints.invoke(
    endpoint_name=batch_endpoint_name,
    inputs={
        'input': Input(path=str(batch_input_path.resolve()), type=AssetTypes.URI_FILE)
    },
)

print(f'Batch job submitted: {job.name}')
print(f'Monitor in Studio: https://ml.azure.com/runs/{job.name}?wsid=/subscriptions/{SUBSCRIPTION_ID}/resourcegroups/{RESOURCE_GROUP}/workspaces/{WORKSPACE_NAME}')

Created input file: batch_endpoint_input.csv (50 rows)
Batch job submitted: batchjob-14b1e65b-c9e3-4807-9297-fc1ae3035ff9
Monitor in Studio: https://ml.azure.com/runs/batchjob-14b1e65b-c9e3-4807-9297-fc1ae3035ff9?wsid=/subscriptions/1d53bfb3-a84c-4eb4-8c79-f29dc8424b6a/resourcegroups/rg-dnd-mlops-demo/workspaces/mlw-dndmlops2-dev


In [26]:
# Wait for batch job and download results
import time

job_name = job.name
wait_seconds = int(os.getenv('BATCH_WAIT_SECONDS', '600'))
poll_seconds = int(os.getenv('BATCH_POLL_SECONDS', '15'))

start = time.time()
while True:
    job_status = ml_client.jobs.get(job_name)
    status = job_status.status
    print(f'Batch job status: {status}')
    
    if status in {'Completed', 'Failed', 'Canceled'}:
        break
    if time.time() - start > wait_seconds:
        print('Job still running. Re-run this cell later to check status.')
        break
    time.sleep(poll_seconds)

if status == 'Completed':
    # Download the output
    output_dir = Path('batch_endpoint_outputs') / job_name
    output_dir.mkdir(parents=True, exist_ok=True)
    
    ml_client.jobs.download(name=job_name, download_path=str(output_dir), output_name='score')
    print(f'Downloaded results to: {output_dir}')
    
    # Find and display the predictions file
    for pred_file in output_dir.rglob('predictions.csv'):
        preds_df = pd.read_csv(pred_file)
        print(f'\\nPredictions from {pred_file}:')
        display(preds_df.head(10))
        break
else:
    print(f'Job ended with status: {status}')

AzureCliCredential.get_token failed: Failed to invoke the Azure CLI
Proceeding with no tenant id appended to studio URL



Batch job status: Completed


Downloading artifact azureml://datastores/workspaceblobstore/paths/azureml/23d8758a-1dc1-4213-abe9-956d87565cfb/score/ to batch_endpoint_outputs\batchjob-14b1e65b-c9e3-4807-9297-fc1ae3035ff9


Downloaded results to: batch_endpoint_outputs\batchjob-14b1e65b-c9e3-4807-9297-fc1ae3035ff9
\nPredictions from batch_endpoint_outputs\batchjob-14b1e65b-c9e3-4807-9297-fc1ae3035ff9\predictions.csv:


Unnamed: 0,0,1,batch_endpoint_input.csv
0,1,1,batch_endpoint_input.csv
1,2,0,batch_endpoint_input.csv
2,3,1,batch_endpoint_input.csv
3,4,0,batch_endpoint_input.csv
4,5,0,batch_endpoint_input.csv
5,6,0,batch_endpoint_input.csv
6,7,1,batch_endpoint_input.csv
7,8,0,batch_endpoint_input.csv
8,9,1,batch_endpoint_input.csv
9,10,1,batch_endpoint_input.csv


In [None]:
# Get batch endpoint details and scoring URI
endpoint_info = ml_client.batch_endpoints.get(batch_endpoint_name)

print('=== Batch Endpoint Details ===')
print(f'Name: {endpoint_info.name}')
print(f'Scoring URI: {endpoint_info.scoring_uri}')
print(f'Swagger URI: {endpoint_info.openapi_uri}')
print(f'Default deployment: {endpoint_info.defaults.deployment_name}')
print(f'\\nStudio URL: https://ml.azure.com/batchEndpoints/{batch_endpoint_name}?wsid=/subscriptions/{SUBSCRIPTION_ID}/resourcegroups/{RESOURCE_GROUP}/workspaces/{WORKSPACE_NAME}')

## Optional: Real-time serving (Managed Online Endpoint)
This section deploys a **managed online endpoint** (real-time serving). It can take several minutes and requires quota for the chosen VM size.

If you want the workshop goal of **MLflow tracking + model registry + batch scoring**, you can skip this section and use the batch scoring section above instead.

In [19]:
# Create an endpoint name that won't collide across attendees
endpoint_name = os.getenv('AML_ENDPOINT_NAME', f'spam-clf-{uuid.uuid4().hex[:8]}')

# In locked-down environments, prefer AAD token auth over key auth.
# - 'aml_token' uses Azure AD auth (recommended)
# - 'key' uses endpoint keys (works in many labs, but may be restricted by policy)
endpoint_auth_mode = os.getenv('AML_ENDPOINT_AUTH_MODE', 'aml_token').strip()

endpoint = ManagedOnlineEndpoint(
    name=endpoint_name,
    description='Spam classification endpoint (workshop)',
    auth_mode=endpoint_auth_mode,
    tags={'environment': 'workshop', 'use_case': 'spam_detection'},
)

ml_client.online_endpoints.begin_create_or_update(endpoint).result()
print('Endpoint created:', endpoint_name)
print('Auth mode:', endpoint_auth_mode)

deployment = ManagedOnlineDeployment(
    name='blue',
    endpoint_name=endpoint_name,
    model=f'azureml:{registered_model.name}:{registered_model.version}',
    instance_type=os.getenv('AML_INSTANCE_TYPE', 'Standard_DS3_v2'),
    instance_count=int(os.getenv('AML_INSTANCE_COUNT', '1')),
)

ml_client.online_deployments.begin_create_or_update(deployment).result()

endpoint = ml_client.online_endpoints.get(endpoint_name)
endpoint.traffic = {'blue': 100}
ml_client.online_endpoints.begin_create_or_update(endpoint).result()

print('Deployment complete; traffic set to 100%')

Check: endpoint spam-clf-de533f51 exists


Endpoint created: spam-clf-de533f51
Auth mode: aml_token
..........................................................................................................................................................................................

HttpResponseError: (BadArgument) User container has crashed or terminated: Liveness probe failed: HTTP probe failed with statuscode: 502. Please see troubleshooting guide, available here: https://aka.ms/oe-tsg#error-resourcenotready
Code: BadArgument
Message: User container has crashed or terminated: Liveness probe failed: HTTP probe failed with statuscode: 502. Please see troubleshooting guide, available here: https://aka.ms/oe-tsg#error-resourcenotready

In [None]:
# Deploy Managed Online Endpoint (uses the fixed model with numpy<2.0)
import uuid

# Create a unique endpoint name
endpoint_name = os.getenv('AML_ENDPOINT_NAME', f'spam-clf-{uuid.uuid4().hex[:8]}')

endpoint = ManagedOnlineEndpoint(
    name=endpoint_name,
    description='Spam classification endpoint (workshop)',
    auth_mode='key',  # 'key' or 'aml_token'
    tags={'environment': 'workshop', 'use_case': 'spam_detection'},
)

print(f'Creating endpoint: {endpoint_name}')
ml_client.online_endpoints.begin_create_or_update(endpoint).result()
print(f'Endpoint created!')

# Deploy the model (MLflow model with numpy<2.0 will auto-generate compatible environment)
deployment = ManagedOnlineDeployment(
    name='blue',
    endpoint_name=endpoint_name,
    model=f'azureml:{registered_model.name}:{registered_model.version}',
    instance_type='Standard_DS3_v2',
    instance_count=1,
)

print(f'Creating deployment (this may take 5-10 minutes)...')
ml_client.online_deployments.begin_create_or_update(deployment).result()

# Set traffic
endpoint = ml_client.online_endpoints.get(endpoint_name)
endpoint.traffic = {'blue': 100}
ml_client.online_endpoints.begin_create_or_update(endpoint).result()

print(f'Deployment complete!')
print(f'Endpoint: {endpoint_name}')
print(f'Scoring URI: {endpoint.scoring_uri}')

Check: endpoint spam-clf-de533f51 exists


Creating deployment "green" with numpy<2.0 environment...
.........................................................................................................................................................................................

HttpResponseError: (BadArgument) User container has crashed or terminated. Please see troubleshooting guide, available here: https://aka.ms/oe-tsg#error-resourcenotready
Code: BadArgument
Message: User container has crashed or terminated. Please see troubleshooting guide, available here: https://aka.ms/oe-tsg#error-resourcenotready

In [None]:
# Invoke the endpoint with a few rows
test_samples = X_test.head(5).to_dict(orient='split')
request_json = json.dumps({
    'input_data': {
        'columns': test_samples['columns'],
        'data': test_samples['data'],
    }
})

response = ml_client.online_endpoints.invoke(
    endpoint_name=endpoint_name,
    deployment_name='blue',
    request_file=None,
    request_json=request_json,
)

print('Raw response:')
print(response)

In [None]:
# Studio link
studio_url = (
    f'https://ml.azure.com/experiments/{experiment_name}'
    f'?wsid=/subscriptions/{SUBSCRIPTION_ID}/resourceGroups/{RESOURCE_GROUP}'
    f'/providers/Microsoft.MachineLearningServices/workspaces/{WORKSPACE_NAME}'
)
print('Open in Azure ML Studio:')
print(studio_url)

## Cleanup (recommended)
If you deployed an endpoint, delete it to avoid ongoing cost.

In [None]:
# Uncomment to delete the endpoint
# ml_client.online_endpoints.begin_delete(name=endpoint_name).result()
# print('Deleted endpoint:', endpoint_name)

# MLOps Hands-On Lab: Azure ML + MLflow Integration

This notebook demonstrates the core MLOps concepts using Azure ML with MLflow tracking.

## Prerequisites
- Azure subscription with ML workspace
- Python environment with required packages

## Topics Covered
1. Setting up MLflow tracking with Azure ML
2. Training a model with experiment tracking
3. Registering models
4. Deploying to managed endpoints

## 1. Setup and Configuration

In [None]:
# Install required packages (uncomment if needed)
# !pip install azure-ai-ml mlflow azureml-mlflow scikit-learn pandas

In [None]:
import os
import mlflow
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential, InteractiveBrowserCredential

print("Packages imported successfully!")

In [None]:
# Azure ML Configuration
# Update these values for your environment

SUBSCRIPTION_ID = "<your-subscription-id>"  # TODO: Update
RESOURCE_GROUP = "rg-dnd-mlops-demo"
WORKSPACE_NAME = "mlw-dnd-mlops-demo"

# Authenticate
try:
    credential = DefaultAzureCredential()
    credential.get_token("https://management.azure.com/.default")
except Exception:
    credential = InteractiveBrowserCredential()

# Initialize ML Client
ml_client = MLClient(
    credential=credential,
    subscription_id=SUBSCRIPTION_ID,
    resource_group_name=RESOURCE_GROUP,
    workspace_name=WORKSPACE_NAME,
)

print(f"Connected to workspace: {ml_client.workspace_name}")

## 2. Configure MLflow Tracking

In [None]:
# Get Azure ML tracking URI
tracking_uri = ml_client.workspaces.get(WORKSPACE_NAME).mlflow_tracking_uri

# Set MLflow tracking URI
mlflow.set_tracking_uri(tracking_uri)

print(f"MLflow tracking URI: {tracking_uri}")

# Set experiment
experiment_name = "mlops-hackathon-demo"
mlflow.set_experiment(experiment_name)

print(f"Experiment: {experiment_name}")

## 3. Load Real-World Dataset

We'll use the **UCI Spambase Dataset** - a classic email spam classification dataset.

- **Source**: UCI Machine Learning Repository / OpenML
- **Task**: Classify emails as spam (1) or not spam (0)
- **Features**: 57 attributes including word frequencies, character frequencies, and capital letter statistics
- **Samples**: 4,601 emails

This is a realistic dataset used for spam/fraud detection demonstrations.

In [None]:
# Load UCI Spambase Dataset (Email Spam Classification)
# Source: https://archive.ics.uci.edu/ml/datasets/spambase

from sklearn.datasets import fetch_openml

print("Loading Spambase dataset from OpenML...")

# Fetch the spambase dataset (ID: 44)
spambase = fetch_openml(data_id=44, as_frame=True, parser='auto')

data = spambase.frame

# Rename target column for clarity
data = data.rename(columns={'class': 'is_spam'})
data['is_spam'] = data['is_spam'].astype(int)

# Display dataset info
print(f"Dataset shape: {data.shape}")
print(f"\nClass distribution:")
print(f"  Not Spam (0): {(data['is_spam'] == 0).sum()} ({(data['is_spam'] == 0).mean()*100:.1f}%)")
print(f"  Spam (1): {(data['is_spam'] == 1).sum()} ({(data['is_spam'] == 1).mean()*100:.1f}%)")

# Show some feature names (word frequencies)
feature_names = spambase.feature_names[:10]
print(f"\nSample features: {feature_names}")
print("(Features represent word/character frequencies in emails)")

data.head()

In [None]:
# Split data
X = data.drop('is_spam', axis=1)
y = data['is_spam']

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

print(f"Training set: {X_train.shape[0]} samples")
print(f"Test set: {X_test.shape[0]} samples")
print(f"Number of features: {X_train.shape[1]}")

## 4. Train Model with MLflow Tracking

This demonstrates proper experiment tracking with:
- Parameter logging
- Metric logging
- Model artifact logging
- Input signature

In [None]:
from mlflow.models import infer_signature
from sklearn.metrics import roc_auc_score

# Hyperparameters
params = {
    "n_estimators": 100,
    "max_depth": 10,
    "min_samples_split": 5,
    "min_samples_leaf": 2,
    "random_state": 42,
}

# Start MLflow run
with mlflow.start_run(run_name="spam-classifier-rf") as run:
    
    # Log parameters
    mlflow.log_params(params)
    mlflow.log_param("training_samples", len(X_train))
    mlflow.log_param("dataset", "UCI Spambase")
    mlflow.log_param("num_features", X_train.shape[1])
    
    # Train model
    model = RandomForestClassifier(**params)
    model.fit(X_train, y_train)
    
    # Make predictions
    y_pred = model.predict(X_test)
    y_pred_proba = model.predict_proba(X_test)[:, 1]
    
    # Calculate metrics
    metrics = {
        "accuracy": accuracy_score(y_test, y_pred),
        "precision": precision_score(y_test, y_pred),
        "recall": recall_score(y_test, y_pred),
        "f1_score": f1_score(y_test, y_pred),
        "roc_auc": roc_auc_score(y_test, y_pred_proba),
    }
    
    # Log metrics
    mlflow.log_metrics(metrics)
    
    # Log feature importance (top 20)
    feature_importance = pd.DataFrame({
        'feature': X.columns,
        'importance': model.feature_importances_
    }).sort_values('importance', ascending=False)
    
    feature_importance.to_csv("feature_importance.csv", index=False)
    mlflow.log_artifact("feature_importance.csv")
    
    # Infer model signature
    signature = infer_signature(X_train, model.predict(X_train))
    
    # Log model
    mlflow.sklearn.log_model(
        model,
        artifact_path="model",
        signature=signature,
        registered_model_name="spam-classifier",
    )
    
    print(f"Run ID: {run.info.run_id}")
    print(f"\nMetrics:")
    for k, v in metrics.items():
        print(f"  {k}: {v:.4f}")
    print(f"\nTop 10 Important Features:")
    print(feature_importance.head(10).to_string(index=False))

## 5. Register Model in Azure ML Registry

In [None]:
from azure.ai.ml.entities import Model
from azure.ai.ml.constants import AssetTypes

# Get the model URI from MLflow
model_uri = f"runs:/{run.info.run_id}/model"

# Register with governance metadata
registered_model = ml_client.models.create_or_update(
    Model(
        path=model_uri,
        name="spam-classifier",
        type=AssetTypes.MLFLOW_MODEL,
        description="Email spam classifier trained on UCI Spambase dataset",
        tags={
            "author": "mlops-team",
            "use_case": "spam_detection",
            "dataset": "UCI Spambase",
            "framework": "sklearn",
            "algorithm": "RandomForest",
        },
        properties={
            "accuracy": str(round(metrics['accuracy'], 4)),
            "precision": str(round(metrics['precision'], 4)),
            "recall": str(round(metrics['recall'], 4)),
            "f1_score": str(round(metrics['f1_score'], 4)),
            "roc_auc": str(round(metrics['roc_auc'], 4)),
        },
    )
)

print(f"Model registered: {registered_model.name}:{registered_model.version}")

## 6. Deploy to Managed Online Endpoint

In [None]:
from azure.ai.ml.entities import (
    ManagedOnlineEndpoint,
    ManagedOnlineDeployment,
)

# Create endpoint
endpoint_name = "spam-classifier-endpoint"

endpoint = ManagedOnlineEndpoint(
    name=endpoint_name,
    description="Spam classification endpoint",
    auth_mode="key",
    tags={"environment": "demo", "use_case": "spam_detection"},
)

# Create or update endpoint
ml_client.online_endpoints.begin_create_or_update(endpoint).result()
print(f"Endpoint created: {endpoint_name}")

In [None]:
# Create deployment
deployment = ManagedOnlineDeployment(
    name="blue",
    endpoint_name=endpoint_name,
    model=f"azureml:{registered_model.name}:{registered_model.version}",
    instance_type="Standard_DS3_v2",
    instance_count=1,
)

# Create deployment (this may take several minutes)
ml_client.online_deployments.begin_create_or_update(deployment).result()

# Set traffic to 100%
endpoint.traffic = {"blue": 100}
ml_client.online_endpoints.begin_create_or_update(endpoint).result()

print(f"Deployment created and traffic set to 100%")

## 7. Test the Endpoint

In [None]:
import json

# Prepare test data
test_samples = X_test.head(5).to_dict(orient='split')

request_json = json.dumps({
    "input_data": {
        "columns": test_samples['columns'],
        "data": test_samples['data']
    }
})

# Invoke endpoint
response = ml_client.online_endpoints.invoke(
    endpoint_name=endpoint_name,
    request_file=None,
    deployment_name="blue",
    request_json=request_json,
)

print("Predictions:")
print(json.loads(response))

## 8. View Experiment in Azure ML Studio

Navigate to Azure ML Studio to see:
- Experiment runs
- Metrics comparison
- Model registry
- Endpoint monitoring

In [None]:
# Get workspace URL
workspace = ml_client.workspaces.get(WORKSPACE_NAME)
studio_url = f"https://ml.azure.com/experiments/{experiment_name}?wsid=/subscriptions/{SUBSCRIPTION_ID}/resourceGroups/{RESOURCE_GROUP}/providers/Microsoft.MachineLearningServices/workspaces/{WORKSPACE_NAME}"

print(f"View experiment in Azure ML Studio:")
print(studio_url)

## 9. Cleanup (Optional)

In [None]:
# Uncomment to delete resources
# ml_client.online_endpoints.begin_delete(name=endpoint_name).result()
# print(f"Endpoint {endpoint_name} deleted")

---

## Key Takeaways

1. **MLflow Integration**: Azure ML provides native MLflow tracking URI
2. **Experiment Tracking**: All parameters, metrics, and artifacts are versioned
3. **Model Registry**: Centralized registry with governance metadata
4. **Managed Endpoints**: Easy deployment with built-in scaling and monitoring

## Next Steps
- Add automated retraining pipeline
- Enable model monitoring for data drift
- Set up CI/CD with GitHub Actions