In [1]:
try:
    # prefer parent path so notebook can be run from this folder
    %pip install -r ../../../requirements.txt
except Exception:
    # fallback to local path
    %pip install -r requirements.txt

print("Installation command executed. Restart kernel if necessary.")

Note: you may need to restart the kernel to use updated packages.
Installation command executed. Restart kernel if necessary.


# Deploy a TensorFlow model served with TF Serving using a custom container in an online endpoint
Learn how to deploy a custom container as an online endpoint in Azure Machine Learning.

Custom container deployments can use web servers other than the default Python Flask server used by Azure Machine Learning. Users of these deployments can still take advantage of Azure Machine Learning's built-in monitoring, scaling, alerting, and authentication.

## Prerequisites

* To use Azure Machine Learning, you must have an Azure subscription. If you don't have an Azure subscription, create a free account before you begin. Try the [free or paid version of Azure Machine Learning](https://azure.microsoft.com/free/).

* Install and configure the [Python SDK v2](sdk/setup.sh).

* You must have an Azure resource group, and you (or the service principal you use) must have Contributor access to it.

* You must have an Azure Machine Learning workspace. 

* To deploy locally, you must install [Docker Engine](https://docs.docker.com/engine/install/) on your local computer. We highly recommend this option, so it's easier to debug issues.

# 1. Connect to Azure Machine Learning Workspace

The [workspace](https://docs.microsoft.com/en-us/azure/machine-learning/concept-workspace) is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. In this section we will connect to the workspace in which the job will be run.

## 1.1. Import the required libraries

In [12]:
# import required libraries
from azure.ai.ml import MLClient
from azure.ai.ml.entities import (
    ManagedOnlineEndpoint,
    ManagedOnlineDeployment,
    Model,
    Environment,
    CodeConfiguration,
    DataCollector,
    DeploymentCollection,
)
from azure.identity import DefaultAzureCredential

## 1.2. Configure workspace details and get a handle to the workspace

To connect to a workspace, we need identifier parameters - a subscription, resource group and workspace name. We will use these details in the `MLClient` from `azure.ai.ml` to get a handle to the required Azure Machine Learning workspace. We use the default [default azure authentication](https://docs.microsoft.com/en-us/python/api/azure-identity/azure.identity.defaultazurecredential?view=azure-python) for this tutorial. Check the [configuration notebook](../../jobs/configuration.ipynb) for more details on how to configure credentials and connect to a workspace.

In [7]:
# Try to load workspace details from a local .env file (at notebooks/.env)
import os
from pathlib import Path

# Relative path from this notebook to the notebooks/.env file
env_path = Path("../../../.env")

if env_path.exists():
    with env_path.open() as f:
        for line in f:
            line = line.strip()
            if not line or line.startswith("#"):
                continue
            if "=" in line:
                k, v = line.split("=", 1)
                os.environ[k.strip()] = v.strip()

    # Read expected variables
    subscription_id = os.environ.get("SUBSCRIPTION_ID", "")
    resource_group = os.environ.get("RESOURCE_GROUP", "")
    workspace = os.environ.get("WORKSPACE_NAME", "")
    appinsights_conn = os.environ.get("APPINSIGHTS_CONNECTION_STRING", "")
else:
    print(f".env file not found at {env_path}. Please set subscription_id, resource_group, workspace variables manually.")

print("Loaded workspace configuration:")
print("  SUBSCRIPTION_ID=", "")
print("  RESOURCE_GROUP=", "")
print("  WORKSPACE_NAME=", "")


Loaded workspace configuration:
  SUBSCRIPTION_ID= 
  RESOURCE_GROUP= 
  WORKSPACE_NAME= 


In [8]:
# get a handle to the workspace
ml_client = MLClient(
    DefaultAzureCredential(), subscription_id, resource_group, workspace
)

Overriding of current TracerProvider is not allowed
Overriding of current LoggerProvider is not allowed
Overriding of current LoggerProvider is not allowed
Overriding of current MeterProvider is not allowed
Overriding of current MeterProvider is not allowed
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented
Attempting to instrument while already instrumented


# 4. Deploy your online endpoint to Azure
Next, deploy your online endpoint to Azure.

## 4.1 Configure online endpoint
`endpoint_name`: The name of the endpoint. It must be unique in the Azure region. Naming rules are defined under [managed online endpoint limits](https://docs.microsoft.com/azure/machine-learning/how-to-manage-quotas#azure-machine-learning-managed-online-endpoints-preview).

`auth_mode` : Use `key` for key-based authentication. Use `aml_token` for Azure Machine Learning token-based authentication. A `key` does not expire, but `aml_token` does expire. 

Optionally, you can add description, tags to your endpoint.

In [9]:
# Creating a unique endpoint name with current datetime to avoid conflicts
import datetime

online_endpoint_name = "endpoint-cpu" + datetime.datetime.now().strftime("%m%d%H%M%f")

# create an online endpoint
endpoint = ManagedOnlineEndpoint(
    name=online_endpoint_name,
    description="this is a sample online endpoint",
    auth_mode="key",
    tags={"foo": "bar"},
)

## 4.2 Create the endpoint
Using the `MLClient` created earlier, we will now create the Endpoint in the workspace. This command will start the endpoint creation and return a confirmation response while the endpoint creation continues.

In [10]:
ml_client.begin_create_or_update(endpoint).result()

ManagedOnlineEndpoint({'public_network_access': 'Enabled', 'provisioning_state': 'Succeeded', 'scoring_uri': 'https://endpoint-cpu10082251537248.canadacentral.inference.ml.azure.com/score', 'openapi_uri': 'https://endpoint-cpu10082251537248.canadacentral.inference.ml.azure.com/swagger.json', 'name': 'endpoint-cpu10082251537248', 'description': 'this is a sample online endpoint', 'tags': {'foo': 'bar'}, 'properties': {'createdBy': 'System Administrator', 'createdAt': '2025-10-08T22:51:42.309600+0000', 'lastModifiedAt': '2025-10-08T22:51:42.309600+0000', 'azureml.onlineendpointid': '/subscriptions/5784b6a5-de3f-4fa4-8b8f-e5bb70ff6b25/resourcegroups/rg-aml-ws-prod-cc-01/providers/microsoft.machinelearningservices/workspaces/mlwprodcc01/onlineendpoints/endpoint-cpu10082251537248', 'AzureAsyncOperationUri': 'https://management.azure.com/subscriptions/5784b6a5-de3f-4fa4-8b8f-e5bb70ff6b25/providers/Microsoft.MachineLearningServices/locations/canadacentral/mfeOperationsStatus/oeidp:fb55c029-0b

## 4.3 Configure online deployment
A deployment is a set of resources required for hosting the model that does the actual inferencing. We will create a deployment for our endpoint using the `ManagedOnlineDeployment` class.

### Key aspects of deployment 
- `name` - Name of the deployment.
- `endpoint_name` - Name of the endpoint to create the deployment under.
- `model` - The model to use for the deployment. This value can be either a reference to an existing versioned model in the workspace or an inline model specification.
- `environment` - The environment to use for the deployment. This value can be either a reference to an existing versioned environment in the workspace or an inline environment specification.
- `code_configuration` - the configuration for the source code and scoring script
    - `path`- Path to the source code directory for scoring the model
    - `scoring_script` - Relative path to the scoring file in the source code directory
- `instance_type` - The VM size to use for the deployment. For the list of supported sizes, see [Managed online endpoints SKU list](https://docs.microsoft.com/en-us/azure/machine-learning/reference-managed-online-endpoints-vm-sku-list).
- `instance_count` - The number of instances to use for the deployment

### Important: Environment requirements for data collection

For data collection to work, your deployment environment **must** include the `azureml-ai-monitoring` package. 

The scoring script (`score.py`) uses:
```python
from azureml.ai.monitoring import Collector
```

**Required dependencies** (already added to `model-1/environment/conda.yaml`):
- `pandas>=1.3.0` - Required because the Collector only logs pandas DataFrames
- `azureml-ai-monitoring~=0.1.0b1` - The data collection SDK

If you're using a custom environment, ensure your conda.yaml includes:
```yaml
dependencies:
  - pip:
      - azureml-defaults==1.38.0  # or later
      - azureml-ai-monitoring~=0.1.0b1
      - pandas
```

**Note**: The base image `mcr.microsoft.com/azureml/curated/minimal-py311-inference:latest` may include some of these packages, but it's safer to explicitly declare them in your environment specification.

In [None]:
# Verify the conda.yaml includes required packages for data collection
import yaml

conda_file = "model-1/environment/conda.yaml"
with open(conda_file, 'r') as f:
    conda_config = yaml.safe_load(f)

print(f"Environment name: {conda_config.get('name')}")
print(f"\nPython version: {[dep for dep in conda_config['dependencies'] if isinstance(dep, str) and dep.startswith('python')]}")
print("\nPip packages:")

for dep in conda_config['dependencies']:
    if isinstance(dep, dict) and 'pip' in dep:
        for pkg in dep['pip']:
            print(f"  - {pkg}")
            
# Check for required packages
pip_packages = []
for dep in conda_config['dependencies']:
    if isinstance(dep, dict) and 'pip' in dep:
        pip_packages.extend(dep['pip'])

required_packages = ['azureml-ai-monitoring', 'pandas']
missing_packages = []

for req in required_packages:
    found = any(req in pkg for pkg in pip_packages)
    status = "✅" if found else "❌"
    print(f"\n{status} {req}: {'Found' if found else 'MISSING'}")
    if not found:
        missing_packages.append(req)

if missing_packages:
    print(f"\n⚠️  WARNING: Missing required packages: {', '.join(missing_packages)}")
    print("Data collection will NOT work without these packages!")
else:
    print("\n✅ All required packages for data collection are present!")

In [None]:
# Prepare environment variables for App Insights (optional)
env_vars = {}
if appinsights_conn:
    # Provide the connection string to the deployment's environment variables
    env_vars["APPLICATIONINSIGHTS_CONNECTION_STRING"] = appinsights_conn

# Create a blue deployment with data collection enabled
model = Model(path="model-1/model/sklearn_regression_model.pkl")

# Create environment with data collection dependencies
# Option 1: Use conda.yaml with azureml-ai-monitoring package (recommended)
env = Environment(
    image="mcr.microsoft.com/azureml/curated/minimal-py311-inference:latest"
    conda_file="model-1/environment/conda.yaml",
)

# Option 2: Use curated minimal inference image (if it includes monitoring package)
# env = Environment(
#     image="mcr.microsoft.com/azureml/curated/minimal-py311-inference:latest",
# )

# Configure data collector for production inference logging
# Using standard names 'model_inputs' and 'model_outputs' for seamless model monitoring
collections = {
    "model_inputs": DeploymentCollection(enabled=True),
    "model_outputs": DeploymentCollection(enabled=True),
}

data_collector = DataCollector(
    collections=collections,
    sampling_rate=1.0  # Collect 100% of requests
)

blue_deployment = ManagedOnlineDeployment(
    name="blue",
    endpoint_name=online_endpoint_name,
    model=model,
    environment=env,
    environment_variables=env_vars,
    app_insights_enabled=True,
    data_collector=data_collector,
    code_configuration=CodeConfiguration(
        code="model-1/onlinescoring", scoring_script="score.py"
    ),
    instance_type="Standard_DS3_v2",
    instance_count=1,
)

Class DeploymentCollection: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class DataCollector: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class DataCollector: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.


## 4.4 Create the deployment
Using the `MLClient` created earlier, we will now create the deployment in the workspace. This command will start the deployment creation and return a confirmation response while the deployment creation continues.

In [None]:
ml_client.begin_create_or_update(blue_deployment).result()

Check: endpoint endpoint-cpu10082251537248 exists
[32mUploading onlinescoring (0.01 MBs): 100%|##########| 7093/7093 [00:00<00:00, 1060688.76it/s]
[39m


[39m

[32mUploading sklearn_regression_model.pkl[32m (< 1 MB): 100%|##########| 756/756 [00:00<00:00, 54.1kB/s]
[39m


[39m



.................................................................................................................................................................................................................................................................................................................

In [8]:
# blue deployment takes 100 traffic
endpoint.traffic = {"blue": 100}
ml_client.begin_create_or_update(endpoint).result()

#endpoint.traffic = {"blue": 0, "green": 0, "orange": 100}
#ml_client.begin_create_or_update(endpoint).result()

ManagedOnlineEndpoint({'public_network_access': 'Enabled', 'provisioning_state': 'Succeeded', 'scoring_uri': 'https://endpoint-cpu03060001216956.canadacentral.inference.ml.azure.com/score', 'openapi_uri': 'https://endpoint-cpu03060001216956.canadacentral.inference.ml.azure.com/swagger.json', 'name': 'endpoint-cpu03060001216956', 'description': 'this is a sample online endpoint', 'tags': {'foo': 'bar'}, 'properties': {'createdBy': 'Jose Medina Gomez', 'createdAt': '2025-03-06T05:01:08.229458+0000', 'lastModifiedAt': '2025-03-06T05:12:22.209731+0000', 'azureml.onlineendpointid': '/subscriptions/14585b9f-5c83-4a76-8055-42149123f99f/resourcegroups/umiazmlaks/providers/microsoft.machinelearningservices/workspaces/umiazmlws/onlineendpoints/endpoint-cpu03060001216956', 'AzureAsyncOperationUri': 'https://management.azure.com/subscriptions/14585b9f-5c83-4a76-8055-42149123f99f/providers/Microsoft.MachineLearningServices/locations/canadacentral/mfeOperationsStatus/oeidp:5158600d-81bf-4fd7-b9b8-a0

## 4.6 Verify data collection is enabled

After the deployment is created, you can verify that data collection is properly configured.

In [None]:
# Get deployment details to verify data collection configuration
deployment = ml_client.online_deployments.get(
    name="blue",
    endpoint_name=online_endpoint_name
)

print("Deployment configuration:")
print(f"  Name: {deployment.name}")
print(f"  App Insights enabled: {deployment.app_insights_enabled}")
print(f"  Data collector enabled: {deployment.data_collector is not None}")

if deployment.data_collector:
    print(f"  Sampling rate: {deployment.data_collector.sampling_rate}")
    print(f"  Collections:")
    for collection_name, collection in deployment.data_collector.collections.items():
        print(f"    - {collection_name}: enabled={collection.enabled}")
    
print("\n✅ Data collection is properly configured!")
print("\nCollected data will be stored at:")
print(f"  azureml://datastores/workspaceblobstore/paths/modelDataCollector/{online_endpoint_name}/blue/model_inputs/")
print(f"  azureml://datastores/workspaceblobstore/paths/modelDataCollector/{online_endpoint_name}/blue/model_outputs/")

### How to view collected production data

After you invoke the endpoint and production data is collected, you can view it in several ways:

#### Option 1: View in Azure ML Studio UI
1. Go to the **Data** tab in your Azure Machine Learning workspace
2. Navigate to **Datastores** and select **workspaceblobstore (Default)**
3. Use the **Browse** menu and navigate to: `modelDataCollector/{endpoint_name}/blue/model_inputs/` or `model_outputs/`
4. Data is stored in JSONL format with the path structure: `{yyyy}/{MM}/{dd}/{HH}/{instance_id}.jsonl`

#### Option 2: List data assets programmatically
```python
# List data assets created by data collector
data_assets = ml_client.data.list()
for asset in data_assets:
    if online_endpoint_name in asset.name:
        print(f"Data asset: {asset.name} (version {asset.version})")
```

#### Option 3: Query blob storage directly
```python
from azure.storage.blob import BlobServiceClient

# Get workspace default datastore
datastore = ml_client.datastores.get_default()

# Construct blob path
blob_path = f"modelDataCollector/{online_endpoint_name}/blue/model_inputs/"
print(f"Data location: {blob_path}")
```

#### Data Schema
Each collected data file is in JSONL format where each line is a JSON object containing:
- `data`: The actual input/output data (pandas DataFrame converted to JSON)
- `correlationid`: Unique ID linking inputs to outputs
- `time`: Timestamp of the inference request
- `specversion`, `id`, `source`, `type`: CloudEvents metadata

**Note**: The scoring script must use `Collector.collect()` with pandas DataFrames for data to be logged. Your updated `score.py` already includes this instrumentation.

# 5. Test the endpoint with sample data
Using the `MLClient` created earlier, we will get a handle to the endpoint. The endpoint can be invoked using the `invoke` command with the following parameters:
- `endpoint_name` - Name of the endpoint
- `request_file` - File with request data
- `deployment_name` - Name of the specific deployment to test in an endpoint

We will send a sample request using a [json](./model-1/sample-request.json) file. 

In [10]:
# test the blue deployment with some sample data
ml_client.online_endpoints.invoke(
    endpoint_name=online_endpoint_name,
    deployment_name="blue",
    request_file="model-1/sample-request.json",
)

'["sklearn_regression_model.pkl"]'

# 6. Managing endpoints and deployments

## 6.1 Get details of the endpoint

In [None]:
# Get the details for online endpoint
endpoint = ml_client.online_endpoints.get(name=online_endpoint_name)

# existing traffic details
print(endpoint.traffic)

# Get the scoring URI
print(endpoint.scoring_uri)

## 6.2 Get the logs for the new deployment
Get the logs for the green deployment and verify as needed

In [14]:
ml_client.online_deployments.get_logs(
    name="blue", endpoint_name=online_endpoint_name, lines=50
)



# 7. Delete the endpoint

In [None]:
ml_client.online_endpoints.begin_delete(name=online_endpoint_name)