# Model Registration and Deployment with Azure Machine Learning

This notebook demonstrates the end-to-end process of selecting, registering, and deploying a pneumonia detection model using Azure Machine Learning. After completing hyperparameter tuning, this notebook guides you through analyzing model performance, selecting the best model, registering it in the Azure ML model registry, deploying it as a real-time inference endpoint, and validating the deployment with actual X-ray images.

## Setup Pre-requisites

Before running this notebook, you should have:

- An Azure Machine Learning workspace with appropriate permissions.
- Completed the model training and hyperparameter tuning process (sweep job).
- Access to the dataset for testing the deployed model.
- The necessary Python packages installed.
- Proper authentication configured for your Azure environment.

## What You'll Do

- **Analyze hyperparameter sweep results** - Review metrics from multiple training runs to identify the best-performing model configuration
- **Register the optimal model** - Create a versioned model asset in the Azure ML model registry with proper metadata
- **Deploy to a managed online endpoint** - Configure and deploy the model as a scalable REST API service on Azure
- **Test model inference** - Validate the deployed endpoint by sending a real X-ray image and receiving predictions

In [1]:
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
from workshop_helpers.utils import get_unique_name

credential = DefaultAzureCredential()
ml_client = MLClient.from_config(credential)

unique_name = get_unique_name(credential)


Found the config file in: /config.json


Validicating credential....
Credential Validated
Determining unique name....
Unique name: jmerkow


## Analyze Hyperparameter Sweep Results

This cell analyzes the results from our completed hyperparameter sweep, allowing us to identify the best-performing model configuration based on validation metrics. Hyperparameter tuning is a critical step in the machine learning workflow that helps optimize model performance by systematically searching through different parameter combinations. By collecting and analyzing the metrics from all completed trial runs, we can make an informed decision about which model configuration to register and deploy.

The process involves:

- **Retrieving sweep information**: Accessing a specific sweep job by name and getting its current status
- **Collecting trial runs**: Fetching all individual experiment runs that were part of the hyperparameter sweep
- **Setting up MLflow tracking**: Connecting to the Azure ML workspace's MLflow tracking server to access logged metrics
- **Extracting performance metrics**: Gathering validation metrics from only the completed runs to ensure valid comparisons
- **Ranking configurations**: Creating a sorted DataFrame that ranks model configurations based on validation AUC, with higher values indicating better performance

Once this analysis is complete, we can identify the top-performing model configuration and use its parameters for the final model registration and deployment, ensuring we proceed with the most effective version of our pneumonia detection model.

In [3]:
import mlflow
import pandas as pd

sweep_job = "teal_rose_2zh0jt4q39"

# Get the latest status of the sweep job
returned_sweep_job = ml_client.jobs.get(name=sweep_job)

# Get all the trial runs for this sweep job
runs = ml_client.jobs.list(parent_job_name=sweep_job)

# Set up MLflow tracking to access run metrics
mlflow_tracking_uri = ml_client.workspaces.get(ml_client.workspace_name).mlflow_tracking_uri
mlflow.set_tracking_uri(mlflow_tracking_uri)

# Collect metrics from completed runs
sweep_results = []
for run in runs:
    if run.status != "Completed":
        continue
    sweep_results.append(mlflow.get_run(run_id=run.name).data.metrics)
    sweep_results[-1]["name"] = run.name

# Create a DataFrame with results and sort by validation AUC (higher is better)
df = pd.DataFrame(sweep_results).sort_values(by="val_auc", ascending=False)
df


  from google.protobuf import service as _service


Unnamed: 0,val_acc,val_f1,val_precision,val_auc,val_recall,val_loss,val_goal_metric_val,train_acc,train_loss,name
0,0.11978,0.202163,0.112656,0.760838,0.983806,0.693158,0.760838,0.102043,0.693159,teal_rose_2zh0jt4q39_0
2,0.12804,0.201681,0.112518,0.719818,0.97166,0.693148,0.719818,0.102043,0.693148,teal_rose_2zh0jt4q39_2
3,0.113355,0.203627,0.113355,0.620293,1.0,0.693147,0.620293,0.102043,0.693147,teal_rose_2zh0jt4q39_3
4,0.11519,0.202647,0.112851,0.573866,0.991903,0.693147,0.573866,0.102043,0.693147,teal_rose_2zh0jt4q39_4
1,0.886645,0.0,0.0,0.5,0.0,0.693147,0.5,0.897957,0.693147,teal_rose_2zh0jt4q39_1
5,0.886645,0.0,0.0,0.5,0.0,0.693147,0.5,0.897957,0.693147,teal_rose_2zh0jt4q39_5


## Register Best Model in Azure Machine Learning

This step transitions our best-performing model from an experiment artifact to an officially registered model in the Azure ML registry. Model registration is a pivotal moment in the ML lifecycle that bridges the gap between experimentation and production deployment. By registering the model, we create a centralized, versioned asset that becomes the foundation for deployment, enabling traceability, governance, and reproducibility throughout the model's lifecycle.

The registration process involves:

- **Best Model Selection**: Identifying the top-performing model from our hyperparameter sweep based on validation metrics (though in this case we also have a hardcoded option as backup)
- **Artifact Location Reference**: Creating a reference to the MLflow model artifact produced during the training run
- **Model Entity Creation**: Defining a Model object with:
  - A unique name that includes a custom identifier for clear versioning
  - The path to the MLflow model artifacts from the best training run
  - A descriptive summary of the model's purpose and training data source
  - The type specification as "MLFLOW_MODEL" to maintain compatibility with the MLflow format

- **Version Management**: The `create_or_update` method ensures proper versioning if models with this name already exist

Once registered, this model becomes a managed asset in the Azure ML registry that can be deployed to various targets, shared with other team members, and tracked for compliance and governance purposes. This registration serves as the bridge between the experimentation phase and the operationalization of our pneumonia detection solution.

In [None]:
from azure.ai.ml.entities import Model
from azure.ai.ml.constants import AssetTypes

best_run_name = df.iloc[0]["name"]

best_run = ml_client.jobs.get(name=best_run_name)
model_dir = best_run.inputs["mlflow_model_dir"]._to_job_input()

# Create a model object referencing the output from your training job
model = Model(
    name=f"pneumonia_detection_model-{unique_name}",  # Choose a name for your model
    path=f"azureml://jobs/{best_run.name}/outputs/artifacts/paths/{model_dir}",
    description="Pneumonia detection model trained on RSNA dataset",
    type=AssetTypes.MLFLOW_MODEL  # This matches the output type from your job
)

# Register the model in your workspace
run_model = ml_client.models.create_or_update(model)

run_model


Model({'job_name': 'honest_onion_54rs3rjg81', 'intellectual_property': None, 'is_anonymous': False, 'auto_increment_version': False, 'auto_delete_setting': None, 'name': 'pneumonia_detection_model-jmerkow', 'description': 'Pneumonia detection model trained on RSNA dataset', 'tags': {}, 'properties': {}, 'print_as_yaml': False, 'id': '/subscriptions/6c180dd2-1ec4-4fad-8ba8-1f2d8d67c129/resourceGroups/fmmg-mars-collab/providers/Microsoft.MachineLearningServices/workspaces/fmmg-mars-collab/models/pneumonia_detection_model-jmerkow/versions/4', 'Resource__source_path': '', 'base_path': '/mnt/batch/tasks/shared/LS_root/mounts/clusters/jmerkow-cpu-uw/code/Users/jmerkow/UW Workshop/tutorial/04_ModelRegistrationAndDeployment', 'creation_context': <azure.ai.ml.entities._system_data.SystemData object at 0x7f836b9f79a0>, 'serialize': <msrest.serialization.Serializer object at 0x7f836b9f6d40>, 'version': '4', 'latest_version': None, 'path': 'azureml://subscriptions/6c180dd2-1ec4-4fad-8ba8-1f2d8d67c

## Deploy Model to a Managed Online Endpoint

This critical step transforms our registered model into an operational web service, making it accessible for real-time inference through a secure REST API. Deployment is the bridge between a trained model and its real-world application, enabling other systems and users to interact with our pneumonia detection solution. By using Azure ML's managed online endpoints, we benefit from infrastructure management, scaling capabilities, and monitoring features that are essential for production-grade AI systems.

The deployment process involves:

- **Model Registration Confirmation**: Finalizing the model registration to ensure it's properly stored in the Azure ML model registry
- **Endpoint Creation**: Establishing a unique, persistent URL endpoint that will host our model:
  - Using a workshop-specific naming convention to ensure uniqueness
  - Creating the infrastructure that will handle incoming requests
- **Deployment Configuration**: Setting up the runtime environment for our model:
  - Specifying a GPU-enabled compute instance (`Standard_NC6s_v3`) for optimal inference performance
  - Configuring a single instance for handling the expected inference load
  - Linking the deployment to our registered model through its unique identifier
- **Traffic Allocation**: Configuring 100% of incoming traffic to route to our deployment, enabling immediate availability

Once deployed, our pneumonia detection model becomes a live service that can accept and process X-ray images, returning predictions through a standardized API. This enables integration with various applications, from clinical decision support systems to research tools, fulfilling the ultimate goal of making our AI solution useful and accessible.

In [None]:
from azure.ai.ml.entities import ManagedOnlineEndpoint, ManagedOnlineDeployment
# Register the Model
run_model = ml_client.models.create_or_update(run_model)


endpoint_name = f"workshop-{unique_name}"

# Create endpoint and deployment with the classification model
endpoint = ManagedOnlineEndpoint(name=endpoint_name)
print(endpoint_name)
endpoint = ml_client.online_endpoints.begin_create_or_update(endpoint).result()
deployment = ManagedOnlineDeployment(
    name=endpoint_name,
    endpoint_name=endpoint.name,
    model=run_model.id,
    instance_type="Standard_NC6s_v3",
    instance_count=1,
)
deployment = ml_client.online_deployments.begin_create_or_update(deployment).result()
endpoint.traffic = {deployment.name: 100}
ml_client.online_endpoints.begin_create_or_update(endpoint).result()


workshop-jmerkow


Check: endpoint workshop-jmerkow exists


.........................................................................

## Test Model Endpoint with Real Data

This validation step completes our end-to-end workflow by testing our deployed model with actual patient X-ray data. Testing a deployed endpoint is an essential quality assurance measure that confirms our model is correctly accessible, properly processes inputs, and returns valid predictions in a production environment. By successfully executing this test, we verify that our pneumonia detection solution is ready for integration with clinical systems or other applications, ensuring the transition from development to production has been completed successfully.

The testing process involves:

- **Authentication**: Retrieving the endpoint's primary access key to establish secure communication
- **Data Preparation Functions**:
  - Creating a base64 encoding function to convert DICOM images into a format suitable for API transmission
  - Implementing a payload formatter that structures the encoded image according to the expected API schema
  - Building a request handler that manages the HTTP POST operation with proper authentication headers

- **Test Execution**: Sending a DICOM file to our endpoint
- **Result Verification**: Receiving and displaying the model's prediction response, confirming the deployed model can successfully process inputs and generate meaningful outputs

This successful test provides confidence that our pneumonia detection model is properly deployed and functioning as expected in a production environment.

In [None]:
import base64

import requests

key = ml_client.online_endpoints.get_keys(endpoint.name).primary_key

def encode_file_to_base64(filepath):
    """
    Read a file and convert it to base64 string.
    
    Args:
        filepath (str): Path to the file
        
    Returns:
        str: Base64 encoded string of the file
    """
    with open(filepath, "rb") as file:
        encoded_string = base64.b64encode(file.read()).decode("utf-8")
    return encoded_string

def make_payload(image_path):
    image_b64 = encode_file_to_base64(image_path)
    data = {
        "input_data": {"columns": ["image"], "index": [0], "data": [[image_b64]]},
    }
    return data

def make_request(image):
    payload = make_payload(image)

    return requests.post(
        endpoint.scoring_uri,
        json=payload,
        headers={
            "Authorization": f"Bearer {key}",
        },
    ).json()

filepath = "/home/azureuser/data/rsna-pneumonia-detection-challenge/stage_2_train_images/fffec09e-8a4a-48b1-b33e-ab4890ccd136.dcm"

data = make_request(filepath)

data


[{'predictions': [0.00012637874169740826]}]