# Deploy locally trained xgboost model to a batch endpoint

In [5]:
!az login

[93mA web browser has been opened at https://login.microsoftonline.com/organizations/oauth2/v2.0/authorize. Please continue the login in the web browser. If no web browser is available or if the web browser fails to open, use device code flow with `az login --use-device-code`.[0m
[
  {
    "cloudName": "AzureCloud",
    "homeTenantId": "7b658efc-8cee-4580-b989-85db158d4e3c",
    "id": "3979595d-b401-4ff4-b3ef-661b8481c742",
    "isDefault": true,
    "managedByTenants": [
      {
        "tenantId": "961897a7-5808-4b80-b503-dbb7be13e7a7"
      }
    ],
    "name": "MV-AZU-001",
    "state": "Enabled",
    "tenantId": "7b658efc-8cee-4580-b989-85db158d4e3c",
    "user": {
      "name": "ben.dixon@multiverse.io",
      "type": "user"
    }
  }
]


In [6]:
from azure.ai.ml import MLClient, Input
from azure.ai.ml.entities import (
    BatchEndpoint,
    ModelBatchDeployment,
    ModelBatchDeploymentSettings,
    Model,
    Environment,
    AmlCompute,
    BatchRetrySettings,
    CodeConfiguration,
)
from azure.identity import DefaultAzureCredential
from azure.ai.ml.constants import AssetTypes, BatchDeploymentOutputAction

In [7]:
credential = DefaultAzureCredential()
ml_client = MLClient.from_config(credential)

Found the config file in: /Users/ben.dixon/Repos/azure-experiments/config.json


# 2. Create Batch Endpoint
Batch endpoints are endpoints that are used batch inferencing on large volumes of data over a period of time. Batch endpoints receive pointers to data and run jobs asynchronously to process the data in parallel on compute clusters. Batch endpoints store outputs to a data store for further analysis.

In [8]:
endpoint_name = "xgboost-batch"

In [9]:
import random
import string

# Creating a unique endpoint name by including a random suffix
allowed_chars = string.ascii_lowercase + string.digits
endpoint_suffix = "".join(random.choice(allowed_chars) for x in range(5))
endpoint_name = f"{endpoint_name}-{endpoint_suffix}"

print(f"Endpoint name: {endpoint_name}")

Endpoint name: xgboost-batch-pnp9z


Let's configure the endpoint:

In [10]:
endpoint = BatchEndpoint(
    name=endpoint_name,
    description="A batch endpoint for returning xgboost inferences.",
    tags={"type": "deep-learning"},
)

In [11]:
ml_client.begin_create_or_update(endpoint).result()

BatchEndpoint({'scoring_uri': 'https://xgboost-batch-pnp9z.ukwest.inference.ml.azure.com/jobs', 'openapi_uri': None, 'provisioning_state': 'Succeeded', 'name': 'xgboost-batch-pnp9z', 'description': 'A batch endpoint for returning xgboost inferences.', 'tags': {'type': 'deep-learning'}, 'properties': {'BatchEndpointCreationApiVersion': '2023-10-01', 'azureml.onlineendpointid': '/subscriptions/3979595d-b401-4ff4-b3ef-661b8481c742/resourceGroups/aiplatform-1/providers/Microsoft.MachineLearningServices/workspaces/ai-mlw-dev-01/batchEndpoints/xgboost-batch-pnp9z'}, 'print_as_yaml': False, 'id': '/subscriptions/3979595d-b401-4ff4-b3ef-661b8481c742/resourceGroups/aiplatform-1/providers/Microsoft.MachineLearningServices/workspaces/ai-mlw-dev-01/batchEndpoints/xgboost-batch-pnp9z', 'Resource__source_path': '', 'base_path': '/Users/ben.dixon/Repos/azure-experiments/deploy_batch/xgboost', 'creation_context': None, 'serialize': <msrest.serialization.Serializer object at 0x105427f10>, 'auth_mode': 

## 3. Registering the model

Deploy a locally trained model - first we have to register it. 

In [12]:
model_name = "xgboost_model"
model_local_path = "assets/xgboost_model.json"

xgboost_model = Model(
    path=model_local_path,
    type=AssetTypes.CUSTOM_MODEL,
    name=model_name,
    description="Model created from local files.",
)
ml_client.models.create_or_update(xgboost_model)

Model({'job_name': None, 'intellectual_property': None, 'is_anonymous': False, 'auto_increment_version': False, 'auto_delete_setting': None, 'name': 'xgboost_model', 'description': 'Model created from local files.', 'tags': {}, 'properties': {}, 'print_as_yaml': False, 'id': '/subscriptions/3979595d-b401-4ff4-b3ef-661b8481c742/resourceGroups/aiplatform-1/providers/Microsoft.MachineLearningServices/workspaces/ai-mlw-dev-01/models/xgboost_model/versions/22', 'Resource__source_path': '', 'base_path': '/Users/ben.dixon/Repos/azure-experiments/deploy_batch/xgboost', 'creation_context': <azure.ai.ml.entities._system_data.SystemData object at 0x10f9d2eb0>, 'serialize': <msrest.serialization.Serializer object at 0x10f9d2e20>, 'version': '22', 'latest_version': None, 'path': 'azureml://subscriptions/3979595d-b401-4ff4-b3ef-661b8481c742/resourceGroups/aiplatform-1/workspaces/ai-mlw-dev-01/datastores/workspaceblobstore/paths/LocalUpload/d5fb0d93cd2f9ee52e95cb4874b7a63c/xgboost_model.json', 'datas

Let's get a reference to the model:

In [13]:
model = ml_client.models.get(name=model_name, label="latest")

# 4. Create a deployment
A deployment is a set of resources required for hosting the model that does the actual inferencing.

### 4.2 Creating the compute

Batch deployments can run on any Azure ML compute that already exists in the workspace. That means that multiple batch deployments can share the same compute infrastructure. In this example, we are going to work on an AzureML compute cluster called `cpu-cluster`. Let's verify the compute exists on the workspace or create it otherwise.

In [15]:
compute_name = "ben-small-test"
# if not any(filter(lambda m: m.name == compute_name, ml_client.compute.list())):
#     compute_cluster = AmlCompute(
#         name=compute_name,
#         description="CPU cluster compute",
#         min_instances=0,
#         max_instances=2,
#     )
#     ml_client.compute.begin_create_or_update(compute_cluster).result()

Compute may take time to be created. Let's wait for it:

### 4.3 Creating the environment

Let's create the environment. In our case, our model runs on `Torch`. Azure Machine Learning already has an environment with the required software installed, so we can reutilize this environment.

In [63]:
env = Environment(
    name="xgboost-batch-inference-env",
    conda_file="assets/conda.yaml",
    image="mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04:latest",
)

### 4.3 Configure the deployment

In [72]:
deployment = ModelBatchDeployment(
    name="iris-xgboost-depl",
    description="A deployment using xgboost to classify Iris data.",
    endpoint_name=endpoint_name,
    model=model,
    code_configuration=CodeConfiguration(
        code="assets/", scoring_script="batch_driver.py"
    ),
    environment=env,
    compute='ben-small-test',
    settings=ModelBatchDeploymentSettings(
        max_concurrency_per_instance=1,
        mini_batch_size=4,
        instance_count=1,
        output_action=BatchDeploymentOutputAction.APPEND_ROW,
        output_file_name="predictions.csv",
        retry_settings=BatchRetrySettings(max_retries=3, timeout=30),
        logging_level="info",
    ),
)

### 4.4 Create the deployment
Using the `MLClient` created earlier, we will now create the deployment in the workspace. This command will start the deployment creation and return a confirmation response while the deployment creation continues.

In [73]:
ml_client.begin_create_or_update(deployment).result()

[32mUploading assets (0.01 MBs): 100%|██████████| 6588/6588 [00:00<00:00, 19116.93it/s]
[39m



BatchDeployment({'provisioning_state': 'Succeeded', 'endpoint_name': 'xgboost-batch-pnp9z', 'type': None, 'name': 'iris-xgboost-depl', 'description': 'A deployment using xgboost to classify Iris data.', 'tags': {}, 'properties': {}, 'print_as_yaml': False, 'id': '/subscriptions/3979595d-b401-4ff4-b3ef-661b8481c742/resourceGroups/aiplatform-1/providers/Microsoft.MachineLearningServices/workspaces/ai-mlw-dev-01/batchEndpoints/xgboost-batch-pnp9z/deployments/iris-xgboost-depl', 'Resource__source_path': '', 'base_path': '/Users/ben.dixon/Repos/azure-experiments/deploy_batch/xgboost', 'creation_context': <azure.ai.ml.entities._system_data.SystemData object at 0x117b4adc0>, 'serialize': <msrest.serialization.Serializer object at 0x117b4aaf0>, 'model': '/subscriptions/3979595d-b401-4ff4-b3ef-661b8481c742/resourceGroups/aiplatform-1/providers/Microsoft.MachineLearningServices/workspaces/ai-mlw-dev-01/models/xgboost_model/versions/22', 'code_configuration': {'code': '/subscriptions/3979595d-b40

Let's update the default deployment name in the endpoint:

In [74]:
endpoint = ml_client.batch_endpoints.get(endpoint_name)
endpoint.defaults.deployment_name = deployment.name
ml_client.batch_endpoints.begin_create_or_update(endpoint).result()

BatchEndpoint({'scoring_uri': 'https://xgboost-batch-pnp9z.ukwest.inference.ml.azure.com/jobs', 'openapi_uri': None, 'provisioning_state': 'Succeeded', 'name': 'xgboost-batch-pnp9z', 'description': 'A batch endpoint for returning xgboost inferences.', 'tags': {'type': 'deep-learning'}, 'properties': {'BatchEndpointCreationApiVersion': '2023-10-01', 'azureml.onlineendpointid': '/subscriptions/3979595d-b401-4ff4-b3ef-661b8481c742/resourceGroups/aiplatform-1/providers/Microsoft.MachineLearningServices/workspaces/ai-mlw-dev-01/batchEndpoints/xgboost-batch-pnp9z'}, 'print_as_yaml': False, 'id': '/subscriptions/3979595d-b401-4ff4-b3ef-661b8481c742/resourceGroups/aiplatform-1/providers/Microsoft.MachineLearningServices/workspaces/ai-mlw-dev-01/batchEndpoints/xgboost-batch-pnp9z', 'Resource__source_path': '', 'base_path': '/Users/ben.dixon/Repos/azure-experiments/deploy_batch/xgboost', 'creation_context': None, 'serialize': <msrest.serialization.Serializer object at 0x117b379d0>, 'auth_mode': 

We can see the details of the deployment as follows:

In [75]:
ml_client.batch_deployments.get(name=deployment.name, endpoint_name=endpoint.name)

BatchDeployment({'provisioning_state': 'Succeeded', 'endpoint_name': 'xgboost-batch-pnp9z', 'type': None, 'name': 'iris-xgboost-depl', 'description': 'A deployment using xgboost to classify Iris data.', 'tags': {}, 'properties': {}, 'print_as_yaml': False, 'id': '/subscriptions/3979595d-b401-4ff4-b3ef-661b8481c742/resourceGroups/aiplatform-1/providers/Microsoft.MachineLearningServices/workspaces/ai-mlw-dev-01/batchEndpoints/xgboost-batch-pnp9z/deployments/iris-xgboost-depl', 'Resource__source_path': '', 'base_path': '/Users/ben.dixon/Repos/azure-experiments/deploy_batch/xgboost', 'creation_context': <azure.ai.ml.entities._system_data.SystemData object at 0x117b37ee0>, 'serialize': <msrest.serialization.Serializer object at 0x117f46640>, 'model': '/subscriptions/3979595d-b401-4ff4-b3ef-661b8481c742/resourceGroups/aiplatform-1/providers/Microsoft.MachineLearningServices/workspaces/ai-mlw-dev-01/models/xgboost_model/versions/22', 'code_configuration': {'code': '/subscriptions/3979595d-b40

### 4.5 Test the endpoint with sample data
Using the `MLClient` created earlier, we will get a handle to the endpoint. The endpoint can be invoked using the `invoke` command with the following parameters:
- `name` - Name of the endpoint
- `input_path` - Path where input data is present
- `deployment_name` - Name of the specific deployment to test in an endpoint

#### 4.5.1 Invoke the endpoint

Let's now invoke the endpoint for batch scoring job:

In [76]:
print(deployment.name)
print(endpoint_name)

iris-xgboost-depl
xgboost-batch-pnp9z


In [77]:
# job = ml_client.batch_endpoints.invoke(
#     endpoint_name=endpoint_name,
#     deployment_name=deployment.name,
#     input=Input(
#         path="https://azuremlexampledata.blob.core.windows.net/data/mnist/sample/",
#         type=AssetTypes.URI_FOLDER,
#     ),
# )

job = ml_client.batch_endpoints.invoke(
    endpoint_name=endpoint_name,
    deployment_name=deployment.name,
    input=Input(
        path="assets/batch-requests",
        type=AssetTypes.URI_FOLDER
    )
)

#### 4.5.2 Get the details of the invoked job
Let us get details and logs of the invoked job

In [78]:
ml_client.jobs.get(job.name)

Experiment,Name,Type,Status,Details Page
xgboost-batch-pnp9z,batchjob-ecf96072-82b5-46d4-ab3f-91ca59027dd2,pipeline,Preparing,Link to Azure Machine Learning studio


We can wait for the job to finish using the following code:

In [79]:
ml_client.jobs.stream(job.name)

RunId: batchjob-ecf96072-82b5-46d4-ab3f-91ca59027dd2
Web View: https://ml.azure.com/runs/batchjob-ecf96072-82b5-46d4-ab3f-91ca59027dd2?wsid=/subscriptions/3979595d-b401-4ff4-b3ef-661b8481c742/resourcegroups/aiplatform-1/workspaces/ai-mlw-dev-01

Streaming logs/azureml/executionlogs.txt

[2024-05-28 17:04:47Z] Submitting 1 runs, first five are: e8dbd442:6d60f7ab-8c9a-4c11-b682-c84295da2a8a
[2024-05-28 17:06:36Z] Execution of experiment failed, update experiment status and cancel running nodes.

Execution Summary
RunId: batchjob-ecf96072-82b5-46d4-ab3f-91ca59027dd2
Web View: https://ml.azure.com/runs/batchjob-ecf96072-82b5-46d4-ab3f-91ca59027dd2?wsid=/subscriptions/3979595d-b401-4ff4-b3ef-661b8481c742/resourcegroups/aiplatform-1/workspaces/ai-mlw-dev-01


JobException: Exception : 
 {
    "error": {
        "code": "UserError",
        "message": "Pipeline has failed child jobs. For more details and logs, please go to the job detail page and check the child jobs.",
        "message_format": "Pipeline has failed child jobs. {0}",
        "message_parameters": {},
        "reference_code": "PipelineHasStepJobFailed",
        "details": []
    },
    "environment": "ukwest",
    "location": "ukwest",
    "time": "2024-05-28T17:06:36.154881Z",
    "component_name": ""
} 

#### 4.5.3 Download the results

The deployment creates a child job that executes the scoring. We can get the details of it using the following code:

In [28]:
scoring_job = list(ml_client.jobs.list(parent_job_name=job.name))[0]

In [30]:
print("Job name:", scoring_job.name)
print("Job status:", scoring_job.status)
print(
    "Job duration:",
    scoring_job.creation_context.last_modified_at
    - scoring_job.creation_context.created_at,
)

Job name: 00650371-b85f-40c3-a7af-9d4b17dd4645
Job status: Failed
Job duration: 0:00:07.876290


The outputs generated by the deployment job will be placed in an output named `score`:

In [31]:
ml_client.jobs.download(name=scoring_job.name, download_path=".", output_name="score")

Downloading artifact azureml://datastores/workspaceblobstore/paths/azureml/00650371-b85f-40c3-a7af-9d4b17dd4645/score/ to named-outputs/score


We can read this data using pandas library:

In [32]:
import pandas as pd

score = pd.read_csv(
    "named-outputs/score/predictions.csv",
    header=None,
    names=["file", "class"],
    sep=" ",
)
score

Unnamed: 0,file,class


### 4.6 Override deployment configuration at invocation time

#### 4.6.1 Output location

You can indicate the output path where you want the job to place the results. To do that, let's first find the ID of a data store registered in AzureML. You can only place outputs in data stores:

In [49]:
batch_ds = ml_client.datastores.get_default()

Run the job:

In [50]:
filename = f"predictions-{random.randint(0,99999)}.csv"

job = ml_client.batch_endpoints.invoke(
    endpoint_name=endpoint_name,
    input=Input(
        path="https://azuremlexampledata.blob.core.windows.net/data/mnist/sample/",
        type=AssetTypes.URI_FOLDER,
    ),
    params_override=[
        {"output_dataset.datastore_id": f"azureml:{batch_ds.id}"},
        {"output_dataset.path": f"/{endpoint_name}/"},
        {"output_file_name": filename},
    ],
)

To find this on the console, go to Data > Datastores > `workspaceblobstore (default)` > `\{batch_ds.id}\{endpoint_name}\{filename}`

In [63]:
print(batch_ds.id.split('/')[-1] + '/' + endpoint_name + '/' + filename)

workspaceblobstore/mnist-batch-vfp7m/predictions-54.csv


#### 4.6.1 Override deployment configuration settings

Some other parameters can be override, including `mini_batch_size` and `instance_count`:

In [None]:
job = ml_client.batch_endpoints.invoke(
    endpoint_name=endpoint_name,
    input=Input(
        path="https://azuremlexampledata.blob.core.windows.net/data/mnist/sample/"
    ),
    params_override=[{"mini_batch_size": "20"}, {"compute.instance_count": "5"}],
)

## 5. Clean up Resources

In [None]:
ml_client.batch_endpoints.begin_delete(name=endpoint_name)