Skip to content

Files

Latest commit

 

History

History
904 lines (502 loc) · 49.5 KB

how-to-use-batch-model-deployments.md

File metadata and controls

904 lines (502 loc) · 49.5 KB
title titleSuffix description services ms.service ms.subservice ms.topic author ms.author ms.reviewer ms.date ms.custom
Deploy models for scoring in batch endpoints
Azure Machine Learning
In this article, learn how to create a batch endpoint to continuously batch score large data.
machine-learning
machine-learning
inferencing
how-to
msakande
mopeakande
cacrest
04/02/2024
how-to, devplatv2, update-code

Deploy models for scoring in batch endpoints

[!INCLUDE cli v2]

Batch endpoints provide a convenient way to deploy models that run inference over large volumes of data. These endpoints simplify the process of hosting your models for batch scoring, so that your focus is on machine learning, rather than the infrastructure.

Use batch endpoints for model deployment when:

  • You have expensive models that require a longer time to run inference.
  • You need to perform inference over large amounts of data that is distributed in multiple files.
  • You don't have low latency requirements.
  • You can take advantage of parallelization.

In this article, you use a batch endpoint to deploy a machine learning model that solves the classic MNIST (Modified National Institute of Standards and Technology) digit recognition problem. Your deployed model then performs batch inferencing over large amounts of data—in this case, image files. You begin by creating a batch deployment of a model that was created using Torch. This deployment becomes the default one in the endpoint. Later, you create a second deployment of a mode that was created with TensorFlow (Keras), test the second deployment, and then set it as the endpoint's default deployment.

To follow along with the code samples and files needed to run the commands in this article locally, see the Clone the examples repository section. The code samples and files are contained in the azureml-examples repository.

Prerequisites

[!INCLUDE machine-learning-batch-prereqs-studio]

Clone the examples repository

[!INCLUDE machine-learning-batch-clone]

Prepare your system

Connect to your workspace

First, connect to the Azure Machine Learning workspace where you'll work.

If you haven't already set the defaults for the Azure CLI, save your default settings. To avoid passing in the values for your subscription, workspace, resource group, and location multiple times, run this code:

az account set --subscription <subscription>
az configure --defaults workspace=<workspace> group=<resource-group> location=<location>

The workspace is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. In this section, you connect to the workspace in which you'll perform deployment tasks.

  1. Import the required libraries:

    from azure.ai.ml import MLClient, Input, load_component
    from azure.ai.ml.entities import BatchEndpoint, ModelBatchDeployment, ModelBatchDeploymentSettings, PipelineComponentBatchDeployment, Model, AmlCompute, Data, BatchRetrySettings, CodeConfiguration, Environment, Data
    from azure.ai.ml.constants import AssetTypes, BatchDeploymentOutputAction
    from azure.ai.ml.dsl import pipeline
    from azure.identity import DefaultAzureCredential

    [!NOTE] Classes ModelBatchDeployment and PipelineComponentBatchDeployment were introduced in version 1.7.0 of the SDK.

  2. Configure workspace details and get a handle to the workspace:

    subscription_id = "<subscription>"
    resource_group = "<resource-group>"
    workspace = "<workspace>"
    
    ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group, workspace)

Open the Azure Machine Learning studio portal and sign in using your credentials.


Create compute

Batch endpoints run on compute clusters and support both Azure Machine Learning compute clusters (AmlCompute) and Kubernetes clusters. Clusters are a shared resource, therefore, one cluster can host one or many batch deployments (along with other workloads, if desired).

Create a compute named batch-cluster, as shown in the following code. You can adjust as needed and reference your compute using azureml:<your-compute-name>.

:::code language="azurecli" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/mnist-classifier/deploy-and-run.sh" ID="create_compute" :::

[!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/mnist-classifier/mnist-batch.ipynb?name=create_compute)]

Follow the steps in the tutorial Create an Azure Machine Learning compute cluster to create a compute cluster.


Note

You're not charged for the compute at this point, as the cluster remains at 0 nodes until a batch endpoint is invoked and a batch scoring job is submitted. For more information about compute costs, see Manage and optimize cost for AmlCompute.

Create a batch endpoint

A batch endpoint is an HTTPS endpoint that clients can call to trigger a batch scoring job. A batch scoring job is a job that scores multiple inputs. A batch deployment is a set of compute resources hosting the model that does the actual batch scoring (or batch inferencing). One batch endpoint can have multiple batch deployments. For more information on batch endpoints, see What are batch endpoints?.

Tip

One of the batch deployments serves as the default deployment for the endpoint. When the endpoint is invoked, the default deployment does the actual batch scoring. For more information on batch endpoints and deployments, see batch endpoints and batch deployment.

  1. Name the endpoint. The endpoint's name must be unique within an Azure region, since the name is included in the endpoint's URI. For example, there can be only one batch endpoint with the name mybatchendpoint in westus2.

    Place the endpoint's name in a variable so you can easily reference it later.

    :::code language="azurecli" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/mnist-classifier/deploy-and-run.sh" ID="name_endpoint" :::

    Place the endpoint's name in a variable so you can easily reference it later.

    [!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/mnist-classifier/mnist-batch.ipynb?name=name_endpoint)]

    You provide the endpoint's name later, at the point when you create the deployment.

  2. Configure the batch endpoint

    The following YAML file defines a batch endpoint. You can use this file with the CLI command for batch endpoint creation.

    endpoint.yml

    :::code language="yaml" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/mnist-classifier/endpoint.yml":::

    The following table describes the key properties of the endpoint. For the full batch endpoint YAML schema, see CLI (v2) batch endpoint YAML schema.

    Key Description
    name The name of the batch endpoint. Needs to be unique at the Azure region level.
    description The description of the batch endpoint. This property is optional.
    tags The tags to include in the endpoint. This property is optional.

    [!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/mnist-classifier/mnist-batch.ipynb?name=configure_endpoint)]

    The following table describes the key properties of the endpoint. For more information on batch endpoint definition, see BatchEndpoint Class.

    Key Description
    name The name of the batch endpoint. Needs to be unique at the Azure region level.
    description The description of the batch endpoint. This property is optional.
    tags The tags to include in the endpoint. This property is optional.

    You create the endpoint later, at the point when you create the deployment.

  3. Create the endpoint:

    Run the following code to create a batch endpoint.

    :::code language="azurecli" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/mnist-classifier/deploy-and-run.sh" ID="create_endpoint" :::

    [!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/mnist-classifier/mnist-batch.ipynb?name=create_endpoint)]

    You create the endpoint later, at the point when you create the deployment.

Create a batch deployment

A model deployment is a set of resources required for hosting the model that does the actual inferencing. To create a batch model deployment, you need the following items:

  • A registered model in the workspace
  • The code to score the model
  • An environment with the model's dependencies installed
  • The pre-created compute and resource settings
  1. Begin by registering the model to be deployed—a Torch model for the popular digit recognition problem (MNIST). Batch Deployments can only deploy models that are registered in the workspace. You can skip this step if the model you want to deploy is already registered.

    [!TIP] Models are associated with the deployment, rather than with the endpoint. This means that a single endpoint can serve different models (or model versions) under the same endpoint, provided that the different models (or model versions) are deployed in different deployments.

    :::code language="azurecli" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/mnist-classifier/deploy-and-run.sh" ID="register_model" :::

    [!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/mnist-classifier/mnist-batch.ipynb?name=register_model)]

    1. Navigate to the Models tab on the side menu.

    2. Select Register > From local files.

    3. In the wizard, leave the option Model type as Unspecified type.

    4. Select Browse > Browse folder > Select the folder deployment-torch/model > Next.

    5. Configure the name of the model: mnist-classifier-torch. You can leave the rest of the fields as they are.

    6. Select Register.

  2. Now it's time to create a scoring script. Batch deployments require a scoring script that indicates how a given model should be executed and how input data must be processed. Batch endpoints support scripts created in Python. In this case, you deploy a model that reads image files representing digits and outputs the corresponding digit. The scoring script is as follows:

    [!NOTE] For MLflow models, Azure Machine Learning automatically generates the scoring script, so you're not required to provide one. If your model is an MLflow model, you can skip this step. For more information about how batch endpoints work with MLflow models, see the article Using MLflow models in batch deployments.

    [!WARNING] If you're deploying an Automated machine learning (AutoML) model under a batch endpoint, note that the scoring script that AutoML provides only works for online endpoints and is not designed for batch execution. For information on how to create a scoring script for your batch deployment, see Author scoring scripts for batch deployments.

    deployment-torch/code/batch_driver.py

    :::code language="python" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/mnist-classifier/deployment-torch/code/batch_driver.py" :::

  3. Create an environment where your batch deployment will run. The environment should include the packages azureml-core and azureml-dataset-runtime[fuse], which are required by batch endpoints, plus any dependency your code requires for running. In this case, the dependencies have been captured in a conda.yaml file:

    deployment-torch/environment/conda.yaml

    :::code language="yaml" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/mnist-classifier/deployment-torch/environment/conda.yaml":::

    [!IMPORTANT] The packages azureml-core and azureml-dataset-runtime[fuse] are required by batch deployments and should be included in the environment dependencies.

    Specify the environment as follows:

    The environment definition will be included in the deployment definition itself as an anonymous environment. You'll see in the following lines in the deployment:

    :::code language="yaml" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/mnist-classifier/deployment-torch/deployment.yml" range="12-15":::

    Get a reference to the environment:

    [!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/mnist-classifier/mnist-batch.ipynb?name=configure_environment)]

    In the Azure Machine Learning studio, follow these steps:

    1. Navigate to the Environments tab on the side menu.

    2. Select the tab Custom environments > Create.

    3. Enter the name of the environment, in this case torch-batch-env.

    4. For Select environment source, select Use existing docker image with optional conda file.

    5. For Container registry image path, enter mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04.

    6. Select Next to go to the "Customize" section.

    7. Copy the content of the file deployment-torch/environment/conda.yaml from the GitHub repo into the portal.

    8. Select Next until you get to the "Review page".

    9. Select Create and wait until the environment is ready for use.


    [!WARNING] Curated environments are not supported in batch deployments. You need to specify your own environment. You can always use the base image of a curated environment as yours to simplify the process.

  4. Create a deployment definition

    deployment-torch/deployment.yml

    :::code language="yaml" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/mnist-classifier/deployment-torch/deployment.yml":::

    The following table describes the key properties of the batch deployment. For the full batch deployment YAML schema, see CLI (v2) batch deployment YAML schema.

    Key Description
    name The name of the deployment.
    endpoint_name The name of the endpoint to create the deployment under.
    model The model to be used for batch scoring. The example defines a model inline using path. This definition allows model files to be automatically uploaded and registered with an autogenerated name and version. See the Model schema for more options. As a best practice for production scenarios, you should create the model separately and reference it here. To reference an existing model, use the azureml:<model-name>:<model-version> syntax.
    code_configuration.code The local directory that contains all the Python source code to score the model.
    code_configuration.scoring_script The Python file in the code_configuration.code directory. This file must have an init() function and a run() function. Use the init() function for any costly or common preparation (for example, to load the model in memory). init() will be called only once at the start of the process. Use run(mini_batch) to score each entry; the value of mini_batch is a list of file paths. The run() function should return a pandas DataFrame or an array. Each returned element indicates one successful run of input element in the mini_batch. For more information on how to author a scoring script, see Understanding the scoring script.
    environment The environment to score the model. The example defines an environment inline using conda_file and image. The conda_file dependencies will be installed on top of the image. The environment will be automatically registered with an autogenerated name and version. See the Environment schema for more options. As a best practice for production scenarios, you should create the environment separately and reference it here. To reference an existing environment, use the azureml:<environment-name>:<environment-version> syntax.
    compute The compute to run batch scoring. The example uses the batch-cluster created at the beginning and references it using the azureml:<compute-name> syntax.
    resources.instance_count The number of instances to be used for each batch scoring job.
    settings.max_concurrency_per_instance [Optional] The maximum number of parallel scoring_script runs per instance.
    settings.mini_batch_size [Optional] The number of files the scoring_script can process in one run() call.
    settings.output_action [Optional] How the output should be organized in the output file. append_row will merge all run() returned output results into one single file named output_file_name. summary_only won't merge the output results and will only calculate error_threshold.
    settings.output_file_name [Optional] The name of the batch scoring output file for append_row output_action.
    settings.retry_settings.max_retries [Optional] The number of max tries for a failed scoring_script run().
    settings.retry_settings.timeout [Optional] The timeout in seconds for a scoring_script run() for scoring a mini batch.
    settings.error_threshold [Optional] The number of input file scoring failures that should be ignored. If the error count for the entire input goes above this value, the batch scoring job will be terminated. The example uses -1, which indicates that any number of failures is allowed without terminating the batch scoring job.
    settings.logging_level [Optional] Log verbosity. Values in increasing verbosity are: WARNING, INFO, and DEBUG.
    settings.environment_variables [Optional] Dictionary of environment variable name-value pairs to set for each batch scoring job.

    [!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/mnist-classifier/mnist-batch.ipynb?name=configure_deployment)]

    The BatchDeployment Class allows you to configure the following key properties of a batch deployment:

    Key Description
    name Name of the deployment.
    endpoint_name Name of the endpoint to create the deployment under.
    model The model to use for the deployment. This value can be either a reference to an existing versioned model in the workspace or an inline model specification.
    environment The environment to use for the deployment. This value can be either a reference to an existing versioned environment in the workspace or an inline environment specification (optional for MLflow models).
    code_configuration The configuration about how to run inference for the model (optional for MLflow models).
    code_configuration.code Path to the source code directory for scoring the model.
    code_configuration.scoring_script Relative path to the scoring file in the source code directory.
    compute Name of the compute target on which to execute the batch scoring jobs.
    instance_count The number of nodes to use for each batch scoring job.
    settings The model deployment inference configuration.
    settings.max_concurrency_per_instance The maximum number of parallel scoring_script runs per instance.
    settings.mini_batch_size The number of files the code_configuration.scoring_script can process in one run() call.
    settings.retry_settings Retry settings for scoring each mini batch.
    settings.retry_settingsmax_retries The maximum number of retries for a failed or timed-out mini batch (default is 3).
    settings.retry_settingstimeout The timeout in seconds for scoring a mini batch (default is 30).
    settings.output_action How the output should be organized in the output file. Allowed values are append_row or summary_only. Default is append_row.
    settings.logging_level The log verbosity level. Allowed values are warning, info, debug. Default is info.
    settings.environment_variables Dictionary of environment variable name-value pairs to set for each batch scoring job.

    In the studio, follow these steps:

    1. Navigate to the Endpoints tab on the side menu.

    2. Select the tab Batch endpoints > Create.

    3. Give the endpoint a name, in this case mnist-batch. You can configure the rest of the fields or leave them blank.

    4. Select Next to go to the "Model" section.

    5. Select the model mnist-classifier-torch.

    6. Select Next to go to the "Deployment" page.

    7. Give the deployment a name.

    8. For Output action, ensure Append row is selected.

    9. For Output file name, ensure the batch scoring output file is the one you need. Default is predictions.csv.

    10. For Mini batch size, adjust the size of the files that will be included on each mini-batch. This size will control the amount of data your scoring script receives per batch.

    11. For Scoring timeout (seconds), ensure you're giving enough time for your deployment to score a given batch of files. If you increase the number of files, you usually have to increase the timeout value too. More expensive models (like those based on deep learning), may require high values in this field.

    12. For Max concurrency per instance, configure the number of executors you want to have for each compute instance you get in the deployment. A higher number here guarantees a higher degree of parallelization but it also increases the memory pressure on the compute instance. Tune this value altogether with Mini batch size.

    13. Once done, select Next to go to the "Code + environment" page.

    14. For "Select a scoring script for inferencing", browse to find and select the scoring script file deployment-torch/code/batch_driver.py.

    15. In the "Select environment" section, select the environment you created previously torch-batch-env.

    16. Select Next to go to the "Compute" page.

    17. Select the compute cluster you created in a previous step.

      [!WARNING] Azure Kubernetes cluster are supported in batch deployments, but only when created using the Azure Machine Learning CLI or Python SDK.

    18. For Instance count, enter the number of compute instances you want for the deployment. In this case, use 2.

    19. Select Next.

  5. Create the deployment:

    Run the following code to create a batch deployment under the batch endpoint, and set it as the default deployment.

    :::code language="azurecli" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/mnist-classifier/deploy-and-run.sh" ID="create_deployment" :::

    [!TIP] The --set-default parameter sets the newly created deployment as the default deployment of the endpoint. It's a convenient way to create a new default deployment of the endpoint, especially for the first deployment creation. As a best practice for production scenarios, you might want to create a new deployment without setting it as default. Verify that the deployment works as you expect, and then update the default deployment later. For more information on implementing this process, see the Deploy a new model section.

    Using the MLClient created earlier, create the deployment in the workspace. This command starts the deployment creation and returns a confirmation response while the deployment creation continues.

    [!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/mnist-classifier/mnist-batch.ipynb?name=create_deployment)]

    Once the deployment is completed, set the new deployment as the default deployment in the endpoint:

    [!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/mnist-classifier/mnist-batch.ipynb?name=set_default_deployment)]

    In the wizard, select Create to start the deployment process.

    :::image type="content" source="./media/how-to-use-batch-model-deployments/review-batch-wizard.png" alt-text="Screenshot of batch endpoints deployment review screen." lightbox="media/how-to-use-batch-model-deployments/review-batch-wizard.png":::


  6. Check batch endpoint and deployment details.

    Use show to check the endpoint and deployment details. To check a batch deployment, run the following code:

    :::code language="azurecli" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/mnist-classifier/deploy-and-run.sh" ID="query_deployment" :::

    To check a batch deployment, run the following code:

    [!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/mnist-classifier/mnist-batch.ipynb?name=query_deployment)]

    After creating the batch endpoint, the endpoint's details page opens up. You can also find this page by following these steps:

    1. Navigate to the Endpoints tab on the side menu.

    2. Select the tab Batch endpoints.

    3. Select the batch endpoint you want to view.

    4. The endpoint's Details page shows the details of the endpoint along with all the deployments available in the endpoint.

      :::image type="content" source="./media/how-to-use-batch-model-deployments/batch-endpoint-details.png" alt-text="Screenshot of the check batch endpoints and deployment details.":::

Run batch endpoints and access results

Invoking a batch endpoint triggers a batch scoring job. The job name is returned from the invoke response and can be used to track the batch scoring progress. When running models for scoring in batch endpoints, you need to specify the path to the input data so that the endpoints can find the data you want to score. The following example shows how to start a new job over a sample data of the MNIST dataset stored in an Azure Storage Account.

You can run and invoke a batch endpoint using Azure CLI, Azure Machine Learning SDK, or REST endpoints. For more details about these options, see Create jobs and input data for batch endpoints.

Note

How does parallelization work?

Batch deployments distribute work at the file level, which means that a folder containing 100 files with mini-batches of 10 files will generate 10 batches of 10 files each. Notice that this happens regardless of the size of the files involved. If your files are too big to be processed in large mini-batches, we suggest that you either split the files into smaller files to achieve a higher level of parallelism or you decrease the number of files per mini-batch. Currently, batch deployments can't account for skews in a file's size distribution.

:::code language="azurecli" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/mnist-classifier/deploy-and-run.sh" ID="start_batch_scoring_job" :::

[!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/mnist-classifier/mnist-batch.ipynb?name=start_batch_scoring_job)]

  1. Navigate to the Endpoints tab on the side menu.

  2. Select the tab Batch endpoints.

  3. Select the batch endpoint you just created.

  4. Select Create job.

    :::image type="content" source="./media/how-to-use-batch-model-deployments/create-batch-job.png" alt-text="Screenshot of the create job option to start batch scoring." lightbox="media/how-to-use-batch-model-deployments/create-batch-job.png":::

  5. For Deployment, select the deployment to execute.

    :::image type="content" source="./media/how-to-use-batch-model-deployments/job-setting-batch-scoring.png" alt-text="Screenshot of using the deployment to submit a batch job." lightbox="media/how-to-use-batch-model-deployments/job-setting-batch-scoring.png":::

  6. Select Next to go to the "Select data source" page.

  7. For the "Data source type", select Datastore.

  8. For the "Datastore", select workspaceblobstore from the dropdown menu.

  9. For "Path", enter the full URL https://azuremlexampledata.blob.core.windows.net/data/mnist/sample.

    [!TIP] This path works only because the given path has public access enabled. In general, you need to register the data source as a Datastore. See Accessing data from batch endpoints jobs for details.

    :::image type="content" source="./media/how-to-use-batch-model-deployments/select-datastore-job.png" alt-text="Screenshot of selecting datastore as an input option." lightbox="media/how-to-use-batch-model-deployments/select-datastore-job.png":::

  10. Select Next.

  11. Select Create to start the job.


Batch endpoints support reading files or folders that are located in different locations. To learn more about the supported types and how to specify them, see Accessing data from batch endpoints jobs.

Monitor batch job execution progress

Batch scoring jobs usually take some time to process the entire set of inputs.

The following code checks the job status and outputs a link to the Azure Machine Learning studio for further details.

:::code language="azurecli" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/mnist-classifier/deploy-and-run.sh" ID="show_job_in_studio" :::

The following code checks the job status and outputs a link to the Azure Machine Learning studio for further details.

[!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/mnist-classifier/mnist-batch.ipynb?name=get_job)]

  1. Navigate to the Endpoints tab on the side menu.

  2. Select the tab Batch endpoints.

  3. Select the batch endpoint you want to monitor.

  4. Select the Jobs tab.

    :::image type="content" source="media/how-to-use-batch-model-deployments/summary-jobs.png" alt-text="Screenshot of summary of jobs submitted to a batch endpoint." lightbox="media/how-to-use-batch-model-deployments/summary-jobs.png":::

  5. From the displayed list of the jobs created for the selected endpoint, select the last job that is running.

  6. You're now redirected to the job monitoring page.


Check batch scoring results

The job outputs are stored in cloud storage, either in the workspace's default blob storage, or the storage you specified. To learn how to change the defaults, see Configure the output location. The following steps allow you to view the scoring results in Azure Storage Explorer when the job is completed:

  1. Run the following code to open the batch scoring job in Azure Machine Learning studio. The job studio link is also included in the response of invoke, as the value of interactionEndpoints.Studio.endpoint.

    :::code language="azurecli" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/mnist-classifier/deploy-and-run.sh" ID="show_job_in_studio" :::

  2. In the graph of the job, select the batchscoring step.

  3. Select the Outputs + logs tab and then select Show data outputs.

  4. From Data outputs, select the icon to open Storage Explorer.

    :::image type="content" source="media/how-to-use-batch-endpoint/view-data-outputs.png" alt-text="Studio screenshot showing view data outputs location." lightbox="media/how-to-use-batch-endpoint/view-data-outputs.png":::

    The scoring results in Storage Explorer are similar to the following sample page:

    :::image type="content" source="media/how-to-use-batch-endpoint/scoring-view.png" alt-text="Screenshot of the scoring output." lightbox="media/how-to-use-batch-endpoint/scoring-view.png":::

Configure the output location

By default, the batch scoring results are stored in the workspace's default blob store within a folder named by job name (a system-generated GUID). You can configure where to store the scoring outputs when you invoke the batch endpoint.

Use output-path to configure any folder in an Azure Machine Learning registered datastore. The syntax for the --output-path is the same as --input when you're specifying a folder, that is, azureml://datastores/<datastore-name>/paths/<path-on-datastore>/. Use --set output_file_name=<your-file-name> to configure a new output file name.

:::code language="azurecli" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/mnist-classifier/deploy-and-run.sh" ID="start_batch_scoring_job_set_output" :::

Use params_override to configure any folder in an Azure Machine Learning registered data store. Only registered data stores are supported as output paths. In this example you use the default data store:

[!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/mnist-classifier/mnist-batch.ipynb?name=get_data_store)]

Once you've identified the data store you want to use, configure the output as follows:

[!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/mnist-classifier/mnist-batch.ipynb?name=start_batch_scoring_job_set_output)]

  1. Navigate to the Endpoints tab on the side menu.

  2. Select the tab Batch endpoints.

  3. Select the batch endpoint you just created.

  4. Select Create job.

    :::image type="content" source="./media/how-to-use-batch-model-deployments/create-batch-job.png" alt-text="Screenshot of the create job option to start batch scoring." lightbox="media/how-to-use-batch-model-deployments/create-batch-job.png":::

  5. For Deployment, select the deployment you want to execute.

  6. Select the option Override deployment settings.

    :::image type="content" source="./media/how-to-use-batch-model-deployments/overwrite-setting.png" alt-text="Screenshot of the overwrite setting when starting a batch job.":::

  7. You can now configure Output file name and some extra properties of the deployment execution. Just this execution will be affected.

  8. Select Next.

  9. On the "Select data source" page, select the data input you want to use.

  10. Select Next.

  11. On the "Configure output location" page, select the option Enable output configuration.

    :::image type="content" source="./media/how-to-use-batch-model-deployments/configure-output-location.png" alt-text="Screenshot of optionally configuring output location." lightbox="media/how-to-use-batch-model-deployments/configure-output-location.png":::

  12. Configure the Blob datastore where the outputs should be placed.


Warning

You must use a unique output location. If the output file exists, the batch scoring job will fail.

Important

Unlike inputs, outputs can be stored only in Azure Machine Learning data stores that run on blob storage accounts.

Overwrite deployment configuration for each job

When you invoke a batch endpoint, some settings can be overwritten to make best use of the compute resources and to improve performance. The following settings can be configured on a per-job basis:

  • Instance count: use this setting to overwrite the number of instances to request from the compute cluster. For example, for larger volume of data inputs, you may want to use more instances to speed up the end to end batch scoring.
  • Mini-batch size: use this setting to overwrite the number of files to include in each mini-batch. The number of mini batches is decided by the total input file counts and mini-batch size. A smaller mini-batch size generates more mini batches. Mini batches can be run in parallel, but there might be extra scheduling and invocation overhead.
  • Other settings, such as max retries, timeout, and error threshold can be overwritten. These settings might impact the end-to-end batch scoring time for different workloads.

:::code language="azurecli" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/mnist-classifier/deploy-and-run.sh" ID="start_batch_scoring_job_overwrite" :::

[!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/mnist-classifier/mnist-batch.ipynb?name=start_batch_scoring_job_overwrite)]

  1. Navigate to the Endpoints tab on the side menu.

  2. Select the tab Batch endpoints.

  3. Select the batch endpoint you just created.

  4. Select Create job.

  5. For Deployment, select the deployment you want to execute.

  6. Select the option Override deployment settings.

  7. Configure the job parameters. Only the current job execution will be affected by this configuration.

  8. Select Next.

  9. On the "Select data source" page, select the data input you want to use.

  10. Select Next.

  11. On the "Configure output location" page, select the option Enable output configuration.

  12. Configure the Blob datastore where the outputs should be placed.


Add deployments to an endpoint

Once you have a batch endpoint with a deployment, you can continue to refine your model and add new deployments. Batch endpoints will continue serving the default deployment while you develop and deploy new models under the same endpoint. Deployments don't affect one another.

In this example, you add a second deployment that uses a model built with Keras and TensorFlow to solve the same MNIST problem.

Add a second deployment

  1. Create an environment where your batch deployment will run. Include in the environment any dependency your code requires for running. You also need to add the library azureml-core, as it's required for batch deployments to work. The following environment definition has the required libraries to run a model with TensorFlow.

    The environment definition is included in the deployment definition itself as an anonymous environment.

    :::code language="yaml" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/mnist-classifier/deployment-keras/deployment.yml" range="12-15":::

    Get a reference to the environment:

    [!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/mnist-classifier/mnist-batch.ipynb?name=configure_environment_non_default)]

    1. Navigate to the Environments tab on the side menu.

    2. Select the tab Custom environments > Create.

    3. Enter the name of the environment, in this case keras-batch-env.

    4. For Select environment source, select Use existing docker image with optional conda file.

    5. For Container registry image path, enter mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04.

    6. Select Next to go to the "Customize" section.

    7. Copy the content of the file deployment-keras/environment/conda.yaml from the GitHub repo into the portal.

    8. Select Next until you get to the "Review page".

    9. Select Create and wait until the environment is ready for use.


    The conda file used looks as follows:

    deployment-keras/environment/conda.yaml

    :::code language="yaml" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/mnist-classifier/deployment-keras/environment/conda.yaml":::

  2. Create a scoring script for the model:

    deployment-keras/code/batch_driver.py

    :::code language="python" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/mnist-classifier/deployment-keras/code/batch_driver.py" :::

  3. Create a deployment definition

    deployment-keras/deployment.yml

    :::code language="yaml" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/mnist-classifier/deployment-keras/deployment.yml":::

    [!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/mnist-classifier/mnist-batch.ipynb?name=configure_deployment_non_default)]

    1. Navigate to the Endpoints tab on the side menu.

    2. Select the tab Batch endpoints.

    3. Select the existing batch endpoint where you want to add the deployment.

    4. Select Add deployment.

      :::image type="content" source="./media/how-to-use-batch-model-deployments/add-deployment-option.png" alt-text="Screenshot of add new deployment option.":::

    5. Select Next to go to the "Model" page.

    6. From the model list, select the model mnist and select Next.

    7. On the deployment configuration page, give the deployment a name.

    8. Undo the selection for the option: Make this new deployment the default for batch jobs.

    9. For Output action, ensure Append row is selected.

    10. For Output file name, ensure the batch scoring output file is the one you need. Default is predictions.csv.

    11. For Mini batch size, adjust the size of the files that will be included in each mini-batch. This will control the amount of data your scoring script receives for each batch.

    12. For Scoring timeout (seconds), ensure you're giving enough time for your deployment to score a given batch of files. If you increase the number of files, you usually have to increase the timeout value too. More expensive models (like those based on deep learning), may require high values in this field.

    13. For Max concurrency per instance, configure the number of executors you want to have for each compute instance you get in the deployment. A higher number here guarantees a higher degree of parallelization but it also increases the memory pressure on the compute instance. Tune this value altogether with Mini batch size.

    14. Select Next to go to the "Code + environment" page.

    15. For Select a scoring script for inferencing, browse to select the scoring script file deployment-keras/code/batch_driver.py.

    16. For Select environment, select the environment you created in a previous step.

    17. Select Next.

    18. On the Compute page, select the compute cluster you created in a previous step.

    19. For Instance count, enter the number of compute instances you want for the deployment. In this case, use 2.

    20. Select Next.

  4. Create the deployment:

    Run the following code to create a batch deployment under the batch endpoint and set it as the default deployment.

    :::code language="azurecli" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/mnist-classifier/deploy-and-run.sh" ID="create_deployment_non_default" :::

    [!TIP] The --set-default parameter is missing in this case. As a best practice for production scenarios, create a new deployment without setting it as default. Then verify it, and update the default deployment later.

    Using the MLClient created earlier, create the deployment in the workspace. This command starts the deployment creation and returns a confirmation response while the deployment creation continues.

    [!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/mnist-classifier/mnist-batch.ipynb?name=create_deployment_non_default)]

    In the wizard, select Create to start the deployment process.

Test a non-default batch deployment

To test the new non-default deployment, you need to know the name of the deployment you want to run.

:::code language="azurecli" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/mnist-classifier/deploy-and-run.sh" ID="test_deployment_non_default" :::

Notice --deployment-name is used to specify the deployment to execute. This parameter allows you to invoke a non-default deployment without updating the default deployment of the batch endpoint.

[!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/mnist-classifier/mnist-batch.ipynb?name=test_deployment_non_default)]

Notice deployment_name is used to specify the deployment to execute. This parameter allows you to invoke a non-default deployment without updating the default deployment of the batch endpoint.

  1. Navigate to the Endpoints tab on the side menu.

  2. Select the tab Batch endpoints.

  3. Select the batch endpoint you just created.

  4. Select Create job.

  5. For Deployment, select the deployment you want to execute. In this case, mnist-keras.

  6. Complete the job creation wizard to get the job started.


Update the default batch deployment

Although you can invoke a specific deployment inside an endpoint, you'll typically want to invoke the endpoint itself and let the endpoint decide which deployment to use—the default deployment. You can change the default deployment (and consequently, change the model serving the deployment) without changing your contract with the user invoking the endpoint. Use the following code to update the default deployment:

:::code language="azurecli" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/mnist-classifier/deploy-and-run.sh" ID="update_default_deployment" :::

[!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/mnist-classifier/mnist-batch.ipynb?name=update_default_deployment)]

  1. Navigate to the Endpoints tab on the side menu.

  2. Select the tab Batch endpoints.

  3. Select the batch endpoint you want to configure.

  4. Select Update default deployment.

    :::image type="content" source="./media/how-to-use-batch-model-deployments/update-default-deployment.png" alt-text="Screenshot of updating default deployment.":::

  5. For Select default deployment, select the name of the deployment you want to set as the default.

  6. Select Update.

  7. The selected deployment is now the default one.


Delete the batch endpoint and the deployment

If you won't be using the old batch deployment, delete it by running the following code. --yes is used to confirm the deletion.

::: code language="azurecli" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/mnist-classifier/deploy-and-run.sh" ID="delete_deployment" :::

Run the following code to delete the batch endpoint and all its underlying deployments. Batch scoring jobs won't be deleted.

::: code language="azurecli" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/mnist-classifier/deploy-and-run.sh" ID="delete_endpoint" :::

If you won't be using the old batch deployment, delete it by running the following code.

[!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/mnist-classifier/mnist-batch.ipynb?name=delete_deployment)]

Run the following code to delete the batch endpoint and all its underlying deployments. Batch scoring jobs won't be deleted.

[!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/mnist-classifier/mnist-batch.ipynb?name=delete_endpoint)]

  1. Navigate to the Endpoints tab on the side menu.

  2. Select the tab Batch endpoints.

  3. Select the batch endpoint you want to delete.

  4. Select Delete.

  5. The endpoint all along with its deployments will be deleted.

  6. Notice that this won't affect the compute cluster where the deployment(s) run.


Related content