# Deploy a scikit-learn model as an online endpoint

Now deploy your machine learning model as a web service in the Azure cloud, an online endpoint.

To deploy a machine learning service, you usually need:

- The model assets (file, metadata) that you want to deploy. You've already registered these assets in your training job.
- Some code to run as a service. The code executes the model on a given input request. This entry script receives data submitted to a deployed web service and passes it to the model, then returns the model's response to the client. The script is specific to your model. The entry script must understand the data that the model expects and returns. With an MLFlow model, as in this tutorial, this script is automatically created for you. Samples of scoring scripts can be found here.



# 1. Connect to Azure Machine Learning Workspace

The [workspace](https://docs.microsoft.com/en-us/azure/machine-learning/concept-workspace) is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. In this section we will connect to the workspace in which the job will be run.

## 1.1. Import the required libraries

In [8]:
# import required libraries
from azure.ai.ml import MLClient
from azure.ai.ml import command, Input
from azure.identity import DefaultAzureCredential

## 1.2. Configure workspace details and get a handle to the workspace

To connect to a workspace, we need identifier parameters - a subscription, resource group and workspace name. We will use these details in the `MLClient` from `azure.ai.ml` to get a handle to the required Azure Machine Learning workspace. We use the default [default azure authentication](https://docs.microsoft.com/en-us/python/api/azure-identity/azure.identity.defaultazurecredential?view=azure-python) for this tutorial. Check the [configuration notebook](../../../configuration.ipynb) for more details on how to configure credentials and connect to a workspace.

In [9]:
# Enter details of your AML workspace
subscription_id = '6baab092-6683-4cc2-b4dd-7df2bb56af59'
resource_group = 'rg-treshenv-ws-a127'
workspace_name = 'ml-treshenv-ws-a127-svc-513c'

In [10]:
# get a handle to the workspace
ml_client = MLClient(
    DefaultAzureCredential(), subscription_id, resource_group, workspace_name
)

# Create a new online endpoint

Online endpoints are endpoints that are used for online (real-time) inferencing. Online endpoints contain deployments that are ready to receive data from clients and can send responses back in real time.

To create an online endpoint we will use ManagedOnlineEndpoint. This class allows user to configure the following key aspects:

- `name` - Name of the endpoint. Needs to be unique at the Azure region level
- `auth_mode` - The authentication method for the endpoint. Key-based authentication and Azure ML token-based authentication are = supported. Key-based authentication doesn't expire but Azure ML token-based authentication does. Possible values are key or aml_token.
- `identity`- The managed identity configuration for accessing Azure resources for endpoint provisioning and inference.
    - `type`- The type of managed identity. Azure Machine Learning supports system_assigned or user_assigned identity.
    - `user_assigned_identities` - List (array) of fully qualified resource IDs of the user-assigned identities. This property is required is identity.type is user_assigned.
- `description`- Description of the endpoint.

In [11]:
import uuid

# Creating a unique name for the endpoint
online_endpoint_name = "iris-svc" + str(uuid.uuid4())[:8]

In [12]:
from azure.ai.ml.entities import (
    ManagedOnlineEndpoint,
    ManagedOnlineDeployment,
    Model,
    Environment,
)

# create an online endpoint
endpoint = ManagedOnlineEndpoint(
    name=online_endpoint_name,
    description="this is an online endpoint",
    auth_mode="key",
    tags={
        "training_dataset": "iris-data",
        "model_type": "sklearn.SVC",
    },
)

endpoint = ml_client.begin_create_or_update(endpoint)

print(f"Endpoint {endpoint.name} provisioning state: {endpoint.provisioning_state}")

Endpoint iris-svc625ae8fd provisioning state: Succeeded


In [13]:
# Once you've created an endpoint, you can retrieve it as below:

endpoint = ml_client.online_endpoints.get(name=online_endpoint_name)

print(
    f'Endpoint "{endpoint.name}" with provisioning state "{endpoint.provisioning_state}" is retrieved'
)

Endpoint "iris-svc625ae8fd" with provisioning state "Succeeded" is retrieved


# Deploy the model to the endpoint
Once the endpoint is created, deploy the model with the entry script. Each endpoint can have multiple deployments. Direct traffic to these deployments can be specified using rules. Here you'll create a single deployment that handles 100% of the incoming traffic. We have chosen a color name for the deployment, for example, blue, green, red deployments, which is arbitrary.

You can check the Models page on Azure ML studio, to identify the latest version of your registered model. Alternatively, the code below will retrieve the latest version number for you to use.

In [14]:
# Let's pick the latest version of the model
latest_model_version = max(
    [int(m.version) for m in ml_client.models.list(name="iris_svc_model")]
)
print (f'Current model version "{latest_model_version}"')

Current model version "2"


# Create a blue deployment
A deployment is a set of resources required for hosting the model that does the actual inferencing. We will create a deployment for our endpoint using the ManagedOnlineDeployment class. This class allows user to configure the following key aspects.

- `name` - Name of the deployment.
- `endpoint_name` - Name of the endpoint to create the deployment under.
- `model` - The model to use for the deployment. This value can be either a reference to an existing versioned model in the workspace or an inline model specification.
- `instance_type` - The VM size to use for the deployment. For the list of supported sizes, see Managed online endpoints SKU list.
- `instance_count` - The number of instances to use for the deploymen

In [15]:
# Deploy the latest version of the model.


# picking the model to deploy. Here we use the latest version of our registered model
model = ml_client.models.get(name="iris_svc_model", version=latest_model_version)


# create an online deployment.
blue_deployment = ManagedOnlineDeployment(
    name="blue",
    endpoint_name=online_endpoint_name,
    model=model,
    instance_type="Standard_DS3_v2",
    instance_count=1,
)

blue_deployment = ml_client.begin_create_or_update(blue_deployment)

Check: endpoint iris-svc625ae8fd exists
Creating/updating online deployment blue Done (6m 50s)


...........................................................................

In [18]:
# blue deployment takes 100 traffic
endpoint.traffic = {"blue": 100}
ml_client.begin_create_or_update(endpoint)
print(f'"{endpoint}')

ManagedOnlineEndpoint({'public_network_access': 'Enabled', 'provisioning_state': 'Succeeded', 'scoring_uri': 'https://iris-svc625ae8fd.canadacentral.inference.ml.azure.com/score', 'swagger_uri': 'https://iris-svc625ae8fd.canadacentral.inference.ml.azure.com/swagger.json', 'name': 'iris-svc625ae8fd', 'description': 'this is an online endpoint', 'tags': {'training_dataset': 'iris-data', 'model_type': 'sklearn.SVC'}, 'properties': {'azureml.onlineendpointid': '/subscriptions/6baab092-6683-4cc2-b4dd-7df2bb56af59/resourcegroups/rg-treshenv-ws-a127/providers/microsoft.machinelearningservices/workspaces/ml-treshenv-ws-a127-svc-513c/onlineendpoints/iris-svc625ae8fd', 'AzureAsyncOperationUri': 'https://management.azure.com/subscriptions/6baab092-6683-4cc2-b4dd-7df2bb56af59/providers/Microsoft.MachineLearningServices/locations/canadacentral/mfeOperationsStatus/oe:e9aec20e-4d27-4063-89ba-a6f45406c563:45e5519f-7c02-4c41-841d-9501d3fbf923?api-version=2022-02-01-preview'}, 'id': '/subscriptions/6baa

# Test the deployment
Using the MLClient created earlier, we will get a handle to the endpoint. The endpoint can be invoked using the invoke command with the following parameters:

- `endpoint_name` - Name of the endpoint
- `request_file` - File with request data
- `deployment_name` - Name of the specific deployment to test in an endpoint
We will send a sample request using a sample-request.json file.

In [21]:
%%writefile sample-request.json
{
  "input_data": {
    "columns": [0,1,2,3],
    "index": [0, 1,2],
    "data": [
            [5,3.3,1.4,0.2],
            [6.1,2.9,4.7,1.4],
            [6,3,4.8,1.8]
        ]
  }
}

Overwriting sample-request.json


In [22]:
# test the blue deployment with some sample data
ml_client.online_endpoints.invoke(
    endpoint_name=online_endpoint_name,
    request_file="./sample-request.json",
    deployment_name="blue",
)

'["Iris-setosa", "Iris-versicolor", "Iris-virginica"]'

# Get endpoint details

In [23]:
# Get the details for online endpoint
endpoint = ml_client.online_endpoints.get(name=online_endpoint_name)

# existing traffic details
print(endpoint.traffic)

# Get the scoring URI
print(endpoint.scoring_uri)

{'blue': 100}
https://iris-svc625ae8fd.canadacentral.inference.ml.azure.com/score


# Next Steps
You can see further examples of running a job [here](../../../single-step/)