# Deploy multiple machine learning models from registry to online endpoint  

Learn how to use an online endpoint to deploy your model, so you don't have to create and manage the underlying infrastructure. You'll begin by deploying a model on your local machine to debug any errors, and then you'll deploy and test it in Azure.

Managed online endpoints help to deploy your ML models in a turnkey manner. Managed online endpoints work with powerful CPU and GPU machines in Azure in a scalable, fully managed way. Managed online endpoints take care of serving, scaling, securing, and monitoring your models, freeing you from the overhead of setting up and managing the underlying infrastructure. 

For more information, see [What are Azure Machine Learning endpoints?](https://docs.microsoft.com/azure/machine-learning/concept-endpoints).

## Prerequisites

* To use Azure Machine Learning, you must have an Azure subscription. If you don't have an Azure subscription, create a free account before you begin. Try the [free or paid version of Azure Machine Learning](https://azure.microsoft.com/free/).

* Install and configure the [Python SDK v2](sdk/setup.sh).

* You must have an Azure resource group, and you (or the service principal you use) must have Contributor access to it.

* You must have an Azure Machine Learning workspace. 

# 1. Connect to Azure Machine Learning Workspace

The [workspace](https://docs.microsoft.com/en-us/azure/machine-learning/concept-workspace) is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. In this section we will connect to the workspace in which the job will be run.

## 1.1. Import the required libraries

In [1]:
# import required libraries
from azure.ai.ml import MLClient
from azure.ai.ml.entities import (
    ManagedOnlineEndpoint,
    ManagedOnlineDeployment,
    Model,
    Environment,
    CodeConfiguration,
)
from azure.identity import DefaultAzureCredential

## 1.2. Configure workspace details and get a handle to the workspace

To connect to a workspace, we need identifier parameters - a subscription, resource group and workspace name. We will use these details in the `MLClient` from `azure.ai.ml` to get a handle to the required Azure Machine Learning workspace. We use the default [default azure authentication](https://docs.microsoft.com/en-us/python/api/azure-identity/azure.identity.defaultazurecredential?view=azure-python) for this tutorial. Check the [configuration notebook](../../jobs/configuration.ipynb) for more details on how to configure credentials and connect to a workspace.

In [2]:
# enter details of your AML workspace
subscription_id = "<SUBSCIPTION ID>"
resource_group = "<RESOURCE_GROUP>"
workspace_name = "<WORKSPACE>"

In [3]:
# get a handle to the workspace
ml_client = MLClient(
    DefaultAzureCredential(), subscription_id, resource_group, workspace_name
)

# 4. Deploy your online endpoint to Azure
Next, deploy your online endpoint to Azure.

## 4.1 Configure online endpoint
`endpoint_name`: The name of the endpoint. It must be unique in the Azure region. Naming rules are defined under [managed online endpoint limits](https://docs.microsoft.com/azure/machine-learning/how-to-manage-quotas#azure-machine-learning-managed-online-endpoints-preview).

`auth_mode` : Use `key` for key-based authentication. Use `aml_token` for Azure Machine Learning token-based authentication. A `key` does not expire, but `aml_token` does expire. 

Optionally, you can add description, tags to your endpoint.

In [6]:
# Creating a unique endpoint name with current datetime to avoid conflicts
import datetime

online_endpoint_name = "multimodel-" + datetime.datetime.now().strftime("%m%d%H%M%f")

# create an online endpoint
endpoint = ManagedOnlineEndpoint(
    name=online_endpoint_name,
    description="this is a multimodel online endpoint",
    auth_mode="key",
    tags={"foo": "bar"},
)

## 4.2 Create the endpoint

Using the `MLClient` created earlier, we will now create the Endpoint in the workspace. This command will start the endpoint creation and return a confirmation response while the endpoint creation continues.

In [7]:
ml_client.online_endpoints.begin_create_or_update(endpoint).result()

ManagedOnlineEndpoint({'public_network_access': 'Enabled', 'provisioning_state': 'Succeeded', 'scoring_uri': 'https://multimodel-03070249904013.eastus2.inference.ml.azure.com/score', 'openapi_uri': 'https://multimodel-03070249904013.eastus2.inference.ml.azure.com/swagger.json', 'name': 'multimodel-03070249904013', 'description': 'this is a multimodel online endpoint', 'tags': {'foo': 'bar'}, 'properties': {'azureml.onlineendpointid': '/subscriptions/f9b97038-ed78-4a26-a1a7-51e81e75d867/resourcegroups/openaml/providers/microsoft.machinelearningservices/workspaces/nlp-workspace/onlineendpoints/multimodel-03070249904013', 'AzureAsyncOperationUri': 'https://management.azure.com/subscriptions/f9b97038-ed78-4a26-a1a7-51e81e75d867/providers/Microsoft.MachineLearningServices/locations/eastus2/mfeOperationsStatus/oe:baa4dabf-18ba-45e2-8649-6d72d7082169:2c243899-2e06-42a5-8e8a-cb5da916e705?api-version=2022-02-01-preview'}, 'id': '/subscriptions/f9b97038-ed78-4a26-a1a7-51e81e75d867/resourceGroups

In [16]:
endpoint = ml_client.online_endpoints.get(online_endpoint_name)
print("Endpoint Identity {0} ".format(endpoint.identity))

Endpoint Identity <azure.ai.ml.entities._credentials.IdentityConfiguration object at 0x7f913a0bc490> 


## 4.2.1 Add Role assignment to Managed Endpoint Identity
To access ML Models from Registry in workspace MOE System Identity needs to be granted access
Assign `AzureML Data Scientist` role to MOE Identity on Workspace scope

In [19]:
# add permissions for Workspace
import uuid
from azure.mgmt.resource.resources import ResourceManagementClient
from azure.mgmt.authorization import AuthorizationManagementClient

authorization_client = AuthorizationManagementClient(
    credential=ml_client._credential,
    subscription_id=subscription_id
)
workspace = ml_client.workspaces.get(name=workspace_name)

# Get "AzureML Data Scientist" built-in role as a RoleDefinition object
role_name = 'AzureML Data Scientist'
roles = list(authorization_client.role_definitions.list(
    workspace.id,
    filter="roleName eq '{}'".format(role_name)
))
assert len(roles) == 1
ml_role = roles[0]

print("Role {0}  Found".format(ml_role))


# Add WS scope to the Managed Identity token
role_assignment = authorization_client.role_assignments.create(
        workspace.id,
        uuid.uuid4(), # Role assignment random name
        {
            'role_definition_id': ml_role.id,
            'principal_id': endpoint.identity.principal_id
        }
)
print("RoleAssignment {0}  Found".format(role_assignment))

Role {'additional_properties': {}, 'id': '/subscriptions/f9b97038-ed78-4a26-a1a7-51e81e75d867/providers/Microsoft.Authorization/roleDefinitions/f6c7c914-8db3-469d-8ca1-694a8f32e121', 'name': 'f6c7c914-8db3-469d-8ca1-694a8f32e121', 'type': 'Microsoft.Authorization/roleDefinitions', 'role_name': 'AzureML Data Scientist', 'description': 'Can perform all actions within an Azure Machine Learning workspace, except for creating or deleting compute resources and modifying the workspace itself.', 'role_type': 'BuiltInRole', 'permissions': [<azure.mgmt.authorization.v2022_04_01.models._models_py3.Permission object at 0x7f912e010c70>], 'assignable_scopes': ['/']}  Found
RoleAssignment {'additional_properties': {}, 'id': '/subscriptions/f9b97038-ed78-4a26-a1a7-51e81e75d867/resourceGroups/openaml/providers/Microsoft.MachineLearningServices/workspaces/nlp-workspace/providers/Microsoft.Authorization/roleAssignments/434ca4bf-5e06-4227-9e73-a18bfc6431fd', 'name': '434ca4bf-5e06-4227-9e73-a18bfc6431fd',

## 4.3 Configure online deployment

A deployment is a set of resources required for hosting the model that does the actual inferencing. We will create a deployment for our endpoint using the `ManagedOnlineDeployment` class.

In [29]:
env = Environment(
    conda_file="./environment/conda.yml",
    image="mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04:latest",
)

blue_deployment = ManagedOnlineDeployment(
    name="blue" + datetime.datetime.now().strftime("%m%d%H%M%f"),
    endpoint_name=online_endpoint_name,
    #model=model,
    environment=env,
    code_configuration=CodeConfiguration(
        code="./onlinescoring", scoring_script="score_registry.py"
    ),
    environment_variables={
        "TRACKING_URI": workspace.mlflow_tracking_uri
    },
    instance_type="Standard_F4s_v2",
    instance_count=1,
)

print("Deployment {0}  defined".format(blue_deployment.name))


Deployment blue03070412735571  defined


## 4.4 Create the deployment

Using the `MLClient` created earlier, we will now create the deployment in the workspace. This command will start the deployment creation and return a confirmation response while the deployment creation continues.

In [30]:
ml_client.online_deployments.begin_create_or_update(blue_deployment).result()

Check: endpoint multimodel-03070249904013 exists
Uploading onlinescoring (0.0 MBs): 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████

.........................................................................

ManagedOnlineDeployment({'private_network_connection': False, 'data_collector': None, 'provisioning_state': 'Succeeded', 'endpoint_name': 'multimodel-03070249904013', 'type': 'Managed', 'name': 'blue03070412735571', 'description': None, 'tags': {}, 'properties': {'AzureAsyncOperationUri': 'https://management.azure.com/subscriptions/f9b97038-ed78-4a26-a1a7-51e81e75d867/providers/Microsoft.MachineLearningServices/locations/eastus2/mfeOperationsStatus/od:baa4dabf-18ba-45e2-8649-6d72d7082169:25b0ee19-0268-4952-8f69-4c04b2e9866a?api-version=2022-02-01-preview'}, 'id': '/subscriptions/f9b97038-ed78-4a26-a1a7-51e81e75d867/resourceGroups/openaml/providers/Microsoft.MachineLearningServices/workspaces/nlp-workspace/onlineEndpoints/multimodel-03070249904013/deployments/blue03070412735571', 'Resource__source_path': None, 'base_path': '/mnt/batch/tasks/shared/LS_root/mounts/clusters/eneros3/code/Users/eneros/azureml-poc/multimodel-registry', 'creation_context': None, 'serialize': <msrest.serializat

In [31]:
# blue deployment takes 100 traffic
endpoint.traffic = {f"{blue_deployment.name}": 100}
ml_client.online_endpoints.begin_create_or_update(endpoint).result()

Readonly attribute principal_id will be ignored in class <class 'azure.ai.ml._restclient.v2022_05_01.models._models_py3.ManagedServiceIdentity'>
Readonly attribute tenant_id will be ignored in class <class 'azure.ai.ml._restclient.v2022_05_01.models._models_py3.ManagedServiceIdentity'>


ManagedOnlineEndpoint({'public_network_access': 'Enabled', 'provisioning_state': 'Succeeded', 'scoring_uri': 'https://multimodel-03070249904013.eastus2.inference.ml.azure.com/score', 'openapi_uri': 'https://multimodel-03070249904013.eastus2.inference.ml.azure.com/swagger.json', 'name': 'multimodel-03070249904013', 'description': 'this is a multimodel online endpoint', 'tags': {'foo': 'bar'}, 'properties': {'azureml.onlineendpointid': '/subscriptions/f9b97038-ed78-4a26-a1a7-51e81e75d867/resourcegroups/openaml/providers/microsoft.machinelearningservices/workspaces/nlp-workspace/onlineendpoints/multimodel-03070249904013', 'AzureAsyncOperationUri': 'https://management.azure.com/subscriptions/f9b97038-ed78-4a26-a1a7-51e81e75d867/providers/Microsoft.MachineLearningServices/locations/eastus2/mfeOperationsStatus/oe:baa4dabf-18ba-45e2-8649-6d72d7082169:975c6951-5a3c-41ea-b4e5-b2ac288e3944?api-version=2022-02-01-preview'}, 'id': '/subscriptions/f9b97038-ed78-4a26-a1a7-51e81e75d867/resourceGroups

# 5. Test the endpoint with sample data
Using the `MLClient` created earlier, we will get a handle to the endpoint. The endpoint can be invoked using the `invoke` command with the following parameters:
- `endpoint_name` - Name of the endpoint
- `request_file` - File with request data
- `deployment_name` - Name of the specific deployment to test in an endpoint

We will send a sample request using a [json](./model-1/sample-request.json) file. 

In [32]:
# test the blue deployment with some sample data
ml_client.online_endpoints.invoke(
    endpoint_name=online_endpoint_name,
    deployment_name=blue_deployment.name,
    request_file="./onlinescoring/sample-request.json",
)

'["Iris-setosa", "Iris-versicolor", "Iris-virginica"]'

# 6. Managing endpoints and deployments

## 6.1 Get details of the endpoint

In [33]:
# Get the details for online endpoint
endpoint = ml_client.online_endpoints.get(name=online_endpoint_name)

# existing traffic details
print(endpoint.traffic)

# Get the scoring URI
print(endpoint.scoring_uri)

{'blue03070412735571': 100, 'blue03070404167155': 0, 'blue03070330299288': 0}
https://multimodel-03070249904013.eastus2.inference.ml.azure.com/score


## 6.2 Get the logs for the new deployment
Get the logs for the green deployment and verify as needed

In [34]:
ml_client.online_deployments.get_logs(
    name=blue_deployment.name, endpoint_name=online_endpoint_name, lines=50
)



# 7. Delete the endpoint


In [None]:
#ml_client.online_endpoints.begin_delete(name=online_endpoint_name)