# Deploy and score a machine learning model by using an online endpoint 

Learn how to use an online endpoint to deploy your model, so you don't have to create and manage the underlying infrastructure. You'll begin by deploying a model on your local machine to debug any errors, and then you'll deploy and test it in Azure.

Managed online endpoints help to deploy your ML models in a turnkey manner. Managed online endpoints work with powerful CPU and GPU machines in Azure in a scalable, fully managed way. Managed online endpoints take care of serving, scaling, securing, and monitoring your models, freeing you from the overhead of setting up and managing the underlying infrastructure. 

For more information, see [What are Azure Machine Learning endpoints?](https://docs.microsoft.com/azure/machine-learning/concept-endpoints).

## Prerequisites

* To use Azure Machine Learning, you must have an Azure subscription. If you don't have an Azure subscription, create a free account before you begin. Try the [free or paid version of Azure Machine Learning](https://azure.microsoft.com/free/).

* Install and configure the [Python SDK v2](sdk/setup.sh).

* You must have an Azure resource group, and you (or the service principal you use) must have Contributor access to it.

* You must have an Azure Machine Learning workspace. 

* To deploy locally, you must install Docker Engine on your local computer. We highly recommend this option, so it's easier to debug issues.

# 1. Connect to Azure Machine Learning Workspace

The [workspace](https://docs.microsoft.com/en-us/azure/machine-learning/concept-workspace) is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. In this section we will connect to the workspace in which the job will be run.

## 1.1. Import the required libraries

In [1]:
# import required libraries
from azure.ai.ml import MLClient
from azure.ai.ml.entities import (
    ManagedOnlineEndpoint,
    ManagedOnlineDeployment,
    Model,
    Environment,
    CodeConfiguration,
)
from azure.identity import DefaultAzureCredential

## 1.2. Configure workspace details and get a handle to the workspace

To connect to a workspace, we need identifier parameters - a subscription, resource group and workspace name. We will use these details in the `MLClient` from `azure.ai.ml` to get a handle to the required Azure Machine Learning workspace. We use the default [default azure authentication](https://docs.microsoft.com/en-us/python/api/azure-identity/azure.identity.defaultazurecredential?view=azure-python) for this tutorial. Check the [configuration notebook](../../jobs/configuration.ipynb) for more details on how to configure credentials and connect to a workspace.

In [2]:
# enter details of your AML workspace
subscription_id = "e9a02075-e822-4b24-95f3-ce725888b904"
resource_group = "pricing-e-machine-learning-eastus-dev-rg"
workspace = "e-machine-learning-eastus-dev-ws"

In [3]:
# get a handle to the workspace
ml_client = MLClient(
    DefaultAzureCredential(), subscription_id, resource_group, workspace
)

## Deploy and debug locally by using local endpoints

### Note
* To deploy locally, [Docker Engine](https://docs.docker.com/engine/install/) must be installed.
* Docker Engine must be running. Docker Engine typically starts when the computer starts. If it doesn't, you can [troubleshoot Docker Engine](https://docs.docker.com/config/daemon/#start-the-daemon-manually).

# 2. Create local endpoint and deployment

## 2.1 Create local endpoint

The goal of a local endpoint deployment is to validate and debug your code and configuration before you deploy to Azure. Local deployment has the following limitations:
* Local endpoints *do not support* traffic rules, authentication, or probe settings.
* Local endpoints support only one deployment per endpoint

In [4]:
# Creating a local endpoint
import datetime

local_endpoint_name = "local-" + datetime.datetime.now().strftime("%m%d%H%M%f")

# create an online endpoint
endpoint = ManagedOnlineEndpoint(
    name=local_endpoint_name, description="this is a sample local endpoint"
)

In [5]:
ml_client.online_endpoints.begin_create_or_update(endpoint, local=True)

Creating local endpoint (local-12191804239098) .Done (0m 5s)
Field 'mirror_traffic': This is an experimental field, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.


ManagedOnlineEndpoint({'public_network_access': None, 'provisioning_state': None, 'scoring_uri': None, 'swagger_uri': None, 'name': 'local-12191804239098', 'description': 'this is a sample local endpoint', 'tags': {}, 'properties': {}, 'id': None, 'base_path': './', 'creation_context': None, 'serialize': <msrest.serialization.Serializer object at 0x7f8778038d90>, 'auth_mode': 'key', 'location': None, 'identity': None, 'traffic': {}, 'mirror_traffic': {}, 'kind': None})

## 2.2 Create local deployment

The example contains all the files needed to deploy a model on an online endpoint. To deploy a model, you must have:

* Model files (or the name and version of a model that's already registered in your workspace). In the example, we have a scikit-learn model that does regression.
* The code that's required to score the model. In this case, we have a score.py file.
* An environment in which your model runs. As you'll see, the environment might be a Docker image with Conda dependencies, or it might be a Dockerfile.
* Settings to specify the instance type and scaling capacity.

### Key aspects of deployment 

- `name` - Name of the deployment.
- `endpoint_name` - Name of the endpoint to create the deployment under.
- `model` - The model to use for the deployment. This value can be either a reference to an existing versioned model in the workspace or an inline model specification.
- `environment` - The environment to use for the deployment. This value can be either a reference to an existing versioned environment in the workspace or an inline environment specification.
- `code_configuration` - the configuration for the source code and scoring script
    - `path`- Path to the source code directory for scoring the model
    - `scoring_script` - Relative path to the scoring file in the source code directory
- `instance_type` - The VM size to use for the deployment. For the list of supported sizes, see [Managed online endpoints SKU list](https://docs.microsoft.com/en-us/azure/machine-learning/reference-managed-online-endpoints-vm-sku-list).
- `instance_count` - The number of instances to use for the deployment

In [11]:
model = Model(path="./model-1/model/sklearn_regression_model.pkl")
env = Environment(
    conda_file="./model-1/environment/conda.yml",
    image="mcr.microsoft.com/azureml/openmpi3.1.2-ubuntu18.04:20210727.v1",
)

blue_deployment = ManagedOnlineDeployment(
    name="blue",
    endpoint_name=local_endpoint_name,
    model=model,
    environment=env,
    code_configuration=CodeConfiguration(
        code="./model-1/onlinescoring", scoring_script="score.py"
    ),
    instance_type="Standard_F2s_v2",
    instance_count=1,
)

In [12]:
ml_client.online_deployments.begin_create_or_update(
    deployment=blue_deployment, local=True
)

Updating local deployment (local-12191804239098 / blue) .
Building Docker image from Dockerfile
Step 1/6 : FROM mcr.microsoft.com/azureml/openmpi3.1.2-ubuntu18.04:20210727.v1
 ---> 9fab65be7722
Step 2/6 : RUN mkdir -p /var/azureml-app/
 ---> Using cache
 ---> f6d8cf722139
Step 3/6 : WORKDIR /var/azureml-app/
 ---> Using cache
 ---> 5f7e172624c5
Step 4/6 : COPY conda.yml /var/azureml-app/
 ---> Using cache
 ---> 9413a786d623
Step 5/6 : RUN conda env create -n inf-conda-env --file conda.yml
 ---> Using cache
 ---> a2a34c3ca9b4
Step 6/6 : CMD ["conda", "run", "--no-capture-output", "-n", "inf-conda-env", "runsvdir", "/var/runit"]
 ---> Using cache
 ---> 7899ab1e5779
Successfully built 7899ab1e5779
Successfully tagged local-12191804239098:blue

Starting up endpoint.....Done (0m 30s)


ManagedOnlineDeployment({'private_network_connection': None, 'egress_public_network_access': None, 'provisioning_state': 'Succeeded', 'endpoint_name': 'local-12191804239098', 'type': 'Managed', 'name': 'blue', 'description': None, 'tags': {}, 'properties': {}, 'id': None, 'base_path': PosixPath('/mnt/batch/tasks/shared/LS_root/mounts/clusters/zvealey-ci/code/Users/vea02153z/LocalEndpointTest'), 'creation_context': None, 'serialize': <msrest.serialization.Serializer object at 0x7f8773112a60>, 'model': Model({'job_name': None, 'is_anonymous': False, 'auto_increment_version': False, 'name': '7713d7a5680d37a33a7ac52530aec294', 'description': None, 'tags': {}, 'properties': {}, 'id': None, 'base_path': PosixPath('/mnt/batch/tasks/shared/LS_root/mounts/clusters/zvealey-ci/code/Users/vea02153z/LocalEndpointTest'), 'creation_context': None, 'serialize': <msrest.serialization.Serializer object at 0x7f87731124c0>, 'version': '1', 'latest_version': None, 'path': '/mnt/batch/tasks/shared/LS_root/m

# 3. Verify the local deployment succeeded

## 3.1 Check the status to see whether the model was deployed without error

In [13]:
ml_client.online_endpoints.get(name=local_endpoint_name, local=True)

ManagedOnlineEndpoint({'public_network_access': None, 'provisioning_state': 'Succeeded', 'scoring_uri': 'http://localhost:49158/score', 'swagger_uri': None, 'name': 'local-12191804239098', 'description': 'this is a sample local endpoint', 'tags': {}, 'properties': {}, 'id': None, 'base_path': './', 'creation_context': None, 'serialize': <msrest.serialization.Serializer object at 0x7f87731301c0>, 'auth_mode': 'key', 'location': 'local', 'identity': None, 'traffic': {}, 'mirror_traffic': {}, 'kind': None})

## 3.2 Get logs

In [14]:
ml_client.online_deployments.get_logs(
    name="blue", endpoint_name=local_endpoint_name, local=True, lines=50
)

"2022-12-19T18:09:13,450957600+00:00 - iot-server/run \r\n2022-12-19T18:09:13,455273700+00:00 - rsyslog/run \r\n2022-12-19T18:09:13,456448100+00:00 - nginx/run \r\n2022-12-19T18:09:13,465295800+00:00 - gunicorn/run \r\nEdgeHubConnectionString and IOTEDGE_IOTHUBHOSTNAME are not set. Exiting...\r\n2022-12-19T18:09:13,538807200+00:00 - iot-server/finish 1 0\r\n2022-12-19T18:09:13,541496700+00:00 - Exit code 1 is normal. Not restarting iot-server.\r\nDynamic Python package installation is disabled.\r\nStarting HTTP server\r\nStarting gunicorn 20.1.0\r\nListening at: http://127.0.0.1:31311 (29)\r\nUsing worker: sync\r\nworker timeout is set to 300\r\nBooting worker with pid: 68\r\nSPARK_HOME not set. Skipping PySpark Initialization.\r\nInitializing logger\r\n2022-12-19 18:09:14,578 | root | INFO | Starting up app insights client\r\nlogging socket was found. logging is available.\r\nlogging socket was found. logging is available.\r\n2022-12-19 18:09:14,582 | root | INFO | Starting up request

## 3.3 Invoke the local endpoint
Invoke the endpoint to score the model by using the convenience command invoke and passing query parameters that are stored in a JSON file

In [15]:
ml_client.online_endpoints.invoke(
    endpoint_name=local_endpoint_name,
    request_file="./model-1/sample-request.json",
    local=True,
)

'{"auction_id": 1, "bid_cents": 11055.977245525679}'

# 7. Delete the endpoint


In [None]:
ml_client.online_endpoints.begin_delete(name=local_endpoint_name)