# Deploy MLflow models to Online Endpoints

Import the namespaces:

In [None]:
from mlflow.tracking import MlflowClient
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential

import json
import mlflow
import pandas as pd

## 1. Connect to Azure Machine Learning Workspace

### If you are working in a Compute Instance in Azure Machine Learning

If you are working in Azure Machine Learning Compute Instances, you MLflow installation is automatically connected to Azure Machine Learning, and you don't need to do anything.

### If you are working in your local machine, or in a cloud outside Azure Machine Learning

You will need to connect MLflow to the Azure Machine Learning workspace you want to work on. MLflow uses the tracking URI to indicate the MLflow server you want to connect to. There are multiple ways to get the Azure Machine Learning MLflow Tracking URI. In this tutorial we will use the Azure ML SDK for Python, but you can check [Set up tracking environment - Azure Machine Learning Docs](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-use-mlflow-cli-runs#set-up-tracking-environment) for more alternatives.

In [None]:
subscription_id = "<SUBSCRIPTION_ID>"
resource_group = "<RESOURCE_GROUP>"
workspace = "<AML_WORKSPACE_NAME>"

In [None]:
ml_client = MLClient(
    DefaultAzureCredential(), subscription_id, resource_group, workspace
)

You can use the workspace object to get the tracking URI:

In [None]:
azureml_tracking_uri = ml_client.workspaces.get(ml_client.workspace_name).mlflow_tracking_uri
mlflow.set_tracking_uri(azureml_tracking_uri)

## 2. Registering the model in the registry

This example uses an MLflow model based on the [UCI Heart Disease Data Set](https://archive.ics.uci.edu/ml/datasets/Heart+Disease). The database contains 76 attributes, but we are using a subset of 14 of them. The model tries to predict the presence of heart disease in a patient. It is integer valued from 0 (no presence) to 1 (presence).

The model has been trained using an XGBBoost classifier and all the required preprocessing has been packaged as a scikit-learn pipeline, making this model an end-to-end pipeline that goes from raw data to predictions.

Let's ensure the model is registered in the workspace:

In [None]:
model_name = 'heart-classifier'
model_local_path = "model"

Let's check if the model is registered:

In [None]:
mlflow_client = MlflowClient()
model_versions = mlflow_client.search_model_versions(filter_string=f"name = '{model_name}'")

If not, let's create one:

In [None]:
if any(model_versions):
    version = model_versions[0].version
else:
    registered_model = mlflow_client.create_model_version(name=model_name, source=f"file://{model_local_path}")
    version = registered_model.version

In [None]:
print(f"We are going to deploy model {model_name} with version {version}")

# 3. Create an Online Endpoint

Online endpoints are endpoints that are used for online (real-time) inferencing. Online endpoints contain deployments that are ready to receive data from clients and can send responses back in real time.

## 3.1 Configure the endpoint

Online Endpoints have the concept of __Endpoint__ and __Deployment__. An endpoint represent the API that customers uses to consume the model, while the deployment indicates the specific implementation of that API. This distinction allows users to decouple the API from the implementation and to change the underlying implementation without affecting the consumer.

In [None]:
import random
import string

# Creating a unique endpoint name by including a random suffix
allowed_chars = string.ascii_lowercase + string.digits
endpoint_suffix = "".join(random.choice(allowed_chars) for x in range(5))
endpoint_name = "heart-classifier-" + endpoint_suffix

print(f"Endpoint name: {endpoint_name}")

## 3.2 Create an Online Endpoint


First, let's create an MLflow deployment client for Azure Machine Learning:

In [None]:
from mlflow.deployments import get_deploy_client

deployment_client = get_deploy_client(mlflow.get_tracking_uri())

Let's create the endpoint with basic configuration:

In [None]:
endpoint = deployment_client.create_endpoint(endpoint_name)

### 3.3 Create a deployment

To configure the hardware requirements of you deployment, you need to create a JSON file with the desired configuration:

In [None]:
deploy_config = {
   "instance_type": "Standard_DS2_v2",
   "instance_count": 1,
}

Write the configuration to a file:

In [None]:
deployment_config_path = "deployment_config.json"
with open(deployment_config_path, "w") as outfile:
   outfile.write(json.dumps(deploy_config))

The method `create_deployment` allows you to create a simple deployment using the configuration indicated in the configuration file. We are going to name this deployment "default".

In [None]:
deployment = deployment_client.create_deployment(
   name="default",
   endpoint=endpoint_name,
   model_uri=f"models:/{model_name}/{version}",
   config={ "deploy-config-file": deployment_config_path },
)

> The parameter `endpoint` in `create_deployment` is optional. If not indicated, an endpoint is automatically created for you with the same name of the deployment you are creating. Hosting multiple deployments under a single endpoint is a feature in Azure Machine Learning that may not be present in all the cloud providers and hence the parameter is optional. However, we highly advise its use.

## 4. Test the deployment

### 4.1 Create a sample request file

Azure Machine Learning requires the key `input_data` to be added to the input examples that you want to provide to the service. Notice that this is not the case of the command `mlflow model serve`.

The following code samples 5 observations from the training dataset, removes the `target` column (as the model will predict it), and creates a request in the file `sample.json` that can be used with the model deployment.

In [None]:
samples = pd.read_csv("data/heart.csv").sample(n=5).drop(columns=["target"]).reset_index(drop=True)

with open("sample.json", "w") as f:
    f.write(
        json.dumps({ "input_data": json.loads(samples.to_json(orient='split', index=False)) })
    )

### 4.2 Get the scoring URI from the endpoint

In [None]:
scoring_uri = deployment_client.get_endpoint(endpoint=endpoint_name)["properties"]["scoringUri"]

### 4.3 Invoke the endpoint

#### 4.3.1 Authentication against the endpoint

Online Endpoints support both key-based authentication or Azure Active Directory. In this case we are going to use key-based authentication which is based on a secret that the caller needs to include in the headers of the request. You can get this key using:

- Azure ML SDK for Python
- Azure ML CLI
- [Azure ML studio](https://ml.azure.com)

In our case, we are going to use the Azure ML SDK for Python. If you didn't create an `MLClient` before, create a client for the Azure Machine Learning workspace:

In [None]:
ml_client = MLClient(
    DefaultAzureCredential(), subscription_id, resource_group, workspace
)

Let's get the secrets of the endpoint:

In [None]:
endpoint_secret_key = ml_client.online_endpoints.list_keys(name=endpoint_name).access_token

#### 4.3.2 Run the endpoint

Let's create the authentication header:

In [None]:
authentication_header = f"'Authorization: Bearer {endpoint_secret_key}'"

In [None]:
!cat -A sample.json | curl $scoring_uri \
                        --request POST \
                        --header 'Content-Type: application/json' \
                        --header $authentication_header \
                        --data-binary @-

## 5 Delete the resources

Once you are ready, delete the created resources:

In [None]:
deployment_client.delete_deployment(endpoint_name)