# Deploy MLflow models to legacy web services

The MLflow plugin `azureml-mlflow` can deploy models to Azure Machine Learning, either to Azure Kubernetes Service (AKS), Azure Container Instances (ACI) and Managed Online Endpoints for real-time serving. We recommend the use of Online Endpoints whenever possible, but both ACI and AKS (v1) are possible targets for deployment. 

In [None]:
%pip install mlflow_sdk_web_service.txt

Import the namespaces:

In [None]:
from mlflow.tracking import MlflowClient

import json
import requests
import mlflow
import pandas as pd

## 1. Connect to Azure Machine Learning Workspace

### If you are working in a Compute Instance in Azure Machine Learning

If you are working in Azure Machine Learning Compute Instances, you MLflow installation is automatically connected to Azure Machine Learning, and you don't need to do anything.

### If you are working in your local machine, or in a cloud outside Azure Machine Learning

You will need to connect MLflow to the Azure Machine Learning workspace you want to work on. MLflow uses the tracking URI to indicate the MLflow server you want to connect to. There are multiple ways to get the Azure Machine Learning MLflow Tracking URI. In this tutorial we will use the Azure ML SDK for Python, but you can check [Set up tracking environment - Azure Machine Learning Docs](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-use-mlflow-cli-runs#set-up-tracking-environment) for more alternatives.

In [None]:
subscription_id = "<SUBSCRIPTION_ID>"
resource_group = "<RESOURCE_GROUP>"
workspace = "<AML_WORKSPACE_NAME>"

In [None]:
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential

ml_client = MLClient(
    DefaultAzureCredential(), subscription_id, resource_group, workspace
)

You can use the workspace object to get the tracking URI:

In [None]:
azureml_tracking_uri = ml_client.workspaces.get(
    ml_client.workspace_name
).mlflow_tracking_uri
mlflow.set_tracking_uri(azureml_tracking_uri)

## 2. Registering the model in the registry

This example uses an MLflow model based on the [UCI Heart Disease Data Set](https://archive.ics.uci.edu/ml/datasets/Heart+Disease). The database contains 76 attributes, but we are using a subset of 14 of them. The model tries to predict the presence of heart disease in a patient. It is integer valued from 0 (no presence) to 1 (presence).

The model has been trained using an XGBBoost classifier and all the required preprocessing has been packaged as a scikit-learn pipeline, making this model an end-to-end pipeline that goes from raw data to predictions.

Let's ensure the model is registered in the workspace:

In [None]:
model_name = "heart-classifier"
model_local_path = "model"

Let's check if the model is registered:

In [None]:
mlflow_client = MlflowClient()
model_versions = mlflow_client.search_model_versions(
    filter_string=f"name = '{model_name}'"
)

If not, let's create one:

In [None]:
if any(model_versions):
    version = model_versions[0].version
else:
    registered_model = mlflow_client.create_model_version(
        name=model_name, source=f"file://{model_local_path}"
    )
    version = registered_model.version

In [None]:
print(f"We are going to deploy model {model_name} with version {version}")

# 3. Create a web service

Deployments can be generated using both the Python API for MLflow or MLflow CLI. In both cases, a JSON configuration file can be indicated with the details of the deployment you want to achieve. If not indicated, then a default deployment is done using Azure Container Instances (ACI) and a minimal configuration. The full specification of this configuration for ACI and AKS file can be checked at [Deployment configuration schema](https://docs.microsoft.com/en-us/azure/machine-learning/reference-azure-machine-learning-cli#deployment-configuration-schema).

#### Configuration example for ACI deployment

```json
{
  "computeType": "aci",
  "containerResourceRequirements":
  {
    "cpu": 1,
    "memoryInGB": 1
  },
  "location": "eastus2",
}
```

Remarks:
- If `containerResourceRequirements` is not indicated, a deployment with minimal compute configuration is applied (cpu: 0.1 and memory: 0.5).
- If `location` is not indicated, it defaults to the location of the workspace.

#### Configuration example for an AKS deployment

```json
{
  "computeType": "aks",
  "computeTargetName": "aks-mlflow"
}
```

Remarks:
- In above exmaple, `aks-mlflow` is the name of an Azure Kubernetes Cluster registered/created in Azure Machine Learning.

## 3.1 Configure the web service for ACI

In [None]:
webservice_name = "heart-classifier-aci"

print(f"Web service name: {webservice_name}")

To configure the hardware requirements of you deployment, you need to create a JSON file with the desired configuration:

In [None]:
deploy_config = {"computeType": "aci"}

Write the configuration to a file:

In [None]:
deployment_config_path = "deployment_config.json"
with open(deployment_config_path, "w") as outfile:
    outfile.write(json.dumps(deploy_config))

## 3.2 Create the web service


First, let's create an MLflow deployment client for Azure Machine Learning:

In [None]:
from mlflow.deployments import get_deploy_client

deployment_client = get_deploy_client(mlflow.get_tracking_uri())

The method `create_deployment` allows you to create a simple deployment using the configuration indicated in the configuration file.

In [None]:
deployment = deployment_client.create_deployment(
    name=webservice_name,
    model_uri=f"models:/{model_name}/{version}",
    config={"deploy-config-file": deployment_config_path},
)

Get the scoring URI from the web service

In [None]:
scoring_uri = deployment_client.get_deployment(webservice_name)["scoringUri"]

## 4. Test the deployment

### 4.1 Create a sample request

The following code samples 5 observations from the training dataset, removes the `target` column (as the model will predict it), and creates a request in the file `sample.json` that can be used with the model deployment.

In [None]:
samples = (
    pd.read_csv("data/heart.csv")
    .sample(n=5)
    .drop(columns=["target"])
    .reset_index(drop=True)
)

### 4.2 Invoke it with the deployment client

In [None]:
deployment_client.predict(deployment_name=webservice_name, df=samples)

### 4.3 Invoke the web service with REST

Your inputs should be submitted inside the a JSON payload containing a dictionary with key `input_data`. The following shows a valid example for the heart classifier model we were working on in JSON-serialized pandas DataFrames in the split orientation:

```json
{
    "input_data": {
        "columns": [
            "age", "sex", "trestbps", "chol", "fbs", "restecg", "thalach", "exang", "oldpeak", "slope", "ca", "thal"
        ],
        "index": [1],
        "data": [
            [1, 1, 145, 233, 1, 2, 150, 0, 2.3, 3, 0, 2]
        ]
    }
}
```

> Azure Machine Learning requires the key `input_data` to be added to the input examples that you want to provide to the service. Notice that this is not the case of the command `mlflow model serve`.

In [None]:
sample_request = {
    "input_data": json.loads(samples.to_json(orient="split", index=False))
}

Make a post to the endpoint.

In [None]:
headers = {
    "Content-Type": "application/json",
}

In [None]:
req = requests.post(scoring_uri, json=sample_request, headers=headers)
req.json()

## 5 Delete the resources

Once you are ready, delete the created resources:

In [None]:
deployment_client.delete_deployment(webservice_name)