## Multimodal Classification Inference using Online Endpoints

This sample shows how deploy `multimodal-classification` type models to an online endpoint for inference.

### Task
`multimodal-classification` tasks assign label(s) or class(es) to an image. There are two common types of `multimodal-classification` tasks:

* MultiClass: Input features can categorised into one of `n` classes.
* MultiLabel: Input features can be categorised into more than one class.
 
### Model
Models that can perform the `multimodal-classification` task are tagged with `multimodal-classification`. We will use the `mmeft` model in this notebook. If you opened this notebook from a specific model card, remember to replace the specific model name.

### Inference data
We will use the [AirBnb](https://automlresources-prod.azureedge.net/datasets/AirBnb.zip) dataset.


### Outline
1. Setup pre-requisites
2. Pick a model to deploy
3. Prepare data for inference
4. Deploy the model to an online endpoint for real time inference
5. Test the endpoint
6. Clean up resources - delete the online endpoint

### 1. Setup pre-requisites
* Install dependencies
* Connect to AzureML Workspace. Learn more at [set up SDK authentication](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-setup-authentication?tabs=sdk). Replace  `<WORKSPACE_NAME>`, `<RESOURCE_GROUP>` and `<SUBSCRIPTION_ID>` below.
* Connect to `azureml` system registry

In [None]:
from azure.ai.ml import MLClient
from azure.identity import (
    DefaultAzureCredential,
    InteractiveBrowserCredential,
)
from azure.ai.ml.entities import AmlCompute
import time

try:
    credential = DefaultAzureCredential()
    credential.get_token("https://management.azure.com/.default")
except Exception as ex:
    credential = InteractiveBrowserCredential()

try:
    workspace_ml_client = MLClient.from_config(credential)
    subscription_id = workspace_ml_client.subscription_id
    resource_group = workspace_ml_client.resource_group_name
    workspace_name = workspace_ml_client.workspace_name
except Exception as ex:
    print(ex)
    # Enter details of your AML workspace
    subscription_id = "<SUBSCRIPTION_ID>"
    resource_group = "<RESOURCE_GROUP>"
    workspace_name = "<AML_WORKSPACE_NAME>"
workspace_ml_client = MLClient(
    credential, subscription_id, resource_group, workspace_name
)

# The models, fine tuning pipelines and environments are available in the AzureML system registry, "azureml"
registry_ml_client = MLClient(
    credential,
    subscription_id,
    resource_group,
    registry_name="azureml",
)
# Generating a unique timestamp that can be used for names and versions that need to be unique
timestamp = str(int(time.time()))

### 2. Pick a model to deploy

Browse models in the Model Catalog in the AzureML Studio, filtering by the `multimodal-classification` task. In this example, we use the `mmeft ` model. If you have opened this notebook for a different model, replace the model name accordingly. This is a pre-trained model and may not give correct prediction for your dataset. We strongly recommend to finetune this model on a down-stream task to be able to use it for predictions and inference. Please refer to the [multi-class classification finetuning notebook](../../finetune/multimodal-classification/multiclass-classification/mmeft-airbnb-multiclass-classification.ipynb).

In [None]:
# Replace this with name of finetuned model registered in your workspace.
model_name = "mmeft"
foundation_models = registry_ml_client.models.list(name=model_name)
foundation_model = max(foundation_models, key=lambda x: x.version)
print(
    f"\n\nUsing model name: {foundation_model.name}, version: {foundation_model.version}, id: {foundation_model.id} for inferencing"
)

### 3. Prepare data for inference

We will use the [AirBnb](https://cvbp-secondary.z19.web.core.windows.net/datasets/multimodal_classification/AirBnb.zip) dataset for multi-class classification task. It has a `.csv` file with features and label. Along with it, images are stored separately in `room_images` folder. Column name that stores label is `room_type`.

This is the most common data format for multiclass image classification. Each folder title corresponds to the image label for the images contained inside. 

In [None]:
import os
import urllib
from zipfile import ZipFile

# Change to a different location if you prefer
dataset_parent_dir = "./data"

# Create data folder if it doesnt exist.
os.makedirs(dataset_parent_dir, exist_ok=True)

# Download data
download_url = "https://automlresources-prod.azureedge.net/datasets/AirBnb.zip"

# Extract current dataset name from dataset url
dataset_name = os.path.split(download_url)[-1].split(".")[0]

# Get the data zip file path
data_file = os.path.join(dataset_parent_dir, f"{dataset_name}.zip")

# Download the dataset
urllib.request.urlretrieve(download_url, filename=data_file)

# Extract files
with ZipFile(data_file, "r") as zip:
    print("extracting files...")
    zip.extractall(path=dataset_parent_dir)
    print("done")
# Delete zip file
os.remove(data_file)

In [None]:
# Initialize dataset specific fields

dataset_dir = os.path.join(dataset_parent_dir, dataset_name)
input_csv_file_path = os.path.join(dataset_dir, "airbnb_multiclass_dataset.csv")

image_column_name = "picture_url"

In [None]:
import pandas as pd

# Read a sample row from dataset
df = pd.read_csv(input_csv_file_path, nrows=2)
print("Sample row\n")
print(df.head())

### 4. Deploy the model to an online endpoint for real time inference
Online endpoints give a durable REST API that can be used to integrate with applications that need to use the model.

In [None]:
import time
from azure.ai.ml.entities import ManagedOnlineEndpoint, ManagedOnlineDeployment

# Endpoint names need to be unique in a region, hence using timestamp to create unique endpoint name
timestamp = int(time.time())
online_endpoint_name = "multimodal-classif-" + str(timestamp)
# Create an online endpoint
endpoint = ManagedOnlineEndpoint(
    name=online_endpoint_name,
    description="Online endpoint for "
    + foundation_model.name
    + ", for multimodal-classification task",
    auth_mode="key",
)
workspace_ml_client.begin_create_or_update(endpoint).wait()

In [None]:
from azure.ai.ml.entities import OnlineRequestSettings, ProbeSettings

# deployment_name should be mandatorily in lowercase
deployment_name = "multimodal-classif-mlflow-deploy"

print(foundation_model.id)
print(online_endpoint_name)
print(deployment_name)

# Create a deployment
demo_deployment = ManagedOnlineDeployment(
    name=deployment_name,
    endpoint_name=online_endpoint_name,
    model=foundation_model.id,
    instance_type="Standard_DS3_V2",  # Use GPU instance type like Standard_NC6s_v3 for faster inference
    instance_count=1,
    request_settings=OnlineRequestSettings(
        max_concurrent_requests_per_instance=1,
        request_timeout_ms=90000,
        max_queue_wait_ms=500,
    ),
    liveness_probe=ProbeSettings(
        failure_threshold=49,
        success_threshold=1,
        timeout=299,
        period=180,
        initial_delay=180,
    ),
    readiness_probe=ProbeSettings(
        failure_threshold=10,
        success_threshold=1,
        timeout=10,
        period=10,
        initial_delay=10,
    ),
)
workspace_ml_client.online_deployments.begin_create_or_update(demo_deployment).wait()
endpoint.traffic = {deployment_name: 100}
workspace_ml_client.begin_create_or_update(endpoint).result()

### 5. Test the endpoint

We will fetch some sample data from the test dataset and submit to online endpoint for inference.

In [None]:
demo_deployment = workspace_ml_client.online_deployments.get(
    name=deployment_name,
    endpoint_name=online_endpoint_name,
)

# Get the details for online endpoint
endpoint = workspace_ml_client.online_endpoints.get(name=online_endpoint_name)

# Existing traffic details
print(endpoint.traffic)

# Get the scoring URI
print(endpoint.scoring_uri)
print(demo_deployment)

In [None]:
import base64
import json


def image_to_str(img_path) -> str:
    with open(os.path.join(dataset_dir, img_path), "rb") as f:
        encoded_image = base64.encodebytes(f.read()).decode("utf-8")
        return encoded_image


df_sample = pd.read_csv(input_csv_file_path, nrows=2)

# We can pass image either as azureml url on data asset or as a base64 encoded string.
# Here, we will be passing base64 encoded string.
df_sample[image_column_name] = df_sample.apply(
    lambda x: image_to_str(x[image_column_name]), axis=1
)

request_json = {
    "input_data": {
        "columns": df_sample.columns.values.tolist(),
        "data": df_sample.values.tolist(),
    }
}

# Create request json
request_file_name = "sample_request_data.json"
with open(request_file_name, "w") as request_file:
    json.dump(request_json, request_file)

In [None]:
# Score the sample_score.json file using the online endpoint with the azureml endpoint invoke method
response = workspace_ml_client.online_endpoints.invoke(
    endpoint_name=online_endpoint_name,
    deployment_name=demo_deployment.name,
    request_file=request_file_name,
)
print(f"raw response: {response}\n")

In [None]:
response

### 6. Clean up resources - delete the online endpoint
Don't forget to delete the online endpoint, else you will leave the billing meter running for the compute used by the endpoint.

In [None]:
workspace_ml_client.online_endpoints.begin_delete(name=online_endpoint_name).wait()