## Multimodal Classification Inference using Batch Endpoints

This sample shows how deploy `multimodal-classification` type models to an batch endpoint for inference.

### Task
`multimodal-classification` tasks assign label(s) or class(es) to an image. There are two common types of `multimodal-classification` tasks:

* MultiClass: Input features can categorised into one of `n` classes.
* MultiLabel: Input features can be categorised into more than one class.
 
### Model
Models that can perform the `multimodal-classification` task are tagged with `multimodal-classification`. We will use the `mmeft` model in this notebook. If you opened this notebook from a specific model card, remember to replace the specific model name.

### Inference data
We will use the [AirBnb](https://automlresources-prod.azureedge.net/datasets/AirBnb.zip) dataset.


### Outline
1. Setup pre-requisites
2. Pick a model to deploy
3. Prepare data for inference
4. Deploy the model to an batch endpoint for real time inference
5. Test the endpoint
6. Clean up resources - delete the batch endpoint

### 1. Setup pre-requisites
* Install dependencies
* Connect to AzureML Workspace. Learn more at [set up SDK authentication](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-setup-authentication?tabs=sdk). Replace  `<WORKSPACE_NAME>`, `<RESOURCE_GROUP>` and `<SUBSCRIPTION_ID>` below.
* Connect to `azureml` system registry

In [None]:
from azure.ai.ml import MLClient, Input
from azure.ai.ml.constants import AssetTypes
from azure.identity import (
    DefaultAzureCredential,
    InteractiveBrowserCredential,
)
from azure.ai.ml.entities import AmlCompute
import time

try:
    credential = DefaultAzureCredential()
    credential.get_token("https://management.azure.com/.default")
except Exception as ex:
    credential = InteractiveBrowserCredential()

try:
    workspace_ml_client = MLClient.from_config(credential)
    subscription_id = workspace_ml_client.subscription_id
    resource_group = workspace_ml_client.resource_group_name
    workspace_name = workspace_ml_client.workspace_name
except Exception as ex:
    print(ex)
    # Enter details of your AML workspace
    subscription_id = "<SUBSCRIPTION_ID>"
    resource_group = "<RESOURCE_GROUP>"
    workspace_name = "<AML_WORKSPACE_NAME>"
workspace_ml_client = MLClient(
    credential, subscription_id, resource_group, workspace_name
)

# The models, fine tuning pipelines and environments are available in the AzureML system registry, "azureml"
registry_ml_client = MLClient(
    credential,
    subscription_id,
    resource_group,
    registry_name="azureml",
)
# Generating a unique timestamp that can be used for names and versions that need to be unique
timestamp = str(int(time.time()))

#### Create a compute cluster
Use the model card from the AzureML system registry to check the minimum required inferencing SKU, referenced as size below. If you already have a sufficient compute cluster, you can simply define the name in compute_name in the following code block.

In [None]:
from azure.ai.ml.entities import AmlCompute
from azure.core.exceptions import ResourceNotFoundError

compute_name = "cpu-cluster"

try:
    _ = workspace_ml_client.compute.get(compute_name)
    print("Found existing compute target.")
except ResourceNotFoundError:
    print("Creating a new compute target...")
    compute_config = AmlCompute(
        name=compute_name,
        description="An AML compute cluster",
        size="Standard_DS3_V2",
        min_instances=0,
        max_instances=3,
        idle_time_before_scale_down=120,
    )
    workspace_ml_client.begin_create_or_update(compute_config).result()

### 2. Pick a model to deploy

Browse models in the Model Catalog in the AzureML Studio, filtering by the `multimodal-classification` task. In this example, we use the `mmeft ` model. If you have opened this notebook for a different model, replace the model name accordingly. This is a pre-trained model and may not give correct prediction for your dataset. We strongly recommend to finetune this model on a down-stream task to be able to use it for predictions and inference. Please refer to the [multi-class classification finetuning notebook](../../finetune/multimodal-classification/multiclass-classification/mmeft-airbnb-multiclass-classification.ipynb).

In [None]:
# Replace this with name of finetuned model registered in your workspace.
model_name = "mmeft"
foundation_models = registry_ml_client.models.list(name=model_name)
foundation_model = max(foundation_models, key=lambda x: x.version)
print(
    f"\n\nUsing model name: {foundation_model.name}, version: {foundation_model.version}, id: {foundation_model.id} for inferencing"
)

### 3. Prepare data for inference

We will use the [AirBnb](https://cvbp-secondary.z19.web.core.windows.net/datasets/multimodal_classification/AirBnb.zip) dataset for multi-class or single-label classification task. It has a `.csv` file with features and label. Along with it, images are stored separately in `room_images` folder. Column name that stores label is `room_type`.

This is the most common data format for multiclass image classification. Each folder title corresponds to the image label for the images contained inside. 

In [None]:
import os
import urllib
from zipfile import ZipFile

# Change to a different location if you prefer
dataset_parent_dir = "./data"

# Create data folder if it doesnt exist.
os.makedirs(dataset_parent_dir, exist_ok=True)

# Download data
download_url = "https://automlresources-prod.azureedge.net/datasets/AirBnb.zip"

# Extract current dataset name from dataset url
dataset_name = os.path.split(download_url)[-1].split(".")[0]

# Get the data zip file path
data_file = os.path.join(dataset_parent_dir, f"{dataset_name}.zip")

# Download the dataset
urllib.request.urlretrieve(download_url, filename=data_file)

# Extract files
with ZipFile(data_file, "r") as zip:
    print("extracting files...")
    zip.extractall(path=dataset_parent_dir)
    print("done")
# Delete zip file
os.remove(data_file)

In [None]:
# Initialize dataset specific fields

dataset_dir = os.path.join(dataset_parent_dir, dataset_name)
input_csv_file_path = os.path.join(dataset_dir, "airbnb_multiclass_dataset.csv")

image_column_name = "picture_url"

In [None]:
import pandas as pd

# Read a sample row from dataset
df = pd.read_csv(input_csv_file_path, nrows=2)
print("Sample row\n")
print(df.head())

### 4. Deploy the model to a batch endpoint
Batch endpoints are endpoints that are used to do batch inferencing on large volumes of data over a period of time. The endpoints receive pointers to data and run jobs asynchronously to process the data in parallel on compute clusters. Batch endpoints store outputs to a data store for further analysis. For more information on batch endpoints and deployments see [What are batch endpoints?](https://learn.microsoft.com/en-us/azure/machine-learning/concept-endpoints?view=azureml-api-2#what-are-batch-endpoints)

* Create a batch endpoint.
* Create a batch deployment.
* Set the deployment as default; doing so allows invoking the endpoint without specifying the deployment's name.

#### Create a batch endpoint

In [None]:
from azure.ai.ml.entities import (
    BatchEndpoint,
    BatchDeployment,
    BatchRetrySettings,
    AmlCompute,
)

# Endpoint names need to be unique in a region, hence using timestamp to create unique endpoint name
endpoint_name = "multimodal-multiclass-classif-" + str(timestamp)
# Create a batch endpoint
endpoint = BatchEndpoint(
    name=endpoint_name,
    description="Batch endpoint for "
    + foundation_model.name
    + ", for multimodal-multiclass-classification task",
)
workspace_ml_client.begin_create_or_update(endpoint).result()

#### Create a batch deployment

In [None]:
deployment_name = "demo"

deployment = BatchDeployment(
    name=deployment_name,
    endpoint_name=endpoint_name,
    model=foundation_model.id,
    compute=compute_name,
    error_threshold=0,
    instance_count=1,
    logging_level="info",
    max_concurrency_per_instance=1,
    mini_batch_size=2,
    output_file_name="predictions.csv",
    retry_settings=BatchRetrySettings(max_retries=3, timeout=600),
)
workspace_ml_client.begin_create_or_update(deployment).result()

#### Set the deployment as default

In [None]:
endpoint = workspace_ml_client.batch_endpoints.get(endpoint_name)
endpoint.defaults.deployment_name = deployment_name
workspace_ml_client.begin_create_or_update(endpoint).result()

endpoint = workspace_ml_client.batch_endpoints.get(endpoint_name)
print(f"The default deployment is {endpoint.defaults.deployment_name}")

### 5. Test the endpoint - Using CSV input with base64 images from 3

Invoke the batch endpoint with the input parameter pointing to the csv file containing the batch inference input. This creates a pipeline job using the default deployment in the endpoint. Wait for the job to complete.

Convert the image in input csv to base64 encoded string and prepare a csv for batch endpoint.

In [None]:
# convert the image to base64 encoded string for batch inferencing
import base64


def image_to_str(img_path) -> str:
    with open(os.path.join(dataset_dir, img_path), "rb") as f:
        encoded_image = base64.encodebytes(f.read()).decode("utf-8")
        return encoded_image


df_sample = pd.read_csv(input_csv_file_path, nrows=2)

# We can pass image either as azureml url on data asset or as a base64 encoded string.
# Here, we will be passing base64 encoded string.
df_sample[image_column_name] = df_sample.apply(
    lambda x: image_to_str(x[image_column_name]), axis=1
)

batch_input_csv_file_path = os.path.join(dataset_dir, "batch_input.csv")
# dump the dataframe to csv
df_sample.to_csv(batch_input_csv_file_path, index=False)

Invoke the batch endpoint with prepared input csv.

In [None]:
job = None
input = Input(path=batch_input_csv_file_path, type=AssetTypes.URI_FILE)
num_retries = 3
for i in range(num_retries):
    try:
        job = workspace_ml_client.batch_endpoints.invoke(
            endpoint_name=endpoint.name, input=input
        )
        break
    except Exception as e:
        if i == num_retries - 1:
            raise e
        else:
            print("Endpoint invocation failed. Retrying after 5 seconds...")
            time.sleep(5)
if job is not None:
    workspace_ml_client.jobs.stream(job.name)

Get the scoring results from the batch endpoint.

In [None]:
scoring_job = list(workspace_ml_client.jobs.list(parent_job_name=job.name))[0]

workspace_ml_client.jobs.download(
    name=scoring_job.name,
    download_path=os.path.join(dataset_parent_dir, "csv-output"),
    output_name="score",
)

predictions_file = os.path.join(
    dataset_parent_dir, "csv-output", "named-outputs", "score", "predictions.csv"
)

# Load the batch predictions file with no headers into a dataframe and set your column names
score_df = pd.read_csv(
    predictions_file,
    header=None,
    names=["row_number_per_file", "preds", "labels", "file_name"],
)
score_df.head()

### 6. Clean up resources - delete the endpoint
Batch endpoints use compute resources only when jobs are submitted. You can keep the batch endpoint for your reference without worrying about compute bills, or choose to delete the endpoint. If you created your compute cluster to have zero minimum instances and scale down soon after being idle, you won't be charged for an unused compute.

In [None]:
workspace_ml_client.batch_endpoints.begin_delete(name=endpoint_name).result()