# Using batch deployments for image file processing

Importing the required libraries. This notebook requires:

- `azure-ai-ml`
- `mlflow`
- `azureml-mlflow`
- `numpy`
- `pandas`
- `pillow`
- `tensorflow`

In [None]:
from azure.ai.ml import MLClient, Input
from azure.ai.ml.entities import (
    BatchEndpoint,
    BatchDeployment,
    Model,
    AmlCompute,
    Data,
    BatchRetrySettings,
    CodeConfiguration,
    Environment,
)
from azure.ai.ml.constants import AssetTypes, BatchDeploymentOutputAction
from azure.identity import DefaultAzureCredential

## Accessing the Azure Machine Learning workspace

In [None]:
subscription_id = "<subscription>"
resource_group = "<resource-group>"
workspace = "<workspace>"

ml_client = MLClient(
    DefaultAzureCredential(), subscription_id, resource_group, workspace
)

## About the model

Let's review how the model is built. The model was built using TensorFlow along with the RestNet architecture ([Identity Mappings in Deep Residual Networks](https://arxiv.org/abs/1603.05027)). This model has the following constraints that are important to keep in mind for deployment:

* In work with images of size 244x244 (tensors of `(224, 224, 3)`).
* It requires inputs to be scaled to the range `[0,1]`.

In [None]:
import tensorflow_hub as hub
import tensorflow as tf

model = tf.keras.Sequential(
    [
        hub.KerasLayer(
            "https://tfhub.dev/google/imagenet/resnet_v2_101/classification/5"
        ),
    ]
)
model.build([None, None, None, 3])

Testing if the model works:

In [None]:
import PIL.Image as Image
import numpy as np

image_file = tf.keras.utils.get_file(
    "image.jpeg",
    "https://azuremlexampledata.blob.core.windows.net/data/imagenet/goldfish.JPEG",
)
img = Image.open(image_file).resize((244, 244))
img = np.array(img) / 255.0
batch_img = tf.expand_dims(img, axis=0)

Run the model:

In [None]:
pred = model.predict(batch_img)
pred_class = pred.argmax(axis=-1)

Getting the labels for ImageNet:

In [None]:
labels_path = tf.keras.utils.get_file(
    "ImageNetLabels.txt",
    "https://azuremlexampledata.blob.core.windows.net/data/imagenet/ImageNetLabels.txt",
)
imagenet_labels = np.array(open(labels_path).read().splitlines())

In [None]:
predicted_class_name = [
    imagenet_labels[predicted_class] for predicted_class in pred_class
]
predicted_class_name

Let's save this model locally:

In [None]:
model_local_path = "imagenet-classifier/model"
model.save(model_local_path)

## Deploying the model in a batch endpoint

### Registering the model

We need to register the model in order to use it with Azure Machine Learning:

In [None]:
model_name = "imagenet-classifier"

In [None]:
if not any(filter(lambda m: m.name == model_name, ml_client.models.list())):
    print(f"Model {model_name} is not registered. Creating...")
    model = ml_client.models.create_or_update(
        Model(name=model_name, path=model_local_path, type=AssetTypes.CUSTOM_MODEL)
    )

Let's get a reference to the model:

In [None]:
model = ml_client.models.get(name=model_name, label="latest")

### Creating an scoring script to work with the model

In [None]:
%%writefile imagenet-classifier/code/imagenet_scorer.py

import os
import numpy as np
import pandas as pd
import tensorflow as tf
from os.path import basename
from PIL import Image
from tensorflow.keras.models import load_model


def init():
    global model
    global input_width
    global input_height

    # AZUREML_MODEL_DIR is an environment variable created during deployment
    model_path = os.path.join(os.environ["AZUREML_MODEL_DIR"], "model")

    # load the model
    model = load_model(model_path)
    input_width = 244
    input_height = 244

def run(mini_batch):
    results = []

    for image in mini_batch:
        data = Image.open(image).resize((input_width, input_height)) # Read and resize the image
        data = np.array(data)/255.0 # Normalize
        data_batch = tf.expand_dims(data, axis=0) # create a batch of size (1, 244, 244, 3)

        # perform inference
        pred = model.predict(data_batch)

        # Compute probabilities, classes and labels
        pred_prob = tf.math.reduce_max(tf.math.softmax(pred, axis=-1)).numpy()
        pred_class = tf.math.argmax(pred, axis=-1).numpy()

        results.append([basename(image), pred_class[0], pred_prob])

    return pd.DataFrame(results)

### Creating the ednpoint

First, let's create the endpoint that is going to host the batch deployments. Remember that each endpoint can host multiple deployments at any time, however, only one of them is the default one:

In [None]:
endpoint_name = "imagenet-classifier-batch"
endpoint = BatchEndpoint(
    name=endpoint_name,
    description="An batch service to perform imagenet image classification",
)

In [None]:
ml_client.batch_endpoints.begin_create_or_update(endpoint)

### Creating the compute

Batch endpoints can run on any Azure ML compute that already exists in the workspace. That means that multiple batch deployments can share the same compute infrastructure. In this example, we are going to work on an AzureML compute cluster called `cpu-cluster`. Let's verify the compute exists on the workspace or create it otherwise.

In [None]:
compute_name = "cpu-cluster"
if not any(filter(lambda m: m.name == compute_name, ml_client.compute.list())):
    print(f"Compute {compute_name} is not created. Creating...")
    compute_cluster = AmlCompute(
        name=compute_name, description="amlcompute", min_instances=0, max_instances=5
    )
    ml_client.begin_create_or_update(compute_cluster)

Compute may take time to be created. Let's wait for it:

In [None]:
from time import sleep

print("Waiting for compute", end="")
while ml_client.compute.get(name=compute_name).provisioning_state == "Creating":
    sleep(1)
    print(".", end="")

print(" [DONE]")

### Creating the environment

Let's create the environment. In our case, our model runs on `TensorFlow`. Azure Machine Learning already has an environment with the required software installed, so we can reutilize this environment.

In [None]:
environment = Environment(
    conda_file="./imagenet-classifier/environment/conda.yml",
    image="mcr.microsoft.com/azureml/tensorflow-2.4-ubuntu18.04-py37-cpu-inference:latest",
)

### Creating the deployment

Let's create a deployment under the given endpoint.

In [None]:
deployment = BatchDeployment(
    name="imagenet-classifier-resnetv2",
    description="A ResNetV2 model architecture for performing ImageNet classification in batch",
    endpoint_name=endpoint.name,
    model=model,
    environment=environment,
    code_configuration=CodeConfiguration(
        code="./imagenet-classifier/code/",
        scoring_script="imagenet_scorer.py",
    ),
    compute=compute_name,
    instance_count=2,
    max_concurrency_per_instance=1,
    mini_batch_size=10,
    output_action=BatchDeploymentOutputAction.APPEND_ROW,
    output_file_name="predictions.csv",
    retry_settings=BatchRetrySettings(max_retries=3, timeout=300),
    logging_level="info",
)

In [None]:
ml_client.batch_deployments.begin_create_or_update(deployment)

### Setting the default deployment

Let's update the default deployment name in the endpoint:

In [None]:
endpoint = ml_client.batch_endpoints.get(endpoint_name)
endpoint.defaults.deployment_name = deployment.name
ml_client.batch_endpoints.begin_create_or_update(endpoint)

We can see the endpoint URL as follows:

In [None]:
endpoint.scoring_uri

## Testing the endpoint

Once the deployment is created, it is ready to recieve jobs. Let's first register a data asset so we can run the job against it. This data asset is a folder containing 1000 images from the original ImageNet dataset. We are going to download it first and then create the data asset:

In [None]:
!wget https://azuremlexampledata.blob.core.windows.net/data/imagenet-1000.zip
!unzip imagenet-1000.zip -d /tmp/imagenet-1000

In [None]:
data_path = "/tmp/imagenet-1000"
dataset_name = "imagenet-sample-unlabeled"

imagenet_sample = Data(
    path=data_path,
    type=AssetTypes.URI_FOLDER,
    description="A sample of 1000 images from the original ImageNet dataset",
    name=dataset_name,
)

In [None]:
ml_client.data.create_or_update(imagenet_sample)

Let's get a reference of the new data asset:

In [None]:
imagenet_sample = ml_client.data.get(name=dataset_name, label="latest")

Let's use this data as an input for the job:

In [None]:
input = Input(type=AssetTypes.URI_FOLDER, path=imagenet_sample.id)

In [None]:
job = ml_client.batch_endpoints.invoke(endpoint_name=endpoint.name, input=input)

You can use the returned job object to check the status of the job:

In [None]:
ml_client.jobs.get(job.name)

## High throughput deployments

In [None]:
%%writefile imagenet-classifier/code/imagenet_scorer_batch.py

import os
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow.keras.models import load_model


def init():
    global model
    global input_width
    global input_height

    # AZUREML_MODEL_DIR is an environment variable created during deployment
    model_path = os.path.join(os.environ["AZUREML_MODEL_DIR"], "model")

    # load the model
    model = load_model(model_path)
    input_width = 244
    input_height = 244

def decode_img(file_path):
    file = tf.io.read_file(file_path)
    img = tf.io.decode_jpeg(file, channels=3)
    img = tf.image.resize(img, [input_width, input_height])
    return img/255.

def run(mini_batch):
    images_ds = tf.data.Dataset.from_tensor_slices(mini_batch)
    images_ds = images_ds.map(decode_img).batch(64)

    # perform inference
    pred = model.predict(images_ds)

    # Compute probabilities, classes and labels
    pred_prob = tf.math.reduce_max(tf.math.softmax(pred, axis=-1)).numpy()
    pred_class = tf.math.argmax(pred, axis=-1).numpy()

    return pd.DataFrame([mini_batch, pred_prob, pred_class], columns=['file', 'probability', 'class'])

Let's use this new scoring script

In [None]:
ht_deployment = BatchDeployment(
    name="imagenet-classifier-resnetv2-ht",
    description="A ResNetV2 model architecture for performing ImageNet classification in batch (High throughput)",
    endpoint_name=endpoint.name,
    model=model,
    environment="AzureML-tensorflow-2.4-ubuntu18.04-py37-cpu-inference@latest",
    code_configuration=CodeConfiguration(
        code="./imagenet-classifier/code/",
        scoring_script="imagenet_scorer_batch.py",
    ),
    compute=compute_name,
    instance_count=2,
    max_concurrency_per_instance=1,
    mini_batch_size=10,
    output_action=BatchDeploymentOutputAction.APPEND_ROW,
    output_file_name="predictions.csv",
    retry_settings=BatchRetrySettings(max_retries=3, timeout=300),
    logging_level="info",
)

In [None]:
ml_client.batch_deployments.begin_create_or_update(ht_deployment)

Let's execute this specific deployment now:

In [None]:
job = ml_client.batch_endpoints.invoke(
    endpoint_name=endpoint.name, deployment_name=ht_deployment, input=input
)

## Exploring the results

In [None]:
ml_client.jobs.download(name=job.name, download_path=".", output_name="score")

In [None]:
import pandas as pd

score = pd.read_csv(
    "named-outputs/score/predictions.csv",
    header=None,
    names=["file", "class", "probabilities"],
    sep=" ",
)
score["label"] = score["class"].apply(lambda pred: imagenet_labels[pred])
score

## Using MLflow with images

When working with MLflow models that processes images, it is important to take into account for all the preprocessing you model requires.

In [None]:
import tensorflow_hub as hub
import tensorflow as tf

model = tf.keras.Sequential(
    [
        tf.keras.layers.Resizing(
            244, 244, interpolation="bilinear", crop_to_aspect_ratio=False
        ),
        tf.keras.layers.Rescaling(1 / 255.0),
        hub.KerasLayer(
            "https://tfhub.dev/google/imagenet/resnet_v2_101/classification/5"
        ),
        tf.keras.layers.Softmax(axis=-1),
    ]
)
model.build([None, None, None, 3])

Let's save this model in a local folder

In [None]:
model_local_path = "imagenet-classifier/model"
model.save(model_local_path)

We are going to include the labels for the predicted class in the directory so we can use them for inference:

In [None]:
!wget https://azuremlexampledata.blob.core.windows.net/data/imagenet/ImageNetLabels.txt -d imagenet-classifier/model

Let's create a custom loader for the MLflow model:

In [None]:
%%writefile imagenet-classifier-mlflow/code/loader_module.py

class TfClassifier():
    def __init__(self, model_path: str, labels_path: str):
        import tensorflow as tf
        import numpy as np
        from tensorflow.keras.models import load_model
        
        self.model = load_model(model_path)
        self.imagenet_labels = np.array(open(labels_path).read().splitlines())

    def predict(self, data):
        preds = self.model.predict(data)

        pred_prob = tf.reduce_max(preds, axis=-1)
        pred_class = tf.argmax(preds, axis=-1)
        pred_label = [self.imagenet_labels[pred] for pred in pred_class]

        return pd.DataFrame([pred_class, pred_prob, pred_label], columns=['class', 'probability', 'label'])

def _load_pyfunc(data_path: str):
    import os

    model_path = os.path.abspath(data_path)
    labels_path = os.path.join(model_path, 'ImageNetLabels.txt')

    return TfClassifier(model_path, labels_path)

Indicating a signature for your model

In [None]:
import numpy as np
import mlflow
from mlflow.models.signature import ModelSignature
from mlflow.types.schema import Schema, TensorSpec

input_schema = Schema(
    [
        TensorSpec(np.dtype(np.uint8), (-1, -1, -1, 3)),
    ]
)
signature = ModelSignature(inputs=input_schema)

Logging the model:

In [None]:
mlflow.pyfunc.save_model(
    "model",
    data_path="imagenet-classifier/model",
    code_path=["imagenet-classifier-mlflow/code/module_loader.py"],
    loader_module="module_loader",
    signature=signature,
)

This new model can be used for batch scoring using batch deployments.