# Batch Predictions for an Image Classification model trained using AutoML.

**Requirements** - In order to benefit from this tutorial, you will need:
- A basic understanding of Machine Learning
- An Azure account with an active subscription. [Create an account for free](https://azure.microsoft.com/free/?WT.mc_id=A261C142F)
- An Azure ML workspace. [Check this notebook for creating a workspace](../../../resources/workspace/workspace.ipynb) 
- A Compute Cluster. [Check this notebook to create a compute cluster](../../../resources/compute/compute.ipynb)
- A python environment
- Installed Azure Machine Learning Python SDK v2 - [install instructions](../../../README.md) - check the getting started section

**Learning Objectives** - By the end of this tutorial, you should be able to:
- Connect to your AML workspace from the Python SDK
- Create an `AutoML Image Classification Multiclass Training Job` with the 'image_classification()' factory-function.
- Train the model using AmlCompute by submitting/running the AutoML training job
- Obtain the best model, register it and deploy it to a batch endpoint
- Generate batch predictions using the batch endpoint

**Please note**: For this notebook you can use an existing image classification model trained using AutoML for Images or use the simple model training we included below for convenience. For detailed instructions on how to train an image classification model with AutoML, please refer to the image classification multiclass [notebook](../automl-image-classification-multiclass-task-fridge-items/automl-image-classification-multiclass-task-fridge-items.ipynb).


# 1. Connect to Azure Machine Learning Workspace

The [workspace](https://docs.microsoft.com/en-us/azure/machine-learning/concept-workspace) is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. In this section we will connect to the workspace in which the job will be run.

## 1.1. Import the required libraries

In [None]:
# Import required libraries
from azure.identity import DefaultAzureCredential
from azure.ai.ml import MLClient

from azure.ai.ml.constants import AssetTypes
from azure.ai.ml import Input
from azure.ai.ml.automl import ImageClassificationSearchSpace
from azure.ai.ml.sweep import (
    Choice,
    Uniform,
    BanditPolicy,
)

from azure.ai.ml import automl

## 1.2. Configure workspace details and get a handle to the workspace

To connect to a workspace, we need identifier parameters - a subscription, resource group and workspace name. We will use these details in the `MLClient` from `azure.ai.ml` to get a handle to the required Azure Machine Learning workspace. We use the default [default azure authentication](https://docs.microsoft.com/en-us/python/api/azure-identity/azure.identity.defaultazurecredential?view=azure-python) for this tutorial. Check the [configuration notebook](../../configuration.ipynb) for more details on how to configure credentials and connect to a workspace.

In [None]:
credential = DefaultAzureCredential()
ml_client = None
try:
    ml_client = MLClient.from_config(credential)
except Exception as ex:
    print(ex)
    # Enter details of your AML workspace
    subscription_id = "<SUBSCRIPTION_ID>"
    resource_group = "<RESOURCE_GROUP>"
    workspace = "<AML_WORKSPACE_NAME>"
    ml_client = MLClient(credential, subscription_id, resource_group, workspace)

# 2. Data

Load the 'fridge items' dataset from a JSON file and MLTable definition.

In order to generate models for computer vision, you will need to bring in labeled image data as input for model training in the form of an Azure Machine Learning MLTable. 

In this notebook, we use a toy dataset called Fridge Objects, which consists of 134 images of 4 classes of beverage container {can, carton, milk bottle, water bottle} photos taken on different backgrounds.

All images in this notebook are hosted in [this repository](https://github.com/microsoft/computervision-recipes) and are made available under the [MIT license](https://github.com/microsoft/computervision-recipes/blob/master/LICENSE).

**NOTE:** In this PRIVATE PREVIEW we're defining the MLTable in a separate folder and .YAML file.
In later versions, you'll be able to do it all in Python APIs.

In [None]:
import os
import urllib
from zipfile import ZipFile

# download data
download_url = "https://cvbp-secondary.z19.web.core.windows.net/datasets/image_classification/fridgeObjects.zip"
data_file = "fridgeObjects.zip"
urllib.request.urlretrieve(download_url, filename=data_file)

# extract files
with ZipFile(data_file, "r") as zip:
    print("extracting files...")
    zip.extractall(path="./data")
    print("done")

# delete zip file
os.remove(data_file)

### Upload the images to Datastore through an AML Data asset (URI Folder)

In order to use the data for training in Azure ML, we upload it to our default Azure Blob Storage of our  Azure ML Workspace.

Reference to URI FOLDER data asset example for further details: https://github.com/Azure/azureml-examples/blob/samuel100/data-samples/sdk/assets/data/data.ipynb

In [None]:
# Uploading image files by creating a 'data asset URI FOLDER':

from azure.ai.ml.entities import Data
from azure.ai.ml.constants import AssetTypes

my_data = Data(
    path="./data/fridgeObjects",
    type=AssetTypes.URI_FOLDER,
    description="Fridge-items images",
    name="fridge-items-images",
)

uri_folder_data_asset = ml_client.data.create_or_update(my_data)

print(uri_folder_data_asset)
print("")
print("Path to folder in Blob Storage:")
print(uri_folder_data_asset.path)

### Convert the downloaded data to JSON metadata

In this example, the fridge object dataset is stored in a directory. There are four different folders inside:

/water_bottle
/milk_bottle
/carton
/can

This is the most common data format for multiclass image classification. Each folder title corresponds to the image label for the images contained inside.

In order to use this data to create an AzureML MLTable, we first need to convert it to the required JSONL format. 

The following script is creating two .jsonl files (one for training and one for validation) in the parent folder of the dataset. The train / validation ratio corresponds to 20% of the data going into the validation file.

In [None]:
import json
import os

src_images = "./data/fridgeObjects/"

# We'll copy each JSONL file within its related MLTable folder
training_mltable_path = "./data/training-mltable-folder/"
validation_mltable_path = "./data/validation-mltable-folder/"

train_validation_ratio = 5

# Path to the training and validation files
train_annotations_file = os.path.join(training_mltable_path, "train_annotations.jsonl")
validation_annotations_file = os.path.join(
    validation_mltable_path, "validation_annotations.jsonl"
)

# Baseline of json line dictionary
json_line_sample = {
    "image_url": uri_folder_data_asset.path,
    "label": "",
}

index = 0
# Scan each sub directary and generate a jsonl line per image, distributed on train and valid JSONL files
with open(train_annotations_file, "w") as train_f:
    with open(validation_annotations_file, "w") as validation_f:
        for className in os.listdir(src_images):
            subDir = src_images + className
            if not os.path.isdir(subDir):
                continue
            # Scan each sub directary
            print("Parsing " + subDir)
            for image in os.listdir(subDir):
                json_line = dict(json_line_sample)
                json_line["image_url"] += f"{className}/{image}"
                json_line["label"] = className

                if index % train_validation_ratio == 0:
                    # validation annotation
                    validation_f.write(json.dumps(json_line) + "\n")
                else:
                    # train annotation
                    train_f.write(json.dumps(json_line) + "\n")
                index += 1

In [None]:
# Training MLTable defined locally, with local data to be uploaded
my_training_data_input = Input(type=AssetTypes.MLTABLE, path=training_mltable_path)

# Validation MLTable defined locally, with local data to be uploaded
my_validation_data_input = Input(type=AssetTypes.MLTABLE, path=validation_mltable_path)

# WITH REMOTE PATH: If available already in the cloud/workspace-blob-store
# my_training_data_input = Input(type=AssetTypes.MLTABLE, path="azureml://datastores/workspaceblobstore/paths/vision-classification/train")
# my_validation_data_input = Input(type=AssetTypes.MLTABLE, path="azureml://datastores/workspaceblobstore/paths/vision-classification/valid")

# 3. Configure and run the AutoML for Images Classification-Multiclass training job
In this section we will configure and run the AutoML job, for training the model.

## 3.1 Configure the job through the image_classification() factory function

### image_classification() function parameters:

The `image_classification()` factory function allows user to configure the training job.

- `target_column_name` - The name of the column to target for predictions. It must always be specified. This parameter is applicable to 'training_data', 'validation_data' and 'test_data'.
- `primary_metric` - The metric that AutoML will optimize for model selection.
- `training_data` - The data to be used for training. It should contain both training feature columns and a target column. Optionally, this data can be split for segregating a validation or test dataset. 
You can use a registered MLTable in the workspace using the format '<mltable_name>:<version>' OR you can use a local file or folder as a MLTable. For e.g Input(mltable='my_mltable:1') OR Input(mltable=MLTable(local_path="./data"))
The parameter 'training_data' must always be provided.
- `compute` - The compute on which the AutoML job will run. In this example we are using a compute called 'cpu-cluster' present in the workspace. You can replace it any other compute in the workspace. 
- `name` - The name of the Job/Run. This is an optional property. If not specified, a random name will be generated.
- `experiment_name` - The name of the Experiment. An Experiment is like a folder with multiple runs in Azure ML Workspace that should be related to the same logical machine learning experiment.

### set_limits() parameters:
This is an optional configuration method to configure limits parameters such as timeouts.     
    
- timeout_minutes - Maximum amount of time in minutes that the whole AutoML job can take before the job terminates. This timeout includes setup, featurization and training runs but does not include the ensembling and model explainability runs at the end of the process since those actions need to happen once all the trials (children jobs) are done. If not specified, the default job's total timeout is 6 days (8,640 minutes). To specify a timeout less than or equal to 1 hour (60 minutes), make sure your dataset's size is not greater than 10,000,000 (rows times column) or an error results.
    
### set_sweep() parameters:
- max_trials - Required parameter for maximum number of configurations to sweep. Must be an integer between 1 and 1000. When exploring just the default hyperparameters for a given model algorithm, set this parameter to 1.

- max_concurrent_trials - Maximum number of runs that can run concurrently. If not specified, all runs launch in parallel. If specified, must be an integer between 1 and 100. NOTE: The number of concurrent runs is gated on the resources available in the specified compute target. Ensure that the compute target has the available resources for the desired concurrency.

- sampling_algorithm - Sampling method to use for sweeping over the defined parameter space. Please refer to this documentation for list of supported sampling methods.

- early_termination - Early termination policy to end poorly performing runs. If no termination policy is specified, all configurations are run to completion. Please refer to this documentation for supported early termination policies.
    
you can find more details about configurations here [automl-image-classification-multiclass-task-fridge-items.ipynb](../automl-image-classification-multiclass-task-fridge-items/automl-image-classification-multiclass-task-fridge-items.ipynb)
    

In [None]:
# general job parameters
compute_name = "gpu-cluster"
exp_name = "dpv2-image-classification-experiment"

In [None]:
# Create the AutoML job with the related factory-function.

image_classification_job = automl.image_classification(
    compute=compute_name,
    # name="dpv2-image-classification-job-02",
    experiment_name=exp_name,
    training_data=my_training_data_input,
    validation_data=my_validation_data_input,
    target_column_name="label",
    primary_metric="accuracy",
    tags={"my_custom_tag": "My custom value"},
)

image_classification_job.set_limits(timeout_minutes=60)

image_classification_job.extend_search_space(
    [
        ImageClassificationSearchSpace(
            model_name=Choice(["vitb16r224", "vits16r224"]),
            learning_rate=Uniform(0.001, 0.01),
            number_of_epochs=Choice([15, 30]),
        ),
        ImageClassificationSearchSpace(
            model_name=Choice(["seresnext", "resnet50"]),
            learning_rate=Uniform(0.001, 0.01),
            layers_to_freeze=Choice([0, 2]),
        ),
    ]
)

image_classification_job.set_sweep(
    max_trials=10,
    max_concurrent_trials=2,
    sampling_algorithm="Random",
    early_termination=BanditPolicy(
        evaluation_interval=2, slack_factor=0.2, delay_evaluation=6
    ),
)

## 3.2 Run the Command
Using the `MLClient` created earlier, we will now run this Command in the workspace.

In [None]:
# Submit the AutoML job
returned_job = ml_client.jobs.create_or_update(
    image_classification_job
)  # submit the job to the backend

print(f"Created job: {returned_job}")

In [None]:
ml_client.jobs.stream(returned_job.name)

## 4. Retrieve the Best Trial (Best Model's trial/run)

Use the MLFLowClient to access the results (such as Models, Artifacts, Metrics) of a previously completed AutoML Trial.

#### Initialize MLFlow Client
The models and artifacts that are produced by AutoML can be accessed via the MLFlow interface. Initialize the MLFlow client here, and set the backend as Azure ML, via. the MLFlow Client.

IMPORTANT, you need to have installed the latest MLFlow packages with:

pip install azureml-mlflow

pip install mlflow
### 4.1 Obtain the tracking URI for MLFlow

In [None]:
import mlflow

# Obtain the tracking URL from MLClient
MLFLOW_TRACKING_URI = ml_client.workspaces.get(
    name=ml_client.workspace_name
).mlflow_tracking_uri

print(MLFLOW_TRACKING_URI)

In [None]:
# Set the MLFLOW TRACKING URI
mlflow.set_tracking_uri(MLFLOW_TRACKING_URI)
print("\nCurrent tracking uri: {}".format(mlflow.get_tracking_uri()))

In [None]:
from mlflow.tracking.client import MlflowClient

# Initialize MLFlow client
mlflow_client = MlflowClient()

### 4.2 Get the AutoML parent Job and find the best run

In [None]:
job_name = returned_job.name

# Get the parent run
mlflow_parent_run = mlflow_client.get_run(job_name)

# Print parent run tags. 'automl_best_child_run_id' tag should be there.
print(mlflow_parent_run.data.tags)

In [None]:
# Get the best model's child run
best_child_run_id = mlflow_parent_run.data.tags["automl_best_child_run_id"]
print("Found best child run id: ", best_child_run_id)

best_run = mlflow_client.get_run(best_child_run_id)

### 4.3 Download the best model locally

Access the results (such as Models, Artifacts, Metrics) of a previously completed AutoML Run and download them locally. We will need these artifacts when deploying the model.

In [None]:
import os

# Create local folder
local_dir = "./artifact_downloads"
if not os.path.exists(local_dir):
    os.mkdir(local_dir)

In [None]:
# Download run's artifacts/outputs
local_path = mlflow_client.download_artifacts(
    best_run.info.run_id, "outputs", local_dir
)
print("Artifacts downloaded in: {}".format(local_path))
print("Artifacts: {}".format(os.listdir(local_path)))

## 5 Deploy the non-MLFlow model with batch endpoints and run batch scoring

We will now deploy the non-MLFlow model to batch endpoint. Batch endpoint simply means, a REST endpoint which is capable of handing inputs in batch format.

To create a batch deployment, you need all the following items:

- **Model files**, or a registered model in your workspace referenced using `azureml:<model-name>:<model-version>`. In this notebook we will take the model file from the best_child_run.

- **The code to score the model**.
    
- **The environment** in which the model runs. It can be a Docker image with Conda dependencies, or an environment already registered in your workspace referenced using `azureml:<environment-name>:<environment-version>`.  In this notebook we will create a environment object using the environment definition downloaded from the best_child_run artifacts.

- **The pre-created compute** referenced using `azureml:<compute-name>`.

### 5.1 Regsiter The Model in the workspace
For deploying the model, we first need to register it with the existing workspace so that we can discover it during runtime.

In [None]:
from azure.ai.ml.entities import Model

model_name = "fridge-items-model"
model = Model(
    path=f"azureml://jobs/{best_run.info.run_id}/outputs/artifacts/outputs/model.pt",
    name=model_name,
    description="my sample object detection model",
)

# for downloaded file
# model = Model(path="artifact_downloads/outputs/model.pt", name=model_name)

registered_model = ml_client.models.create_or_update(model)
registered_model.id

### 5.2 Configure environment
It is recommended to use the same envionment for model deployment as the model training, therefore we are creating the Environment object using the conda environment file downloaded as artifacts. We also need to specify the base image, which in case of vision tasks, is `mcr.microsoft.com/azureml/openmpi4.1.0-cuda11.1-cudnn8-ubuntu18.04`

To read more about environments, please follow this [notebook](./../../../assets/environment/environment.ipynb)

In [None]:
from azure.ai.ml.entities import Environment

env = Environment(
    name="automl-images-env",
    description="environment for automl images inference",
    image="mcr.microsoft.com/azureml/openmpi4.1.0-cuda11.1-cudnn8-ubuntu18.04",
    conda_file="artifact_downloads/outputs/conda_env_v_1_0_0.yml",
)

### 5.3 Get A Scoring Script

To do the scoring, you need tp create a batch scoring script batch_scoring.py, and write it to the scripts folder in current directory. The script takes a minibatch of input images, applies the classification model, and outputs the predictions to a results file.

While creating the batch scoring script, refer to the scoring scripts generated under the outputs folder of the Automl training runs. This will help to identify the right model settings to be used in the batch scoring script init method while loading the model. Note: The batch scoring script we generate in the subsequent step is different from the scoring script generated by the training runs in the below screenshot. We refer to it just to identify the right model settings to be used in the batch scoring script.

![scoring-script.png](./ui_outputs.png)

### Understanding the scoring script

The scoring_script must contain two functions:

- init(): Use this function to load the model into a global object. This function will be called once at the beginning of the process.

- run(mini_batch): This function will be called for each mini_batch and do the actual scoring.

    + `mini_batch`: The mini_batch value is a list of file paths.

    + `result`: The run() method should return a pandas DataFrame or an array. Each returned output element indicates one successful run of an input element in the input mini_batch.
    
**Note** The scoring script used in this notebook is shown below

In [None]:
# View the batch scoring script. Use the model settings as appropriate for your model.
with open("./scripts/batch_scoring.py", "r") as f:
    print(f.read())

### 5.4 Deploy the model to batch endpoint
**Now, let's deploy the model with batch endpoints and run batch scoring.** 

It has three steps.
- Create a batch endpoint
- configure the endpoint
- deploy the endpoint using MLClient.

#### 5.4.1 Create A Batch Endpoint

**Please Note** that The name of the endpoint must be unique in the Azure region. For more information on the naming rules, see [managed endpoint limits](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-manage-quotas#azure-machine-learning-managed-online-endpoints). 

Optionally, you can add description, tags to your endpoint.

In [None]:
from azure.ai.ml.entities import (
    BatchEndpoint,
    BatchDeployment,
    BatchRetrySettings,
)
from azure.ai.ml.constants import BatchDeploymentOutputAction

import datetime

batch_endpoint_name = "my-batch-endpoint-" + datetime.datetime.now().strftime(
    "%Y%m%d%H%M"
)

# create a batch endpoint
endpoint = BatchEndpoint(
    name=batch_endpoint_name,
    description="this is a sample batch endpoint",
    tags={"foo": "bar"},
)

Using the MLClient created earlier, we'll now create the Endpoint in the workspace. This command will start the endpoint creation and return a confirmation response while the endpoint creation continues.

In [None]:
ml_client.begin_create_or_update(endpoint)

#### 5.4.2 Configure the deployment 

A deployment is a set of resources required for hosting the model that does the actual inferencing. We'll create a deployment for our endpoint using the BatchDeployment class.

Few important parameters are

- **model**: Link to the registered model

- **code_path**: Folder containing the scoring script

- **scoring_script**: Path of the scoring script relative to the `code path` parameter.

- **environment**: Environment object which would be used to create the virtual environment for inferencing. It should of type `azure.ai.ml.entities.environment` 

In [None]:
# create a batch deployment
deployment = BatchDeployment(
    name="non-mlflow-deployment",
    description="this is a sample non-mlflow deployment",
    endpoint_name=batch_endpoint_name,
    model=registered_model.id,
    code_path="./scripts",
    scoring_script="batch_scoring.py",
    environment=env,
    compute="cpu-cluster",
    instance_count=2,
    max_concurrency_per_instance=2,
    mini_batch_size=10,
    output_action=BatchDeploymentOutputAction.APPEND_ROW,
    output_file_name="predictions.csv",
    retry_settings=BatchRetrySettings(max_retries=3, timeout=30),
    logging_level="info",
)

#### 5.4.3 Create the deployment

Using the `MLClient` created earlier, we'll now create the deployment in the workspace. This command will start the deployment creation and return a confirmation response while the deployment creation continues.

In [None]:
ml_client.begin_create_or_update(deployment)

### 5.5 Test the endpoint with sample data

Using the MLClient created earlier, we'll get a handle to the endpoint. The endpoint can be invoked using the invoke command with the following parameters:

- endpoint_name - Name of the endpoint

- input - Dataset object holding the test dataset

- deployment_name - Name of the specific deployment to test in an endpoint

In [None]:
registered_data_asset = ml_client.data.get(name="fridge-items-images", label="latest")

test_data = Input(type=AssetTypes.URI_FILE, path=registered_data_asset.id)

# invoke the endpoint for batch scoring job
job = ml_client.batch_endpoints.invoke(
    endpoint_name=batch_endpoint_name,
    input=test_data,
    deployment_name="non-mlflow-deployment",  # name is required as default deployment is not set
)

In [None]:
# get the details of the job
job_name = job.name
batch_job = ml_client.jobs.get(name=job_name)
print(batch_job.status)
# stream the job logs
ml_client.jobs.stream(name=job_name)

## Download the output files

1. Once the job is over, click on the weblink and it should open the ml studio.
2. Double Click on "Batch Scoring" step and it should open a dialog box from the left. 
3. In the dialog box, go to outputs+logs, click on the data outputs, it will take you to the datastore path where the output file is stored.
4. You can download the file from the path.

### 6. Delete the endpoint

In [None]:
ml_client.batch_endpoints.begin_delete(name=batch_endpoint_name)