# mnist batch prediction example \[Parallel job\] \[SDK example\]
## Key notes for this example
- How to use **parallel job** for **batch inferencing** scenario.
- How to use parallel job **run_function** task with predefined **entry_script**.
- How to use **url_folder** with **files data** as the **input of parallel job**.
- How to use **mini_batch_size** in parallel job to split input data by size. 
- How to use **append_row_to** to aggregate returns to **uri_file** output.

To get the same example with CLI + Yaml experience, please refer to: [link](../../../../../cli/jobs/parallel/3a_mnist_batch_identification/README.md)

# 1. Connect to Azure Machine Learning Workspace
## 1.1 Import the required libraries

In [None]:
# import required libraries
from azure.identity import DefaultAzureCredential, InteractiveBrowserCredential
from azure.ai.ml import MLClient, Input, Output, load_component
from azure.ai.ml.dsl import pipeline
from azure.ai.ml.entities import Environment, ResourceConfiguration
from azure.ai.ml.constants import AssetTypes, InputOutputModes
from azure.ai.ml.parallel import parallel_run_function, RunFunction

## 1.2 Configure credential
`DefaultAzureCredential` should be capable of handling most Azure SDK authentication scenarios. 

Reference for more available credentials if it does not work for you: [configure credential example](../../configuration.ipynb), [azure-identity reference doc](https://docs.microsoft.com/en-us/python/api/azure-identity/azure.identity?view=azure-python).

In [None]:
try:
    credential = DefaultAzureCredential()
    # Check if given credential can get token successfully.
    credential.get_token("https://management.azure.com/.default")
except Exception as ex:
    # Fall back to InteractiveBrowserCredential in case DefaultAzureCredential not work
    credential = InteractiveBrowserCredential()

## 1.3 Get a handle to the workspace

We use config file to connect to a workspace. The Azure ML workspace should be configured with computer cluster. [Check this notebook for configure a workspace](../../configuration.ipynb)

In [None]:
# Get a handle to workspace
ml_client = MLClient.from_config(credential=credential)

# Retrieve an already attached Azure Machine Learning Compute.
cpu_compute_target = "cpu-cluster"
print(ml_client.compute.get(cpu_compute_target))

# 2. Define components and jobs in pipeline
## 2.1 Load existing component

In [None]:
prepare_data_component = load_component(source="./script/prepare_data.yml")

## 2.2 Declare parallel job

In [None]:
# Declare parallel job with run_function task
batch_inferencing_with_mini_batch_size = parallel_run_function(
    name="batch_inferencing_with_mini_batch_size",
    display_name="Batch Inferencing with mini_batch_size",
    description="parallel job to do batch inferencing with mini_batch_size on url folder with files input",
    tags={
        "azureml_parallel_example": "3a_sdk",
    },
    inputs=dict(
        job_data_path=Input(
            type=AssetTypes.URI_FOLDER,
            description="Input tabular mltable data.",
            mode=InputOutputModes.RO_MOUNT,
        ),
        score_model=Input(
            type=AssetTypes.URI_FOLDER,
            description="Folder contains the model file.",
            mode=InputOutputModes.DOWNLOAD,
        ),
    ),
    outputs=dict(
        job_output_file=Output(
            type=AssetTypes.URI_FILE,
            mode=InputOutputModes.RW_MOUNT,
        ),
    ),
    input_data="${{inputs.job_data_path}}",  # Define which input data will be splitted into mini-batches
    mini_batch_size="5",  # Use 'mini_batch_size' as the data division method. For files input data, this number define the file count for each mini-batch.
    instance_count=2,  # Use 2 nodes from compute cluster to run this parallel job.
    max_concurrency_per_instance=2,  # Create 2 worker processors in each compute node to execute mini-batches.
    error_threshold=5,  # Monitor the failures of item processed by the gap between mini-batch input count and returns. 'Batch inferencing' scenario should return a list, dataframe, or tuple with the successful items to try to meet this threshold.
    mini_batch_error_threshold=5,  # Monitor the failed mini-batch by exception, time out, or null return. When failed mini-batch count is higher than this setting, the parallel job will be marked as 'failed'.
    retry_settings=dict(
        max_retries=2,  # Define how many retries when mini-batch execution is failed by exception, time out, or null return.
        timeout=60,  # Define the timeout in second for each mini-batch execution.
    ),
    logging_level="DEBUG",
    environment_variables={
        "AZUREML_PARALLEL_EXAMPLE": "3a_sdk",
    },
    task=RunFunction(
        code="./script",
        entry_script="digit_identification.py",
        environment=Environment(
            image="mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04",
            conda_file="./script/environment_parallel.yml",
        ),
        program_arguments="--model ${{inputs.score_model}} ",
        append_row_to="${{outputs.job_output_file}}",  # Define where to output the aggregated returns from each mini-batches.
    ),
)

# 3. Build pipeline

In [None]:
# Declare the inputs of the job.
input_model_folder = Input(
    path="./mnist_models", type=AssetTypes.URI_FOLDER, mode=InputOutputModes.DOWNLOAD
)

# Declare pipeline structure.
@pipeline(
    display_name="parallel job for iris batch inferencing",
)
def parallel_job_in_pipeline():
    # Declare command job to prepare mnist data
    prepare_data = prepare_data_component()

    # Declare parallel inferencing job.
    predict_digits_mnist = batch_inferencing_with_mini_batch_size(
        job_data_path=prepare_data.outputs.mnist_png,
        score_model=input_model_folder,
    )

    # User could override parallel job run-level property when invoke that parallel job/component in pipeline.
    predict_digits_mnist.resources.instance_count = 2
    predict_digits_mnist.max_concurrency_per_instance = 2
    predict_digits_mnist.mini_batch_error_threshold = 10
    predict_digits_mnist.outputs.job_output_file.path = "azureml://datastores/${{default_datastore}}/paths/${{name}}/aggregated_returns.csv"


# Create pipeline instance
my_job = parallel_job_in_pipeline()

# Set pipeline level compute
my_job.tags.update
my_job.settings.default_compute = "cpu-cluster"

In [None]:
print(my_job)

# 4. Submit pipeline job

In [None]:
pipeline_job = ml_client.jobs.create_or_update(
    my_job,
    experiment_name="hello-world-parallel-job",
)
pipeline_job

In [None]:
# wait until the job completes
ml_client.jobs.stream(pipeline_job.name)