Copyright (c) Microsoft Corporation. All rights reserved.  
Licensed under the MIT License.

# Using Azure Machine Learning Pipelines for Batch Inference

In this notebook, we will demonstrate how to make predictions on large quantities of data asynchronously using the ML pipelines with Azure Machine Learning. Batch inference (or batch scoring) provides cost-effective inference, with unparalleled throughput for asynchronous applications. Batch prediction pipelines can scale to perform inference on terabytes of production data. Batch prediction is optimized for high throughput, fire-and-forget predictions for a large collection of data.

> **Tip**
If your system requires low-latency processing (to process a single document or small set of documents quickly), use [real-time scoring](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-consume-web-service) instead of batch prediction.

In this example will be take a digit identification model already-trained on MNIST dataset using the [AzureML training with deep learning example notebook](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-keras/train-hyperparameter-tune-deploy-with-keras.ipynb), and run that trained model on some of the MNIST test images in batch.  

The input dataset used for this notebook differs from a standard MNIST dataset in that it has been converted to PNG images to demonstrate use of files as inputs to Batch Inference. A sample of PNG-converted images of the MNIST dataset were take from [this repository](https://github.com/myleott/mnist_png). 

The outline of this notebook is as follows:

- Create a DataStore referencing MNIST images stored in a blob container.
- Register the pretrained MNIST model into the model registry. 
- Use the registered model to do batch inference on the images in the data blob container.

![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/machine-learning-pipelines/contrib/batch_inferencing/file-dataset-image-inference-mnist.png)

## Prerequisites
If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, make sure you go through the configuration Notebook located at https://github.com/Azure/MachineLearningNotebooks first. This sets you up with a working config file that has information on your workspace, subscription id, etc. 

### Connect to workspace
Create a workspace object from the existing workspace. Workspace.from_config() reads the file config.json and loads the details into an object named ws.

In [1]:
from azureml.core import Workspace

ws = Workspace.from_config()
print('Workspace name: ' + ws.name, 
      'Azure region: ' + ws.location, 
      'Subscription id: ' + ws.subscription_id, 
      'Resource group: ' + ws.resource_group, sep = '\n')

Performing interactive authentication. Please follow the instructions on the terminal.
To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code FJMB5K3E9 to authenticate.
Interactive authentication successfully completed.
Workspace name: 107327-aml-ws
Azure region: southcentralus
Subscription id: 07a3b836-0813-4c05-afd4-3a7ab00358d9
Resource group: aml-rg-107327


### Create or Attach existing compute resource
By using Azure Machine Learning Compute, a managed service, data scientists can train machine learning models on clusters of Azure virtual machines. Examples include VMs with GPU support. In this tutorial, you create Azure Machine Learning Compute as your training environment. The code below creates the compute clusters for you if they don't already exist in your workspace.

**Creation of compute takes approximately 5 minutes. If the AmlCompute with that name is already in your workspace the code will skip the creation process.**

In [2]:
import os
from azureml.core.compute import AmlCompute, ComputeTarget
from azureml.core.compute_target import ComputeTargetException

# choose a name for your cluster
compute_name = os.environ.get("AML_COMPUTE_CLUSTER_NAME", "cpu-cluster")
compute_min_nodes = os.environ.get("AML_COMPUTE_CLUSTER_MIN_NODES", 0)
compute_max_nodes = os.environ.get("AML_COMPUTE_CLUSTER_MAX_NODES", 4)

# This example uses CPU VM. For using GPU VM, set SKU to STANDARD_NC6
vm_size = os.environ.get("AML_COMPUTE_CLUSTER_SKU", "STANDARD_D2_V2")


if compute_name in ws.compute_targets:
    compute_target = ws.compute_targets[compute_name]
    if compute_target and type(compute_target) is AmlCompute:
        print('found compute target. just use it. ' + compute_name)
else:
    print('creating a new compute target...')
    provisioning_config = AmlCompute.provisioning_configuration(vm_size = vm_size,
                                                                min_nodes = compute_min_nodes, 
                                                                max_nodes = compute_max_nodes)

    # create the cluster
    compute_target = ComputeTarget.create(ws, compute_name, provisioning_config)
    
    # can poll for a minimum number of nodes and for a specific timeout. 
    # if no min node count is provided it will use the scale settings for the cluster
    compute_target.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)
    
     # For a more detailed view of current AmlCompute status, use get_status()
    print(compute_target.get_status().serialize())

creating a new compute target...
Creating
Succeeded
AmlCompute wait for completion finished
Minimum number of nodes requested have been provisioned
{'currentNodeCount': 0, 'targetNodeCount': 0, 'nodeStateCounts': {'preparingNodeCount': 0, 'runningNodeCount': 0, 'idleNodeCount': 0, 'unusableNodeCount': 0, 'leavingNodeCount': 0, 'preemptedNodeCount': 0}, 'allocationState': 'Steady', 'allocationStateTransitionTime': '2019-10-26T23:19:05.011000+00:00', 'errors': None, 'creationTime': '2019-10-26T23:19:02.358295+00:00', 'modifiedTime': '2019-10-26T23:19:17.983268+00:00', 'provisioningState': 'Succeeded', 'provisioningStateTransitionTime': None, 'scaleSettings': {'minNodeCount': 0, 'maxNodeCount': 4, 'nodeIdleTimeBeforeScaleDown': 'PT120S'}, 'vmPriority': 'Dedicated', 'vmSize': 'STANDARD_D2_V2'}


### Create a datastore containing sample images
The input dataset used for this notebook differs from a standard MNIST dataset in that it has been converted to PNG images to demonstrate use of files as inputs to Batch Inference. A sample of PNG-converted images of the MNIST dataset were take from [this repository](https://github.com/myleott/mnist_png).

We have created a public blob container `sampledata` on an account named `pipelinedata`, containing these images from the MNIST dataset. In the next step, we create a datastore with the name `images_datastore`, which points to this blob container. In the call to `register_azure_blob_container` below, setting the `overwrite` flag to `True` overwrites any datastore that was created previously with that name. 

This step can be changed to point to your blob container by providing your own `datastore_name`, `container_name`, and `account_name`.

In [3]:
from azureml.core.datastore import Datastore

account_name = "pipelinedata"
datastore_name = "mnist_datastore"
container_name = "sampledata"

mnist_data = Datastore.register_azure_blob_container(ws, 
                      datastore_name=datastore_name, 
                      container_name= container_name, 
                      account_name=account_name,
                      overwrite=True)

Next, let's specify the default datastore for the outputs.

In [4]:
def_data_store = ws.get_default_datastore()

### Create a FileDataset
A [FileDataset](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.data.filedataset?view=azure-ml-py) references single or multiple files in your datastores or public urls. The files can be of any format. FileDataset provides you with the ability to download or mount the files to your compute. By creating a dataset, you create a reference to the data source location. If you applied any subsetting transformations to the dataset, they will be stored in the dataset as well. The data remains in its existing location, so no extra storage cost is incurred.

In [5]:
from azureml.core.dataset import Dataset

mnist_ds_name = 'mnist_sample_data'

path_on_datastore = mnist_data.path('mnist')
input_mnist_ds = Dataset.File.from_files(path=path_on_datastore, validate=False)
registered_mnist_ds = input_mnist_ds.register(ws, mnist_ds_name, create_new_version=True)
named_mnist_ds = registered_mnist_ds.as_named_input(mnist_ds_name)

### Intermediate/Output Data
Intermediate data (or output of a Step) is represented by [PipelineData](https://docs.microsoft.com/en-us/python/api/azureml-pipeline-core/azureml.pipeline.core.pipelinedata?view=azure-ml-py) object. PipelineData can be produced by one step and consumed in another step by providing the PipelineData object as an output of one step and the input of one or more steps.

**Constructing PipelineData**
- name: [Required] Name of the data item within the pipeline graph
- datastore_name: Name of the Datastore to write this output to
- output_name: Name of the output
- output_mode: Specifies "upload" or "mount" modes for producing output (default: mount)
- output_path_on_compute: For "upload" mode, the path to which the module writes this output during execution
- output_overwrite: Flag to overwrite pre-existing data

In [7]:
from azureml.pipeline.core import Pipeline, PipelineData

output_dir = PipelineData(name="inferences", 
                          datastore=def_data_store, 
                          output_path_on_compute="mnist/results")

### Download the Model

Download and extract the model from https://pipelinedata.blob.core.windows.net/mnist-model/mnist-tf.tar.gz to "models" directory

In [8]:
import tarfile
import urllib.request

# create directory for model
model_dir = 'models'
if not os.path.isdir(model_dir):
    os.mkdir(model_dir)

url="https://pipelinedata.blob.core.windows.net/mnist-model/mnist-tf.tar.gz"
response = urllib.request.urlretrieve(url, "model.tar.gz")
tar = tarfile.open("model.tar.gz", "r:gz")
tar.extractall(model_dir)

os.listdir(model_dir)

['mnist-tf.model.data-00000-of-00001',
 'mnist-tf.model.index',
 'mnist-tf.model.meta',
 'saved_model.pb']

### Register the model with Workspace
A registered model is a logical container for one or more files that make up your model. For example, if you have a model that's stored in multiple files, you can register them as a single model in the workspace. After you register the files, you can then download or deploy the registered model and receive all the files that you registered.

Using tags, you can track useful information such as the name and version of the machine learning library used to train the model. Note that tags must be alphanumeric. Learn more about registering models [here](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-deploy-and-where#registermodel) 

In [9]:
from azureml.core.model import Model

# register downloaded model 
model = Model.register(model_path = "models/",
                       model_name = "mnist", # this is the name the model is registered as
                       tags = {'pretrained': "mnist"},
                       description = "Mnist trained tensorflow model",
                       workspace = ws)

Registering model mnist


### Using your model to make batch predictions
To use the model to make batch predictions, you need an **entry script** and a list of **dependencies**:

#### An entry script
This script accepts requests, scores the requests by using the model, and returns the results.
- __init()__ - Typically this function loads the model into a global object. This function is run only once at the start of batch processing per worker node/process. Init method can make use of following environment variables (ParallelRunStep input):
    1.	AZUREML_BI_OUTPUT_PATH – output folder path
- __run(mini_batch)__ - The method to be parallelized. Each invocation will have one minibatch.<BR>
__mini_batch__: Batch inference will invoke run method and pass either a list or Pandas DataFrame as an argument to the method. Each entry in min_batch will be - a filepath if input is a FileDataset, a Pandas DataFrame if input is a TabularDataset.<BR>
__run__ method response: run() method should return a Pandas DataFrame or an array. For append_row output_action, these returned elements are appended into the common output file. For summary_only, the contents of the elements are ignored. For all output actions, each returned output element indicates one successful inference of input element in the input mini-batch.
    User should make sure that enough data is included in inference result to map input to inference. Inference output will be written in output file and not guaranteed to be in order, user should use some key in the output to map it to input.
    

#### Dependencies
Helper scripts or Python/Conda packages required to run the entry script or model.

The deployment configuration for the compute target that hosts the deployed model. This configuration describes things like memory and CPU requirements needed to run the model.

These items are encapsulated into an inference configuration and a deployment configuration. The inference configuration references the entry script and other dependencies. You define these configurations programmatically when you use the SDK to perform the deployment. You define them in JSON files when you use the CLI.

In [10]:
import os

scripts_folder = "Code"
script_file = "digit_identification.py"

# peek at contents
with open(os.path.join(scripts_folder, script_file)) as inference_file:
    print(inference_file.read())


# Copyright (c) Microsoft. All rights reserved.
# Licensed under the MIT license.

import os
import numpy as np
import tensorflow as tf
from PIL import Image
from azureml.core import Model


def init():
    global g_tf_sess

    # pull down model from workspace
    model_path = Model.get_model_path("mnist")

    # contruct graph to execute
    tf.reset_default_graph()
    saver = tf.train.import_meta_graph(os.path.join(model_path, 'mnist-tf.model.meta'))
    g_tf_sess = tf.Session(config=tf.ConfigProto(device_count={'GPU': 0}))
    saver.restore(g_tf_sess, os.path.join(model_path, 'mnist-tf.model'))


def run(mini_batch):
    print(f'run method start: {__file__}, run({mini_batch})')
    resultList = []
    in_tensor = g_tf_sess.graph.get_tensor_by_name("network/X:0")
    output = g_tf_sess.graph.get_tensor_by_name("network/output/MatMul:0")

    for image in mini_batch:
        # prepare each image
        data = Image.open(image)
        np_im = np.array(data).reshape((1, 784))
     

## Build and run the batch inference pipeline
The data, models, and compute resource are now available. Let's put all these together in a pipeline.

###  Specify the environment to run the script
Specify the conda dependencies for your script. This will allow us to install pip packages as well as configure the inference environment.

In [11]:
from azureml.core import Environment
from azureml.core.runconfig import CondaDependencies, DEFAULT_CPU_IMAGE

batch_conda_deps = CondaDependencies.create(pip_packages=["tensorflow==1.13.1", "pillow"])

batch_env = Environment(name="batch_environment")
batch_env.python.conda_dependencies = batch_conda_deps
batch_env.docker.enabled = True
batch_env.docker.base_image = DEFAULT_CPU_IMAGE

###  Create the configuration to wrap the inference script

Before we can run this step, we need to install the preview SDK for the Parallel Run Step.

In [15]:
%pip install --extra-index-url https://pypi.python.org/simple --index-url https://azuremlsdktestpypi.azureedge.net/sdk-release/Candidate/604C89A437BA41BD942B4F46D9A3591D/ azureml-contrib-pipeline-steps azureml-widgets

Looking in indexes: https://azuremlsdktestpypi.azureedge.net/sdk-release/Candidate/604C89A437BA41BD942B4F46D9A3591D/, https://pypi.python.org/simple
Collecting azureml-contrib-pipeline-steps
  Downloading https://azuremlsdktestpypi.blob.core.windows.net/repo/sdk-release/Candidate/604C89A437BA41BD942B4F46D9A3591D/azureml_contrib_pipeline_steps-1.0.72-py3-none-any.whl?sv=2017-07-29&sr=b&sig=ilyL8TKpy3QB%2BkXgHLz67GwQdJZkJ02pAA21Y5e3elQ%3D&st=2019-10-25T18%3A13%3A09Z&se=2020-10-25T18%3A13%3A09Z&sp=rl
Collecting azureml-core==1.0.72.* (from azureml-contrib-pipeline-steps)
[?25l  Downloading https://azuremlsdktestpypi.blob.core.windows.net/repo/sdk-release/Candidate/604C89A437BA41BD942B4F46D9A3591D/azureml_core-1.0.72-py2.py3-none-any.whl?sv=2017-07-29&sr=b&sig=Rnxn%2BsBQYnJc5D3H6HdzhIK4i6deTwpIG05mHL2g6Hc%3D&st=2019-10-25T18%3A13%3A09Z&se=2020-10-25T18%3A13%3A09Z&sp=rl (1.1MB)
[K     |████████████████████████████████| 1.1MB 1.9MB/s eta 0:00:01
[?25hCollecting azureml-pipeline-core==1.0.

[?25l  Downloading https://files.pythonhosted.org/packages/36/94/23135312f97b20d6457294606fb70fad43ef93b7bffe567088ebe3623703/pyarrow-0.11.1-cp36-cp36m-manylinux1_x86_64.whl (11.6MB)
[K     |████████████████████████████████| 11.6MB 3.2MB/s eta 0:00:01
Collecting azureml-train-core==1.0.72.* (from azureml-pipeline-steps==1.0.72.*->azureml-contrib-pipeline-steps)
[?25l  Downloading https://azuremlsdktestpypi.blob.core.windows.net/repo/sdk-release/Candidate/604C89A437BA41BD942B4F46D9A3591D/azureml_train_core-1.0.72-py3-none-any.whl?sv=2017-07-29&sr=b&sig=GZaUl47ZSf5MXiAN547Jpp8x%2FSZBXUisML%2FulcKbE4o%3D&st=2019-10-25T18%3A13%3A10Z&se=2020-10-25T18%3A13%3A10Z&sp=rl (81kB)
[K     |████████████████████████████████| 81kB 40.0MB/s eta 0:00:01
Collecting azureml-train-restclients-hyperdrive==1.0.72.* (from azureml-train-core==1.0.72.*->azureml-pipeline-steps==1.0.72.*->azureml-contrib-pipeline-steps)
  Downloading https://azuremlsdktestpypi.blob.core.windows.net/repo/sdk-release/Candidate/

[31mERROR: azureml-widgets 1.0.69.1 has requirement azureml-core==1.0.69.*, but you'll have azureml-core 1.0.72 which is incompatible.[0m
[31mERROR: azureml-train 1.0.69 has requirement azureml-train-core==1.0.69.*, but you'll have azureml-train-core 1.0.72 which is incompatible.[0m
[31mERROR: azureml-train-core 1.0.72 has requirement azureml-telemetry==1.0.72.*, but you'll have azureml-telemetry 1.0.69 which is incompatible.[0m
[31mERROR: azureml-train-automl 1.0.69 has requirement azureml-core==1.0.69.*, but you'll have azureml-core 1.0.72 which is incompatible.[0m
[31mERROR: azureml-train-automl 1.0.69 has requirement azureml-pipeline-core==1.0.69.*, but you'll have azureml-pipeline-core 1.0.72 which is incompatible.[0m
[31mERROR: azureml-train-automl 1.0.69 has requirement wheel==0.30.0, but you'll have wheel 0.33.6 which is incompatible.[0m
[31mERROR: azureml-tensorboard 1.0.69 has requirement azureml-core==1.0.69.*, but you'll have azureml-core 1.0.72 which is incomp

Define pipeline configurations.

In [16]:
from azureml.contrib.pipeline.steps import ParallelRunStep, ParallelRunConfig

parallel_run_config = ParallelRunConfig(
    source_directory=scripts_folder,
    input_format = 'file',
    entry_script=script_file,
    mini_batch_size="5",
    error_threshold=10,
    output_action="append_row",
    environment=batch_env,
    compute_target=compute_target,
    node_count=2)

### Create the pipeline step
Create the pipeline step using the script, environment configuration, and parameters. Specify the compute target you already attached to your workspace as the target of execution of the script. We will use ParallelRunStep to create the pipeline step.

In [17]:
parallelrun_step = ParallelRunStep(
    name="predict-digits-mnist",
    parallel_run_config=parallel_run_config,
    inputs=[ named_mnist_ds ],
    output=output_dir,
    models=[ model ],
    arguments=[ ],
    allow_reuse=True
)



### Run the pipeline
At this point you can run the pipeline and examine the output it produced. The Experiment object is used to track the run of the pipeline

In [18]:
from azureml.core import Experiment

pipeline = Pipeline(workspace=ws, steps=[parallelrun_step])
experiment = Experiment(ws, 'digit_identification')
pipeline_run = experiment.submit(pipeline)



Created step predict-digits-mnist [bbd38510][26d65fe7-a9ff-470b-a19f-6e58663109b7], (This step will run and generate new outputs)
Created data reference mnist_sample_data_0 for StepId [c22a57fc][4079e59e-61c8-4104-9684-c0d5853908a1], (Consumers of this data will generate new runs.)
Submitted PipelineRun dd3626d7-1e09-425e-a129-311d83c599f4
Link to Azure Portal: https://mlworkspace.azure.ai/portal/subscriptions/07a3b836-0813-4c05-afd4-3a7ab00358d9/resourceGroups/aml-rg-107327/providers/Microsoft.MachineLearningServices/workspaces/107327-aml-ws/experiments/digit_identification/runs/dd3626d7-1e09-425e-a129-311d83c599f4


### Monitor the run

In [19]:
from azureml.widgets import RunDetails
RunDetails(pipeline_run).show()

_PipelineWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', …

### Optional: View detailed logs (streaming) 

In [20]:
pipeline_run.wait_for_completion(show_output=True)

PipelineRunId: dd3626d7-1e09-425e-a129-311d83c599f4
Link to Portal: https://mlworkspace.azure.ai/portal/subscriptions/07a3b836-0813-4c05-afd4-3a7ab00358d9/resourceGroups/aml-rg-107327/providers/Microsoft.MachineLearningServices/workspaces/107327-aml-ws/experiments/digit_identification/runs/dd3626d7-1e09-425e-a129-311d83c599f4
PipelineRun Status: Running


StepRunId: 4b654a9d-beb0-4e31-ad9a-390403b60911
Link to Portal: https://mlworkspace.azure.ai/portal/subscriptions/07a3b836-0813-4c05-afd4-3a7ab00358d9/resourceGroups/aml-rg-107327/providers/Microsoft.MachineLearningServices/workspaces/107327-aml-ws/experiments/digit_identification/runs/4b654a9d-beb0-4e31-ad9a-390403b60911
StepRun( predict-digits-mnist ) Status: NotStarted
StepRun( predict-digits-mnist ) Status: Running

Streaming azureml-logs/20_image_build_log.txt
2019/10/26 23:23:08 Downloading source code...
2019/10/26 23:23:09 Finished downloading source code
2019/10/26 23:23:09 Creating Docker network: acb_default_network, driver

Executing transaction: ...working... done
Collecting tensorflow==1.13.1
  Downloading https://files.pythonhosted.org/packages/77/63/a9fa76de8dffe7455304c4ed635be4aa9c0bacef6e0633d87d5f54530c5c/tensorflow-1.13.1-cp36-cp36m-manylinux1_x86_64.whl (92.5MB)
Collecting pillow
  Downloading https://files.pythonhosted.org/packages/10/5c/0e94e689de2476c4c5e644a3bd223a1c1b9e2bdb7c510191750be74fa786/Pillow-6.2.1-cp36-cp36m-manylinux1_x86_64.whl (2.1MB)
Collecting azure-storage-queue~=2.1
  Downloading https://files.pythonhosted.org/packages/69/08/a0dcde53f9203fa83111e28fc368404c818a8c54c077ae66559b237c9dd4/azure_storage_queue-2.1.0-py2.py3-none-any.whl
Collecting azure-storage-common~=2.1
  Downloading https://files.pythonhosted.org/packages/6b/a0/6794b318ce0118d1a4053bdf0149a60807407db9b710354f2b203c2f5975/azure_storage_common-2.1.0-py2.py3-none-any.whl (47kB)
Collecting azureml-core~=1.0
  Downloading https://files.pythonhosted.org/packages/9a/4f/b5c71c45f9aa82aa2636dd5ec7e19c6c11138c8ef127faa5

  Downloading https://files.pythonhosted.org/packages/7c/0d/80815326fa04f2a73ea94b0f57c29669c89df5aa5f5e285952f6445a91c4/azure_mgmt_resource-5.1.0-py2.py3-none-any.whl (681kB)
Collecting PyJWT
  Downloading https://files.pythonhosted.org/packages/87/8b/6a9f14b5f781697e51259d81657e6048fd31a113229cf346880bb7545565/PyJWT-1.7.1-py2.py3-none-any.whl
Collecting urllib3>=1.23
  Downloading https://files.pythonhosted.org/packages/e0/da/55f51ea951e1b7c63a579c09dd7db825bb730ec1fe9c0180fc77bfb31448/urllib3-1.25.6-py2.py3-none-any.whl (125kB)
Collecting applicationinsights
  Downloading https://files.pythonhosted.org/packages/a1/53/234c53004f71f0717d8acd37876e0b65c121181167057b9ce1b1795f96a0/applicationinsights-0.11.9-py2.py3-none-any.whl (58kB)
Collecting cloudpickle>=1.1.0
  Downloading https://files.pythonhosted.org/packages/c1/49/334e279caa3231255725c8e860fa93e72083567625573421db8875846c14/cloudpickle-1.2.2-py2.py3-none-any.whl
Collecting dotnetcore2>=2.1.9
  Downloading https://files.pythonho

[91m
[0m#
# To activate this environment, use:
# > source activate /azureml-envs/azureml_46bc96c33b52e0bb335f778cfc3b1793
#
# To deactivate an active environment, use:
# > source deactivate
#

Removing intermediate container 7f1ddd18cc52
 ---> 2ede5184358b
Step 9/15 : ENV PATH /azureml-envs/azureml_46bc96c33b52e0bb335f778cfc3b1793/bin:$PATH
 ---> Running in 7e048dee6739
Removing intermediate container 7e048dee6739
 ---> 89a2da204496
Step 10/15 : ENV AZUREML_CONDA_ENVIRONMENT_PATH /azureml-envs/azureml_46bc96c33b52e0bb335f778cfc3b1793
 ---> Running in e34c3b46b0ab
Removing intermediate container e34c3b46b0ab
 ---> 3982fbe6bdad
Step 11/15 : ENV LD_LIBRARY_PATH /azureml-envs/azureml_46bc96c33b52e0bb335f778cfc3b1793/lib:$LD_LIBRARY_PATH
 ---> Running in 5605427c7a33
Removing intermediate container 5605427c7a33
 ---> 5a26a2f17b38
Step 12/15 : COPY azureml-environment-setup/spark_cache.py azureml-environment-setup/log4j.properties /azureml-environment-setup/
 ---> 02bb731db448
Step 13/15 :

bash: /azureml-envs/azureml_46bc96c33b52e0bb335f778cfc3b1793/lib/libtinfo.so.5: no version information available (required by bash)
Starting the daemon thread to refresh tokens in background for process with pid = 176
Entering Run History Context Manager.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
2019-10-26 23:34:24.235854: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-10-26 23:34:24.241284: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2294685000 Hz
2019-10-26 23:34:24.241522: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x7fc7d06d9fa0 executing computations on platform Host. 

run method start: /mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/digit_identification.py, run(['/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/mnist/810.png', '/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/mnist/642.png', '/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/mnist/974.png', '/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/m

run method start: /mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/digit_identification.py, run(['/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/mnist/302.png', '/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/mnist/377.png', '/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/mnist/790.png', '/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/m

run method start: /mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/digit_identification.py, run(['/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/mnist/912.png', '/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/mnist/514.png', '/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/mnist/658.png', '/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/m

run method start: /mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/digit_identification.py, run(['/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/mnist/768.png', '/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/mnist/174.png', '/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/mnist/589.png', '/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/m

run method start: /mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/digit_identification.py, run(['/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/mnist/141.png', '/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/mnist/865.png', '/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/mnist/16.png', '/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/mn

run method start: /mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/digit_identification.py, run(['/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/mnist/630.png', '/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/mnist/832.png', '/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/mnist/513.png', '/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/m

run method start: /mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/digit_identification.py, run(['/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/mnist/850.png', '/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/mnist/163.png', '/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/mnist/407.png', '/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/m

run method start: /mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/digit_identification.py, run(['/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/mnist/951.png', '/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/mnist/562.png', '/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/mnist/700.png', '/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/m

run method start: /mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/digit_identification.py, run(['/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/mnist/515.png', '/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/mnist/339.png', '/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/mnist/193.png', '/mnt/batch/tasks/shared/LS_root/jobs/107327-aml-ws/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mounts/workspaceblobstore/azureml/4b654a9d-beb0-4e31-ad9a-390403b60911/mnist_datastore/m


StepRun(predict-digits-mnist) Execution Summary
StepRun( predict-digits-mnist ) Status: Finished
{'runId': '4b654a9d-beb0-4e31-ad9a-390403b60911', 'target': 'cpu-cluster', 'status': 'Completed', 'startTimeUtc': '2019-10-26T23:30:59.861764Z', 'endTimeUtc': '2019-10-26T23:35:28.466571Z', 'properties': {'azureml.runsource': 'azureml.StepRun', 'ContentSnapshotId': '9d91ca79-dd31-4b14-9ce1-6d8102a56e25', 'StepType': 'PythonScriptStep', 'ComputeTargetType': 'AmlCompute', 'azureml.pipelinerunid': 'dd3626d7-1e09-425e-a129-311d83c599f4', '_azureml.ComputeTargetType': 'batchai', 'AzureML.DerivedImageName': 'azureml/azureml_5bfb56a3bc693485eb8783aa1349dcf4', 'ProcessInfoFile': 'azureml-logs/process_info.json', 'ProcessStatusFile': 'azureml-logs/process_status.json'}, 'inputDatasets': [], 'runDefinition': {'script': 'driver/amlbi_main.py', 'arguments': ['--scoring_module_name', 'digit_identification.py', '--process_count_per_node', '$AML_PARAMETER_aml_process_count_per_node', '--output', '$AZUREM



PipelineRun Execution Summary
PipelineRun Status: Finished
{'runId': 'dd3626d7-1e09-425e-a129-311d83c599f4', 'status': 'Completed', 'startTimeUtc': '2019-10-26T23:22:55.132596Z', 'endTimeUtc': '2019-10-26T23:35:32.84928Z', 'properties': {'azureml.runsource': 'azureml.PipelineRun', 'runSource': None, 'runType': 'HTTP', 'azureml.parameters': '{"aml_process_count_per_node":"1","aml_node_count":"2"}'}, 'inputDatasets': [], 'logFiles': {'logs/azureml/executionlogs.txt': 'https://mlstrgoqbt37tuc4fbq.blob.core.windows.net/azureml/ExperimentRun/dcid.dd3626d7-1e09-425e-a129-311d83c599f4/logs/azureml/executionlogs.txt?sv=2018-11-09&sr=b&sig=MfeLk834ev6%2FbhcA1w0QWnz5Tw4THz2JkqAb55wsKj4%3D&st=2019-10-26T23%3A25%3A34Z&se=2019-10-27T07%3A35%3A34Z&sp=r', 'logs/azureml/stderrlogs.txt': 'https://mlstrgoqbt37tuc4fbq.blob.core.windows.net/azureml/ExperimentRun/dcid.dd3626d7-1e09-425e-a129-311d83c599f4/logs/azureml/stderrlogs.txt?sv=2018-11-09&sr=b&sig=YWo0UmvuWS5gC4PpXv21P9u%2Bn7oVPSFR1RL1TMr0EQw%3D&s

'Finished'

### View the prediction results per input image
In the score.py file above you can see that the ResultList with the filename and the prediction result gets returned. These are written to the DataStore specified in the PipelineData object as the output data, which in this case is called *inferences*. This containers the outputs from  all of the worker nodes used in the compute cluster. You can download this data to view the results ... below just filters to the first 10 rows

In [None]:
import pandas as pd
import shutil

# remove previous run results, if present
shutil.rmtree("mnist_results", ignore_errors=True)

batch_run = next(pipeline_run.get_children())
batch_output = batch_run.get_output_data("inferences")
batch_output.download(local_path="mnist_results")

for root, dirs, files in os.walk("mnist_results"):
    for file in files:
        if file.endswith('parallel_run_step.txt'):
            result_file = os.path.join(root,file)

df = pd.read_csv(result_file, delimiter=":", header=None)
df.columns = ["Filename", "Prediction"]
print("Prediction has ", df.shape[0], " rows")
df.head(10)

## Cleanup Compute resources

For re-occuring jobs, it may be wise to keep compute the compute resources and allow compute nodes to scale down to 0. However, since this is just a single-run job, we are free to release the allocated compute resources.

In [None]:
# uncomment below and run if compute resources are no longer needed 
# compute_target.delete() 