# Deploy batch inference pipelines with Azure Machine Learning
In many production scenarios, long-running tasks that operate on large volumes of data are performed as *batch* operations. In machine learning, *batch inferencing* is used to apply a predictive model to multiple cases asynchronously - usually writing the results to a file or database. Typically this is done by creating a pipeline.

## Learning objectives
- Public Batch inference pipeline for a trained model.
- Use a batch inference pipeline to generate predictions.

# Creating a batch inference pipeline
1. Register a model - same as in real-time inferencing
2. Create a scoring script (aka entry script) - difference here is that un the `run` function, we pass `mini_batch` for mini batches and process in a list. Run is called for each batch of data to be processed
3. Create a pipeline with a ParallelRunStep - we dont' need to define a pipeline in real-time inferencing
4. Run the pipeline and retrieve the step output

# 3. Create a pipeline with a ParallelRunStep
Azure Machine Learning provides a type of pipeline step specifically for performing parallel batch inferencing. Using the `ParallelRunStep` class, you cna read batches of files froma `File` dataset and write the processing output to a `PipelineData` reference. Additoinally, you can set the `output_action` setting for the step to "append_row", which will ensure that all instances of the step being run in parallel will collate their results to a single output file names *parallel_run_step.txt*. The following code snipppet shows an example of creating a pipeline with a `ParallelRunStep`:

In [None]:
from azurem.pipeline.steps import ParallelRunConfig, ParallelRunStep
from azureml.pipeline.core import PipelineData
from azureml.pipeline.core import Pipeline

# Get the batch dataset for input
batch_data_set = ws.datasets('batch-data')

# Set the output location
default_ds = ws.get_default_datastore()
output_dir = PipelineData(name='inferences',
                          datastore=default_ds,
                          output_path_on_compute='results')

# Define the parallel run step step configuration
parallel_run_config = ParallelRunConfig(
    source_directory='batch_scripts',
    entry_script="batch_scoring_script.py",
    mini_batch_size="5",
    error_threshold=10,
    output_action="append_row",
    environment=batch_env,
    compute_target=aml_cluster,
    node_count=4)

# Create the parallel run step
parallelrun_step = ParallelRunStep(
    name='batch-score',
    parallel_run_config=parallel_run_config,
    inputs=[batch_data_set.as_named_input('batch_data')],
    output=output_dir,
    arguments=[],
    allow_reuse=True
)
# Create the pipeline
pipeline = Pipeline(workspace=ws, steps=[parallelrun_step])

## 4. Run the pipeline and retrieve the step output
After your pipeline ahs been defined, you can run it and wait for it to complete. Then you can retrieve the `parallel_run_step.txt` files from the output of the step to view the results as shown in the following code example:

In [1]:
from azureml.core import Experiment

# Run the pipeline as an experiment
pipeline_run = Experiment(ws, 'batch_prediction_pipeline').submit(pipeline)
pipeline_run.wait_for_completion(show_output=True)

# Get the outputs from the first (and only) step
prediction_run = next(pipeline_run.get_children())
prediction_output = prediction_run.get_output_data('inferences')
prediction_output.download(local_path='results')

# Find the parallel_run_step.txt file
for root, dirs, files in os.walk('results'):
    for file in files:
        if file.endswith('parallel_run_step.txt'):
            result_file = os.path.join(root,file)

# Load and display the results
df = pd.read_csv(result_file, delimiter=":", header=None)
df.columns = ["File", "Prediction"]
print(df)

NameError: name 'ws' is not defined

# Publishing a batch inference pipeline
You can publish a batch inferencing pipeline as a REST service, as shown in the following example code:

In [None]:
published_pipeline = pipeline_run.publish_pipeline(name='Batch_Prediction_Pipeline',
                                                   description='Batch pipeline',
                                                   version='1.0')
rest_endpoint = published_pipeline.endpoint

Once published, you can use the service endpoint to initiate a batch inferencing job, as shown in the following example code:

In [None]:
import requests

response = requests.post(rest_endpoint,
                         headers=auth_header,
                         json={"ExperimentName": "Batch_Prediction"})
run_id = response.json()["Id"]

You can also schedule the published pipeline to have it run automaticlay, as shown in the following example code:

In [None]:
from azureml.pipeline.core import ScheduleRecurrence, Schedule

weekly = ScheduleRecurrence(frequency='Week', interval=1)
pipeline_schedule = Schedule.create(ws, name='Weekly Predictions',
                                        description='batch inferencing',
                                        pipeline_id=published_pipeline.id,
                                        experiment_name='Batch_Prediction',
                                        recurrence=weekly)