# Batch Predictions


Amazon SageMaker Batch Transform allows you to make predictions on batches of data in S3 without setting up a REST endpoint. Batch predictions are also called “offline” predictions since they do not require an online REST endpoint. Typically meant for higher-throughput workloads that can tolerate higher latency and lower freshness, batch prediction servers typically do not run 24 hours per day like real-time prediction servers. They run for a few hours on a batch of data, then shut down - hence the term, “batch.” Batch Transform manages all of the resources needed to perform the inferences including the launch and termination of the cluster after the job completes.

![](img/batch_transform_tensorflow.gif)

In [None]:
import boto3
import sagemaker
import pandas as pd

sess   = sagemaker.Session()
bucket = sess.default_bucket()
role = sagemaker.get_execution_role()
region = boto3.Session().region_name

sm = boto3.Session().client(service_name='sagemaker', region_name=region)

# Setup Batch Transform Model

In [None]:
%store -r training_job_name

In [None]:
print(training_job_name)

In [None]:
!aws s3 cp s3://$bucket/$training_job_name/output/model.tar.gz ./model.tar.gz

In [None]:
!tar -xvzf ./model.tar.gz

In [None]:
!saved_model_cli show --all --dir ./tensorflow/saved_model/0/

In [None]:
!pygmentize ./src_batch_tsv/inference.py

# Configure TensorFlow Serving for Batch Inference

In [None]:
from sagemaker.tensorflow.serving import Model

batch_env = {
  # Configures whether to enable record batching.
  'SAGEMAKER_TFS_ENABLE_BATCHING': 'true',

  # Name of the model - this is important in multi-model deployments
  'SAGEMAKER_TFS_DEFAULT_MODEL_NAME': 'saved_model',

  # Configures how long to wait for a full batch, in microseconds.
  'SAGEMAKER_TFS_BATCH_TIMEOUT_MICROS': '50000', # microseconds

  # Corresponds to "max_batch_size" in TensorFlow Serving.
  'SAGEMAKER_TFS_MAX_BATCH_SIZE': '10000',

  # Number of seconds for the SageMaker web server timeout
  'SAGEMAKER_MODEL_SERVER_TIMEOUT': '7200', # Seconds

  # Configures number of batches that can be enqueued.
  'SAGEMAKER_TFS_MAX_ENQUEUED_BATCHES': '10000'
}

# Configure the Parallelism and Payload Size
To increase performance, you can increase the max_concurrent_transforms parameter.  Tune this on a single instance before trying to scale out the number of instances - especially if you have a small file count, the multiple instances can be a big waste.  Note that `max_concurrent_transforms * max_payload <= 100`

In [None]:
max_concurrent_transforms=1
max_payload=1      # Megabytes (not number of records)

# Setup Instance Type and Instance Count for Our Cluster

In [None]:
instance_type='ml.c5.9xlarge'
instance_count=1

# Setup Input Data and Configuration
This include Single vs. MultiRecord, compression_type, accept_type, content_type, split types, etc.

In [None]:
strategy='MultiRecord'
compression_type='Gzip'
accept_type='text/csv'
content_type='text/csv'
assemble_with='Line'
split_type='Line'

In [None]:
input_csv_s3_uri = 's3://{}/amazon-reviews-pds/tsv/'.format(bucket)
print(input_csv_s3_uri)

In [None]:
!aws s3 ls --recursive $input_csv_s3_uri

# Setup Batch Transformer 
We are using a previously-trained model specified at `model_s3_uri`.

In [None]:
model_s3_uri = 's3://{}/{}/output/model.tar.gz'.format(bucket, training_job_name)

batch_model = Model(entry_point='inference.py',
                    source_dir='src_batch_tsv',       
                    model_data=model_s3_uri,
                    role=role,
                    framework_version='2.1.0',
                    env=batch_env)

In [None]:
batch_predictor = batch_model.transformer(strategy=strategy, 
                                          instance_type=instance_type,
                                          instance_count=instance_count,
                                          accept=accept_type,
                                          assemble_with=assemble_with,
                                          max_concurrent_transforms=max_concurrent_transforms,
                                          max_payload=max_payload, # This is in Megabytes (not number of records)
                                          env=batch_env)

# Start Batch Predictions

In [None]:
batch_predictor.transform(data=input_csv_s3_uri,
                          split_type=split_type,
                          compression_type=compression_type,
                          content_type=content_type,
#                          join_source='Input', # Mismatched line count between input and output
                          experiment_config=None,
                          wait=False)

In [None]:
from IPython.core.display import display, HTML

display(HTML('<b>Review <a target="blank" href="https://console.aws.amazon.com/sagemaker/home?region={}#/transform-jobs/{}?region={}&tab=Monitor">Batch Prediction Job</a></b>'.format(region, batch_predictor.latest_transform_job.job_name, region)))


In [None]:
from IPython.core.display import display, HTML

display(HTML('<b>Review <a target="blank" href="https://console.aws.amazon.com/cloudwatch/home?region={}#logStream:group=/aws/sagemaker/TransformJobs;prefix={};streamFilter=typeLogStreamPrefix">CloudWatch Logs</a></b>'.format(region, batch_predictor.latest_transform_job.job_name)))


In [None]:
from IPython.core.display import display, HTML

display(HTML('<b>Review <a target="blank" href="https://console.aws.amazon.com/s3/buckets/{}/{}/?region={}">Batch Prediction S3 Output</a></b>'.format(bucket, batch_predictor.latest_transform_job.job_name, region)))


In [None]:
print('Waiting for batch prediction job: ' + batch_predictor.latest_transform_job.job_name)

batch_predictor.wait(logs=False)

# _Wait Until the ^^ Batch Transform Job ^^ Completes_

# Check Output Data

After the transform job has completed, download the output data from S3.

For each file in the input data, we have a corresponding file with a ".out" extension.  This .out file contains the predicted labels for each input row. 

In [None]:
# Download the output data from S3 to local filesystem
batch_prediction_output_s3_uri = batch_predictor.output_path

In [None]:
!aws s3 cp --recursive $batch_prediction_output_s3_uri/ batch_prediction_output/

In [None]:
!ls batch_prediction_output/

In [None]:
%%javascript
Jupyter.notebook.save_checkpoint();
Jupyter.notebook.session.delete();