# Making batch predictions using a PyTorch model with Amazon SageMaker
This notebook shows how to make **batch predictions with PyTorch on SageMaker**. Many customers have machine learning workloads that require a large number of predictions to be made reliably on a repeatable schedule. As compared to SageMaker's managed hosting service, compute capacity for batch predictions is spun up on demand and taken down upon completion of the batch. For large batch workloads, this represents significant cost savings over an always-on endpoint. 

Another benefit of SageMaker batch is that it allows data scientists can stay focused on creating the best models.
[SageMaker batch](https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform.html) uses the same trained model easily across hosted endpoints and batch, with no need for expensive rewrites or infrastructure. You provide the input via s3, and SageMaker returns the predictions via s3 as well. Note that the same input and output handlers you used for your SageMaker endpoint in the [previous lab](./2_sm_image_classification_birds.ipynb) are used for batch predictions as well. Likewise, the same trained model works for both. 

## Setup
This notebook assumes you have already trained your model in the prior lab, which results in model artifacts being available in S3. Update the `training_job_name` variable below to refer to your specific training job, so that the notebook has a full s3 URI to the model artifacts. 

These same model artifacts were used for deployment in a SageMaker hosted endpoint in the previous lab. In this lab, we demonstrate batch predictions with the same trained model.

In [None]:
import boto3
import sagemaker
from time import gmtime, strftime

training_job_name = 'pyt-ic-2020-02-19-02-39-01-936'  ### Replace this with your job name from the previous lab

USE_GPU_INSTANCE  = True
FRAMEWORK_VERSION = '1.3.1'
ACCOUNT_NUM = '763104351884' # for most regions
CONTAINER_NAME = 'pytorch-inference'
print(f'Using account: {ACCOUNT_NUM}, container: {CONTAINER_NAME}')
## Here is the documentation for identifying TensorFlow SageMaker container images
##   https://docs.aws.amazon.com/sagemaker/latest/dg/pre-built-containers-frameworks-deep-learning.html

In [None]:
sess = sagemaker.Session()
bucket = sess.default_bucket()
s3_prefix = 'DEMO-PYT-image-classification-birds'
region_name = boto3.Session().region_name

To help with evaluating the batch prediction results, enter the list of class labels that your classifier was trained on in notebook 2.

In [None]:
class_name_list = ['013.Bobolink', '017.Cardinal', '035.Purple_Finch', '036.Northern_Flicker']
#,
#                   '047.American_Goldfinch', '068.Ruby_throated_Hummingbird', '073.Blue_Jay', 
#                   '075.Green_Jay', '087.Mallard', '095.Baltimore_Oriole', 
#                   '120.Fox_Sparrow', '179.Tennessee_Warbler', '192.Downy_Woodpecker']

SageMaker batch transformations require input to be specified in s3, and you need to provide an s3 output path where SageMaker will save the resulting predictions. For this sample batch job, we will make a prediction for each of the images from our training dataset.

In [None]:
input_data_path  = f's3://{bucket}/{s3_prefix}/train/'
output_data_path = f's3://{bucket}/{s3_prefix}/batch_predictions/'

print(f'Batch input from: {input_data_path}')
print(f'Batch output to:  {output_data_path}')

Before we run the batch transformation, we first remove prior batch prediction results. In production, you would likely instead tag the folder with a timestamp and retain the results from each run of the batch.

In [None]:
!aws s3 ls $input_data_path

In [None]:
if input('Are you sure you want to remove the old batch predictions {}?'.format(output_data_path)) == 'yes':
    !aws s3 rm --quiet --recursive $output_data_path

Likewise, to interpret the results, we copy them down to our local folder. If we have done this before, we first remove the old results.

In [None]:
if input('Are you sure you want to remove the prior local batch predictions from ./batch_predictions') == 'yes':
    !rm -rf ./batch_predictions/*

## Create a Model for performing batch predictions

When we deployed the model in the previous lab to an Amazon SageMaker real time endpoint, we deployed to a CPU-based instance type.  Under the hood a CPU-based Amazon SageMaker Model object was created to wrap a CPU-based TFS container.  However, for Batch Transform on a large dataset, we would prefer to use full GPU instances.  To do this, we need to create another Model object that will utilize a GPU-based TFS container.  

First we give a unique name for the model and identify the proper TensorFlow framework image.

In [None]:
if USE_GPU_INSTANCE:
    device = 'gpu'
    batch_instance_type = 'ml.p3.8xlarge'
else:
    device = 'cpu'
    batch_instance_type = 'ml.c5.4xlarge'

model_artifacts = f's3://{bucket}/{training_job_name}/output/model.tar.gz'
model_prefix = f'pyt-ic-{device}'
model_name = '{}-{}'.format(model_prefix, strftime("%m-%d-%H-%M-%S", gmtime()))
framework_image = \
      f'{ACCOUNT_NUM}.dkr.ecr.{region_name}.amazonaws.com/{CONTAINER_NAME}:{FRAMEWORK_VERSION}-{device}-py3'

print(f'Model will be named: {model_name}')
print('Using image: {}'.format(framework_image))

Here we instantiate a Model object pointing to the trained model artifacts and referring to the TensorFlow Serving image that will be used to drive inference on that model.

In [None]:
USE_BOTO3 = True
from sagemaker.pytorch import PyTorchModel

print(f'Creating model for {model_artifacts}')
serving_model = PyTorchModel(model_data=model_artifacts,
                             entry_point='train-resnet.py',
                             source_dir='code',
                             role=sagemaker.get_execution_role(),
                             image=framework_image,
                             framework_version=FRAMEWORK_VERSION,
                             sagemaker_session=sess)
if USE_BOTO3:
    client = boto3.client('sagemaker')
    tf_serving_container = serving_model.prepare_container_def(batch_instance_type)
    model_params = {
        'ModelName': model_name,
        'Containers': [
            tf_serving_container
        ],
        'ExecutionRoleArn': sagemaker.get_execution_role()
    }
    client.create_model(**model_params)

## Launch the batch transformation job
Here we kick off the batch prediction job using the SageMaker Transformer object.

In [None]:
%%time

batch_instance_count = 2
concurrency = 64

if USE_BOTO3:
    transformer = sagemaker.transformer.Transformer(
        model_name=model_name,
        instance_count = batch_instance_count,
        instance_type  = batch_instance_type,
        max_concurrent_transforms = concurrency,
        output_path    = output_data_path,
        base_transform_job_name='pyt-birds-image-transform')
else:
    transformer = serving_model.transformer(
        instance_count = batch_instance_count,
        instance_type  = batch_instance_type,
        max_concurrent_transforms = concurrency,
        output_path    = output_data_path)

transformer.transform(data = input_data_path, content_type = 'application/x-image')
#transformer.wait()

### Evaluate prediction results
To facilitate evaluation of the output, we download the results to our local folder.

In [None]:
!aws s3 cp  --quiet --recursive $output_data_path ./batch_predictions

Here we take a look at a sample output file. For each jpg file we passed to the batch transformation job, we get a corresponding `.jpg.out` file containing the json formatted output from the prediction.

In [None]:
import glob
filepaths = glob.glob('./batch_predictions/*/*')
print('Total number of predictions: {}'.format(len(filepaths)))
print('\nSample prediction output file:')
sample_file = filepaths[0]
!cat $sample_file

For the larger scale batch predictions on images we don't yet have labelled, we'll simply parse the prediction output files to see the distribution of predicted classes.

In [None]:
import json
import re
import os
import glob
import numpy as np

In [None]:
total = 0
predicted = []

for entry in glob.glob('batch_predictions/*/*'):
    try:
        with open(entry, 'r') as f:
            results = json.load(f)
            class_index = np.argmax(np.array(results))
            predicted_label = class_name_list[class_index]
            predicted.append(class_index)
            total += 1
    except Exception as e:
        print(e)
        continue
        
print(f'Found {total} prediction files.')

In [None]:
def how_many(in_list, which_value):
    total_found = 0
    for n in range(len(in_list)):
        if in_list[n] == which_value:
            total_found += 1
    return total_found

In [None]:
prediction_totals_by_class = []
for i in range(len(class_name_list)):
    prediction_totals_by_class.append(how_many(predicted, i))

In [None]:
import matplotlib.pyplot as plt
import numpy as np

%matplotlib inline

x = np.arange(len(class_name_list))
width = 0.7

fig, ax = plt.subplots()
ax.set_ylabel('Prediction Counts')
ax.set_title('Prediction Counts By Class')
ax.bar(x, prediction_totals_by_class, width)
ax.set_xticks(x)
ax.set_xticklabels(class_name_list, rotation='vertical')
plt.show()

Here we take the highest probability class prediction for each image and compare that to the actual class of the image (represented by its class subfolder).

In [None]:
total = 0
correct = 0

predicted = []
actual = []

for entry in glob.glob('batch_predictions/*/*'):
    try:
        actual_label = entry.split('/')[1]
        actual_index = class_name_list.index(actual_label)
        with open(entry, 'r') as f:
            results = json.load(f)
            class_index = np.argmax(np.array(results))
            predicted_label = class_name_list[class_index]
            predicted.append(class_index)
            actual.append(actual_index)
            is_correct = (predicted_label == actual_label) or False
            if is_correct:
                correct += 1
            total += 1
    except Exception as e:
        print(e)
        continue

In [None]:
print('Out of {} total images, accurate predictions were returned for {}'.format(total, correct))
accuracy = correct / total
print('Accuracy is {:.1%}'.format(accuracy))

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import itertools

def plot_confusion_matrix(cm, classes,
                          normalize=False,
                          title='Confusion matrix',
                          cmap=plt.cm.GnBu):
    plt.figure(figsize=(7,7))
    plt.grid(False)

    plt.imshow(cm, interpolation='nearest', cmap=cmap)
    plt.title(title)
    tick_marks = np.arange(len(classes))
    plt.xticks(tick_marks, classes, rotation=45)
    plt.yticks(tick_marks, classes)
    fmt = '.2f' if normalize else 'd'
    thresh = cm.max() / 2.
    for i, j in itertools.product(range(cm.shape[0]), 
                                  range(cm.shape[1])):
        plt.text(j, i, format(cm[i, j], fmt),
                 horizontalalignment="center",
                 color="white" if cm[i, j] > thresh else "black")
    plt.tight_layout()
    plt.gca().set_xticklabels(class_name_list)
    plt.gca().set_yticklabels(class_name_list)
    plt.ylabel('True label')
    plt.xlabel('Predicted label')

In [None]:
from sklearn.metrics import confusion_matrix
def create_and_plot_confusion_matrix(actual, predicted):
    cnf_matrix = confusion_matrix(actual, np.asarray(predicted),labels=range(len(class_name_list)))
    plot_confusion_matrix(cnf_matrix, classes=range(len(class_name_list)))

In [None]:
create_and_plot_confusion_matrix(actual, predicted)