# Using TensorFlow with Amazon SageMaker's training and hosting services

1. [Introduction](#Introduction)
2. [Setup](#Setup)
3. [Train the model](#Train-the-model)
4. [Host the model](#Host-the-model)
5. [Test the model](#Test-the-model)
6. [Clean up](#Clean-up)

## Introduction

The previous lab performed training and prediction directly in the Jupyter notebook environment. With this lab, we demonstrate the use of TensorFlow with Amazon SageMaker's managed training and hosting services. SageMaker provides cost, productivity, and scale advantages for training and hosting, with the flexibility of using your own custom models with the most popular machine learning frameworks.

In this labe, we use [Amazon SageMaker's TensorFlow container](https://sagemaker.readthedocs.io/en/stable/using_tf.html), which lets you provide your training code as a Python script. The container also provides a flexible way for you to customize how inference inputs and outputs are handled over a REST interface. Here is a [blog post](https://aws.amazon.com/blogs/machine-learning/using-tensorflow-eager-execution-with-amazon-sagemaker-script-mode/) describing how TensorFlow eager execution is supported by the container.

You can use this notebook in multiple modes:

- Local mode. Training is run in the background on the same instance being used for the hosting of this notebook. Likewise, the SageMaker managed inference endpoint is also run on the same instance in local mode. This is useful for iterating during development of new models, as it saves the time required for launching of new instances.
- Managed training job. In this mode, SageMaker spins up a separate training cluster on your behalf to perform your training job. This provides a cost effective way to perform large scale model training without having to manage your own clusters. The machines are terminated immediately after completion of training, and you only pay for the instance time used.
- Managed training job with spot instances. Spot instances provide another cost saving opportunity, up to 90%. SageMaker takes care of resuming your training job from saved checkpoints if it is interrupted.

## Setup

Before preparing the data, there are some initial steps required for setup. To train the image classification algorithm on Amazon SageMaker, we need to setup and authenticate the use of AWS services. To begin with, we need an AWS account role with SageMaker access. Here we will use the execution role the current notebook instance was given when it was created.  This role has necessary permissions, including access to your data in S3.

In [None]:
import sagemaker
from sagemaker import get_execution_role

role = get_execution_role()
print(role)
sess = sagemaker.Session()

We also need to identify the S3 bucket that you want to use for providing training and validation datasets.  It will  be used to store the tranied model artifacts as well. In this notebook, we use a default bucket for use with SageMaker in your account. Alternatively, you could use whatever bucket you would like.  We use an object prefix to help organize the bucket content.

In [None]:
bucket = sess.default_bucket() # or use your own custom bucket name
prefix = 'DEMO-TF2-image-classification-birds'

# Data Preparation

This notebook assumes you have already downloaded and unpacked the dataset into your notebook instance as part of the first lab.

## Set some parameters for the rest of the notebook to use

Here we define a few parameters that help drive the rest of the notebook.  For example, `SAMPLE_ONLY` is defaulted to `True`. This will force the notebook to train on only a handful of species.  Setting `SAMPLE_ONLY` to false will make the notebook work with the entire dataset of 200 bird species.  This makes the training a more difficult challenge, and you will need to tune parameters and run more epochs.

An `EXCLUDE_IMAGE_LIST` is defined as a mechanism to address any corrupt images from the dataset and ensure they do not disrupt the process.

In [None]:
import pandas as pd
import boto3
import json
import pandas as pd

runtime = boto3.client(service_name='runtime.sagemaker')

import matplotlib.pyplot as plt
%matplotlib inline

# To speed up training and experimenting, you can use a small handful of species.
# To see the full list of the classes available, look at the content of CLASSES_FILE.
SAMPLE_ONLY = True

if SAMPLE_ONLY:
    CLASSES = [13, 17] #, 35, 36, 47, 68, 73, 75, 87, 95, 120, 179, 192]
else:
    # Otherwise, you can use the full set of species
    CLASSES = []
    for c in range(200):
        CLASSES += [c + 1]
    prefix = prefix + '-full'

BASE_DIR     = 'CUB_MINI/' # 'CUB_200_2011/' if using the full download
IMAGES_DIR   = BASE_DIR + 'images/'
CLASSES_FILE = BASE_DIR + 'classes.txt'

EXCLUDE_IMAGE_LIST = ['087.Mallard/Mallard_0130_76836.jpg']

SPLIT_RATIOS = (0.6, 0.2, 0.2)

CLASS_COLS      = ['class_number','class_id']

JOB_PREFIX     = 'tf2'

Show the list of bird species or dataset classes the model will be trained to predict.

In [None]:
classes_df = pd.read_csv(CLASSES_FILE, sep=' ', names=CLASS_COLS, header=None)
criteria = classes_df['class_number'].isin(CLASSES)
classes_df = classes_df[criteria]

class_name_list = sorted(classes_df['class_id'].unique().tolist())
print(class_name_list)

# Create train/val/test dataframes from our dataset

In this lab, we've moved the train/test split logic to a utilities script to keep the focus of the notebook on the actual training and hosting steps.

In [None]:
from utils import split
train_df, val_df, test_df = split.get_train_val_dataframes(BASE_DIR, CLASSES, EXCLUDE_IMAGE_LIST, SPLIT_RATIOS)

# Prepare the data channels for Amazon SageMaker
When using Amazon SageMaker's managed training service, you need to provide the datasets to the training algorithm. This is primarily handled via populating S3 buckets, and by indicating the location of data channels such as train, test, and validation. You also need to consider the data format. In our case, to keep things simple, we will populate the data channels with folders containing the original JPG images organized by class folders.

In [None]:
CHANNEL_FOLDER = 'data'

### Populate local data channels
Here we populate the local channel folders, and we handle re-populating in case we have already run through this notebook with a different subset of bird species. The local content will be synchronized to our s3 bucket for access by the SageMaker training service. Note that if you are running this notebook with a large number of bird species, this step could take several minutes. You can skip these cells if you are re-running this notebook after already having populated these folders. 

In [None]:
import os
cwd = os.getcwd()

In [None]:
import shutil
def copy_files_for_channel(df, channel_name, verbose=False):
    print('\nCopying files for {} images in channel: {}...'.format(df.shape[0], channel_name))
    for i in range(df.shape[0]):
        target_fname = df.iloc[i]['image_file_name']
        if verbose:
            print(target_fname)
        src = f'{cwd}/{IMAGES_DIR}/{target_fname}'
        dst = f'{cwd}/{CHANNEL_FOLDER}/{channel_name}/{target_fname}'
        shutil.copyfile(src, dst)

In [None]:
if os.path.exists(CHANNEL_FOLDER):
    print('About to remove {}/{}'.format(cwd, CHANNEL_FOLDER))
    perform_delete = input('Are you sure you want to remove it and start fresh (yes/no)?')
    if perform_delete == 'yes':
        for ch in ['train', 'test', 'validation']:
            !rm -i -rf $CHANNEL_FOLDER/$ch/*
            !rm -i -rf $CHANNEL_FOLDER/$ch
            !mkdir  $CHANNEL_FOLDER/$ch

        for c in class_name_list:
            !mkdir $CHANNEL_FOLDER/train/$c
            !mkdir $CHANNEL_FOLDER/validation/$c
            !mkdir $CHANNEL_FOLDER/test/$c
            
        copy_files_for_channel(val_df,   'validation')
        copy_files_for_channel(test_df,  'test')
        copy_files_for_channel(train_df, 'train')
else:
    print('Channel folder does not yet exist. Creating it and subfolders for each channel.')
    os.mkdir(CHANNEL_FOLDER)
    os.mkdir(CHANNEL_FOLDER + '/validation')
    os.mkdir(CHANNEL_FOLDER + '/test')
    os.mkdir(CHANNEL_FOLDER + '/train')
    for c in class_name_list:
        os.mkdir('{}/{}/{}'.format(CHANNEL_FOLDER, 'validation', c))
        os.mkdir('{}/{}/{}'.format(CHANNEL_FOLDER, 'test', c))
        os.mkdir('{}/{}/{}'.format(CHANNEL_FOLDER, 'train', c))
    
    copy_files_for_channel(val_df,   'validation')
    copy_files_for_channel(test_df,  'test')
    copy_files_for_channel(train_df, 'train')

### Upload content of each data channel to S3
Note that for local mode training, this copy to s3 is not necessary. SageMaker local mode is helpful for early iterations of the development of a new model. Once you are happy with your new model on smaller datasets, you typically then use SageMaker training jobs on larger sets of data and with additional epochs.

In [None]:
print('About to uploaded your image data to s3://{}/{}'.format(bucket, prefix))
print('This will ensure you have the latest data for training, but it takes a while if using the full 200 classes.')
if input('Are you sure you want to replace your s3 images?') == 'yes':
    print('Clearing out s3://{}/{}/'.format(bucket, prefix))
    for ch in ['train', 'test', 'validation']:
        !aws s3 rm --quiet --recursive s3://$bucket/$prefix/$ch/
    print('\nSynchronizing local data channels with s3...')
    !aws s3 sync $CHANNEL_FOLDER s3://$bucket/$prefix/

print('contents of s3://{}/{}/'.format(bucket, prefix))
!aws s3 ls s3://$bucket/$prefix/

# Train the model 
When using [SageMaker's TensorFlow container](https://sagemaker.readthedocs.io/en/stable/frameworks/tensorflow/using_tf.html), the custom TensorFlow training code is provided via a Python script in a separate file that gets passed to SageMaker. For our example, that script is shown below for completeness in the notebook. Study that code before proceeding to the actual training. Pay attention to any differences from the code you used in the first lab when training directly in the notebook:

- Use of script parameters to make the code more flexible. We will take advantage of these parameters later on when doing automatic model tuning.
- Copying of `inference.py` and `requirements.txt` to the `code` directory for use with TensorFlow Serving at inference time.
- Saving the trained model in a format that is compatible with TensorFlow Serving.

In [None]:
!pygmentize 'code/train-mobilenet.py'

### A note about customizing your SageMaker hosted TensorFlow endpoint
For our example, we need some preprocessing code to take the raw image bytes and prepare the image for being passed to the image classification model. This lets use efficiently pass images to our endpoint for predictions, versus passing them as JSON for example. Also, the various pretrained models expect different transformations of the image. By performing these final transformations in the endpoint, it avoids forcing your client application to know the implementation details of your endpoint. This lets you change from MobileNetV2 to ResNet for example, without having to change any client side code. 

To do this preprocessing, we provide an `inference.py` script with an `input_handler()` function.
See the [documentation about custom pre- and post-processing](https://sagemaker.readthedocs.io/en/stable/frameworks/tensorflow/using_tf.html#create-python-scripts-for-custom-input-and-output-formats) for more detail.

In [None]:
!pygmentize code/inference.py

To train in local mode, it is necessary to have Docker Compose or NVIDIA-Docker-Compose (for GPU) installed in the notebook instance.  This simple setup script is required.

In [None]:
!/bin/bash ./setup.sh

## Create the SageMaker training job using the TensorFlow container
Here we establish the Tensorflow estimator object. Note that this notebook code is designed to support SageMaker remote training jobs as well as local mode training. Set `local = False` to use SageMaker's training service on separate managed training instances. Metric definitions are provided so that you can visualize metrics from the SageMaker console as well as from CloudWatch. These same metrics can be used when optimizing your model with automatic model tuning.

For this lab, run the whole notebook with `local = True` first. Once you have understood all that is going on, and have seen it work successfully, change to `local = False` and re-run the rest of the notebook.

### A note about instance types and account limits
You may find yourself running into an error like this:

````
ResourceLimitExceeded: An error occurred (ResourceLimitExceeded) when calling the CreateTrainingJob operation: The account-level service limit 'ml.p3.8xlarge for training job usage' is 0 Instances, with current utilization of 0 Instances and a request delta of 1 Instances. Please contact AWS support to request an increase for this limit.
````
To avoid customers getting unexpected bills for more powerful and more expensive instance usage, accounts are established by default with limited access to certain instance types. These are soft limits that can be raised by contacting AWS support. This lab defaults to a powerful GPU instance type, but you can run it on a lower-powered instance type. In such a case, you will pay less, but your training jobs will take longer.

### A note about TensorFlow versions
Amazon SageMaker supports multiple versions of TensorFlow natively. In addition, you can bring your own container if you have specific requirements that are not met by [SageMaker's TensorFlow container](https://sagemaker.readthedocs.io/en/stable/frameworks/tensorflow/using_tf.html). When instantiating a TensorFlow estimator object, you specify a TensorFlow framework version. This directs SageMaker to find the corresponding container image.

In [None]:
import sagemaker
from sagemaker.tensorflow import TensorFlow

TF_FRAMEWORK_VERSION = '2.1'

The `requirements.txt` file lists the set of Python packages that will be pip installed in your endpoint before calling your `input_handler` from your `inference.py` script. If you are using TensorFlow specific code, be sure to include the `tensorflow` package. These packages will also be installed when using SageMaker batch transformation.

In [None]:
!cat ./code/requirements.txt

### A note about spot training
When using SageMaker's Managed Spot Training, the SageMaker TensorFlow container handles checkpointing for you as long as you pass in an S3 location for the checkpoints. If your spot instance gets taken away, the training will automatically resume from the last saved checkpoint without any manual intervention. Here we specify a unique location for those checkpoints.

In [None]:
import uuid

checkpoint_s3_uri = f's3://{bucket}/{prefix}/checkpoints/{uuid.uuid4()}'
print(checkpoint_s3_uri)

Here we create a SageMaker TensorFlow estimator. You can use it in local mode or use it to run managed training jobs on separate training instances. Local mode is helpful for iterating on training script edits during development. Managed training jobs should be used for full training runs where you can scale both the data used for training and the compute instance types and counts required.

In [None]:
local = True
if (local):
    train_instance_type  = 'local'
    train_instance_count = 1
    serve_instance_type  = 'local'
    initial_epochs       = 3
    fine_tuning_epochs   = 5
    use_spot             = False
else:
    train_instance_type  = 'ml.c5.4xlarge' #'ml.p3.2xlarge' 
    train_instance_count = 1
    serve_instance_type  = 'ml.c5.4xlarge'
    use_spot             = False
    if SAMPLE_ONLY:
        initial_epochs     = 5
        fine_tuning_epochs = 20
    else:
        initial_epochs     = 10
        fine_tuning_epochs = 300

hyperparameters = {'initial_epochs':     initial_epochs,
                   'batch_size':         8,
                   'fine_tuning_epochs': fine_tuning_epochs, 
                   'dropout':            0.4,
                   'data_dir':           '/opt/ml/input/data'}
if use_spot:
    hyperparameters['s3_checkpoint_path'] = checkpoint_s3_uri

# TF changed the logging of accuracy metrics in TF 2.x (from 'acc' to 'accuracy').
if TF_FRAMEWORK_VERSION[0] == '2':
    metric_definitions = [{'Name': 'loss',      'Regex': 'loss: ([0-9\\.]+)'},
                          {'Name': 'acc',       'Regex': 'accuracy: ([0-9\\.]+)'},
                          {'Name': 'val_loss',  'Regex': 'val_loss: ([0-9\\.]+)'},
                          {'Name': 'val_acc',   'Regex': 'val_accuracy: ([0-9\\.]+)'}]
else:
    metric_definitions = [{'Name': 'loss',      'Regex': 'loss: ([0-9\\.]+)'},
                          {'Name': 'acc',       'Regex': 'acc: ([0-9\\.]+)'},
                          {'Name': 'val_loss',  'Regex': 'val_loss: ([0-9\\.]+)'},
                          {'Name': 'val_acc',   'Regex': 'val_acc: ([0-9\\.]+)'}]

if train_instance_count > 1:
    distributions = {'parameter_server': {'enabled': True}}
else:
    distributions = {'parameter_server': {'enabled': False}}

if use_spot:
    estimator = TensorFlow(entry_point='train-mobilenet.py',
                           source_dir='code',
                           train_instance_type=train_instance_type,
                           train_instance_count=train_instance_count,
                           distributions=distributions,
                           hyperparameters=hyperparameters,
                           metric_definitions=metric_definitions,
                           role=sagemaker.get_execution_role(),
                           train_use_spot_instances=use_spot,
                           train_max_run=60*60*10,
                           train_max_wait=60*60*12, # Seconds to wait for spot instances to become available
                           checkpoint_s3_uri=checkpoint_s3_uri,
                           framework_version=TF_FRAMEWORK_VERSION, 
                           py_version='py3',
                           base_job_name=JOB_PREFIX,
                           script_mode=True)
else:
    estimator = TensorFlow(entry_point='train-mobilenet.py',
                           source_dir='code',
                           train_instance_type=train_instance_type,
                           train_instance_count=train_instance_count,
                           distributions=distributions,
                           hyperparameters=hyperparameters,
                           metric_definitions=metric_definitions,
                           role=sagemaker.get_execution_role(),
                           framework_version=TF_FRAMEWORK_VERSION, 
                           py_version='py3',
                           base_job_name=JOB_PREFIX,
                           script_mode=True)

### To experiment with the use of Spot instances for SageMaker training, add this set of parameters to your
## call above when creating the TensorFlow estimator object:
##
######    train_use_spot_instances=True, train_max_run=2*60*60, train_max_wait=3*60*60,

Here we establish pointers to where each data channel is located. For local mode training, these will be in the local file system. For remote training, the s3 data channel copies are provided.

In [None]:
data_dir = os.path.join(os.getcwd(), 'data')
print('Local data dir: {}'.format(data_dir))

if (local):
    train_in = f'file://{data_dir}/train'
    test_in  = f'file://{data_dir}/test'
    val_in   = f'file://{data_dir}/validation'
else:
    s3_base  = 's3://{}/{}'.format(bucket, prefix)

    if train_instance_count > 1:
        DISTRIBUTION_MODE = 'ShardedByS3Key'
    else:
        DISTRIBUTION_MODE = 'FullyReplicated'
        
    train_in = sagemaker.s3_input(s3_data=f'{s3_base}/train', distribution=DISTRIBUTION_MODE)
    test_in  = sagemaker.s3_input(s3_data=f'{s3_base}/test',  distribution=DISTRIBUTION_MODE)
    val_in   = sagemaker.s3_input(s3_data=f'{s3_base}/validation', distribution=DISTRIBUTION_MODE)

inputs = {'train':train_in, 'test': test_in, 'validation': val_in}

print(inputs)

Here we tell the estimator to fit the model. For local mode, this training is performed in the current notebook instance. For remote training, new instances are launched and training is performed on those separate instances. No matter which technique you use, the training script can count on the same interface (script parameters, specific SageMaker environment variables, consistent location of data channels in `/opt/ml/input/data`, and saving to `/opt/ml/model`).

In [None]:
estimator.fit(inputs) 

The batch lab will ask you to provide the name of the training job that produced the resulting model artifacts. The model will be used for batch inference in that lab. Here is the training job name you will provide:

In [None]:
print('Completed training job: {}'.format(estimator.latest_training_job.name))

### A note about spot
If you want to try simulating managed spot recovery from prior checkpoints, uncomment this block of code.
We'll bump up the number of epochs by 10, but pass in the checkpoint_s3_uri from the prior job. The job
should find the checkpoints from the prior training job, resume training from the final checkpoint of the prior job, and run for an additional 10 epochs.

In [None]:
# hyperparameters = {'initial_epochs': initial_epochs,
#                    'batch_size': 8,
#                    'fine_tuning_epochs': fine_tuning_epochs + 10, 
#                    'dropout': 0.4,
#                    'data_dir': '/opt/ml/input/data',
#                    's3_checkpoint_path': checkpoint_s3_uri}
# estimator = TensorFlow(entry_point='train-mobilenet.py',
#                        source_dir='code',
#                        train_instance_type=train_instance_type,
#                        train_instance_count=train_instance_count,
#                        distributions=distributions,
#                        hyperparameters=hyperparameters,
#                        metric_definitions=metric_definitions,
#                        role=sagemaker.get_execution_role(),
#                        train_use_spot_instances=use_spot,
#                        train_max_run=60*60*10,
#                        train_max_wait=60*60*12, # Seconds to wait for spot instances to become available
#                        checkpoint_s3_uri=checkpoint_s3_uri,  # add when using spot
#                        framework_version=TF_FRAMEWORK_VERSION, 
#                        py_version='py3',
#                        base_job_name=JOB_PREFIX,
#                        script_mode=True)
# estimator.fit(inputs)

# Host the model

Once the fit method is complete, we can simply call deploy (`estimator.deploy(...)`) to publish the trained model. 

### A note about iterating on your inference script
There are some scenarios in which you want to train once, but deploy multiple times with different iterations of your `inference.py` script. You can avoid having to retrain the model by simply replacing the `inference.py` file in the model artifacts compressed zip file. 

This next cell runs a script that downloads the model artifacts, replaces the inference files, and uploads the new model artifacts zip for deployment.

In [None]:
if not local:
    # re-deploy model with new inference.py without having to re-do training job
    training_job_name = estimator.latest_training_job.name
    model_artifacts = 's3://{}/{}/output/model.tar.gz'.format(bucket, training_job_name)
    print(model_artifacts)

    !bash ./replace-inference.sh $model_artifacts

Here we deploy the model to a SageMaker endpoint.

If iterating on changes to your `inference.py` script, we re-create the SageMaker model object with the latest version of your script and deploy the endpoint using this latest model. This avoids having to run a new training job (`estimator.fit(inputs)`) just to grab your latest script copy.

Otherwise, simply deploy the model directly from the original estimator. You can use this approach (`estimator.deploy()`) once your `inference.py` code is stable. The deploy method will automatically create a SageMaker Model object on your behalf before creating your endpoint.

In [None]:
from sagemaker.tensorflow.serving import Model
if not local:
    model = Model(model_data=model_artifacts, 
                  role=sagemaker.get_execution_role(),
                  framework_version=TF_FRAMEWORK_VERSION,
                  sagemaker_session=sess)
    predictor = model.deploy(initial_instance_count=1, 
                            instance_type=serve_instance_type)
else:
    predictor = estimator.deploy(initial_instance_count=1, 
                               instance_type=serve_instance_type,
                               endpoint_type='tensorflow-serving') # endpoint_type parameter required for TFS local

# Test the model 

### A note about attaching a predictor to an already running endpoint
If you are using an endpoint that was already in service, without an estimator object, you can establish a predictor object using the following code.

In [None]:
#from sagemaker import RealTimePredictor
#predictor = RealTimePredictor(endpoint='<put-endpoint-name-here>')

## Predicting using the SageMaker TensorFlow endpoint
When calling your image classifier hosted in a SageMaker endpoint, the call to the predictor is slightly different than what you used in the first lab:

1. In lab 1, we preprocessed the image bytes, which is specific to our base MobileNet model. Instead, here we pass the image bytes directly. This hides the implementation detail of a model-specific preprocessing of the image as part of the endpoint. This gives us flexibility to switch to a different implementation down the road (e.g., ResNet50, InceptionV3, ...).
2. We indicate the content type for the REST call, in this case `application/x-image`.
3. The results come back as a json document, with the actual predictions in a `predictions` object.

The rest of the code remains the same.

In [None]:
HEIGHT=224; WIDTH=224

In [None]:
from sagemaker.predictor import json_deserializer
from IPython.display import Image, display
from numpy import argmax

def predict_bird_from_file(fn, verbose=True):
    with open(fn, 'rb') as img:
        f = img.read()
    x = bytearray(f)
    
    predictor.content_type = 'application/x-image'
    predictor.deserializer = json_deserializer
    predictor.serializer   = None
    
    results = predictor.predict(x)['predictions']

    predicted_class_idx = argmax(results)
    predicted_class = class_name_list[predicted_class_idx]
    confidence = results[0][predicted_class_idx]
    if verbose:
        display(Image(fn, height=HEIGHT, width=WIDTH))
        print('Class: {}, confidence: {:.2f}'.format(predicted_class, confidence))
    del img, x
    return predicted_class_idx, confidence

In [None]:
predict_bird_from_file('northern-cardinal-2.jpg')

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import itertools

def plot_confusion_matrix(cm, classes,
                          normalize=False,
                          title='Confusion matrix',
                          cmap=plt.cm.GnBu):
    plt.figure(figsize=(7,7))
    plt.grid(False)

    plt.imshow(cm, interpolation='nearest', cmap=cmap)
    plt.title(title)
    tick_marks = np.arange(len(classes))
    plt.xticks(tick_marks, classes, rotation=45)
    plt.yticks(tick_marks, classes)
    fmt = '.2f' if normalize else 'd'
    thresh = cm.max() / 2.
    for i, j in itertools.product(range(cm.shape[0]), 
                                  range(cm.shape[1])):
        plt.text(j, i, format(cm[i, j], fmt),
                 horizontalalignment="center",
                 color="white" if cm[i, j] > thresh else "black")
    plt.tight_layout()
    plt.gca().set_xticklabels(class_name_list)
    plt.gca().set_yticklabels(class_name_list)
    plt.ylabel('True label')
    plt.xlabel('Predicted label')

In [None]:
from sklearn.metrics import confusion_matrix
def create_and_plot_confusion_matrix(actual, predicted):
    cnf_matrix = confusion_matrix(actual, np.asarray(predicted),labels=range(len(class_name_list)))
    plot_confusion_matrix(cnf_matrix, classes=range(len(class_name_list)))

### Assess prediction performance against validation and test datasets

In [None]:
from IPython.display import Image, display

# Iterate through entire dataframe, tracking predictions and accuracy. For mistakes, show the image, and the predicted and actual classes to help understand
# where the model may need additional tuning.

def test_image_df(df):
    print('Testing {} images'.format(df.shape[0]))
    num_errors = 0
    preds = []
    acts  = []
    for i in range(df.shape[0]):
        fname = df.iloc[i]['image_file_name']
        act   = int(df.iloc[i]['class_id']) - 1
        acts.append(act)
        pred, conf = predict_bird_from_file(IMAGES_DIR + '/' + fname, verbose=False)
        preds.append(pred)
        if (pred != act):
            num_errors += 1
            print('ERROR on image index {} -- Pred: {} {:.2f}, Actual: {}'.format(i, 
                                                                   class_name_list[pred], conf, 
                                                                   class_name_list[act]))
            img = Image(filename=f'{IMAGES_DIR}/{fname}', width=WIDTH, height=HEIGHT)
            display(img)

    return num_errors, preds, acts

In [None]:
num_images = val_df.shape[0]
num_errors, preds, acts = test_image_df(val_df)
print('\nAccuracy: {:.2f}, {}/{}'.format(1 - (num_errors/num_images), num_images - num_errors, num_images))

In [None]:
create_and_plot_confusion_matrix(acts, preds)

In [None]:
num_images = test_df.shape[0]
num_errors, preds, acts = test_image_df(test_df)
print('\nAccuracy: {:.2f}, {}/{}'.format(1 - (num_errors/num_images), num_images - num_errors, num_images))

In [None]:
create_and_plot_confusion_matrix(acts, preds)

### Test model against previously unseen images
Here we download images that the algorithm has not yet seen.

In [None]:
predict_bird_from_file('northern-cardinal-2.jpg')
predict_bird_from_file('bobolink.jpg')

# Clean up
Finally, and importantly, to avoid being billed for an idle endpoint, here we delete the SageMaker endpoint.

In [None]:
predictor.delete_endpoint()