# Boots 'n' Cats 2b: Modelling with SageMaker Built-In Algorithm

In this notebook we'll try another approach to build our boots 'n' cats detector: the [SageMaker built-in Object Detection algorithm](https://docs.aws.amazon.com/sagemaker/latest/dg/object-detection.html).

Like Rekognition, this method doesn't need us to implement an algorithm, but this time we'll need to *know* more about deep neural networks to tune the model for good performance.

**You'll need to** have gone through the first notebook in this series (*Intro and Data Preparation*) to complete this example.

## About the Algorithm: SSD

Like most of the built-in algorithms, the Object Detection docs include a [How It Works](https://docs.aws.amazon.com/sagemaker/latest/dg/algo-object-detection-tech-notes.html) with an overview and links to relevant resources.

SageMaker Object Detection uses a Single Shot multi-box Detector algorithm as described in [Liu et al, 2016](https://arxiv.org/pdf/1512.02325.pdf).

The object detection / bounding box problem is by no means easy, and our toy data-set is both small and diverse: So we don't anticipate amazing performance in this example, and should expect the built-in model pre-training to be very influential on the results.

## Step 0: Dependencies and configuration

As usual we'll start by loading libraries, defining configuration, and connecting to the AWS SDKs:

In [None]:
%load_ext autoreload
%autoreload 1

# Built-Ins:
import csv
import os
from collections import defaultdict
import json

# External Dependencies:
import boto3
import imageio
import numpy as np
import sagemaker
from IPython.display import display, HTML

# Local Dependencies:
%aimport util

Next we re-load configuration from the intro & data processing notebook:

In [None]:
%store -r BUCKET_NAME
assert BUCKET_NAME, "BUCKET_NAME missing from IPython store"
%store -r CHECKPOINTS_PREFIX
assert CHECKPOINTS_PREFIX, "CHECKPOINTS_PREFIX missing from IPython store"
%store -r DATA_PREFIX
assert DATA_PREFIX, "DATA_PREFIX missing from IPython store"
%store -r MODELS_PREFIX
assert MODELS_PREFIX, "MODELS_PREFIX missing from IPython store"
%store -r CLASS_NAMES
assert CLASS_NAMES, "CLASS_NAMES missing from IPython store"
%store -r test_image_folder
assert test_image_folder, "test_image_folder missing from IPython store"

%store -r attribute_names
assert attribute_names, "attribute_names missing from IPython store"
%store -r n_samples_training
assert n_samples_training, "n_samples_training missing from IPython store"
%store -r n_samples_validation
assert n_samples_validation, "n_samples_validation missing from IPython store"

Here we just connect to the AWS SDKs we'll use, and validate the choice of S3 bucket:

In [None]:
role = sagemaker.get_execution_role()
session = boto3.session.Session()
region = session.region_name
s3 = session.resource("s3")
bucket = s3.Bucket(BUCKET_NAME)
smclient = session.client("sagemaker")

bucket_region = \
    session.client("s3").head_bucket(Bucket=BUCKET_NAME)["ResponseMetadata"]["HTTPHeaders"]["x-amz-bucket-region"]
assert (
    bucket_region == region
), f"Your S3 bucket {BUCKET_NAME} and this notebook need to be in the same region."

if (region != "us-east-1"):
    print("WARNING: Rekognition Custom Labels functionality is only available in us-east-1 at launch")
    
# Initialise some empty variables we need to exist:
predictor_std = None
predictor_hpo = None

## Step 1: Review our algorithm details

The first step in deciding to use a SageMaker built-in algorithm is to review its [documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/object-detection.html) and [common parameters](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-algo-docker-registry-paths.html): To understand it's input/output interface, tunable parameters, use case, etc.

In particular we'll need the URL for the Docker image in order to use the algorithm. While this is listed [in the docs](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-algo-docker-registry-paths.html), it's also nice and easy to fetch programmatically.

(Note some built-in algorithms have native classes in the SageMaker SDK e.g. `sagemaker.KMeans`: We only need this `training_image` URL for custom algorithms or built-ins like this one which the SDK treats as generic)

In [None]:
training_image = sagemaker.amazon.amazon_estimator.get_image_uri(
    region,
    "object-detection",
    repo_version="latest"
)
print(training_image)

## Step 2: Set up input data channels

SageMaker describes data connections in terms of **channels**, rather than "folders" or "sources", to try and avoid any inaccurate assumptions about how algorithms see the connection and what API is presented.

In this case we have in S3 for each of training and validation:

* A *JSONLines manifest file* listing what images are in the data-set (by their S3 URI) and what annotations have been collected for those images (bounding boxes from SageMaker Ground Truth)
* The image files themselves

We'd like SageMaker to provide the algorithm with a **stream of image records** comprising both the image data and the annotations: To avoid having to wait around downloading the full dataset to the container before training starts; or retrieving the image bytes for each annotation.

The [algorithm docs](https://docs.aws.amazon.com/sagemaker/latest/dg/object-detection.html#object-detection-inputoutput) give guidance on how to set this up: SageMaker already provides functionality to create RecordIO files for us from manifests:

In [None]:
train_channel = sagemaker.session.s3_input(
    f"s3://{BUCKET_NAME}/{DATA_PREFIX}/train.manifest",
    distribution="FullyReplicated",  # In case we want to try distributed training
    content_type="application/x-recordio",
    s3_data_type="AugmentedManifestFile",
    record_wrapping="RecordIO",
    attribute_names=attribute_names  # In case the manifest contains other junk to ignore (it does!)
)
                                        
validation_channel = sagemaker.session.s3_input(
    f"s3://{BUCKET_NAME}/{DATA_PREFIX}/validation.manifest",
    distribution="FullyReplicated",
    content_type="application/x-recordio",
    record_wrapping="RecordIO",
    s3_data_type="AugmentedManifestFile",
    attribute_names=attribute_names
)

## Step 3: Configure the algorithm

The remainder of the pre-training setup concerns:

* Output data connection parameters (where to store final model artifacts and intermediate checkpoints)
* Compute resource specification
* Algorithm (hyper-) parameters

We do this through the SageMaker SDK's `Estimator` API, similarly to estimators in other common frameworks.

Note:

* "Pipe mode" streams input data to the algorithm rather than (the default) downloading the data up-front. This can accelerate training start-up for algorithms that support it.
* As detailed in the [common parameters](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-algo-docker-registry-paths.html) docs, `object-detection` supports GPU-accelerated and distributed training. We use a GPU-accelerated `ml.p3.2xlarge` instance type but don't bother to create more than one instance type due to the small data-set size.
* Always prefer [spot instance](https://docs.aws.amazon.com/sagemaker/latest/dg/model-managed-spot-training.html) training where practical: It's an easy way to save ~70-90% on training costs!

In [None]:
estimator = sagemaker.estimator.Estimator(
    training_image,  # URL to container image implementing the algorithm 
    role,  # IAM access to perform the API actions
    input_mode="Pipe",
    train_instance_count=1,
    train_instance_type="ml.p3.2xlarge",
    train_volume_size=50,  # Make sure we don't run out of space
    train_max_run = 5*60*60,
    train_use_spot_instances=True,
    train_max_wait= 5*60*60,
    base_job_name="bootsncats-ssd",
    output_path=f"s3://{BUCKET_NAME}/{MODELS_PREFIX}",
    checkpoint_s3_uri=f"s3://{BUCKET_NAME}/{CHECKPOINTS_PREFIX}",
)

In [None]:
estimator.set_hyperparameters(
    # Pre-training is particularly important for tiny data-sets like this!:
    base_network="resnet-50",
    early_stopping=True,
    early_stopping_min_epochs=100,
    early_stopping_patience=20,
    epochs=400,
    image_shape=300,
    label_width=350,
    learning_rate=0.0002,
    lr_scheduler_factor=0.5,
    mini_batch_size=5,
    momentum=0.9,
    nms_threshold=0.45,
    num_classes=len(CLASS_NAMES),
    num_training_samples=n_samples_training,
    optimizer="sgd",
    overlap_threshold=0.5,
    use_pretrained_model=1,
    weight_decay=0.005,
)

## Step 4: Train the model

The hyperparameters above represent our best up-front guess; and it's easy enough to call `estimator.fit()` to train a model as shown below.

Instead though, we can improve model performance and reduce some of the guesswork in setting these hyperparameters by letting the SageMaker `HyperParameterTuner` optimize them. SageMaker HPO uses a [Bayesian optimization](https://arxiv.org/abs/1807.02811) strategy (unless you [tell it otherwise](https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning-how-it-works.html)) specifically formulated for this kind of expensive-to-evaluate optimization scenario: Much more cost efficient than naive options like grid search.

Because HPO typically takes much longer than standard model fitting, `tuner.fit()` is an **asynchronous** method by default whereas `estimator.fit()` is **synchronous** (blocking).

If you'd like to compare both, we suggest you run `WITH_HPO=True` first, **then** try `False`.

In [None]:
WITH_HPO = # TODO: True first, then False?

In [None]:
%%time
if (not WITH_HPO):
    estimator.fit({ "train": train_channel, "validation": validation_channel }, logs=True)
else:
    hyperparameter_ranges = {
        "learning_rate": sagemaker.tuner.ContinuousParameter(0.0001, 0.1),
        "momentum": sagemaker.tuner.ContinuousParameter(0.0, 0.99),
        "weight_decay": sagemaker.tuner.ContinuousParameter(0.0, 0.99),
        "mini_batch_size": sagemaker.tuner.IntegerParameter(1, n_samples_validation),
        "optimizer": sagemaker.tuner.CategoricalParameter(['sgd', 'adam', 'rmsprop', 'adadelta'])
    }

    tuner = sagemaker.tuner.HyperparameterTuner(
        estimator,
        "validation:mAP",  # Name of the objective metric to optimize
        objective_type="Maximize",  # "Mean Average Precision" high = good
        hyperparameter_ranges=hyperparameter_ranges,
        base_tuning_job_name="bootsncats-ssd-hpo",
        # Defining the maximum number and parallelism of HPO training jobs:
        # Note that accounts have protective limits on number of GPU instances by default.
        # For Event Engine accounts, default max ml.p3.2xlarge = 2
        # Set max_parallel_jobs = (limit / train_instance_count) - 1
        # (minus one lets you run HPO and non-HPO in parallel)
        max_jobs=# TODO: Ideally 20 or more
        max_parallel_jobs=# TODO: Maybe only 1 for Event Engine, more if possible
    )
    
    tuner.fit(
        { "train": train_channel, "validation": validation_channel },
        include_cls_metadata=False
    )

## Step 5: While the model(s) are training

Individual training jobs typically take around 10 minutes for this configuration and so the HPO job may take hours, depending on your configured `max_jobs`

Take some time to familiarize yourself with the metrics reported in the *Training > Training jobs* and *Training > Hyperparameter tuning jobs* sections of the console: Both of which provide useful tracking for the inputs and parameters of training jobs as well as the result metrics. 

If you're running through other notebooks at the same time, now is a good time to go and check on those!

You can proceed to the next step as soon as the first model (non-HPO) is finished fitting.

## Step 6: Deploy the model

Once a model is trained, SageMaker supports using it for either:

* Deploying the model to an *endpoint* for real-time inference
* Running a *batch transform* job on an input dataset

In this example we'll deploy a real-time endpoint, and use the same `WITH_HPO` parameter from earlier to select which model to deploy.

Since our endpoints won't be handling any significant traffic volumes, we provision a single non-accelerated instance.

In [None]:
%%time
if (WITH_HPO):
    if (predictor_hpo):
        predictor_hpo.delete_endpoint()
    print("Deploying HPO model...")
    predictor_hpo = tuner.deploy(
        initial_instance_count=1,
        instance_type="ml.m4.xlarge"
    )
else:
    if (predictor_std):
        predictor_std.delete_endpoint()
    print("Deploying standard (non-HPO) model...")
    predictor_std = estimator.deploy(
        initial_instance_count=1,
        instance_type="ml.m4.xlarge"
    )

## Step 7: Run inference on test images

Now we have one or more models deployed, we can send our same test images to them and see how they perform!

The `visualize_detection()` function used here is provided in the `util` folder: it just uses matplotlib to plot the provided detection boxes over the image.

Unlike Rekognition Custom Labels, the built-in Object Detection algorithm doesn't estimate an optimal confidence threshold for us. What number do you find gives best results?

In [None]:
# Change this if you want something different:
predictor = predictor_hpo if WITH_HPO else predictor_std

# This time confidence is 0-1, not 0-100:
confidence_threshold = # TODO: 0.2 is a good starting point, but explore options!

for test_image in os.listdir(test_image_folder):
    test_image_path = f"{test_image_folder}/{test_image}"
    with open(test_image_path, "rb") as f:
        payload = bytearray(f.read())

    client = boto3.client("sagemaker-runtime")
    response = client.invoke_endpoint(
        EndpointName=predictor.endpoint,
        ContentType='application/x-image',
        Body=payload
    )

    result = response['Body'].read()
    result = json.loads(result)["prediction"]
    # result is a list of [class_ix, confidence, y1, y2, x1, x2] detections.
    display(HTML(f"<h4>{test_image}</h4>"))
    util.visualize_detection(
        test_image_path,
        result,
        CLASS_NAMES,
        thresh=confidence_threshold
    )

## Clean up

Although training instances are ephemeral, the resources we allocated for real-time endpoints need to be cleaned up to avoid ongoing charges.

The code below will delete the *most recently deployed* endpoint for the HPO and non-HPO configurations, but note that if you deployed either more than once, you might end up with extra endpoints.

To be safe, it's best to still check through the SageMaker console for any left-over resources when cleaning up.

In [None]:
if (predictor_hpo):
    print("Deleting HPO-optimized predictor endpoint")
    predictor_hpo.delete_endpoint()
if (predictor_std):
    print("Deleting standard (non-HPO) predictor endpoint")
    predictor_std.delete_endpoint()

## Review

In this notebook we used our SageMaker Ground Truth annotated dataset to train the built-in Object Detection algorithm for our use case.

You probably found with this small dataset and starting hyperparameters that it was hard to get to the same level of performance as the automatic learning in Rekognition Custom Labels: but as can be seen we have much more control over the model (and our costs), which can be useful for situations where our team has knowledge of the problem and how to solve it well.

The next step on the control/complexity continuum would be to use a custom algorithm in place of the built-in: A good fit for teams interested in exploring [other object detection procedures](https://arxiv.org/pdf/1908.03673.pdf) like YOLO, Fast(er)-RCNN, or any more recent advances, besides the SageMaker SSD implementation.

Thanks for taking the time to explore this notebook and the others in the series: We'd love to hear your feedback!