# This notebook includes code to build, tune, and train models using SageMaker's object detection algorithm.

Using a notebook as opposed to SageMaker's UI gives us the advantage of having access to all of the model artifacts in one place. It also allows us to specify all input data and output locations in a single notebook.

#### Code and documentation in this notebook was heavily inspired by the following Object Detection sample notebook created by Amazon SageMaker:
https://sagemaker-examples.readthedocs.io/en/latest/introduction_to_amazon_algorithms/object_detection_birds/object_detection_birds.html

# Setup

#### First, specify if you want to train a model off of a REC file (i.e., a RecordIO File) or an Augmented Manifest File.

In [None]:
# Specify either "REC" or "AugmentedManifest"
data_file = "REC"

This initializes important specifications with different values depending on the which data file is being used.

In [None]:
# Name of the professor's S3 Bucket dedicated to machine learning
    # Note that part of the name has been redacted for privacy purposes
bucket_name = "sagemaker-us-west-2-************"

# If training off of a REC File:
if data_file == "REC":
    print("Training off of RecordIO File")
    train_path = f"s3://{bucket_name}/train/train_full.rec"
    val_path = f"s3://{bucket_name}/validation/val.rec"
    input_mode = "File"
    content_type="application/x-recordio"
    s3_data_type = "S3Prefix"
    attribute_names = None

# If training off of an Augmented Manifest File:
elif data_file == "AugmentedManifest":
    print("Training off of Augmented Manifest File")
    train_path = f"s3://{bucket_name}/train_full_manifest/train_full_AugmentedManifestFile.jsonl"
    val_path = f"s3://{bucket_name}/val_manifest/val_AugmentedManifestFile.jsonl"
    input_mode = "Pipe"
    content_type="application/x-image"
    s3_data_type = "ManifestFile"
    attribute_names = ["spectrogram", "boxes"]

In [None]:
print(train_path)

#### Now we create a connection to the professor's S3 Bucket, a connection to the data files, and a connection to SageMaker's object detection algorithm.

We start by connecting to the S3 Bucket where we have all of our training data and validation data. We also need to specify their precise file paths.

In [None]:
# Imports for the S3 Bucket
import sagemaker
import boto3

# Creating a connection to the S3 Bucket that has our training and validation data
s3 = boto3.resource('s3')
bucket = s3.Bucket(bucket_name)

# Indicating the exact location of training and validation data
s3_train_data = train_path
s3_validation_data = val_path

We need to provide proper authentication to allow for the use of Amazon's SageMaker services, so we must specify our execution role from an account with SageMaker access. This also allows for access to the data in the S3 bucket.

In [None]:
# Gets execution role to authenticate usage of SageMaker services and access to S3 bucket
from sagemaker import get_execution_role

role = get_execution_role()
print(role)
sess = sagemaker.Session()

We need to get the URI to the Amazon SageMaker Object Detection docker image. This ensures the estimator uses the correct algorithm from the correct region, which is specified based on the session.

In [None]:
from sagemaker import image_uris

# Retrieves the URI to the object detection docker image
training_image = image_uris.retrieve(
    region=sess.boto_region_name, framework="object-detection", version="latest"
)
print(training_image)

We must also specify our desired output location (folder path within the S3 bucket) for model artifacts once the model is trained.

In [None]:
# S3 Bucket output location for model artifacts
s3_output_location = f"s3://{bucket_name}/output"

#### Now we can start building the model framework and specifying the parameter values for the model type. Some examples of these are the algorithm, execution role, instance type, and output location.

Here is a link documenting the Estimator object and its attributes: 
https://sagemaker.readthedocs.io/en/stable/api/training/estimators.html

In [None]:
# Building the model framework using a SageMaker estimator object
od_model = sagemaker.estimator.Estimator(
    training_image,
    role,
    instance_count=1,
    instance_type="ml.p3.2xlarge",
    volume_size=5,
    max_run=360000,
    input_mode=input_mode,
    output_path=s3_output_location,
    sagemaker_session=sess,
)

# Set Hyperparameters

#### Now we define the hyperparameters for our object detection model. At the time of creating this notebook, SageMaker's object detection algorithm supports 2 base networks, the VGG-16 and ResNet-50. The base network only gets used if "use_pretrained_model" is equal to 1.

Here is a link to the hyperparameter documentation for object detection models in SageMaker: https://docs.aws.amazon.com/sagemaker/latest/dg/object-detection-api-config.html

In [None]:
# Function to set hyperparameters for the model
def set_hyperparameters(num_classes, num_training_samples, mini_batch_size, num_epochs, learning_rate, lr_steps,
                        lr_scheduler_factor, base_network="vgg-16", use_pretrained_model = 0, optimizer="sgd", 
                        momentum=0.9, weight_decay=0.0005, overlap_threshold=0.5, nms_threshold=0.45,
                        image_shape=512, label_width=350):
    num_classes = num_classes
    num_training_samples = num_training_samples
    print("num classes: {}, num training images: {}, mini batch size: {}, \n num_epochs: {}, learning rate: {}, lr steps: {}, lr scheduler factor: {}, \n base network: {}, optimizer: {}, momentum: {}, weight decay: {}, \n overlap threshold: {}, nms threshold: {}".format(
        num_classes, num_training_samples,
        mini_batch_size, num_epochs, learning_rate,
        lr_steps, lr_scheduler_factor, base_network,
        optimizer, momentum, weight_decay,
        overlap_threshold, nms_threshold))

    od_model.set_hyperparameters(
        base_network=base_network,
        use_pretrained_model=use_pretrained_model,
        num_classes=num_classes,
        mini_batch_size=mini_batch_size,
        epochs=num_epochs,
        learning_rate=learning_rate,
        lr_scheduler_step=lr_steps,
        lr_scheduler_factor=lr_scheduler_factor,
        optimizer=optimizer,
        momentum=momentum,
        weight_decay=weight_decay,
        early_stopping = True,
        #early_stopping = False,
        early_stopping_min_epochs = 70,
        overlap_threshold=overlap_threshold,
        nms_threshold=nms_threshold,
        image_shape=image_shape,
        label_width=label_width,
        num_training_samples=num_training_samples,
    )

In [None]:
# Set hyperparameters
set_hyperparameters(num_classes = 5, 
                    num_training_samples = 10400, 
                    mini_batch_size = 4, 
                    num_epochs = 100, 
                    learning_rate = 0.001, 
                    lr_steps = "33,67", 
                    lr_scheduler_factor = 0.1,
                    use_pretrained_model = 1,
                    optimizer = "rmsprop", 
                    momentum = 0.22092291710943074, 
                    weight_decay = 0.000024952493030102602)

### Now, different code chunks need to be run depending on whether your goal is to tune a model or train a model.

# Model Tuning

Regardless of what kind of tuning job you create, run the following library import statements now.

In [None]:
# Import Statements
from sagemaker import tuner
from sagemaker import parameter
import time

#### Now, different code chunks need to be run depending on whether your goal is to tune a model from scratch or tune a model based on a past tuning job.

## Tuning a Model From Scratch

Template for Hyperparameter Tuning comes from the following sample notebook created by Amazon SageMaker: https://github.com/aws/amazon-sagemaker-examples/blob/main/hyperparameter_tuning/xgboost_random_log/hpo_xgboost_random_log.ipynb

In [None]:
# Creates a "HyperparameterTuner" object and specifies its parameters (see SageMaker documentation for more information)

tuner_log = tuner.HyperparameterTuner(
    estimator = od_model,
    objective_metric_name = "validation:mAP",
    hyperparameter_ranges = {"learning_rate": parameter.ContinuousParameter(min_value = 0.00155, max_value = 0.00156),
                            #"mini_batch_size": parameter.IntegerParameter(min_value = 1, max_value = 5),
                            "momentum": parameter.ContinuousParameter(min_value = 0.19494, max_value = 0.19495),
                            #"optimizer": parameter.CategoricalParameter(values = ['rmsprop', 'adam', 'sgd', 'adadelta']),
                            "weight_decay": parameter.ContinuousParameter(min_value = 0.0000173708, max_value = 0.0000173709)},
    max_jobs=1,
    max_parallel_jobs=1,
    strategy="Bayesian",
    early_stopping_type = "Auto"
)

Now we can submit the tuning job using the fit method. Once it is done, we can access model artifacts in the S3 bucket where the output directory was specified previously. Feel free to close this notebook and stop the notebook instance once the tuning job has begun.

In [None]:
# Submitting the tuning job
"""WARNING: Make sure you specify a unique, informative, and concise name for the training job."""
tuner_log.fit(
    {"train": s3_train_data, "validation": s3_validation_data},
    logs=True,
    include_cls_metadata=False,
    # Example of including information on the tuning job's start time within its name
        # job_name="cpbio-1stTJ" + time.strftime("%Y%m%d-%H-%M-%S", time.gmtime()),
    job_name = ""
)

## Tuning a Model Based on Past Tuning Job(s)

The rest of the hyperparameter tuning code comes from the following sample notebook created by SageMaker: https://github.com/aws/amazon-sagemaker-examples/blob/main/hyperparameter_tuning/image_classification_warmstart/hpo_image_classification_warmstart.ipynb

In [None]:
# Implements "Warm Start" (the means by which a new tuning job can obtain information from past tuning jobs)
from sagemaker.tuner import WarmStartConfig, WarmStartTypes

"""Specify the name(s) of the past tuning job(s) you want the new tuning job to inherit from. You can specify up to five."""
parent_tuning_jobs = {"", ""}
warm_start_config = WarmStartConfig(
    WarmStartTypes.IDENTICAL_DATA_AND_ALGORITHM, parents=parent_tuning_jobs
)

parent_tuning_jobs

In [None]:
# Specifies hyperparameter ranges for the new tuning job, and passes in information for accessing past tuning jobs
from sagemaker import tuner
from sagemaker import parameter
import time

tuner_warm_start = tuner.HyperparameterTuner(
    estimator = od_model,
    objective_metric_name = "validation:mAP",
    hyperparameter_ranges = {"learning_rate": parameter.ContinuousParameter(min_value = 0.0005, max_value = 0.0015),
                            "momentum": parameter.ContinuousParameter(min_value = 0.21, max_value = 0.23),
                            "weight_decay": parameter.ContinuousParameter(min_value = 0.000015, max_value = 0.000035)},
    max_jobs=10,
    max_parallel_jobs=1,
    strategy="Bayesian",
    early_stopping_type = "Auto",
    warm_start_config=warm_start_config,
)

Now we can submit the tuning job using the fit method. Once it is done, we can access model artifacts in the S3 bucket where the output directory was specified previously. Feel free to close this notebook and stop the notebook instance once the tuning job has begun.

In [None]:
# Submitting the Tuning Job
"""WARNING: Make sure you specify a unique, informative, and concise name for the tuning job."""
tuner_warm_start.fit(
    {"train": s3_train_data, "validation": s3_validation_data},
    logs=True,
    include_cls_metadata=False,
    job_name=""
)

## Visualizing Results from a Tuning Job (Wait Until Tuning Job is Complete)

#### Regardless of whether you tuned from scratch or used "warm start", the following code chunks help visualize the tuning job's results. Note that you can close this notebook and stop the notebook instance, and you will still be able to run this code to visualize results.

This code chunk displays a table summarizing the tuning job's information and performance.

In [None]:
# Can bring up a table of metrics once the tuning job completes

"""Specify the name of the tuning job you want to visualize results for."""
tuning_job_name = ""

from sagemaker import analytics

tuner_parent_metrics = analytics.HyperparameterTuningJobAnalytics(tuning_job_name)
if not tuner_parent_metrics.dataframe().empty:
    df_parent = tuner_parent_metrics.dataframe().sort_values(
        ["FinalObjectiveValue"], ascending=False
    )

df_parent

This code chunk plots the mAP scores for all training jobs within the tuning job. This assumes you have run the previous code chunk.

In [None]:
# Plots how "mean average precision" changes as tuning progresses
import bokeh
import bokeh.io

bokeh.io.output_notebook()
from bokeh.plotting import figure, show
from bokeh.models import HoverTool

import pandas as pd

df_parent_objective_value = df_parent[df_parent["FinalObjectiveValue"] > -float("inf")]

p = figure(
    plot_width=900,
    plot_height=400,
    x_axis_type="datetime",
    x_axis_label="datetime",
    y_axis_label="validation:mAP",
)
p.circle(
    source=df_parent_objective_value, x="TrainingStartTime", y="FinalObjectiveValue", color="black"
)

show(p)

# Model Training

#### Now we can submit the training job using the fit method. Once it is done, we can access model artifacts in the S3 bucket where the output directory was specified previously. Feel free to close this notebook and stop the notebook instance once the training job has begun.

Before we submit a training job, we must specify our data types and the locations for the data channels.

In [None]:
# Specifying training and validation inputs
train_data = sagemaker.inputs.TrainingInput(
    s3_train_data,
    distribution="FullyReplicated",
    content_type=content_type,
    s3_data_type=s3_data_type,
    attribute_names = attribute_names
)
validation_data = sagemaker.inputs.TrainingInput(
    s3_validation_data,
    distribution="FullyReplicated",
    content_type=content_type,
    s3_data_type=s3_data_type,
    attribute_names = attribute_names
)
data_channels = {"train": train_data, "validation": validation_data}

In [None]:
%%time
# Submitting the training job
"""WARNING: Make sure you specify a unique, informative, and concise name for the training job."""
od_model.fit(inputs=data_channels, logs=True, job_name = "")

# Visualizing Results from a Training Job (Wait Until Training Job is Complete)

### Regardless of whether you tuned a model or trained a model, training jobs will have been created. The following code chunks help visualize a training job's results. Note that you can close this notebook and stop the notebook instance, and you will still be able to run this code to visualize results.

Now that we have trained a model, we can take a look at the Mean Average Precision (mAP) score to assess how the training job progressed on the validation data. Below is code to plot the mAP against what appears to be the epochs with the best mAP scores.

In [None]:
import boto3
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker

%matplotlib inline

# Specifying the training log channel
client = boto3.client("logs")
BASE_LOG_NAME = "/aws/sagemaker/TrainingJobs"

# Function to plot the mAP score over time against the epochs
def plot_object_detection_log(model, title):
    logs = client.describe_log_streams(
        logGroupName=BASE_LOG_NAME, logStreamNamePrefix=model
    )
    cw_log = client.get_log_events(
        logGroupName=BASE_LOG_NAME, logStreamName=logs["logStreams"][0]["logStreamName"]
    )

    mAP_accs = []
    for e in cw_log["events"]:
        msg = e["message"]
        if "validation mAP <score>=" in msg:
            num_start = msg.find("(")
            num_end = msg.find(")")
            mAP = msg[num_start + 1 : num_end]
            mAP_accs.append(float(mAP))

    print(title)
    print("Maximum mAP: %f " % max(mAP_accs))

    fig, ax = plt.subplots()
    plt.xlabel("Epochs")
    plt.ylabel("Mean Avg Precision (mAP)")
    (val_plot,) = ax.plot(range(len(mAP_accs)), mAP_accs, label="mAP")
    plt.legend(handles=[val_plot])
    ax.yaxis.set_ticks(np.arange(0.0, 1.05, 0.1))
    ax.yaxis.set_major_formatter(ticker.FormatStrFormatter("%0.2f"))
    plt.show()

In [None]:
# Call function to plot mAP score against epochs

"""Specify the name of the training job you want to produce a plot for."""
training_job_name = ""
plot_object_detection_log(training_job_name, "mAP tracking for job: " + training_job_name)