# Bird Object Detection - Training

Upload the training and validation data to the S3 bucket. We do this in multiple channels. Channels are simply directories in the bucket that differentiate the types of data provided to the algorithm. For the object detection algorithm, we call these directories train and validation.

Next we define an output location in S3, where the model artifacts will be placed on completion of the training. These artifacts are the output of the algorithm’s traning job. We also get the URI to the Amazon SageMaker Object Detection docker image. This ensures the estimator uses the correct algorithm from the current region.

In [None]:
import sagemaker
from sagemaker import get_execution_role

sess = sagemaker.Session()
# this will create a 'default' sagemaker bucket if it doesn't exist (sagemaker-region-accountid)
bucket = sagemaker_session.default_bucket()

role = get_execution_role()

prefix = "DEMO-ObjectDetection-birds"
TRAIN_REC_FILE = "birds_ssd_sample_train.rec"
VAL_REC_FILE = "birds_ssd_sample_val.rec"
TRAIN_LST_FILE = "birds_ssd_sample_train.lst"
VAL_LST_FILE = "birds_ssd_sample_val.lst"

# Upload the RecordIO files to train and validation channels
train_channel = prefix + "/train"
validation_channel = prefix + "/validation"

sess.upload_data(path="birds_ssd_sample_train.rec", bucket=bucket, key_prefix=train_channel)
sess.upload_data(path="birds_ssd_sample_val.rec", bucket=bucket, key_prefix=validation_channel)

s3_train_data = "s3://{}/{}".format(bucket, train_channel)
s3_validation_data = "s3://{}/{}".format(bucket, validation_channel)

print(s3_train_data)
print(s3_validation_data)

Now that we have our data on S3, let's start a training job on SageMaker to train the model using the SageMaker Object Detection algoritm. The object detection algorithm at its core is the Single-Shot Multi-Box detection algorithm (SSD). This algorithm uses a base_network, which is typically a VGG or a ResNet. The Amazon SageMaker object detection algorithm supports VGG-16 and ResNet-50. It also has a number of hyperparameters that help configure the training job. The next step in our training, is to setup these hyperparameters and data channels for training the model. See the SageMaker Object Detection documentation for more details on its specific hyperparameters. 

We get the URI to the Amazon SageMaker Object Detection docker image. This ensures the estimator uses the correct algorithm from the current region.

In [None]:
from sagemaker import image_uris

training_image = image_uris.retrieve( "object-detection", sess.boto_region_name, version="1")
print(training_image)

We also need to define an output location in S3, where the model artifacts will be placed on completion of the training. These artifacts are the output of the algorithm’s traning job.

In [None]:
s3_output_location = "s3://{}/{}/output".format(bucket, prefix)

Now we create our estimator object, that will be responsible for running the training job. We pass the container image used, the type and count of instances used in the training, attached storage volume size and also a maximum runtime parameter. We also set the training mode to 'File' and the output path for the resulting model. Note that 'File' will need to copy all the training data into the training instance(s) before training starts. Here we are using a GPU instance p3.2xlarge.

In [None]:
od_model = sagemaker.estimator.Estimator(
    training_image,
    role,
    instance_count=1,
    instance_type="ml.p3.2xlarge",
    volume_size=50,
    max_run=360000,
    input_mode="File",
    output_path=s3_output_location,
    sagemaker_session=sess,
)

Now let's set the hyperparameters for our training job. Here we are using a pretrained 'resnet50' which returns a ResNet-50 network (a convolutional neural network that is 50 layers deep) trained on the ImageNet data set.

In [None]:
import pandas as pd

IM2REC_SSD_COLS = [
    "header_cols",
    "label_width",
    "zero_based_id",
    "xmin",
    "ymin",
    "xmax",
    "ymax",
    "image_file_name",
]

train_lst = pd.read_csv(TRAIN_LST_FILE, sep="\t", names=IM2REC_SSD_COLS, header=None)

num_classes = len(train_lst['zero_based_id'].unique())
num_training_samples = train_lst.shape[0]
num_epochs, lr_steps = (100, "33,67")

od_model.set_hyperparameters(
    base_network="resnet-50",
    use_pretrained_model=1,
    num_classes=num_classes,
    mini_batch_size=16,
    epochs=num_epochs,
    learning_rate=0.001,
    lr_scheduler_step=lr_steps,
    lr_scheduler_factor=0.1,
    optimizer="sgd",
    momentum=0.9,
    weight_decay=0.0005,
    overlap_threshold=0.5,
    nms_threshold=0.45,
    image_shape=512,
    label_width=350,
    num_training_samples=num_training_samples,
)

And we launch the training with our input training and test data (that was previously uploaded to S3). Note that we create SageMaker TrainingInput objects that specify the location of the datasets as well as their format (RecordIO) and the way we will copy the datasets to our training job instances (FullyReplicated).

In [None]:
from sagemaker.inputs import TrainingInput

train_data = TrainingInput(
    s3_train_data,
    distribution="FullyReplicated",
    content_type="application/x-recordio",
    s3_data_type="S3Prefix",
)
validation_data = TrainingInput(
    s3_validation_data,
    distribution="FullyReplicated",
    content_type="application/x-recordio",
    s3_data_type="S3Prefix",
)
data_channels = {"train": train_data, "validation": validation_data}

Finally we call fit() in our Estimator

In [None]:
%%time

od_model.fit(inputs=data_channels, logs=True)

Let's evaluate our model performance. In computer vision, mAP (Mean Average Precision) is a popular evaluation metric used for object detection (i.e. localisation and classification). Localization determines the location of an instance (e.g. bounding box coordinates) and classification tells you what it is (e.g. the species of a birds). Precision measures how accurate your predictions are. i.e. the percentage of your predictions are correct and is related to TRUE and FALSE positives in predictions. Object detection systems make predictions in terms of a bounding box and a class label. For each bounding box, we measure an overlap between the predicted bounding box and the ground truth bounding box. This is measured by **IoU (intersection over union)**.

!![IoU](./images/IoU.png)


For a prediction, we may get different binary TRUE or FALSE positives, by changing the IoU threshold, if IoU threshold is 0.5, and the IoU value for a prediction is 0.7, then we classify the prediction as True Positive (TP). On the other hand, if IoU is 0.3, we classify it as False Positive (FP). The mean Average Precision or mAP score is calculated by taking the mean AP over all classes and/or overall IoU thresholds. Using our validation data, we measured mAP for each epoch of our training job. We can plot it along the time axis to see that as it rapidly increases in the first epochs it eventually stabilizes and fluctuates around a certain mAP value. AP is always 0 < AP < 1, with higher values, meaning the model performs better.

In [None]:
%matplotlib inline
from sagemaker.analytics import TrainingJobAnalytics

training_job_name = '<insert_training_job_name>'
metric_name = 'validation:mAP'

metrics_dataframe = TrainingJobAnalytics(training_job_name=training_job_name,metric_names=[metric_name]).dataframe()
plt = metrics_dataframe.plot(kind='line', figsize=(12,5), x='timestamp', y='value', style='b.', legend=False)
plt.set_ylabel(metric_name);

The previous chart can also be visualized in SageMaker 'Trial Components'