# GluonCV SSD Mobilenet training and optimizing using SageMaker Neo

1. [Introduction](#Introduction)
2. [Setup](#Setup)
3. [Data Preparation](#Data-Preparation)
  1. [Download data](#Download-data)
  2. [Convert data into RecordIO](#Convert-data-into-RecordIO)
  3. [Upload data to S3](#Upload-data-to-S3)
4. [Train the model](#Train-the-model)
5. [Compile the trained model using SageMaker Neo](#Compile-the-trained-model-using-SageMaker-Neo)
6. [Deploy the compiled model and request Inferences](#Deploy-the-compiled-model-and-request-Inferences)
7. [Delete the Endpoint](#Delete-the-Endpoint)

## Introduction

This is an end-to-end example of GluonCV SSD model training inside sagemaker notebook and then compile the trained model using SageMaker Neo. In this demo, we will demonstrate how to train a mobilenet model on the [Pascal VOC dataset](http://host.robots.ox.ac.uk/pascal/VOC/) using the Single Shot multibox Detector ([SSD](https://arxiv.org/abs/1512.02325)) algorithm. We will also demonstrate how to optimize this trained model using SageMaker Neo and host it.

***This notebook is for demonstration purpose only. Please fine tune the training parameters based on your own dataset.***

## Setup

Before getting started, make sure to select `Python 3 (Data Science)` kernel. Ensure that `Apache MXNet` and `OpenCV` packages are installed in the kernel which is required to pre-process the dataset.

Next, we need to define a few variables and obtain certain permissions that will be needed later in the example. These are:
* A SageMaker session
* IAM role to  give learning, storage & hosting access to your data
* An S3 bucket, a folder & sub folders that will be used to store data and artifacts

In [None]:
%cd /root/amazon-sagemaker-examples/aws_sagemaker_studio/sagemaker_neo_compilation_jobs/gluoncv_ssd_mobilenet

In [None]:
!pip install mxnet
!apt-get update
!apt-get install -y python3-opencv

Then we need an AWS account role with SageMaker access. This role is used to give SageMaker access to your data in S3. We also create a session.

In [None]:
import sagemaker

sess = sagemaker.Session()
role = sagemaker.get_execution_role()

We then need an S3 bucket that would be used for storing the model artifacts generated after training and compilation, training data and custom code. 

In [None]:
# S3 bucket and folders for saving code and model artifacts.
# Feel free to specify different bucket/folders here if you wish.
bucket = sess.default_bucket() 
folder = 'StudioDemo-ObjectDetection-SSD-MobileNet'
custom_code_sub_folder = folder + '/custom-code'
training_data_sub_folder = folder + '/training-data'
training_output_sub_folder = folder + '/training-output'
compilation_output_sub_folder = folder + '/compilation-output'

To easily visualize the detection outputs we also define the following function. The function visualizes the high-confidence predictions with bounding box by filtering out low-confidence detections.

In [None]:
%matplotlib inline
def visualize_detection(img_file, dets, classes=[], thresh=0.6):
        """
        visualize detections in one image
        Parameters:
        ----------
        img_file : numpy.array
            image, in bgr format
        dets : numpy.array
            ssd detections, numpy.array([[id, score, x1, y1, x2, y2]...])
            each row is one object
        classes : tuple or list of str
            class names
        thresh : float
            score threshold
        """
        import random
        import matplotlib.pyplot as plt
        import matplotlib.image as mpimg
        from matplotlib.patches import Rectangle

        img=mpimg.imread(img_file)
        plt.imshow(img)
        height = img.shape[0]
        width = img.shape[1]
        colors = dict()
        klasses = dets[0][0]
        scores = dets[1][0]
        bbox = dets[2][0]
        for i in range(len(classes)):
            klass = klasses[i][0]
            score = scores[i][0]
            x0, y0, x1, y1 = bbox[i]
            if score < thresh:
                continue
            cls_id = int(klass)
            if cls_id not in colors:
                colors[cls_id] = (random.random(), random.random(), random.random())
            xmin = int(x0 * width / 512)
            ymin = int(y0 * height / 512)
            xmax = int(x1 * width / 512)
            ymax = int(y1 * height / 512)
            rect = Rectangle((xmin, ymin), xmax - xmin,
                                 ymax - ymin, fill=False,
                                 edgecolor=colors[cls_id],
                                 linewidth=3.5)
            plt.gca().add_patch(rect)
            class_name = str(cls_id)
            if classes and len(classes) > cls_id:
                class_name = classes[cls_id]
            plt.gca().text(xmin, ymin-2,
                            '{:s} {:.3f}'.format(class_name, score),
                            bbox=dict(facecolor=colors[cls_id], alpha=0.5),
                                    fontsize=12, color='white')
        plt.tight_layout(rect=[0, 0, 2, 2])
        plt.show()

In [None]:
# Initializing object categories
object_categories = ['aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat', 
                     'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike', 'person', 
                     'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor']

# Setting a threshold 0.20 will only plot detection results that have a confidence score greater than 0.20
threshold = 0.20

Finally we load the test image into the memory. The test image used in this notebook is from [PEXELS](https://www.pexels.com/) which remains unseen until the time of preditcion.

In [None]:
import PIL.Image
import numpy as np

test_file = 'test.jpg'
test_image = PIL.Image.open(test_file)
test_image = np.asarray(test_image.resize((512, 512)))

## Data Preparation
[Pascal VOC](http://host.robots.ox.ac.uk/pascal/VOC/) was a popular computer vision challenge and they released annual challenge datasets for object detection from 2005 to 2012. In this notebook, we will use the data sets from 2007 and 2012, named as VOC07 and VOC12 respectively. Cumulatively, we have more than 20,000 images containing about 50,000 annotated objects. These annotated objects are grouped into 20 categories.

***Notes:***
1. While using the Pascal VOC dataset, please be aware of the database usage rights. The VOC data includes images obtained from flickr's website. Use of these images must respect the corresponding terms of use: https://www.flickr.com/help/terms
2. If you are running this notebook inside of a SageMaker Notebook instance then while performing this step you might run out of storage as the default EBS volume size for SageMaker Notebook instances is 5GB. One way to increase the EBS Volume size of your Notebook instance is by using AWS CLI as documented [here](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/sagemaker/update-notebook-instance.html).
3. Amazon SageMaker Studio uses Amazon Elastic File System (EFS). To manage EFS volume size you can follow our doc [here](https://docs.aws.amazon.com/sagemaker/latest/dg/studio-tasks-manage-storage.html)

### Download data
Download and extract the Pascal VOC datasets for 2007 and 2012 from Oxford University's website.

_Note: This step may take around 30mins to complete._

***Following is an alternative link to download the dataset if there is some connection problem: https://course.fast.ai/datasets#image-localization***

In [None]:
%%time

# Download the dataset
!wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
!wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
!wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar

In [None]:
%%time

# Extract the data
!tar -xf VOCtrainval_11-May-2012.tar
!tar -xf VOCtrainval_06-Nov-2007.tar
!tar -xf VOCtest_06-Nov-2007.tar

### Convert data into RecordIO
[RecordIO](https://mxnet.incubator.apache.org/architecture/note_data_loading.html) is a highly efficient binary data format from [MXNet](https://mxnet.incubator.apache.org/). Using this format, dataset is simple to prepare and transfer to the instance that will run the training job. Please refer to [object_detection_recordio_format](https://github.com/awslabs/amazon-sagemaker-examples/blob/80333fd4632cf6d924d0b91c33bf80da3bdcf926/introduction_to_amazon_algorithms/object_detection_pascalvoc_coco/object_detection_recordio_format.ipynb) for more information about how to prepare RecordIO dataset

In [None]:
!python tools/prepare_dataset.py --dataset pascal --year 2007,2012 --set trainval --target VOCdevkit/train.lst
!python tools/prepare_dataset.py --dataset pascal --year 2007 --set test --target VOCdevkit/val.lst --no-shuffle

### Upload data to S3
Upload the converted data to the S3 bucket. 

In [None]:
# Upload the RecordIO files to train and validation channels
sess.upload_data(path='VOCdevkit/train.rec', bucket=bucket, key_prefix=training_data_sub_folder)
sess.upload_data(path='VOCdevkit/train.idx', bucket=bucket, key_prefix=training_data_sub_folder)

Next, we need to setup training and compilation output locations in S3, where the respective model artifacts will be dumped. We also setup the s3 location for training data and custom code.

In [None]:
# S3 Location where the training data is stored in the previous step
s3_training_data_location = 's3://{}/{}'.format(bucket, training_data_sub_folder)

# S3 Location to save the model artifact after training
s3_training_output_location = 's3://{}/{}'.format(bucket, training_output_sub_folder)

# S3 Location to save the model artifact after compilation
s3_compilation_output_location = 's3://{}/{}'.format(bucket, compilation_output_sub_folder)

# S3 Location to save your custom code in tar.gz format
s3_custom_code_upload_location = 's3://{}/{}'.format(bucket, custom_code_sub_folder)

## Train the model
Now that we are done with all the setup that is needed, we are ready to train our object detector. To begin, we will create a [SageMaker MXNet estimator](https://sagemaker.readthedocs.io/en/stable/frameworks/mxnet/sagemaker.mxnet.html#mxnet-estimator) object which allows us to run single machine or distributed training in SageMaker, using CPU or GPU-based instances. After creating the estimator, training is started by calling ``fit()`` on this estimator. When we create the estimator, we pass:
- ``entry_point``: filename of the python script which defines training and hosting methods. Here we use `ssd_entry_point.py`
- ``role``: name of our IAM execution role.
- ``output_path``: S3 path where the training artifacts will be stored. We defined this in the previous step.
- ``code_location``: S3 path where the custom code including the entry_point script will be stored. We defined this in the previous step.
- ``instance_count`` & ``instance_type``: allows us to specify the number & type of SageMaker instances that will be used for the training job. For this example, we will choose one ``ml.p3.2xlarge`` instance.
- ``framework_version`` & ``py_version``
- ``distribution``: dict with information on how to run distributed training. Here we will use distributed training with parameter_server.
- ``hyperparameters``: dict of values that will be passed to the entry_point script.

In [None]:
from sagemaker.mxnet import MXNet

ssd_estimator = MXNet(entry_point='ssd_entry_point.py',
                      role=role,
                      output_path=s3_training_output_location,
                      code_location=s3_custom_code_upload_location,
                      instance_count=1,
                      instance_type='ml.p3.2xlarge',
                      framework_version='1.8.0',
                      py_version='py37',
                      distribution={'parameter_server': {'enabled': True}},
                      hyperparameters={'epochs': 1,
                                       'data-shape': 512,
                                      }
                     )

In [None]:
ssd_estimator.fit({'train': s3_training_data_location})

## Compile the trained model using SageMaker Neo

After training the model we can use SageMaker Neo's ``compile_model()`` API to compile the trained model. When calling ``compile_model()`` user is expected to provide all the correct input shapes required by the model for successful compilation. We also specify the target instance family, the name of our IAM execution role, S3 bucket to which the compiled model would be stored and we set ``MMS_DEFAULT_RESPONSE_TIMEOUT`` environment variable to 500. 

For this example, we will choose `ml_p3` as the target instance family while compiling the trained model. 

In [None]:
%%time
compiled_model = ssd_estimator.compile_model(target_instance_family='ml_p3', 
                                             input_shape={'data':[1, 3, 512, 512]},
                                             role=role,
                                             output_path=s3_compilation_output_location,
                                             framework='mxnet', 
                                             framework_version='1.8',
                                             env={'MMS_DEFAULT_RESPONSE_TIMEOUT': '500'})

## Deploy the compiled model and request Inferences

We have to deploy the compiled model on one of the instance family for which the trained model was compiled for. Since we have compiled for `ml_p3` we can deploy to any `ml.p3` instance type. For this example we will choose `ml.p3.2xlarge`

In [None]:
%%time
neo_object_detector = compiled_model.deploy(initial_instance_count = 1, instance_type = 'ml.p3.2xlarge')

In [None]:
%%time
response = neo_object_detector.predict(test_image)

In [None]:
# Visualize the detections.
visualize_detection(test_file, response, object_categories, threshold)

## Delete the Endpoint
Having an endpoint running will incur some costs. Therefore as an optional clean-up job, you can delete it.

In [None]:
print("Endpoint name: " + neo_object_detector.endpoint_name)
neo_object_detector.delete_endpoint()