  ![alt text](https://gluon-cv.mxnet.io/_static/gluon-logo.svg "Gluon Logo")
  
  #  GluonCV: a Deep Learning Toolkit for Computer Vision

GluonCV provides implementations of state-of-the-art (SOTA) deep learning algorithms in computer vision. It aims to help engineers, researchers, and students quickly prototype products, validate new ideas and learn computer vision.

GluonCV features:

   * training scripts that reproduce SOTA results reported in latest papers,

   * a large set of pre-trained models,

   * carefully designed APIs and easy to understand implementations,

   * community support.

   
![alt text](images/gluoncv.png "Gluoncv Applications")

This notebook will focus on training a <b>custom Object Detection model</b> using the <b>SSD</b> network
(custom means we are not using a pre-trained model trained on a dataset such as ImageNet)

In order to use the GluonCV library, we must install it by updating the version of mxnet that is installed.
We will also update some paths to the CUDA libraries, a dependency of GluonCV

In [None]:
import boto3                            # AWS Python framework                            
import os                               # OS library to access file paths
from gluoncv import data, utils         # gluoncv data and utils modules to create datasets
from gluoncv.data import VOCDetection   # VOCDetection allows gluoncv to recognize our boundingboxes and classes
from gluoncv.utils import viz           # gluoncv specific visualization capabilities
from matplotlib import pyplot as plt    # visualization capabilites (to view dataset samples)


   We are going to use a PASCAL VOC formatted dataset for this model.  We will briefly cover the formatting of a VOC Dataset in this notebook, but for more information about PASCAL VOC, visit: https://gluon-cv.mxnet.io/build/examples_datasets/pascal_voc.html#sphx-glr-build-examples-datasets-pascal-voc-py

Our dataset has <b>52 classes</b> - corresponding to the 52 different cards in a deck of playing cards (minus the Jokers)
![alt text](images/playingcards.png "Deck of cards")

The class names are abbreviated by first letter of rank and first letter of suite:

    2C = Two of Clubs
    AS = Ace of Spades
    ...

Here we will create a class derived from the VOCDetection method for our custom dataset:

### Move sample data from S3 to our notebook instance
We have a small sample dataset that we will evaluate on this notebook

In [None]:
!aws s3 cp --recursive s3://gluoncv-notebook .

In [None]:
# Let us take a look a the directory structure of a typical PASCAL VOC dataset:
!tree VOC

For this example, we will use the images and annotations in "VOCvalidate" as our validation dataset and the images and annotations in "VOCtrain" as our training dataset.  The "VOCvalidate" folder could be named anything meaningful. In this example since we have only 1 training and 1 validation dataset we keep the names simple and refer to them as VOCtrain and VOCvalidate, but we will see momentarily why the folders "VOCvalidate" & "VOCtrain" are not synonymous with datasets, instead we will refer to them as VOC Imageset folders. 

**note** The word VOC must be the first three letters of the ImageSet foldernames or the VOCDetection method that will be introduced later will not recognize the folders.

Within each VOC Imageset folder there are 3 child folders:
    Annotations
    ImageSets
    JPEGImages
    
The <b>Annotations<b> folder holds .XML files.  Each XML file contains bounding box information for 1 image file in the dataset. That image file may have multiple objects, but there is a 1:1 relationship between annotation files and imamge files.  Let's look at one of the Annotation files:

In [None]:
!cat VOC/VOCvalidate/Annotations/aug3_046386182.xml

In the above annotation file, there are 4 object nodes.  The image (which we will see soon, contains two playing cards. Each playing card has two locations for rank and suite and in this image, all 4 locations are visible.

Each object node contains the class name (QC, 8H) as well as the bounding box for the rank and suite. It is important to note the this model ONLY detects the rank and suite of playing cards, not the entire card or the number of suite icons on the card.

The second child folder is ImageSets. This folder contains text files that list the images you wish to include in a particular dataset. Lets take a look:

In [None]:
!cat VOC/VOCvalidate/ImageSets/Main/val.txt

When dealing with a VOC dataset for object detection, we use a child folder called "Main" within the ImageSets folder.  If we were using gluoncv for other task such as Action/Event, Pose Detection or Segmentation we would create additional folders at the level of Main and give them names corresponding to the type of model. Within the Main folder is a file called val.txt. It is used to encapsulate the size of the dataset we wish to use by listing each image name. In this example, we only include 5 files.  You will note the file extension is absent.  The PASCAL VOC format will expect to find an annotation file (In the Annotation directory and ending in .xml) of the same name as each entry in this val.txt.  PASCAL VOC will also expect to find an image file of the same name in the JPEGImages directory with a .jpg extension.

This structure allows you to store n files in the annotations and JPEGImages directories, and then customize an ImageSet listing file to only train/validate on selected annotation and JPEGImage files.  This gives you the capability to store as many images as you wish in a single directory, but create separate datasets by creating multiple ImageSet files.

The final directory is the JPEGImages folder which as stated above, contains the corresponding image to the annotations file. We will use the GluonCV API to explore these files further, first however we need to introduce gluoncv to our class structure.

In [None]:
class VOCLike(VOCDetection):
    CLASSES = ["ac", "2c", "3c", "4c", "5c", "6c", "7c", "8c", "9c", "10c", "jc", "qc", "kc", "ad", "2d", "3d", "4d", "5d", "6d", "7d", "8d", "9d", "10d", "jd", "qd", "kd", "ah", "2h", "3h", "4h", "5h", "6h", "7h", "8h", "9h", "10h", "jh", "qh", "kh", "as", "2s", "3s", "4s", "5s", "6s", "7s", "8s", "9s", "10s", "js", "qs", "ks"]
    def __init__(self, root, splits, transform=None, index_map=None, preload_label=True):
        super(VOCLike, self).__init__(root, splits, transform, index_map, preload_label)

We will also create an object containing our classes for use later

In [None]:
my_classes = ["ac", "2c", "3c", "4c", "5c", "6c", "7c", "8c", "9c", "10c", "jc", "qc", "kc", "ad", "2d", "3d", "4d", "5d",
           "6d", "7d", "8d", "9d", "10d", "jd", "qd", "kd", "ah", "2h", "3h", "4h", "5h", "6h", "7h", "8h", "9h", "10h",
           "jh", "qh", "kh", "as", "2s", "3s", "4s", "5s", "6s", "7s", "8s", "9s", "10s", "js", "qs", "ks"]

In [None]:
from gluoncv.utils.metrics.voc_detection import VOC07MApMetric

# Use our newly created class to generate a reference to the training data
train_dataset = VOCLike(root='VOCTemplate', splits=(('VOCTrain', 'train'),))
    
# Use our newly created class to generate a reference to the validation data
val_dataset = VOCLike(root='VOCTemplate', splits=(('VOCValid', 'valid'),))

# This metric will be introduced later prior to training
val_metric = VOC07MApMetric(iou_thresh=0.5, class_names=val_dataset.classes)

print('Training images:', len(train_dataset))
print('Validation images:', len(val_dataset))

Even though our label (annotation + classname) data is currently in a separate file from our image.  We can use the GluonCV library to read an image-label pair:

In [None]:
# Get a training image and corresponding label
train_image, train_label = train_dataset[24000]


# The train_image is an mxnet.ndarray that should be a 720 x 720 RGB image
print("train_image shape:{}".format(train_image.shape))


In [None]:
# Here we will take a moment to pay special attention to the shape of the train_label

# the label is a numpy array 
print("train_label type: {}".format(type(train_label)))

# the array has n elements - 1 element for each object in the train image
print("train_label shape: {}".format(train_label.shape))


In [None]:
# lets look at the entire array:
print(train_label)

The first 4 positions in an element are the bounding box (xmin, ymin, xmax, ymax)
The 5th position is the class ID
The 6th position is the label, if it has been pre-loaded

![alt text](images/label_shape.png "Label shape")

In [None]:
# Let's view this image with bounding boxes and classes
bboxes = train_label[:, :4]   # Get all elements :, and get all positions up to the 4th :4
cids = train_label[:, 4:5]    # Get the class ID in the 4th position
print('image:', train_image.shape)
print('bboxes:', bboxes.shape, 'class ids:', cids.shape)
ax = viz.plot_bbox(train_image.asnumpy(), bboxes, labels=cids, class_names=train_dataset.classes)
plt.show()

## Training

In [None]:
import sagemaker
from sagemaker.mxnet import MXNet
from sagemaker.mxnet import MXNetModel
from sagemaker import get_execution_role

sagemaker_session = sagemaker.Session()
role = get_execution_role()

bucket = sagemaker_session.default_bucket()
model_artifacts_location = 's3://{}'.format(bucket)


In [None]:
# Prepare for training

git_config = {'repo': 'https://github.com/Christopheraburns/cv-at-edge.git',
              'branch': 'master'}


estimator = MXNet(entry_point="train_ssd-playing-cards.py",
          role=role,
          git_config=git_config,
          output_path=model_artifacts_location,
          checkpoint_s3_uri=model_artifacts_location, 
          train_instance_count=1,
          train_instance_type="ml.p3.16xlarge",
          framework_version="1.6.0",
          py_version="py3",
          train_max_run=172800)


In [None]:
# Train
estimator.fit("s3://gluoncv-training/VOC-PlayingCards")

 #  Model minimization with SageMaker Neo
 
 Neo enables machine learning models to train once and run anywhere in the cloud and at the edge.
 
 Neo consists of a <b>compiler</b> and a <b>runtime</b>
 
 The runtime is known as the DLR (Deep Learning Runtime).  It can be found here: https://github.com/neo-ai/neo-ai-dlr but is already installed on SageMaker hosting instances
 
 The compiler can be accessed through the CLI, from the SageMaker maker console or from the SageMaker SDK via a Notebook. 
We will use the latter method.

Before we get to the model compilation however,  lets look at the components of a trained model in MXNET:

model.params
model-symbol.json

The params file, as the name of it's extension implies saves the <i>parameters</i> of a trained model.  However it does not contain the model architecture. 
    
The symbol.json file contains the hybridized model architecture.

Gluon makes it possible to export a trained model without an architecture because model architecture cannot be saved for dynamic models since the model architecture changes during execution.

If we refer to the Neo documentation we see that the Neo compiler requires files specific to the framework you are using:

<b>Tensorflow</b>
     
     Neo supports saved models and frozen models.
     For saved models, Neo expects one .pb or one .pbtxt file and a variables directory that contains variables.
     For frozen models, Neo expect only one .pb or .pbtxt file.
     
<b>Keras</b>

    Neo expects one .h5 file containing the model definition.

<b>PyTorch</b>

    Neo expects one .pth file containing the model definition.
    
<b> MXNET </b>

    Neo expects one symbol file (.json) and one parameter file (.params).
    
In the next cell we will walk through, step-by-step, the process to export the <i>model architecture</i> from our freshly trained MXNet parameters file.

In [None]:
# Lets copy our .params file to local disk to work with it
s3 = boto3.client('s3')
s3.download_file(bucket, 'ssd_512_mobilenet1.0_custom_best.params', 'ssd_512_mobilenet1.0_custom_best.params')

#### NOTE ####

# 'ssd_512_mobilenet1.0_custom_best.params' comes from our train_ssd-playing-cards.py script.

##############

# Because we obtained this model's network from the GluonCV model zoo loading this .params file back into the MXNet framework 
# is a two step process:
from gluoncv.model_zoo import get_model
import mxnet as mx

# First we get an instance of the network from the model zoo with the model_zoo get_model function
my_model = get_model('ssd_512_mobilenet1.0_custom', pretrained=False, classes=my_classes, ctx=mx.gpu(0))


Let us examine the parameters passed to the get_model function in the above cell

<b> pretrained=False </b>

    Since we specified the _custom network setting this to true would create unintended consequences.
    if we had chosen ..._voc or ..._coco as the network and set this to true we would get a pretrained model

<b> classes=my_classes </b>

    This is a custom model trained on our 52 classes. So here we provide our classes to the network
    
<b> ctx=mx.gpu(0) </b>

    ctx represents the ConTeXt that the model runs in.  This can can cpu or gpu. This context must match the context we
    used during training

In the next cell we will pass our <b>trained parameter file</b> into the my_model object to apply our parameters to the model zoo network.

In [None]:
my_model.load_parameters('ssd_512_mobilenet1.0_custom_best.params', ctx=mx.gpu(0))

While the my_model object is now suitable for running inference within the MXNet framework, it is not yet ready for compilation in Sagemaker Neo.  Let's export the model architecture to a ...-symbol.json file

In [None]:
# Convert the model to symbolic format
my_model.hybridize()

# intialize the weights by passing a tensor (of zeros) of the correct shape
my_model(mx.nd.ones((1, 3, 512, 512)).as_in_context(mx.gpu(0)))

# Export the model architecture
my_model.export('my_model')

The final line of the above cell will create the two files we need for Neo

the string value we passed to the export function is just the prefix.

The files will be named:

    my_model-0000.params
    my_model-symbol.json
    
We can now compress these two files into the familiar model.tar.gz file (familiar if you have some experience with SageMaker)

In [None]:
import tarfile

params = 'my_model-0000.params'
symbols = 'my_model-symbol.json'
tfile = "model.tar.gz"


tar = tarfile.open(tfile, "w:gz")
tar.add(params)
tar.add(symbols)
tar.close()

We are now ready to send our model to Neo to be recompiled.  

Lets take a look at the data Neo requires to compile a model.  We will use a screenshot from the SageMaker console

![alt text](images/NEO-compile.png "Neo")


Fairly straightforward, but for the sake of completeness we will go through each item

<b>job name </b>

    A unique name to give the compilation job.  This will be visible from the SageMaker training jobs console. 

    
<b> IAM Role </b>
    
    A role with sufficient permission to access SageMaker Neo.  
  

<b> Location of Model artifacts </b>

    We outlined the Neo compiler required files earlier.  Here we will enter the location of the model.tar.gz file
    we just created.  
    
<b> Data input Configuration </b>

    This is the shape of an observation. It must be in NCHW format and wrapped.
    N = Number of observations
    C = Number of Channels
    H = Height of the observation
    W = Width of the observation
    
    Thus, our input configuration will be [1, 3, 512, 512]
    and then wrapped it will become {"data": [1, 3, 512, 512]}
    
<b> Machine learning Framework </b>

    The framework our model was trained in.  In this example, it is MXNET
    
<b> S3 Output Location </b>

    The location where the compiled model will be placed upon completion
    
<b> Target Device </b>

    You must tell NEO the type of device you will be deploying the model to.  For this example, we will use Jetson Xavier

In [None]:
import time
# Since we only preserved the params object from our SageMaker training session, our estimator object does not have
# knowledge of our newly exported model.
model_key = 'model.tar.gz'

# Lets move our model to persistant storage (s3) and then point our estimator to it.
s3 = boto3.client('s3')
response = s3.upload_file(model_key, bucket, model_key)

model_path = 's3://{}/{}'.format(bucket, model_key)
compilation_job_name = 'jetson-mxnet-16-gluoncv-070'
sm_client = boto3.client('sagemaker')
data_shape = '{"data":[1,3,512,512]}'
target_device = 'jetson_xavier'
framework = 'MXNET'
framework_version = '1.6.0'
compiled_model_path = 's3://{}/neo-output'.format(bucket)

response = sm_client.create_compilation_job(
    CompilationJobName=compilation_job_name,
    RoleArn=role,
    InputConfig={
        'S3Uri': model_path,
        'DataInputConfig': data_shape,
        'Framework': framework
    },
    OutputConfig={
        'S3OutputLocation': compiled_model_path,
        'TargetDevice': target_device
    },
    StoppingCondition={
        'MaxRuntimeInSeconds': 300
    }
)
print(response)

# Poll every 30 sec
while True:
    response = sm_client.describe_compilation_job(CompilationJobName=compilation_job_name)
    if response['CompilationJobStatus'] == 'COMPLETED':
        break
    elif response['CompilationJobStatus'] == 'FAILED':
        raise RuntimeError('Compilation failed')
    print('Compiling ...')
    time.sleep(30)
print('Done!')