#  How to create a Custom Object Detector using GluonCV and deploy it to Panorama

**About this Notebook** :

* Fine-tuning is commonly used approach to transfer previously trained model to a new dataset.
* It is especially useful if the targeting new dataset is relatively small.
* Finetuning from pre-trained models can help reduce the risk of overfitting.
* Finetuned model may also generalize better if the previously used dataset is in the similar domain of the new dataset.
* This tutorial opens up a good approach for fine-tuning object detection models provided by GluonCV.
* This notebook also shows finetuning using both local training and Amazon SageMaker managed training.

**Goal of this Notebook** :

* Aid an Panorama developer in creating a custom Object Detector using GluonCV
* Once the model is created, export the model parameters
* Use the exported parameters to then deploy the model
* Create a sample lambda to loop in the model

**What this Notebook accomplishes?** :

* Show how to use a customized Pikachu dataset and illustrate the finetuning fundamentals step by step.
* Walk thru the steps to modify a model to fit your own object detection projects.
* This is an adaption of the following GluonCV example : [Link](https://gluon-cv.mxnet.io/build/examples_detection/finetune_detection.html)

**Useful Resources to aid your development**:
* [AWS Panorama Documentation](https://docs.aws.amazon.com/panorama/)
* [Create Your Own COCO Dataset](https://gluon-cv.mxnet.io/build/examples_datasets/mscoco.html#sphx-glr-build-examples-datasets-mscoco-py)
* [Create Your Own VOC Dataset](https://gluon-cv.mxnet.io/build/examples_datasets/pascal_voc.html#sphx-glr-build-examples-datasets-pascal-voc-py)
* [sphx-glr-build-examples-datasets-detection-custom](https://gluon-cv.mxnet.io/build/examples_datasets/detection_custom.html#sphx-glr-build-examples-datasets-detection-custom-py)


### Imports & config
----------------
This notebook was tested with ***GluonCV 0.8.0***. If you are running this notebook on Amazon SageMaker Notebook Instance then use `conda_mxnet_p36` kernel (or similar latest version) which comes with MXNet pre-installed (but you may still need to install GluonCV. 

*Also install MXNet if not running on SageMaker Notebook Instance by using `pip install mxnet` and restart a kernel (`Kernel->Restart` in menu). This notebook was last tested with* ***MXNet 1.6.0***

In [None]:
!pip install gluoncv==0.8.0

In [None]:
%matplotlib inline

In [None]:
import os
import time
import shutil
from matplotlib import pyplot as plt
import tarfile
import json
import inspect
import numpy as np

from pygments import highlight
from pygments.lexers import PythonLexer
from pygments.formatters import HtmlFormatter

import mxnet as mx
from mxnet.io import DataBatch
import gluoncv as gcv
from gluoncv.utils import download, viz
from IPython.display import display, HTML

import boto3
import sagemaker
from sagemaker.mxnet import MXNet

import training.train

print(f'Using MXNet {mx.__version__}')
print(f'Using GluonCV {gcv.__version__}')

In [None]:
# Set this variable/constant value to be the full name of your bucket, for example "aws-panorama-example-xyz"
BUCKET = 'aws-panorama-<you-bucket-name-suffix>'  # Bucket name must contain "aws-panorama"

MODELS_S3_PREFIX = 'models'
models_s3_url = f's3://{BUCKET}/{MODELS_S3_PREFIX}'

DATA_S3_PREFIX = 'data/pikachu'

LAMBDA = 'PikachuDetection'
LAMBDA_DIR = '../Lambda'

LAMBDA_EXECUTION_ROLE_NAME = 'PanoramaPikachuLambdaExecutionRole'
lambda_file = f'{LAMBDA}.py'
lambda_archive = lambda_file.replace('.py', '.zip')

THRESHOLD = 0.5

classes = training.train.CLASSES
input_size = training.train.INPUT_SIZE

DATA_DIR = 'data'
DATA_REC_FILES = ['train.rec']
DATA_BASE_URL = 'https://apache-mxnet.s3-accelerate.amazonaws.com/gluon/dataset/pikachu'

TRAIN_DIR = 'training'
TRAIN_FILE = 'train.py'
train_file_path = os.path.join(TRAIN_DIR, TRAIN_FILE)

ctx = training.train.get_ctx()

In [None]:
ss = sagemaker.session.Session()
region = ss.boto_region_name
sm = boto3.client('sagemaker', region_name=region)
s3 = boto3.client('s3')
iam = boto3.client("iam")
lm = boto3.client("lambda")

In [None]:
def show_code(func):
    code = inspect.getsource(func)
    formatter = HtmlFormatter(cssclass='pygments')
    html_code = highlight(inspect.getsource(training.train.get_net), PythonLexer(), formatter)
    css = formatter.get_style_defs('.pygments')
    template = '<style>{}</style>{}'
    html = template.format(css, html_code)
    display(HTML(html))

### Pikachu Dataset
----------------
Starting with a Pikachu dataset generated by rendering 3D models on random real-world scenes.

Please refer to `sphx_glr_build_examples_datasets_detection_custom` link above for tutorial of how to create your own datasets.

In [None]:
if not os.path.exists(DATA_DIR):
    os.mkdir(DATA_DIR)

for f in [rf.replace('.rec', ext) for rf in DATA_REC_FILES for ext in ('.rec', '.idx')]:
    download(os.path.join(DATA_BASE_URL, f), path=os.path.join(DATA_DIR, f), overwrite=False)

##### Load the dataset and show some samples

In [None]:
dataset = gcv.data.RecordFileDetection(os.path.join(DATA_DIR, DATA_REC_FILES[0]))
samples = [0, 1]
_, ax = plt.subplots(len(samples), figsize=(15, 15))
for i in [0, 1]:
    image, label = dataset[i]
    viz.plot_bbox(image, bboxes=label[:, :4], labels=label[:, 4:5], class_names=classes, ax=ax[i])
plt.show()


### Training code
-------------------

***This notebook shows two ways of training the model - training locally and training on Amazon SageMaker. Both examples use the same training code, which can be found in `training/train.py` and is not replicated in this notebook***

Here is the full training code

In [None]:
!pygmentize $train_file_path


### Pre-trained models
-------------------

Instead of building and training a model from scratch, let's build one by finetuning a pre-trained model. There are multiple choices in the [GluonCV Model Zoo](https://cv.gluon.ai/model_zoo/detection.html). A fast SSD network with MobileNet1.0 backbone was selected for this sample.

This is the part of the training code responsible for loading and configuring the pre-trained model to work with classes we are interested in.

In [None]:
show_code(training.train.get_net)

##### Let's load our chosed model and inspect it

In [None]:
local_net = training.train.get_net(ctx=ctx)
local_net

*There is also an alternative way for creating custom network with pre-trained weights, shown here for informational purposes only*

If you want to try this way, convert the next cell to Code type and run it to obtain the same model.

### Local training
------------------

Code in the next cell will train the model using a small number of epochs, which should only take few minutes to complete (for example, training for 2 epochs on a `p2.xlarge` instance can take about 5 minutes). You may try increasing the number of epochs to see if you obtain a better model (or drop it to as low as 1 if you just want to see how it works)

In [None]:
epochs = 2
local_model_dir = 'model'
saved_params_file = 'saved-model.params'
if os.path.exists(local_model_dir):
    shutil.rmtree(local_model_dir)

# Data and model paths should be relative to script running the training
!cd $TRAIN_DIR && python3 $TRAIN_FILE --epochs $epochs --train ../$DATA_DIR \
    --model-dir ../$local_model_dir --save-params --saved-params-file $saved_params_file

### Test the fine-tuned model
----------------------------
Use an image not seen by the training process for testing

In [None]:
test_image_source_url = 'https://raw.githubusercontent.com/zackchase/mxnet-the-straight-dope/master/img/pikachu.jpg'
def prepare_test_image(test_image_url=test_image_source_url, input_size=input_size, ctx=None):
    test_image_name = os.path.split(test_image_url)[1]
    if not os.path.exists(test_image_name):
        print(f'Downloadig {test_image_url}')
        download(test_image_url, test_image_name)
    img, image = gcv.data.transforms.presets.ssd.load_test(test_image_name, input_size)
    if ctx is not None:
        img = img.as_in_context(ctx)
    return img, image

In [None]:
img, image = prepare_test_image(input_size=input_size, ctx=ctx)

In [None]:
def show(cids, scores, bboxes, image, classes, thresh=THRESHOLD):
    _, ax = plt.subplots(2, figsize=(15, 15))
    ax[0].imshow(image)
    viz.plot_bbox(image, bboxes[0], scores[0], cids[0], class_names=classes, thresh=thresh, ax=ax[1])
    plt.show()

In [None]:
local_net.load_parameters(os.path.join(local_model_dir, saved_params_file), ctx=ctx)
show(*local_net(img), image, classes)

### SageMaker training
------------------

You may skip this section if all you want is to see how the model works on Panorama device as you already have a trainded model to deploy. However, following this section will give you a good overview of training the same model using SageMaker managed training.

SageMaker managed training takes place outside of the instance running this notebook and needs access to training data. Storing the training data in S3 bucket is convenient way of achieving this due to easy integration of Amazon S3 with other AWS services.

##### Upload training data to S3 bucket

In [None]:
s3_inputs = ss.upload_data(bucket=BUCKET, path=DATA_DIR, key_prefix=DATA_S3_PREFIX)
print(f'Uploaded training data to {s3_inputs}')

##### Launch a training job

SageMaker SDK provides resources to facilitate the training (and inference) using different Machine Learning frameworks, including the ability to use your own customer Docker containers for training and inference. The model used here is based on MXNet and there is a dedicated `MXNet` class in SageMaker SDK for training MXNet models using custom training code, exactly what we need.

Running the next cell may take a bit longer than the local training because Amazon SageMaker needs to launch additional compute instances to perform the training (*and you only paying for the actual training time, not the time it takes to launch those instances*), however in return you get extra benefits like a flexibility of choosing different framework versions and instance types to use for training (i.e. you could be running this notebook on a small CPU based instance to minimise the development costs and use powerful GPU instances only for training) or ability to run multiple training jobs in the background while you keep on developing. 

##### Launch a training job

Change the `wait` variable value to False if you want to carry on with other tasks while the training job is running at the background. Note that we are using a specific version of MXNet to run the training job which is actually different from the version we used for local training.

*You will not be able to see the logs here if you run the training job in non-waiting mode, you can still find logs in Amazon CloudWatch*

In [None]:
wait = True
estimator = MXNet(TRAIN_FILE,
                  source_dir=TRAIN_DIR,
                  role=sagemaker.get_execution_role(),
                  instance_count=1,
                  instance_type="ml.p2.xlarge",
                  framework_version="1.7.0",
                  py_version="py3",
                  output_path=models_s3_url,
                  hyperparameters={'epochs': 2})
estimator.fit(s3_inputs, wait=wait)

If you started the training job in non-waiting mode (`wait = False`), you can monitor the job progress in the Amazon SageMaker Console and see the logs in Amazon CloudWatch. Alternatively you can start a waiter which will block the notebook execution until the training stops.

In [None]:
waiter = sm.get_waiter('training_job_completed_or_stopped')
waiter.wait(TrainingJobName=estimator._current_job_name)

##### Download and unpack the model artifacts 

In [None]:
model_dir = 'sm_model'
job_info = sm.describe_training_job(TrainingJobName=estimator._current_job_name)
job_status = job_info['TrainingJobStatus']
print(f"Job status: {job_status}")
sm_training_ok = job_status.upper() == 'COMPLETED'
if sm_training_ok:
    sm_model_s3_url = job_info['ModelArtifacts']['S3ModelArtifacts']
    print(f'Model S3 location: {sm_model_s3_url}')
    if os.path.exists(model_dir):
        shutil.rmtree(model_dir)
    os.mkdir(model_dir)
    model_key = sm_model_s3_url[len(f's3://{BUCKET}/'):]
    model_archive_file = os.path.split(sm_model_s3_url)[1]
    print(f'Downloading model to {model_dir}')
    s3.download_file(Bucket=BUCKET, Key=model_key, Filename=os.path.join(model_dir, model_archive_file))
    !ls $model_dir
    print(f'Extracing model artifacts')
    !cd $model_dir && tar xvf $model_archive_file
else:
    job_info

##### Run inference

This shows how to load an MXNet model without direct access to the model's definition class as we did earlier when we loaded saved parameters into the existing instance of the model class (`local_net`).

In [None]:
def get_net_from_checkpoint(checkpoint_prefix, ctx):
    """Load the model weights from checkpoint (e.g. created by `net.export`) as opposed to loading parameters
    from file created by `net.save_paramerters`
    https://github.com/awslabs/multi-model-server/tree/master/examples/ssd
    """
    print(f'Loading model from {checkpoint_prefix}')
    sym, arg_params, aux_params = mx.model.load_checkpoint(prefix=checkpoint_prefix, epoch=0)

    # We use the data_names and data_shapes returned by save_mxnet_model API.
    mod = mx.mod.Module(symbol=sym, context=ctx, label_names=None)
    print(f'Model input name: {mod.data_names}')
    mod.bind(data_shapes=[(mod.data_names[0], (1, 3, input_size, input_size))])
    mod.set_params(arg_params, aux_params)
    print(f'Model input shapes: {mod.data_shapes}')
    print(f'Model output shapes: {mod.output_shapes}')
    return mod

In [None]:
sm_net = get_net_from_checkpoint(os.path.join(model_dir, 'exported-model'), training.train.get_ctx())
img, image = prepare_test_image(ctx=ctx)
sm_net.forward(DataBatch([img]))
sm_output = sm_net.get_outputs()
show(*sm_output, image, classes)

### Prepare the model
-----------------------

There is nothing to do here if you want to use the model trained in Amazon SageMaker, the model is already in S3 and the exact location is in `sm_model_s3_url` variable.

Take a note of the model's S3 location, you will need it during the application deployment to Panorama appliance.

*Here is a quick way to check the file on S3 if you have AWS CLI tools installed*

In [None]:
print(f'Model trained in SageMaker stored in {sm_model_s3_url}')
!aws s3 ls $sm_model_s3_url --human-readable

If you want to use a locally trained model then convert the following cells to Code type and run to pack the locally trained model and upload it to S3 bucket.

----------------------------
### Upload, Create and Publish Lambda Function
----------------------------

This Python snippet uses boto3 to create an IAM role named LambdaBasicExecution with basic 
lambda execution permissions.

In [None]:
role_policy_document = {
    "Version": "2012-10-17",
    "Statement":[
        {
            "Effect": "Allow",
            "Principal": {"Service": ["lambda.amazonaws.com", "events.amazonaws.com"]},
            "Action": "sts:AssumeRole",
        }
    ]
}
iam.create_role(
    RoleName=LAMBDA_EXECUTION_ROLE_NAME,
    AssumeRolePolicyDocument=json.dumps(role_policy_document),
)

The following Python snippet will use the resources above to create a new AWS Lambda function called PikachuDetection_demo. If you already have a Lambda Function with that name and want to re-create it, run the following cell after converting it to Code type.

In [None]:
!rm $LAMBDA_DIR/$lambda_archive
!cd $LAMBDA_DIR && zip -o $lambda_archive $lambda_file
!cp $LAMBDA_DIR/$lambda_archive .

with open(os.path.join(LAMBDA_DIR, lambda_archive), "rb") as f:
    zipped_code = f.read()

lambda_execution_role = iam.get_role(RoleName=LAMBDA_EXECUTION_ROLE_NAME)
response = lm.create_function(
    FunctionName=LAMBDA,
    Runtime="python3.7",
    Role=lambda_execution_role["Role"]["Arn"],
    Handler=lambda_file.replace('.py', '.main()'),
    Code=dict(ZipFile=zipped_code),
    Timeout=120,  
    MemorySize=2048,
    Publish=True)

Printing the details of the lambda function that was just published

In [None]:
function_arn = response["FunctionArn"]
function_arn_version = list(response["FunctionArn"].split(":"))[-1]
lambda_url = (
    "https://console.aws.amazon.com/lambda/home?region=us-east-1#/functions/"
    + response["FunctionName"]
    + "/versions/"
    + response["Version"]
    + "?tab=configuration"
)
print(lambda_url)

----------------------------
### Next steps
----------------------------

The Lambda is now created and published. You are now ready to deploy your model and the published lambda function, to the Panorama device

The instructions to deploy are linked below

[Creating Application Instructions Here](https://docs.aws.amazon.com/panorama/)