# Workshop - Human in the Loop for SageMaker Models - Module 3

Now that we have used the model to generate prediction on some random out-of-sample images and got unsatisfactory prediction (low probability). We also demonstrated how to use Amazon Augmented AI to review and label the image based on custom criteria. Next step in a typical machine learning life cycle is to include these cases with which the model has trouble in the next batch of training data for retraining purposes so that the model can now learn from a set of new training data to improve the model. In machine learning we call it [incremental training](https://docs.aws.amazon.com/sagemaker/latest/dg/incremental-training.html).

There are [three ways](https://docs.aws.amazon.com/sagemaker/latest/dg/object-detection.html#object-detection-inputoutput) to supply the image data and annotation to SageMaker built-in object detection algorithm. We trained our original model with the RecordIO format as we converted the PASCAL VOC images and annotations into RecordIO format. If you want to create a custom RecordIO data, you could follow the steps outlined [here](https://gluon-cv.mxnet.io/build/examples_datasets/detection_custom.html). Alternatively, SageMaker built-in object detection algorithm also takes JSON file as annotation along with your JPEG/PNG images. You could create one JSON file per image as in **Train with the Image Format** in the [documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/object-detection.html#object-detection-inputoutput), or take an advantage of [pipe mode](https://aws.amazon.com/blogs/machine-learning/accelerate-model-training-using-faster-pipe-mode-on-amazon-sagemaker/) enabled by using [Augmented Manifest](https://docs.aws.amazon.com/sagemaker/latest/dg/augmented-manifest.html) as input format. Pipe mode accelerate overall model training time up to 35% by streaming the data into the training algorithm while it is running instead of copying data to the EBS volume attached to the training instance. We could construct augmented manifest file from the A2I output with the following function: 

In [None]:
import re
import json
import boto3
import sagemaker
from sagemaker import get_execution_role

In [None]:
role = get_execution_role()
sess = sagemaker.Session()
BUCKET = sess.default_bucket()
OUTPUT_PATH = f's3://{BUCKET}/a2i-results'
MODEL_PATH = f's3://{BUCKET}/model'
object_categories = ['aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 
                     'cat', 'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike', 
                     'person', 'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor']

In [None]:
sagemaker_client = boto3.client('sagemaker')
a2i = boto3.client('sagemaker-a2i-runtime')
s3 = boto3.client('s3')

In [None]:
object_categories_dict = {str(i): j for i, j in enumerate(object_categories)}

def convert_a2i_to_augmented_manifest(a2i_output):
    annotations = []
    confidence = []
    for i, bbox in enumerate(a2i_output['humanAnswers'][0]['answerContent']['annotatedResult']['boundingBoxes']):
        object_class_key = [key for (key, value) in object_categories_dict.items() if value == bbox['label']][0]
        obj = {'class_id': int(object_class_key), 
               'width': bbox['width'],
               'top': bbox['top'],
               'height': bbox['height'],
               'left': bbox['left']}
        annotations.append(obj)
        confidence.append({'confidence': 1})

    # We set "a2i-retraining" as the attribute name for this dataset. This will later be used in setting the training data
    augmented_manifest={'source-ref': a2i_output['inputContent']['taskObject'],
                        'a2i-retraining': {'annotations': annotations,
                                           'image_size': [{'width': a2i_output['humanAnswers'][0]['answerContent']['annotatedResult']['inputImageProperties']['width'],
                                                           'depth':3,
                                                           'height': a2i_output['humanAnswers'][0]['answerContent']['annotatedResult']['inputImageProperties']['height']}]},
                        'a2i-retraining-metadata': {'job-name': 'a2i/%s' % a2i_output['humanLoopName'],
                                                    'class-map': object_categories_dict,
                                                    'human-annotated':'yes',
                                                    'objects': confidence,
                                                    'creation-date': a2i_output['humanAnswers'][0]['submissionTime'],
                                                    'type':'groundtruth/object-detection'}}
    return augmented_manifest

This function will take an A2I output json and result in a json object that is compatible to how Amazon SageMaker Ground Truth outputs the result and how SageMaker built-in object detection algorithm expects from the input. In order to create a cohort of training images from all the images re-labeled by human reviewers in A2I console. You can loop through all the A2I output, convert the json file, and concatenate them into a JSON Lines file, with each line represents results of one image. 

In [None]:
flowDefinitionName = 'fd-sagemaker-object-detection-demo'
flowDefinitionArn = sagemaker_client.describe_flow_definition(
    FlowDefinitionName=flowDefinitionName
)['FlowDefinitionArn']

In [None]:
a2i.list_human_loops(FlowDefinitionArn=flowDefinitionArn)

In [None]:
human_loops = a2i.list_human_loops(FlowDefinitionArn=flowDefinitionArn)
completed_human_loops = []

for loop in human_loops['HumanLoopSummaries']:
    resp = a2i.describe_human_loop(HumanLoopName=loop['HumanLoopName'])
    print(f'HumanLoop Status: {resp["HumanLoopStatus"]}')
    print(f'HumanLoop Output Destination: {resp["HumanLoopOutput"]}')
    print('\n')

    if resp["HumanLoopStatus"] == "Completed":
        completed_human_loops.append(resp)

In [None]:
output=[]
with open('augmented.manifest', 'w') as outfile:
    # convert the a2i json to augmented manifest for each human loop output
    for resp in completed_human_loops:
        splitted_string = re.split('s3://' +  BUCKET + '/', resp['HumanLoopOutput']['OutputS3Uri'])
        output_bucket_key = splitted_string[1]

        response = s3.get_object(Bucket=BUCKET, Key=output_bucket_key)
        content = response["Body"].read()
        json_output = json.loads(content)
        
        # convert using the function
        augmented_manifest = convert_a2i_to_augmented_manifest(json_output)
        print(json.dumps(augmented_manifest))
        json.dump(augmented_manifest, outfile)
        outfile.write('\n')
        output.append(augmented_manifest)
        print('\n')

In [None]:
# take a look at how Json Lines looks like
!head -n2 augmented.manifest

In [None]:
# upload the manifest file to S3
!aws s3 cp augmented.manifest {OUTPUT_PATH}/augmented.manifest

Similar to training with Ground Truth output augmented manifest file outlined in this [blog](https://aws.amazon.com/blogs/machine-learning/easily-train-models-using-datasets-labeled-by-amazon-sagemaker-ground-truth/), once we have collected enough data points, we can construct a new `Estimator` for incremental training. 

For incremental training, the choice of hyperparameters becomes critical. Since we are continue the learning and optimization from the last model, an appropriate starting `learning_rate`, for example, would again need to be determined. But as a rule of thumb, even with the introduction of new, unseen data, we should start out the incremental training with a smaller `learning_rate` and different learning rate schedule (`lr_scheduler_factor` and `lr_scheduler_step`) than that of the previous training job as the optimization has previously reached to a more stable state with reduced learning rate. We should see a similar mAP performance on the original validation dataset in the first epoch in the incremental training. 

We here will be using the hyperparameters exactly the same as how the first model was trained, with the following exceptions

- smaller learning rate (`learning_rate` was 0.001, now 0.0001)
- using the weights from the trained model instead of pre-trained weights that comes with the algorithm (`use_pretrained_model=0`).

Note that the following working code snippet is meant to demonstrate how to set up the A2I output for training in SageMaker with object detection algorithm. Incremental training with merely 1 or 2 new samples and untuned hyperparameters, would not yield a meaning model, if not experiencing [catastrophic forgetting](https://en.wikipedia.org/wiki/Catastrophic_interference).

In [None]:
# path definition
s3_train_data = f'{OUTPUT_PATH}/augmented.manifest'
# Reusing the training data for validation here for demonstration purposes
# but in practice you should provide a set of data that you want to validate the training against
s3_validation_data = s3_train_data 
s3_output_location = f'{OUTPUT_PATH}/incremental-training'

num_training_samples = len(output)

In [None]:
source_model_data_s3_uri = 's3://aws-sagemaker-augmented-ai-example/model/model.tar.gz'

!aws s3 cp {source_model_data_s3_uri} {MODEL_PATH}/model.tar.gz

model_data_s3_uri = f'{MODEL_PATH}/model.tar.gz'

In [None]:
# setting the input data
train_data = sagemaker.inputs.TrainingInput(s3_train_data, 
                                            distribution='FullyReplicated', 
                                            content_type='application/x-recordio',
                                            record_wrapping='RecordIO',
                                            s3_data_type='AugmentedManifestFile', 
                                            attribute_names=['source-ref', 'a2i-retraining'])

validation_data = sagemaker.inputs.TrainingInput(s3_validation_data, 
                                                 distribution='FullyReplicated', 
                                                 content_type='application/x-recordio',
                                                 record_wrapping='RecordIO',
                                                 s3_data_type='AugmentedManifestFile', 
                                                 attribute_names=['source-ref', 'a2i-retraining'])

# Use the output model from the original training job.  
model_data = sagemaker.inputs.TrainingInput(model_data_s3_uri, 
                                            distribution='FullyReplicated',
                                            content_type='application/x-sagemaker-model', 
                                            s3_data_type='S3Prefix',
                                            input_mode = 'File')

data_channels = {'train': train_data, 
                 'validation': validation_data,
                 'model': model_data}

In [None]:
image = sagemaker.image_uris.retrieve('object-detection', 'us-east-1', version='1')

# Create a model object set to using "Pipe" mode because we are inputing augmented manifest files.
new_od_model = sagemaker.estimator.Estimator(image, # same object detection image that we used for model hosting  
                                             role, 
                                             instance_count=1, 
                                             instance_type='ml.p3.2xlarge', 
                                             volume_size = 50, 
                                             max_run = 360000, 
                                             input_mode = 'Pipe',
                                             output_path=s3_output_location, 
                                             sagemaker_session=sess) 

In [None]:
# same set of hyperparameters from the original training job
new_od_model.set_hyperparameters(base_network='resnet-50',
                                 use_pretrained_model=0, # we are going to use our own model
                                 num_classes=20,
                                 learning_rate=0.0001,   # smaller learning rate for a more stable search
                                 mini_batch_size=1,
                                 epochs=1,               # 1 for demo purposes
                                 lr_scheduler_step='3,6',
                                 lr_scheduler_factor=0.1,
                                 optimizer='sgd',
                                 momentum=0.9,
                                 weight_decay=0.0005,
                                 overlap_threshold=0.5,
                                 nms_threshold=0.45,
                                 image_shape=300,
                                 label_width=350,
                                 num_training_samples=num_training_samples)

In [None]:
new_od_model.fit(inputs=data_channels, logs=True)

After training, you would get a new model in the `s3_output_location`, you can deploy it to a new endpoint or modify an endpoint without taking models that are already deployed into production out of service. For example, you can add new model variants, update the ML Compute instance configurations of existing model variants, or change the distribution of traffic among model variants. To modify an endpoint, you provide a new endpoint configuration. Amazon SageMaker implements the changes without any downtime. For more information, see [UpdateEndpoint](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UpdateEndpoint.html) and [UpdateEndpointWeightsAndCapacities](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UpdateEndpointWeightsAndCapacities.html). 

## More on incremental training
It is recommended to perform a search over the hyperparameter space for your incremental training with [hyperparameter tuning](https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning.html) for an optimal set of hyperparameters, especially the ones related to learning rate: `learning_rate`, `lr_scheduler_factor` and `lr_scheduler_step` from the SageMaker object detection algorithm.

## Clean Up

Remember to exeute the last cells in module 1 and module 2