# Object Detection: AML Package for Computer Vision

### Overview

Object Detection is one of the main problems in Computer Vision. Traditionally, this required expert knowledge to identify and implement so called “features” that highlight the position of objects in the image. Starting in 2012 with the famous [AlexNet paper](https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf), Deep Neural Networks are used to automatically find these features.

This notebook shows how the Azure Machine Learning Package for Computer Vision can be used to train, evaluate, and deploy a [Faster R-CNN](https://arxiv.org/abs/1506.01497) object detection model. The Computer Vision Toolkit (CVTK) makes it easy to perform all these steps, and internally uses [Tensorflow's implementation](https://arxiv.org/abs/1611.10012) of Faster R-CNN. Faster R-CNN was shown to produce state-of-the-art results for Pascal VOC, one of the main object detection challenges in the field. For more information see the [Tensorflow object detection website](https://github.com/tensorflow/models/tree/master/research/object_detection).

### Image Annotation

Manually annotated object locations are required to train and evaluate an object detector. One excellent UI to help drawing bounding boxes is [LabelImg](https://tzutalin.github.io/labelImg) which can be installed using [windows_v1.6.0.zip](https://www.dropbox.com/s/tq7zfrcwl44vxan/windows_v1.6.0.zip?dl=1). LabelImg writes an xml-file per image in Pascal-VOC format, which can be read into CVTK (see below). 

<img src="https://user-images.githubusercontent.com/9322661/38701920-2c85c962-3e6d-11e8-8cad-b4578336b231.JPG" width=800>

### Example Dataset
For this demo, a dataset of grocery items inside refrigerators is provided, consisting of 30 images, and 8 classes (e.g. water, orange, mushroom, etc). These images can be see in the folder "../sample_data/foods". For each jpg image in the folder "../sample_data/foods/train/JPEGImages", an annotation xml-file with similar name exists in  "../sample_data/foods/train/Annotations". These images and bounding box annotations will be loaded in the next section.

The Figure below shows the recommended folder structure. 

<img src="https://user-images.githubusercontent.com/9322661/38628125-a363aaa8-3d7e-11e8-884d-e53a53a8da90.JPG" width=600>

## Training and Evaluation

### Step 1: Dataset creation

Create a CVTK dataset object which consists of a set of images, with their respective bounding box annotations. In the example below, we will read in the refrigerator images which are provided in the "../sample_data/foods/training" folder. Note that only JPEG images are supported.

In [None]:
import warnings
warnings.filterwarnings("ignore")
import os, time
from cvtk.core import Context, ObjectDetectionDataset, TFFasterRCNN
from cvtk.utils import detection_utils

# Disable printing of logging messages
from azuremltkbase.logging import ToolkitLogger
ToolkitLogger.getInstance().setEnabled(False)

# Initialize the context object
out_root_path = "../../../cvtk_output"
Context.create(outputs_path=out_root_path, persistent_path=out_root_path, temp_path=out_root_path)

from matplotlib import pyplot as plt
# Display the images
%matplotlib inline

In [None]:
image_folder = "../sample_data/foods/train"
data_train = ObjectDetectionDataset.create_from_dir(dataset_name='training_dataset', data_dir=image_folder,
                                                    annotations_dir="Annotations", image_subdirectory='JPEGImages')

# Show some statistics of the training image, and also give one example of the ground truth rectangle annotations
data_train.print_info()
_ = data_train.images[2].visualize_bounding_boxes(image_size = (10,10))

### Step 2: Define a model

Various parameters can be provided when defining a model. The meaning of these parameters, as well as the parameters used for training (see next section) can be found in either CVTK's API docs, or on the [Tensorflow object detection website](https://github.com/tensorflow/models/tree/master/research/object_detection). 

In [None]:
score_threshold = 0.0       # Threshold on the detection score, use to discard lower-confidence detections.
max_total_detections = 300  # Maximum number of detections. A high value will slow down training but might increase accuracy.
my_detector = TFFasterRCNN(labels=data_train.labels, 
                           score_threshold=score_threshold, 
                           max_total_detections=max_total_detections)

### Step 3: Train the model

We can now train our object detector. This requires a GPU, and for the refrigerator datasets can take up to 5 minutes. The number or training steps in the code is set to 350, so that training runs more quickly (~5minutes). In practice, one should set it to at least 10 times the number of images in the training set.

Two key parameters for training are number of steps and learning rate(s).The argument num_steps can be used to specify the number of minibatches used to train the model. Since the minibatch size is set to 1 in this release, it equals the number of images considered during training.

In [None]:
print("tensorboard --logdir={}".format(my_detector.train_dir))

# to get good results, use a larger value for num_steps, e.g., 5000.
num_steps = 350
learning_rate = 0.001 # learning rate

start_train = time.time()
my_detector.train(dataset=data_train, num_steps=num_steps, 
                  initial_learning_rate=learning_rate)
end_train = time.time()
print(end_train-start_train)

TensorBoard can be used to visualize the training progress. TensorBoard events are located in the folder specified by the model object's train_dir attribute. Copy the printout that starts with 'tensorboard --logdir' to a command line to view TensorBoard. Then copy the URL from the command window to a web browser to view it. Once you open the TensorBoard, you should see an window like the following screenshot. If Firefox does not render TensorBoard correctly, please try other browsers. 

<img src="https://user-images.githubusercontent.com/9322661/39355941-a3785452-49dc-11e8-973f-080044feaa32.JPG" width=600>

### Step 4: Evaluate the model

Create an evaluation dataset, and use this to compute the accuracy of the model.

In [None]:
image_folder = "../sample_data/foods/test"
data_val = ObjectDetectionDataset.create_from_dir(dataset_name='val_dataset', data_dir=image_folder)
eval_result = my_detector.evaluate(dataset=data_val)

# print out the performance metric values
for label_obj in data_train.labels:
    label = label_obj.name
    key = 'PASCAL/PerformanceByCategory/AP@0.5IOU/' + label
    print('{0: <15}: {1: <3}'.format(label, round(eval_result[key], 2)))
    
print('{0: <15}: {1: <3}'.format("overall:", round(eval_result['PASCAL/Precision/mAP@0.5IOU'], 2))) 

Evaluation results can also be viewed using TensorBoard. For instance, copy the following printout to a command line to view TensorBoard.

In [None]:
print("tensorboard --logdir={} --port=8008".format(my_detector.eval_dir))

Similarily, you can compute the accuracy of the model on the training set. This is a sanity check to make sure training converged to a good solution. Note that the accuracy on the training set after successful training is often close to 100%.

## Scoring

### Step 5: Score an image

In [None]:
image_path = data_val.images[0].storage_path
scores = my_detector.score(image_path)
path_save = out_root_path + "/scored_images/scored_image_preloaded.jpg"
ax = detection_utils.visualize(image_path, scores, image_size=(8, 12))
path_save_dir = os.path.dirname(os.path.abspath(path_save))
os.makedirs(path_save_dir, exist_ok=True)
ax.get_figure().savefig(path_save)

### Step 6: Save the model

In [None]:
save_model_path = out_root_path + "/frozen_model/faster_rcnn.model" # Please save your model to outside of your AML workbench project folder because of the size limit of AML project
my_detector.save(save_model_path)

### Step 7: Score an image with the saved model

The model needs to be loaded once for scoring.

In [None]:
my_detector_loaded = TFFasterRCNN.load(save_model_path)

After the model is loaded, it can be used to score an image or a list of images.

In [None]:
detections_dict = my_detector_loaded.score(image_path)

Below we print out the detected objects with scores above 0.5, including labels, scores, and coordinates.

In [None]:
look_up = dict((v,k) for k,v in my_detector.class_map.items())
n_obj = 0
for i in range(detections_dict['num_detections']):
    if detections_dict['detection_scores'][i] > 0.5:
        n_obj += 1
        print("Object {}: label={:11}, score={:.2f}, location=(top: {:.2f}, left: {:.2f}, bottom: {:.2f}, right: {:.2f})".format(
            i, look_up[detections_dict['detection_classes'][i]], 
            detections_dict['detection_scores'][i], 
            detections_dict['detection_boxes'][i][0],
            detections_dict['detection_boxes'][i][1], 
            detections_dict['detection_boxes'][i][2],
            detections_dict['detection_boxes'][i][3]))    
        
print("\nFound {} objects in image {}.".format(n_obj, image_path))           

### Step 8: Visualize the scored image

In [None]:
path_save = out_root_path + "/scored_images/scored_image_frozen_graph.jpg"
ax = detection_utils.visualize(image_path, detections_dict, path_save=path_save, image_size=(8, 12))
# ax.get_figure() # use this code extract the returned image

### Step 9: Webservice Deployment


<b>Prerequisites:</b> 
Please the check the **Prerequisites** section of our deployment notebook to set up your deployment CLI. You only need to set it up once for all your deployments. More deployment related topics including IoT Edge deployment can be found in the deployment notebook.
       
<b>Deployment API:</b>

> **Examples:**
- ```deploy_obj = AMLDeployment(deployment_name=deployment_name, associated_DNNModel=dnn_model, aml_env="cluster")``` # create deployment object
- ```deploy_obj.deploy()``` # deploy web service
- ```deploy_obj.status()``` # get status of deployment
- ```deploy_obj.score_image(local_image_path_or_image_url)``` # score an image
- ```deploy_obj.delete()``` # delete the web service
- ```deploy_obj.build_docker_image()``` # build docker image without creating webservice
- ```AMLDeployment.list_deployment()``` # list existing deployment
- ```AMLDeployment.delete_if_service_exist(deployment_name)``` # delete if the service exists with the deployment name

<b>Deployment management with portal:</b>

You can go to [Azure portal](https://ms.portal.azure.com/) to track and manage your deployments. From Azure portal, find your Machine Learning Model Management account page (You can search for your model management account name). Then go to: the model management account page->Model Management->Services.

In [None]:
# ##### OPTIONAL - Interactive CLI setup helper ###### 
# # Interactive CLI setup helper, including model management account and deployment environment.
# # If you haven't setup you CLI before or if you want to change you CLI settings, you can use this block to help you interactively.
# # UNCOMMENT THE FOLLOWING LINES IF YOU HAVE NOT CREATED OR SET THE MODEL MANAGEMENT ACCOUNT AND DEPLOYMENT ENVIRONMENT

# from azuremltkbase.deployment import CliSetup
# CliSetup().run()

In [None]:
from cvtk.operationalization import AMLDeployment

# set deployment name
deployment_name = "wsdeployment"

# Create deployment object
# It will use the current deployment environment (you can check it with CLI command "az ml env show").
deploy_obj = AMLDeployment(deployment_name=deployment_name, aml_env="cluster", associated_DNNModel=my_detector, replicas=1)

# Alternatively, you can provide azure machine learning deployment cluster name (environment name) and resource group name
# to deploy your model. It will use the provided cluster to deploy. To do that, please uncomment the following lines to create 
# the deployment object.

# azureml_rscgroup = "<resource group>"
# cluster_name = "<cluster name>"
# deploy_obj = AMLDeployment(deployment_name=deployment_name, associated_DNNModel=my_detector,
#                            aml_env="cluster", cluster_name=cluster_name, resource_group=azureml_rscgroup, replicas=1)

# Check if the deployment name exists, if yes remove it first.
if deploy_obj.is_existing_service():
    AMLDeployment.delete_if_service_exist(deployment_name)
    
# create the webservice
print("Deploying to Azure cluster...")
deploy_obj.deploy()
print("Deployment DONE")

### Step 10: Webservice consumption

Once you created the webservice, you can score images with the deployed webservice. You have several options:

   - You can directly score the webservice with the deployment object with: deploy_obj.score_image(image_path_or_url) 
   - Or, you can use the Service endpoin url and Serivce key (None for local deployment) with: AMLDeployment.score_existing_service_with_image(image_path_or_url, service_endpoint_url, service_key=None)
   - Form your http requests directly to score the webservice endpoint (For advanced users).

#### Score with existing deployment object
```
deploy_obj.score_image(image_path_or_url)
```

In [None]:
# Score with existing deployment object

# Score local image with file path
print("Score local image with file path")
image_path_or_url = data_train.images[0].storage_path
print("Image source:",image_path_or_url)
serialized_result_in_json = deploy_obj.score_image(image_path_or_url, image_resize_dims=[224,224])
print("serialized_result_in_json:", serialized_result_in_json[:50])

# Score image url and remove image resizing
print("Score image url")
image_path_or_url = "https://cvtkdata.blob.core.windows.net/publicimages/microsoft_logo.jpg"
print("Image source:",image_path_or_url)
serialized_result_in_json = deploy_obj.score_image(image_path_or_url)
print("serialized_result_in_json:", serialized_result_in_json[:50])


In [None]:
# Time image scoring
import timeit

num_images = 3
for img_index, img_obj in enumerate(data_train.images[:num_images]):
    print("Calling API for image {} of {}: {}...".format(img_index, num_images, img_obj.name))
    tic = timeit.default_timer()
    return_json = deploy_obj.score_image(img_obj.storage_path, image_resize_dims=[224,224])
    print("   Time for API call: {:.2f} seconds".format(timeit.default_timer() - tic))

#### Score with service endpoint url and service key
```
    AMLDeployment.score_existing_service_with_image(image_path_or_url, service_endpoint_url, service_key=None)
```

In [None]:
# Import related classes and functions
from cvtk.operationalization import AMLDeployment

service_endpoint_url = "http://xxx" # please replace with your own service url
service_key = "xxx" # please replace with your own service key

# score image url
image_path_or_url = "https://cvtkdata.blob.core.windows.net/publicimages/microsoft_logo.jpg"
print("Image source:",image_path_or_url)
serialized_result_in_json = AMLDeployment.score_existing_service_with_image(image_path_or_url,service_endpoint_url, service_key = service_key, image_resize_dims=[224,224])
print("serialized_result_in_json:", serialized_result_in_json[:50])

#### Score endpoint with http request directly
Following is some example code to form the http request directly in Python. You can do it in other programming languages.

In [None]:
def score_image_with_http(image, service_endpoint_url, service_key=None, parameters={}):
    """Score local image with http request

    Args:
        image (str): Image file path
        service_endpoint_url(str): web service endpoint url
        service_key(str): Service key. None for local deployment.
        parameters (dict): Additional request paramters in dictionary. Default is {}.


    Returns:
        str: serialized result 
    """
    import requests
    from io import BytesIO
    import base64
    import json

    if service_key is None:
        headers = {'Content-Type': 'application/json'}
    else:
        headers = {'Content-Type': 'application/json',
                   "Authorization": ('Bearer ' + service_key)}
    payload = []
    encoded = None
    
    # Read image
    with open(image,'rb') as f:
        image_buffer = BytesIO(f.read()) ## Getting an image file represented as a BytesIO object
        
    # Convert your image to base64 string
    # image_in_base64 : "b'{base64}'"
    encoded = base64.b64encode(image_buffer.getvalue())
    image_request = {"image_in_base64": "{0}".format(encoded), "parameters": parameters}
    payload.append(image_request)
    body = json.dumps(payload)
    r = requests.post(service_endpoint_url, data=body, headers=headers)
    try:
        result = json.loads(r.text)
        json.loads(result[0])
    except:
        raise ValueError("Incorrect output format. Result cant not be parsed: " + r.text)
    return result[0]


### Parse serialized result from webservice
The result from the webserice is in json string. You can parse it the with different DNN model classes

In [None]:
image_path_or_url = image_path
print("Image source:",image_path_or_url)
serialized_result_in_json = deploy_obj.score_image(image_path_or_url)
print("serialized_result_in_json:", serialized_result_in_json[:50])

In [None]:
# Parse result from json string
import numpy as np
parsed_result = TFFasterRCNN.parse_serialized_result(serialized_result_in_json)
print("Parsed result:", parsed_result)

In [None]:
ax = detection_utils.visualize(image_path, parsed_result)
path_save = "../../../cvtk_output/scored_images/scored_image_web.jpg"
path_save_dir = os.path.dirname(os.path.abspath(path_save))
os.makedirs(path_save_dir, exist_ok=True)
ax.get_figure().savefig(path_save)

# APPENDIX 

# (A) Using pretrained model

### Initialize the model

In [None]:
my_detector_pt = TFFasterRCNN(labels=None, name="pretrained")
frozen_model_path, label_map_path = my_detector_pt.init_pretrained(use_frozen=True)
print("Frozen model written to path: " + frozen_model_path)
print("Labels written to path: " + label_map_path)

### Score with using preloaded model

In [None]:
scores = my_detector_pt.score(image_path)
path_save = "../../../cvtk_output/scored_images/scored_image_pretrained.jpg"
image_size = (8, 12)
ax = detection_utils.visualize(image_path, scores, label_map_path, path_save=path_save,
                              image_size=image_size)
# ax.get_figure()

### Score with using frozen graph

Load detection graph once.

In [None]:
detection_graph = detection_utils.load_graph(frozen_model_path)

In [None]:
detections_dict = detection_utils.score(detection_graph, image_path)
path_save = "../../../cvtk_output/scored_images/scored_image_pretrained_frozen.jpg"
image_size = (8, 12)
ax = detection_utils.visualize(image_path, detections_dict, label_map_path, path_save=path_save,
                              image_size=image_size)
# ax.get_figure()

# (B) Webcam scoring

The code below shows how to read in frames from a webcam (or optionally from disk) and run object detection on them. As detector, a pre-trained COCO model is used, but one can use any trained detector as input instead.

In [None]:
import cv2
from cvtk.core import Context, TFFasterRCNN
from cvtk.utils.detection_utils import FilepathImageProvider, VideoImageProvider
%matplotlib inline

out_root_path = "../../../cvtk_output"
Context.create(outputs_path=out_root_path, persistent_path=out_root_path, temp_path=out_root_path)

# Initialize detector with pre-trained model
my_detector = TFFasterRCNN(labels=None, name="pretrained")
my_detector.init_pretrained()

# Choose image provider
# image_provider = VideoImageProvider() # read images from webcam
image_provider = FilepathImageProvider([image.storage_path for image in data_val.images])  #read images from disk
#image_provider = VideoImageProvider(cv2_video_capture = cv2.VideoCapture("movie.mp4")) #read images from video file

# Optionally save visualization to video
# video_size = (640, 480)
# cv2_video_writer = cv2.VideoWriter('out_video.avi', cv2.VideoWriter_fourcc(*'XVID'), 5.0, video_size)

# Run object detection
_ = my_detector.score_multiple(image_provider, visualize=True) #, cv2_video_writer = cv2_video_writer, cv2_video_writer_img_size = video_size)

© 2018 Microsoft. All rights reserved. 