# Object Detection

*Object detection* is a form of computer vision in which a machine learning model is trained to classify individual instances of objects in an image, and indicate a *bounding box* that marks its location. Youi can think of this as a progression from *image classification* (in which the model answers the question "what is this an image of?") to building solutions where we can ask the model "what objects are in this image, and where are they?".

<p style='text-align:center'><img src='./images/object-detection.jpg' alt='A robot identifying fruit'/></p>

For example, a grocery store might use an object detection model to implement an automated checkout system that scans a conveyor belt using a camera, and can identify specific items without the need to place each item on the belt and scan them individually.

The **Custom Vision** cognitive service in Microsoft Azure provides a cloud-based solution for creating and publishing custom object detection models.

## Create a Custom Vision resource

To use the Custom Vision service, you need an Azure resource that you can use to train a model, and a resource with which you can publish it for applications to use. You can use the same resource for each of these tasks, or you can use different resources for each to allocate costs separately provided both resources are created in the same region. The resource for either (or both) tasks can be a general **Cognitive Services** resource, or a specific **Custom Vision** resource. Use the following instructions to create a new **Custom Vision** resource (or you can use an existing resource if you have one).

1. In a new browser tab, open the Azure portal at [https://portal.azure.com](https://portal.azure.com), and sign in using the Microsoft account associated with your Azure subscription.
2. Select the **&#65291;Create a resource** button, search for *custom vision*, and create a **Custom Vision** resource with the following settings:
    - **Create options**: Both
    - **Subscription**: *Your Azure subscription*
    - **Resource group**: *Create a new resource group with a unique name*
    - **Name**: *Enter a unique name*
    - **Training location**: *Choose any available region*
    - **Training pricing tier**: F0
    - **Prediction location**: *The same as the training location*
    - **Prediction pricing tier**: F0

    > **Note**: If you already have an F0 custom vision service in your subscription, select **S0** for this one.

3. Wait for the resource to be created.

## Create a Custom Vision project

To train an object detection model, you need to create a Custom Vision project based on your trainign resource. To do this, you'll use the Custom Vision portal.

1. In a new browser tab, open the Custom Vision portal at [https://customvision.ai](https://customvision.ai), and sign in using the Microsoft account associated with your Azure subscription.
2. Create a new project with the following settings:
    - **Name**: Grocery Detection
    - **Description**: Object detection for groceries.
    - **Resource**: *The Custom Vision resource you created previously*
    - **Project Types**: Object Detection
    - **Classification Types**: Multiclass (single tag per image)
    - **Domains**: general
3. Wait for the project to be created and opened in the browser.

## Add and tag images

To train an object detection model, you need to upload images that contain the classes you want the model to identify, and tag them to indicate bounding boxes for each object instance.

1. Download and extract the training images from https://github.com/GraemeMalcolm/ai-fundamentals/raw/master/data/vision/object_training.zip. The extracted folder contains a collection of images of fruit.
2. In the Custom Vision portal, in your object detection project, select **Add images** and upload all of the images in the extracted folder.
3. After the images have been uploaded, select the first one to open it.
4. Hold the mouse over any object in the image until an automatically detected region is displayed like the image below. Then select the object, and if necessary resize the region to surround it.

    <p style='text-align:center'><img src='./images/object-region.jpg' alt='The default region for an object'/></p>

    Alternatively, you can simply drag around the object to create a region.

5. When the region surrounds the object, add a new tag with the appropriate object type (*apple*, *banana*, or *orange*) as shown here:

    <p style='text-align:center'><img src='./images/object-tag.jpg' alt='A tagged object in an image'/></p>

6. Select and tag each other object in the image, resizing the regions and adding new tags as required.

    <p style='text-align:center'><img src='./images/object-tags.jpg' alt='Two tagged objects in an image'/></p>

7. Use the **>** link on the right to go to the next image, and tag its objects. Then just keep working through the entire image collection, tagging each apple, banana, and orange.

8. When you have finished tagging the last image, close the **Image Detail** editor and on the **Training Images** page, under **Tags**, select **Tagged** to see all of your tagged images:

    <p style='text-align:center'><img src='./images/tagged-images.jpg' alt='Tagged images in a project'/></p>

## Train and test a model

Now that you've tagged the images in your project, you're ready to train a model.

1. In the Custom Vision project, click **Train** to train an object detection model using the tagged images. Select the **Quick Training** option.
2. Wait for training to complete (it might take ten minutes or so), and then review the *Precision*, *Recall*, and *mAP* performance metrics - these measure the prediction accuracy of the classification model, and should all be high.
3. At the top right of the page, click **Quick Test**, and then in the **Image URL** box, enter *https://github.com/GraemeMalcolm/ai-fundamentals/raw/master/data/vision/test/IMG_TEST_5.jpg* and view the prediction that is generated. Then close the **Quick Test** window.

## Publish and consume the model

1. At the top left of the **Performance** page, click **&#128504; Publish** to publish the trained model with the following settings:
    - **Model name**: detect-produce
    - **Prediction Resource**: *Your cognitive services resource*.
2. After publishing, click the **&#9881;** icon at the top right to view the project settings. Under **General** (on the left), note the **Project Id**; and under **Resources** (on the right)  note the **Key** and **Endpoint** values. Copy these values and paste them in the code cell below, replacing **YOUR_PROJECT_ID**, **YOUR_KEY** and **YOUR_ENDPOINT**.
11. Run the code cell below by clicking its green <span style="color:green">&#9655</span> button (at the top left of the cell).

> **Note**: Don't worry too much about the details of the code. It uses the Python SDK for the Custom Vision service to submit an image to your model and retrieve predictions for detected objects. Each prediction consists of a class name (*apple*, *banana*, or *orange*) and *bounding box* coordinates that indicate where in the image the predicted object has been detected. The code then uses this information to draw a labelled box around each object on the image.

In [None]:
project_id = 'YOUR_PROJECT_ID' # Replace with your project ID
cv_key = 'YOUR_KEY' # Replace with your primary key
cv_endpoint = 'YOUR_ENDPOINT' # Replace with your endpoint
model_name = 'detect-produce' # this must match the model name you set when publishing your model iteration exactly (including case)!


from azure.cognitiveservices.vision.customvision.prediction import CustomVisionPredictionClient
from matplotlib import pyplot as plt
from PIL import Image, ImageDraw, ImageFont
import numpy as np
import os
%matplotlib inline

# Load a test image and get its dimensions
test_img_file = os.path.join('data', 'vision', 'produce.jpg')
test_img = Image.open(test_img_file)
test_img_h, test_img_w, test_img_ch = np.array(test_img).shape

# Get a prediction client for the object detection model
predictor = CustomVisionPredictionClient(cv_key, endpoint=cv_endpoint)

print('Detecting objects in {} using model {} in project {}...'.format(test_img_file, model_name, project_id))

# Detect objects in the test image
with open(test_img_file, mode="rb") as test_data:
    results = predictor.detect_image(project_id, model_name, test_data)

# Create a figure to display the results
fig = plt.figure(figsize=(8, 8))
plt.axis('off')

# Display the image with boxes around each detected object
draw = ImageDraw.Draw(test_img)
lineWidth = int(np.array(test_img).shape[1]/100)
object_colors = {
    "apple": "lightgreen",
    "banana": "yellow",
    "orange": "orange"
}
for prediction in results.predictions:
    color = 'white' # default for 'other' object tags
    if (prediction.probability*100) > 50:
        if prediction.tag_name in object_colors:
            color = object_colors[prediction.tag_name]
        left = prediction.bounding_box.left * test_img_w 
        top = prediction.bounding_box.top * test_img_h 
        height = prediction.bounding_box.height * test_img_h
        width =  prediction.bounding_box.width * test_img_w
        points = ((left,top), (left+width,top), (left+width,top+height), (left,top+height),(left,top))
        draw.line(points, fill=color, width=lineWidth)
        plt.annotate(prediction.tag_name + ": {0:.2f}%".format(prediction.probability * 100),(left,top), backgroundcolor=color)
plt.imshow(test_img)


View the resulting predictions, which show the objects detected and the probability for each prediction.