# Object Detection with AWS: A Demo with Tires and Licence Plates

This series of notebooks demonstrates tackling a sample computer vision problem on AWS - building a two-class object detector.

**This notebook** walks through using the [SageMaker Ground Truth](https://aws.amazon.com/sagemaker/groundtruth/) tool to annotate training and validation data sets.

**Follow-on** notebooks show how to train a range of models from the created dataset, including:

* [Amazon Rekognition](https://aws.amazon.com/rekognition/)'s new [custom labels](https://aws.amazon.com/rekognition/custom-labels-features/) functionality, announced at Re:Invent 2019
* SageMaker's [built-in object detection algorithm](https://docs.aws.amazon.com/sagemaker/latest/dg/object-detection.html)

# Tires and Plates 1: Introduction and Data Preparation

## Acknowledgements

We use the [**Open Images Dataset v4**](https://storage.googleapis.com/openimages/web/download_v4.html) as a convenient source of pre-curated images. The Open Images Dataset V4 is created by Google Inc. We have not modified the images or the accompanying annotations. You can obtain the images and the annotations [here](https://storage.googleapis.com/openimages/web/download_v4.html). The annotations are licensed by Google Inc. under CC BY 4.0 license. The images are listed as having a CC BY 2.0 license. The following paper describes Open Images V4 in depth: from the data collection and annotation to detailed statistics about the data and evaluation of models trained on it.

A. Kuznetsova, H. Rom, N. Alldrin, J. Uijlings, I. Krasin, J. Pont-Tuset, S. Kamali, S. Popov, M. Malloci, T. Duerig, and V. Ferrari. The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale. arXiv:1811.00982, 2018. ([link to PDF](https://arxiv.org/abs/1811.00982))

## Pre-requisites

This notebook is designed to be run in Amazon SageMaker. To run it (and understand what's going on), you'll need:

* Basic familiarity with Python, [AWS S3](https://docs.aws.amazon.com/s3/index.html), [Amazon Sagemaker](https://aws.amazon.com/sagemaker/), and the [AWS Command Line Interface (CLI)](https://aws.amazon.com/cli/).
* To run in **a region where Rekognition custom labelling is available** (Only N. Virginia at launch), if you plan to explore this feature.
* To create an **S3 bucket** in the same region, and ensure the SageMaker notebook's role has access to this bucket.
* Sufficient [SageMaker quota limits](https://docs.aws.amazon.com/general/latest/gr/aws_service_limits.html#limits_sagemaker) set on your account to run GPU-accelerated training jobs.

## Cost and runtime

Depending on your configuration, this demo may consume resources outside of the free tier but should not generally be expensive because we'll be training on a small number of images. You might wish to review the following for your region:

* [Amazon SageMaker pricing](https://aws.amazon.com/sagemaker/pricing/)
* [SageMaker Ground Truth pricing](https://aws.amazon.com/sagemaker/groundtruth/pricing/)
* [Amazon Rekognition pricing](https://aws.amazon.com/rekognition/pricing/)

The standard `ml.t2.medium` instance should be sufficient to run the notebooks.

We will use GPU-accelerated instance types for training and hyperparameter optimization, and use spot instances where appropriate to optimize these costs.

As noted in the step-by-step guidance, you should take particular care to delete any created SageMaker real-time prediction endpoints when finishing the demo.

## Step 0: Dependencies and configuration

As usual we'll start by loading libraries, defining configuration, and connecting to the AWS SDKs:

In [1]:
%load_ext autoreload
%autoreload 1

# Built-Ins:
import csv
import os
from collections import defaultdict
import json

# External Dependencies:
import boto3
import imageio
import numpy as np
import sagemaker
from IPython.display import display, HTML

# Local Dependencies:
%aimport util

Next we configure the name and layout of your bucket, and the annotation job to set up.

**If you're following this demo in a group:**, you can pool your annotations for better accuracy without spending hours annotating:

* Have each group member set a different `BATCH_OFFSET` integer from 0 upwards and you'll be allocated different images to annotate
* Later, you can *import* the other members' output manifest files to your own S3 data set.

**If not:** don't worry - we already provide a 100-image set in this repository to augment your annotations!

In [2]:
## Overall S3 bucket layout:
# Note you could instead set for auto-setup: BUCKET_NAME = sagemaker.Session().default_bucket()
BUCKET_NAME = 'vtg-20200124-gt-demo'
%store BUCKET_NAME
DATA_PREFIX = "data" # The folder in the bucket (and locally) where we will store data
%store DATA_PREFIX
MODELS_PREFIX = "models" # The folder in the bucket where we will store models
%store MODELS_PREFIX
CHECKPOINTS_PREFIX = "models/checkpoints" # Model checkpoints can go in a subfolder of models
%store CHECKPOINTS_PREFIX

## Annotation job:
CLASS_NAMES = ["Tire", "Vehicle registration plate"]
%store CLASS_NAMES
N_EXAMPLES_PER_CLASS = 20
BATCH_OFFSET = 0
BATCH_NAME = "my-annotations"

# Note that some paths are reserved, restricting your choice of BATCH_NAME:
data_raw_prefix = DATA_PREFIX + "/raw"
data_augment_prefix = DATA_PREFIX + "/augmentation"
data_batch_prefix = f"{DATA_PREFIX}/{BATCH_NAME}"
test_image_folder = DATA_PREFIX + "/test"
%store test_image_folder

Stored 'BUCKET_NAME' (str)
Stored 'DATA_PREFIX' (str)
Stored 'MODELS_PREFIX' (str)
Stored 'CHECKPOINTS_PREFIX' (str)
Stored 'CLASS_NAMES' (list)
Stored 'test_image_folder' (str)


Here we just connect to the AWS SDKs we'll use, and validate the choice of S3 bucket:

In [3]:
role = sagemaker.get_execution_role()
session = boto3.session.Session()
region = session.region_name
s3 = session.resource("s3")
bucket = s3.Bucket(BUCKET_NAME)
smclient = session.client("sagemaker")

bucket_region = \
    session.client("s3").head_bucket(Bucket=BUCKET_NAME)["ResponseMetadata"]["HTTPHeaders"]["x-amz-bucket-region"]
assert (
    bucket_region == region
), f"Your S3 bucket {BUCKET_NAME} and this notebook need to be in the same region."

if (region != "us-east-1"):
    print("WARNING: Rekognition Custom Labels functionality is only available in us-east-1 at launch")

## Step 1: Set the goalposts with some unlabelled target data

Let's start out by collecting a handful of images from around the web to illustrate what we'd like to detect.

These images are not licensed and the links may break for different regions / times in future: Feel free to add your own or replace with any other images of boots and cats!

Model evaluations in following notebooks will loop through each image in the `test_image_folder`

## Step 2: Map our class names to OpenImages class IDs

OpenImages defines a hierarchy of object types (e.g. "swan" is a subtype of "bird"), and references each with a class ID instead of the human-readable name.

Since we want to find images containing boots and cats, our first job is to figure what OpenImages class IDs they correspond to.

We start by downloading the OpenImages metadata, below.

(Note we're only referencing the `test` subset of OpenImages as an easy way to keep data volumes small)

In [6]:
# Download and process the Open Images annotations.
os.makedirs(data_raw_prefix, exist_ok=True)
!wget -O $data_raw_prefix/annotations-bbox.csv https://storage.googleapis.com/openimages/2018_04/test/test-annotations-bbox.csv 
!wget -O $data_raw_prefix/class-descriptions.csv https://storage.googleapis.com/openimages/2018_04/class-descriptions.csv 
!wget -O $data_raw_prefix/labels-hierarchy.json https://storage.googleapis.com/openimages/2018_04/bbox_labels_600_hierarchy.json

--2020-01-23 10:10:58--  https://storage.googleapis.com/openimages/2018_04/test/test-annotations-bbox.csv
Resolving storage.googleapis.com (storage.googleapis.com)... 172.217.12.240, 2607:f8b0:4004:807::2010
Connecting to storage.googleapis.com (storage.googleapis.com)|172.217.12.240|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 52174204 (50M) [text/csv]
Saving to: ‘data/raw/annotations-bbox.csv’


2020-01-23 10:10:58 (124 MB/s) - ‘data/raw/annotations-bbox.csv’ saved [52174204/52174204]

--2020-01-23 10:10:58--  https://storage.googleapis.com/openimages/2018_04/class-descriptions.csv
Resolving storage.googleapis.com (storage.googleapis.com)... 172.217.12.240, 2607:f8b0:4004:807::2010
Connecting to storage.googleapis.com (storage.googleapis.com)|172.217.12.240|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 475854 (465K) [text/csv]
Saving to: ‘data/raw/class-descriptions.csv’


2020-01-23 10:10:58 (62.7 MB/s) - ‘data/raw/class-descr

Now a case-insensitive lookup for our class names in the classes CSV:

In [7]:
# The class list is really long, so let's stream it instead of loading dataframe:
class_root_ids = { s: None for s in CLASS_NAMES }
classes_lower_notfound = { s.lower(): s for s in CLASS_NAMES }
with open(f"{data_raw_prefix}/class-descriptions.csv", "r") as f:
    for row in csv.reader(f):
        row_class_lower = row[1].lower()
        match = classes_lower_notfound.get(row_class_lower)
        if (match is not None):
            class_root_ids[match] = row[0]
            del classes_lower_notfound[row_class_lower]
            if (len(classes_lower_notfound) == 0):
                print("Class name -> root ID mapping done")
                break

print(class_root_ids)
if len(classes_lower_notfound):
    raise ValueError(
        f"IDs not found for these class names: {[v for (k,v) in classes_lower_notfound.items()]}"
    )

Class name -> root ID mapping done
{'Tire': '/m/0h9mv', 'Vehicle registration plate': '/m/01jfm_'}


Next, we recurse down the ontology from these root classes to capture any child classes.

(Note that actually "boot" and "cat" are leaf nodes in OpenImages v4, but other common demos like "bird" are not).

In [8]:
with open(f"{data_raw_prefix}/labels-hierarchy.json", "r") as f:
    hierarchy = json.load(f)

def get_all_subclasses(class_id, tree):
    """Get the set of `class_id` and all matching subclasses from hierarchy `tree`"""
    def all_subtree_class_ids(subtree):
        if ("Subcategory" in subtree):
            return set([subtree["LabelName"]]).union(
                *[all_subtree_class_ids(s) for s in subtree["Subcategory"]]
            )
        else:
            return set([subtree["LabelName"]])
    if (tree["LabelName"] == class_id):
        return all_subtree_class_ids(tree)
    elif "Subcategory" in tree:
        return set().union(*[get_all_subclasses(class_id, s) for s in tree["Subcategory"]])
    else:
        return set()

class_id_sets = {
    name: get_all_subclasses(class_root_ids[name], hierarchy) for name in class_root_ids
}
print("Final OpenImages class ID sets:")
print(class_id_sets)

Final OpenImages class ID sets:
{'Tire': {'/m/0h9mv'}, 'Vehicle registration plate': {'/m/01jfm_'}}


## Step 3: Find suitable example images

Now we've looked up the full range of applicable label IDs, we can use the OpenImages annotations to extract which image IDs will be interesting for us to train on (i.e. they contain boots and/or cats).

We deliberately search through the data-set in deterministic order, and only want to collect `N_EXAMPLES_PER_CLASS` images for each label but need to offset the ones we pick up by `BATCH_OFFSET` if this is non-zero.

In [9]:
# Skip these images with known bad quality content:
SKIP_IMAGES = {"251d4c429f6f9c39", "065ad49f98157c8d"}

# Dict[class_name][img_id] -> [class_name, xmin, xmax, ymin, ymax]
class_bbs = { name: defaultdict(list) for name in class_id_sets }

# BATCH_OFFSET allows 
n_images_needed = N_EXAMPLES_PER_CLASS * (BATCH_OFFSET + 1)

unfilled_class_names = set(CLASS_NAMES)
with open(f"{data_raw_prefix}/annotations-bbox.csv", "r") as f:
    for row in csv.reader(f):
        img_id, _, cls_id, conf, xmin, xmax, ymin, ymax, *_ = row
        if (img_id in SKIP_IMAGES):
            continue
        curr_unfilled_class_names = unfilled_class_names.copy()
        for name in curr_unfilled_class_names:
            if (cls_id in class_id_sets[name]):
                class_bbs[name][img_id].append([name, xmin, xmax, ymin, ymax])
                if (len(class_bbs[name]) >= n_images_needed):
                    unfilled_class_names.remove(name)
                    
if (len(unfilled_class_names)):
    print(
        "WARNING: Found fewer than ("
        + f"{N_EXAMPLES_PER_CLASS}x{BATCH_OFFSET+1}={n_images_needed}"
        + ") requested images for the following classes:\n"
        + "\n".join([f"{name} ({len(class_bbs[name])} images)" for name in unfilled_class_names])
    )

bbs = defaultdict(list)
for class_name in class_bbs:
    # Take last N_EXAMPLES_PER_CLASS images from each class (for BATCH_OFFSET)
    class_bbs_all_unfiltered = list(class_bbs[class_name].items())
    class_bbs_batch = class_bbs_all_unfiltered[-N_EXAMPLES_PER_CLASS:]
    class_bbs[class_name] = defaultdict(list, class_bbs_batch)
    # Concatenate each class together into the overall `bbs` set
    for (img_id, boxes) in class_bbs_batch:
        bbs[img_id] = bbs[img_id] + boxes

image_ids = bbs.keys()
n_images = len(image_ids)
print(f"Selected {n_images} images")

Selected 38 images


## Step 4: Upload images and manifest file to S3

We need our training image data in an accessible S3 bucket, and a [manifest](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-data-input.html) file defining for SageMaker Ground Truth (and later our model) what images are in the data set and where to find them.

In the following cell, we:

* Copy each identified image directly from the OpenImages repository to our bucket
* Build up a local manifest file listing all the images
* Upload the manifest file to the bucket

This process should only take a few seconds with small data sets like we're dealing with here.

In [10]:
os.makedirs(f"{data_batch_prefix}/manifests", exist_ok=True)
input_manifest_loc = f"{data_batch_prefix}/manifests/input.manifest"

with open(input_manifest_loc, "w") as f:
    print("Copying images", end="")
    # TODO: Delete existing folder contents?
    for image_id in image_ids:
        print(".", end="")
        dest_key = f"{data_batch_prefix}/images/{image_id}.jpg"
        bucket.copy(
            {
                "Bucket": "open-images-dataset",
                "Key": f"test/{image_id}.jpg"
            },
            dest_key
        )
        f.write(json.dumps({ "source-ref": f"s3://{BUCKET_NAME}/{dest_key}" }) + "\n")
    print("")
    print(f"Images copied to s3://{BUCKET_NAME}/{data_batch_prefix}/images/")

bucket.upload_file(input_manifest_loc, input_manifest_loc)
print(f"Manifest uploaded to s3://{BUCKET_NAME}/{input_manifest_loc}")

Copying images......................................
Images copied to s3://vtg-20200124-gt-demo/data/my-annotations/images/
Manifest uploaded to s3://vtg-20200124-gt-demo/data/my-annotations/manifests/input.manifest


## Step 5: Set up the SageMaker Ground Truth labelling job

Now that our images and a manifest file listing them are ready in S3, we'll set up the Ground Truth labelling job **in the [AWS console](https://console.aws.amazon.com)**.

Under *Services* go to *Amazon SageMaker*, and select *Ground Truth > Labeling Jobs* from the side-bar menu on the left.

**Note:** These steps assume you've either never used SageMaker Ground Truth before, or have already set up a Private Workforce that will be suitable for this task. If you have one or more private workforces configured already, but none of them are appropriate for this task, you'll need to go to *Ground Truth > Labeling workforces* **first** to create a new one.

### Job Details

Click the **Create labeling job** button, and you'll be asked to specify job details as follows:

* **Job name:** Choose a name to identify this labelling job, e.g. `boots-and-cats-batch-0`
* **Label name (The override checkbox):** Consider overriding this to `labels`
* **Input data location:** The path to the input manifest file in S3 (see output above)
* **Output data location:** Set this just to the parent folder of the input manifests (e.g. *s3://gt-object-detect-thewsey-us-east-1/data/my-annotations*)
* **IAM role:** If you're not sure whether your existing roles have the sufficient permissions for Ground Truth, select the options to create a new role
* **Task type:** Image > Bounding box

<img src="BlogImages/JobDetailsIntro.png"/>

All other settings can be left as default. Record your choices for the label name and output data location below, because we'll need these later:

In [11]:
my_groundtruth_job_name = 'vtg-demo-2'
my_groundtruth_output = f"s3://{BUCKET_NAME}/data/my-annotations" # TODO: **No trailing slash!**
my_groundtruth_labels = "labels" # TODO: Check this matches yours

### Workers

On the next screen, we'll configure **who** will annotate our data: Ground Truth allows you to define your own in-house *Private Workforces*; use *Vendor Managed Workforces* for specialist tasks; or use the public workforce provided by *Amazon Mechanical Turk*.

Select **Private** worker type, and you'll be prompted either to select from your existing private workforces, or create a new one if none exist.

To create a new private workforce if you need, simply follow the UI workflow with default settings. It doesn't matter what you call the workforce, and you can create a new Cognito User Group to define the workforce. **Add yourself** to the user pool by adding your email address: You should receive a confirmation email shortly with a temporary password and a link to access the annotation portal.

Automatic data labeling is applicable only for data sets over 1000 samples, so leave this turned **off** for now.

<img src="BlogImages/SelectPrivateWorkforce.png"/>

### Labeling Tool

Since you'll be labelling the data yourself, a brief description of the task should be fine in this case. When using real workforces, it's important to be really clear in this section about the task requirements and best practices - to ensure consistency of annotations between human workers.

For example: In the common case where we see a *pair* of boots from the side and one is almost entirely obscured, how should the image be annotated? Should *model* cats count, or only real ones?

The most important configuration here is to set the *options* to be the same as our `CLASS_NAMES` and in the same order: **Boot, Cat**

<img src="BlogImages/LabellingToolSetup.png"/>

Take some time to explore the other options for configuring the annotation tool; and when you're ready click "Create" to launch the labeling job.

## Step 6: Label those images!

Follow the link you received in your workforce invitation email to the workforce's **labelling portal**, and log in with the default password given in the email (which you'll be asked to change).

If you lose the portal link, you can always retrieve it through the *Ground Truth > Labeling Workforces* menu in the SageMaker console: Near the top of the summary of private workforces.

New jobs can sometimes take a minute or two to appear for workers, but you should soon see a screen like the below. Select the job and click "Start working" to enter the labelling tool.

<img src="BlogImages/LabellingJobsReady.png"/>

Note that you can check on the progress of labelling jobs through the APIs as well as in the AWS console:

In [12]:
smclient.describe_labeling_job(LabelingJobName=my_groundtruth_job_name)['LabelingJobStatus']

'Completed'

Label all the images in the tool by selecting the class and drawing boxes around the objects, and when done you will be brought back to the (now empty) jobs list screen above.

It may take a few seconds after completing for the job status to update in the AWS console.