Training computer vision models for autonomous driving to achieve high-end performance requires large labeled datasets, which can be prohibitively expensive. This project shows an end to end pipeline that streamlines the process of labelling driving scene datasets for a task, using SageMaker Groundtruth autolabelling and active learning. The task that we focus is pedestrian detection in camera images.
The project focuses on an object detection task where the goal of the final trained model is to predict 2D bounding boxes around pedestrians in an image.
The active learning pipeline ensures that, starting with just a handful of a labels we can train a model, predict bounding boxes, compare the models predictions to a real labels and only add more labels if we need to increase the performance of the model.
Using Step Functions, we can create a workflow that automates this process and iteratively performs the active learning loop.
To get started quickly, use the following quick-launch link to create a CloudFormation stack and deploy the resources in this project.
| Region | Stack |
|---|---|
| US West (Oregon) | ![]() |
| US East (N. Virginia) | ![]() |
| US East (Ohio) | ![]() |
On the stack creation page, check the boxes to acknowledge creation of IAM resources and auto-expand, and click Create Stack.
Once the stack is created, go to the Outputs tab and click on the SageMakerNotebook link. This will open up the jupyter notebook in a SageMaker Notebook instance where you can run the code in the notebook.
The project architecture deployed by the cloud formation template is shown here.
deployment/template.yaml: Creates AWS CloudFormation Stack for solutionpermissions.yaml: Creates AWS CloudFormation Stack for an IAM role with permissions for all actions needed by the solution
source/lambda/active-learning-1p/: Lambda package for active learning using SageMaker built-in algorithms.
sagemaker/artifacts/annotations_metadata.json: Metadata for SageMaker Groundtruth labelling jobclass_labels: Class label information for SageMaker Groundtruth labelling jobinstructions.template: SageMaker Groundtruth Labeling job template for 2D bounding box
data/: for storing image data locallymanifests/: stores manifest filespackage/active_learning/prepare.py: Contains functions for preparing manifests and configurations for Active Learning looprequest.py: Create Input requests to initiate Active Learning loopstep_functions.py: Front end for Step Functions State Machine implementing Active Learning loop
config.py: Contains solution configurationmanifest.py: Utilities for manipulating manifestsrequirements.txt: python dependencies for notebooksetup.py: Build solution code as packageworkteam.py: Utilities for manipulating SageMaker Groundtruth private workteam
active-learning-visual-perception.ipynb: Orchestrates the solution. Triggers active learning loop
This project is licensed under the Apache-2.0 License.

