Skip to content

Ravishukla1234/ActiveLearning-ObjectDetection

Repository files navigation

Amazon SageMaker custom object detection workflows with active learning

Active Learning is the process of labeling some fraction of unlabeled dataset by humans which is then fed into the training model. The model is then inferenced to label the remaining part of dataset and provide a confidence score on each of its record. A subset of records which are labeled with low confidence are provided to human for labeling, which is in turn used to retrain the model to improve its performance and label the remaining data which was previously labeled by the model with low confidence score. This loop goes on until entire data is labeled with high confidence score.

In short, in active learning difficult data objects are sent to human workers to be annotated and easy data objects are automatically labeled with machine learning (automated labeling or auto-labeling).

Amazon SageMaker Ground Truth can be easily and inexpensively used to build accurately labeled machine learning (ML) datasets. Active learning can be used with SageMaker Ground Truth to reduce the labeling cost. This post can be used to create an active learning workflow for object detection(bounding box) task using SageMaker in-built algorithm to run training and inference in that workflow. This example can be used as a starting point to perform active learning and auto annotation with a custom labeling job.

Prerequisites and necessary setup

To create a custom object detection active learning workflow using this post, you need to fulfill the following prerequisites:

  • Create an AWS account.
  • Create a Amazon SageMaker Jupyter notebook instance. We have created and tested the workflow in an ml.m4.xlarge instance.
  • Create an IAM role with the permissions required to complete this workflow and attach it to the notebook instance you are using. Ensure that the IAM role have the following AWS managed policies attached:
    • IAMFullAccess
    • CloudWatchFullAccess
    • AWSLambda_FullAccess
    • AWSStepFunctionsFullAccess
    • AmazonSageMakerFullAccess
    • AWSCloudFormationFullAccess
    • AmazonS3FullAccess
  • Ensure that CORS is enabled on the s3 bucket you are going to use. Else Private worker would not be able to perform the labeling job assigned. You can enable CORS in the s3 bucket by following the link Enable CORS.
  • Install AWS SAM CLI in the notebook instance.
  • Familiarity with Amazon SageMaker labeling, training and batch transform; AWS CloudFormation; and Step Functions.

Use active-learning-object-detection.ipynb notebook in an Amazon SageMaker Jupyter notebook instance to create the resources needed in your active learning workflow.

For launching the workflow in the region other than us-east-1

In the active-learning-object-detection.ipynb notebook, we have demonstrated to launch the cloud formation stack in us-east-1 region, but if you want to use it in any other region, then after launching the stack in your preferred region as mentioned in the notebook active-learning-object-detection.ipynb, perform the following necessary edits

  1. On the AWS Lambda console, locate the Lambda function with the name *-PrepareForTraining-<###> where * is the name you used when you launched your CloudFormation stack and -<###> is a list of letters and numbers.
  2. Identify the existing algorithm specification, which looks like the following code:
  3. Change the value of TrainingImage key to training image uri of the sagemaker object detection model of your region. Check the following link to get the training image uri:- Sagemaker-algo-docker-registry. After edit, save the lamda function and click on deploy button.
  4. On the AWS Step Functions console, choose State Machines.
  5. Select the radio button near ActiveLearning-*
  6. Choose Edit.
  7. Look for the Inference post-processing step state, this step is responsible for invoking the sagemaker processing job which performs necessary post-processing in the output generated by the object detection model.
  8. In this step, look for AppSpecification, which looks like the following code:
  9. Here change the value of ImageUri key according to the pre-built sklearn container image uri of your region. Check the following link to get the container image uri:- Sagemaker Pre-built Sklearn container

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published