# Introduction

This sample notebook takes you through an end-to-end workflow to demonstrate the functionality of SageMaker Ground Truth and Amazon Rekognition Custom Labels

In [2]:
import datetime
import tarfile
import boto3
import os
from sagemaker import get_execution_role
import sagemaker

## Setup buckets with images
#### A dataset source bucket and working bucket
- Set up the source and destination bucket
- Create a your unique working bucket name
- Copy sample images to your working bucket

In [4]:
# Your bucket name needs to be globally unise so use the following name and add your initial
!aws s3 mb 's3://'$my_bucket --region us-east-1

make_bucket: 2022-1-6-smartstream-pvt2


In [None]:
# From a dataset source bucket to working bucket create copy raw Images to yor S3 working bucket
!aws s3 cp $dataset_bucket 's3://'$my_bucket \
        --recursive --source-region us-east-1 --region us-east-1

## Detect Object using Amazon Rekogniton
#### Attach IAM Managed Policy 
- Click on the generated URL
- Click on **Attach policies** button
- Search for **Rekog** in **Filter policies** bar
- Select **AmazonRekognitionFullAccess** and click on **Attach Policy**

In [6]:
role_name = get_execution_role().split('/')[2]
job_url = "https://console.aws.amazon.com/iam/home?#/roles/"+role_name+"?section=permissions"
print (job_url) 

https://console.aws.amazon.com/iam/home?#/roles/AmazonSageMaker-ExecutionRole-20220105T223834?section=permissions


<img src="../lab-images/15.png">
<img src="../lab-images/16.png">

#### Assume Role
- Click on the generated URL
- Click on **Edit trust relationship** button
- Edit Policy with Service as **["sagemaker.amazonaws.com","rekognition.amazonaws.com"]**
- Click on **Update Trust Policy** button

In [7]:
job_url = "https://console.aws.amazon.com/iam/home?#/roles/"+role_name+"?section=trust"
print (job_url) 

https://console.aws.amazon.com/iam/home?#/roles/AmazonSageMaker-ExecutionRole-20220105T223834?section=trust


<img src="../lab-images/17.png" width="600">
<img src="../lab-images/18.png" width="600">

---
# Ground Truth labeling job



Part or all of your images will be annotated by human annotators. It is essential to provide good instructions that help the annotators give you the annotations you want. Good instructions are:

Concise. We recommend limiting verbal/textual instruction to two sentences, and focusing on clear visuals.
Visual. In the case of image classification, we recommend providing one labeled image in each of the classes as part of the instruction.
When used through the AWS Console, Ground Truth helps you create the instructions using a visual wizard.

### Create Labeling Workforce
- Select **'Labeling workforces'** then click the **'Private'** tab.
- On the **'Private'** tab click **'Create private team'**

![labeling workforce](../lab-images/1a.png)

On the **'Create private team'** page
- Enter the **'Team name'** as **Labeling-experts**
- Click **Create private team**

![Create private team](../lab-images/1b.png)

Select **Invite new workers**
![Create private team](../lab-images/1c.png)

On the **Add workers by email address** page 
- Add your **email address** to invite private annotators to access the job. For the purpose of this exercise, you can use your own email address. Typically, this will be the list of email addresses of workers in your organization.
- Click **Invite new workers**

![Create private team](../lab-images/1d.png)

### Ground Truth Label Job
In the left hand menu select **Labeling Job**
![Create Labeling Job](../lab-images/1.png)

Click **Create labeling job**
![Label Job](../lab-images/2.png)

### Specify Job Parameters
- Specify Job Name - **'smartstream-evm-labeling-job'**
- Check the box next to "I want to specify a label attribute name different from the labeling job name."
- Specify a value of **'labels'** in the "Label attribute name" field
- Under "Input data setup" select "**Automated data setup**"
- For "S3 location for input datasets" specify the S3 location of images - **'s3://{my-bucket}/'**
- Next select "Specify a new location" under "S3 location for output datasets" and specify the output location for annotated data - **'s3://{my-bucket}/annotated-data/'**
- For "Data type" select "images"

**Note:** When you see **{your-bucket-name}** replace it with the name of the bucket that you created earlier

<img src="../lab-images/labelingJob.png" >

### Create IAM Role
- Select the option to **create a new role**
- Specify S3 Bucket Name - **'{my-bucket}'** 
- Click on **Create** button

<img src="../lab-images/4.png" width="600">
<img src="../lab-images/5.png" width="600">

### Complete Data setup
- Click on "**Complete Data Setup**". This will created the image manifest file and update the S3 input location path. Wait for "**Input data connection successful**"

<img src="../lab-images/completedatasetup.png">


### Additional Configuration
- Expand **Additional Configuration** 
- Validate that **Full dataset** is selected (This is used to specify whether you want to provide all the images to labeling job or a sub set of images based on filters or random sampling)

<img src="../lab-images/6.png" width="600">

### Labeling Task 
- From the **Task type** drop down, select **Image**. Since you need to do annotation on images

<img src="../lab-images/7.png" width="600">

### Task Selection
- This is Object Detection use case so you need to select **Bounding box** option  
- Leave other options as default and click on **Next** button

<img src="../lab-images/8.png" width="600">

### Select workers and configure tool
- Select **Private** in **Worker types**. For this lab, you will select an internal workforce to annotate the images. You have the option to select Public contractual workforce i.e **Amazon Mechanical Turk** or Partner workforce i.e. **Vendor managed** depending on your use case.
- In **Private teams** select the team name - **'labeling-experts'**

<img src="../lab-images/9.png" width="600"> 

### Labeling Instructions Template
- Leave other configurations default and scroll down to **Bounding box labeling tool**
- Add two labels as shown below - **'hole'** and **'no_hole'**
- Add detailed instructions in the **Description tab** for providing instructions to the workers - For example, you can specify - **You need to label woodpecker hole in the provided image. Please ensure that you select label 'hole' and draw the box around the woodpecker hole just to fit the hole for better quality of label data. You also need to label other areas which look similar to woodpecker holes but are not woodpecker holes with label 'no_hole'**
- You can also *optionally* provide examples of Good and Bad labeling image. You need to make sure that these images are publicly accessible.
- Click on **Create** button


### Start Labeling Job
- Once you have successfully created the job, you will see that the **status** of the job is **"InProgress"**. This means that the job is created and the private workforce is notified via **email** regarding the task assigned to them. Since in this case, you have assigned the task to yourself. You should have received email with instructions to login to Ground Truth Labeling project
- **Open the email** and click on the **link** provided
- Enter **username** and **password** provided in the email. You may have to change the one time password provided in the email with a new password after login
- After you login, you will see the below screen
- Click on **Start working** button

<img src="../lab-images/11.png" width="700">

### Labeling Task
- You can use the provided tools to **Zoom in**, **Zoom out**, **Move** and **Box** sections in the images.
- You need to first select a **label** i.e. either **evm_ok** or **evm_alert** and then draw box in the image to annotate.
- Once you are annotating the required objects, click on **Submit** button

### Complete Labeling Task
- You need to ensure that the bounding box is just enough to bound the object of interest
- Everytime you need to drsw bounding box, you need to first select the label on the right panel and then draw box around the object

<img src="../lab-images/evm_ok.png" >
<img src="../lab-images/evm_alert.png" >
<img src="../lab-images/evm_2_ok.png" >

<img src="../lab-images/13.png">

###  Check Labeling Job Status
A Ground Truth job can take a few hours to complete (if your dataset is larger than 10000 images, it can take much longer than that!). One way to monitor the job's progress is through AWS Console. In this notebook, we will use Ground Truth output files to monitor the progress.

You can re-evaluate the next cell repeatedly. It sends a `describe_labeling_job` request which should tell you whether the job is completed or not. If it is, then 'LabelingJobStatus' will be 'Completed'.

In [None]:
job_name = 'smartstream-evm-labeling-job' 
sagemaker_client = boto3.client('sagemaker')
sagemaker_client.describe_labeling_job(LabelingJobName=job_name)['LabelingJobStatus']

### Inspect Labeled Data Sets

In [None]:
job_url = "https://"+region+".console.aws.amazon.com/sagemaker/groundtruth?region="+region+"#/labeling-jobs/details/"+job_name
print (job_url)

### Labeled Data Sets
- Once you have labeled all the images, you will be taken to the SageMaker labeling project home page. This page shows you the **Labeled dataset** as shown below
- You can see how the different labels are applied. Now, training data for Amazon Rekognition Custom Labels is ready.

# Review

We covered a lot of ground in this notebook! Let's recap what we accomplished. We uploaded images to S3 bucket and used SageMake Ground Truth labeling job to label the images and generated new labels for all of the images in our dataset. 

---
## After the images has been labeled, we can now use Rekognigion Custom Label to train a model

- Go to the console and under services type "Rekog" and you will see Rekognition 

<img src="../lab-images/rekog-console.png" width="600">

### Select Custom Labels

<img src="../lab-images/rekog-customlabel1.png" width="800">

### create project
#### Type Project name = smartstream-evm-custom-label-date
<img src="../lab-images/rekog-project-name-create.png" >

### Select create dataset
<img src="../lab-images/rekog-dataset-create.png" >

### You will use the labeled images
- Select: "Import images labeld By SagaMaker Ground Truth"
- For Manifest file location enter your <bucket-name>/annotated-data/labeling-experts/manifest/output/output.manifest
<img src="../lab-images/rekog-dataset-create2.png" >

### Review of the labeled dataset 

<img src="../lab-images/rekog-labeled-dataset.png" >

### Train a model
- Select: "Train model" button

<img src="../lab-images/rekog-train-model.png" >

### Training and test dataset splot
- Select: "Train model"

<img src="../lab-images/rekog-training-dataset-split.png" >

### Training in progress - will take some time over an hour

<img src="../lab-images/rekog-training-in-progress.png" >

### Training Result

<img src="../lab-images/rekog-model-evaluation.png" >

### How to use the model for inference

<img src="../lab-images/rekog-how2-use-model.png" >

---
## Our next workshop will be about to use the model for inference