<a href="https://colab.research.google.com/github//pylabel-project/samples/blob/main/label_new_dataset.ipynb" target="_blank"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>&nbsp;
<a href='https://pylabel.readthedocs.io/en/latest/?badge=latest'>
    <img src='https://readthedocs.org/projects/pylabel/badge/?version=latest' alt='Documentation Status' />
</a>

# Autolabel images with PyLabel, YOLOv5, and jupyter-bbox-widget
This notebook is labeling tool that can used to annotate image datasets with bounding boxes, automatically suggest bounding boxes using an object detection model, and save the annotations in YOCO, COCO, or VOC format. 

The annotation interface uses the [jupyter-bbox-widget](https://github.com/gereleth/jupyter-bbox-widget). The bounding box detection uses PyTorch and a [VOLOv5](https://github.com/ultralytics/yolov5) model.

In [1]:
import logging
logging.getLogger().setLevel(logging.CRITICAL)
%pip install pylabel > /dev/null

Note: you may need to restart the kernel to use updated packages.


In [2]:
from pylabel import importer

## Import Images to Create a New Dataset
In this example there are no annotations created yet. The path should be the path to a directory with the images that you want to annotate. For this demonstration we will download a subset of the coco dataset. 

In [3]:
import os, zipfile

#Download sample yolo dataset 
os.makedirs("data", exist_ok=True)
!wget "https://github.com/ultralytics/yolov5/releases/download/v1.0/coco128.zip" -O data/coco128.zip
with zipfile.ZipFile("data/coco128.zip", 'r') as zip_ref:
   zip_ref.extractall("data")

--2022-01-10 20:59:01--  https://github.com/ultralytics/yolov5/releases/download/v1.0/coco128.zip
Resolving github.com (github.com)... 192.30.255.112
Connecting to github.com (github.com)|192.30.255.112|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/264818686/7a208a00-e19d-11eb-94cf-5222600cc665?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20220111%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220111T045901Z&X-Amz-Expires=300&X-Amz-Signature=be5a6a8e9e904069734e102690defe3169e6bd5fabf6f590baad342825534153&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=264818686&response-content-disposition=attachment%3B%20filename%3Dcoco128.zip&response-content-type=application%2Foctet-stream [following]
--2022-01-10 20:59:01--  https://objects.githubusercontent.com/github-production-release-asset-2e65be/264818686/7a208a00-e19d-11eb-94cf-5222600cc665?X-Amz-Algori

In [4]:
path_to_images = "data/coco128/images/train2017"
dataset = importer.ImportImagesOnly(path=path_to_images, name="coco128")
dataset.df.head(3)

Unnamed: 0_level_0,img_folder,img_filename,img_path,img_id,img_width,img_height,img_depth,ann_segmented,ann_bbox_xmin,ann_bbox_ymin,...,ann_segmentation,ann_iscrowd,ann_pose,ann_truncated,ann_difficult,cat_id,cat_name,cat_supercategory,split,annotated
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
0,,000000000612.jpg,,0,640,480,3,,,,...,,,,,,,,,,
1,,000000000404.jpg,,1,426,640,3,,,,...,,,,,,,,,,
2,,000000000438.jpg,,2,640,480,3,,,,...,,,,,,,,,,


## Predict and Edit Annotations
Use the jupyter_bbox_widget to inspect, edit, and save annotations without leaving the Jupyter notebook. Press predict to autolabel images using a pretrained model. For instructions and keyboard shortcuts for using this widget see https://github.com/gereleth/jupyter-bbox-widget#Usage.

In [5]:
classes = ['person','boat', 'bear', "car"]
dataset.labeler.StartPyLaber(new_classes=classes)

VBox(children=(HBox(children=(Label(value='000000000612.jpg (not annotated)'),)), HBox(children=(Button(icon='…

# Instructions 
- The first image (000000000612.jpg) should show some bears. Select the bear cleass draw some some boxes around the bears and then save.
- The next image should be a boat. (000000000404.jpg) Select the boat class, draw boxes around the boats, and save.
- When you see an image with an object that is not in the current list of classes, add it as new class, draw boxes on the image using that class and save. 
At anytime, run the cell below to see how many classes you have labeled in the dataset. 

In [6]:
dataset.analyze.class_counts

    128
Name: cat_name, dtype: int64

In [7]:
dataset.df.loc[dataset.df["annotated"] == 1]

Unnamed: 0,img_folder,img_filename,img_path,img_id,img_width,img_height,img_depth,ann_segmented,ann_bbox_xmin,ann_bbox_ymin,...,ann_segmentation,ann_iscrowd,ann_pose,ann_truncated,ann_difficult,cat_id,cat_name,cat_supercategory,split,annotated
126,,000000000612.jpg,,0,640,480,3,,,,...,,,,,,,,,,1
127,,000000000404.jpg,,1,426,640,3,,,,...,,,,,,,,,,1


In [8]:
#Export the annotations in Yolo format
dataset.path_to_annotations = 'data/coco128/labels/newlabels/'
os.makedirs(dataset.path_to_annotations, exist_ok=True)
dataset.export.ExportToYoloV5()

['training/dataset.yaml',
 'training/labels/000000000612.txt',
 'training/labels/000000000404.txt']