description |
---|
How to create bounding boxes, masks on video frames in Python |
In this tutorial, you will learn how to programmatically create classes, objects and figures for video frames and upload them to Supervisely platform.
Supervisely supports different types of shapes / geometries for video annotation:
- bounding box (rectangle)
- mask (also known as bitmap)
- polygon - will be covered in other tutorials
- polyline - will be covered in other tutorials
- point - will be covered in other tutorials
- keypoints (also known as graph, skeleton, landmarks) - will be covered in other tutorials
Learn more about Supervisely Annotation JSON format here.
{% hint style="info" %} Everything you need to reproduce this tutorial is on GitHub: source code, Visual Studio Code configuration, and a shell script for creating virtual env. {% endhint %}
Step 1. Prepare ~/supervisely.env
file with credentials. Learn more here.
Step 2. Clone repository with source code and demo data and create Virtual Environment.
git clone https://github.com/supervisely-ecosystem/video-figures
cd video-figures
./create_venv.sh
Step 3. Open repository directory in Visual Studio Code.
code -r .
Step 4. change ✅ workspace ID ✅ in local.env
file by copying the ID from the context menu of the workspace. A new project with annotated videos will be created in the workspace you define:
WORKSPACE_ID=507 # ⬅️ change value
Step 5. Start debugging src/main.py
import os
from os.path import join
from dotenv import load_dotenv
import supervisely as sly
Init api for communicating with Supervisely Instance. First, we load environment variables with credentials and workspace ID:
load_dotenv("local.env")
load_dotenv(os.path.expanduser("~/supervisely.env"))
api = sly.Api()
With next lines we will check the you did everything right - API client initialized with correct credentials and you defined the correct workspace ID in local.env
.
workspace_id = sly.env.workspace_id()
workspace = api.workspace.get_info_by_id(workspace_id)
if workspace is None:
print("you should put correct workspaceId value to local.env")
raise ValueError(f"Workspace with id={workspace_id} not found")
Create empty project with name "Demo" with one dataset "orange & kiwi" in your workspace on server. If the project with the same name exists in your dataset, it will be automatically renamed (Demo_001, Demo_002, etc ...) to avoid name collisions.
project = api.project.create(
workspace.id,
name="Demo",
type=sly.ProjectType.VIDEOS,
change_name_if_conflict=True,
)
dataset = api.dataset.create(project.id, name="orange & kiwi")
print(f"Project has been sucessfully created, id={project.id}")
video_path = "data/orange_kiwi.mp4"
video_name = sly.fs.get_file_name_with_ext(video_path)
video_info = api.video.upload_path(dataset.id, video_name, video_path)
print(f"Video has been sucessfully uploaded, id={video_info.id}")
Color will be automatically generated if the class was created without color
argument.
kiwi_obj_cls = sly.ObjClass("kiwi", sly.Rectangle, color=[0, 0, 255])
orange_obj_cls = sly.ObjClass("orange", sly.Bitmap, color=[255, 255, 0])
The next step is to create ProjectMeta - a collection of annotation classes and tags that will be available for labeling in the project.
project_meta = sly.ProjectMeta(obj_classes=[kiwi_obj_cls, orange_obj_cls])
And finally, we need to set up classes in our project on server:
api.project.update_meta(project.id, project_meta.to_json())
masks_dir = "data/masks"
# prepare rectangle points for 10 demo frames
points = [
[136, 632, 350, 817],
[139, 655, 355, 842],
[145, 672, 361, 864],
[158, 700, 366, 885],
[153, 700, 367, 885],
[156, 724, 375, 914],
[164, 745, 385, 926],
[177, 770, 396, 944],
[189, 793, 410, 966],
[199, 806, 417, 980],
]
orange = sly.VideoObject(orange_obj_cls)
kiwi = sly.VideoObject(kiwi_obj_cls)
We are going to create ten masks from the following black and white images:
{% hint style="info" %} Mask has to be the same size as the video {% endhint %}
Supervisely SDK allows creating masks from NumPy arrays with the following values:
0
- nothing,1
- pixels of target mask0
- nothing,255
- pixels of target maskFalse
- nothing,True
- pixels of target mask
frames = []
for mask in os.listdir(masks_dir):
fr_index = int(sly.fs.get_file_name(mask).split("_")[-1])
mask_path = join(masks_dir, mask)
# orange will be labeled with a masks.
# supports masks with values (0, 1) or (0, 255) or (False, True)
bitmap = sly.Bitmap.from_path(mask_path)
# kiwi will be labeled with a bounding box.
bbox = sly.Rectangle(*points[fr_index])
mask_figure = sly.VideoFigure(orange, bitmap, fr_index)
bbox_figure = sly.VideoFigure(kiwi, bbox, fr_index)
frame = sly.Frame(fr_index, figures=[mask_figure, bbox_figure])
frames.append(frame)
objects = sly.VideoObjectCollection([kiwi, orange])
frames = sly.FrameCollection(frames)
frame_size, vlength = sly.video.get_image_size_and_frames_count(video_path)
Learn more about VideoAnnotation JSON format.
video_ann = sly.VideoAnnotation(
img_size=frame_size,
frames_count=vlength,
objects=objects,
frames=frames,
)
api.video.annotation.append(video_info.id, video_ann)
print(f"Annotation has been sucessfully uploaded to the video {video_name}")
In the GitHub repository for this tutorial, you will find the full python script.
In this tutorial we learned how to
- quickly configure python development for Supervisely
- how to create a project and dataset with classes of different shapes
- how to initialize rectangles, masks for video frames
- how to construct Supervisely annotation and upload it with an videos to server