# Train-validation tagging

This notebook shows how to split a training dataset into train and validation folds using tags.

**Input**:
- Source project.

**Output**:
- New project with bitmap and polygon objects tagged by `area` tag with their respective area in pixels.

## Configuration

Edit the following settings for your own case

In [1]:
import supervisely_lib as sly
from tqdm import tqdm
import random
import os

In [2]:
team_name = "jupyter_tutorials"
workspace_name = "cookbook"
project_name = "tutorial_project"

dst_project_name = "tutorial_project_tagged"

area_tag_meta = sly.TagMeta('area', sly.TagValueType.ANY_NUMBER)

# Obtain server address and your api_token from environment variables
# Edit those values if you run this notebook on your own PC
address = os.environ['SERVER_ADDRESS']
token = os.environ['API_TOKEN']

In [3]:
# Initialize API object
api = sly.Api(address, token)

## Verify input values

Test that context (team / workspace / project) exists

In [4]:
# Get IDs of team, workspace and project by names

team = api.team.get_info_by_name(team_name)
if team is None:
    raise RuntimeError("Team {!r} not found".format(team_name))

workspace = api.workspace.get_info_by_name(team.id, workspace_name)
if workspace is None:
    raise RuntimeError("Workspace {!r} not found".format(workspace_name))
    
project = api.project.get_info_by_name(workspace.id, project_name)
if project is None:
    raise RuntimeError("Project {!r} not found".format(project_name))
    
print("Team: id={}, name={}".format(team.id, team.name))
print("Workspace: id={}, name={}".format(workspace.id, workspace.name))
print("Project: id={}, name={}".format(project.id, project.name))

Team: id=5, name=jupyter_tutorials
Workspace: id=8, name=cookbook
Project: id=711, name=tutorial_project


## Get Source ProjectMeta

In [5]:
meta_json = api.project.get_meta(project.id)
meta = sly.ProjectMeta.from_json(meta_json)
print("Source ProjectMeta: \n", meta)

Source ProjectMeta: 
 ProjectMeta:
Object Classes
+--------+-----------+----------------+
|  Name  |   Shape   |     Color      |
+--------+-----------+----------------+
|  bike  | Rectangle | [246, 255, 0]  |
|  car   |  Polygon  | [190, 85, 206] |
|  dog   |  Polygon  |  [253, 0, 0]   |
| person |   Bitmap  |  [0, 255, 18]  |
+--------+-----------+----------------+
Tags
+---------------+--------------+-----------------------+
|      Name     |  Value type  |    Possible values    |
+---------------+--------------+-----------------------+
|   car_color   |  any_string  |          None         |
|  cars_number  |  any_number  |          None         |
|      like     |     none     |          None         |
| person_gender | oneof_string |   ['male', 'female']  |
|    situated   | oneof_string | ['inside', 'outside'] |
|  vehicle_age  | oneof_string | ['modern', 'vintage'] |
+---------------+--------------+-----------------------+



## Construct Destination ProjectMeta

In [6]:
dst_meta = meta.add_tag_meta(area_tag_meta)
print("Destination ProjectMeta:\n", dst_meta)

Destination ProjectMeta:
 ProjectMeta:
Object Classes
+--------+-----------+----------------+
|  Name  |   Shape   |     Color      |
+--------+-----------+----------------+
|  bike  | Rectangle | [246, 255, 0]  |
|  car   |  Polygon  | [190, 85, 206] |
|  dog   |  Polygon  |  [253, 0, 0]   |
| person |   Bitmap  |  [0, 255, 18]  |
+--------+-----------+----------------+
Tags
+---------------+--------------+-----------------------+
|      Name     |  Value type  |    Possible values    |
+---------------+--------------+-----------------------+
|   car_color   |  any_string  |          None         |
|  cars_number  |  any_number  |          None         |
|      like     |     none     |          None         |
| person_gender | oneof_string |   ['male', 'female']  |
|    situated   | oneof_string | ['inside', 'outside'] |
|  vehicle_age  | oneof_string | ['modern', 'vintage'] |
|      area     |  any_number  |          None         |
+---------------+--------------+-------------------

## Create Destination project

In [7]:
# check if destination project already exists. If yes - generate new free name
if api.project.exists(workspace.id, dst_project_name):
    dst_project_name = api.project.get_free_name(workspace.id, dst_project_name)
print("Destination project name: ", dst_project_name)

Destination project name:  tutorial_project_tagged


In [8]:
dst_project = api.project.create(workspace.id, dst_project_name)
api.project.update_meta(dst_project.id, dst_meta.to_json())
print("Destination project has been created: id={}, name={!r}".format(dst_project.id, dst_project.name))

Destination project has been created: id=713, name='tutorial_project_tagged'


## Iterate over all images, tag them and add to destination project

In [9]:
for dataset in api.dataset.get_list(project.id):
    print('Dataset: {}'.format(dataset.name), flush=True)
    dst_dataset = api.dataset.create(dst_project.id, dataset.name)
    
    images = api.image.get_list(dataset.id)
    with tqdm(total=len(images), desc="Process annotations") as progress_bar:
        for batch in sly.batched(images):
            image_ids = [image_info.id for image_info in batch]
            image_names = [image_info.name for image_info in batch]
            
            ann_infos = api.annotation.download_batch(dataset.id, image_ids)

            anns_to_upload = []
            for ann_info in ann_infos:
                ann = sly.Annotation.from_json(ann_info.annotation, meta)

                tagged_labels = []
                for label in ann.labels:
                    if label.obj_class.geometry_type in (sly.Bitmap, sly.Polygon):
                        area_tag = sly.Tag(area_tag_meta, value=label.area)
                        label = label.add_tag(area_tag)
                    tagged_labels.append(label)
                ann = ann.clone(labels=tagged_labels)
                anns_to_upload.append(ann)
            
            dst_image_infos = api.image.upload_ids(dst_dataset.id, image_names, image_ids)
            dst_image_ids = [image_info.id for image_info in dst_image_infos]
            api.annotation.upload_anns(dst_image_ids, anns_to_upload)
            progress_bar.update(len(batch))

Dataset: dataset_01


Process annotations: 100%|██████████| 3/3 [00:00<00:00, 14.31it/s]

Dataset: dataset_02



Process annotations: 100%|██████████| 2/2 [00:00<00:00, 13.23it/s]


In [10]:
print("Project {!r} has been sucessfully uploaded".format(dst_project.name))
print("Number of images: ", api.project.get_images_count(dst_project.id))

Project 'tutorial_project_tagged' has been sucessfully uploaded
Number of images:  5
