Skip to content

Latest commit

 

History

History
190 lines (134 loc) · 6.95 KB

VOC2012Segmentation.rst

File metadata and controls

190 lines (134 loc) · 6.95 KB

VOC2012 Segmentation

This topic describes how to manage the VOC2012 Segmentation dataset, which is a dataset with reference/label_format/SemanticMask:SemanticMask and reference/label_format/InstanceMask:InstanceMask labels (Fig. %s <example-semantic-mask> and Fig. %s <example-instance-mask>).

The preview of a semantic mask from "VOC2012 Segmentation".The preview of a semantic mask from "VOC2012 Segmentation".
The preview of a instance mask from "VOC2012 Segmentation".The preview of a instance mask from "VOC2012 Segmentation".

Authorize a Client Instance

An reference/glossary:accesskey is needed to authenticate identity when using TensorBay.

../../../../docs/code/VOC2012Segmentation.py

Create Dataset

../../../../docs/code/VOC2012Segmentation.py

Organize Dataset

Normally, dataloader.py and catalog.json are required to organize the "VOC2012 Segmentation" dataset into the ~tensorbay.dataset.dataset.Dataset instance. In this example, they are stored in the same directory like:

VOC2012 Segmentation/
    catalog.json
    dataloader.py

It takes the following steps to organize "VOC2012 Segmentation" dataset by the ~tensorbay.dataset.dataset.Dataset instance.

Step 1: Write the Catalog

A Catalog <reference/dataset_structure:catalog> contains all label information of one dataset, which is typically stored in a json file like catalog.json.

../../../../tensorbay/opendataset/VOC2012Segmentation/catalog.json

The annotation types for "VOC2012 Segmentation" are reference/label_format/SemanticMask:SemanticMask and reference/label_format/InstanceMask:InstanceMask, and there are 22 reference/label_format/CommonLabelProperties:Category types for reference/label_format/SemanticMask:SemanticMask. There are 2 reference/label_format/CommonLabelProperties:Category types for reference/label_format/InstanceMask:InstanceMask, category 0 represents the background, and category 255 represents the border of instances.

Note

  • By passing the path of the catalog.json, ~tensorbay.dataset.dataset.DatasetBase.load_catalog supports loading the catalog into dataset.
  • The categories in reference/label_format/InstanceMask:InstanceMaskSubcatalog are for pixel values which are not instance ids.

Important

See catalog table <reference/dataset_structure:catalog> for more catalogs with different label types.

Step 2: Write the Dataloader

A reference/glossary:dataloader is needed to organize the dataset into a ~tensorbay.dataset.dataset.Dataset instance.

../../../../tensorbay/opendataset/VOC2012Segmentation/loader.py

See SemanticMask annotation <reference/label_format/SemanticMask:SemanticMask> and InstanceMask annotation <reference/label_format/InstanceMask:InstanceMask> for more details.

There are already a number of dataloaders in TensorBay SDK provided by the community. Thus, instead of writing, importing an available dataloader is also feasible.

../../../../docs/code/VOC2012Segmentation.py

Note

Note that catalogs are automatically loaded in available dataloaders, users do not have to write them again.

Important

See dataloader table <reference/glossary:dataloader> for dataloaders with different label types.

Upload Dataset

The organized "VOC2012 Segmentation" dataset can be uploaded to tensorBay for sharing, reuse, etc.

../../../../docs/code/VOC2012Segmentation.py

Similar with Git, the commit step after uploading can record changes to the dataset as a version. If needed, do the modifications and commit again. Please see features/version_control/index:Version Control for more details.

See the visualization on TensorBay website.

Read Dataset

Now "VOC2012 Segmentation" dataset can be read from TensorBay.

../../../../docs/code/VOC2012Segmentation.py

In reference/dataset_structure:Dataset "VOC2012 Segmentation", there are two segments <reference/dataset_structure:Segment>: train and val. Get a segment by passing the required segment name or the index.

../../../../docs/code/VOC2012Segmentation.py

In the train reference/dataset_structure:Segment, there is a sequence of reference/dataset_structure:Data, which can be obtained by index.

../../../../docs/code/VOC2012Segmentation.py

In each reference/dataset_structure:Data, there are one reference/label_format/SemanticMask:SemanticMask annotation and one reference/label_format/InstanceMask:InstanceMask annotation.

../../../../docs/code/VOC2012Segmentation.py

There are two label types in "VOC2012 Segmentation" dataset, which are semantic_mask and instance_mask. We can get the mask by Image.open() or get the mask url by get_url(). The information stored in reference/label_format/SemanticMask:SemanticMask.all_attributes is attributes for every category in categories list of SEMANTIC_MASK. The information stored in reference/label_format/InstanceMask:InstanceMask.all_attributes is attributes for every instance. See reference/label_format/SemanticMask:SemanticMask and reference/label_format/InstanceMask:InstanceMask label formats for more details.

Delete Dataset

../../../../docs/code/VOC2012Segmentation.py