This topic describes how to manage the VOC2012 Segmentation dataset, which is a dataset with reference/label_format/SemanticMask:SemanticMask
and reference/label_format/InstanceMask:InstanceMask
labels (Fig. %s <example-semantic-mask>
and Fig. %s <example-instance-mask>
).
An reference/glossary:accesskey
is needed to authenticate identity when using TensorBay.
../../../../docs/code/VOC2012Segmentation.py
../../../../docs/code/VOC2012Segmentation.py
Normally, dataloader.py
and catalog.json
are required to organize the "VOC2012 Segmentation" dataset into the ~tensorbay.dataset.dataset.Dataset
instance. In this example, they are stored in the same directory like:
VOC2012 Segmentation/
catalog.json
dataloader.py
It takes the following steps to organize "VOC2012 Segmentation" dataset by the ~tensorbay.dataset.dataset.Dataset
instance.
A Catalog <reference/dataset_structure:catalog>
contains all label information of one dataset, which is typically stored in a json file like catalog.json
.
../../../../tensorbay/opendataset/VOC2012Segmentation/catalog.json
The annotation types for "VOC2012 Segmentation" are reference/label_format/SemanticMask:SemanticMask
and reference/label_format/InstanceMask:InstanceMask
, and there are 22 reference/label_format/CommonLabelProperties:Category
types for reference/label_format/SemanticMask:SemanticMask
. There are 2 reference/label_format/CommonLabelProperties:Category
types for reference/label_format/InstanceMask:InstanceMask
, category 0 represents the background, and category 255 represents the border of instances.
Note
- By passing the path of the
catalog.json
,~tensorbay.dataset.dataset.DatasetBase.load_catalog
supports loading the catalog into dataset. - The categories in
reference/label_format/InstanceMask:InstanceMaskSubcatalog
are for pixel values which are not instance ids.
Important
See catalog table <reference/dataset_structure:catalog>
for more catalogs with different label types.
A reference/glossary:dataloader
is needed to organize the dataset into a ~tensorbay.dataset.dataset.Dataset
instance.
../../../../tensorbay/opendataset/VOC2012Segmentation/loader.py
See SemanticMask annotation <reference/label_format/SemanticMask:SemanticMask>
and InstanceMask annotation <reference/label_format/InstanceMask:InstanceMask>
for more details.
There are already a number of dataloaders in TensorBay SDK provided by the community. Thus, instead of writing, importing an available dataloader is also feasible.
../../../../docs/code/VOC2012Segmentation.py
Note
Note that catalogs are automatically loaded in available dataloaders, users do not have to write them again.
Important
See dataloader table <reference/glossary:dataloader>
for dataloaders with different label types.
The organized "VOC2012 Segmentation" dataset can be uploaded to tensorBay for sharing, reuse, etc.
../../../../docs/code/VOC2012Segmentation.py
Similar with Git, the commit step after uploading can record changes to the dataset as a version. If needed, do the modifications and commit again. Please see features/version_control/index:Version Control
for more details.
See the visualization on TensorBay website.
Now "VOC2012 Segmentation" dataset can be read from TensorBay.
../../../../docs/code/VOC2012Segmentation.py
In reference/dataset_structure:Dataset
"VOC2012 Segmentation", there are two segments <reference/dataset_structure:Segment>
: train
and val
. Get a segment by passing the required segment name or the index.
../../../../docs/code/VOC2012Segmentation.py
In the train
reference/dataset_structure:Segment
, there is a sequence of reference/dataset_structure:Data
, which can be obtained by index.
../../../../docs/code/VOC2012Segmentation.py
In each reference/dataset_structure:Data
, there are one reference/label_format/SemanticMask:SemanticMask
annotation and one reference/label_format/InstanceMask:InstanceMask
annotation.
../../../../docs/code/VOC2012Segmentation.py
There are two label types in "VOC2012 Segmentation" dataset, which are semantic_mask
and instance_mask
. We can get the mask by Image.open()
or get the mask url by get_url()
. The information stored in reference/label_format/SemanticMask:SemanticMask.all_attributes
is attributes for every category in categories
list of SEMANTIC_MASK
. The information stored in reference/label_format/InstanceMask:InstanceMask.all_attributes
is attributes for every instance. See reference/label_format/SemanticMask:SemanticMask
and reference/label_format/InstanceMask:InstanceMask
label formats for more details.
../../../../docs/code/VOC2012Segmentation.py