A dataset can be used by accessing DatasetCatalog
for its data, or MetadataCatalog for its metadata (class names, etc).
This document explains how to setup the builtin datasets so they can be used by the above APIs. Use Custom Datasets gives a deeper dive on how to use DatasetCatalog
and MetadataCatalog
,
and how to add new datasets to them.
CutLER has builtin support for a few datasets. The datasets are assumed to exist in a directory specified by the environment variable DETECTRON2_DATASETS
. Under this directory, detectron2 will look for datasets in the structure described below, if needed.
$DETECTRON2_DATASETS/
imagenet/
coco/
voc/
kitti/
...
You can set the location for builtin datasets by export DETECTRON2_DATASETS=/path/to/datasets
. If left unset, the default is ./datasets
relative to your current working directory.
All dataset annotation files should be converted to the COCO format.
Expected dataset structure for ImageNet:
imagenet/
train/
n01440764/*.JPEG
n01443537/*.JPEG
...
annotations/
imagenet_train_fixsize480_tau0.15_N3.json # generated by MaskCut
The ImageNet-1K samples should be downloaded from here and the pre-generated json file can be directly download from here. Or if you want to generate your own pseudo-masks, you can follow the instructions on MaskCut.
Expected dataset structure for COCO, COCO20K, LVIS:
coco/
annotations/
instances_{train,val}2017.json
coco20k_trainval_gt.json
coco_cls_agnostic_instances_val2017.json
{1,2,5,10,20,30,40,50,60,80}perc_instances_train2017.json
lvis1.0_cocofied_val_cls_agnostic.json
{train,val}2017/
000000000139.jpg
000000000285.jpg
...
train2014/
COCO_train2014_000000581921.jpg
COCO_train2014_000000581909.jpg
...
Following previous work, we prepare annotations for semi-supervised learning by randomly sampling a subset of labeled images. You can download these annotation filess needed for COCO semi-supervised learning experiments and for evaluating the performance of the models on COCO, COCO20K and LVIS: coco-semi, coco, lvis and coco20k.
Expected dataset structure for VOC2007:
voc/
annotations/
trainvaltest_2007_cls_agnostic.json
VOC2007/
JPEGImages/
000001.jpg
...
We provide pre-converted COCO-style annotation files here
Expected dataset structure for Objects365-V2:
objects365/
annotations/
zhiyuan_objv2_val_cls_agnostic.json
val/
000000000139.jpg
000000000285.jpg
...
You only need to download the validation set of Objects365 for evaluation. We provide pre-converted COCO-style annotation files here
Expected dataset structure for OpenImages-V6:
openImages/
annotations/
openimages_val_cls_agnostic.json
validation/
47947b97662dc962.jpg
...
You only need to download the validation set of OpenImages for evaluation. We provide pre-converted COCO-style annotation files here
Expected dataset structure for UVO:
uvo/
annotations/
val_sparse_cleaned_cls_agnostic.json
all_UVO_frames/
000000000139.jpg
000000000285.jpg
...
You only need to download the validation set of UVO for evaluation. We provide pre-converted COCO-style annotation files here
Expected dataset structure for KITTI:
kitti/
annotations/
trainval_cls_agnostic.json
JPEGImages/
001717.jpg
...
We provide pre-converted COCO-style annotation files here
Expected dataset structure for Watercolor, Comic, Clipart:
watercolor/
annotations/
traintest_cls_agnostic.json
JPEGImages/
163330523.jpg
...
comic/
annotations/
traintest_cls_agnostic.json
JPEGImages/
161067391.jpg
...
clipart/
annotations/
traintest_cls_agnostic.json
JPEGImages/
375390294.jpg
...
We provide pre-converted COCO-style annotation files here: watercolor, comic and clipart.
NOTE: ALL DATASETS FOLLOW THEIR ORIGINAL LICENSES.