Skip to content

Commit

Permalink
update docs
Browse files Browse the repository at this point in the history
Summary: Pull Request resolved: fairinternal/detectron2#310

Differential Revision: D17975608

Pulled By: ppwwyyxx

fbshipit-source-id: 5397020062fb7b6c2041b9e0e312d02c1721bb47
  • Loading branch information
ppwwyyxx authored and facebook-github-bot committed Oct 17, 2019
1 parent 951417e commit ed6105b
Show file tree
Hide file tree
Showing 4 changed files with 47 additions and 39 deletions.
2 changes: 1 addition & 1 deletion MODEL_ZOO.md
Original file line number Diff line number Diff line change
Expand Up @@ -549,7 +549,7 @@ These baselines are described in Table 3(c) of the [LVIS paper](https://arxiv.or

NOTE: the 1x schedule here has the same amount of __iterations__ as the COCO 1x baselines.
They are roughly 24 epochs of LVISv0.5 data.
The final results of these configs has large variance across different runs.
The final results of these configs have large variance across different runs.

<!--
./gen_html_table.py --config 'LVIS-InstanceSegmentation/mask*50*' 'LVIS-InstanceSegmentation/mask*101*' --name R50-FPN R101-FPN X101-FPN --fields lr_sched train_speed inference_speed mem box_AP mask_AP
Expand Down
21 changes: 13 additions & 8 deletions datasets/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
For a few datasets that detectron2 natively supports,
the datasets are assumed to exist in a directory called
"datasets/", under the directory where you launch the program.
with the following directory structure:
They need to have the following directory structure:

## Expected dataset structure for COCO instance/keypoint detection:

Expand All @@ -17,7 +17,7 @@ coco/

You can use the 2014 version of the dataset as well.

Some of the builtin tests (`run_*_tests.sh`) uses a tiny version of the COCO dataset,
Some of the builtin tests (`dev/run_*_tests.sh`) uses a tiny version of the COCO dataset,
which you can download with `./prepare_for_tests.sh`.

## Expected dataset structure for PanopticFPN:
Expand All @@ -28,6 +28,7 @@ coco/
panoptic_{train,val}2017.json
panoptic_{train,val}2017/
# png annotations
panoptic_stuff_{train,val}2017/ # generated by the script mentioned below
```

Install panopticapi by:
Expand All @@ -36,13 +37,13 @@ pip install git+https://github.com/cocodataset/panopticapi.git
```
Then, run `./prepare_panoptic_fpn.py`, to extract semantic annotations from panoptic annotations.

## Expected dataset structure for LVIS instance detection/segmentation:
## Expected dataset structure for LVIS instance segmentation:
```
coco/
{train,val,test}2017/
lvis/
lvis_v0.5_{train,val}.json
lvis_v0.5_image_info_test.json
lvis_v0.5_image_info_test.json
```

Install lvis-api by:
Expand All @@ -56,8 +57,8 @@ cityscapes/
gtFine/
train/
aachen/
color.png, instanceIds.png, labelIds.png, polygons.json
labelTrainIds.png (created by cityscapesscripts/preparation/createTrainIdLabelImgs.py)
color.png, instanceIds.png, labelIds.png, polygons.json,
labelTrainIds.png
...
val/
test/
Expand All @@ -71,10 +72,14 @@ Install cityscapes scripts by:
pip install git+https://github.com/mcordts/cityscapesScripts.git
```

Note:
labelTrainIds.png are created by `cityscapesscripts/preparation/createTrainIdLabelImgs.py`.
They are not needed for instance segmentation.

## Expected dataset structure for Pascal VOC:
```
VOC20{07,12}/
Annotations/
ImageSets/
JPEGImages/
ImageSets/
JPEGImages/
```
1 change: 1 addition & 0 deletions detectron2/data/catalog.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ def register(name, func):
name (str): the name that identifies a dataset, e.g. "coco_2014_train".
func (callable): a callable which takes no arguments and returns a list of dicts.
"""
assert callable(func), "You must register a function with `DatasetCatalog.register`!"
DatasetCatalog._REGISTERED[name] = func

@staticmethod
Expand Down
62 changes: 32 additions & 30 deletions docs/tutorials/datasets.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,42 +46,42 @@ can load an image from "file_name" if the "image" field is not available.
+ `sem_seg_file_name`: the full path to the ground truth semantic segmentation file.
+ `image`: the image as a numpy array.
+ `sem_seg`: semantic segmentation ground truth in a 2D numpy array. Values in the array represent
category labels.
category labels.
+ `height`, `width`: integer. The shape of image.
+ `image_id` (str): a string to identify this image. Mainly used during evaluation to identify the
image. Each dataset may use it for different purposes.
image. Each dataset may use it for different purposes.
+ `annotations` (list[dict]): the per-instance annotations of every
instance in this image. Each annotation dict may contain:
+ `bbox` (list[float]): list of 4 numbers representing the bounding box of the instance.
+ `bbox_mode` (int): the format of bbox.
It must be a member of
[structures.BoxMode](../modules/structures.html#detectron2.structures.BoxMode).
Currently supports: `BoxMode.XYXY_ABS`, `BoxMode.XYWH_ABS`.
+ `category_id` (int): an integer in the range [0, num_categories) representing the category label.
The value num_categories is reserved to represent the "background" category, if applicable.
+ `segmentation` (list[list[float]] or dict):
+ If `list[list[float]]`, it represents a list of polygons, one for each connected component
of the object. Each `list[float]` is one simple polygon in the format of `[x1, y1, ..., xn, yn]`.
The Xs and Ys are either relative coordinates in [0, 1], or absolute coordinates,
depend on whether "bbox_mode" is relative.
+ If `dict`, it represents the per-pixel segmentation mask in COCO's RLE format.
+ `keypoint`s (list[float]): in the format of [x1, y1, v1,..., xn, yn, vn].
v[i] means the visibility of this keypoint.
`n` must be equal to the number of keypoint categories.
The Xs and Ys are either relative coordinates in [0, 1], or absolute coordinates,
depend on whether "bbox_mode" is relative.

Note that the coordinate annotations in COCO format are integers in range [0, H-1 or W-1].
By default, detectron2 adds 0.5 to absolute keypoint coordinates to convert them from discrete
instance in this image. Each annotation dict may contain:
+ `bbox` (list[float]): list of 4 numbers representing the bounding box of the instance.
+ `bbox_mode` (int): the format of bbox.
It must be a member of
[structures.BoxMode](../modules/structures.html#detectron2.structures.BoxMode).
Currently supports: `BoxMode.XYXY_ABS`, `BoxMode.XYWH_ABS`.
+ `category_id` (int): an integer in the range [0, num_categories) representing the category label.
The value num_categories is reserved to represent the "background" category, if applicable.
+ `segmentation` (list[list[float]] or dict):
+ If `list[list[float]]`, it represents a list of polygons, one for each connected component
of the object. Each `list[float]` is one simple polygon in the format of `[x1, y1, ..., xn, yn]`.
The Xs and Ys are either relative coordinates in [0, 1], or absolute coordinates,
depend on whether "bbox_mode" is relative.
+ If `dict`, it represents the per-pixel segmentation mask in COCO's RLE format.
+ `keypoint`s (list[float]): in the format of [x1, y1, v1,..., xn, yn, vn].
v[i] means the visibility of this keypoint.
`n` must be equal to the number of keypoint categories.
The Xs and Ys are either relative coordinates in [0, 1], or absolute coordinates,
depend on whether "bbox_mode" is relative.

Note that the coordinate annotations in COCO format are integers in range [0, H-1 or W-1].
By default, detectron2 adds 0.5 to absolute keypoint coordinates to convert them from discrete
pixel indices to floating point coordinates.
+ `iscrowd`: 0 or 1. Whether this instance is labeled as COCO's "crowd region".
+ `iscrowd`: 0 or 1. Whether this instance is labeled as COCO's "crowd region".
+ `proposal_boxes` (array): 2D numpy array with shape (K, 4) representing K precomputed proposal boxes for this image.
+ `proposal_objectness_logits` (array): numpy array with shape (K, ), which corresponds to the objectness
logits of proposals in 'proposal_boxes'.
logits of proposals in 'proposal_boxes'.
+ `proposal_bbox_mode` (int): the format of the precomputed proposal bbox.
It must be a member of
[structures.BoxMode](../modules/structures.html#detectron2.structures.BoxMode).
Default format is `BoxMode.XYXY_ABS`.
It must be a member of
[structures.BoxMode](../modules/structures.html#detectron2.structures.BoxMode).
Default format is `BoxMode.XYXY_ABS`.


If your dataset is already in the COCO format, you can simply register it by
Expand Down Expand Up @@ -146,12 +146,14 @@ Some additional metadata that are specific to the evaluation of certain datasets
* `stuff_dataset_id_to_contiguous_id` (dict[int->int]): Used when generating prediction json files for
semantic/panoptic segmentation.
A mapping from semantic segmentation class ids in the dataset
to contiguous ids in [0, num_categories). It is useful for evaluation only.
to contiguous ids in [0, num_categories). It is useful for evaluation only.

* `json_file`: The COCO annotation json file. Used by COCO evaluation for COCO-format datasets.
* `panoptic_root`, `panoptic_json`: Used by panoptic evaluation.
* `evaluator_type`: Used by the builtin main training script to select
evaluator. No need to use it if you write your own main script.
You can just provide the [DatasetEvaluator](../modules/evaluation.html#detectron2.evaluation.DatasetEvaluator)
for your dataset directly in your main script.

NOTE: For background on the difference between "thing" and "stuff" categories, see
[On Seeing Stuff: The Perception of Materials by Humans and Machines](http://persci.mit.edu/pub_pdfs/adelson_spie_01.pdf).
Expand Down

0 comments on commit ed6105b

Please sign in to comment.