update docs

Summary: Pull Request resolved: fairinternal/detectron2#310 Differential Revision: D17975608 Pulled By: ppwwyyxx fbshipit-source-id: 5397020062fb7b6c2041b9e0e312d02c1721bb47
facebookresearch · Oct 17, 2019 · ed6105b · ed6105b
1 parent 951417e
commit ed6105b
Show file tree

Hide file tree

Showing 4 changed files with 47 additions and 39 deletions.
diff --git a/MODEL_ZOO.md b/MODEL_ZOO.md
@@ -549,7 +549,7 @@ These baselines are described in Table 3(c) of the [LVIS paper](https://arxiv.or
 
 NOTE: the 1x schedule here has the same amount of __iterations__ as the COCO 1x baselines.
 They are roughly 24 epochs of LVISv0.5 data.
-The final results of these configs has large variance across different runs.
+The final results of these configs have large variance across different runs.
 
 <!--
 ./gen_html_table.py --config 'LVIS-InstanceSegmentation/mask*50*' 'LVIS-InstanceSegmentation/mask*101*' --name R50-FPN R101-FPN X101-FPN --fields lr_sched train_speed inference_speed mem box_AP mask_AP

diff --git a/datasets/README.md b/datasets/README.md
@@ -2,7 +2,7 @@
 For a few datasets that detectron2 natively supports,
 the datasets are assumed to exist in a directory called
 "datasets/", under the directory where you launch the program.
-with the following directory structure:
+They need to have the following directory structure:
 
 ## Expected dataset structure for COCO instance/keypoint detection:
 
@@ -17,7 +17,7 @@ coco/
 
 You can use the 2014 version of the dataset as well.
 
-Some of the builtin tests (`run_*_tests.sh`) uses a tiny version of the COCO dataset,
+Some of the builtin tests (`dev/run_*_tests.sh`) uses a tiny version of the COCO dataset,
 which you can download with `./prepare_for_tests.sh`.
 
 ## Expected dataset structure for PanopticFPN:
@@ -28,6 +28,7 @@ coco/
     panoptic_{train,val}2017.json
   panoptic_{train,val}2017/
     # png annotations
+	panoptic_stuff_{train,val}2017/  # generated by the script mentioned below
 ```
 
 Install panopticapi by:
@@ -36,13 +37,13 @@ pip install git+https://github.com/cocodataset/panopticapi.git
 ```
 Then, run `./prepare_panoptic_fpn.py`, to extract semantic annotations from panoptic annotations.
 
-## Expected dataset structure for LVIS instance detection/segmentation:
+## Expected dataset structure for LVIS instance segmentation:
 ```
 coco/
   {train,val,test}2017/
 lvis/
   lvis_v0.5_{train,val}.json
-	lvis_v0.5_image_info_test.json
+  lvis_v0.5_image_info_test.json
 ```
 
 Install lvis-api by:
@@ -56,8 +57,8 @@ cityscapes/
   gtFine/
     train/
       aachen/
-        color.png, instanceIds.png, labelIds.png, polygons.json
-				labelTrainIds.png (created by cityscapesscripts/preparation/createTrainIdLabelImgs.py)
+        color.png, instanceIds.png, labelIds.png, polygons.json,
+        labelTrainIds.png
       ...
     val/
     test/
@@ -71,10 +72,14 @@ Install cityscapes scripts by:
 pip install git+https://github.com/mcordts/cityscapesScripts.git
 ```
 
+Note:
+labelTrainIds.png are created by `cityscapesscripts/preparation/createTrainIdLabelImgs.py`.
+They are not needed for instance segmentation.
+
 ## Expected dataset structure for Pascal VOC:
 ```
 VOC20{07,12}/
   Annotations/
-	ImageSets/
-	JPEGImages/
+  ImageSets/
+  JPEGImages/
 ```
diff --git a/detectron2/data/catalog.py b/detectron2/data/catalog.py
@@ -34,6 +34,7 @@ def register(name, func):
             name (str): the name that identifies a dataset, e.g. "coco_2014_train".
             func (callable): a callable which takes no arguments and returns a list of dicts.
         """
+        assert callable(func), "You must register a function with `DatasetCatalog.register`!"
         DatasetCatalog._REGISTERED[name] = func
 
     @staticmethod

diff --git a/docs/tutorials/datasets.md b/docs/tutorials/datasets.md
@@ -46,42 +46,42 @@ can load an image from "file_name" if the "image" field is not available.
 + `sem_seg_file_name`: the full path to the ground truth semantic segmentation file.
 + `image`: the image as a numpy array.
 + `sem_seg`: semantic segmentation ground truth in a 2D numpy array. Values in the array represent
- 		category labels.
+   category labels.
 + `height`, `width`: integer. The shape of image.
 + `image_id` (str): a string to identify this image. Mainly used during evaluation to identify the
-		image. Each dataset may use it for different purposes.
+  image. Each dataset may use it for different purposes.
 + `annotations` (list[dict]): the per-instance annotations of every
-		instance in this image. Each annotation dict may contain:
-	+ `bbox` (list[float]): list of 4 numbers representing the bounding box of the instance.
-	+ `bbox_mode` (int): the format of bbox.
-			It must be a member of
-      [structures.BoxMode](../modules/structures.html#detectron2.structures.BoxMode).
-		  Currently supports: `BoxMode.XYXY_ABS`, `BoxMode.XYWH_ABS`.
-	+ `category_id` (int): an integer in the range [0, num_categories) representing the category label.
-      The value num_categories is reserved to represent the "background" category, if applicable.
-	+ `segmentation` (list[list[float]] or dict):
-		+ If `list[list[float]]`, it represents a list of polygons, one for each connected component
-     	of the object. Each `list[float]` is one simple polygon in the format of `[x1, y1, ..., xn, yn]`.
-			The Xs and Ys are either relative coordinates in [0, 1], or absolute coordinates,
-			depend on whether "bbox_mode" is relative.
-		+ If `dict`, it represents the per-pixel segmentation mask in COCO's RLE format.
-	+ `keypoint`s (list[float]): in the format of [x1, y1, v1,..., xn, yn, vn].
-		v[i] means the visibility of this keypoint.
-		`n` must be equal to the number of keypoint categories.
-		The Xs and Ys are either relative coordinates in [0, 1], or absolute coordinates,
-		depend on whether "bbox_mode" is relative.
-
-		Note that the coordinate annotations in COCO format are integers in range [0, H-1 or W-1].
-		By default, detectron2 adds 0.5 to absolute keypoint coordinates to convert them from discrete
+  instance in this image. Each annotation dict may contain:
+  + `bbox` (list[float]): list of 4 numbers representing the bounding box of the instance.
+  + `bbox_mode` (int): the format of bbox.
+    It must be a member of
+    [structures.BoxMode](../modules/structures.html#detectron2.structures.BoxMode).
+    Currently supports: `BoxMode.XYXY_ABS`, `BoxMode.XYWH_ABS`.
+  + `category_id` (int): an integer in the range [0, num_categories) representing the category label.
+    The value num_categories is reserved to represent the "background" category, if applicable.
+  + `segmentation` (list[list[float]] or dict):
+    + If `list[list[float]]`, it represents a list of polygons, one for each connected component
+      of the object. Each `list[float]` is one simple polygon in the format of `[x1, y1, ..., xn, yn]`.
+      The Xs and Ys are either relative coordinates in [0, 1], or absolute coordinates,
+      depend on whether "bbox_mode" is relative.
+    + If `dict`, it represents the per-pixel segmentation mask in COCO's RLE format.
+  + `keypoint`s (list[float]): in the format of [x1, y1, v1,..., xn, yn, vn].
+    v[i] means the visibility of this keypoint.
+    `n` must be equal to the number of keypoint categories.
+    The Xs and Ys are either relative coordinates in [0, 1], or absolute coordinates,
+    depend on whether "bbox_mode" is relative.
+
+    Note that the coordinate annotations in COCO format are integers in range [0, H-1 or W-1].
+    By default, detectron2 adds 0.5 to absolute keypoint coordinates to convert them from discrete
     pixel indices to floating point coordinates.
-	+ `iscrowd`: 0 or 1. Whether this instance is labeled as COCO's "crowd region".
+  + `iscrowd`: 0 or 1. Whether this instance is labeled as COCO's "crowd region".
 + `proposal_boxes` (array): 2D numpy array with shape (K, 4) representing K precomputed proposal boxes for this image.
 + `proposal_objectness_logits` (array): numpy array with shape (K, ), which corresponds to the objectness
-        logits of proposals in 'proposal_boxes'.
+  logits of proposals in 'proposal_boxes'.
 + `proposal_bbox_mode` (int): the format of the precomputed proposal bbox.
-        It must be a member of
-        [structures.BoxMode](../modules/structures.html#detectron2.structures.BoxMode).
-        Default format is `BoxMode.XYXY_ABS`.
+  It must be a member of
+  [structures.BoxMode](../modules/structures.html#detectron2.structures.BoxMode).
+  Default format is `BoxMode.XYXY_ABS`.
 
 
 If your dataset is already in the COCO format, you can simply register it by
@@ -146,12 +146,14 @@ Some additional metadata that are specific to the evaluation of certain datasets
 * `stuff_dataset_id_to_contiguous_id` (dict[int->int]): Used when generating prediction json files for
   semantic/panoptic segmentation.
   A mapping from semantic segmentation class ids in the dataset
-	to contiguous ids in [0, num_categories). It is useful for evaluation only.
+  to contiguous ids in [0, num_categories). It is useful for evaluation only.
 
 * `json_file`: The COCO annotation json file. Used by COCO evaluation for COCO-format datasets.
 * `panoptic_root`, `panoptic_json`: Used by panoptic evaluation.
 * `evaluator_type`: Used by the builtin main training script to select
    evaluator. No need to use it if you write your own main script.
+   You can just provide the [DatasetEvaluator](../modules/evaluation.html#detectron2.evaluation.DatasetEvaluator)
+   for your dataset directly in your main script.
 
 NOTE: For background on the difference between "thing" and "stuff" categories, see
 [On Seeing Stuff: The Perception of Materials by Humans and Machines](http://persci.mit.edu/pub_pdfs/adelson_spie_01.pdf).