Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements in COCO reader API #2406

Merged
merged 9 commits into from
Oct 30, 2020

Conversation

jantonguirao
Copy link
Contributor

@jantonguirao jantonguirao commented Oct 27, 2020

Why we need this PR?

Pick one, remove the rest

  • Refactoring to improve usability of COCOReader, especially in use cases involving segmentation masks.

What happened in this PR?

Fill relevant points, put NA otherwise. Replace anything inside []

  • What solution was applied:
    Changed the format of mask polygon descriptor, to use indices of vertices rather than indices of coordinates
    Renamed and deprecated some ambiguously named arguments
    Removed trailing dimension from labels output
    Added handling of mask polygons in COCO reader example
    Rework the way that COCOLoader and COCOReader share data
  • Affected modules and functionalities:
    COCOReader
  • Key points relevant for the review:
    Changes in the COCO reader
  • Validation and testing:
    Existing tests and jupyter example
  • Documentation (including examples):
    COCO reader example enhanced

JIRA TASK: [DALI-1686]

Signed-off-by: Joaquin Anton <janton@nvidia.com>
Signed-off-by: Joaquin Anton <janton@nvidia.com>
@jantonguirao
Copy link
Contributor Author

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1737859]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1737859]: BUILD FAILED

Signed-off-by: Joaquin Anton <janton@nvidia.com>
@jantonguirao jantonguirao changed the title [WIP] COCO reader rework Improvements in COCO reader API Oct 28, 2020
@awolant awolant self-requested a review October 28, 2020 09:26
@jantonguirao jantonguirao requested review from mzient and a team October 28, 2020 09:45
@@ -200,7 +200,7 @@
"labels = labels_cpu.at(img_index)\n",
"categories_set = set()\n",
"for label in labels:\n",
" categories_set.add(label[0])\n",
" categories_set.add(label)\n",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only code where labels were expected to have an extra dim. In the detection pipelines the labels go to the box encoder, and the extra dimension is flattened there.

@review-notebook-app
Copy link

review-notebook-app bot commented Oct 28, 2020

View / edit / reply to this conversation on ReviewNB

JanuszL commented on 2020-10-28T11:37:02Z
----------------------------------------------------------------

>Each entry in the vertices contains two coordinates (x, y)

I would say that `Each entry in the vertices contains coordinates (x, y respectively for 2D polygons).


@jantonguirao
Copy link
Contributor Author

View / edit / reply to this conversation on ReviewNB

JanuszL commented on 2020-10-28T11:37:02Z

Each entry in the vertices contains two coordinates (x, y)

I would say that `Each entry in the vertices contains coordinates (x, y respectively for 2D polygons).

for 2D polygons
This gives the impression that we support other kind of polygons, which we don't. There's no 3D COCO dataset

@@ -94,7 +95,7 @@ void dump_filenames(const ImageIdPairs &image_id_pairs, const std::string path)
}

template <typename T>
void load_meta_file(std::vector<T> &output, const std::string path) {
void LoadFromFile(std::vector<T> &output, const std::string path) {
std::ifstream file(path);
DALI_ENFORCE(file.good(), make_string("Error writing to path: ", path));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DALI_ENFORCE(file.good(), make_string("Error writing to path: ", path));
DALI_ENFORCE(file.good(), make_string("CocoReader meta file error while loading for path: ", path));

@@ -119,7 +120,7 @@ void load_meta_file(std::vector<std::vector<T> > &output, const std::string path
}
}

void load_filenames(ImageIdPairs &image_id_pairs, const std::string path) {
void LoadFilenamesFromFile(ImageIdPairs &image_id_pairs, const std::string path) {
std::ifstream file(path);
DALI_ENFORCE(file, "CocoReader meta file error while loading for path: " + path);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DALI_ENFORCE(file, "CocoReader meta file error while loading for path: " + path);
DALI_ENFORCE(file.good(), make_string("CocoReader meta file error while loading for path: ", path));

sample_mask_meta.push_back(objects_in_sample);
sample_mask_meta.push_back(obj_coords_offset + annotation.poly_.segm_meta_[i]);
sample_mask_meta.push_back(obj_coords_offset + annotation.poly_.segm_meta_[i + 1]);
auto segm_meta = annotation.poly_.segm_meta_.data();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto segm_meta = annotation.poly_.segm_meta_.data();
auto &segm_meta = annotation.poly_.segm_meta_;

Signed-off-by: Joaquin Anton <janton@nvidia.com>
Each mask can be one or more polygons, and for a given sample, the polygons are represented by the
following tensors:
.DeprecateArg("masks", false, // deprecated since 0.28dev
"``masks`` argument is now deprecated. Please use ``polygon_masks`` instead "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should keep an info how the deprecated format looks like?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 29 to 71
images and annotation JSON files.

This readers produces the following outputs::

images, bounding_boxes, labels, ((polygons, vertices) | (pixelwise_masks)), (image_ids)

**images**

Each sample contains image data with layout ``HWC`` (height, width, channels).

**bounding_boxes**

Each sample can have an arbitrary ``M`` number of bounding boxes, each described by 4 coordinates::

[[x_0, y_0, w_0, h_0],
[x_1, y_1, w_1, h_1]
...
[x_M, y_M, w_M, h_M]]

or in ``[l, t, r, b]`` format if requested (see ``ltrb`` argument).

**labels**

Each bounding box is associated with an integer label representing a category identifier::

[label_0, label_1, ..., label_M]

**polygons** and **vertices** (Optional, present if ``polygon_masks`` is set to True)

If ``polygon_masks`` is enabled, two extra outputs describing masks by a set of polygons.

Each mask contains an arbitrary number of polygons ``P``, each associated with a mask index in the range [0, M) and
composed by a group of ``V`` vertices. The output ``polygons`` describes the polygons as follows::

[[mask_idx_0, start_vertex_idx_0, end_vertex_idx_0],
[mask_idx_1, start_vertex_idx_1, end_vertex_idx_1],
...
[mask_idx_P, start_vertex_idx_P, end_vertex_idx_P]]

where ``mask_idx`` is the index of the mask the polygon, in the range ``[0, M)``, and ``start_vertex_idx`` and ``end_verted_idx``
define the range of indices of vertices, as they appear in the output ``vertices``, belonging to this polygon.

Each sample in ``vertices`` contains a list of vertices that composed the different polygons in the sample, as 2D coordinates::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about making this a list:

* **images**
  Each sample contains image data with layout ``HWC`` (height, width, channels).
* **bounding_boxes**
  Each sample can have an arbitrary ``M`` number of bounding boxes, each described by 4 coordinates::

    [[x_0, y_0, w_0, h_0],
     [x_1, y_1, w_1, h_1]
     ...
     [x_M, y_M, w_M, h_M]]

  or in ``[l, t, r, b]`` format if requested (see ``ltrb`` argument).
* **labels**
  Each bounding box is associated with an integer label representing a category identifier::
        
    [label_0, label_1, ..., label_M]

* **polygons** and **vertices** (Optional, present if ``polygon_masks`` is set to True)
  If ``polygon_masks`` is enabled, two extra outputs describing masks by a set of polygons.
  Each mask contains an arbitrary number of polygons ``P``, each associated with a mask index in the range [0, M) and 
  composed by a group of ``V`` vertices. The output ``polygons`` describes the polygons as follows::
  
    [[mask_idx_0, start_vertex_idx_0, end_vertex_idx_0],
     [mask_idx_1, start_vertex_idx_1, end_vertex_idx_1],
     ...
     [mask_idx_P, start_vertex_idx_P, end_vertex_idx_P]]

  where ``mask_idx`` is the index of the mask the polygon, in the range ``[0, M)``, and ``start_vertex_idx`` and  ``end_verted_idx``
  define the range of indices of vertices, as they appear in the output ``vertices``, belonging to this polygon.
  Each sample in ``vertices`` contains a list of vertices that composed the different polygons in the sample, as 2D coordinates::
  
    [[x_0, y_0],
     [x_1, y_1],
     ...
     [x_V, y_V]]

* **pixelwise_masks** (Optional, present if argument ``pixelwise_masks`` is set to True)
  Contains image-like data, same shape and layout as ``images``, representing a pixelwise segmentation mask.
* **image_ids** (Optional, present if argument ``image_ids`` is set to True)
  One element per sample, representing an image identifier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Signed-off-by: Joaquin Anton <janton@nvidia.com>
@jantonguirao
Copy link
Contributor Author

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1742158]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1742158]: BUILD FAILED

Signed-off-by: Joaquin Anton <janton@nvidia.com>
@jantonguirao
Copy link
Contributor Author

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1742317]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1742317]: BUILD FAILED

Signed-off-by: Joaquin Anton <janton@nvidia.com>
@jantonguirao
Copy link
Contributor Author

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1744533]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1744533]: BUILD PASSED

.AddOptionalArg("dtype",
R"code(Output data type.)code",
DALI_FLOAT)
DALI_DATA_TYPE)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bugfix

@jantonguirao
Copy link
Contributor Author

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1745296]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1745296]: BUILD FAILED

Signed-off-by: Joaquin Anton <janton@nvidia.com>
@jantonguirao
Copy link
Contributor Author

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1745364]: BUILD STARTED

@jantonguirao
Copy link
Contributor Author

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1745735]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1745735]: BUILD FAILED

Signed-off-by: Joaquin Anton <janton@nvidia.com>
@jantonguirao
Copy link
Contributor Author

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1746226]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1746226]: BUILD PASSED

@jantonguirao jantonguirao merged commit 523d56d into NVIDIA:master Oct 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants