-
Notifications
You must be signed in to change notification settings - Fork 609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements in COCO reader API #2406
Conversation
Signed-off-by: Joaquin Anton <janton@nvidia.com>
Signed-off-by: Joaquin Anton <janton@nvidia.com>
!build |
CI MESSAGE: [1737859]: BUILD STARTED |
CI MESSAGE: [1737859]: BUILD FAILED |
Signed-off-by: Joaquin Anton <janton@nvidia.com>
@@ -200,7 +200,7 @@ | |||
"labels = labels_cpu.at(img_index)\n", | |||
"categories_set = set()\n", | |||
"for label in labels:\n", | |||
" categories_set.add(label[0])\n", | |||
" categories_set.add(label)\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only code where labels were expected to have an extra dim. In the detection pipelines the labels go to the box encoder, and the extra dimension is flattened there.
View / edit / reply to this conversation on ReviewNB JanuszL commented on 2020-10-28T11:37:02Z >Each entry in the vertices contains two coordinates (x, y)
I would say that `Each entry in the vertices contains coordinates (x, y respectively for 2D polygons). |
|
879db31
to
5710dc6
Compare
@@ -94,7 +95,7 @@ void dump_filenames(const ImageIdPairs &image_id_pairs, const std::string path) | |||
} | |||
|
|||
template <typename T> | |||
void load_meta_file(std::vector<T> &output, const std::string path) { | |||
void LoadFromFile(std::vector<T> &output, const std::string path) { | |||
std::ifstream file(path); | |||
DALI_ENFORCE(file.good(), make_string("Error writing to path: ", path)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DALI_ENFORCE(file.good(), make_string("Error writing to path: ", path)); | |
DALI_ENFORCE(file.good(), make_string("CocoReader meta file error while loading for path: ", path)); |
@@ -119,7 +120,7 @@ void load_meta_file(std::vector<std::vector<T> > &output, const std::string path | |||
} | |||
} | |||
|
|||
void load_filenames(ImageIdPairs &image_id_pairs, const std::string path) { | |||
void LoadFilenamesFromFile(ImageIdPairs &image_id_pairs, const std::string path) { | |||
std::ifstream file(path); | |||
DALI_ENFORCE(file, "CocoReader meta file error while loading for path: " + path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DALI_ENFORCE(file, "CocoReader meta file error while loading for path: " + path); | |
DALI_ENFORCE(file.good(), make_string("CocoReader meta file error while loading for path: ", path)); |
sample_mask_meta.push_back(objects_in_sample); | ||
sample_mask_meta.push_back(obj_coords_offset + annotation.poly_.segm_meta_[i]); | ||
sample_mask_meta.push_back(obj_coords_offset + annotation.poly_.segm_meta_[i + 1]); | ||
auto segm_meta = annotation.poly_.segm_meta_.data(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto segm_meta = annotation.poly_.segm_meta_.data(); | |
auto &segm_meta = annotation.poly_.segm_meta_; |
Signed-off-by: Joaquin Anton <janton@nvidia.com>
5710dc6
to
0d21763
Compare
Each mask can be one or more polygons, and for a given sample, the polygons are represented by the | ||
following tensors: | ||
.DeprecateArg("masks", false, // deprecated since 0.28dev | ||
"``masks`` argument is now deprecated. Please use ``polygon_masks`` instead " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should keep an info how the deprecated format looks like?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
images and annotation JSON files. | ||
|
||
This readers produces the following outputs:: | ||
|
||
images, bounding_boxes, labels, ((polygons, vertices) | (pixelwise_masks)), (image_ids) | ||
|
||
**images** | ||
|
||
Each sample contains image data with layout ``HWC`` (height, width, channels). | ||
|
||
**bounding_boxes** | ||
|
||
Each sample can have an arbitrary ``M`` number of bounding boxes, each described by 4 coordinates:: | ||
|
||
[[x_0, y_0, w_0, h_0], | ||
[x_1, y_1, w_1, h_1] | ||
... | ||
[x_M, y_M, w_M, h_M]] | ||
|
||
or in ``[l, t, r, b]`` format if requested (see ``ltrb`` argument). | ||
|
||
**labels** | ||
|
||
Each bounding box is associated with an integer label representing a category identifier:: | ||
|
||
[label_0, label_1, ..., label_M] | ||
|
||
**polygons** and **vertices** (Optional, present if ``polygon_masks`` is set to True) | ||
|
||
If ``polygon_masks`` is enabled, two extra outputs describing masks by a set of polygons. | ||
|
||
Each mask contains an arbitrary number of polygons ``P``, each associated with a mask index in the range [0, M) and | ||
composed by a group of ``V`` vertices. The output ``polygons`` describes the polygons as follows:: | ||
|
||
[[mask_idx_0, start_vertex_idx_0, end_vertex_idx_0], | ||
[mask_idx_1, start_vertex_idx_1, end_vertex_idx_1], | ||
... | ||
[mask_idx_P, start_vertex_idx_P, end_vertex_idx_P]] | ||
|
||
where ``mask_idx`` is the index of the mask the polygon, in the range ``[0, M)``, and ``start_vertex_idx`` and ``end_verted_idx`` | ||
define the range of indices of vertices, as they appear in the output ``vertices``, belonging to this polygon. | ||
|
||
Each sample in ``vertices`` contains a list of vertices that composed the different polygons in the sample, as 2D coordinates:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about making this a list:
* **images**
Each sample contains image data with layout ``HWC`` (height, width, channels).
* **bounding_boxes**
Each sample can have an arbitrary ``M`` number of bounding boxes, each described by 4 coordinates::
[[x_0, y_0, w_0, h_0],
[x_1, y_1, w_1, h_1]
...
[x_M, y_M, w_M, h_M]]
or in ``[l, t, r, b]`` format if requested (see ``ltrb`` argument).
* **labels**
Each bounding box is associated with an integer label representing a category identifier::
[label_0, label_1, ..., label_M]
* **polygons** and **vertices** (Optional, present if ``polygon_masks`` is set to True)
If ``polygon_masks`` is enabled, two extra outputs describing masks by a set of polygons.
Each mask contains an arbitrary number of polygons ``P``, each associated with a mask index in the range [0, M) and
composed by a group of ``V`` vertices. The output ``polygons`` describes the polygons as follows::
[[mask_idx_0, start_vertex_idx_0, end_vertex_idx_0],
[mask_idx_1, start_vertex_idx_1, end_vertex_idx_1],
...
[mask_idx_P, start_vertex_idx_P, end_vertex_idx_P]]
where ``mask_idx`` is the index of the mask the polygon, in the range ``[0, M)``, and ``start_vertex_idx`` and ``end_verted_idx``
define the range of indices of vertices, as they appear in the output ``vertices``, belonging to this polygon.
Each sample in ``vertices`` contains a list of vertices that composed the different polygons in the sample, as 2D coordinates::
[[x_0, y_0],
[x_1, y_1],
...
[x_V, y_V]]
* **pixelwise_masks** (Optional, present if argument ``pixelwise_masks`` is set to True)
Contains image-like data, same shape and layout as ``images``, representing a pixelwise segmentation mask.
* **image_ids** (Optional, present if argument ``image_ids`` is set to True)
One element per sample, representing an image identifier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Signed-off-by: Joaquin Anton <janton@nvidia.com>
df4a273
to
d073c79
Compare
!build |
CI MESSAGE: [1742158]: BUILD STARTED |
CI MESSAGE: [1742158]: BUILD FAILED |
Signed-off-by: Joaquin Anton <janton@nvidia.com>
12f5b9f
to
aea6a9b
Compare
!build |
CI MESSAGE: [1742317]: BUILD STARTED |
CI MESSAGE: [1742317]: BUILD FAILED |
!build |
CI MESSAGE: [1744533]: BUILD STARTED |
CI MESSAGE: [1744533]: BUILD PASSED |
5205f2e
to
0fac87e
Compare
.AddOptionalArg("dtype", | ||
R"code(Output data type.)code", | ||
DALI_FLOAT) | ||
DALI_DATA_TYPE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bugfix
!build |
CI MESSAGE: [1745296]: BUILD STARTED |
CI MESSAGE: [1745296]: BUILD FAILED |
Signed-off-by: Joaquin Anton <janton@nvidia.com>
0fac87e
to
aeb459f
Compare
!build |
CI MESSAGE: [1745364]: BUILD STARTED |
!build |
CI MESSAGE: [1745735]: BUILD STARTED |
CI MESSAGE: [1745735]: BUILD FAILED |
82cf81a
to
2917013
Compare
!build |
CI MESSAGE: [1746226]: BUILD STARTED |
CI MESSAGE: [1746226]: BUILD PASSED |
Why we need this PR?
Pick one, remove the rest
What happened in this PR?
Fill relevant points, put NA otherwise. Replace anything inside []
Changed the format of mask polygon descriptor, to use indices of vertices rather than indices of coordinates
Renamed and deprecated some ambiguously named arguments
Removed trailing dimension from
labels
outputAdded handling of mask polygons in COCO reader example
Rework the way that COCOLoader and COCOReader share data
COCOReader
Changes in the COCO reader
Existing tests and jupyter example
COCO reader example enhanced
JIRA TASK: [DALI-1686]