-
Notifications
You must be signed in to change notification settings - Fork 610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SelectMasks operator #2381
Add SelectMasks operator #2381
Conversation
618cd67
to
73e0970
Compare
DALI_SCHEMA(segmentation__SelectMasks) | ||
.DocStr(R"(Selects a subset of mask by their mask ids. | ||
|
||
The operator expects ``mask_meta`` and ``mask_coords``, containing segmentation mask. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add an info about mask_ids as well.
|
||
Example 2: Selecting two out of the three mask, reindexing the mask ids to follow the order of the input mask ids:: | ||
|
||
``mask_ids`` = [2, 0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add leading tabs in this example.
|
||
Example 2: Selecting two out of the three mask, reindexing the mask ids to follow the order of the input mask ids:: | ||
|
||
``mask_ids`` = [2, 0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to allow such inversion, what is the use case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing is that the mask_ids are actually indices of the bounding boxes in the batch.
Let's say we had three bounding boxes [A, B, C] with corresponding masks ids [0, 1, 2].
Now random bbox crop filtered out B and we are left with [A, B] and an output describing the indices of the bounding boxes that stayed: [0, 2].
What we want now is select the masks that refer to those bounding boxes that stayed, so we filter mask_ids [0, 2]. However, the bounding boxes now are two, so we have indices [0, 1]. To be able to map a mask to a bounding box the mask id should have the proper index, so we do this mapping: mask_id remains 0, but mask_id is now 1 because that's the index of the bounding box in the cropped image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the reason for reindexing, but I was asking about the [0, 2] case. Do we expect operators to change the order of bboxes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the order doesn't matter much. I mean, there is no special reason to reorder them, it is just that we are processing them as they appear in the mask_ids input.
|
||
const auto &in_mask_coords = ws.template InputRef<CPUBackend>(2); | ||
auto in_mask_coords_shape = in_mask_coords.shape(); | ||
DALI_ENFORCE(in_mask_coords.type().id() == DALI_FLOAT, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not support ints as well here?
in_mask_meta_shape[i][1], " elements")); | ||
} | ||
|
||
int coord_ndim = in_mask_coords_shape[0][1]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if in_mask_coords_shape[0] is zero lenght?
|
||
const auto &in_mask_ids_view = view<const int32_t, 1>(in_mask_ids); | ||
const auto &in_mask_meta_view = view<const int32_t, 2>(in_mask_meta); | ||
const auto &in_mask_coords_view = view<const float, 2>(in_mask_coords); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not used.
def get_data_source(*args, **kwargs): | ||
from test_segmentation_utils import make_batch_select_masks | ||
return lambda: make_batch_select_masks(*args, **kwargs) | ||
pipe = Pipeline(batch_size=batch_size, num_threads=4, device_id=0, seed=1234) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pipe = Pipeline(batch_size=batch_size, num_threads=4, device_id=0, seed=1234) | |
pipe = Pipeline(batch_size=batch_size, num_threads=4, device_id=None, seed=1234) |
@@ -659,4 +659,23 @@ def test_combine_transforms_cpu(): | |||
for _ in range(3): | |||
pipe.run() | |||
|
|||
def test_segmentation_select_masks(): | |||
def get_data_source(*args, **kwargs): | |||
from test_segmentation_utils import make_batch_select_masks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be a global import.
int mask_id = meta.selected_masks[k]; | ||
auto it = meta.masks_meta.find(mask_id); | ||
if (it == meta.masks_meta.end()) { | ||
DALI_FAIL(make_string("Selected mask_id ", mask_id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't that FAIL be moved to L136?
I don't think that is possible here as:
for (auto mask_id : meta.selected_masks) {
if (meta.masks_meta.find(mask_id) != meta.masks_meta.end()) {
DALI_WARN(make_string("mask_id ", mask_id, " is duplicated. Ignoring..."));
continue;
}
meta.masks_meta[mask_id].mask_id = mask_id;
meta.masks_meta[mask_id].new_mask_id = reindex_masks_ ? idx++ : mask_id;
}
So instead of range for, you have 0 to nselected for, but sill mask_ids are the same.
out_coords = out_masks_coords[out_coord_start:out_coord_end+1] | ||
assert (expected_out_coords == out_coords).all() | ||
|
||
def test_select_masks(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a negative case to triger any enforce you have in your setup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
void RunImpl(workspace_t<CPUBackend> &ws) override; | ||
|
||
struct MaskMeta { | ||
int mask_id = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that field is usefull.
DALI_WARN(make_string("mask_id ", mask_id, " is duplicated. Ignoring...")); | ||
continue; | ||
} | ||
meta.masks_meta[mask_id].mask_id = mask_id; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need mask_id field.
156589b
to
322c3e9
Compare
R"code(Each row describes a single mask with three values, ``mask_idx, start_idx, end_idx``, | ||
where ``start_idx`` and ``end_idx`` refer to indices in ``mask_coords``.)code") | ||
.InputDox(2, "mask_coords", "2D TensorList of float or int", | ||
R"code(Each row contains a single ``x, y`` coordinate.)code") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
R"code(Each row contains a single ``x, y`` coordinate.)code") | |
R"code(Each row contains a single set of coordinates, kike ``x, y`` or ``x, y, z``.code") |
...and, of course, we should support more than 2D points.
R"code(If set to True, new mask ids are reassigned to each of the selected mask, so that they | ||
follow zero-based index of the mask ids in the input.)code", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
R"code(If set to True, new mask ids are reassigned to each of the selected mask, so that they | |
follow zero-based index of the mask ids in the input.)code", | |
R"(If set to True, the output mask ids are replaced with the indices at which they appeared in ``mask_ids`` input.)", |
for (int i = 0; i < nsamples; i++) { | ||
auto sh = in_mask_meta_shape.tensor_shape_span(i); | ||
DALI_ENFORCE(3 == sh[1], | ||
make_string("``mask_meta`` is expected to contain 3 element rows, containing " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make_string("``mask_meta`` is expected to contain 3 element rows, containing " | |
make_string("``mask_meta`` is expected to contain 2D tensors with 3 columns: " |
DALI_ENFORCE( | ||
mask.start_coord >= 0 && mask.end_coord < input_total_ncoords, | ||
make_string( | ||
"Coordinate index range for mask id ", mask_id, " [", mask.start_coord, ", ", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Coordinate index range for mask id ", mask_id, " [", mask.start_coord, ", ", | |
"Vertex index range for mask id ", mask_id, " [", mask.start_coord, ", ", |
0660071
to
9cece8a
Compare
R"code(Vertex data, of arbitrary dimensionality | ||
(the number of dimensions should be the same for every vertex).)code") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- "...data of arbitrary dimensionality..." sounds a bit ambiguous - the vertices or the tensor itself?
- There's no way to supply vertices of different dimensionality within one sample, because it would not be a tensor
R"code(Vertex data, of arbitrary dimensionality | |
(the number of dimensions should be the same for every vertex).)code") | |
R"code(Vertex data stored in interleaved format:: | |
[[x0, y0], [x1, y1], ... , [xn, yn]] | |
The operator accepts vertices with arbitrary number of coordinates.)code") |
for (int i = 1; i < nsamples; i++) { | ||
auto sh = in_vertices_shape.tensor_shape_span(i); | ||
DALI_ENFORCE(vertex_ndim == sh[1], | ||
make_string("All vertices are expected to have the same dimensionality. Got ", | ||
vertex_ndim, "D and ", sh[1], "D in the same batch")); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we care? It seems to us that it doesn't make sense, but it is an artificial constraint and someone might want to abuse this argument for some purposes and we shouldn't disallow it.
507aa3f
to
25ad5fb
Compare
.NumOutput(2) | ||
.InputDox(0, "mask_ids", "1D TensorList of int", | ||
R"code(List of identifiers of the masks to be selected.)code") | ||
.InputDox(1, "polygons", "2D TensorList of int", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started wondering if we should extend this to 3D as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3D is supported. Here 2D refers to the shape of the data:
[[x0, y0, (z0, ...)], ..., [xN, yN, (zN, ...)]
25ad5fb
to
6b841ad
Compare
Signed-off-by: Joaquin Anton <janton@nvidia.com>
Signed-off-by: Joaquin Anton <janton@nvidia.com>
Signed-off-by: Joaquin Anton <janton@nvidia.com>
Signed-off-by: Joaquin Anton <janton@nvidia.com>
Signed-off-by: Joaquin Anton <janton@nvidia.com>
Signed-off-by: Joaquin Anton <janton@nvidia.com>
6b841ad
to
93dbf3c
Compare
!build |
CI MESSAGE: [1748057]: BUILD STARTED |
CI MESSAGE: [1748057]: BUILD PASSED |
Signed-off-by: Joaquin Anton <janton@nvidia.com>
@@ -0,0 +1,52 @@ | |||
# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick:
The filename sounds like it's testing segmentation utils. If these are utilities for testing segmentation, then the correct name would be segmentation_test_utils.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
if (polygons.find(mask_id) != polygons.end()) { | ||
DALI_WARN(make_string("mask_id ", mask_id, " is duplicated. Ignoring...")); | ||
continue; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? I bet this will bite us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disallowed duplicates completely now
poly.end_vertex = poly_data[2]; | ||
|
||
DALI_ENFORCE( | ||
poly.start_vertex >= 0 && poly.end_vertex <= in_nvertices, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing check for poly.start_vertex <= poly.end_vertex
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Signed-off-by: Joaquin Anton <janton@nvidia.com>
Signed-off-by: Joaquin Anton <janton@nvidia.com>
bbox_x = fn.slice(in_bboxes, 0, 1, axes=[1]) | ||
bbox_y = fn.slice(in_bboxes, 1, 1, axes=[1]) | ||
bbox_X = fn.slice(in_bboxes, 2, 1, axes=[1]) | ||
bbox_Y = fn.slice(in_bboxes, 3, 1, axes=[1]) | ||
in_bboxes_abs = fn.cat(bbox_x * w, bbox_y * h, bbox_X * w, bbox_Y * h, axis=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bbox_x = fn.slice(in_bboxes, 0, 1, axes=[1]) | |
bbox_y = fn.slice(in_bboxes, 1, 1, axes=[1]) | |
bbox_X = fn.slice(in_bboxes, 2, 1, axes=[1]) | |
bbox_Y = fn.slice(in_bboxes, 3, 1, axes=[1]) | |
in_bboxes_abs = fn.cat(bbox_x * w, bbox_y * h, bbox_X * w, bbox_Y * h, axis=1) | |
to_abs = fn.transforms.scale(scale=fn.cat(w, h, w, h, axis=0)) | |
in_bboxes_abs = fn.coord_transform(in_bboxes, MT=to_abs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Despite the matrix-vector multiplication it should be cheaper just because there are much fewer operators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as we agreed offline, it's more straight forward the way it is now
Signed-off-by: Joaquin Anton <janton@nvidia.com>
a5d78be
to
9e6d1ad
Compare
!build |
CI MESSAGE: [1749311]: BUILD STARTED |
CI MESSAGE: [1749311]: BUILD FAILED |
CI MESSAGE: [1749311]: BUILD PASSED |
Signed-off-by: Joaquin Anton janton@nvidia.com
Why we need this PR?
What happened in this PR?
Fill relevant points, put NA otherwise. Replace anything inside []
Added a segmentation.SelectMasks operator that selects a subset of masks based on a list of indices
New operator
All
Tests added
Docstr
JIRA TASK: [DALI-1680]