Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix RandomBBoxCrop errors while using crop_shape argument #2605

Merged
merged 6 commits into from
Jan 21, 2021

Conversation

wonrzrzeczny
Copy link
Contributor

Why we need this PR?

Pick one, remove the rest

  • It fixes anchor/shape data type returned by calling RandomBBoxCrop with supplied crop_shape/input_shape argument and prevents segfaults in case crop_shape > input_shape.

What happened in this PR?

Fill relevant points, put NA otherwise. Replace anything inside []

  • What solution was applied:
    Anchor and shape outputs of RandomBBoxCrop call with provided crop_shape/input_shape are cast to int before being returned, which prevents pixel dimensions from being interpreted as relative by other operators (eg. Slice). Additionally, anchor selection is handled differently when crop_shape > input_shape, which prevents segfaults previously caused by using std::uniform_int_distribution with incorrect constructor params.
  • Affected modules and functionalities:
    Changes to dali/operators/image/crop/bbox_crop.cc
  • Key points relevant for the review:
    NA
  • Validation and testing:
    NA
  • Documentation (including examples):
    NA

JIRA TASK: [Use DALI-XXXX or NA]

Signed-off-by: Kacper Kluk <kk406184@students.mimuw.edu.pl>
Signed-off-by: Kacper Kluk <kk406184@students.mimuw.edu.pl>
Signed-off-by: Kacper Kluk <kk406184@students.mimuw.edu.pl>
@JanuszL
Copy link
Contributor

JanuszL commented Jan 13, 2021

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1976940]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1976940]: BUILD PASSED

auto &anchor_out = ws.template OutputRef<CPUBackend>(0);
anchor_out.Resize(uniform_list_shape(num_samples, {ndim}));
auto anchor_out_view = view<float>(anchor_out);
DALIDataType output_dtype = has_crop_shape_ ? DALI_INT32 : DALI_FLOAT;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this is a good idea to change the output type based on presence of one of the arguments.
Maybe I always should be DALI_FLOAT? Or just DALI_INT32 and use cast operator after that operator?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that if user provided crop_shape argument (which represents the exact size in pixels of desired crop region) then output anchors and shapes should also be returned as integers. Returning them as relative float values could potentially result in output shape being off by 1 pixel due to fp arithmetic errors and this behaviour could break the pipeline in some use cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about returning always FLOAT in that case (but with switch that is by default on that changes it to int - so we won't break the existing code while introducing new functionality).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requirement for different output type depending on whether crop_shape argument was provided or not is imposed because of the way Slice operator interprets its input arguments (int -> pixel coords, float -> normalized coords). I believe that current fix won't break any existing code - any previous usage of RandomBBoxCrop operator with crop_shape provided required additional cast to int to be declared explicitly in pipeline - current fix will provide this cast unnecessary, although the results will be the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only other solution I see is to always return int values (in pixel coordinates), even when using scaling instead of crop_shape (possibly with switch to return the output in normalized coords instead). However, this will probably require much more existing code to be rewritten.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's another problem. We cannot return pixel coordinates with scaling provided as we lack image shape data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that returning different types should be OK, given that Slice operator interprets those differently.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've discussed internally with the team and came to the conclusion that changing the output type of the operator silently, based on the presence of another argument, is not a good user experience, since it could break some existing code. So, for the sake of backward compatibility we should keep DALI_FLOAT by default.

I can suggest some solutions:

  1. Use normalized_anchor=False, normalized_shape=False in Slice, so that you don't need to cast to int32
  2. Cast to int32 (it adds extra overhead, not recommended)
  3. Add a output_dtype argument to this operator, and only use int32 when it was explicitly requested.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I've found pixel coordinates returned as a DALI_FLOAT confusing, because of the way Slice works by default, although I understand your point. Currently I've reverted my changes (leaving only the sigsegv fix). Maybe additional clarification should be added to RandomBBoxCrop docs?

for d in range(ndim):
assert(anchor[d] >= 0.0 and anchor[d] <= input_shape[d])
assert shape[d] == expected_crop_shape[d], "{} != {}".format(shape, expected_crop_shape)
assert(anchor[d] + shape[d] > 0.0 and anchor[d] + shape[d] <= input_shape[d])
assert(anchor[d] + shape[d] > 0.0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why removing the second check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bottom right corner can be outside original image boundaries if crop_shape > input_shape.

anchor_out.Resize(uniform_list_shape(num_samples, {ndim}));
auto anchor_out_view = view<float>(anchor_out);
DALIDataType output_dtype = has_crop_shape_ ? DALI_INT32 : DALI_FLOAT;
TYPE_SWITCH(output_dtype, type2id, type, (int, float), (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
TYPE_SWITCH(output_dtype, type2id, type, (int, float), (
TYPE_SWITCH(output_dtype, type2id, T, (int, float), (

would be more in line with other parts of the codebase.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

auto &anchor_out = ws.template OutputRef<CPUBackend>(0);
anchor_out.Resize(uniform_list_shape(num_samples, {ndim}));
auto anchor_out_view = view<float>(anchor_out);
DALIDataType output_dtype = has_crop_shape_ ? DALI_INT32 : DALI_FLOAT;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that returning different types should be OK, given that Slice operator interprets those differently.

Signed-off-by: Kacper Kluk <kk406184@students.mimuw.edu.pl>
Signed-off-by: Kacper Kluk <kk406184@students.mimuw.edu.pl>
@JanuszL
Copy link
Contributor

JanuszL commented Jan 19, 2021

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1993040]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1993040]: BUILD PASSED

Signed-off-by: Kacper Kluk <kk406184@students.mimuw.edu.pl>
@JanuszL
Copy link
Contributor

JanuszL commented Jan 20, 2021

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1997445]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1997445]: BUILD PASSED

@awolant awolant merged commit 548cc7e into NVIDIA:master Jan 21, 2021
@wonrzrzeczny wonrzrzeczny deleted the bboxcrop_fixes branch January 21, 2021 11:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants