Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CoordFlip CPU operator #1894

Merged
merged 6 commits into from
Apr 28, 2020
Merged

Conversation

jantonguirao
Copy link
Contributor

Signed-off-by: Joaquin Anton janton@nvidia.com

Why we need this PR?

  • It adds new feature, Coordinate flip, needed to complete MaskRCNN pipeline

What happened in this PR?

Fill relevant points, put NA otherwise. Replace anything inside []

  • What solution was applied:
    Added Coordinate Flip CPU operator
  • Affected modules and functionalities:
    New operator
  • Key points relevant for the review:
    The operator implementation
  • Validation and testing:
    Python operator tests added
  • Documentation (including examples):
    NA

JIRA TASK: [DALI-1392]  

int64_t i = 0;
for (; i < in_size; i++, d++) {
if (d == ndim_) d = 0;
assert(in[i] >= 0.0f && in[i] <= 1.0f);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe enforce, we can get this data from the ExternalSource, so it is a user error.

Signed-off-by: Joaquin Anton <janton@nvidia.com>
Signed-off-by: Joaquin Anton <janton@nvidia.com>
Comment on lines 35 to 37
.AddOptionalArg("horizontal", R"code(Perform flip along horizontal axis.)code", 1, true)
.AddOptionalArg("vertical", R"code(Perform flip along vertical axis.)code", 0, true)
.AddOptionalArg("depthwise", R"code(Perform flip along depthwise axis.)code", 0, true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, that image flip is (unfortunately) defined the opposite. horizontal denotes flip along vertical axis. Maybe we should unify it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, at least per docs: horizontal (int, optional, default = 1) – Perform a horizontal flip.

On the other hand BbFlip does the same as this operator, we're already not consistent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dayum :( Let's discuss what to do with it

bool horizontal_flip = spec_.GetArgument<int>("horizontal", &ws, sample_id);
bool vertical_flip = spec_.GetArgument<int>("vertical", &ws, sample_id);
bool depthwise_flip = spec_.GetArgument<int>("depthwise", &ws, sample_id);
std::array<bool, 3> flip_dim = {false, false, false};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::array has no specialization for bool. Maybe we'd be better with std::vector<bool> here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vector of bool should be killed with fire and purged from existence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want 3 bools, no need for dynamic allocation


DALI_SCHEMA(CoordFlip)
.DocStr(
R"code(Transforms normalized coordinates (range [0.0, 1.0]) so that they map to the same place after
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding also a version for not normalized coordinates? We have tensor shape, so simple switch in API makes do

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done with the new patch


DALI_ENFORCE(in_shape[0].size() == 2);
ndim_ = in_shape[0][1];
DALI_ENFORCE(ndim_ >= 1 && ndim_ <= 3, make_string("Unexpected number of dimensions ", ndim_));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could work for 0-dim, right? Just return 1-input

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would be the meaning of a 0D coordinate?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will never be 0. Hovever, it's going to be possible to have in.sample_dim() to be 0 - scalar 1D coordinate (not a vector) - in which case I think we should just assume it's X.

Comment on lines +99 to +101
int d = 0;
int64_t i = 0;
for (; i < in_size; i++, d++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a nitpick, but I'd appreciate a little bit more descriptive names ;) Like dim_idx and coor_idx

Comment on lines 28 to 32
Possible values are:

``x`` (horizontal position), ``y`` (vertical position), ``z`` (depthwise position),

Note: If left empty, ``"xy"`` or ``"xyz"`` will be assumed, depending on the number of dimensions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid that this won't look very well in the docs (but we will need to check it.
How about using bullets here?


void CoordFlipCPU::RunImpl(workspace_t<CPUBackend> &ws) {
const auto &input = ws.InputRef<CPUBackend>(0);
DALI_ENFORCE(input.type().id() == DALI_FLOAT, "Input is expected to be float");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you maybe move this enforce to Setup? If we decide to have some type/shape inference, if the input data is wrong (for example u8), we will report that we produce u8 and later report error during Run.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

auto &thread_pool = ws.GetThreadPool();

if (layout_.empty()) {
layout_ = ndim_ == 2 ? "xy" : "xyz";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the ndim_ is 1 you end up with wrong layout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines 104 to 105
DALI_ENFORCE(in_val >= 0.0f && in_val <= 1.0f,
"Input expected to be within the range [0.0, 1.0]");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we already had this discussion, but isn't this a bit over-defensive? You're probably not reporting such errors from the GPU Op. As you are flipping along x=0.5 here (or whatever the axis), it will work regardless of the range.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

Comment on lines 79 to 83
def test_operator_coord_flip():
for device in ['cpu']:
for batch_size in [1, 3]:
for layout, shape in [("xy", (10, 2)), ("xyz", (10, 3))]:
yield check_operator_coord_flip, device, batch_size, layout, shape
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also check 1 dim as it's supported.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe 0-dim also should be supported

Copy link
Contributor

@klecki klecki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please verify scalar arguments and 1-dim case.

I also think that the implementation is a bit overprotective - I didn't spot Janusz's comment. I still think last time we decided on Garbage in - garbage out approach. Especially that it won't be exactly garbage.

@JanuszL
Copy link
Contributor

JanuszL commented Apr 24, 2020

I also think that the implementation is a bit overprotective - I didn't spot Janusz's comment.

I focused on how to do it, not if it is justified. You are right. Maybe we should not put such restrictions. We had some discussion with @mzient that bboxes or polygons may span beyond the image and some networks may still produce a valid result - if you see half a car in the image you can still tell where it ends.

@klecki
Copy link
Contributor

klecki commented Apr 27, 2020

I also think that the implementation is a bit overprotective - I didn't spot Janusz's comment.

I focused on how to do it, not if it is justified. You are right. Maybe we should not put such restrictions. We had some discussion with @mzient that bboxes or polygons may span beyond the image and some networks may still produce a valid result - if you see half a car in the image you can still tell where it ends.

Especially that it still would flip them correctly.

Signed-off-by: Joaquin Anton <janton@nvidia.com>
@jantonguirao jantonguirao force-pushed the coord_flip branch 2 times, most recently from a03bac2 to 9403852 Compare April 27, 2020 15:34
Note: If left empty, ``"x"``, ``"xy"`` or ``"xyz"`` will be assumed, depending on the number of dimensions.
)code",
TensorLayout{""})
.AddOptionalArg("horizontal", R"code(Flip horizontal dimension.)code", 1, true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe they should be called flip_x, flip_y, flip_z ? Just a thought/question. Current ones are compatible with other operators, but since the order of coordinates is different here (XYZ vs (Z)YX as in (D)HWC layout), this compatibility may cause more problems than it solves.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Signed-off-by: Joaquin Anton <janton@nvidia.com>
expected_out_coords[:, d] = 1.0 - in_coords[:, d]
np.testing.assert_allclose(out_coords[:, d], expected_out_coords[:, d])

def test_operator_coord_flip():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about testing center_* argument as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Signed-off-by: Joaquin Anton <janton@nvidia.com>
@JanuszL JanuszL self-requested a review April 28, 2020 09:04
Comment on lines +78 to +79
flip_dim[y_dim] = spec_.GetArgument<int>("flip_y", &ws, sample_id);
flip_dim[z_dim] = spec_.GetArgument<int>("flip_z", &ws, sample_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

y_dim and z_dim can be -1 if they are not in layout (1D and 2D case). Same below.

Signed-off-by: Joaquin Anton <janton@nvidia.com>
@jantonguirao
Copy link
Contributor Author

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1285277]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1285277]: BUILD PASSED

@jantonguirao jantonguirao merged commit 4c1b3b3 into NVIDIA:master Apr 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants