Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add segmentation.RandomMaskPixel operator #2445

Merged
merged 8 commits into from
Nov 13, 2020

Conversation

jantonguirao
Copy link
Contributor

@jantonguirao jantonguirao commented Nov 9, 2020

Signed-off-by: Joaquin Anton janton@nvidia.com

Why we need this PR?

Pick one, remove the rest

  • It adds new feature needed to be able to perform a random crop operation where foreground is always represented up to a minimum ratio

What happened in this PR?

Fill relevant points, put NA otherwise. Replace anything inside []

  • What solution was applied:
    Added a new operator segmentation.RandomMaskPixel that selects a pixel either randomly or to point to any of the available foreground pixels, specified by an input mask and threhold/value arguments
  • Affected modules and functionalities:
    No existing functionality is affected
  • Key points relevant for the review:
    All
  • Validation and testing:
    Python tests
  • Documentation (including examples):
    Operator documentation in schema

JIRA TASK: [DALI-1723]

@jantonguirao jantonguirao force-pushed the biased_crop_center branch 2 times, most recently from cf97ab9 to 150f6ad Compare November 9, 2020 16:21
@jantonguirao jantonguirao requested a review from a team November 9, 2020 16:23
Signed-off-by: Joaquin Anton <janton@nvidia.com>

DALI_SCHEMA(segmentation__BiasedCropCenter)
.DocStr(R"(Selects a cropping window center which can be selected randomly from either
any position in the input or any position of a foreground pixel in the input, based on an
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
any position in the input or any position of a foreground pixel in the input, based on an
at any position in the input or any position of a foreground pixel in the input, based on an


DALI_SCHEMA(segmentation__BiasedCropCenter)
.DocStr(R"(Selects a cropping window center which can be selected randomly from either
any position in the input or any position of a foreground pixel in the input, based on an
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we should use foreground or just non zero value, because the meaning of the foreground if not defined.


When foreground != 0, the cropping center is first picked to match any of the foreground
pixels in the input. If the selected crop center results in an out of bounds cropping window,
the center is shifted as necessary so that the window remains within bounds.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it makes any sense not to shift?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both things can make sense. If cropping out of bounds is OK, the user can let shape=None and any pixel would be considered a valid center. If the shape is specified, we'll assure that the cropping window is within bounds.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've already had github requests for out-of-bounds cropping. Perhaps we should make this configurable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have out-of-bounds cropping, and this is already configured (providing a shape or not).


has_crop_shape_ = spec_.ArgumentDefined("shape");
if (has_crop_shape_) {
GetShapeArgument(crop_shape_, spec_, "shape", ws, ndim, nsamples);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't you do anything with the return value as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not really, I don't need to infer the batch size or the number of dimensions.

[&, sample_idx](int thread_id) {
auto mask = masks_view[sample_idx];
auto center = center_view[sample_idx];
SearchableRLEMask rle_mask(mask);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you tell how fast/slow it is for the typical inputs?
Do we assume that masks are uniform regarding the size across the batch or they can vary?
If can vary maybe we can differently split the work across the threads. Ask each thread to create a mask for only one plan in each volume.
Does it make any sense to have it on the GPU?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And how that works with 3D masks?

Copy link
Contributor Author

@jantonguirao jantonguirao Nov 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can benchmark it. I wouldn't assume uniform shape.
SearchableRLEMask works on flat indices. We are not doing anything with the dimensions.
Regarding the GPU, encoding the mask is a rather serial task. Having a mask per plane would complicate things, so I wouldn't go there unless we see there is a performance bottleneck here. On top of that, Slice operator doesn't take GPU anchor/shape.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let us benchmark first and see if there is anything to fight for.

// Adjust center if necessary
if (has_crop_shape_) {
for (int d = 0; d < ndim; d++) {
int64_t w = crop_sh[d] >> 1;
Copy link
Contributor

@mzient mzient Nov 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what this w was supposed to mean, but it read as width (which apprently it isn't).

Suggested change
int64_t w = crop_sh[d] >> 1;
int64_t half = crop_sh[d] >> 1;

for (int d = 0; d < ndim; d++) {
int64_t w = crop_sh[d] >> 1;
center.data[d] =
boundary::idx_clamp(center.data[d], w, mask_sh[d] - (crop_sh[d] - w));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
boundary::idx_clamp(center.data[d], w, mask_sh[d] - (crop_sh[d] - w));
clamp(center.data[d], w, mask_sh[d] - (crop_sh[d] - w));

If you're just clamping, you don't need any fancy boundary handling (ordinary clamp will do).
Also, boundary clamp is an exclusive clamp (to size-1) and we want to have inclusive clamp.
Example:
mask_sh[d] = 100
crop_sh[d] = 10
w = 5
maximum valid value is 95 (the window will start at 90 and end at 99)

@jantonguirao jantonguirao force-pushed the biased_crop_center branch 2 times, most recently from eb2f26f to b8b3824 Compare November 10, 2020 13:24
Signed-off-by: Joaquin Anton <janton@nvidia.com>
Signed-off-by: Joaquin Anton <janton@nvidia.com>
@jantonguirao jantonguirao force-pushed the biased_crop_center branch 2 times, most recently from bbfa995 to f76f187 Compare November 12, 2020 16:01
@jantonguirao jantonguirao changed the title Add segmentation.BiasedCropCenter operator Add segmentation.RandomMaskPixel operator Nov 12, 2020
Signed-off-by: Joaquin Anton <janton@nvidia.com>
@jantonguirao jantonguirao force-pushed the biased_crop_center branch 2 times, most recently from 8fbc0be to 266366b Compare November 12, 2020 17:47
Signed-off-by: Joaquin Anton <janton@nvidia.com>
Comment on lines 102 to 104
for (int sample_idx = 0; sample_idx < nsamples; sample_idx++) {
foreground_[sample_idx] = spec_.template GetArgument<int>("foreground", &ws, sample_idx);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sample-indexed GetArgument is a relic from the obsolete SampleWorkspace. It's inefficient and should not be used.

Suggested change
for (int sample_idx = 0; sample_idx < nsamples; sample_idx++) {
foreground_[sample_idx] = spec_.template GetArgument<int>("foreground", &ws, sample_idx);
}
GetPerSampleArgument(foreground_, "foreground", spec_, ws, nsamples);

Comment on lines 107 to 110
value_.resize(nsamples);
for (int sample_idx = 0; sample_idx < nsamples; sample_idx++) {
value_[sample_idx] = spec_.template GetArgument<int>("value", &ws, sample_idx);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
value_.resize(nsamples);
for (int sample_idx = 0; sample_idx < nsamples; sample_idx++) {
value_[sample_idx] = spec_.template GetArgument<int>("value", &ws, sample_idx);
}
GetPerSampleArgument(value_, "value", spec_, ws, nsamples);

Comment on lines 114 to 116
for (int sample_idx = 0; sample_idx < nsamples; sample_idx++) {
threshold_[sample_idx] = spec_.template GetArgument<float>("threshold", &ws, sample_idx);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (int sample_idx = 0; sample_idx < nsamples; sample_idx++) {
threshold_[sample_idx] = spec_.template GetArgument<float>("threshold", &ws, sample_idx);
}
GetPerSampleArgument(threshold_, "threshold", spec_, ws, nsamples);

Comment on lines 140 to 143
if (rle_.empty()) {
rle_.resize(thread_pool.size());
}
assert(rle_.size() == static_cast<size_t>(thread_pool.size()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: resize won't do much more than a similar check, so:

Suggested change
if (rle_.empty()) {
rle_.resize(thread_pool.size());
}
assert(rle_.size() == static_cast<size_t>(thread_pool.size()));
rle_.resize(thread_pool.size());

flat_idx = rle_mask.find(dist(rng));
}
} else {
T threshold = static_cast<T>(threshold_[sample_idx]);
Copy link
Contributor

@mzient mzient Nov 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't do that. This will cause wrong results for signed integer input and fractional threshold (and also when the value wraps after casting).

Suggested change
T threshold = static_cast<T>(threshold_[sample_idx]);
float threshold = threshold_[sample_idx];

int64_t flat_idx = -1;
auto &rle_mask = rle_[thread_id];
if (has_value_) {
T value = static_cast<T>(value_[sample_idx]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have mixed feelings about that one - consider that:
input is uint8
value is 256
magically, you select 0 as foreground.
Actually:

if (static_cast<int>(value) != values_[sample_idx])
   // behave as if there was no foregound at all

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Signed-off-by: Joaquin Anton <janton@nvidia.com>
it's equal to a specific ``value``.
)")
.AddOptionalArg<int>("value",
R"code(All pixels equal to this value are interpreted as foreground.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
R"code(All pixels equal to this value are interpreted as foreground.
R"code(All pixels equal to this value are interpreted as foreground.

Comment on lines 48 to 49
R"code(If different than 0, the pixel position is sampled uniformly from all foreground pixels.
If 0, the pixel position is sampled uniformly from all available pixels.)code",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
R"code(If different than 0, the pixel position is sampled uniformly from all foreground pixels.
If 0, the pixel position is sampled uniformly from all available pixels.)code",
R"code(If different than 0, the pixel position is sampled uniformly from all foreground pixels.
If 0, the pixel position is sampled uniformly from all available pixels.)code",

Comment on lines 139 to 160
auto &rle_mask = rle_[thread_id];
if (has_value_) {
T value = static_cast<T>(value_[sample_idx]);
// checking if the value is representable by T, otherwise we
// just fall back to pick a random pixel.
if (static_cast<int>(value) == value_[sample_idx]) {
rle_mask.Init(
mask, [value](const T &x) { return x == value; });
if (rle_mask.count() > 0) {
auto dist = std::uniform_int_distribution<int64_t>(0, rle_mask.count() - 1);
flat_idx = rle_mask.find(dist(rng));
}
}
} else {
float threshold = threshold_[sample_idx];
rle_mask.Init(
mask, [threshold](const T &x) { return x > threshold; });
if (rle_mask.count() > 0) {
auto dist = std::uniform_int_distribution<int64_t>(0, rle_mask.count() - 1);
flat_idx = rle_mask.find(dist(rng));
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto &rle_mask = rle_[thread_id];
if (has_value_) {
T value = static_cast<T>(value_[sample_idx]);
// checking if the value is representable by T, otherwise we
// just fall back to pick a random pixel.
if (static_cast<int>(value) == value_[sample_idx]) {
rle_mask.Init(
mask, [value](const T &x) { return x == value; });
if (rle_mask.count() > 0) {
auto dist = std::uniform_int_distribution<int64_t>(0, rle_mask.count() - 1);
flat_idx = rle_mask.find(dist(rng));
}
}
} else {
float threshold = threshold_[sample_idx];
rle_mask.Init(
mask, [threshold](const T &x) { return x > threshold; });
if (rle_mask.count() > 0) {
auto dist = std::uniform_int_distribution<int64_t>(0, rle_mask.count() - 1);
flat_idx = rle_mask.find(dist(rng));
}
}
auto &rle_mask = rle_[thread_id];
rle_mask.clear();
if (has_value_) {
T value = static_cast<T>(value_[sample_idx]);
// checking if the value is representable by T, otherwise we
// just fall back to pick a random pixel.
if (static_cast<int>(value) == value_[sample_idx]) {
rle_mask.Init(
mask, [value](const T &x) { return x == value; });
}
} else {
float threshold = threshold_[sample_idx];
rle_mask.Init(
mask, [threshold](const T &x) { return x > threshold; });
}
if (rle_mask.count() > 0) {
auto dist = std::uniform_int_distribution<int64_t>(0, rle_mask.count() - 1);
flat_idx = rle_mask.find(dist(rng));
}

@@ -45,7 +45,9 @@ class SearchableRLEMask {
* determine the mask values that are considered foreground
*/
template <typename T, typename Predicate = is_positive>
explicit SearchableRLEMask(span<const T> mask_view, Predicate &&is_foreground = {}) {
void Init(span<const T> mask_view, Predicate &&is_foreground = {}) {
groups_.clear();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would export this as a clear() method.

Comment on lines +30 to +32
crop_shape = in_shape - 2 # We want to force the center adjustment, therefore the large crop shape
anchor = fg_pixel1 - crop_shape // 2
anchor = min(max(0, anchor), in_shape - crop_shape)
Copy link
Contributor

@JanuszL JanuszL Nov 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it is up to the user to make sure that window is inside the image. Maybe it could be mentioned in the documentation - that it is possible and how to deal with that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now this operator is not advertised to be used for cropping. It's a general purpose pixel selector. Do you think we should add this suggestion of usage in the operator documentation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not if we are not advertising it that way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could add an example using this operator to calculate a crop center, maybe in one of the existing notebooks


assert in_mask[tuple(fg_pixel1)] > 0
assert in_mask[tuple(fg_pixel2)] > 0.99
print(in_mask[tuple(fg_pixel3)])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that print required?

Signed-off-by: Joaquin Anton <janton@nvidia.com>
@jantonguirao
Copy link
Contributor Author

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1796485]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1796485]: BUILD FAILED

Signed-off-by: Joaquin Anton <janton@nvidia.com>
@jantonguirao
Copy link
Contributor Author

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1796659]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1796659]: BUILD PASSED

@jantonguirao jantonguirao merged commit 9e99e50 into NVIDIA:master Nov 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants