-
Notifications
You must be signed in to change notification settings - Fork 611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add segmentation.RandomMaskPixel operator #2445
Conversation
cf97ab9
to
150f6ad
Compare
Signed-off-by: Joaquin Anton <janton@nvidia.com>
150f6ad
to
2e557ac
Compare
|
||
DALI_SCHEMA(segmentation__BiasedCropCenter) | ||
.DocStr(R"(Selects a cropping window center which can be selected randomly from either | ||
any position in the input or any position of a foreground pixel in the input, based on an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any position in the input or any position of a foreground pixel in the input, based on an | |
at any position in the input or any position of a foreground pixel in the input, based on an |
|
||
DALI_SCHEMA(segmentation__BiasedCropCenter) | ||
.DocStr(R"(Selects a cropping window center which can be selected randomly from either | ||
any position in the input or any position of a foreground pixel in the input, based on an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if we should use foreground
or just non zero value, because the meaning of the foreground
if not defined.
|
||
When foreground != 0, the cropping center is first picked to match any of the foreground | ||
pixels in the input. If the selected crop center results in an out of bounds cropping window, | ||
the center is shifted as necessary so that the window remains within bounds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it makes any sense not to shift?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think both things can make sense. If cropping out of bounds is OK, the user can let shape=None and any pixel would be considered a valid center. If the shape is specified, we'll assure that the cropping window is within bounds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've already had github requests for out-of-bounds cropping. Perhaps we should make this configurable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have out-of-bounds cropping, and this is already configured (providing a shape or not).
|
||
has_crop_shape_ = spec_.ArgumentDefined("shape"); | ||
if (has_crop_shape_) { | ||
GetShapeArgument(crop_shape_, spec_, "shape", ws, ndim, nsamples); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't you do anything with the return value as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not really, I don't need to infer the batch size or the number of dimensions.
[&, sample_idx](int thread_id) { | ||
auto mask = masks_view[sample_idx]; | ||
auto center = center_view[sample_idx]; | ||
SearchableRLEMask rle_mask(mask); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you tell how fast/slow it is for the typical inputs?
Do we assume that masks are uniform regarding the size across the batch or they can vary?
If can vary maybe we can differently split the work across the threads. Ask each thread to create a mask for only one plan in each volume.
Does it make any sense to have it on the GPU?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And how that works with 3D masks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can benchmark it. I wouldn't assume uniform shape.
SearchableRLEMask works on flat indices. We are not doing anything with the dimensions.
Regarding the GPU, encoding the mask is a rather serial task. Having a mask per plane would complicate things, so I wouldn't go there unless we see there is a performance bottleneck here. On top of that, Slice operator doesn't take GPU anchor/shape.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, let us benchmark first and see if there is anything to fight for.
// Adjust center if necessary | ||
if (has_crop_shape_) { | ||
for (int d = 0; d < ndim; d++) { | ||
int64_t w = crop_sh[d] >> 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know what this w
was supposed to mean, but it read as width (which apprently it isn't).
int64_t w = crop_sh[d] >> 1; | |
int64_t half = crop_sh[d] >> 1; |
for (int d = 0; d < ndim; d++) { | ||
int64_t w = crop_sh[d] >> 1; | ||
center.data[d] = | ||
boundary::idx_clamp(center.data[d], w, mask_sh[d] - (crop_sh[d] - w)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
boundary::idx_clamp(center.data[d], w, mask_sh[d] - (crop_sh[d] - w)); | |
clamp(center.data[d], w, mask_sh[d] - (crop_sh[d] - w)); |
If you're just clamping, you don't need any fancy boundary handling (ordinary clamp will do).
Also, boundary clamp is an exclusive clamp (to size-1) and we want to have inclusive clamp.
Example:
mask_sh[d] = 100
crop_sh[d] = 10
w = 5
maximum valid value is 95 (the window will start at 90 and end at 99)
eb2f26f
to
b8b3824
Compare
b8b3824
to
4ca693f
Compare
Signed-off-by: Joaquin Anton <janton@nvidia.com>
bbfa995
to
f76f187
Compare
Signed-off-by: Joaquin Anton <janton@nvidia.com>
8fbc0be
to
266366b
Compare
Signed-off-by: Joaquin Anton <janton@nvidia.com>
266366b
to
8f62e10
Compare
for (int sample_idx = 0; sample_idx < nsamples; sample_idx++) { | ||
foreground_[sample_idx] = spec_.template GetArgument<int>("foreground", &ws, sample_idx); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sample-indexed GetArgument
is a relic from the obsolete SampleWorkspace. It's inefficient and should not be used.
for (int sample_idx = 0; sample_idx < nsamples; sample_idx++) { | |
foreground_[sample_idx] = spec_.template GetArgument<int>("foreground", &ws, sample_idx); | |
} | |
GetPerSampleArgument(foreground_, "foreground", spec_, ws, nsamples); |
value_.resize(nsamples); | ||
for (int sample_idx = 0; sample_idx < nsamples; sample_idx++) { | ||
value_[sample_idx] = spec_.template GetArgument<int>("value", &ws, sample_idx); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value_.resize(nsamples); | |
for (int sample_idx = 0; sample_idx < nsamples; sample_idx++) { | |
value_[sample_idx] = spec_.template GetArgument<int>("value", &ws, sample_idx); | |
} | |
GetPerSampleArgument(value_, "value", spec_, ws, nsamples); |
for (int sample_idx = 0; sample_idx < nsamples; sample_idx++) { | ||
threshold_[sample_idx] = spec_.template GetArgument<float>("threshold", &ws, sample_idx); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for (int sample_idx = 0; sample_idx < nsamples; sample_idx++) { | |
threshold_[sample_idx] = spec_.template GetArgument<float>("threshold", &ws, sample_idx); | |
} | |
GetPerSampleArgument(threshold_, "threshold", spec_, ws, nsamples); |
if (rle_.empty()) { | ||
rle_.resize(thread_pool.size()); | ||
} | ||
assert(rle_.size() == static_cast<size_t>(thread_pool.size())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: resize won't do much more than a similar check, so:
if (rle_.empty()) { | |
rle_.resize(thread_pool.size()); | |
} | |
assert(rle_.size() == static_cast<size_t>(thread_pool.size())); | |
rle_.resize(thread_pool.size()); |
flat_idx = rle_mask.find(dist(rng)); | ||
} | ||
} else { | ||
T threshold = static_cast<T>(threshold_[sample_idx]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't do that. This will cause wrong results for signed integer input and fractional threshold (and also when the value wraps after casting).
T threshold = static_cast<T>(threshold_[sample_idx]); | |
float threshold = threshold_[sample_idx]; |
int64_t flat_idx = -1; | ||
auto &rle_mask = rle_[thread_id]; | ||
if (has_value_) { | ||
T value = static_cast<T>(value_[sample_idx]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have mixed feelings about that one - consider that:
input is uint8
value is 256
magically, you select 0 as foreground.
Actually:
if (static_cast<int>(value) != values_[sample_idx])
// behave as if there was no foregound at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Signed-off-by: Joaquin Anton <janton@nvidia.com>
685e2b7
to
bfafcb4
Compare
it's equal to a specific ``value``. | ||
)") | ||
.AddOptionalArg<int>("value", | ||
R"code(All pixels equal to this value are interpreted as foreground. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
R"code(All pixels equal to this value are interpreted as foreground. | |
R"code(All pixels equal to this value are interpreted as foreground. |
R"code(If different than 0, the pixel position is sampled uniformly from all foreground pixels. | ||
If 0, the pixel position is sampled uniformly from all available pixels.)code", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
R"code(If different than 0, the pixel position is sampled uniformly from all foreground pixels. | |
If 0, the pixel position is sampled uniformly from all available pixels.)code", | |
R"code(If different than 0, the pixel position is sampled uniformly from all foreground pixels. | |
If 0, the pixel position is sampled uniformly from all available pixels.)code", |
auto &rle_mask = rle_[thread_id]; | ||
if (has_value_) { | ||
T value = static_cast<T>(value_[sample_idx]); | ||
// checking if the value is representable by T, otherwise we | ||
// just fall back to pick a random pixel. | ||
if (static_cast<int>(value) == value_[sample_idx]) { | ||
rle_mask.Init( | ||
mask, [value](const T &x) { return x == value; }); | ||
if (rle_mask.count() > 0) { | ||
auto dist = std::uniform_int_distribution<int64_t>(0, rle_mask.count() - 1); | ||
flat_idx = rle_mask.find(dist(rng)); | ||
} | ||
} | ||
} else { | ||
float threshold = threshold_[sample_idx]; | ||
rle_mask.Init( | ||
mask, [threshold](const T &x) { return x > threshold; }); | ||
if (rle_mask.count() > 0) { | ||
auto dist = std::uniform_int_distribution<int64_t>(0, rle_mask.count() - 1); | ||
flat_idx = rle_mask.find(dist(rng)); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto &rle_mask = rle_[thread_id]; | |
if (has_value_) { | |
T value = static_cast<T>(value_[sample_idx]); | |
// checking if the value is representable by T, otherwise we | |
// just fall back to pick a random pixel. | |
if (static_cast<int>(value) == value_[sample_idx]) { | |
rle_mask.Init( | |
mask, [value](const T &x) { return x == value; }); | |
if (rle_mask.count() > 0) { | |
auto dist = std::uniform_int_distribution<int64_t>(0, rle_mask.count() - 1); | |
flat_idx = rle_mask.find(dist(rng)); | |
} | |
} | |
} else { | |
float threshold = threshold_[sample_idx]; | |
rle_mask.Init( | |
mask, [threshold](const T &x) { return x > threshold; }); | |
if (rle_mask.count() > 0) { | |
auto dist = std::uniform_int_distribution<int64_t>(0, rle_mask.count() - 1); | |
flat_idx = rle_mask.find(dist(rng)); | |
} | |
} | |
auto &rle_mask = rle_[thread_id]; | |
rle_mask.clear(); | |
if (has_value_) { | |
T value = static_cast<T>(value_[sample_idx]); | |
// checking if the value is representable by T, otherwise we | |
// just fall back to pick a random pixel. | |
if (static_cast<int>(value) == value_[sample_idx]) { | |
rle_mask.Init( | |
mask, [value](const T &x) { return x == value; }); | |
} | |
} else { | |
float threshold = threshold_[sample_idx]; | |
rle_mask.Init( | |
mask, [threshold](const T &x) { return x > threshold; }); | |
} | |
if (rle_mask.count() > 0) { | |
auto dist = std::uniform_int_distribution<int64_t>(0, rle_mask.count() - 1); | |
flat_idx = rle_mask.find(dist(rng)); | |
} |
@@ -45,7 +45,9 @@ class SearchableRLEMask { | |||
* determine the mask values that are considered foreground | |||
*/ | |||
template <typename T, typename Predicate = is_positive> | |||
explicit SearchableRLEMask(span<const T> mask_view, Predicate &&is_foreground = {}) { | |||
void Init(span<const T> mask_view, Predicate &&is_foreground = {}) { | |||
groups_.clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would export this as a clear() method.
crop_shape = in_shape - 2 # We want to force the center adjustment, therefore the large crop shape | ||
anchor = fg_pixel1 - crop_shape // 2 | ||
anchor = min(max(0, anchor), in_shape - crop_shape) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it is up to the user to make sure that window is inside the image. Maybe it could be mentioned in the documentation - that it is possible and how to deal with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now this operator is not advertised to be used for cropping. It's a general purpose pixel selector. Do you think we should add this suggestion of usage in the operator documentation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe not if we are not advertising it that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could add an example using this operator to calculate a crop center, maybe in one of the existing notebooks
|
||
assert in_mask[tuple(fg_pixel1)] > 0 | ||
assert in_mask[tuple(fg_pixel2)] > 0.99 | ||
print(in_mask[tuple(fg_pixel3)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that print required?
Signed-off-by: Joaquin Anton <janton@nvidia.com>
!build |
CI MESSAGE: [1796485]: BUILD STARTED |
CI MESSAGE: [1796485]: BUILD FAILED |
Signed-off-by: Joaquin Anton <janton@nvidia.com>
!build |
CI MESSAGE: [1796659]: BUILD STARTED |
CI MESSAGE: [1796659]: BUILD PASSED |
Signed-off-by: Joaquin Anton janton@nvidia.com
Why we need this PR?
Pick one, remove the rest
What happened in this PR?
Fill relevant points, put NA otherwise. Replace anything inside []
Added a new operator segmentation.RandomMaskPixel that selects a pixel either randomly or to point to any of the available foreground pixels, specified by an input mask and threhold/value arguments
No existing functionality is affected
All
Python tests
Operator documentation in schema
JIRA TASK: [DALI-1723]