-
Notifications
You must be signed in to change notification settings - Fork 621
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ROIRandomCrop operator #2638
Conversation
b198ac0
to
9749666
Compare
Note: Using ``roi_end`` is mutually exclusive with ``roi_shape``.)code", nullptr, true) | ||
.AddOptionalArg<std::vector<int>>("roi_shape", | ||
R"code(ROI shape. | ||
|
||
Note: Using ``roi_shape`` is mutually exclusive with ``roi_end``.)code", nullptr, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: Using ``roi_end`` is mutually exclusive with ``roi_shape``.)code", nullptr, true) | |
.AddOptionalArg<std::vector<int>>("roi_shape", | |
R"code(ROI shape. | |
Note: Using ``roi_shape`` is mutually exclusive with ``roi_end``.)code", nullptr, true) | |
.. Note:: | |
Using ``roi_end`` is mutually exclusive with ``roi_shape``.)code", nullptr, true) | |
.AddOptionalArg<std::vector<int>>("roi_shape", | |
R"code(ROI shape. | |
.. Note:: | |
Using ``roi_shape`` is mutually exclusive with ``roi_end``.)code", nullptr, true) |
If provided, the cropping window start will be selected so that the cropping window is within the | ||
bounds of the input. | ||
|
||
Note: Providing ``in_shape`` is incompatible with feeding the input data directly as a positional input. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: Providing ``in_shape`` is incompatible with feeding the input data directly as a positional input. | |
.. Note:: | |
Providing ``in_shape`` is incompatible with feeding the input data directly as a positional input. |
if ((roi_end_.IsDefined() + roi_shape_.IsDefined()) != 1) | ||
DALI_FAIL("Either ROI end or ROI shape should be defined, but not both"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if ((roi_end_.IsDefined() + roi_shape_.IsDefined()) != 1) | |
DALI_FAIL("Either ROI end or ROI shape should be defined, but not both"); | |
DALI_ENFORCE(roi_end_.IsDefined() + roi_shape_.IsDefined()) == 1, | |
"Either ROI end or ROI shape should be defined, but not both"); |
if (roi_end_.IsDefined()) { | ||
roi_end_.Acquire(spec_, ws, nsamples, sh); | ||
} else { | ||
assert(roi_shape_.IsDefined()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need that assert as you already have one check in the constructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it one of the reasons to have asserts - to check for inconsistent internal state?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put it there mostly for documentation, but also to double-check that the logic is sane
if (in_shape_arg_.IsDefined() && (ws.NumInput() == 1)) { | ||
DALI_FAIL("``in_shape`` argument is incompatible with providing an input.") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (in_shape_arg_.IsDefined() && (ws.NumInput() == 1)) { | |
DALI_FAIL("``in_shape`` argument is incompatible with providing an input.") | |
} | |
DALI_ENFORCE(in_shape_arg_.IsDefined() && (ws.NumInput() == 1) == false, "``in_shape`` argument is incompatible with providing an input.") |
for (int d = 0; d < ndim; d++) { | ||
DALI_ENFORCE(roi_start_[s].data[d] >= 0 && | ||
sample_sh[d] >= (roi_start_[s].data[d] + roi_shape_[s].data[d]), | ||
"ROI can't be out of bounds."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you print the offending shape and roi bounds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do
shape=(ndim,), dtype=types.INT32, device='cpu') | ||
roi_start = fn.random.uniform(range=(roi_min_start, roi_max_start + 1), | ||
shape=(ndim,), dtype=types.INT32, device='cpu') | ||
crop_start = fn.roi_random_crop(*inputs, crop_shape=crop_shape, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should test roi_end
argument as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do
roi_start = fn.random.uniform(range=(roi_min_start, roi_max_start + 1), | ||
shape=(ndim,), dtype=types.INT32, device='cpu') | ||
crop_start = fn.roi_random_crop(*inputs, crop_shape=crop_shape, | ||
roi_start=roi_start, roi_shape=roi_shape, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a negative cases when both use_in_shape_arg
and use_shape_like_in
are provided, as well as roi_end
with roi_shape
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
outputs += [in_shape_out] | ||
pipe.set_outputs(*outputs) | ||
pipe.build() | ||
for iter in range(niter): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for iter in range(niter): | |
for _ in range(niter): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
for iter in range(niter): | ||
outputs = pipe.run() | ||
for idx in range(batch_size): | ||
roi_start = outputs[0].at(idx).tolist() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you not use at
as it is deprecated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
|
||
for d in range(ndim): | ||
if in_shape is not None: | ||
assert crop_start[d] >= 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't always be crop_start[d] >= 0
no matter if in_shape
is provided?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really. Out of bounds slicing is allowed. We are OK with a negative anchor as long as the cropping window contains the ROI (could happen for a big cropping window and a small ROI near the edge)
if (in_shape_arg_.IsDefined()) { | ||
in_shape_.resize(nsamples, ndim); | ||
in_shape_arg_.Acquire(spec_, ws, nsamples, sh); | ||
for (int s = 0; s < nsamples; s++) { | ||
auto sample_sh = in_shape_.tensor_shape_span(s); | ||
for (int d = 0; d < ndim; d++) { | ||
sample_sh[d] = in_shape_arg_[s].data[d]; | ||
} | ||
} | ||
} else { | ||
auto &in = ws.template InputRef<CPUBackend>(0); | ||
in_shape_ = in.shape(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if no input nor in_shape_arg_
are provided?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing special, we just don't bound the cropping window to be within the image
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I see. I missed that L121 condition covers the L138 loop as well.
} | ||
} else { | ||
for (int d = 0; d < ndim; d++) { | ||
DALI_ENFORCE(roi_start_[s].data[d] >= 0 && sample_sh[d] >= roi_end_[s].data[d], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to bother with roi_end_
here. Maybe it should be just roi_start_
and roi_shape_
, and values from roi_end
should be used to infer roi_shape_
if not provided directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd involve extra storage, and the code accessing this storage would look different than the arguments directly provided (because those are tensor views). This is why I preferred to do it on the fly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's easier to check end
than start+shape
?
ws.ArgumentInput("crop_shape").size() : | ||
ws.GetRequestedBatchSize(0); | ||
crop_shape_.Acquire(spec_, ws, nsamples, true); | ||
int ndim = crop_shape_[0].shape[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you should check if crop_shape_
is uniform?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what the true
in Acquire does
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't have to - each sample can be different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the shape of a shape we are talking about. Uniformity in crop_shape means same number of dimensions
int64_t roi_extent = -1; | ||
int64_t roi_start = roi_start_[sample_idx].data[d]; | ||
int64_t crop_extent = crop_shape_[sample_idx].data[d]; | ||
if (roi_end_.IsDefined()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can do this in setup (roi_extent
calculation).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could but it'd involve extra storage to keep those.
if (roi_extent == crop_extent) { | ||
crop_start[sample_idx].data[d] = roi_start; | ||
} else if (roi_extent > crop_extent) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about:
if (roi_extent == crop_extent) { | |
crop_start[sample_idx].data[d] = roi_start; | |
} else if (roi_extent > crop_extent) { | |
if (roi_extent == crop_extent) { | |
crop_start[sample_idx].data[d] = roi_start; | |
} else { | |
int64_t range = roi_extent - crop_extent; | |
if (sample_sh) { | |
range = std::max<int64_t>(range, -roi_start); // make sure that if roi_extent < crop_extent we won't be < 0 | |
} | |
int64_t range_start = std::min<int64_t>(roi_start, roi_start + range); | |
int64_t range_end = std::max<int64_t>(roi_start, roi_start + range) | |
auto dist = std::uniform_int_distribution<int64_t>(range_start, range_end); | |
crop_start[sample_idx].data[d] = dist(rngs_[sample_idx]); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you can get rid of if at all.
} else if (roi_extent > crop_extent) { | ||
int64_t range_end = roi_start + roi_extent - crop_extent; | ||
if (sample_sh) | ||
range_end = std::min<int64_t>(sample_sh[d] - crop_extent, range_end); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you check if ROI is inside shape already in setup. So this is not needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't depend on the ROI but on the cropping window. Imagine a large cropping window, it would go easily out of bounds if we want to cover all the different anchors that cover the whole ROI. In the case that the cropping window is bounded the range of possible anchors is limited. Otherwise it isn't
provided region of interest (ROI) is contained in it. | ||
|
||
If the ROI is bigger than the cropping window, the cropping window will be a subwindow of the ROI. | ||
If the ROI is smaller than the cropping window, the whole ROI should be contained in the cropping window. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the ROI is smaller than the cropping window, the whole ROI should be contained in the cropping window. | |
If the ROI is smaller than the cropping window, the whole ROI shall be contained in the cropping window. |
If the ROI is smaller than the cropping window, the whole ROI should be contained in the cropping window. | ||
|
||
If an input shape (``in_shape``) is given, the resulting cropping window is selected to be within the | ||
bounds of that input shape. Alternatively, the input data subject to cropping can be fed directly to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bounds of that input shape. Alternatively, the input data subject to cropping can be fed directly to | |
bounds of that input shape. Alternatively, the input data subject to cropping can be passed to the operator, in | |
which case the cropping window will be constrained to the shape of the data. |
fed
suggests that the data itself is consumed - and I think we should make it very clear that the operator doesn't touch the data.
DALI_ENFORCE(sample_sh[d] >= crop_shape_[s].data[d], | ||
make_string("Cropping shape can't be bigger than the input shape. " | ||
"Got: crop_shape[", crop_shape_[s].data[d], "] and sample_shape[", | ||
sample_sh[d], "] for d=", d)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder what we should do in this case...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should allow (with an extra parameter) to clip the cropping window if it's larger than input? Or, alternatively, keep out-of-bounds window (pad)? Just thinking aloud.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should complicate this even more unless we have a specific use in mind.
auto& thread_pool = ws.GetThreadPool(); | ||
|
||
for (int sample_idx = 0; sample_idx < nsamples; sample_idx++) { | ||
thread_pool.AddWork( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if asynchronous execution is justified here - after all this operator performs ndim
loops per sample - not much compared to the overhead of scheduling work to thread pool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can get rid of it
788bd74
to
f483f2c
Compare
for idx in range(6, 10): | ||
check_crop_start(np.array(outputs[idx][s]).tolist(), roi_start, roi_shape, crop_shape, in_shape) | ||
|
||
def test_random_mask_pixel(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def test_random_mask_pixel(): | |
def test_roi_random_cropl(): |
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, copy-paste :)
crop_shape = fn.random.uniform(range=(crop_min_extent, crop_max_extent + 1), | ||
shape=(ndim,), dtype=types.INT32, device='cpu') | ||
roi_shape = fn.random.uniform(range=(roi_min_extent, roi_max_extent + 1), | ||
shape=(ndim,), dtype=types.INT32, device='cpu') | ||
roi_start = fn.random.uniform(range=(roi_min_start, roi_max_start + 1), | ||
shape=(ndim,), dtype=types.INT32, device='cpu') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that scalar arguments are not tested. Perhaps you could do sth like:
def make_arg(lo, hi):
return [lo] * ndim if lo == hi else fn.random.uniform(range=(lo, hi+1), shape=[ndim], dtype=types.INT32)
shape_like_in = dali.fn.external_source(lambda: np.zeros(shape_gen_f()), | ||
device='cpu', batch=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try variable batch size?
int nsamples = crop_start.shape.size(); | ||
int ndim = crop_start[0].shape[0]; | ||
|
||
auto& thread_pool = ws.GetThreadPool(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove
Signed-off-by: Joaquin Anton <janton@nvidia.com>
Signed-off-by: Joaquin Anton <janton@nvidia.com>
Signed-off-by: Joaquin Anton <janton@nvidia.com>
Signed-off-by: Joaquin Anton <janton@nvidia.com>
Signed-off-by: Joaquin Anton <janton@nvidia.com>
f483f2c
to
6512b9d
Compare
roi_start = np.array([(roi_min_start + roi_max_start) // 2] * ndim, dtype=np.int32) if random.choice([True, False]) \ | ||
else fn.random.uniform(range=(roi_min_start, roi_max_start + 1), | ||
shape=(ndim,), dtype=types.INT32, device='cpu') | ||
roi_end = roi_start + roi_shape |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this works, then it's by chance - if roi_start
is a numpy.ndarray
and roi_shape
is a DataNode
, this wouldn't work.
shape_like_in = dali.fn.external_source(data_gen_f, device='cpu') | ||
in_shape = dali.fn.shapes(shape_like_in, dtype=types.INT32) | ||
|
||
crop_shape = [(crop_min_extent + crop_max_extent) // 2] * ndim if random.choice([True, False]) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Numpy arrays are passed as argument inputs, so this still does not test non-tensor arguments. It must be a plain list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. I fixed that now
Signed-off-by: Joaquin Anton <janton@nvidia.com>
a66d70e
to
fbb4c23
Compare
Signed-off-by: Joaquin Anton janton@nvidia.com
Why we need this PR?
Pick one, remove the rest
What happened in this PR?
Fill relevant points, put NA otherwise. Replace anything inside []
Added a new operator ROIRandomCrop that takes a fixed cropping window shape and a region of interest and produces a cropping window anchor (start coordinates) so that the resulting cropping window contains as much of the ROI as possible.
New op
Calculation of crop anchor.
Test coverage? corner cases?
Python tests
Operator documentation
JIRA TASK: [DALI-1816]