[Cutmix] Make fn.multi_paste more flexible, fix validation #5331

stiepan · 2024-02-19T23:19:50Z

Category:

New feature (non-breaking change which adds functionality)
Bug fix (non-breaking change which fixes an issue)

Description:

This PR reworks arguments and their parsing for fn.multi_paste to make it more flexible and easier to use (with cutmix augmentation in mind; namely ability to mix different batches and no need to use image shape explicitly if the inputs are uniformly shaped). It fixes a couple of bugs.

Fixes:

There was no validation of in_ids leading to out-of-bound accesses for incorrect input
The number of channels was handled incorrectly
- GPU assumed 3 channels no matter the actual number of channels - leading to incorrect cuda mem accesses.
- Both backends did not verify if all the pasted regions have the same number of channels (and if the output number of channels matches). The number of output channels was copied from the corresponding input sample. The PR changes that - the number of output channels is inferred from the actually pasted regions and uses the old behaviour only if there are none (to be compatible with the only case it could have worked previously).

New features:

The in_anchors, region shapes and out_anchors have now relative counterparts - they can be specified as [0, 1] floats relative to input shape, input shape and output shape respectively.
If all the input shapes are uniform and no output size is provided, the output shape is assumed to be the same.
To allow mixing images of diffrent batches, the operator can accept multiple inputs - in that case the in_ids must not be specified and the regions are pasted elementwise.

The first two points are aimed to make it easier to use fn.multi_paste without explcitly handling the actual shapes of the samples.
The multi-input mode should make it easier to use the operator in simple applications where the number of paste regions is uniform - with DALI implict batch you can think in terms of samples rather than indicies in the implicit batch. And it enables cases were you need to mix images from different sources.

Other changes:

The args are parsed once and saved not to recompute them along the way
No intersection check is moved to the CPU impl, it's outputs were ignored by the GPU impl anyway.
Added video support.

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

Checklist

Documentation

DALI team only

Requirements

Implements new requirements
Affects existing requirements
N/A

REQ IDs: N/A

JIRA TASK: DALI-3496

stiepan · 2024-02-19T23:20:10Z

!build

dali/test/python/operator_1/test_multipaste.py

dali-automaton · 2024-02-19T23:25:03Z

CI MESSAGE: [12935254]: BUILD STARTED

dali-automaton · 2024-02-20T01:58:57Z

CI MESSAGE: [12935254]: BUILD FAILED

… output shape, allow multipule inputs Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>

stiepan · 2024-02-26T12:10:30Z

!build

dali-automaton · 2024-02-26T12:15:25Z

CI MESSAGE: [13087567]: BUILD STARTED

dali-automaton · 2024-02-26T15:40:25Z

CI MESSAGE: [13087567]: BUILD PASSED

dali/operators/image/paste/multipaste.cc

klecki

Before looking at tests, posting the comments

dali/operators/image/paste/multipaste.cc