Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix JPEG2K fused decoding (with ROI), add native tests for JP2k decoding #2692

Merged
merged 2 commits into from
Feb 20, 2021

Conversation

JanuszL
Copy link
Contributor

@JanuszL JanuszL commented Feb 18, 2021

  • adds a CPU fallback for fused (slice/crop/randomCrop) JPEG2000 images decoding
  • enables JPEG2000 tests for fused CPU/Mixed decoder
  • when nvJPEG2000 is not available for given HW fallback to CPU implementation

Signed-off-by: Janusz Lisiecki jlisiecki@nvidia.com

Why we need this PR?

Pick one, remove the rest

  • It fixes JPEG2K fused decoding (with ROI)
  • It adds native tests for JP2k decoding

What happened in this PR?

Fill relevant points, put NA otherwise. Replace anything inside []

  • What solution was applied:
    extended tests to use JPEG2K images
    adds a CPU fallback for fused (slice/crop/randomCrop) JPEG2000 images decoding
    adds check if nvJPEG2000 was initialized successfully and can be used
  • Affected modules and functionalities:
    ImageDecoder for mixed backend
    tests for fused ImageDecoder (CPU, Mixed)
  • Key points relevant for the review:
    NA
  • Validation and testing:
    new tests are added
  • Documentation (including examples):
    NA

JIRA TASK: []

@JanuszL
Copy link
Contributor Author

JanuszL commented Feb 18, 2021

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2087123]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2087123]: BUILD FAILED

@JanuszL
Copy link
Contributor Author

JanuszL commented Feb 19, 2021

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2088671]: BUILD STARTED

@JanuszL JanuszL changed the title [WIP] Fix JPEG2K fused decoding Fix JPEG2K fused decoding Feb 19, 2021
@JanuszL JanuszL changed the title Fix JPEG2K fused decoding Fix JPEG2K fused decoding (with ROI), add native tests for JP2k decoding Feb 19, 2021
@JanuszL
Copy link
Contributor Author

JanuszL commented Feb 19, 2021

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2088861]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2088671]: BUILD PASSED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2088861]: BUILD PASSED

@jantonguirao jantonguirao self-assigned this Feb 19, 2021
@@ -526,7 +526,7 @@ class nvJPEGDecoder : public Operator<MixedBackend>, CachedDecoderImpl {
data.method = DecodeMethod::NvjpegCuda;
samples_single_.push_back(&data);
}
} else if (!ParseNvjpeg2k(data, span<const uint8_t>(input_data, in_size))) {
} else if (crop_generator || !ParseNvjpeg2k(data, span<const uint8_t>(input_data, in_size))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good for now. Just an idea to be discussed with the team:
In case that ROI decoding is not an option with nvjpeg2k, should we decode the whole image with nvjpeg2k and launch the GPU slice kernel to produce the result instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment to the schema docs,

@banasraf banasraf self-assigned this Feb 19, 2021
@JanuszL
Copy link
Contributor Author

JanuszL commented Feb 19, 2021

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2090052]: BUILD STARTED

- adds a CPU fallback for fused (slice/crop/randomCrop) JPEG2000 images
  decoding
- enables JPEG2000 tests for fused CPU/Mixed decoder
- when nvJPEG2000 is not available for given HW fallback to CPU implementation

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
Please note that GPU acceleration for JPEG 2000 decoding is only available for CUDA 11.

.. note::
JPEG 2000 decoding and extraction of regions-of-interest (ROI) in not accelerated on the GPU when
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about

JPEG 2000 region-of-interest (ROI) decoding is not accelerated on the GPU, and will use a CPU implementation regardless of the selected backend.
For a GPU accelerated implementation, consider using separate ``image_decoder`` and ``crop`` operators.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -220,7 +223,10 @@ image format, it will decode the entire image and crop the selected ROI.
The output of the decoder is in *HWC* layout.

Supported formats: JPG, BMP, PNG, TIFF, PNM, PPM, PGM, PBM, JPEG 2000.
Please note that GPU acceleration for JPEG 2000 decoding is only available for CUDA 11.

.. note::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JPEG 2000 region-of-interest (ROI) decoding is not accelerated on the GPU, and will use a CPU implementation regardless of the selected backend.
For a GPU accelerated implementation, consider using separate ``image_decoder`` and ``random_crop`` operators.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -273,7 +279,10 @@ image format, it will decode the entire image and crop the selected ROI.
The output of the decoder is in the *HWC* layout.

Supported formats: JPG, BMP, PNG, TIFF, PNM, PPM, PGM, PBM, JPEG 2000.
Please note that GPU acceleration for JPEG 2000 decoding is only available for CUDA 11.

.. note::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JPEG 2000 region-of-interest (ROI) decoding is not accelerated on the GPU, and will use a CPU implementation regardless of the selected backend.
For a GPU accelerated implementation, consider using separate ``image_decoder`` and ``slice`` operators.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// nvJPEG2k doesn't support deprecated sm while DALI does. In this case, NvJPEG2KHandle may
// be empty and we need to fall back to the CPU implementation
if (nvjpeg2k_handle_) {
auto nvjpeg2k_thread_id = nvjpeg2k_thread_.GetThreadIds()[0];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it'd be better to get the thread id before posting the work

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

"component 0 has a shape of {", comp.component_height, ", ",
comp.component_width, "} and component ", c, " has a shape of {",
height, ", ", width, "}"));
if (nvjpeg2k_handle_) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (nvjpeg2k_handle_) {
if (!nvjpeg2k_handle_)
return false;

just a suggestion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

for (uint32_t c = 1; c < image_info.num_components; ++c) {
NVJPEG2K_CALL(nvjpeg2kStreamGetImageComponentInfo(jpeg2k_stream, &comp, c));
DALI_ENFORCE(height == comp.component_height &&
width == comp.component_width,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: indentation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

CacheStore(sample->file_name, output_data, shape, nvjpeg2k_cu_stream_);
}
});
if (nvjpeg2k_handle_) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, I'd rather add an early return. But that might be just a personal preference so feel free to ignore

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
@JanuszL
Copy link
Contributor Author

JanuszL commented Feb 19, 2021

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2090329]: BUILD FAILED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2090052]: BUILD FAILED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2090329]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2090052]: BUILD PASSED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2090329]: BUILD PASSED

@JanuszL JanuszL merged commit 7b9681a into NVIDIA:master Feb 20, 2021
@JanuszL JanuszL deleted the fix_jp2k_fused_decoding branch February 20, 2021 01:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants