-
Notifications
You must be signed in to change notification settings - Fork 609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix JPEG2K fused decoding (with ROI), add native tests for JP2k decoding #2692
Conversation
6139ed4
to
9a3b0d2
Compare
!build |
CI MESSAGE: [2087123]: BUILD STARTED |
CI MESSAGE: [2087123]: BUILD FAILED |
9a3b0d2
to
45030ba
Compare
!build |
CI MESSAGE: [2088671]: BUILD STARTED |
!build |
CI MESSAGE: [2088861]: BUILD STARTED |
CI MESSAGE: [2088671]: BUILD PASSED |
CI MESSAGE: [2088861]: BUILD PASSED |
@@ -526,7 +526,7 @@ class nvJPEGDecoder : public Operator<MixedBackend>, CachedDecoderImpl { | |||
data.method = DecodeMethod::NvjpegCuda; | |||
samples_single_.push_back(&data); | |||
} | |||
} else if (!ParseNvjpeg2k(data, span<const uint8_t>(input_data, in_size))) { | |||
} else if (crop_generator || !ParseNvjpeg2k(data, span<const uint8_t>(input_data, in_size))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good for now. Just an idea to be discussed with the team:
In case that ROI decoding is not an option with nvjpeg2k, should we decode the whole image with nvjpeg2k and launch the GPU slice kernel to produce the result instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment to the schema docs,
45030ba
to
d422eb6
Compare
!build |
CI MESSAGE: [2090052]: BUILD STARTED |
- adds a CPU fallback for fused (slice/crop/randomCrop) JPEG2000 images decoding - enables JPEG2000 tests for fused CPU/Mixed decoder - when nvJPEG2000 is not available for given HW fallback to CPU implementation Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
d422eb6
to
5b9c282
Compare
Please note that GPU acceleration for JPEG 2000 decoding is only available for CUDA 11. | ||
|
||
.. note:: | ||
JPEG 2000 decoding and extraction of regions-of-interest (ROI) in not accelerated on the GPU when |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about
JPEG 2000 region-of-interest (ROI) decoding is not accelerated on the GPU, and will use a CPU implementation regardless of the selected backend.
For a GPU accelerated implementation, consider using separate ``image_decoder`` and ``crop`` operators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -220,7 +223,10 @@ image format, it will decode the entire image and crop the selected ROI. | |||
The output of the decoder is in *HWC* layout. | |||
|
|||
Supported formats: JPG, BMP, PNG, TIFF, PNM, PPM, PGM, PBM, JPEG 2000. | |||
Please note that GPU acceleration for JPEG 2000 decoding is only available for CUDA 11. | |||
|
|||
.. note:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JPEG 2000 region-of-interest (ROI) decoding is not accelerated on the GPU, and will use a CPU implementation regardless of the selected backend.
For a GPU accelerated implementation, consider using separate ``image_decoder`` and ``random_crop`` operators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -273,7 +279,10 @@ image format, it will decode the entire image and crop the selected ROI. | |||
The output of the decoder is in the *HWC* layout. | |||
|
|||
Supported formats: JPG, BMP, PNG, TIFF, PNM, PPM, PGM, PBM, JPEG 2000. | |||
Please note that GPU acceleration for JPEG 2000 decoding is only available for CUDA 11. | |||
|
|||
.. note:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JPEG 2000 region-of-interest (ROI) decoding is not accelerated on the GPU, and will use a CPU implementation regardless of the selected backend.
For a GPU accelerated implementation, consider using separate ``image_decoder`` and ``slice`` operators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
// nvJPEG2k doesn't support deprecated sm while DALI does. In this case, NvJPEG2KHandle may | ||
// be empty and we need to fall back to the CPU implementation | ||
if (nvjpeg2k_handle_) { | ||
auto nvjpeg2k_thread_id = nvjpeg2k_thread_.GetThreadIds()[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel it'd be better to get the thread id before posting the work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
"component 0 has a shape of {", comp.component_height, ", ", | ||
comp.component_width, "} and component ", c, " has a shape of {", | ||
height, ", ", width, "}")); | ||
if (nvjpeg2k_handle_) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (nvjpeg2k_handle_) { | |
if (!nvjpeg2k_handle_) | |
return false; |
just a suggestion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
for (uint32_t c = 1; c < image_info.num_components; ++c) { | ||
NVJPEG2K_CALL(nvjpeg2kStreamGetImageComponentInfo(jpeg2k_stream, &comp, c)); | ||
DALI_ENFORCE(height == comp.component_height && | ||
width == comp.component_width, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: indentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
CacheStore(sample->file_name, output_data, shape, nvjpeg2k_cu_stream_); | ||
} | ||
}); | ||
if (nvjpeg2k_handle_) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, I'd rather add an early return. But that might be just a personal preference so feel free to ignore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
0f3363b
to
90de0de
Compare
!build |
CI MESSAGE: [2090329]: BUILD FAILED |
CI MESSAGE: [2090052]: BUILD FAILED |
CI MESSAGE: [2090329]: BUILD STARTED |
CI MESSAGE: [2090052]: BUILD PASSED |
CI MESSAGE: [2090329]: BUILD PASSED |
Signed-off-by: Janusz Lisiecki jlisiecki@nvidia.com
Why we need this PR?
Pick one, remove the rest
What happened in this PR?
Fill relevant points, put NA otherwise. Replace anything inside []
extended tests to use JPEG2K images
adds a CPU fallback for fused (slice/crop/randomCrop) JPEG2000 images decoding
adds check if nvJPEG2000 was initialized successfully and can be used
ImageDecoder for mixed backend
tests for fused ImageDecoder (CPU, Mixed)
NA
new tests are added
NA
JIRA TASK: []