Image Decoder: Unified behavior across backends,Alpha channel support in PNG and JP2, YCbCr support in JP2 #2867

jantonguirao · 2021-04-15T16:06:46Z

Why we need this PR?

Pick one, remove the rest

Refactoring to improve consistency of Image Decoder behavior across backends.

What happened in this PR?

Fill relevant points, put NA otherwise. Replace anything inside []

What solution was applied:
Unified behavior in mixed and cpu versions of ImageDecoder operator
Added support for alpha channel in PNG, and JP2 files (TIFF already supported)
Added support for YCbCr and BGR in JP2
Added consistency tests
Affected modules and functionalities:
Image Decoder
Key points relevant for the review:
Changes in the decoders
Validation and testing:
New tests added
Documentation (including examples):
NA

JIRA TASK: [DALI-1948]

DALI_EXTRA_VERSION

dali/image/generic_image.cc

JanuszL · 2021-04-16T09:53:54Z

dali/operators/decoder/nvjpeg/nvjpeg_decoder_decoupled_api.h

      }
      data.shape = {height, width, image_info.num_components};
+      data.req_nchannels = NumberOfChannels(output_image_type_, data.shape[2]);
+      data.bpp = std::max<int>(data.bpp, comp.precision);


Repeats L460.

It's a bug, it should be inside the for loop

JanuszL · 2021-04-16T10:03:09Z

dali/operators/decoder/nvjpeg/nvjpeg_decoder_decoupled_api.h

@@ -527,7 +535,7 @@ class nvJPEGDecoder : public Operator<MixedBackend>, CachedDecoderImpl {
 #endif
        data.shape = {heights[0], widths[0], c};
        data.subsampling = subsampling;
-
+        data.req_nchannels = NumberOfChannels(output_image_type_, c);


Maybe this can be extracted after L556 s the same thing is repeated in L551?

JanuszL · 2021-04-16T10:26:12Z

dali/operators/decoder/nvjpeg/permute_layout.cu

+  Output g = ConvertNorm<Output>(input[tid + npixels]);
+  Output b = ConvertNorm<Output>(input[tid + 2 * npixels]);
+  Output *out = output + 3 * tid;
+  out[0] = kernels::rgb_to_y<Output>({r, g, b});


Do we need to convert before feeding to rgb_to_y, or maybe it should accept any type and only convert the final result?

I think that we're looking at a different design problem here: The componentwise functions (rbg_to_y, rgb_to_cb) work on unnormalized float values but this function normalizes values.
ConvertNorm<smaller_type> applied to the original components will result in loss of precision. The correct action would be to convert to float, convert to cbcr, add half-range to cb-cr (if needed), normalize the range and convertsat<output>. Note that cbcr component conversion has the nasty hardcoded 128 half range which will break things sooner than later.

JanuszL · 2021-04-16T10:26:59Z

dali/operators/decoder/nvjpeg/permute_layout.cu

+  auto r = ConvertNorm<Output>(input[tid]);
+  auto g = ConvertNorm<Output>(input[tid + npixels]);
+  auto b = ConvertNorm<Output>(input[tid + 2 * npixels]);
+  output[tid] = kernels::rgb_to_y<Output>({r, g, b});


Similar as above.

JanuszL · 2021-04-16T10:52:22Z

dali/test/python/test_operator_decoders_image.py

+    compare_pipelines(img_decoder_pipe("cpu", out_type=img_out_type, files=files),
+                      img_decoder_pipe("mixed", out_type=img_out_type, files=files),
+                      batch_size=batch_size_test, N_iterations=3,
+                      eps = eps)


Suggested change

eps = eps)

eps=eps)

JanuszL · 2021-04-16T11:00:33Z

dali/test/python/test_operator_decoders_image.py

+                                    ("jpeg2k", "db/single/multichannel/with_alpha", 'jp2'),
+                                    ("jpeg2k", "db/single/16bit", 'jp2'),
+                                    ("png", "db/single/multichannel/with_alpha", 'png')]:
+            subdir = ''  # In those paths the images are not organized in subdirs


How about subdir = None and adding a way to handle this.

mzient · 2021-04-19T08:14:59Z

dali/kernels/imgproc/color_manipulation/color_space_conversion_impl.cuh

+
+template <typename T>
+__inline__ __device__ T rgb_to_cb(vec<3, T> rgb) {
+  return ConvertSat<T>(-0.16873589f * rgb.x - 0.33126411f * rgb.y + 0.50000000f * rgb.z + 128.0f);


I don't like this + 128.0f here - it works well for uint8 outputs, but makes no sense for signed T and it's applicability is dubious with float output, too. Perhaps we should add half-range of unsigned T and don't add anything otherwise? It would affect JPEG distortion. We need to discuss this.

- adds a check inside the decoder if the requested image format matches the underlying image - adds check only or JPEG2000 as in case of JPEG only RGB images are supported, for the host fallback DALI always convert to the expected format if requested Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>

…g. RGBA) to RGB. Signed-off-by: Joaquin Anton <janton@nvidia.com>

Signed-off-by: Joaquin Anton <janton@nvidia.com>

jantonguirao · 2021-04-19T15:52:47Z

!build

dali-automaton · 2021-04-19T15:56:22Z

CI MESSAGE: [2282818]: BUILD STARTED

dali-automaton · 2021-04-19T17:07:49Z

CI MESSAGE: [2282818]: BUILD PASSED

mzient · 2021-04-19T17:43:19Z

dali/operators/decoder/nvjpeg/nvjpeg_decoder_decoupled_api.h

+                                 "component 0 has a shape of {", comp.component_height, ", ",
+                                 comp.component_width, "} and component ", c, " has a shape of {",
+                                 height, ", ", width, "}"));
+        data.bpp = std::max<int>(data.bpp, comp.precision);


Hmm.... if precision is defined per component, whereas you get max precision. I wonder what would happen in case of an image with different precision for different components... is that even possible?

I would put an enforce for that.

mzient · 2021-04-19T17:54:30Z

dali/operators/decoder/nvjpeg/nvjpeg_decoder_decoupled_api.h

+    void clear() {
+      encoded_length = 0;
+      bpp = 8;
+      selected_decoder = nullptr;
+      is_progressive = false;
+      req_nchannels = -1;
+      method = DecodeMethod::Host;
+      subsampling = NVJPEG_CSS_UNKNOWN;
+    }


It's beyond the scope of this PR, but this method (and the fact that you can't just assign {} to reset it) is a manifestation of an underlying problem: this object serves two purposes. It is both SampleData and SampleDecoderContext or whatever the name should be. I don't think they should be aggregated this way - perhaps a valid solution would be to rename it to SampleContext and create a subaggregate (SampleData) which would contain the fields that are reset here (+ filename - which should probably (?) be reset here, too).

I agree with this. I'll add a TODO here for now. It deserves a separate PR.

include/dali/core/common.h

mzient · 2021-04-19T17:57:23Z

dali/test/python/test_pipeline_multichannel.py

@@ -182,7 +184,7 @@ def __init__(self, device, batch_size, num_threads=1, device_id=0):
        super(MultichannelPipeline, self).__init__(batch_size, num_threads, device_id)
        self.device = device

-        self.reader = ops.readers.File(file_root=multichannel_tiff_root)
+        self.reader = ops.readers.File(files = multichannel_tiff_files)


Nitpick: Shouldn't named parameters go without spaces around assignment?

mzient

Only nitpicks/comments, nothing really stopping it.

JanuszL · 2021-04-20T07:55:58Z

dali/kernels/imgproc/jpeg/jpeg_distortion_gpu_impl.cuh


 namespace dali {
 namespace kernels {
 namespace jpeg {

+// TODO(janton): Purposedly not using the color space conversion utils here.
+// We need better color space conversion utilities and we are risking performance
+// using the conversion functions in color_space_conversion_impl.h


What they are for in that case?

They are used for the image decoder color conversion. Those utils can convert between types, so to allow for going from a narrower to a wider type and vice-versa we have to normalize to float first. This introduces extra multiplications that don't really make sense for the JPEG distortion kernel, since we are working with uint8. Making generic and very optimized color conversion utils is beyond the scope of this PR, and should probably be tackled separately.

Can you write this in the comment too?

I went ahead and unify the color space conversion utils (we had 3 different ones)

dali/operators/decoder/nvjpeg/nvjpeg_decoder_decoupled_api.h

JanuszL · 2021-04-20T08:01:26Z

dali/operators/decoder/nvjpeg/nvjpeg_decoder_decoupled_api.h

+    DALIDataType pixel_type = sample->bpp == 16 ? DALI_UINT16 : DALI_UINT8;
+    output_image.pixel_type = sample->bpp == 16 ? NVJPEG2K_UINT16 : NVJPEG2K_UINT8;


Suggested change

DALIDataType pixel_type = sample->bpp == 16 ? DALI_UINT16 : DALI_UINT8;

output_image.pixel_type = sample->bpp == 16 ? NVJPEG2K_UINT16 : NVJPEG2K_UINT8;

DALIDataType pixel_type;

if (sample->bpp == 16) {

pixel_type = DALI_UINT16;

output_image.pixel_type = NVJPEG2K_UINT16;

} else {

pixel_type = DALI_UINT8;

output_image.pixel_type = NVJPEG2K_UINT8;

}

just a suggestion

JanuszL · 2021-04-20T08:02:23Z

dali/operators/decoder/nvjpeg/nvjpeg_decoder_decoupled_api.h

-        PermuteToInterleaved(output_data, i_out, comp_size, sample->shape[2], nvjpeg2k_cu_stream_);
+      if (need_processing) {
+        if (output_image_type_ == DALI_GRAY) {
+          // Converting to Gray, dropping extra channels if needed


Suggested change

// Converting to Gray, dropping extra channels if needed

// Converting to Gray, dropping alpha channels if needed

?

JanuszL · 2021-04-20T08:07:02Z

dali/test/python/test_operator_decoders_image.py

+    pipe0 = pipe(device, out_type=out_type, files=files)
+    pipe0.build()
+    out0, shape0 = pipe0.run()
+    if device == 'mixed':
+        out0 = out0.as_cpu()
+    out0 = np.array(out0[0])
+    shape0 = np.array(shape0[0])
+    expected_channels = 4 if out_type == types.ANY_DATA else \
+                        1 if out_type == types.GRAY else \
+                        3
+    assert out0.shape[2] == expected_channels, \
+        f"Expected {expected_channels} but got {out0.shape[2]}"


Suggested change

pipe0 = pipe(device, out_type=out_type, files=files)

pipe0.build()

out0, shape0 = pipe0.run()

if device == 'mixed':

out0 = out0.as_cpu()

out0 = np.array(out0[0])

shape0 = np.array(shape0[0])

expected_channels = 4 if out_type == types.ANY_DATA else \

1 if out_type == types.GRAY else \

3

assert out0.shape[2] == expected_channels, \

f"Expected {expected_channels} but got {out0.shape[2]}"

pipe = pipe(device, out_type=out_type, files=files)

pipe.build()

out, shape = pipe.run()

if device == 'mixed':

out = out.as_cpu()

out = np.array(out[0])

shape = np.array(shape[0])

expected_channels = 4 if out_type == types.ANY_DATA else \

1 if out_type == types.GRAY else \

3

assert out.shape[2] == expected_channels, \

f"Expected {expected_channels} but got {out.shape[2]}"

dali/test/python/test_operator_decoders_image.py

Signed-off-by: Joaquin Anton <janton@nvidia.com>

jantonguirao · 2021-04-20T11:59:13Z

!build

dali-automaton · 2021-04-20T12:01:18Z

CI MESSAGE: [2286089]: BUILD STARTED

dali-automaton · 2021-04-20T13:11:59Z

CI MESSAGE: [2286089]: BUILD FAILED

Signed-off-by: Joaquin Anton <janton@nvidia.com>

jantonguirao · 2021-04-20T13:31:53Z

!build

dali-automaton · 2021-04-20T13:36:39Z

CI MESSAGE: [2286307]: BUILD STARTED

dali-automaton · 2021-04-20T14:43:23Z

CI MESSAGE: [2286307]: BUILD PASSED

jantonguirao assigned mzient and JanuszL Apr 15, 2021

jantonguirao commented Apr 15, 2021

View reviewed changes

DALI_EXTRA_VERSION Outdated Show resolved Hide resolved

jantonguirao changed the title ~~Image decoder consistency~~ Image Decoder: Unified behavior across backends,Alpha channel support in PNG and JP2, YCbCr support in JP2 Apr 15, 2021

jantonguirao mentioned this pull request Apr 15, 2021

Image Decoder to have consistent behavior across backends #2843

Closed

JanuszL reviewed Apr 16, 2021

View reviewed changes

dali/image/generic_image.cc Show resolved Hide resolved

JanuszL reviewed Apr 16, 2021

View reviewed changes

JanuszL mentioned this pull request Apr 16, 2021

jpeg2000 decoding is ignoring the gpu for large files #2673

Closed

mzient reviewed Apr 19, 2021

View reviewed changes

JanuszL and others added 11 commits April 19, 2021 11:50

Mixed Image Decoder to allow converting from more than 3 channges (e.…

9da0c5c

…g. RGBA) to RGB. Signed-off-by: Joaquin Anton <janton@nvidia.com>

Fix linter

b60c192

Signed-off-by: Joaquin Anton <janton@nvidia.com>

Typo

4d3f926

Signed-off-by: Joaquin Anton <janton@nvidia.com>

Consistent behavior of ImageDecoder across backends

73cea99

Signed-off-by: Joaquin Anton <janton@nvidia.com>

Code review fixes

42b69fc

Signed-off-by: Joaquin Anton <janton@nvidia.com>

Code review fixes

c6cb55d

Signed-off-by: Joaquin Anton <janton@nvidia.com>

Support YCbCr, alpha channel and 16bit samples (converted to 8bit)

74d9a43

Signed-off-by: Joaquin Anton <janton@nvidia.com>

Handle JP2 bpp=16 + test fixes

fd95cd8

Signed-off-by: Joaquin Anton <janton@nvidia.com>

Code review fixes

ecc0bbd

Signed-off-by: Joaquin Anton <janton@nvidia.com>

Code review fixes

cd8950b

Signed-off-by: Joaquin Anton <janton@nvidia.com>

jantonguirao force-pushed the image_decoder_consistency branch 3 times, most recently from b031c07 to fd6a769 Compare April 19, 2021 15:19

Clean up color space conv impl

12084e5

Signed-off-by: Joaquin Anton <janton@nvidia.com>

jantonguirao force-pushed the image_decoder_consistency branch from fd6a769 to 12084e5 Compare April 19, 2021 15:23

mzient reviewed Apr 19, 2021

View reviewed changes

include/dali/core/common.h Show resolved Hide resolved

mzient reviewed Apr 19, 2021

View reviewed changes

mzient approved these changes Apr 19, 2021

View reviewed changes

JanuszL reviewed Apr 20, 2021

View reviewed changes

dali/operators/decoder/nvjpeg/nvjpeg_decoder_decoupled_api.h Show resolved Hide resolved

JanuszL reviewed Apr 20, 2021

View reviewed changes

dali/test/python/test_operator_decoders_image.py Show resolved Hide resolved

Unify color space conversion utilities

4bf5061

Signed-off-by: Joaquin Anton <janton@nvidia.com>

mzient approved these changes Apr 20, 2021

View reviewed changes

JanuszL approved these changes Apr 20, 2021

View reviewed changes

jpeg color conversion fix

52c1bfb

Signed-off-by: Joaquin Anton <janton@nvidia.com>

jantonguirao merged commit 94ccc7a into NVIDIA:master Apr 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Image Decoder: Unified behavior across backends,Alpha channel support in PNG and JP2, YCbCr support in JP2 #2867

Image Decoder: Unified behavior across backends,Alpha channel support in PNG and JP2, YCbCr support in JP2 #2867

jantonguirao commented Apr 15, 2021

JanuszL Apr 16, 2021

jantonguirao Apr 16, 2021

JanuszL Apr 16, 2021

JanuszL Apr 16, 2021

mzient Apr 19, 2021 •

edited

JanuszL Apr 16, 2021

JanuszL Apr 16, 2021

JanuszL Apr 16, 2021

mzient Apr 19, 2021

jantonguirao commented Apr 19, 2021

dali-automaton commented Apr 19, 2021

dali-automaton commented Apr 19, 2021

mzient Apr 19, 2021

JanuszL Apr 20, 2021

mzient Apr 19, 2021

jantonguirao Apr 20, 2021

mzient Apr 19, 2021

mzient left a comment

JanuszL Apr 20, 2021

jantonguirao Apr 20, 2021 •

edited

JanuszL Apr 20, 2021

jantonguirao Apr 20, 2021

JanuszL Apr 20, 2021

JanuszL Apr 20, 2021

JanuszL Apr 20, 2021

jantonguirao commented Apr 20, 2021

dali-automaton commented Apr 20, 2021

dali-automaton commented Apr 20, 2021

jantonguirao commented Apr 20, 2021

dali-automaton commented Apr 20, 2021

dali-automaton commented Apr 20, 2021

		DALIDataType pixel_type = sample->bpp == 16 ? DALI_UINT16 : DALI_UINT8;
		output_image.pixel_type = sample->bpp == 16 ? NVJPEG2K_UINT16 : NVJPEG2K_UINT8;

-    DALIDataType pixel_type = sample->bpp == 16 ? DALI_UINT16 : DALI_UINT8;
-    output_image.pixel_type = sample->bpp == 16 ? NVJPEG2K_UINT16 : NVJPEG2K_UINT8;
+    DALIDataType pixel_type;
+    if (sample->bpp == 16) {
+      pixel_type = DALI_UINT16;
+      output_image.pixel_type = NVJPEG2K_UINT16;
+    } else {
+      pixel_type = DALI_UINT8;
+      output_image.pixel_type = NVJPEG2K_UINT8;
+    }

	// Converting to Gray, dropping extra channels if needed
	// Converting to Gray, dropping alpha channels if needed

Image Decoder: Unified behavior across backends,Alpha channel support in PNG and JP2, YCbCr support in JP2 #2867

Image Decoder: Unified behavior across backends,Alpha channel support in PNG and JP2, YCbCr support in JP2 #2867

Conversation

jantonguirao commented Apr 15, 2021

Why we need this PR?

What happened in this PR?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mzient Apr 19, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jantonguirao commented Apr 19, 2021

dali-automaton commented Apr 19, 2021

dali-automaton commented Apr 19, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mzient left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jantonguirao Apr 20, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jantonguirao commented Apr 20, 2021

dali-automaton commented Apr 20, 2021

dali-automaton commented Apr 20, 2021

jantonguirao commented Apr 20, 2021

dali-automaton commented Apr 20, 2021

dali-automaton commented Apr 20, 2021

mzient Apr 19, 2021 •

edited

jantonguirao Apr 20, 2021 •

edited