Add JPEG distortion kernel #2801

jantonguirao · 2021-03-16T19:15:35Z

Signed-off-by: Joaquin Anton janton@nvidia.com

Why we need this PR?

Pick one, remove the rest

It adds a new feature needed to general JPEG distortion as an augmentation

What happened in this PR?

Fill relevant points, put NA otherwise. Replace anything inside []

What solution was applied:
Implemented a JPEG distortion CUDA kernel
Affected modules and functionalities:
New functionality
Key points relevant for the review:
The kernel implementation
Validation and testing:
C++ tests added
Documentation (including examples):
NA

JIRA TASK: [DALI-1919]

jantonguirao · 2021-03-17T10:59:24Z

dali/kernels/imgproc/jpeg/jpeg_distortion_gpu.cuh

+      }
+
+      if (chroma_x == 0) {  // once per row
+        dct_fwd_8x8_1d<col_stride>(&cb[chroma_y][0]);


TODO: modify so that the 4 rows (up two rows for luma and one for cb and cr each) are processed in parallel for 4 different threads.

JanuszL · 2021-03-24T11:41:49Z

dali/kernels/imgproc/jpeg/dct_8x8_gpu.cuh

+  float x1c7dm3f5apm = c * x1 - d * x7 - f * x3 - a * x5;
+  float x1d7cp3a5fmm = d * x1 + c * x7 - a * x3 + f * x5;


I don't think variables names are meaningful anymore. They are just hard to read now.

JanuszL · 2021-03-24T11:42:57Z

dali/kernels/imgproc/jpeg/jpeg_distortion_gpu.cuh

+float GetQualityFactorScale(int quality) {
+  quality = clamp<int>(quality, 1, 99);
+  float q_scale = 1.0f;
+  if (1 <= quality && quality < 50) {


Suggested change

if (1 <= quality && quality < 50) {

if (quality < 50) {

It was clamped to 1-99 already.

JanuszL · 2021-03-24T11:43:34Z

dali/kernels/imgproc/jpeg/jpeg_distortion_gpu.cuh

+  float q_scale = 1.0f;
+  if (1 <= quality && quality < 50) {
+    q_scale = 50.0f / quality;
+  } else if (50 <= quality && quality < 100) {


Suggested change

} else if (50 <= quality && quality < 100) {

} else {

It was clamped to 1-99 already.

dali/kernels/imgproc/jpeg/jpeg_distortion_gpu.cuh

JanuszL · 2021-03-24T12:01:22Z

dali/kernels/imgproc/jpeg/jpeg_distortion_gpu.cuh

+      __syncthreads();
+
+      if (quantization) {
+        auto q_chroma_coeff = ConvertSat<float>(chroma_Q_table(chroma_y, chroma_x));


Why we need ConvertSat<float> here? Can we just pass the value directly to quantize?

JanuszL · 2021-03-24T12:05:44Z

dali/kernels/imgproc/jpeg/jpeg_distortion_gpu_test.cu

+      std::vector<std::string> paths{
+        dali_extra_path() + "/db/single/bmp/0/cat-111793_640_palette_8bit.bmp",
+        dali_extra_path() + "/db/single/bmp/0/cat-1046544_640.bmp",
+        dali_extra_path() + "/db/single/bmp/0/cat-3591348_640.bmp",
+      };


Suggested change

std::vector<std::string> paths{

dali_extra_path() + "/db/single/bmp/0/cat-111793_640_palette_8bit.bmp",

dali_extra_path() + "/db/single/bmp/0/cat-1046544_640.bmp",

dali_extra_path() + "/db/single/bmp/0/cat-3591348_640.bmp",

};

std::vector<std::string> path = ImageList(testing::dali_extra_path() + "/db/single/bmp", {".bmp"}, 3);

I think it is better to abstract away particular file names. Unless they have properties that are a must for this test.

JanuszL · 2021-03-24T12:12:44Z

dali/kernels/imgproc/jpeg/jpeg_distortion_gpu_test.cu

-    TensorView<StorageCPU, uint8_t> out_tv(output_host_.data(), flat_sh);
+    // In the kernel we average the RGB values, then converto to YCbCr
+    // while here we are first converting and then averaging
+    // Check(out_view_cpu, out_ref_cpu, EqualEps(40));


JanuszL · 2021-03-24T12:15:41Z

dali/kernels/imgproc/jpeg/jpeg_distortion_gpu_test.cu

+
+      cv::cvtColor(in_mat, in_mat, cv::COLOR_RGB2BGR);
+      cv::imencode(".jpg", in_mat, encoded, {cv::IMWRITE_JPEG_QUALITY, ConvertSat<int>(quality_factor)});
+      cv::cvtColor(in_mat, in_mat, cv::COLOR_BGR2RGB);


Do we need this 2nd conversion?

just want to leave the input matrix as it was before (RGB).

It is not used anymore so I think it is just redundant.

JanuszL · 2021-03-24T12:17:11Z

dali/kernels/imgproc/jpeg/jpeg_distortion_gpu_test.cu


  using BlkSetup = BlockSetup<2, -1>;
  BlkSetup block_setup_;
  using BlockDesc = BlkSetup::BlockDesc;
+
+  float quality_factor = 20.0f;
+  mat<8, 8, uint8_t> luma_table;


Suggested change

mat<8, 8, uint8_t> luma_table;

mat<8, 8, uint8_t> luma_table_;

JanuszL · 2021-03-24T12:17:20Z

dali/kernels/imgproc/jpeg/jpeg_distortion_gpu_test.cu

+
+  float quality_factor = 20.0f;
+  mat<8, 8, uint8_t> luma_table;
+  mat<8, 8, uint8_t> chroma_table;


Suggested change

mat<8, 8, uint8_t> chroma_table;

mat<8, 8, uint8_t> chroma_table_;

dali/kernels/imgproc/jpeg/jpeg_distortion_gpu_test.cu

JanuszL · 2021-03-24T12:22:54Z

dali/kernels/imgproc/jpeg/jpeg_distortion_gpu_test.cu

+      cv::cvtColor(out_ref, out_ref, cv::COLOR_BGR2RGB);
+      std::memcpy(out_ref_view[i].data, out_ref.data,
+                  in_shapes_.tensor_size(i) * sizeof(uint8_t));
+    }
  }

  void TestJpegCompressionDistortion() {


Can you add a test with JpegCompressionDistortion<quantization=False>?

JanuszL · 2021-03-24T12:27:29Z

dali/kernels/imgproc/jpeg/jpeg_distortion_gpu.cuh

+  const int vert_chroma_blocks = 2;
+  const int vert_luma_blocks = vert_chroma_blocks << vert_subsample;
+  const int horz_chroma_blocks = 4;
+  const int horz_luma_blocks = horz_chroma_blocks << horz_subsample;


So what we can do is:

444

422

420

440

But not 411, right?

We can do 2x2, 1x2, 2x1 and 1x1 subsampling.
I don't know how these translate it to these names (and if they even can describe the difference between 1x2 and 2x1).

https://en.wikipedia.org/wiki/Chroma_subsampling#Types_of_sampling_and_subsampling
They cannot express 1xN.

444 : horz=false, vert=false
422 : horz=true, vert=false
420 : horz=true, vert=true
440 : horz=false, vert=true

411 : not supported, that is subsampled with factor 4 horizontally

dali/kernels/imgproc/jpeg/jpeg_distortion_gpu.cuh

jantonguirao · 2021-03-26T12:52:09Z

!build

dali-automaton · 2021-03-26T12:56:19Z

CI MESSAGE: [2207817]: BUILD STARTED

dali-automaton · 2021-03-26T13:24:10Z

CI MESSAGE: [2207817]: BUILD FAILED

Signed-off-by: Joaquin Anton <janton@nvidia.com>

… blocks Signed-off-by: Joaquin Anton <janton@nvidia.com>

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Signed-off-by: Joaquin Anton <janton@nvidia.com>

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Signed-off-by: Joaquin Anton <janton@nvidia.com>

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Signed-off-by: Joaquin Anton <janton@nvidia.com>

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Signed-off-by: Joaquin Anton <janton@nvidia.com>

jantonguirao · 2021-03-26T15:35:24Z

!build

dali-automaton · 2021-03-26T15:41:32Z

CI MESSAGE: [2208161]: BUILD STARTED

dali-automaton · 2021-03-26T18:21:31Z

CI MESSAGE: [2208161]: BUILD PASSED

jantonguirao force-pushed the jpeg_distortion_dct_2 branch 2 times, most recently from 8a66b61 to 8048737 Compare March 16, 2021 19:17

jantonguirao assigned mzient Mar 17, 2021

jantonguirao commented Mar 17, 2021

View reviewed changes

jantonguirao changed the title ~~[WIP] Implement DCT + IDCT in JPEG distortion kernel~~ Add JPEG distortion kernel Mar 23, 2021

jantonguirao force-pushed the jpeg_distortion_dct_2 branch from d52a7e8 to c0c40f1 Compare March 23, 2021 17:52

jantonguirao assigned awolant and JanuszL Mar 24, 2021

JanuszL reviewed Mar 24, 2021

View reviewed changes

dali/kernels/imgproc/jpeg/jpeg_distortion_gpu.cuh Show resolved Hide resolved

JanuszL reviewed Mar 24, 2021

View reviewed changes

dali/kernels/imgproc/jpeg/jpeg_distortion_gpu.cuh Show resolved Hide resolved

JanuszL reviewed Mar 24, 2021

View reviewed changes

dali/kernels/imgproc/jpeg/jpeg_distortion_gpu_test.cu Show resolved Hide resolved

JanuszL reviewed Mar 24, 2021

View reviewed changes

mzient reviewed Mar 24, 2021

View reviewed changes

dali/kernels/imgproc/jpeg/jpeg_distortion_gpu.cuh Outdated Show resolved Hide resolved

mzient removed their assignment Mar 24, 2021

JanuszL approved these changes Mar 24, 2021

View reviewed changes

awolant approved these changes Mar 26, 2021

View reviewed changes

Implement DCT + IDCT

6e0aaf2

Signed-off-by: Joaquin Anton <janton@nvidia.com>

jantonguirao and others added 13 commits March 26, 2021 14:50

Add quantization, test with real images, pad to get full 8x8 or 16x16…

7dd5c7d

… blocks Signed-off-by: Joaquin Anton <janton@nvidia.com>

Flatten blocks. Process 1 8-element DCT per thread.

945a797

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Fix edge artifacts.

396c6b7

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Convert BGR to RGB

78df933

Signed-off-by: Joaquin Anton <janton@nvidia.com>

Fix block handling.

73d19db

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Use DeviceArray for matrices

4c8ac29

Signed-off-by: Joaquin Anton <janton@nvidia.com>

Fix boundary conditions.

120ff12

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Enable all tests + fixes

cca6fed

Signed-off-by: Joaquin Anton <janton@nvidia.com>

remove unnecessary __syncthreads

3ea336a

Signed-off-by: Joaquin Anton <janton@nvidia.com>

Use mat 8x8 for quantization tables, fix tests, cleanup

aa177e3

Signed-off-by: Joaquin Anton <janton@nvidia.com>

Process more pixels per block.

aefdacb

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Address code review issues

f1a129b

Signed-off-by: Joaquin Anton <janton@nvidia.com>

Fix clang build

0814e57

Signed-off-by: Joaquin Anton <janton@nvidia.com>

jantonguirao force-pushed the jpeg_distortion_dct_2 branch from 88718f1 to 0814e57 Compare March 26, 2021 15:34

jantonguirao merged commit 65f749d into NVIDIA:master Mar 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add JPEG distortion kernel #2801

Add JPEG distortion kernel #2801

jantonguirao commented Mar 16, 2021 •

edited

Loading

jantonguirao Mar 17, 2021

JanuszL Mar 24, 2021

JanuszL Mar 24, 2021

JanuszL Mar 24, 2021

JanuszL Mar 24, 2021

JanuszL Mar 24, 2021 •

edited

Loading

JanuszL Mar 24, 2021

JanuszL Mar 24, 2021

jantonguirao Mar 24, 2021

JanuszL Mar 24, 2021

JanuszL Mar 24, 2021

JanuszL Mar 24, 2021

JanuszL Mar 24, 2021

JanuszL Mar 24, 2021

mzient Mar 24, 2021

JanuszL Mar 24, 2021

jantonguirao Mar 24, 2021

jantonguirao commented Mar 26, 2021

dali-automaton commented Mar 26, 2021

dali-automaton commented Mar 26, 2021

jantonguirao commented Mar 26, 2021

dali-automaton commented Mar 26, 2021

dali-automaton commented Mar 26, 2021

		float x1c7dm3f5apm = c * x1 - d * x7 - f * x3 - a * x5;
		float x1d7cp3a5fmm = d * x1 + c * x7 - a * x3 + f * x5;

	mat<8, 8, uint8_t> luma_table;
	mat<8, 8, uint8_t> luma_table_;

	mat<8, 8, uint8_t> chroma_table;
	mat<8, 8, uint8_t> chroma_table_;

Add JPEG distortion kernel #2801

Add JPEG distortion kernel #2801

Conversation

jantonguirao commented Mar 16, 2021 • edited Loading

Why we need this PR?

What happened in this PR?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JanuszL Mar 24, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jantonguirao commented Mar 26, 2021

dali-automaton commented Mar 26, 2021

dali-automaton commented Mar 26, 2021

jantonguirao commented Mar 26, 2021

dali-automaton commented Mar 26, 2021

dali-automaton commented Mar 26, 2021

jantonguirao commented Mar 16, 2021 •

edited

Loading

JanuszL Mar 24, 2021 •

edited

Loading