Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add JPEG distortion kernel #2801

Merged
merged 14 commits into from
Mar 26, 2021

Conversation

jantonguirao
Copy link
Contributor

@jantonguirao jantonguirao commented Mar 16, 2021

Signed-off-by: Joaquin Anton janton@nvidia.com

Why we need this PR?

Pick one, remove the rest

  • It adds a new feature needed to general JPEG distortion as an augmentation

What happened in this PR?

Fill relevant points, put NA otherwise. Replace anything inside []

  • What solution was applied:
    Implemented a JPEG distortion CUDA kernel
  • Affected modules and functionalities:
    New functionality
  • Key points relevant for the review:
    The kernel implementation
  • Validation and testing:
    C++ tests added
  • Documentation (including examples):
    NA

JIRA TASK: [DALI-1919]

@jantonguirao jantonguirao force-pushed the jpeg_distortion_dct_2 branch 2 times, most recently from 8a66b61 to 8048737 Compare March 16, 2021 19:17
}

if (chroma_x == 0) { // once per row
dct_fwd_8x8_1d<col_stride>(&cb[chroma_y][0]);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: modify so that the 4 rows (up two rows for luma and one for cb and cr each) are processed in parallel for 4 different threads.

@jantonguirao jantonguirao changed the title [WIP] Implement DCT + IDCT in JPEG distortion kernel Add JPEG distortion kernel Mar 23, 2021
Comment on lines 108 to 109
float x1c7dm3f5apm = c * x1 - d * x7 - f * x3 - a * x5;
float x1d7cp3a5fmm = d * x1 + c * x7 - a * x3 + f * x5;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think variables names are meaningful anymore. They are just hard to read now.

float GetQualityFactorScale(int quality) {
quality = clamp<int>(quality, 1, 99);
float q_scale = 1.0f;
if (1 <= quality && quality < 50) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (1 <= quality && quality < 50) {
if (quality < 50) {

It was clamped to 1-99 already.

float q_scale = 1.0f;
if (1 <= quality && quality < 50) {
q_scale = 50.0f / quality;
} else if (50 <= quality && quality < 100) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
} else if (50 <= quality && quality < 100) {
} else {

It was clamped to 1-99 already.

__syncthreads();

if (quantization) {
auto q_chroma_coeff = ConvertSat<float>(chroma_Q_table(chroma_y, chroma_x));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we need ConvertSat<float> here? Can we just pass the value directly to quantize?

Comment on lines 50 to 54
std::vector<std::string> paths{
dali_extra_path() + "/db/single/bmp/0/cat-111793_640_palette_8bit.bmp",
dali_extra_path() + "/db/single/bmp/0/cat-1046544_640.bmp",
dali_extra_path() + "/db/single/bmp/0/cat-3591348_640.bmp",
};
Copy link
Contributor

@JanuszL JanuszL Mar 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::vector<std::string> paths{
dali_extra_path() + "/db/single/bmp/0/cat-111793_640_palette_8bit.bmp",
dali_extra_path() + "/db/single/bmp/0/cat-1046544_640.bmp",
dali_extra_path() + "/db/single/bmp/0/cat-3591348_640.bmp",
};
std::vector<std::string> path = ImageList(testing::dali_extra_path() + "/db/single/bmp", {".bmp"}, 3);

I think it is better to abstract away particular file names. Unless they have properties that are a must for this test.

TensorView<StorageCPU, uint8_t> out_tv(output_host_.data(), flat_sh);
// In the kernel we average the RGB values, then converto to YCbCr
// while here we are first converting and then averaging
// Check(out_view_cpu, out_ref_cpu, EqualEps(40));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?


cv::cvtColor(in_mat, in_mat, cv::COLOR_RGB2BGR);
cv::imencode(".jpg", in_mat, encoded, {cv::IMWRITE_JPEG_QUALITY, ConvertSat<int>(quality_factor)});
cv::cvtColor(in_mat, in_mat, cv::COLOR_BGR2RGB);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this 2nd conversion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just want to leave the input matrix as it was before (RGB).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not used anymore so I think it is just redundant.


using BlkSetup = BlockSetup<2, -1>;
BlkSetup block_setup_;
using BlockDesc = BlkSetup::BlockDesc;

float quality_factor = 20.0f;
mat<8, 8, uint8_t> luma_table;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
mat<8, 8, uint8_t> luma_table;
mat<8, 8, uint8_t> luma_table_;


float quality_factor = 20.0f;
mat<8, 8, uint8_t> luma_table;
mat<8, 8, uint8_t> chroma_table;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
mat<8, 8, uint8_t> chroma_table;
mat<8, 8, uint8_t> chroma_table_;

cv::cvtColor(out_ref, out_ref, cv::COLOR_BGR2RGB);
std::memcpy(out_ref_view[i].data, out_ref.data,
in_shapes_.tensor_size(i) * sizeof(uint8_t));
}
}

void TestJpegCompressionDistortion() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test with JpegCompressionDistortion<quantization=False>?

Comment on lines 270 to 272
const int vert_chroma_blocks = 2;
const int vert_luma_blocks = vert_chroma_blocks << vert_subsample;
const int horz_chroma_blocks = 4;
const int horz_luma_blocks = horz_chroma_blocks << horz_subsample;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what we can do is:

  • 444
  • 422
  • 420
  • 440

But not 411, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do 2x2, 1x2, 2x1 and 1x1 subsampling.
I don't know how these translate it to these names (and if they even can describe the difference between 1x2 and 2x1).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

444 : horz=false, vert=false
422 : horz=true, vert=false
420 : horz=true, vert=true
440 : horz=false, vert=true

411 : not supported, that is subsampled with factor 4 horizontally

@mzient mzient removed their assignment Mar 24, 2021
@jantonguirao
Copy link
Contributor Author

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2207817]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2207817]: BUILD FAILED

Signed-off-by: Joaquin Anton <janton@nvidia.com>
jantonguirao and others added 13 commits March 26, 2021 14:50
… blocks

Signed-off-by: Joaquin Anton <janton@nvidia.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Joaquin Anton <janton@nvidia.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Joaquin Anton <janton@nvidia.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Joaquin Anton <janton@nvidia.com>
Signed-off-by: Joaquin Anton <janton@nvidia.com>
Signed-off-by: Joaquin Anton <janton@nvidia.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Joaquin Anton <janton@nvidia.com>
Signed-off-by: Joaquin Anton <janton@nvidia.com>
@jantonguirao
Copy link
Contributor Author

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2208161]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2208161]: BUILD PASSED

@jantonguirao jantonguirao merged commit 65f749d into NVIDIA:master Mar 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants