Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use GPU Convert in nvJPEG decoder #4247

Merged
merged 9 commits into from
Sep 30, 2022
Merged

Conversation

staniewzki
Copy link
Contributor

@staniewzki staniewzki commented Sep 9, 2022

Category:

New feature (non-breaking change which adds functionality)

Description:

This PR adds usage of GPU Convert in nvJPEG decoder. This enables the decoder to output data types other than uint8_t, and support orientation adjustment.

During my work on this PR me and @szkarpinski found out that a lot of the orientation-related code in imgcodec is bugged, and this PR fixes all the previous mistakes.

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Checklist

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: N/A

@staniewzki staniewzki changed the title Support other output data types in nvJPEG decoder Support other output data types and orientation adjustment in nvJPEG decoder Sep 22, 2022
@staniewzki staniewzki changed the title Support other output data types and orientation adjustment in nvJPEG decoder Use GPU Convert in nvJPEG decoder Sep 22, 2022
@staniewzki staniewzki force-pushed the nvjpeg-convert branch 3 times, most recently from 8080811 to 043d0e4 Compare September 27, 2022 11:58
@staniewzki staniewzki marked this pull request as ready for review September 27, 2022 11:59
if (roi.use_roi()) {
ctx.shape[0] = roi.shape()[0];
ctx.shape[1] = roi.shape()[1];
}
DecodeJpegSample(*in, out.mutable_data<uint8_t>(), opts, ctx);

// Synchronizing on the acces to immediate_buffer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Synchronizing on the acces to immediate_buffer
// Synchronizing on the access to immediate_buffer

Comment on lines +197 to +199
auto& intermediate_buffer = ctx.resources.intermediate_buffer;
CUDA_CALL(cudaEventSynchronize(ctx.resources.decode_event));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open question: Do we need to synchronize always or only when we actually use intermediate_buffer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only when the buffer is actually used. So I guess here It could be conditional, and later on when some other resources are used we can sync again? The second sync would be a no-op, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The synchronization would have to happen almost in the same place anyways (#4292), I don't think it makes any difference.

if (roi.use_roi()) {
ctx.shape[0] = roi.shape()[0];
ctx.shape[1] = roi.shape()[1];
}
DecodeJpegSample(*in, out.mutable_data<uint8_t>(), opts, ctx);

// Synchronizing on the acces to immediate_buffer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Synchronizing on the acces to immediate_buffer
// Synchronizing on the access to itermediate_buffer

?

Comment on lines 192 to 194
ctx.shape[0] = roi.shape()[0];
ctx.shape[1] = roi.shape()[1];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to use un-rotated size here - (as we do in ParseJpegSample). This is most certainly a bug and needs a follow-up work (this PR or otherwise).

Copy link
Contributor

@mzient mzient left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OKayish - but I'd rather not merge it before #4224
Also, the orientation support with ROI looks broken.

Signed-off-by: Michał Staniewski <m.staniewzki@gmail.com>
Signed-off-by: Michał Staniewski <m.staniewzki@gmail.com>
Signed-off-by: Michał Staniewski <m.staniewzki@gmail.com>
Signed-off-by: Michał Staniewski <m.staniewzki@gmail.com>
Signed-off-by: Michał Staniewski <m.staniewzki@gmail.com>
mzient and others added 3 commits September 28, 2022 16:10
Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>
Signed-off-by: Michał Staniewski <m.staniewzki@gmail.com>
Signed-off-by: Michał Staniewski <m.staniewzki@gmail.com>
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [6053230]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [6053230]: BUILD FAILED

Signed-off-by: Michał Staniewski <m.staniewzki@gmail.com>
@staniewzki
Copy link
Contributor Author

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [6054356]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [6054356]: BUILD FAILED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [6054356]: BUILD PASSED

Comment on lines +28 to +29
roi.begin[idx] = info.shape[idx ^ swap_xy] - roi.begin[idx];
roi.end[idx] = info.shape[idx ^ swap_xy] - roi.end[idx];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well.... this will only work for exactly 2 dimensions. If we had, say, MJPEG sequence, we'd have an extra outermost dimension and the index xoring hack would break.
Having said that - let's merge it as is and improve in a follow-up.

extent = roi.end[d];
if (d < roi.begin.size())
extent -= roi.begin[d];
if (roi) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check seems redundant...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will double check it, I think I had some error without it. But yes, seems redundant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I think I know why I did this, but that was wrong.

@@ -21,7 +21,7 @@ namespace dali {
namespace imgcodec {

template <typename OutShape>
void OutputShape(OutShape &&out_shape,
void DLL_PUBLIC OutputShape(OutShape &&out_shape,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand it. A template shouldn't require external linkage - unless you're doing something very wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I don't fully understand as well, I though that you told me to do that 😛 Reverting this then.

@staniewzki staniewzki merged commit 256c021 into NVIDIA:main Sep 30, 2022
@staniewzki staniewzki mentioned this pull request Sep 30, 2022
18 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants