Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix buffer exhaustion in the frames_decoder_gpu #4723

Merged
merged 1 commit into from
Mar 22, 2023

Conversation

JanuszL
Copy link
Contributor

@JanuszL JanuszL commented Mar 17, 2023

  • frames_decoder_gpu keeps the internal buffer of the num_decode_surfaces_
    length as it is the maximal number of frames that needs to be decoded
    to maintain the display order. The decoder accepts the raw packets as
    long as there is a free place in the buffer. In some cases, after receiving
    the packet two frames could be produced (as the N-1 frame may require N-the packet).
    In this case, we receive more frames and there is spare space. This PR
    adds functionality to enlarge the buffer if there is an additional frame
    to keep. Also, it makes sure we don't try to have more than num_decode_surfaces_
    frames decoded at the time.

Relates to #4498

Category:

Bug fix (non-breaking change which fixes an issue)

Description:

  • frames_decoder_gpu keeps the internal buffer of the num_decode_surfaces_
    length as it is the maximal number of frames that needs to be decoded
    to maintain the display order. The decoder accepts the raw packets as
    long as there is a free place in the buffer. In some cases, after receiving
    the packet two frames could be produced (as the N-1 frame may require N-the packet).
    In this case, we receive more frames and there is spare space. This PR
    adds functionality to enlarge the buffer if there is an additional frame
    to keep. Also, it makes sure we don't try to have more than num_decode_surfaces_
    frames decoded at the time.

Additional information:

Affected modules and functionalities:

  • frames_decoder_gpu

Key points relevant for the review:

  • NA

Tests:

  • Existing tests apply
    • test_video.test_multi_gpu_video has been extended
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Checklist

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: N/A

- frames_decoder_gpu keeps the internal buffer of the num_decode_surfaces_
  length as it is the maximal number of frames that needs to be decoded
  to maintain the display order. The decoder accepts the raw packets as
  long as there is a free place in the buffer. In some cases, after receiving
  the packet two frames could be produced (as the N-1 frame may require N-the packet).
  In this case, we receive more frames and there is spare space. This PR
  adds functionality to enlarge the buffer if there is an additional frame
  to keep. Also, it makes sure we don't try to have more than num_decode_surfaces_
  frames decoded at the time.

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
@JanuszL
Copy link
Contributor Author

JanuszL commented Mar 17, 2023

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7623615]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7623615]: BUILD PASSED

@awolant awolant self-assigned this Mar 18, 2023
Comment on lines +736 to +740
std::vector<BufferedFrame> new_frame_buffer(frame_buffer_.size() + 1);
for (size_t i = 0; i < frame_buffer_.size(); ++i) {
new_frame_buffer[i] = std::move(frame_buffer_[i]);
}
frame_buffer_ = std::move(new_frame_buffer);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering why you decided to create new buffer and move it in place of the old one instead of push_back new frame to the old buffer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because DeviceBuffer doesn't have assignment operator:


In file included from /Dali/dali/dali/operators/reader/loader/video/frames_decoder_gpu.h:35,
                 from /Dali/dali/dali/operators/reader/loader/video/frames_decoder_gpu.cc:14:
/Dali/dali/include/dali/core/dev_buffer.h: In instantiation of ‘dali::DeviceBuffer<T>::DeviceBuffer(dali::DeviceBuffer<T>&&) [with T = unsigned char]’:
/Dali/dali/dali/operators/reader/loader/video/frames_decoder_gpu.h:123:8:   required from ‘void __gnu_cxx::new_allocator<_Tp>::construct(_Up*, _Args&& ...) [with _Up = dali::BufferedFrame; _Args = {dali::BufferedFrame}; _Tp = dali::BufferedFrame]’
/usr/include/c++/11/bits/alloc_traits.h:516:17:   required from ‘static void std::allocator_traits<std::allocator<_Tp1> >::construct(std::allocator_traits<std::allocator<_Tp1> >::allocator_type&, _Up*, _Args&& ...) [with _Up = dali::BufferedFrame; _Args = {dali::BufferedFrame}; _Tp = dali::BufferedFrame; std::allocator_traits<std::allocator<_Tp1> >::allocator_type = std::allocator<dali::BufferedFrame>]’
/usr/include/c++/11/bits/vector.tcc:115:30:   required from ‘std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::emplace_back(_Args&& ...) [with _Args = {dali::BufferedFrame}; _Tp = dali::BufferedFrame; _Alloc = std::allocator<dali::BufferedFrame>; std::vector<_Tp, _Alloc>::reference = dali::BufferedFrame&]’
/usr/include/c++/11/bits/stl_vector.h:1204:21:   required from ‘void std::vector<_Tp, _Alloc>::push_back(std::vector<_Tp, _Alloc>::value_type&&) [with _Tp = dali::BufferedFrame; _Alloc = std::allocator<dali::BufferedFrame>; std::vector<_Tp, _Alloc>::value_type = dali::BufferedFrame]’
/Dali/dali/dali/operators/reader/loader/video/frames_decoder_gpu.cc:736:26:   required from here
/Dali/dali/include/dali/core/dev_buffer.h:73:11: error: use of deleted function ‘dali::DeviceBuffer<unsigned char>& dali::DeviceBuffer<unsigned char>::operator=(const dali::DeviceBuffer<unsigned char>&)’
   73 |     *this = other;
      |     ~~~~~~^~~~~~~
/Dali/dali/include/dali/core/dev_buffer.h:70:8: note: ‘dali::DeviceBuffer<unsigned char>& dali::DeviceBuffer<unsigned char>::operator=(const dali::DeviceBuffer<unsigned char>&)’ is implicitly declared as deleted because ‘dali::DeviceBuffer<unsigned char>’ declares a move constructor or move assignment operator
   70 | struct DeviceBuffer {
      |        ^~~~~~~~~~~~

No idea why vector quires it and cannot use move operator insteada.

@JanuszL JanuszL merged commit 86d3d8e into NVIDIA:main Mar 22, 2023
@JanuszL JanuszL deleted the enlarge_buffer_frames_dec branch March 22, 2023 11:27
@stiepan stiepan added this to the Release_1.25.0 milestone Apr 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants