Skip to content

Make async_pool immune to handle reuse.#5348

Merged
mzient merged 3 commits intoNVIDIA:mainfrom
mzient:fix_stream_handle_aliasing
Mar 5, 2024
Merged

Make async_pool immune to handle reuse.#5348
mzient merged 3 commits intoNVIDIA:mainfrom
mzient:fix_stream_handle_aliasing

Conversation

@mzient
Copy link
Contributor

@mzient mzient commented Mar 1, 2024

Category:

Bug fix .... actually, previous behavior was accepted and documented, but nonetheless limiting and error prone.

Description:

Prior to this change, stream-ordered allocations had to use stream handles with care - specifically, it was illegal to delete a stream which still had a pending deallocation. This might have caused problems with streams over which we have no control.
This PR removes this limitation. Also, handling of per-thread default stream was broken.

Instead of using a stream handle, this PR tries to obtain a unique stream ID. If possible, we proceed as before, but with extra guarantees - we don't have to care about the stream being destroyed, because the ID is unique.
When a stream ID is not available (old drivers), this code uses a handle-derived non-unique ID and scans the per-stream free blocks linearly when returning to upstream. When getting a per-stream free block, we're not sure it really comes from the same stream. Therefore, if the block isn't ready yet, we record an event on the requesting stream, so that in the worst case, the stream would wait for the allocation to really happen.

Additional information:

The functionality borrows heavily on the implementation of stream id hints in cv-cuda.

Affected modules and functionalities:

Memory resources.

Key points relevant for the review:

N/A

Tests:

NOTE: It's impossible to trigger the fallback behavior manually (well... perhaps we could have a separate test target with some env vars which are only used for testing - I don't think it's worth the extra overhead). It triggers automatically on old (pre-cuda12) drivers.

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Checklist

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: N/A

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is almost copied from cv-cuda.

detail::pooled_map<char *, padded_block, true> padded_;

std::unordered_map<cudaStream_t, PerStreamFreeBlocks> stream_free_;
std::unordered_map<uint64_t, PerStreamFreeBlocks> stream_free_;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The per-stream resources are keyed not with a handle, but with an ID hint.

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [13203053]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [13203053]: BUILD PASSED

Comment on lines +10 to +11
"cuGetProcAddress": {},
"cuGetProcAddress_v2": {},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

depending on CUDA header versions, we need either the former or the latter (starting with CUDA 12.0).

return fn;
}

bool _hasPreciseHint() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why the _ at the beginning?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I don't know :)

@jantonguirao jantonguirao self-assigned this Mar 5, 2024
mzient added 2 commits March 5, 2024 09:58
Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>
Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>
@mzient mzient force-pushed the fix_stream_handle_aliasing branch from 9c08600 to 8a4fab1 Compare March 5, 2024 09:04
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [13281756]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [13281883]: BUILD STARTED

Remove underscore from a function name.

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>
@mzient mzient force-pushed the fix_stream_handle_aliasing branch from c9828d8 to ddaf0a7 Compare March 5, 2024 09:09
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [13281980]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [13281980]: BUILD PASSED

@mzient mzient merged commit 3c263df into NVIDIA:main Mar 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants