Make async_pool immune to handle reuse.#5348
Merged
mzient merged 3 commits intoNVIDIA:mainfrom Mar 5, 2024
Merged
Conversation
mzient
commented
Mar 1, 2024
Contributor
Author
There was a problem hiding this comment.
This file is almost copied from cv-cuda.
mzient
commented
Mar 1, 2024
| detail::pooled_map<char *, padded_block, true> padded_; | ||
|
|
||
| std::unordered_map<cudaStream_t, PerStreamFreeBlocks> stream_free_; | ||
| std::unordered_map<uint64_t, PerStreamFreeBlocks> stream_free_; |
Contributor
Author
There was a problem hiding this comment.
The per-stream resources are keyed not with a handle, but with an ID hint.
Collaborator
|
CI MESSAGE: [13203053]: BUILD STARTED |
Collaborator
|
CI MESSAGE: [13203053]: BUILD PASSED |
mzient
commented
Mar 5, 2024
Comment on lines
+10
to
+11
| "cuGetProcAddress": {}, | ||
| "cuGetProcAddress_v2": {}, |
Contributor
Author
There was a problem hiding this comment.
depending on CUDA header versions, we need either the former or the latter (starting with CUDA 12.0).
jantonguirao
reviewed
Mar 5, 2024
dali/core/mm/stream_id_hint.cc
Outdated
| return fn; | ||
| } | ||
|
|
||
| bool _hasPreciseHint() { |
Contributor
There was a problem hiding this comment.
nit: why the _ at the beginning?
Contributor
Author
There was a problem hiding this comment.
Good question. I don't know :)
jantonguirao
approved these changes
Mar 5, 2024
Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>
Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>
9c08600 to
8a4fab1
Compare
Collaborator
|
CI MESSAGE: [13281756]: BUILD STARTED |
Collaborator
|
CI MESSAGE: [13281883]: BUILD STARTED |
Remove underscore from a function name. Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>
c9828d8 to
ddaf0a7
Compare
Collaborator
|
CI MESSAGE: [13281980]: BUILD STARTED |
awolant
approved these changes
Mar 5, 2024
szkarpinski
approved these changes
Mar 5, 2024
Collaborator
|
CI MESSAGE: [13281980]: BUILD PASSED |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Category:
Bug fix .... actually, previous behavior was accepted and documented, but nonetheless limiting and error prone.
Description:
Prior to this change, stream-ordered allocations had to use stream handles with care - specifically, it was illegal to delete a stream which still had a pending deallocation. This might have caused problems with streams over which we have no control.
This PR removes this limitation. Also, handling of per-thread default stream was broken.
Instead of using a stream handle, this PR tries to obtain a unique stream ID. If possible, we proceed as before, but with extra guarantees - we don't have to care about the stream being destroyed, because the ID is unique.
When a stream ID is not available (old drivers), this code uses a handle-derived non-unique ID and scans the per-stream free blocks linearly when returning to upstream. When getting a per-stream free block, we're not sure it really comes from the same stream. Therefore, if the block isn't ready yet, we record an event on the requesting stream, so that in the worst case, the stream would wait for the allocation to really happen.
Additional information:
The functionality borrows heavily on the implementation of stream id hints in cv-cuda.
Affected modules and functionalities:
Memory resources.
Key points relevant for the review:
N/A
Tests:
NOTE: It's impossible to trigger the fallback behavior manually (well... perhaps we could have a separate test target with some env vars which are only used for testing - I don't think it's worth the extra overhead). It triggers automatically on old (pre-cuda12) drivers.
Checklist
Documentation
DALI team only
Requirements
REQ IDs: N/A
JIRA TASK: N/A