-
Notifications
You must be signed in to change notification settings - Fork 212
Cythonize cuda.core
more
#1070
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cythonize cuda.core
more
#1070
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
7641f18
to
fa88b28
Compare
/ok to test b713c5c |
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a comment asking about pointer data type changes and pointer value conversions.
cdef int _get_device_and_context(self) except?-1: | ||
cdef cydriver.CUcontext curr_ctx | ||
if self._device_id == cydriver.CU_DEVICE_INVALID: | ||
# TODO: It is likely faster/safer to call cuCtxGetCurrent? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is universally true, where a context can be non-current without being destroyed. Retrieving the device and context from that stream object would be valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not bother doing a profiling here, just thought that it might be good to reduce a few CUDA calls... 😛
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove this TODO before merge? It's not logically correct to call cuCtxGetCurrent
here as the Stream object could be from a different context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see why this TODO is confusing. We actually need the current context in order to do a safe cuCtxGetDevice
dance (set the stream ctx to current -> get device -> revert to the current ctx). Let me clarify in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can just use cuCtxGetDevice_v2
which allows passing a ctx explicitly and doesn't require messing with the current ctx?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See commit 22b0e2e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can just use
cuCtxGetDevice_v2
which allows passing a ctx explicitly and doesn't require messing with the current ctx?
Unfortunately it seems to be CUDA 13 only... 😢
/ok to test a5e6bcf |
/ok to test 3809a33 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving as the last change I'd like is trivial
/ok to test 22b0e2e |
|
Description
Close #1065
Close #1063
Checklist