-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C++] Add device-specific synchronization API to Buffer #36103
Comments
@pitrou Is the idea that the |
|
Also cc @felipecrv , if you know a bit about GPU APIs |
We should probably put a |
AFAICT, we don't expose |
In any case, the right lifetime semantics will have to be decided, which may entail putting a |
I'm not sure if an event is the right thing for Buffers. In addition to being able to synchronize, device buffers are often managed in a stream ordered fashion. I.E. allocated and more importantly freed asynchronously from a CPU perspective with the ordering being handled via CUDA streams. I.E. RMM (libcudf's memory manager) has a couple of stream APIs associated to its |
Ok, so instead of exposing an event-like API, we could expose a stream-like API. |
### Rationale for this change Building on the `ArrowDeviceArray` we need to expand the abstractions for handling events and stream synchronization for devices. ### What changes are included in this PR? Initial Abstract implementations for the new DeviceSync API and a CPU implementation. This will be followed up by a CUDA implementation in a subsequent PR. ### Are these changes tested? Yes, tests are added for Import/Export DeviceArrays using the DeviceSync handling. * Closes: #36103 Lead-authored-by: Matt Topol <zotthewizard@gmail.com> Co-authored-by: Benjamin Kietzman <bengilgit@gmail.com> Co-authored-by: Antoine Pitrou <pitrou@free.fr> Signed-off-by: Matt Topol <zotthewizard@gmail.com>
### Rationale for this change Building on the `ArrowDeviceArray` we need to expand the abstractions for handling events and stream synchronization for devices. ### What changes are included in this PR? Initial Abstract implementations for the new DeviceSync API and a CPU implementation. This will be followed up by a CUDA implementation in a subsequent PR. ### Are these changes tested? Yes, tests are added for Import/Export DeviceArrays using the DeviceSync handling. * Closes: apache#36103 Lead-authored-by: Matt Topol <zotthewizard@gmail.com> Co-authored-by: Benjamin Kietzman <bengilgit@gmail.com> Co-authored-by: Antoine Pitrou <pitrou@free.fr> Signed-off-by: Matt Topol <zotthewizard@gmail.com>
Describe the enhancement requested
For the C device data interface, and other applications, we'll need to add optional synchronization information to buffers.
Here is for example a possible API. It tries to avoid or minimize additional footprint, especially for the CPU case:
Or, as an alternative, define a
DeviceSyncStream
instead of aDeviceSyncEvent
.TODO: define lifetime semantics.
Component(s)
C++
The text was updated successfully, but these errors were encountered: