-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for unified memory #296
Comments
Surely, memory access and allocation need better care in Alpaka. |
I was at a CUDA introductory lesson for summer students at CERN last week, and unified memory was taught as a recommended practice now. I asked the lecturer about it and they said, they base their judgement on some collaborating HPC institutes. According to their measurements, using unified memory and async prefetching is at least as fast as ordinary host/device memory + memcpy, for complex kernels and access patterns potentially faster. And it's easier to program. This kept me thinking about the lack of support for this in alpaka :) |
Eh... we have direct observation of the opposite: unified memory with explicit prefetching was still slower than explicit memory copies last time we tried it. Still, it's something we've been considering, and we have had some internal discussion of what it could look like in alpaka to be easy to adopt for CMS. I will try to find back my notes... |
cudaMalloc -> cudaMallocManaged
cudaMemcpy -> cudaMemPrefetchAsync
cudaMemAdvice for pinning?
The alloc function currently takes a device. If this is DevGpuCuda, the Cuda versions are called. We would need something like a MemSpace to allow to select the memory space on a per allocation base (Gpu only vs unified).
The text was updated successfully, but these errors were encountered: