-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request: deep_copy within parallel regions #689
Comments
This functionality is something that could be supported, but I'm not sure using the function deep_copy() is the way we want to go. What you want is something that copies between views only in the same memory space. deep_copy() implies that it works copying between memory spaces. @crtrott We've talked a couple of times about adding some of the stl algorithm functionality to Kokkos algorithms. One way to approach this problem is to use a Kokkos algorithms analogue to stl::copy(). This would avoid confusing the intent of the name deep_copy(). Although it would bring up the issue of should our copy() work for both in parallel regions and outside parallel regions? Should we have separate functions that do each (copy() and serial_copy()), or should a single function detect if it is in a parallel region and do the right thing? |
deep_copy is the "easiest Kokkos kernel" ;-) Would it make sense to have team-level, thread-level, etc. versions of it? |
Restricted to the same memory space and just single thread level is fine for my needs. Basically I have a bunch of code like: #ifndef KOKKOS_HAVE_CUDA that I would like to replace with just the subview construction and a call to a deep_copy (or an alternatively named function) that does the fill. |
Is this still needed, and if so is it okay if its called something different (like |
Yes this would still be useful. I'm fine with a different name, maybe fill to be consistent with std::fill? |
I have a similar request for this feature. Here's a usage example, which only works on CPU:
This example is a bit abusive because it basically copy the |
@vbrunini I was thinking about how to do this. The main issue is that |
I don't think a solution involving |
@vbrunini Why couldn't we just call |
I hadn't thought about that option, no objections to that. |
@mhoemmen I think we've seen issues before when trying to run a |
@ibaned The |
I'd like to have versions of deep_copy (both the copy from another view and the fill with a constant value) that can be called from within parallel regions and are as performant as the equivalent using memcpy/memset on the underlying data.
@crtrott
The text was updated successfully, but these errors were encountered: