Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

emulate hip/cuda-Memcpy3D with a kernel #1014

Merged

Conversation

psychocoderHPC
Copy link
Member

  • add kernel to emulate hip/cuda-Memcpy3D
  • add CMake option to enable/disable emulated memory copy (by default
    only for HIP enabled)
  • enable ALPAKA_EMU_MEMCPY3D for one HIP and one CUDA CI test

This optimization based on my HIP issue and will increase the memory copy performance for device to device copies on the same device.
I enabled the emulated copy only for HIP, for CUDA it can be optional enabled but is not showing any improvement. I assume the CUDA driver is already using a kernel instead of looping over the rows and call 1D mem-copies.

/// It is required to start `height * depth` HIP/CUDA blocks.
/// The kernel loops over the memory rows.
template<typename T>
__global__ void hipMemcpy3DEmulatedKernelD2D(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: a native kernel is used to avoid cyclic dependencies within alpaka.

@@ -18,7 +19,10 @@ set(ALPAKA_ACC_CPU_B_SEQ_T_FIBERS_ENABLE_DEFAULT ON)
set(ALPAKA_ACC_CPU_B_TBB_T_SEQ_ENABLE_DEFAULT ON)
set(ALPAKA_ACC_CPU_B_OMP2_T_SEQ_ENABLE_DEFAULT ON)
set(ALPAKA_ACC_CPU_B_SEQ_T_OMP2_ENABLE_DEFAULT ON)
set(ALPAKA_ACC_CPU_BT_OMP4_ENABLE_DEFAULT ON)
set(ALPAKA_ACC_ANY_BT_OMP5_ENABLE_DEFAULT ON)
set(ALPAKA_ACC_ANY_BT_OACC_ENABLE_DEFAULT OFF)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should not be part of this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@psychocoderHPC psychocoderHPC force-pushed the topic-emulate3DMemcopy branch 2 times, most recently from e5889d1 to d7a4c63 Compare May 29, 2020 13:02
- add kernel to emulate hip/cuda-Memcpy3D
- add CMake option to enable/disable emulated memory copy (by default
only for HIP enabled)
@psychocoderHPC
Copy link
Member Author

@BenjaminW3 Do you know what we can do if a CI test fails. Actions do not allow to restart single tests.

@BenjaminW3
Copy link
Member

Github is working on it. I do not know a solution despite rebuilding all or merging it nevertheless.

@psychocoderHPC
Copy link
Member Author

Feel free to merge it even if the CI is not passing. The CI passed before but I fixed a indention issue in CMake and today not all tests passing.

@BenjaminW3 BenjaminW3 merged commit 1e1a1d9 into alpaka-group:develop Jun 2, 2020
@psychocoderHPC psychocoderHPC deleted the topic-emulate3DMemcopy branch June 2, 2020 12:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants