Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OpenMP][libomptarget] Use two SDMA engines #73633

Merged
merged 1 commit into from
Nov 29, 2023
Merged

Conversation

jplehr
Copy link
Contributor

@jplehr jplehr commented Nov 28, 2023

Limit the use to two SDMA engines to resolve issues with certain cards.

Limit the use to two SDMA engines to resolve issues with certain cards.
@llvmbot
Copy link
Collaborator

llvmbot commented Nov 28, 2023

@llvm/pr-subscribers-backend-amdgpu

Author: Jan Patrick Lehr (jplehr)

Changes

Limit the use to two SDMA engines to resolve issues with certain cards.


Full diff: https://github.com/llvm/llvm-project/pull/73633.diff

1 Files Affected:

  • (modified) openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp (+2-2)
diff --git a/openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp b/openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp
index 208cce90d8e9277..b3eed60d2879d48 100644
--- a/openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp
+++ b/openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp
@@ -161,8 +161,8 @@ Error asyncMemCopy(bool UseMultipleSdmaEngines, void *Dst, hsa_agent_t DstAgent,
       Dst, DstAgent, Src, SrcAgent, Size, NumDepSignals, DepSignals,
       CompletionSignal, (hsa_amd_sdma_engine_id_t)LocalSdmaEngine,
       /*force_copy_on_sdma=*/true);
-  // Increment to use one of three SDMA engines: 0x1, 0x2, 0x4
-  LocalSdmaEngine = (LocalSdmaEngine << 1) % 7;
+  // Increment to use one of two SDMA engines: 0x1, 0x2
+  LocalSdmaEngine = (LocalSdmaEngine << 1) % 3;
   SdmaEngine.store(LocalSdmaEngine, std::memory_order_relaxed);
 
   return Plugin::check(S, "Error in hsa_amd_memory_async_copy_on_engine: %s");

jplehr added a commit to jplehr/aomp that referenced this pull request Nov 28, 2023
This updates the test to work with the changes in
llvm/llvm-project#73633
@@ -161,8 +161,8 @@ Error asyncMemCopy(bool UseMultipleSdmaEngines, void *Dst, hsa_agent_t DstAgent,
Dst, DstAgent, Src, SrcAgent, Size, NumDepSignals, DepSignals,
CompletionSignal, (hsa_amd_sdma_engine_id_t)LocalSdmaEngine,
/*force_copy_on_sdma=*/true);
// Increment to use one of three SDMA engines: 0x1, 0x2, 0x4
LocalSdmaEngine = (LocalSdmaEngine << 1) % 7;
// Increment to use one of two SDMA engines: 0x1, 0x2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep the comment that lists all 3? Restrict the limit to only the relevant devices?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intention is to first unblock testing and then do better.

@jplehr
Copy link
Contributor Author

jplehr commented Nov 28, 2023

After some more digging, I think limiting this to 2 SDMAs being used is actually the correct way to do it.

@jplehr
Copy link
Contributor Author

jplehr commented Nov 28, 2023

The reason why I think this is right approach is that according to documentation, 2 SDMA engines are best suited for d2h/h2d transfers and other available engines are more suited for d2d transfers.
Since some devices only have those for host/device transfers, this is the safe thing to do, and using the engines best suited for those transfers.

@jplehr jplehr merged commit 3930a0b into llvm:main Nov 29, 2023
5 checks passed
jplehr added a commit to ROCm/aomp that referenced this pull request Nov 29, 2023
This updates the test to work with the changes in
llvm/llvm-project#73633
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants