Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[libomptarget][nextgen-plugin] Always use a signal to trigger completion of H2D data transfers #83475

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

doru1004
Copy link
Contributor

@doru1004 doru1004 commented Feb 29, 2024

Always use a signal to trigger completion of H2D data transfers.

I refactored this to avoid relying solely on the built-in synchronous behavior of memcpy and reuse the signal-based approach used everywhere else in the asynchronous implementation. From a consistency perspective this code now performs the same steps as the asyncActionCallback which also does a memcpy() + Slot.Signal->signal().

@llvmbot
Copy link
Collaborator

llvmbot commented Feb 29, 2024

@llvm/pr-subscribers-backend-amdgpu

Author: Gheorghe-Teodor Bercea (doru1004)

Changes

Always use a signal to trigger completion of H2D data transfers.


Full diff: https://github.com/llvm/llvm-project/pull/83475.diff

1 Files Affected:

  • (modified) openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp (+8-15)
diff --git a/openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp b/openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp
index 81634ae1edc490..55794d8bbf2264 100644
--- a/openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp
+++ b/openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp
@@ -1354,12 +1354,10 @@ struct AMDGPUStreamTy {
       Signal->increaseUseCount();
     }
 
-    AMDGPUSignalTy *OutputSignal = OutputSignals[0];
-
     std::lock_guard<std::mutex> Lock(Mutex);
 
     // Consume stream slot and compute dependencies.
-    auto [Curr, InputSignal] = consume(OutputSignal);
+    auto [Curr, InputSignal] = consume(OutputSignals[0]);
 
     // Avoid defining the input dependency if already satisfied.
     if (InputSignal && !InputSignal->load())
@@ -1383,22 +1381,17 @@ struct AMDGPUStreamTy {
       if (auto Err = Plugin::check(Status,
                                    "Error in hsa_amd_signal_async_handler: %s"))
         return Err;
-
-      // Let's use now the second output signal.
-      OutputSignal = OutputSignals[1];
-
-      // Consume another stream slot and compute dependencies.
-      std::tie(Curr, InputSignal) = consume(OutputSignal);
     } else {
       // All preceding operations completed, copy the memory synchronously.
       std::memcpy(Inter, Src, CopySize);
 
-      // Return the second signal because it will not be used.
-      OutputSignals[1]->decreaseUseCount();
-      if (auto Err = SignalManager.returnResource(OutputSignals[1]))
-        return Err;
+      // Signal the end of the operation.
+      Slots[Curr].Signal->signal();
     }
 
+    // Consume another stream slot and compute dependencies.
+    std::tie(Curr, InputSignal) = consume(OutputSignals[1]);
+
     // Setup the post action to release the intermediate pinned buffer.
     if (auto Err = Slots[Curr].schedReleaseBuffer(Inter, MemoryManager))
       return Err;
@@ -1409,10 +1402,10 @@ struct AMDGPUStreamTy {
       hsa_signal_t InputSignalRaw = InputSignal->get();
       return utils::asyncMemCopy(UseMultipleSdmaEngines, Dst, Agent, Inter,
                                  Agent, CopySize, 1, &InputSignalRaw,
-                                 OutputSignal->get());
+                                 OutputSignals[1]->get());
     }
     return utils::asyncMemCopy(UseMultipleSdmaEngines, Dst, Agent, Inter, Agent,
-                               CopySize, 0, nullptr, OutputSignal->get());
+                               CopySize, 0, nullptr, OutputSignals[1]->get());
   }
 
   // AMDGPUDeviceTy is incomplete here, passing the underlying agent instead

@jhuber6
Copy link
Contributor

jhuber6 commented Feb 29, 2024

Could you add some more context to your commit messages? Just from looking at it I can't tell why this is a necessary change.

Copy link
Contributor

@jhuber6 jhuber6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to work, let's hope it doesn't break anything.

}

// Consume another stream slot and compute dependencies.
std::tie(Curr, InputSignal) = consume(OutputSignals[1]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for std::tie, we're on C++17.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants