-
Notifications
You must be signed in to change notification settings - Fork 10.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[libc] Implement 'atexit' on the GPU correctly #83037
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-backend-amdgpu @llvm/pr-subscribers-libc Author: Joseph Huber (jhuber6) ChangesSummary: Depends on #83026 Full diff: https://github.com/llvm/llvm-project/pull/83037.diff 2 Files Affected:
diff --git a/libc/docs/gpu/support.rst b/libc/docs/gpu/support.rst
index 250f0a7794de3c..b2231eb69a590a 100644
--- a/libc/docs/gpu/support.rst
+++ b/libc/docs/gpu/support.rst
@@ -102,6 +102,7 @@ atol |check|
atoll |check|
exit |check| |check|
abort |check| |check|
+atexit |check|
labs |check|
llabs |check|
div |check|
diff --git a/libc/src/stdlib/atexit.cpp b/libc/src/stdlib/atexit.cpp
index 1513b7969f0dbc..5951c5c77e8d1f 100644
--- a/libc/src/stdlib/atexit.cpp
+++ b/libc/src/stdlib/atexit.cpp
@@ -9,6 +9,7 @@
#include "src/stdlib/atexit.h"
#include "src/__support/blockstore.h"
#include "src/__support/common.h"
+#include "src/__support/fixedstack.h"
#include "src/__support/fixedvector.h"
#include "src/__support/threads/mutex.h"
@@ -16,7 +17,7 @@ namespace LIBC_NAMESPACE {
namespace {
-Mutex handler_list_mtx(false, false, false);
+[[maybe_unused]] Mutex handler_list_mtx(false, false, false);
using AtExitCallback = void(void *);
using StdCAtExitCallback = void(void);
@@ -29,12 +30,10 @@ struct AtExitUnit {
};
#if defined(LIBC_TARGET_ARCH_IS_GPU)
-// The GPU build cannot handle the potentially recursive definitions required by
-// the BlockStore class. Additionally, the liklihood that someone exceeds this
-// while executing on the GPU is extremely small.
-// FIXME: It is not generally safe to use 'atexit' on the GPU because the
-// mutexes simply passthrough. We will need a lock free stack.
-using ExitCallbackList = FixedVector<AtExitUnit, 64>;
+// The GPU interface cannot use the standard implementation because it does not
+// support the Mutex type. Instead we use a lock free stack with a sufficiently
+// large size.
+using ExitCallbackList = FixedStack<AtExitUnit, CALLBACK_LIST_SIZE_FOR_TESTS>;
#elif defined(LIBC_COPT_PUBLIC_PACKAGING)
using ExitCallbackList = cpp::ReverseOrderBlockStore<AtExitUnit, 32>;
#else
@@ -60,6 +59,11 @@ void stdc_at_exit_func(void *payload) {
namespace internal {
void call_exit_callbacks() {
+#if defined(LIBC_TARGET_ARCH_IS_GPU)
+ AtExitUnit unit;
+ while (exit_callbacks.pop(unit))
+ unit.callback(unit.payload);
+#else
handler_list_mtx.lock();
while (!exit_callbacks.empty()) {
auto unit = exit_callbacks.back();
@@ -69,14 +73,20 @@ void call_exit_callbacks() {
handler_list_mtx.lock();
}
ExitCallbackList::destroy(&exit_callbacks);
+#endif
}
} // namespace internal
static int add_atexit_unit(const AtExitUnit &unit) {
+#if defined(LIBC_TARGET_ARCH_IS_GPU)
+ if (!exit_callbacks.push(unit))
+ return -1;
+#else
MutexLock lock(&handler_list_mtx);
if (!exit_callbacks.push_back(unit))
return -1;
+#endif
return 0;
}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you split the GPU's atexit
implementation to src/stdlib/gpu/atexit.cpp
similar to malloc
and free
there? So that they won't have maybe-unused non-overlapping dependency.
Somewhat of a pain because it required adding some "generic/" directory only for a single file. |
You don't need the |
libc/src/stdlib/atexit.cpp
Outdated
@@ -9,6 +9,7 @@ | |||
#include "src/stdlib/atexit.h" | |||
#include "src/__support/blockstore.h" | |||
#include "src/__support/common.h" | |||
#include "src/__support/fixedstack.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixedstack
is not needed here anymore.
Summary: This function was never marked at supported because it was fundamentally broken when called with multiple threads. The patch in llvm#83026 introduces a lock-free stack that can be used to correctly handle enqueuing callbacks from multiple threads. Although the previous interface tried to provide a consistent API, this was not feasible with the needs for a lock-free stack so I have elected to just use ifdefs. The size is fixed to whatever we use for testing, which currently amounts to about 8KiB dedicated for this thing, which isn't enough to be concenred about. Depends on llvm#83026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for splitting this out! The implementation is cleaner and easier to follow now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Summary:
This function was never marked at supported because it was fundamentally
broken when called with multiple threads. The patch in
#83026 introduces a lock-free
stack that can be used to correctly handle enqueuing callbacks from
multiple threads. Although the previous interface tried to provide a
consistent API, this was not feasible with the needs for a lock-free
stack so I have elected to just use ifdefs. The size is fixed to
whatever we use for testing, which currently amounts to about 8KiB
dedicated for this thing, which isn't enough to be concenred about.
Depends on #83026