-
Notifications
You must be signed in to change notification settings - Fork 12k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[libc] Move from alias(X) to asm(X) for aliasing #89333
Conversation
The previous method of aliasing a function internally was `[[gnu::alias(#name)]]`, but that ran into issues with gcc under some circumstances. By using `asm(#name);` instead, those issues are avoided.
@llvm/pr-subscribers-libc Author: Michael Jones (michaelrj-google) ChangesThe previous method of aliasing a function internally was Full diff: https://github.com/llvm/llvm-project/pull/89333.diff 1 Files Affected:
diff --git a/libc/src/__support/common.h b/libc/src/__support/common.h
index 53951dc131c28b..69268567bb2e2a 100644
--- a/libc/src/__support/common.h
+++ b/libc/src/__support/common.h
@@ -25,7 +25,7 @@
#define LLVM_LIBC_FUNCTION_IMPL(type, name, arglist) \
LLVM_LIBC_FUNCTION_ATTR decltype(LIBC_NAMESPACE::name) \
__##name##_impl__ __asm__(#name); \
- decltype(LIBC_NAMESPACE::name) name [[gnu::alias(#name)]]; \
+ decltype(LIBC_NAMESPACE::name) name asm(#name); \
type __##name##_impl__ arglist
#else
#define LLVM_LIBC_FUNCTION_IMPL(type, name, arglist) type name arglist
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding
Fixes: #60481
to the commit message such that github will autoclose #60481 once this PR is merged.
Were you able to verify that this fixes the observed issue with GCC+overlay mode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure this works? Looking into the IR we no longer have any aliases. Internal references are therefore not resolved in things like the startup code.
$ clang puts.c --target=amdgcn-amd-amdhsa -mcpu=native -flto ../../clang/lib/amdgcn-amd-amdhsa/crt1.o -lc
ld.lld: error: undefined symbol: __llvm_libc_19_0_0_git::atexit(void (*)())
>>> referenced by a.out.lto.o:(_begin)
>>> referenced by a.out.lto.o:(_begin)
ld.lld: error: undefined symbol: __llvm_libc_19_0_0_git::exit(int)
>>> referenced by a.out.lto.o:(_end)
>>> referenced by a.out.lto.o:(_end)
clang: error: ld.lld command failed with exit code 1 (use -v to see invocation)
Before:
After:
huh, so the gnu alias attribute must mangle the identifier if the aliasee is within the same namespace. I guess back to the drawing board for the GCC+overlay issue. |
working on figuring out if this is the right way forward, here's the original patch this was added: https://reviews.llvm.org/D94195 will investigate more tomorrow. |
@@ -25,7 +25,7 @@ | |||
#define LLVM_LIBC_FUNCTION_IMPL(type, name, arglist) \ | |||
LLVM_LIBC_FUNCTION_ATTR decltype(LIBC_NAMESPACE::name) \ | |||
__##name##_impl__ __asm__(#name); \ | |||
decltype(LIBC_NAMESPACE::name) name [[gnu::alias(#name)]]; \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The semantic of these 2 are different. In the old one,
decltype(LIBC_NAMESPACE::name) name [[gnu::alias(#name)]];
will give you a public C++ function LIBC_NAMESPACE::name
(b/c we used this macro inside namespace LIBC_NAMESPACE { ... }
which is internally aliased to the public C symbol #name
.
On the other hand, the second one using asm
will simply make the public symbol of LIBC_NAMESPACE::name
as unmangled #name
.
So this change will make LIBC_NAMESPACE::name
not available outside of the translation unit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this macro result in two symbols in the emitted binary, one non-mangled, and one mangled wrt. the namespace? If I extract out this macro, I only see the non-mangled name.
https://godbolt.org/z/oedsKxroP
But for full build, if you run llvm-readelf -s <build dir>/libc/lib/libc.a
you see the pairs of symbol names.
There's something else I'm missing here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So currently, the unmangled symbol #name
is declared as the public symbol of LIBC_NAMESPACE::name_impl
, which is declared in https://github.com/llvm/llvm-project/blob/main/libc/src/__support/common.h#L27 and define in https://github.com/llvm/llvm-project/blob/main/libc/src/__support/common.h#L29.
https://github.com/llvm/llvm-project/blob/main/libc/src/__support/common.h#L28 then defines the other C++ public symbol LIBC_NAMESPACE::name
and aliases it with the unmangled C symbol.
I'm not sure yet that this would fix the specific issue, but we can kind of reverse which declaration is the alias and which is the aliasee, which ends up being fewer lines of code in the macro: https://godbolt.org/z/GxacEG7Yh Though that would mean replacing the declaration in |
I think that would look like: diff --git a/libc/src/__support/common.h b/libc/src/__support/common.h
index 53951dc131c2..fbc7081bfa2f 100644
--- a/libc/src/__support/common.h
+++ b/libc/src/__support/common.h
@@ -23,10 +23,8 @@
// MacOS needs to be excluded because it does not support aliasing.
#if defined(LIBC_COPT_PUBLIC_PACKAGING) && (!defined(__APPLE__))
#define LLVM_LIBC_FUNCTION_IMPL(type, name, arglist) \
- LLVM_LIBC_FUNCTION_ATTR decltype(LIBC_NAMESPACE::name) \
- __##name##_impl__ __asm__(#name); \
- decltype(LIBC_NAMESPACE::name) name [[gnu::alias(#name)]]; \
- type __##name##_impl__ arglist
+ namespace LIBC_NAMESPACE { decltype(::name) name [[gnu::alias(#name)]]; } \
+ extern "C" type name arglist
#else
#define LLVM_LIBC_FUNCTION_IMPL(type, name, arglist) type name arglist
#endif
diff --git a/libc/src/ctype/isalnum.cpp b/libc/src/ctype/isalnum.cpp
index 42ed8ea475f1..0691c19b1394 100644
--- a/libc/src/ctype/isalnum.cpp
+++ b/libc/src/ctype/isalnum.cpp
@@ -11,12 +11,10 @@
#include "src/__support/common.h"
-namespace LIBC_NAMESPACE {
+using namespace LIBC_NAMESPACE;
// TODO: Currently restricted to default locale.
// These should be extended using locale information.
LLVM_LIBC_FUNCTION(int, isalnum, (int c)) {
return static_cast<int>(internal::isalnum(static_cast<unsigned>(c)));
}
-
-} // namespace LIBC_NAMESPACE
diff --git a/libc/src/ctype/isalnum.h b/libc/src/ctype/isalnum.h
index 71830c95cb2f..3ed5b3ac2a62 100644
--- a/libc/src/ctype/isalnum.h
+++ b/libc/src/ctype/isalnum.h
@@ -9,10 +9,6 @@
#ifndef LLVM_LIBC_SRC_CTYPE_ISALNUM_H
#define LLVM_LIBC_SRC_CTYPE_ISALNUM_H
-namespace LIBC_NAMESPACE {
-
-int isalnum(int c);
-
-} // namespace LIBC_NAMESPACE
+extern "C" int isalnum(int c);
#endif // LLVM_LIBC_SRC_CTYPE_ISALNUM_H but applied across the whole tree. This seems to build and produce both symbols aliased to one another.
|
Yeah that seems to work on the small scale to fix #60481.
apply the above diff plus: diff --git a/libc/src/stdlib/bsearch.cpp b/libc/src/stdlib/bsearch.cpp
index 4292d6b6fe04..fe02f5fb8366 100644
--- a/libc/src/stdlib/bsearch.cpp
+++ b/libc/src/stdlib/bsearch.cpp
@@ -9,9 +9,8 @@
#include "src/stdlib/bsearch.h"
#include "src/__support/common.h"
-#include <stdint.h>
-
-namespace LIBC_NAMESPACE {
+#include <stdint.h> // uint8_t
+#include <stddef.h> // size_t
LLVM_LIBC_FUNCTION(void *, bsearch,
(const void *key, const void *array, size_t array_size,
@@ -43,5 +42,3 @@ LLVM_LIBC_FUNCTION(void *, bsearch,
return nullptr;
}
-
-} // namespace LIBC_NAMESPACE
diff --git a/libc/src/stdlib/bsearch.h b/libc/src/stdlib/bsearch.h
index 1de7e051ff6c..cf3188c3adc7 100644
--- a/libc/src/stdlib/bsearch.h
+++ b/libc/src/stdlib/bsearch.h
@@ -9,13 +9,9 @@
#ifndef LLVM_LIBC_SRC_STDLIB_BSEARCH_H
#define LLVM_LIBC_SRC_STDLIB_BSEARCH_H
-#include <stdlib.h>
+#include <stddef.h> // size_t
-namespace LIBC_NAMESPACE {
-
-void *bsearch(const void *key, const void *array, size_t array_size,
+extern "C" void *bsearch(const void *key, const void *array, size_t array_size,
size_t elem_size, int (*compare)(const void *, const void *));
-} // namespace LIBC_NAMESPACE
-
#endif //LLVM_LIBC_SRC_STDLIB_BSEARCH_H
|
diff --git a/libc/src/stdlib/bsearch.h b/libc/src/stdlib/bsearch.h
index 1de7e051ff6c..cf3188c3adc7 100644
--- a/libc/src/stdlib/bsearch.h
+++ b/libc/src/stdlib/bsearch.h
@@ -9,13 +9,9 @@
#ifndef LLVM_LIBC_SRC_STDLIB_BSEARCH_H
#define LLVM_LIBC_SRC_STDLIB_BSEARCH_H
-#include <stdlib.h>
+#include <stddef.h> // size_t Is perhaps the more interesting change. We should do that change regardless, because we only need
IME there's some very wacky behavior wrt. attributes on redeclarations; there's always this ambiguity between if the redeclarations don't match in terms of attributes, are the attributes retained or not. IME, it's a case by case basis whether that's an error or not, and when not, whether they're retained. So though we redeclare bsearch without parameter attributes, the declaration in glibc MUST have them on the parameters, so we now get a diagnostic that the comparisons on function parameters (that are guaranteed by the standard to not be nullptr else UB). So we perhaps should also declare our bsearch to have non-null pointer parameters and remove these checks (maybe, unless we have some hardening mode). Also, I didn't test building the whole tree; even small individual function unit tests require full conversion of the codebase. |
Looks like the headers under diff --git a/libc/src/ctype/isalnum.h b/libc/src/ctype/isalnum.h
index 71830c95cb2f..bd7926fd32f7 100644
--- a/libc/src/ctype/isalnum.h
+++ b/libc/src/ctype/isalnum.h
@@ -9,10 +9,7 @@
#ifndef LLVM_LIBC_SRC_CTYPE_ISALNUM_H
#define LLVM_LIBC_SRC_CTYPE_ISALNUM_H
-namespace LIBC_NAMESPACE {
-
-int isalnum(int c);
-
-} // namespace LIBC_NAMESPACE
+extern "C" int isalnum(int c);
+namespace LIBC_NAMESPACE { decltype(::isalnum) isalnum; }
#endif // LLVM_LIBC_SRC_CTYPE_ISALNUM_H could probably hide those with another macro though. 😋 |
Closing as no longer necessary, see #60481 for more information. |
The previous method of aliasing a function internally was
[[gnu::alias(#name)]]
, but that ran into issues with gcc under somecircumstances. By using
asm(#name);
instead, those issues are avoided.