Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[libc] Move from alias(X) to asm(X) for aliasing #89333

Closed
wants to merge 1 commit into from

Conversation

michaelrj-google
Copy link
Contributor

The previous method of aliasing a function internally was
[[gnu::alias(#name)]], but that ran into issues with gcc under some
circumstances. By using asm(#name); instead, those issues are avoided.

The previous method of aliasing a function internally was
`[[gnu::alias(#name)]]`, but that ran into issues with gcc under some
circumstances. By using `asm(#name);` instead, those issues are avoided.
@llvmbot llvmbot added the libc label Apr 18, 2024
@llvmbot
Copy link

llvmbot commented Apr 18, 2024

@llvm/pr-subscribers-libc

Author: Michael Jones (michaelrj-google)

Changes

The previous method of aliasing a function internally was
[[gnu::alias(#name)]], but that ran into issues with gcc under some
circumstances. By using asm(#name); instead, those issues are avoided.


Full diff: https://github.com/llvm/llvm-project/pull/89333.diff

1 Files Affected:

  • (modified) libc/src/__support/common.h (+1-1)
diff --git a/libc/src/__support/common.h b/libc/src/__support/common.h
index 53951dc131c28b..69268567bb2e2a 100644
--- a/libc/src/__support/common.h
+++ b/libc/src/__support/common.h
@@ -25,7 +25,7 @@
 #define LLVM_LIBC_FUNCTION_IMPL(type, name, arglist)                           \
   LLVM_LIBC_FUNCTION_ATTR decltype(LIBC_NAMESPACE::name)                       \
       __##name##_impl__ __asm__(#name);                                        \
-  decltype(LIBC_NAMESPACE::name) name [[gnu::alias(#name)]];                   \
+  decltype(LIBC_NAMESPACE::name) name asm(#name);                              \
   type __##name##_impl__ arglist
 #else
 #define LLVM_LIBC_FUNCTION_IMPL(type, name, arglist) type name arglist

Copy link
Member

@nickdesaulniers nickdesaulniers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding

Fixes: #60481

to the commit message such that github will autoclose #60481 once this PR is merged.

Were you able to verify that this fixes the observed issue with GCC+overlay mode?

Copy link
Contributor

@jhuber6 jhuber6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this works? Looking into the IR we no longer have any aliases. Internal references are therefore not resolved in things like the startup code.

$ clang puts.c --target=amdgcn-amd-amdhsa -mcpu=native -flto ../../clang/lib/amdgcn-amd-amdhsa/crt1.o -lc
ld.lld: error: undefined symbol: __llvm_libc_19_0_0_git::atexit(void (*)())
>>> referenced by a.out.lto.o:(_begin)
>>> referenced by a.out.lto.o:(_begin)

ld.lld: error: undefined symbol: __llvm_libc_19_0_0_git::exit(int)
>>> referenced by a.out.lto.o:(_end)
>>> referenced by a.out.lto.o:(_end)
clang: error: ld.lld command failed with exit code 1 (use -v to see invocation)

@nickdesaulniers
Copy link
Member

Before:

$ llvm-readelf -sW projects/libc/src/stdlib/CMakeFiles/libc.src.stdlib.atexit.dir/atexit.cpp.o | llvm-cxxfilt | grep ::atexit
    16: 0000000000000000    38 FUNC    GLOBAL DEFAULT     9 __llvm_libc_19_0_0_git::atexit(void (*)())

After:

$ llvm-readelf -sW projects/libc/src/stdlib/CMakeFiles/libc.src.stdlib.atexit.dir/atexit.cpp.o | llvm-cxxfilt | grep ::atexit
$

huh, so the gnu alias attribute must mangle the identifier if the aliasee is within the same namespace. I guess back to the drawing board for the GCC+overlay issue.

@nickdesaulniers nickdesaulniers self-requested a review April 18, 2024 23:46
@michaelrj-google
Copy link
Contributor Author

working on figuring out if this is the right way forward, here's the original patch this was added: https://reviews.llvm.org/D94195

will investigate more tomorrow.

@@ -25,7 +25,7 @@
#define LLVM_LIBC_FUNCTION_IMPL(type, name, arglist) \
LLVM_LIBC_FUNCTION_ATTR decltype(LIBC_NAMESPACE::name) \
__##name##_impl__ __asm__(#name); \
decltype(LIBC_NAMESPACE::name) name [[gnu::alias(#name)]]; \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The semantic of these 2 are different. In the old one,

decltype(LIBC_NAMESPACE::name) name [[gnu::alias(#name)]];

will give you a public C++ function LIBC_NAMESPACE::name (b/c we used this macro inside namespace LIBC_NAMESPACE { ... } which is internally aliased to the public C symbol #name.
On the other hand, the second one using asm will simply make the public symbol of LIBC_NAMESPACE::name as unmangled #name.

So this change will make LIBC_NAMESPACE::name not available outside of the translation unit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this macro result in two symbols in the emitted binary, one non-mangled, and one mangled wrt. the namespace? If I extract out this macro, I only see the non-mangled name.

https://godbolt.org/z/oedsKxroP

But for full build, if you run llvm-readelf -s <build dir>/libc/lib/libc.a you see the pairs of symbol names.

There's something else I'm missing here...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So currently, the unmangled symbol #name is declared as the public symbol of LIBC_NAMESPACE::name_impl, which is declared in https://github.com/llvm/llvm-project/blob/main/libc/src/__support/common.h#L27 and define in https://github.com/llvm/llvm-project/blob/main/libc/src/__support/common.h#L29.

https://github.com/llvm/llvm-project/blob/main/libc/src/__support/common.h#L28 then defines the other C++ public symbol LIBC_NAMESPACE::name and aliases it with the unmangled C symbol.

@nickdesaulniers
Copy link
Member

I'm not sure yet that this would fix the specific issue, but we can kind of reverse which declaration is the alias and which is the aliasee, which ends up being fewer lines of code in the macro:

https://godbolt.org/z/GxacEG7Yh

Though that would mean replacing the declaration in libc/src/<hname>/<fname>.h from being within LIBC_NAMESPACE to being extern "C", and same for the defintion in libc/src/<hname>/<fname>.h, en masse.

@nickdesaulniers
Copy link
Member

nickdesaulniers commented Apr 19, 2024

I think that would look like:

diff --git a/libc/src/__support/common.h b/libc/src/__support/common.h
index 53951dc131c2..fbc7081bfa2f 100644
--- a/libc/src/__support/common.h
+++ b/libc/src/__support/common.h
@@ -23,10 +23,8 @@
 // MacOS needs to be excluded because it does not support aliasing.
 #if defined(LIBC_COPT_PUBLIC_PACKAGING) && (!defined(__APPLE__))
 #define LLVM_LIBC_FUNCTION_IMPL(type, name, arglist)                           \
-  LLVM_LIBC_FUNCTION_ATTR decltype(LIBC_NAMESPACE::name)                       \
-      __##name##_impl__ __asm__(#name);                                        \
-  decltype(LIBC_NAMESPACE::name) name [[gnu::alias(#name)]];                   \
-  type __##name##_impl__ arglist
+  namespace LIBC_NAMESPACE { decltype(::name) name [[gnu::alias(#name)]]; }    \
+  extern "C" type name arglist
 #else
 #define LLVM_LIBC_FUNCTION_IMPL(type, name, arglist) type name arglist
 #endif
diff --git a/libc/src/ctype/isalnum.cpp b/libc/src/ctype/isalnum.cpp
index 42ed8ea475f1..0691c19b1394 100644
--- a/libc/src/ctype/isalnum.cpp
+++ b/libc/src/ctype/isalnum.cpp
@@ -11,12 +11,10 @@
 
 #include "src/__support/common.h"
 
-namespace LIBC_NAMESPACE {
+using namespace LIBC_NAMESPACE;
 
 // TODO: Currently restricted to default locale.
 // These should be extended using locale information.
 LLVM_LIBC_FUNCTION(int, isalnum, (int c)) {
   return static_cast<int>(internal::isalnum(static_cast<unsigned>(c)));
 }
-
-} // namespace LIBC_NAMESPACE
diff --git a/libc/src/ctype/isalnum.h b/libc/src/ctype/isalnum.h
index 71830c95cb2f..3ed5b3ac2a62 100644
--- a/libc/src/ctype/isalnum.h
+++ b/libc/src/ctype/isalnum.h
@@ -9,10 +9,6 @@
 #ifndef LLVM_LIBC_SRC_CTYPE_ISALNUM_H
 #define LLVM_LIBC_SRC_CTYPE_ISALNUM_H
 
-namespace LIBC_NAMESPACE {
-
-int isalnum(int c);
-
-} // namespace LIBC_NAMESPACE
+extern "C" int isalnum(int c);
 
 #endif //  LLVM_LIBC_SRC_CTYPE_ISALNUM_H

but applied across the whole tree. This seems to build and produce both symbols aliased to one another.

$ ninja libc.src.ctype.isalnum
$ llvm-readelf -sW projects/libc/src/ctype/CMakeFiles/libc.src.ctype.isalnum.dir/isalnum.cpp.o | llvm-cxxfilt

Symbol table '.symtab' contains 4 entries:
   Num:    Value          Size Type    Bind   Vis       Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT   UND 
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT   ABS isalnum.cpp
     2: 0000000000000000    34 FUNC    GLOBAL DEFAULT     3 isalnum
     3: 0000000000000000    34 FUNC    GLOBAL DEFAULT     3 __llvm_libc_19_0_0_git::isalnum(int)

@nickdesaulniers
Copy link
Member

nickdesaulniers commented Apr 19, 2024

Yeah that seems to work on the small scale to fix #60481.

$ cd <new build dir>
$ cmake ../llvm -DLLVM_ENABLE_PROJECTS="libc" -DCMAKE_BUILD_TYPE=Release -G Ninja
$ ninja libc.src.stdlib.bsearch
<explosion from #60481>

apply the above diff plus:

diff --git a/libc/src/stdlib/bsearch.cpp b/libc/src/stdlib/bsearch.cpp
index 4292d6b6fe04..fe02f5fb8366 100644
--- a/libc/src/stdlib/bsearch.cpp
+++ b/libc/src/stdlib/bsearch.cpp
@@ -9,9 +9,8 @@
 #include "src/stdlib/bsearch.h"
 #include "src/__support/common.h"
 
-#include <stdint.h>
-
-namespace LIBC_NAMESPACE {
+#include <stdint.h> // uint8_t
+#include <stddef.h> // size_t
 
 LLVM_LIBC_FUNCTION(void *, bsearch,
                    (const void *key, const void *array, size_t array_size,
@@ -43,5 +42,3 @@ LLVM_LIBC_FUNCTION(void *, bsearch,
 
   return nullptr;
 }
-
-} // namespace LIBC_NAMESPACE
diff --git a/libc/src/stdlib/bsearch.h b/libc/src/stdlib/bsearch.h
index 1de7e051ff6c..cf3188c3adc7 100644
--- a/libc/src/stdlib/bsearch.h
+++ b/libc/src/stdlib/bsearch.h
@@ -9,13 +9,9 @@
 #ifndef LLVM_LIBC_SRC_STDLIB_BSEARCH_H
 #define LLVM_LIBC_SRC_STDLIB_BSEARCH_H
 
-#include <stdlib.h>
+#include <stddef.h> // size_t
 
-namespace LIBC_NAMESPACE {
-
-void *bsearch(const void *key, const void *array, size_t array_size,
+extern "C" void *bsearch(const void *key, const void *array, size_t array_size,
               size_t elem_size, int (*compare)(const void *, const void *));
 
-} // namespace LIBC_NAMESPACE
-
 #endif //LLVM_LIBC_SRC_STDLIB_BSEARCH_H
$ ninja libc.src.stdlib.bsearch
$ llvm-readelf -sW projects/libc/src/stdlib/CMakeFiles/libc.src.stdlib.bsearch.dir/bsearch.cpp.o | llvm-cxxfilt

Symbol table '.symtab' contains 4 entries:
   Num:    Value          Size Type    Bind   Vis       Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT   UND 
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT   ABS bsearch.cpp
     2: 0000000000000000   154 FUNC    GLOBAL DEFAULT     4 bsearch
     3: 0000000000000000   154 FUNC    GLOBAL DEFAULT     4 __llvm_libc_19_0_0_git::bsearch(void const*, void const*, unsigned long, unsigned long, int (*)(void const*, void const*))

@nickdesaulniers
Copy link
Member

diff --git a/libc/src/stdlib/bsearch.h b/libc/src/stdlib/bsearch.h
index 1de7e051ff6c..cf3188c3adc7 100644
--- a/libc/src/stdlib/bsearch.h
+++ b/libc/src/stdlib/bsearch.h
@@ -9,13 +9,9 @@
 #ifndef LLVM_LIBC_SRC_STDLIB_BSEARCH_H
 #define LLVM_LIBC_SRC_STDLIB_BSEARCH_H
 
-#include <stdlib.h>
+#include <stddef.h> // size_t

Is perhaps the more interesting change. We should do that change regardless, because we only need stddef.h for the declaration of size_t. BUT! if we retain it:

/android0/llvm-project/libc/src/stdlib/bsearch.cpp: In function ‘void* bsearch(const void*, const void*, size_t, size_t, int (*)(const void*, const void*))’:
/android0/llvm-project/libc/src/stdlib/bsearch.cpp:19:11: error: ‘nonnull’ argument ‘key’ compared to NULL [-Werror=nonnull-compare]
   19 |   if (key == nullptr || array == nullptr || array_size == 0 || elem_size == 0)
      |       ~~~~^~~~~~~~~~
/android0/llvm-project/libc/src/stdlib/bsearch.cpp:19:31: error: ‘nonnull’ argument ‘array’ compared to NULL [-Werror=nonnull-compare]
   19 |   if (key == nullptr || array == nullptr || array_size == 0 || elem_size == 0)
      |                         ~~~~~~^~~~~~~~~~

IME there's some very wacky behavior wrt. attributes on redeclarations; there's always this ambiguity between if the redeclarations don't match in terms of attributes, are the attributes retained or not. IME, it's a case by case basis whether that's an error or not, and when not, whether they're retained.

So though we redeclare bsearch without parameter attributes, the declaration in glibc MUST have them on the parameters, so we now get a diagnostic that the comparisons on function parameters (that are guaranteed by the standard to not be nullptr else UB).

So we perhaps should also declare our bsearch to have non-null pointer parameters and remove these checks (maybe, unless we have some hardening mode).


Also, I didn't test building the whole tree; even small individual function unit tests require full conversion of the codebase.

@nickdesaulniers
Copy link
Member

nickdesaulniers commented Apr 19, 2024

Looks like the headers under libc/src/{hname}/{fname}.h would also need to declare the namespaced function in order for the tests to call them.

diff --git a/libc/src/ctype/isalnum.h b/libc/src/ctype/isalnum.h
index 71830c95cb2f..bd7926fd32f7 100644
--- a/libc/src/ctype/isalnum.h
+++ b/libc/src/ctype/isalnum.h
@@ -9,10 +9,7 @@
 #ifndef LLVM_LIBC_SRC_CTYPE_ISALNUM_H
 #define LLVM_LIBC_SRC_CTYPE_ISALNUM_H
 
-namespace LIBC_NAMESPACE {
-
-int isalnum(int c);
-
-} // namespace LIBC_NAMESPACE
+extern "C" int isalnum(int c);
+namespace LIBC_NAMESPACE { decltype(::isalnum) isalnum; }
 
 #endif //  LLVM_LIBC_SRC_CTYPE_ISALNUM_H

could probably hide those with another macro though. 😋

@michaelrj-google
Copy link
Contributor Author

Closing as no longer necessary, see #60481 for more information.

@michaelrj-google michaelrj-google deleted the libcAsmAlias branch September 19, 2024 21:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants