New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[libc++] Encode additional ODR-affecting properties in the ABI tag #69669
Conversation
As explained in __config, we have an ABI tag that we use to ensure that we don't run into ODR issues when mixing different versions of libc++ in multiple TUs. However, the reasoning behind that extends not only to different versions of libc++, but also to different configurations of the same version of libc++. In fact, we've been aware of this for a while but never really bothered to make the change because ODR issues are often thought to be benign. Well, it turns out that I just spent over an hour banging my head against an issue that boils down to our lack of encoding some ODR properties in the ABI tag, so here's the patch we should have done a long time ago. For now, the ODR properties we encode in the ABI tag are: - library version - exceptions vs no-exceptions - hardening mode Those are all things that we support different values for on a per-TU basis and they definitely affect ODR in a meaningful way. We can add more properties later as we see fit.
@llvm/pr-subscribers-libcxx Author: Louis Dionne (ldionne) ChangesAs explained in __config, we have an ABI tag that we use to ensure that we don't run into ODR issues when mixing different versions of libc++ in multiple TUs. However, the reasoning behind that extends not only to different versions of libc++, but also to different configurations of the same version of libc++. In fact, we've been aware of this for a while but never really bothered to make the change because ODR issues are often thought to be benign. Well, it turns out that I just spent over an hour banging my head against an issue that boils down to our lack of encoding some ODR properties in the ABI tag, so here's the patch we should have done a long time ago. For now, the ODR properties we encode in the ABI tag are:
Those are all things that we support different values for on a per-TU basis and they definitely affect ODR in a meaningful way. We can add more properties later as we see fit. Full diff: https://github.com/llvm/llvm-project/pull/69669.diff 3 Files Affected:
diff --git a/libcxx/include/__config b/libcxx/include/__config
index 65ce6d6a27f8326..2fa548132bba569 100644
--- a/libcxx/include/__config
+++ b/libcxx/include/__config
@@ -56,10 +56,6 @@
# define _LIBCPP_CONCAT_IMPL(_X, _Y) _X##_Y
# define _LIBCPP_CONCAT(_X, _Y) _LIBCPP_CONCAT_IMPL(_X, _Y)
-// Valid C++ identifier that revs with every libc++ version. This can be used to
-// generate identifiers that must be unique for every released libc++ version.
-# define _LIBCPP_VERSIONED_IDENTIFIER _LIBCPP_CONCAT(v, _LIBCPP_VERSION)
-
# if __STDC_HOSTED__ == 0
# define _LIBCPP_FREESTANDING
# endif
@@ -734,22 +730,54 @@ typedef __char32_t char32_t;
# define _LIBCPP_EXCLUDE_FROM_EXPLICIT_INSTANTIATION _LIBCPP_ALWAYS_INLINE
# endif
+# if _LIBCPP_ENABLE_HARDENED_MODE
+# define _LIBCPP_HARDENING_SIG h
+# elif _LIBCPP_ENABLE_SAFE_MODE
+# define _LIBCPP_HARDENING_SIG s
+# elif _LIBCPP_ENABLE_DEBUG_MODE
+# define _LIBCPP_HARDENING_SIG d
+# else
+# define _LIBCPP_HARDENING_SIG u // for unchecked
+# endif
+
+# ifdef _LIBCPP_HAS_NO_EXCEPTIONS
+# define _LIBCPP_EXCEPTIONS_SIG n
+# else
+# define _LIBCPP_EXCEPTIONS_SIG e
+# endif
+
+# define _LIBCPP_ODR_SIGNATURE \
+ _LIBCPP_CONCAT(_LIBCPP_CONCAT(_LIBCPP_CONCAT(v, _LIBCPP_VERSION), _LIBCPP_HARDENING_SIG), _LIBCPP_EXCEPTIONS_SIG)
+
// This macro marks a symbol as being hidden from libc++'s ABI. This is achieved
// on two levels:
// 1. The symbol is given hidden visibility, which ensures that users won't start exporting
// symbols from their dynamic library by means of using the libc++ headers. This ensures
// that those symbols stay private to the dynamic library in which it is defined.
//
-// 2. The symbol is given an ABI tag that changes with each version of libc++. This ensures
-// that no ODR violation can arise from mixing two TUs compiled with different versions
-// of libc++ where we would have changed the definition of a symbol. If the symbols shared
-// the same name, the ODR would require that their definitions be token-by-token equivalent,
-// which basically prevents us from being able to make any change to any function in our
-// headers. Using this ABI tag ensures that the symbol name is "bumped" artificially at
-// each release, which lets us change the definition of these symbols at our leisure.
-// Note that historically, this has been achieved in various ways, including force-inlining
-// all functions or giving internal linkage to all functions. Both these (previous) solutions
-// suffer from drawbacks that lead notably to code bloat.
+// 2. The symbol is given an ABI tag that encodes the ODR-relevant properties of the library.
+// This ensures that no ODR violation can arise from mixing two TUs compiled with different
+// versions or configurations of libc++ (such as exceptions vs no-exceptions). Indeed, if the
+// program contains two definitions of a function, the ODR requires them to be token-by-token
+// equivalent, and the linker is allowed to pick either definition and discard the other one.
+//
+// For example, if a program contains a copy of `vector::at()` compiled with exceptions enabled
+// *and* a copy of `vector::at()` compiled with exceptions disabled (by means of having two TUs
+// compiled with different settings), the two definitions are both visible by the linker and they
+// have the same name, but they have a meaningfully different implementation (one throws an exception
+// and the other aborts the program). This violates the ODR and makes the program ill-formed, and in
+// practice what will happen is that the linker will pick one of the definitions at random and will
+// discard the other one. This can quite clearly lead to incorrect program behavior.
+//
+// A similar reasoning holds for many other properties that are ODR-affecting. Essentially any
+// property that causes the code of a function to differ from the code in another configuration
+// can be considered ODR-affecting. In practice, we don't encode all such properties in the ABI
+// tag, but we encode the ones that we think are most important: library version, exceptions, and
+// hardening mode.
+//
+// Note that historically, solving this problem has been achieved in various ways, including
+// force-inlining all functions or giving internal linkage to all functions. Both these previous
+// solutions suffer from drawbacks that lead notably to code bloat.
//
// Note that we use _LIBCPP_EXCLUDE_FROM_EXPLICIT_INSTANTIATION to ensure that we don't depend
// on _LIBCPP_HIDE_FROM_ABI methods of classes explicitly instantiated in the dynamic library.
@@ -769,7 +797,7 @@ typedef __char32_t char32_t;
# ifndef _LIBCPP_NO_ABI_TAG
# define _LIBCPP_HIDE_FROM_ABI \
_LIBCPP_HIDDEN _LIBCPP_EXCLUDE_FROM_EXPLICIT_INSTANTIATION \
- __attribute__((__abi_tag__(_LIBCPP_TOSTRING(_LIBCPP_VERSIONED_IDENTIFIER))))
+ __attribute__((__abi_tag__(_LIBCPP_TOSTRING(_LIBCPP_ODR_SIGNATURE))))
# else
# define _LIBCPP_HIDE_FROM_ABI _LIBCPP_HIDDEN _LIBCPP_EXCLUDE_FROM_EXPLICIT_INSTANTIATION
# endif
diff --git a/libcxx/test/libcxx/odr_signature.exceptions.sh.cpp b/libcxx/test/libcxx/odr_signature.exceptions.sh.cpp
new file mode 100644
index 000000000000000..4796f4070f86300
--- /dev/null
+++ b/libcxx/test/libcxx/odr_signature.exceptions.sh.cpp
@@ -0,0 +1,43 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+// Test that we encode whether exceptions are supported in an ABI tag to avoid
+// ODR violations when linking TUs that have different values for it.
+
+// RUN: %{cxx} %s %{flags} %{compile_flags} -c -DTU1 -fno-exceptions -o %t.tu1.o
+// RUN: %{cxx} %s %{flags} %{compile_flags} -c -DTU2 -fexceptions -o %t.tu2.o
+// RUN: %{cxx} %s %{flags} %{compile_flags} -c -DMAIN -o %t.main.o
+// RUN: %{cxx} %t.tu1.o %t.tu2.o %t.main.o %{flags} %{link_flags} -o %t.exe
+// RUN: %{exec} %t.exe
+
+// -fno-exceptions
+#ifdef TU1
+# include <__config>
+_LIBCPP_HIDE_FROM_ABI inline int f() { return 1; }
+int tu1() { return f(); }
+#endif // TU1
+
+// -fexceptions
+#ifdef TU2
+# include <__config>
+_LIBCPP_HIDE_FROM_ABI inline int f() { return 2; }
+int tu2() { return f(); }
+#endif // TU2
+
+#ifdef MAIN
+# include <cassert>
+
+int tu1();
+int tu2();
+
+int main(int, char**) {
+ assert(tu1() == 1);
+ assert(tu2() == 2);
+ return 0;
+}
+#endif // MAIN
diff --git a/libcxx/test/libcxx/odr_signature.hardening.sh.cpp b/libcxx/test/libcxx/odr_signature.hardening.sh.cpp
new file mode 100644
index 000000000000000..b9965030966509d
--- /dev/null
+++ b/libcxx/test/libcxx/odr_signature.hardening.sh.cpp
@@ -0,0 +1,63 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+// Test that we encode the hardening mode in an ABI tag to avoid ODR violations
+// when linking TUs that have different values for it.
+
+// RUN: %{cxx} %s %{flags} %{compile_flags} -c -DTU1 -D_LIBCPP_ENABLE_HARDENED_MODE -o %t.tu1.o
+// RUN: %{cxx} %s %{flags} %{compile_flags} -c -DTU2 -D_LIBCPP_ENABLE_SAFE_MODE -o %t.tu2.o
+// RUN: %{cxx} %s %{flags} %{compile_flags} -c -DTU3 -D_LIBCPP_ENABLE_DEBUG_MODE -o %t.tu3.o
+// RUN: %{cxx} %s %{flags} %{compile_flags} -c -DTU4 -o %t.tu4.o
+// RUN: %{cxx} %s %{flags} %{compile_flags} -c -DMAIN -o %t.main.o
+// RUN: %{cxx} %t.tu1.o %t.tu2.o %t.tu3.o %t.tu4.o %t.main.o %{flags} %{link_flags} -o %t.exe
+// RUN: %{exec} %t.exe
+
+// hardened mode
+#ifdef TU1
+# include <__config>
+_LIBCPP_HIDE_FROM_ABI inline int f() { return 1; }
+int tu1() { return f(); }
+#endif // TU1
+
+// safe mode
+#ifdef TU2
+# include <__config>
+_LIBCPP_HIDE_FROM_ABI inline int f() { return 2; }
+int tu2() { return f(); }
+#endif // TU2
+
+// debug mode
+#ifdef TU3
+# include <__config>
+_LIBCPP_HIDE_FROM_ABI inline int f() { return 3; }
+int tu3() { return f(); }
+#endif // TU3
+
+// unchecked mode
+#ifdef TU4
+# include <__config>
+_LIBCPP_HIDE_FROM_ABI inline int f() { return 4; }
+int tu4() { return f(); }
+#endif // TU4
+
+#ifdef MAIN
+# include <cassert>
+
+int tu1();
+int tu2();
+int tu3();
+int tu4();
+
+int main(int, char**) {
+ assert(tu1() == 1);
+ assert(tu2() == 2);
+ assert(tu3() == 3);
+ assert(tu4() == 4);
+ return 0;
+}
+#endif // MAIN
|
Before merging this could you do some analysis to what this does to binary size, both with and without debug info? |
Do you mean the binary size of |
I mean something else. Choose something that instantiates a lot of templates in our test suite, and then compile that to an object. We're increasing the size of the mangled names here, and I want to know what effect that has. |
I'm fine with this change btw. It LGTM. |
Ok, that makes sense. So here we go. I took First, with
There is a 0.1% size increase for that test (which should be pretty intensive in terms of instantiating libc++ internal symbols with the ABI tag). Now with
This seems to show no difference at all between before and after at |
//===----------------------------------------------------------------------===// | ||
|
||
// TODO: Investigate | ||
// XFAIL: target={{.+}}-windows-{{.+}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mstorsjo Do you know why this might fail on Windows? ABI tags are supported an honored on Windows too, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like these ABI tags aren't supported when using the MSVC C++ name mangling - but as you observed, it does work with MinGW which uses the regular Itanium name mangling. I don't know the details any further than that for the MSVC C++ ABI though - maybe @rnk might know?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct, ABI tags are only supported for targets using the Itanium C++ ABI, so non-MSVC targets.
If it's not too much trouble, I'm interested in the results for debug mode. But feel free to ignore this, and don't consider it a blocker. Again, this LGTM. |
Here's the results for
There doesn't seem to be a super significant change here either. Also FWIW I have no idea how symbols are stored inside debug information and whether it's possible to change that without breaking the world, but it would be quite awesome if they were stored in some sort of compressed form. For example we could store symbols in some kind of prefix tree and then this change would have basically no effect on the overall size of symbols. |
To make the size comparison easier, bloaty has a builtin comparison feature, see the docs here: Comparison commands seem to look like: |
Yeah I realized that after reading @MaskRay 's blog post, thanks a bunch! I'll use that going forward :) |
…lvm#69669) As explained in `__config`, we have an ABI tag that we use to ensure that we don't run into ODR issues when mixing different versions of libc++ in multiple TUs. However, the reasoning behind that extends not only to different versions of libc++, but also to different configurations of the same version of libc++. In fact, we've been aware of this for a while but never really bothered to make the change because ODR issues are often thought to be benign. Well, it turns out that I just spent over an hour banging my head against an issue that boils down to our lack of encoding of some ODR properties in the ABI tag, so here's the patch we should have done a long time ago. For now, the ODR properties we encode in the ABI tag are: - library version - exceptions vs no-exceptions - hardening mode Those are all things that we support different values for on a per-TU basis and they definitely affect ODR in a meaningful way. We can add more properties later as we see fit.
…69669) As explained in `__config`, we have an ABI tag that we use to ensure that we don't run into ODR issues when mixing different versions of libc++ in multiple TUs. However, the reasoning behind that extends not only to different versions of libc++, but also to different configurations of the same version of libc++. In fact, we've been aware of this for a while but never really bothered to make the change because ODR issues are often thought to be benign. Well, it turns out that I just spent over an hour banging my head against an issue that boils down to our lack of encoding of some ODR properties in the ABI tag, so here's the patch we should have done a long time ago. For now, the ODR properties we encode in the ABI tag are: - library version - exceptions vs no-exceptions - hardening mode Those are all things that we support different values for on a per-TU basis and they definitely affect ODR in a meaningful way. We can add more properties later as we see fit. (cherry picked from commit bc792a2)
Hi @ldionne This change breaks OpenMP offloading on my machine. I traced it back to this commit with git bisect. If I compile simple tests programs with OpenMP offloading enabled, I get runtime errors like
and I did not get any from the previous commit in the log. I am not trying to ask you to revert this commit, but could you please help me understand what side effects this commit could have? |
If I had to guess, I would say there's some kind of mismatch between whether exceptions or hardening is enabled or not when you compile a function. I don't know how OpenMP offloading works, but if the compiler compiled the device code with e.g. exceptions disabled and then tried to find that function from a TU where exceptions are enabled, it could be that there's a mismatch between the mangled name the compiler expects to find and the actual mangled name that was generated for the device. |
Thanks a lot for the quick reply! I will try to see if enabling or disabling exceptions will resolve the problem. :D |
It worked to compile everything with exceptions disabled. |
As explained in
__config
, we have an ABI tag that we use to ensure that we don't run into ODR issues when mixing different versions of libc++ in multiple TUs. However, the reasoning behind that extends not only to different versions of libc++, but also to different configurations of the same version of libc++. In fact, we've been aware of this for a while but never really bothered to make the change because ODR issues are often thought to be benign.Well, it turns out that I just spent over an hour banging my head against an issue that boils down to our lack of encoding of some ODR properties in the ABI tag, so here's the patch we should have done a long time ago.
For now, the ODR properties we encode in the ABI tag are:
Those are all things that we support different values for on a per-TU basis and they definitely affect ODR in a meaningful way. We can add more properties later as we see fit.