Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ItaniumDemangle] Unconditionally parse substitution template arguments #131970

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

boomanaiden154
Copy link
Contributor

With the current mangled name parser, when we try to parse a type name inside the template argument of an operator, we end up not parsing any template arguments of substitutions, which ends up causing things to fail in certain cases. This is compliant with the Itanium ABI documentation as far as I can tell.

I tried to look through the git history to find out why the limitation on parsing template arguments for substitutions inside of operators existed, but I only ended up at the import commit, so I suspect the original motivation is lost to time.

A regression test from LLVM has been added.

With the current mangled name parser, when we try to parse a type name
inside the template argument of an operator, we end up not parsing any
template arguments of substitutions, which ends up causing things to
fail in certain cases. This is compliant with the Itanium ABI
documentation as far as I can tell.

I tried to look through the git history to find out why the limitation
on parsing template arguments for substitutions inside of operators
existed, but I only ended up at the import commit, so I suspect the
original motivation is lost to time.

A regression test from LLVM has been added.
@boomanaiden154 boomanaiden154 requested a review from a team as a code owner March 19, 2025 04:02
@llvmbot llvmbot added the libc++abi libc++abi C++ Runtime Library. Not libc++. label Mar 19, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 19, 2025

@llvm/pr-subscribers-libcxxabi

Author: Aiden Grossman (boomanaiden154)

Changes

With the current mangled name parser, when we try to parse a type name inside the template argument of an operator, we end up not parsing any template arguments of substitutions, which ends up causing things to fail in certain cases. This is compliant with the Itanium ABI documentation as far as I can tell.

I tried to look through the git history to find out why the limitation on parsing template arguments for substitutions inside of operators existed, but I only ended up at the import commit, so I suspect the original motivation is lost to time.

A regression test from LLVM has been added.


Full diff: https://github.com/llvm/llvm-project/pull/131970.diff

3 Files Affected:

  • (modified) libcxxabi/src/demangle/ItaniumDemangle.h (+3-3)
  • (modified) libcxxabi/test/test_demangle.pass.cpp (+1)
  • (modified) llvm/include/llvm/Demangle/ItaniumDemangle.h (+3-3)
diff --git a/libcxxabi/src/demangle/ItaniumDemangle.h b/libcxxabi/src/demangle/ItaniumDemangle.h
index 3df41b5f4d7d0..45ec8d41fee41 100644
--- a/libcxxabi/src/demangle/ItaniumDemangle.h
+++ b/libcxxabi/src/demangle/ItaniumDemangle.h
@@ -4626,10 +4626,10 @@ Node *AbstractManglingParser<Derived, Alloc>::parseType() {
       //   <template-template-param> ::= <template-param>
       //                             ::= <substitution>
       //
-      // If this is followed by some <template-args>, and we're permitted to
-      // parse them, take the second production.
+      // If this is followed by some <template-args>, take the second
+      // production.
 
-      if (look() == 'I' && (!IsSubst || TryToParseTemplateArgs)) {
+      if (look() == 'I') {
         if (!IsSubst)
           Subs.push_back(Result);
         Node *TA = getDerived().parseTemplateArgs();
diff --git a/libcxxabi/test/test_demangle.pass.cpp b/libcxxabi/test/test_demangle.pass.cpp
index e9c74f70a094b..fad6c101f742d 100644
--- a/libcxxabi/test/test_demangle.pass.cpp
+++ b/libcxxabi/test/test_demangle.pass.cpp
@@ -30247,6 +30247,7 @@ const char* cases[][2] = {
     {"_Z1fDSDRm", "f(_Sat unsigned long _Fract)"},
 
     {"_Z11bfloat16addDF16bDF16b", "bfloat16add(std::bfloat16_t, std::bfloat16_t)"},
+    {"_ZNK4llvm11SmallStringILj16EEcvNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEv", "llvm::SmallString<16u>::operator std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>() const"},
     // clang-format on
 };
 
diff --git a/llvm/include/llvm/Demangle/ItaniumDemangle.h b/llvm/include/llvm/Demangle/ItaniumDemangle.h
index b0363c1a7a786..15400694b3c70 100644
--- a/llvm/include/llvm/Demangle/ItaniumDemangle.h
+++ b/llvm/include/llvm/Demangle/ItaniumDemangle.h
@@ -4626,10 +4626,10 @@ Node *AbstractManglingParser<Derived, Alloc>::parseType() {
       //   <template-template-param> ::= <template-param>
       //                             ::= <substitution>
       //
-      // If this is followed by some <template-args>, and we're permitted to
-      // parse them, take the second production.
+      // If this is followed by some <template-args>, take the second
+      // production.
 
-      if (look() == 'I' && (!IsSubst || TryToParseTemplateArgs)) {
+      if (look() == 'I') {
         if (!IsSubst)
           Subs.push_back(Result);
         Node *TA = getDerived().parseTemplateArgs();

Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we might be able to make additional simplifications to the code since I think all meaningful references to TryToParseTemplateArgs are being removed in this patch?

@boomanaiden154
Copy link
Contributor Author

It looks like we might be able to make additional simplifications to the code since I think all meaningful references to TryToParseTemplateArgs are being removed in this patch?

There are still a couple there, and those ones do impact demangling (I ran the tests with them disabled and got failures).

@zygoloid
Copy link
Collaborator

I think the point here is to handle a mangling collision between:

template<typename T> struct X {};
struct C {
  operator X<int>();
} c;
X<int> x = c;

and:

struct X {};
struct C {
  template<typename T = int>
  operator X() {}
} c;
X x = c;

where in both cases the conversion function mangles as _ZN1Ccv1XIiEEv. This isn't a problem for the ABI, because these can't both exist in the same program -- X is either a template or a non-template -- but it's a problem for the demangler because it needs to pick one of those two options.

I think that's why TryToParseTemplateArgs is set to false in parseOperatorName -- it's trying to pick the second interpretation above, by stopping parsing the type name at the I. I don't know why we set TryToParseTemplateArgs to false in parseConvertExpr or on parseExpr when parsing a conversion expression, because the same ambiguity doesn't exist there (as far as I'm aware); perhaps this was added too broadly and those ScopedOverrides should not be present?

Can you check that these all still demangle with this patch applied:

_ZN1Ccv1XIiEEv
_ZN1Ccv1XIiEIiEEv
_ZN1Ccv1XIT_EIiEEv

@boomanaiden154
Copy link
Contributor Author

perhaps this was added too broadly and those ScopedOverrides should not be present?

I tried removing it, but ended up getting test failures, specifically with _ZN8Blizzard6Memory12voidp_returncvPT_IcEEv, _ZNK3com9markzware2js11cJSArgumentcvRKT_I8cMyClassEEv, and _ZN5OuterI4MarpEcvT_I4MerpEEv.

Can you check that these all still demangle with this patch applied:

It is able to still demangle all those cases. Interestingly the c++filt on my system is not able to demangle the last one, but that's not particularly relevant to anything.

root@32b49e0c8216:/llvm-project/build# ./bin/llvm-cxxfilt _ZN1Ccv1XIiEEv
C::operator X<int>()
root@32b49e0c8216:/llvm-project/build# ./bin/llvm-cxxfilt _ZN1Ccv1XIiEIiEEv
C::operator X<int><int>()
root@32b49e0c8216:/llvm-project/build# ./bin/llvm-cxxfilt _ZN1Ccv1XIT_EIiEEv
C::operator X<int><int>()

@zygoloid
Copy link
Collaborator

It would be good for someone to figure out and ideally write down (perhaps in a comment) how the existing code is handling these problems when demangling conversion functions. I don't have a lot of confidence that what we're currently doing is correct (and it's clearly not correct for the example in the new test case), but it's going to be hard to confidently approve this change without that understanding (Chesterton's fence).

@boomanaiden154
Copy link
Contributor Author

Yeah, makes sense. I will go through it more thoroughly when I have time and document what I find.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++abi libc++abi C++ Runtime Library. Not libc++.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants