Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[libcxxabi][ItaniumDemangle] Demangle explicitly named object parameters #72881

Merged

Conversation

Michael137
Copy link
Member

@Michael137 Michael137 commented Nov 20, 2023

The mangling for an explicitly named object was introduced in https://reviews.llvm.org/D140828

See following discussion for why a new mangling had to be introduced: itanium-cxx-abi/cxx-abi#148

Since clang started emitting names with the new mangling, this patch implements support for demangling such names.

The approach this patch takes is to add a new ExplicitObjectParameter node that will print the first parameter of a function declaration with a this prefix, to reflect what was spelled out in source.

Example:

void MyClass::func(this MyClass const& self); // _ZNH7MyClass4funcERKS_

With this patch, the above demangles to:

_ZNH7MyClass4funcERKS_ -> MyClass::func(this MyClass const&)

Note that func is not marked as const &, since the function-qualifiers are now encoded as part of the explicit this. C++ doesn't allow specifying the function-qualifiers in the presence of an explicit object parameter, so this demangling is consistent with the source spelling.

@Michael137 Michael137 requested a review from a team as a code owner November 20, 2023 14:58
@llvmbot llvmbot added the libc++abi libc++abi C++ Runtime Library. Not libc++. label Nov 20, 2023
@llvmbot
Copy link
Collaborator

llvmbot commented Nov 20, 2023

@llvm/pr-subscribers-libcxxabi

Author: Michael Buch (Michael137)

Changes

The mangling for an explicitly named object was introduced in https://reviews.llvm.org/D140828

See following discussion for why a new mangling had to be introduced: itanium-cxx-abi/cxx-abi#148

Since clang started emitting names with the new mangling, this patch implements support for demangling such names.

The approach this patch takes is to simply skip the explicit this parameter when constructing a FunctionEncoding. Thus following two functions demangle to the same:

struct Foo {
  void func(this Foo self);
  void func();
};

Full diff: https://github.com/llvm/llvm-project/pull/72881.diff

2 Files Affected:

  • (modified) libcxxabi/src/demangle/ItaniumDemangle.h (+22-8)
  • (modified) libcxxabi/test/test_demangle.pass.cpp (+10)
diff --git a/libcxxabi/src/demangle/ItaniumDemangle.h b/libcxxabi/src/demangle/ItaniumDemangle.h
index 2336b84da293bcc..f9a0c059e86e8bd 100644
--- a/libcxxabi/src/demangle/ItaniumDemangle.h
+++ b/libcxxabi/src/demangle/ItaniumDemangle.h
@@ -2780,6 +2780,7 @@ template <typename Derived, typename Alloc> struct AbstractManglingParser {
     Qualifiers CVQualifiers = QualNone;
     FunctionRefQual ReferenceQualifier = FrefQualNone;
     size_t ForwardTemplateRefsBegin;
+    bool HasExplicitObjectParameter = false;
 
     NameState(AbstractManglingParser *Enclosing)
         : ForwardTemplateRefsBegin(Enclosing->ForwardTemplateRefs.size()) {}
@@ -3438,15 +3439,25 @@ AbstractManglingParser<Derived, Alloc>::parseNestedName(NameState *State) {
   if (!consumeIf('N'))
     return nullptr;
 
-  Qualifiers CVTmp = parseCVQualifiers();
-  if (State) State->CVQualifiers = CVTmp;
+  // 'H' specifies that the encoding that follows
+  // has an explicit object parameter.
+  if (!consumeIf('H')) {
+    Qualifiers CVTmp = parseCVQualifiers();
+    if (State)
+      State->CVQualifiers = CVTmp;
 
-  if (consumeIf('O')) {
-    if (State) State->ReferenceQualifier = FrefQualRValue;
-  } else if (consumeIf('R')) {
-    if (State) State->ReferenceQualifier = FrefQualLValue;
-  } else {
-    if (State) State->ReferenceQualifier = FrefQualNone;
+    if (consumeIf('O')) {
+      if (State)
+        State->ReferenceQualifier = FrefQualRValue;
+    } else if (consumeIf('R')) {
+      if (State)
+        State->ReferenceQualifier = FrefQualLValue;
+    } else {
+      if (State)
+        State->ReferenceQualifier = FrefQualNone;
+    }
+  } else if (State) {
+    State->HasExplicitObjectParameter = true;
   }
 
   Node *SoFar = nullptr;
@@ -5424,6 +5435,9 @@ Node *AbstractManglingParser<Derived, Alloc>::parseEncoding() {
         return nullptr;
       Names.push_back(Ty);
     } while (!IsEndOfEncoding() && look() != 'Q');
+    // Ignore the explicit 'this' parameter.
+    if (NameInfo.HasExplicitObjectParameter)
+      ++ParamsBegin;
     Params = popTrailingNodeArray(ParamsBegin);
   }
 
diff --git a/libcxxabi/test/test_demangle.pass.cpp b/libcxxabi/test/test_demangle.pass.cpp
index 77741a952850ab9..a6b6d45a3783bb1 100644
--- a/libcxxabi/test/test_demangle.pass.cpp
+++ b/libcxxabi/test/test_demangle.pass.cpp
@@ -30174,6 +30174,16 @@ const char* cases[][2] =
     {"_Z2f5IKiEvu14__remove_constIT_E", "void f5<int const>(__remove_const(int const))"},
     {"_Z2f5IPiEvu16__remove_pointerIT_E", "void f5<int*>(__remove_pointer(int*))"},
     {"_Z2f5IiEvu14__remove_cvrefIT_E", "void f5<int>(__remove_cvref(int))"},
+
+    // C++23 explicit object parameter
+    {"_ZNH3Foo3fooES_i", "Foo::foo(int)"},
+    {"_ZNH3Foo3fooERKS_i", "Foo::foo(int)"},
+    {"_ZNH1X3fooIRS_EEvOT_i", "void X::foo<X&>(int)"},
+    {"_ZZNH2ns3Foo3fooES0_iENH4Foo24foo2EOKS1_", "ns::Foo::foo(int)::Foo2::foo2()" },
+    {"_ZNH2ns3FooB7Foo_tag3fooB7foo_tagERKS0_i", "ns::Foo[abi:Foo_tag]::foo[abi:foo_tag](int)" },
+    {"_ZZN3Foo3fooEiENH4Foo24foo2EOKS0_", "Foo::foo(int)::Foo2::foo2()"},
+    {"_ZZNH3Foo3fooES_iENK4Foo24foo2Ev", "Foo::foo(int)::Foo2::foo2() const" },
+    {"_ZNH3FooclERKS_", "Foo::operator()()"},
 };
 
 const unsigned N = sizeof(cases) / sizeof(cases[0]);

@Michael137
Copy link
Member Author

Once people are happy with the approach I'll sync the changes to llvm in a separate patch

@Michael137
Copy link
Member Author

clang-format fails for the tests, which I don't think we want to actually format

Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My review shouldn't be considered authoritative cause I don't know the demangler very well, but this seems reasonable to me.

@@ -5424,6 +5435,9 @@ Node *AbstractManglingParser<Derived, Alloc>::parseEncoding() {
return nullptr;
Names.push_back(Ty);
} while (!IsEndOfEncoding() && look() != 'Q');
// Ignore the explicit 'this' parameter.
if (NameInfo.HasExplicitObjectParameter)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to avoid printing out the this parameter? I think it could be useful to include it, especially if a user is overloading using it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No particular reason apart from implementation simplicity. Thinking about it some more though, we would be losing the ref-qualifiers on the functions here because in the explicit-object-paramter case that information is encoded into the parameter itself, which we discard.

Shouldn't be too difficult to amend this patch to explicitly name the parameter. Probably with a this prefix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a stab at this in the latest commit

Copy link
Member

@epilk epilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you run the cp-to-llvm.sh script before merging this too? Thanks!

@@ -5422,6 +5456,14 @@ Node *AbstractManglingParser<Derived, Alloc>::parseEncoding() {
Node *Ty = getDerived().parseType();
if (Ty == nullptr)
return nullptr;

const bool IsFirstParam = (Names.size() - ParamsBegin) == 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: ParamsBegin == Names.size() seems a bit more direct.


template <typename Fn> void match(Fn F) const { F(Base); }

void printLeft(OutputBuffer &OB) const override { OB += "this "; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you don't need to overload printRight, you should just be able to do: OB += "this"; Base->print(OB); here in printLeft (in which case you can remove the Cache::Yes just above).

public:
ExplicitObjectParameter(Node *Base_)
: Node(KExplicitObjectParameter, Cache::Yes), Base(Base_) {
assert(Base != nullptr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bots are complaining about assert here, looks like we're meant to use DEMANGLE_ASSERT now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I'm a little confused about this one, since we're using plain asserts throughout the file. Any idea why specifically this one is failing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the bots are rebasing your changes on top of c4779ea ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah good catch!

@Michael137
Copy link
Member Author

The buildkite job for this PR is failing on Windows/Linux with:

.../llvm-project/github-pull-requests/llvm/include/llvm/Demangle/ItaniumDemangle.h:905:5: error: use of undeclared identifier 'DEMANGLE_ASSERT'
    DEMANGLE_ASSERT(
    ^

Pretty sure it isn't a problem with this patch. Is there some misconfiguration going on? (@ldionne)

@Michael137 Michael137 force-pushed the bugfix/demangle-explicit-object-parameter branch from 9197ebd to b2c4195 Compare November 27, 2023 11:26
@Michael137
Copy link
Member Author

This builds fine for me on my Linux machine. @epilk any idea what the buildkite bot could be doing differently?

@epilk
Copy link
Member

epilk commented Nov 28, 2023

This builds fine for me on my Linux machine. @epilk any idea what the buildkite bot could be doing differently?

Looks like an unrelated failure on windows. Maybe try rebasing to see if it's still failing, and if so ping @ldionne?

@Michael137
Copy link
Member Author

This builds fine for me on my Linux machine. @epilk any idea what the buildkite bot could be doing differently?

Looks like an unrelated failure on windows. Maybe try rebasing to see if it's still failing, and if so ping @ldionne?

Ah seems like after rebase the build failure was fixed. Going to merge since the test failures look unrelated

@Michael137 Michael137 force-pushed the bugfix/demangle-explicit-object-parameter branch from b2c4195 to 033fd85 Compare November 28, 2023 15:54
@Michael137 Michael137 merged commit 4d5079c into llvm:main Nov 28, 2023
35 of 38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++abi libc++abi C++ Runtime Library. Not libc++.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants