Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[llvm-cxxfilt] Added the option --no-params #75348

Merged
merged 3 commits into from
Jan 2, 2024

Conversation

slydiman
Copy link
Contributor

@slydiman slydiman commented Dec 13, 2023

Added -p / --no-params flag to skip demangling function parameters similar to how it is supported by GNU c++filt tool.

There are cases when users want to demangle a large number of symbols in bulk, for example, at startup, and do not care about function parameters and overloads at that time. Skipping the demangling of parameter types led to a measurable improvement in performance. Our users reported about 15% speed up with GNU c++filt and we expect similar results with llvm-cxxfilt with this patch.

@llvmbot
Copy link
Collaborator

llvmbot commented Dec 13, 2023

@llvm/pr-subscribers-libcxxabi

@llvm/pr-subscribers-llvm-binary-utilities

Author: Dmitry Vasilyev (slydiman)

Changes

Added -p / --no-params flag to skip demangling function parameters similar to how it is supported by GNU c++filt tool.


Full diff: https://github.com/llvm/llvm-project/pull/75348.diff

7 Files Affected:

  • (modified) llvm/include/llvm/Demangle/Demangle.h (+2-2)
  • (modified) llvm/include/llvm/Demangle/ItaniumDemangle.h (+16-6)
  • (modified) llvm/lib/Demangle/Demangle.cpp (+3-2)
  • (modified) llvm/lib/Demangle/ItaniumDemangle.cpp (+2-2)
  • (added) llvm/test/tools/llvm-cxxfilt/no-params.test (+14)
  • (modified) llvm/tools/llvm-cxxfilt/Opts.td (+2)
  • (modified) llvm/tools/llvm-cxxfilt/llvm-cxxfilt.cpp (+7-3)
diff --git a/llvm/include/llvm/Demangle/Demangle.h b/llvm/include/llvm/Demangle/Demangle.h
index 70cfc1418f0c7b..4174a160171f9b 100644
--- a/llvm/include/llvm/Demangle/Demangle.h
+++ b/llvm/include/llvm/Demangle/Demangle.h
@@ -32,7 +32,7 @@ enum : int {
 /// Returns a non-NULL pointer to a NUL-terminated C style string
 /// that should be explicitly freed, if successful. Otherwise, may return
 /// nullptr if mangled_name is not a valid mangling or is nullptr.
-char *itaniumDemangle(std::string_view mangled_name);
+char *itaniumDemangle(std::string_view mangled_name, bool ParseParams = true);
 
 enum MSDemangleFlags {
   MSDF_None = 0,
@@ -68,7 +68,7 @@ char *dlangDemangle(std::string_view MangledName);
 std::string demangle(std::string_view MangledName);
 
 bool nonMicrosoftDemangle(std::string_view MangledName, std::string &Result,
-                          bool CanHaveLeadingDot = true);
+                          bool CanHaveLeadingDot = true, bool ParseParams = true);
 
 /// "Partial" demangler. This supports demangling a string into an AST
 /// (typically an intermediate stage in itaniumDemangle) and querying certain
diff --git a/llvm/include/llvm/Demangle/ItaniumDemangle.h b/llvm/include/llvm/Demangle/ItaniumDemangle.h
index e0ff035d47cfb4..619bb93ab11b51 100644
--- a/llvm/include/llvm/Demangle/ItaniumDemangle.h
+++ b/llvm/include/llvm/Demangle/ItaniumDemangle.h
@@ -2793,7 +2793,7 @@ template <typename Derived, typename Alloc> struct AbstractManglingParser {
   Node *parseClassEnumType();
   Node *parseQualifiedType();
 
-  Node *parseEncoding();
+  Node *parseEncoding(bool ParseParams = true);
   bool parseCallOffset();
   Node *parseSpecialName();
 
@@ -2910,7 +2910,7 @@ template <typename Derived, typename Alloc> struct AbstractManglingParser {
   Node *parseDestructorName();
 
   /// Top-level entry point into the parser.
-  Node *parse();
+  Node *parse(bool ParseParams = true);
 };
 
 const char* parse_discriminator(const char* first, const char* last);
@@ -5404,7 +5404,7 @@ Node *AbstractManglingParser<Derived, Alloc>::parseSpecialName() {
 //            ::= <data name>
 //            ::= <special-name>
 template <typename Derived, typename Alloc>
-Node *AbstractManglingParser<Derived, Alloc>::parseEncoding() {
+Node *AbstractManglingParser<Derived, Alloc>::parseEncoding(bool ParseParams) {
   // The template parameters of an encoding are unrelated to those of the
   // enclosing context.
   SaveTemplateParams SaveTemplateParamsScope(this);
@@ -5452,6 +5452,16 @@ Node *AbstractManglingParser<Derived, Alloc>::parseEncoding() {
   }
 
   NodeArray Params;
+  
+  // ParseParams may be false in top-level only, when called from parse().
+  if (!ParseParams) {
+    while (consume());
+    return make<FunctionEncoding>(ReturnType, Name, Params, Attrs,
+                                  nullptr /* Requires */,
+                                  NameInfo.CVQualifiers,
+                                  NameInfo.ReferenceQualifier);
+  }
+
   if (!consumeIf('v')) {
     size_t ParamsBegin = Names.size();
     do {
@@ -5894,9 +5904,9 @@ AbstractManglingParser<Derived, Alloc>::parseTemplateArgs(bool TagTemplates) {
 // extension      ::= ___Z <encoding> _block_invoke<decimal-digit>+
 // extension      ::= ___Z <encoding> _block_invoke_<decimal-digit>+
 template <typename Derived, typename Alloc>
-Node *AbstractManglingParser<Derived, Alloc>::parse() {
+Node *AbstractManglingParser<Derived, Alloc>::parse(bool ParseParams) {
   if (consumeIf("_Z") || consumeIf("__Z")) {
-    Node *Encoding = getDerived().parseEncoding();
+    Node *Encoding = getDerived().parseEncoding(ParseParams);
     if (Encoding == nullptr)
       return nullptr;
     if (look() == '.') {
@@ -5910,7 +5920,7 @@ Node *AbstractManglingParser<Derived, Alloc>::parse() {
   }
 
   if (consumeIf("___Z") || consumeIf("____Z")) {
-    Node *Encoding = getDerived().parseEncoding();
+    Node *Encoding = getDerived().parseEncoding(ParseParams);
     if (Encoding == nullptr || !consumeIf("_block_invoke"))
       return nullptr;
     bool RequireNumber = consumeIf('_');
diff --git a/llvm/lib/Demangle/Demangle.cpp b/llvm/lib/Demangle/Demangle.cpp
index 83f3cdc88c01ef..117b849d1c7848 100644
--- a/llvm/lib/Demangle/Demangle.cpp
+++ b/llvm/lib/Demangle/Demangle.cpp
@@ -47,7 +47,8 @@ static bool isRustEncoding(std::string_view S) { return starts_with(S, "_R"); }
 static bool isDLangEncoding(std::string_view S) { return starts_with(S, "_D"); }
 
 bool llvm::nonMicrosoftDemangle(std::string_view MangledName,
-                                std::string &Result, bool CanHaveLeadingDot) {
+                                std::string &Result, bool CanHaveLeadingDot,
+                                bool ParseParams) {
   char *Demangled = nullptr;
 
   // Do not consider the dot prefix as part of the demangled symbol name.
@@ -57,7 +58,7 @@ bool llvm::nonMicrosoftDemangle(std::string_view MangledName,
   }
 
   if (isItaniumEncoding(MangledName))
-    Demangled = itaniumDemangle(MangledName);
+    Demangled = itaniumDemangle(MangledName, ParseParams);
   else if (isRustEncoding(MangledName))
     Demangled = rustDemangle(MangledName);
   else if (isDLangEncoding(MangledName))
diff --git a/llvm/lib/Demangle/ItaniumDemangle.cpp b/llvm/lib/Demangle/ItaniumDemangle.cpp
index e3f208f0adf8dc..5c21b06a1d0955 100644
--- a/llvm/lib/Demangle/ItaniumDemangle.cpp
+++ b/llvm/lib/Demangle/ItaniumDemangle.cpp
@@ -366,13 +366,13 @@ class DefaultAllocator {
 
 using Demangler = itanium_demangle::ManglingParser<DefaultAllocator>;
 
-char *llvm::itaniumDemangle(std::string_view MangledName) {
+char *llvm::itaniumDemangle(std::string_view MangledName, bool ParseParams) {
   if (MangledName.empty())
     return nullptr;
 
   Demangler Parser(MangledName.data(),
                    MangledName.data() + MangledName.length());
-  Node *AST = Parser.parse();
+  Node *AST = Parser.parse(ParseParams);
   if (!AST)
     return nullptr;
 
diff --git a/llvm/test/tools/llvm-cxxfilt/no-params.test b/llvm/test/tools/llvm-cxxfilt/no-params.test
new file mode 100644
index 00000000000000..d021a4cfd3db21
--- /dev/null
+++ b/llvm/test/tools/llvm-cxxfilt/no-params.test
@@ -0,0 +1,14 @@
+RUN: llvm-cxxfilt             _ZN1f2baC2ERKNS_2baIT_EE _ZN2baC2EOT0_ | FileCheck %s --check-prefix=CHECK-PARAMS
+RUN: llvm-cxxfilt          -p _ZN1f2baC2ERKNS_2baIT_EE _ZN2baC2EOT0_ | FileCheck %s --check-prefix=CHECK-NO-PARAMS
+RUN: llvm-cxxfilt --no-params _ZN1f2baC2ERKNS_2baIT_EE _ZN2baC2EOT0_ | FileCheck %s --check-prefix=CHECK-NO-PARAMS
+
+# Use invalid mangled name broken in function parameters
+# to check how -p or --no-params flag works.
+# If given we should demangle function name just fine,
+# if not given demangle should fail because of the invalid params.
+
+CHECK-PARAMS: _ZN1f2baC2ERKNS_2baIT_EE
+CHECK-NO-PARAMS: f::ba::ba()
+
+CHECK-PARAMS: _ZN2baC2EOT0_
+CHECK-NO-PARAMS: ba::ba()
diff --git a/llvm/tools/llvm-cxxfilt/Opts.td b/llvm/tools/llvm-cxxfilt/Opts.td
index f652a1a7f88bbd..df5d284283f7d1 100644
--- a/llvm/tools/llvm-cxxfilt/Opts.td
+++ b/llvm/tools/llvm-cxxfilt/Opts.td
@@ -17,6 +17,7 @@ multiclass Eq<string name, string help> {
 def help : FF<"help", "Display this help">;
 defm strip_underscore : BB<"strip-underscore", "Strip the leading underscore", "Don't strip the leading underscore">;
 def types : FF<"types", "Attempt to demangle types as well as function names">;
+def no_params : FF<"no-params", "Skip function parameters">;
 def version : FF<"version", "Display the version">;
 
 defm : Eq<"format", "Specify mangling format. Currently ignored because only 'gnu' is supported">;
@@ -25,4 +26,5 @@ def : F<"s", "Alias for --format">;
 def : F<"_", "Alias for --strip-underscore">, Alias<strip_underscore>;
 def : F<"h", "Alias for --help">, Alias<help>;
 def : F<"n", "Alias for --no-strip-underscore">, Alias<no_strip_underscore>;
+def : F<"p", "Alias for --no-params">, Alias<no_params>;
 def : F<"t", "Alias for --types">, Alias<types>;
diff --git a/llvm/tools/llvm-cxxfilt/llvm-cxxfilt.cpp b/llvm/tools/llvm-cxxfilt/llvm-cxxfilt.cpp
index 4b9d88a650666e..e393c9b11da3fb 100644
--- a/llvm/tools/llvm-cxxfilt/llvm-cxxfilt.cpp
+++ b/llvm/tools/llvm-cxxfilt/llvm-cxxfilt.cpp
@@ -55,6 +55,7 @@ class CxxfiltOptTable : public opt::GenericOptTable {
 } // namespace
 
 static bool StripUnderscore;
+static bool ParseParams;
 static bool Types;
 
 static StringRef ToolName;
@@ -74,18 +75,19 @@ static std::string demangle(const std::string &Mangled) {
   }
 
   std::string Result;
-  if (nonMicrosoftDemangle(DecoratedStr, Result, CanHaveLeadingDot))
+  if (nonMicrosoftDemangle(DecoratedStr, Result, CanHaveLeadingDot,
+                           ParseParams))
     return Result;
 
   std::string Prefix;
   char *Undecorated = nullptr;
 
   if (Types)
-    Undecorated = itaniumDemangle(DecoratedStr);
+    Undecorated = itaniumDemangle(DecoratedStr, ParseParams);
 
   if (!Undecorated && starts_with(DecoratedStr, "__imp_")) {
     Prefix = "import thunk for ";
-    Undecorated = itaniumDemangle(DecoratedStr.substr(6));
+    Undecorated = itaniumDemangle(DecoratedStr.substr(6), ParseParams);
   }
 
   Result = Undecorated ? Prefix + Undecorated : Mangled;
@@ -173,6 +175,8 @@ int llvm_cxxfilt_main(int argc, char **argv, const llvm::ToolContext &) {
   else
     StripUnderscore = Triple(sys::getProcessTriple()).isOSBinFormatMachO();
 
+  ParseParams = !Args.hasArg(OPT_no_params);
+
   Types = Args.hasArg(OPT_types);
 
   std::vector<std::string> Decorated = Args.getAllArgValues(OPT_INPUT);

Copy link

github-actions bot commented Dec 13, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@jh7370 jh7370 requested a review from MaskRay December 13, 2023 15:29
@slydiman slydiman force-pushed the llvm-cxxfilt-no-params branch 2 times, most recently from 2b53770 to 7ab9256 Compare December 13, 2023 15:37
Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the new option to the Command Guide documentation.

The behaviour you've implemented doesn't match GNU c++filt's behaviour. The GNU tool produces the following output when used with the input strings you've provided in the test:

james@james-VirtualBox:~$ c++filt -p ZN1f2baC2ERKNS_2baIT_EE _ZN2baC2EOT0_
ZN1f2baC2ERKNS_2baIT_EE
ba::ba

This is with c++filt version 2.38.

llvm/include/llvm/Demangle/ItaniumDemangle.h Outdated Show resolved Hide resolved
llvm/include/llvm/Demangle/ItaniumDemangle.h Outdated Show resolved Hide resolved
@slydiman
Copy link
Contributor Author

Please add the new option to the Command Guide documentation.

Done. Thanks

@slydiman
Copy link
Contributor Author

The behaviour you've implemented doesn't match GNU c++filt's behaviour. The GNU tool produces the following output when used with the input strings you've provided in the test:

james@james-VirtualBox:~$ c++filt -p ZN1f2baC2ERKNS_2baIT_EE _ZN2baC2EOT0_
ZN1f2baC2ERKNS_2baIT_EE
ba::ba

This is with c++filt version 2.38.

Please pay attention to _ZN1f2baC2ERKNS_2baIT_EE.

c++filt -p _ZN1f2baC2ERKNS_2baIT_EE _ZN2baC2EOT0_
f::ba::ba
ba::ba

@slydiman slydiman requested a review from jh7370 December 13, 2023 16:32
@jh7370
Copy link
Collaborator

jh7370 commented Dec 14, 2023

c++filt -p _ZN1f2baC2ERKNS_2baIT_EE ZN2baC2EOT0

My apologies, you are quite right. However, my point still stands: your tested output doesn't match the actual output that the GNU tool produces - note the parentheses at the end in the thing you've implemented, whereas these are not present in the GNU output.

@slydiman
Copy link
Contributor Author

note the parentheses at the end in the thing you've implemented, whereas these are not present in the GNU output.

Note NodeArray Params is required parameter for the class FunctionEncoding. All other parts are const Node * and may be nullptr. We can update Params type to const NodeArray * and update the printing logic in FunctionEncoding::printRight() like

if (Params) {
  OB.printOpen();
  Params->printWithComma(OB);
  OB.printClose();
}

But I think it is redundant. It requires to review all FunctionEncoding::getParams() usage too.

@jh7370
Copy link
Collaborator

jh7370 commented Dec 15, 2023

I've not attempted to review your implementation yet, because the output doesn't match the expected output. The --no-params option doesn't just not demangle the parameters, it omits the enclosing parentheses entirely. I think this is important behaviour that we need to replicate. Otherwise, it could be easy to mistake a function demangled with this option with one with no parameters. In general, we aim to be compatible with the GNU binutils, and I believe there's no reason to deviate here.

@slydiman
Copy link
Contributor Author

slydiman commented Dec 15, 2023

The --no-params option doesn't just not demangle the parameters, it omits the enclosing parentheses entirely. I think this is important behaviour that we need to replicate.

Ok, agreed. Note the --no-params option omits the return type too. I have updated the patch to fully match GNU behavior.

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the --no-params option have any impact on templates in GNU c++filt? In particular, I'm wondering how a function signature used in a template argument might behave when this option is specified. (I did a quick experiment locally and couldn't see it changing the behaviour, but I could easily have missed something).

It's probably worth getting another person to look at the changes in the demangler library, as it's been a while since I looked at that code.

llvm/include/llvm/Demangle/Demangle.h Show resolved Hide resolved
llvm/include/llvm/Demangle/ItaniumDemangle.h Show resolved Hide resolved
llvm/include/llvm/Demangle/ItaniumDemangle.h Outdated Show resolved Hide resolved
llvm/test/tools/llvm-cxxfilt/no-params.test Outdated Show resolved Hide resolved
llvm/test/tools/llvm-cxxfilt/no-params.test Outdated Show resolved Hide resolved
llvm/test/tools/llvm-cxxfilt/no-params.test Outdated Show resolved Hide resolved
llvm/test/tools/llvm-cxxfilt/no-params.test Outdated Show resolved Hide resolved
llvm/docs/CommandGuide/llvm-cxxfilt.rst Outdated Show resolved Hide resolved
llvm/tools/llvm-cxxfilt/Opts.td Outdated Show resolved Hide resolved
Added -p / --no-params flag to skip demangling function parameters similar to how it is supported by GNU c++filt tool.
@slydiman slydiman requested a review from a team as a code owner December 18, 2023 17:50
@llvmbot llvmbot added the libc++abi libc++abi C++ Runtime Library. Not libc++. label Dec 18, 2023
@MaskRay
Copy link
Member

MaskRay commented Dec 18, 2023

Can you edit the first comment (which will become the commit message) to include the motivation and some background about the feature? I am not very familiar with mangling but I know that there is quite some complexity involving dependent types, template arguments, etc. Giving some examples will help.

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the LLVM guidelines, please don't force push your changes as it makes it harder to view what has actually changed since the previous review.

Also, please don't resolve conversations I've initiated. I'll do that once I've reviewed the change you've made and am happy with it. See https://discourse.llvm.org/t/rfc-github-pr-resolve-conversation-button/73178 for a more detailed expllanation and discussion.

llvm/test/tools/llvm-cxxfilt/no-params.test Outdated Show resolved Hide resolved
llvm/include/llvm/Demangle/ItaniumDemangle.h Outdated Show resolved Hide resolved
llvm/include/llvm/Demangle/ItaniumDemangle.h Show resolved Hide resolved
@slydiman
Copy link
Contributor Author

slydiman commented Dec 19, 2023

@MaskRay

Can you edit the first comment (which will become the commit message) to include the motivation and some background about the feature?

Sure. I have updated the first comment.

I know that there is quite some complexity involving dependent types, template arguments, etc. Giving some examples will help.

Please look at the added tests.

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not care of function parameters

of -> about

The new --no-params flag would let save time on processing function parameters.

I think you can delete this sentence entirely - it's just repeating what has already been said in the comment.

LGTM, with remaining nits fixed, but might be worth giving @MaskRay a chance to make further comments.

@@ -5431,7 +5431,10 @@ Node *AbstractManglingParser<Derived, Alloc>::parseEncoding(bool ParseParams) {
if (IsEndOfEncoding())
return Name;

// ParseParams may be false in top-level only, when called from parse().
// ParseParams maybe false at the top level only, when called from parse().
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// ParseParams maybe false at the top level only, when called from parse().
// ParseParams may be false at the top level only, when called from parse().

Apologies for the typo in my original suggestion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much. I have updated the comment and the description.

@slydiman
Copy link
Contributor Author

LGTM, with remaining nits fixed, but might be worth giving @MaskRay a chance to make further comments.

Ping.

@slydiman slydiman merged commit 71f8ea3 into llvm:main Jan 2, 2024
42 of 43 checks passed
@nico
Copy link
Contributor

nico commented Jan 2, 2024

This breaks tests on macOS, which uses two leading underscores: http://45.33.8.238/macm1/75841/step_11.txt

Please take a look and revert for now if it takes a while to fix.

aeubanks added a commit that referenced this pull request Jan 2, 2024
This reverts commit 71f8ea3.

Test doesn't pass on mac. See comments on
#75348.
@aeubanks
Copy link
Contributor

aeubanks commented Jan 2, 2024

I've reverted this for now, feel free to reland whenever the test is updated to pass on mac

@cachemeifyoucan
Copy link
Collaborator

Just need to explicit pass -n for c++filt tests because Mac platform uses a different default. Alternatively, we can probably add a new substitution %llvm-c++filt that just expand to the command with -n added.

slydiman added a commit that referenced this pull request Jan 4, 2024
Added -p / --no-params flag to skip demangling function parameters
similar to how it is supported by GNU c++filt tool.

There are cases when users want to demangle a large number of symbols in
bulk, for example, at startup, and do not care about function parameters
and overloads at that time. Skipping the demangling of parameter types
led to a measurable improvement in performance. Our users reported about
15% speed up with GNU c++filt and we expect similar results with
llvm-cxxfilt with this patch.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++abi libc++abi C++ Runtime Library. Not libc++. llvm:binary-utilities
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants