Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pprof: symbolize mishandles the names of methods of parameterized Go types #705

Closed
adonovan opened this issue May 24, 2022 · 3 comments · Fixed by #717
Closed

pprof: symbolize mishandles the names of methods of parameterized Go types #705

adonovan opened this issue May 24, 2022 · 3 comments · Fixed by #717

Comments

@adonovan
Copy link

adonovan commented May 24, 2022

Using the version of pprof vendored in Go, the names of parameterized functions are mishandled so that a function named pkg.(*T[...]).f is rendered in SVG view as pkg \n \n f. The reason for the mishandling is that the C++ heuristic in Symbolize spuriously matches the square brackets, causing the parenthesized type name portion(*T[...]) to be removed.

This appears to be an effective workaround, but perhaps one can do better:

diff --git a/src/cmd/vendor/github.com/google/pprof/internal/symbolizer/symbolizer.go b/src/cmd/vendo
r/github.com/google/pprof/internal/symbolizer/symbolizer.go
index cbb0ed4d1b..90641a621e 100644
--- a/src/cmd/vendor/github.com/google/pprof/internal/symbolizer/symbolizer.go
+++ b/src/cmd/vendor/github.com/google/pprof/internal/symbolizer/symbolizer.go
@@ -231,7 +231,13 @@ func Demangle(prof *profile.Profile, force bool, demanglerMode string) {
                // Could not demangle. Apply heuristics in case the name is
                // already demangled.
                name := fn.SystemName
-               if looksLikeDemangledCPlusPlus(name) {
+
+               if strings.Contains(name, "[...]") {
+                       // Go parameterized function.
+                       // (Replace the periods so they don't get treated
+                       // as path separators later in dot view.)
+                       name = strings.ReplaceAll(name, "[...]", "[…]")
+               } else if looksLikeDemangledCPlusPlus(name) {
                        if demanglerMode == "" || demanglerMode == "templates" {
                                name = removeMatching(name, '(', ')')
                        }

Before: Screen Shot 2022-05-23 at 11 39 48 PM
With workaround:
Screen Shot 2022-05-23 at 11 44 07 PM

@ianlancetaylor

@prattmic
Copy link
Member

I think that detection of [...] could go directly in looksLikeDemangledCPlusPlus as a special case like the one for Java.

Dot will need to handle ... as well, but perhaps that should be handled directly in the dot generator code? (I'd also considered replacing ... with unicode ellipsis before I found this issue!)

There is another wrinkle here too, in that the actual symbol names look something like main.(*Foo[go.shape.string_0,go.shape.int_1]).Meth. In pprof profiling, Go simplifies the symbol name with [...], but a perf profile for example will still use the full symbol name, which ideally should not trip up pprof. Thus, I'd like to get looksLikeDemangledCPlusPlus to not match either main.(*Foo[...]).Meth or main.(*Foo[go.shape.string_0,go.shape.int_1]).Meth.

I took a look at the demangled symbols in a large C++ binary, and every instance of square brackets fell into one of two categories:

  1. Nothing inside the brackets (e.g., new[], delete[], operator[], char []).
  2. A single number inside the brackets (e.g., unsigned long const (&) [4]). I think these were all array types.

Neither of these patterns would appear in Go symbols, so we could match on either of these in looksLikeDemangledCPlusPlus. However, I am very familiar with C++ mangling, so I'm not sure if there are other possible uses of mangled square brackets in symbol names that I missed (cc @ianlancetaylor may know?).

@ianlancetaylor
Copy link
Contributor

Here are some other cases where square brackets will appear in demangled C++ names. I don't know whether they matter.

_Z3fooPfS_S_j.sse4_1.2
foo(float*, float*, float*, unsigned int) [clone .sse4_1.2]

_Z1f1AIXtl1Xdi1adi1bdXLi3ELi4ELi1EEEE
f(A<X{.a.b[3 ... 4]=(1)}>)

_ZDC1a1bE
[a, b]

_ZNStDC1aEE
std::[a]

_ZW3FooDC1a1bE
[a, b]@Foo

_ZN3NMSW3MOD3FooB3ABIEv
NMS::Foo@MOD[abi:ABI]()

prattmic added a commit to prattmic/pprof that referenced this issue Jul 28, 2022
For the curious: `bar(int (*) [5])` is a pointer to an array of
length 5:

	typedef int IntArray[5];
	int bar(IntArray* arr);

For google#705.
prattmic added a commit to prattmic/pprof that referenced this issue Jul 28, 2022
prattmic added a commit to prattmic/pprof that referenced this issue Jul 28, 2022
Go symbols with type parameters are reported in Go pprof profiles as
something like `main.Set[...]`.

Today, multilinePrintableName mishandles this by replacing each with a
newline, resulting in a very strange looking node.

Instead, replace ... with a unicode ellipsis so that it isn't replaced
with a newline just below.

Note that the full symbol name (which is reported by tools like perf) is
something like `[go.shape.string_0,go.shape.int_1]`. This case is still
not handled very well, but we likely want to strip this out anyways in
ShortenFunctionName or demangling as future work.

Fixes google#705.
@prattmic
Copy link
Member

Oof, wow. I, uh, love C++.

#717 handles only the two cases I described. The others will not be detected as C++, so function parameters and type parameters would not be removed when simplifying names (https://github.com/golang/go/blob/master/src/cmd/vendor/github.com/google/pprof/internal/symbolizer/symbolizer.go#L231).

I am also unsure how important these are either. This code is also only relevant if the profile contains symbol names that are already demangled, and it is unclear to me how common that is. I can also support a few more of these, but some like [a] are ambiguous with Go names depending on where in the name they can appear, so we can't do it perfectly.

prattmic added a commit to prattmic/pprof that referenced this issue Jul 29, 2022
prattmic added a commit to prattmic/pprof that referenced this issue Jul 29, 2022
Go symbols with type parameters are reported in Go pprof profiles as
something like `main.Set[...]`.

Today, multilinePrintableName mishandles this by replacing each with a
newline, resulting in a very strange looking node.

Instead, replace ... with a unicode ellipsis so that it isn't replaced
with a newline just below.

Note that the full symbol name (which is reported by tools like perf) is
something like `[go.shape.string_0,go.shape.int_1]`. This case is still
not handled very well, but we likely want to strip this out anyways in
ShortenFunctionName or demangling as future work.

Fixes google#705.
aalexand pushed a commit that referenced this issue Jul 29, 2022
* internal/symbolizer: add unit tests for Demangle

For the curious: `bar(int (*) [5])` is a pointer to an array of
length 5:

	typedef int IntArray[5];
	int bar(IntArray* arr);

For #705.

* internal/symbolizer: avoid matching generic Go symbols as C++

For #705.

* internal/graph: keep Go type parameter ellipsis in dot

Go symbols with type parameters are reported in Go pprof profiles as
something like `main.Set[...]`.

Today, multilinePrintableName mishandles this by replacing each with a
newline, resulting in a very strange looking node.

Instead, replace ... with a unicode ellipsis so that it isn't replaced
with a newline just below.

Note that the full symbol name (which is reported by tools like perf) is
something like `[go.shape.string_0,go.shape.int_1]`. This case is still
not handled very well, but we likely want to strip this out anyways in
ShortenFunctionName or demangling as future work.

Fixes #705.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants