Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

calling "very flat" callgraphs with bad function names #1

Closed
GitMensch opened this issue Nov 14, 2022 · 13 comments · Fixed by #8
Closed

calling "very flat" callgraphs with bad function names #1

GitMensch opened this issue Nov 14, 2022 · 13 comments · Fixed by #8

Comments

@GitMensch
Copy link
Contributor

I've pulled current master, then built (no need for site.mak as dependencies were installed via apt in Debian), then followed the docs further (pgcollect with manually increased frequency, then running pgconvert) - but end up with a profile that has all entries named "func_HASH" and a total result of 148%.

should this still work and produce a callgrind file that contains reasonable function names and percentage?

@ostash
Copy link
Owner

ostash commented Nov 14, 2022

Hello,

Thank you very much for the interest in my program :)

If it compiles properly it should work :) Best results are achieved if profiled binary is compiled with debug information enabled and with -fno-omit-frame-pointer compiler option. I didn't use it recently, but I don't see why program shouldn't work.

Is source code of your program available anywhere? I can take a look how to build and collect the profile.

@GitMensch
Copy link
Contributor Author

Thanks for the fast answer!

Is source code of your program available anywhere? I can take a look how to build and collect the profile.

Sure! Let's start with a simple one:

Source: https://github.com/KDAB/hotspot/tree/master/tests/test-clients/cpp-libs

Compilation:

cd test-clients/cpp-libs
gcc -ggdb3 -fPIC -shared sharedlib.cpp -o libshared.so
gcc -ggdb3 -fPIC -c staticlib.cpp -o staticlib.o
ar -rcs libstatic.a staticlib.o
g++ -ggdb3 -fPIC -o main main.cpp -L. -lshared -lstatic

Profiling:

LD_LIBRARY_PATH=. pgcollect -F 8196 outfile.pgdata ./main
pgconvert outfile.pgdata -d source > callgrind.out.pgdata

Note: Ideally I'd like to pgconvert profiles coming directly from
LD_LIBRARY_PATH=. perf record -o perf.data --call-graph dwarf,8192 -z --sample-cpu ./main.

@ostash
Copy link
Owner

ostash commented Nov 14, 2022

I believe I know where is the issue, there is even a TODO: determine how to handle pageOffset in mmap event. I will push dirty untested fix to branch

@ostash
Copy link
Owner

ostash commented Nov 14, 2022

Please, give a try for https://github.com/ostash/perfgrind/tree/pgoff, it worked for me (with -g -fno-omit-frame-pointer). I do no really remember if -fno-omit-frame-pointer is required, but this is a separate topic.

@GitMensch
Copy link
Contributor Author

That did work quite nice. I still have some func_HASH entries in, but those are from ld-2.31.so.
We may close this and create a follow-up issue "FR: filter specified libraries" :-)

@ostash
Copy link
Owner

ostash commented Nov 15, 2022

Glad to hear that. I'm still not satisfied completely as callgraph doesn't show call from main to sharedlib, want to take a closer look.

@GitMensch
Copy link
Contributor Author

Sounds good. The last "big piece" on shared libraries would be handling of dlopen'd - and much more important - afterwards dlclose'd shared libraries. Should this work already?

To get a useful profile with other tools you either need to LD_PRELOAD a dummy dlclose (for example gprof) or inert an explicit library call to the profiler for dumping the profile (gprof/callgrind) or (only possible with callgrind) use a command line option to specify an event where the profile should be dumped (a function call).

@ostash
Copy link
Owner

ostash commented Nov 15, 2022

Partially. There is no explicit support for dlclose()/munmap(), but with ASLR enabled there are very low chances that a library gets mapped to the same addresses which were occupied by some other unloaded library previously.

I agree that it will be nice to handle that.

@GitMensch
Copy link
Contributor Author

GitMensch commented Nov 16, 2022

Just to recheck: Do you consider the current result that's quite often reproducible with the current make check reasonable?

pgcollect'd

For comparison: that's the output generated by perf record:
perf-record'd
(note: it outputs "Module "libstdc++.so.6.0.28" is missing 1 of 1 debug symbols.")

which is quite similar (also not showing the actual executed pginfo_dbg, because that test runs too fast), but the function names are resolved, so "something" is different.

perf report seems to not do something with the function names, at least I did neither spot anything in its strace, nor in the recording strace, but you may want to have a look.

@ostash
Copy link
Owner

ostash commented Nov 17, 2022

  • looks like perf report includes samples in the kernel space, which is ignored by pgcollect;
  • I'm not sure how perf reports stack unwinding and symbol resolving, most probably it has support for external debug information, which is in TODO for pgreport.

@GitMensch
Copy link
Contributor Author

looks like perf report includes samples in the kernel space, which is ignored by pgcollect

I see - and that's actually a reasonable distinction.
Can you add this information to the README in the short description on what perfgrind is?

I'm not sure how perf reports stack unwinding and symbol resolving, most probably it has support for external debug information, which is in TODO for pgreport.

I agree. It seems reasonable to mention

- use debug link section in ELF for finding separate debug info.

in the README with a minimal explanation what this means. Could you do this at the end of the short description as "current limitation", please?

@ostash
Copy link
Owner

ostash commented Nov 17, 2022

I can't say that ignoring kernel space is a limitation, rather it was a design decision :) The application I profiled was highly CPU-bound, so I was completely not interested in the kernel.

But, you're right is probable worth mentioning that somewhere.

@GitMensch
Copy link
Contributor Author

"feauture/difference": ignoring kernel space
"current limitation": external debug symbol reading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants