Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve user symbol resolution #2386

Merged
merged 7 commits into from
Jun 13, 2023

Conversation

lenticularis39
Copy link
Contributor

@lenticularis39 lenticularis39 commented Oct 11, 2022

This PR contains three related improvements of usym resolution (edit: added purpose of the changes):

  • Caching BCC symcache per process if ASLR is enabled (edit: optional after changes, by default disabled)
    • Enables faster symbol resolution and symcache preloading (see below)
  • Resolving symbols directly from ELF (only if ASLR is disabled)
    • Enables resolving symbols that are not dynamically loaded even after process exit
  • Preloading BCC symcache for running instances of program
    • Enables resolving symbols after program exists even with ASLR enabled, given that the process exists at probe attachment time.

The latter two provide a partial fix of #2284.

Checklist
  • Language changes are updated in man/adoc/bpftrace.adoc and if needed in docs/reference_guide.md
  • User-visible and non-trivial changes updated in CHANGELOG.md
  • The new behaviour is covered by tests

@viktormalik
Copy link
Contributor

CI failed, can you debug it?

@lenticularis39
Copy link
Contributor Author

lenticularis39 commented Oct 14, 2022

I looked into that, one issue is codegen tests, because there is now one more attribute of the usym event (that one is easily fixable), other issue is ustack runtime tests failing with segfault (that one will be a little bit more difficult, since it doesn't happen on my system).

Also there is another problem with the PR, that is that the necessary BCC fix was reverted in iovisor/bcc#4270 (it turned that /proc/.../exe is not a simple symlink, but it also reflects mount namespaces, meaning the patch broke symbol resolution for processes running in a different namespace). That has to be resolved in order for the last of this PR to work.

Copy link
Contributor

@viktormalik viktormalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, these are very nice changes!

What I'm missing here, though, are better explanations of why the given approaches are necessary - this should appear in the PR description, in commit messages, and in code comments, too. Perhaps even a single-place summary of all the considered scenarios and corresponding solutions could be useful.

src/bpftrace.cpp Outdated Show resolved Hide resolved
man/adoc/bpftrace.adoc Outdated Show resolved Hide resolved
@lenticularis39 lenticularis39 marked this pull request as draft November 7, 2022 10:04
@lenticularis39
Copy link
Contributor Author

lenticularis39 commented Nov 7, 2022

I opened iovisor/bcc#4319 to implement necessary functionality for the symcache preloading to work correctly.

@lenticularis39 lenticularis39 force-pushed the even-better-symbol-resolution branch 2 times, most recently from 58756c1 to 618cd0e Compare December 20, 2022 09:56
@lenticularis39 lenticularis39 force-pushed the even-better-symbol-resolution branch 2 times, most recently from e5f628d to 5af39fa Compare January 31, 2023 09:26
@lenticularis39 lenticularis39 marked this pull request as ready for review January 31, 2023 09:49
@lenticularis39
Copy link
Contributor Author

Rebased and changed caching by PID, which is required for per-process symbol cache preloading, to optional (see the last commit for details).

@lenticularis39 lenticularis39 requested review from viktormalik and removed request for BurntBrunch, fbs, danobi and ajor January 31, 2023 09:56
tests/runtime/builtin Outdated Show resolved Hide resolved
Copy link
Contributor

@viktormalik viktormalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good but it's quite a complex change so I'd like to have another review before merging.

@ajor ajor self-requested a review April 6, 2023 13:08
@ajor ajor self-assigned this Apr 6, 2023
@ajor
Copy link
Member

ajor commented Apr 6, 2023

I love this PR, thanks so much for doing it - it solves so many problems I've had with user symbols over the years!

I'd suggest that we make "PER_PID" the default caching behaviour when ASLR is enabled, and leave it up to users to disable the cache if they feel it's using too much memory.

  1. Having ASLR enabled is not too uncommon, and often users will only be tracing a single process. I'd think the large amount of forking described in Collapse bcc symbol resolvers by process executable #816 is a less common scenario
  2. It's simpler for users to understand disabling a cache than it is to work out which type of caching to enable

Copy link
Member

@ajor ajor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also updated the ELF symbol resolution logic to work with addresses with non-zero offsets

Did you have an example of where this was causing symbolication problems? It'd be useful to check that in as a test-case.

I have a slight concern about the impact this pre-caching will have on startup time when tracing huge programs. I'll do a little bit of testing on this to satisfy myself.

src/ast/passes/codegen_llvm.cpp Outdated Show resolved Hide resolved
src/utils.cpp Outdated Show resolved Hide resolved
src/types.cpp Show resolved Hide resolved
src/bpftrace.h Outdated Show resolved Hide resolved
tests/runtime/builtin Outdated Show resolved Hide resolved
@lenticularis39
Copy link
Contributor Author

I also updated the ELF symbol resolution logic to work with addresses with non-zero offsets

Did you have an example of where this was causing symbolication problems? It'd be useful to check that in as a test-case.

The issue is in general ustack addresses that are not resolvable using bcc_symcache_resolve, but are resolvable using symbol_table_cache_. More specifically, it is ustack addresses in exited processes with ASLR disabled - I didn't write any tests for this because of the requirement of disabling ASLR (also for usym resolution), now when I'm thinking about it I could use personality(ADDR_NO_RANDOMIZE) to disable ASLR in a test program and test both.

I have a slight concern about the impact this pre-caching will have on startup time when tracing huge programs. I'll do a little bit of testing on this to satisfy myself.

OK. The bcc symcache pre-loading is not a issue, since it only loads the symbols when bcc_symcache_resolve is called (since e.g. dynamically loaded modules might not be present at the time of calling bcc_symcache_new), what could impact the startup time is symbol_table_cache_, it might be useful to have an option to disable it, like we do for the symcache.

@lenticularis39 lenticularis39 force-pushed the even-better-symbol-resolution branch 3 times, most recently from bcff44b to 90dddb6 Compare May 22, 2023 13:30
tests/testprogs/uprobe_symres_exited_process.c Dismissed Show dismissed Hide dismissed
tests/testprogs/disable_aslr.c Dismissed Show resolved Hide resolved
@lenticularis39
Copy link
Contributor Author

Oops, I forgot to change the main offset in the test to a more generic pattern, causing the test to fail with a different compiler.

Copy link
Member

@ajor ajor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thank you!

Extend caching usym tables by program (used when ASLR is disabled) with
caching by PID, which is used when ASLR is enabled.

Without ASLR all instances of one program share the same memory layout,
hence it can be stored once per-program.
In addition to resolving usyms from the running process, an alternative
is implemented that reads the program binary via libelf. This is useful
for cases when the usym resolution happens after the process exists.

This approach does not work for dynamically loaded symbols or processes
with ASLR enabled, in these cases usym resolution falls back to using
BCC symcache.

Implementation notes:
- To access the program binary, a "probe id" (distinct from the AST probe
id) is generated in resource analyzer, and passed via usym/ustack events
into the userspace when the associated perf event fires. If the probe
contains only one attach point, and therefore one program, this allows
the binary to be identified even for exited processes.
- ustack perf event was changed from pid/stack id packed integer to struct
to fit the probe id.
While attaching a uprobe using symbol resultions, look for running
instances of the program targeted by the uprobe and create a BCC symcache
for each of them.

This enables bpftrace to resolve usyms for processes running at the time of
attaching the probe even when ASLR is on and the process is gone at the time
when the print event fires.
Change BPFTRACE_CACHE_USER_SYMBOLS values from 0 and 1 to PER_PID,
PER_PROGRAM, and NONE. The values 0 and 1 are still supported, meaning
NONE and default (see below), for compatibility.

The new default is PER_PROGRAM when ASLR is disabled and PER_PID if
ASLR is enabled. NONE option can be used to save memory in case the
number of traced processes is high.
Note: test for ASLR enabled is disabled because of race condition,
see iovisor/bcc#4319.
Better symbol resolution
@lenticularis39
Copy link
Contributor Author

Rebased and fixed failing test added by #2625.

@ajor ajor merged commit 87ee523 into bpftrace:master Jun 13, 2023
21 checks passed
tnovak added a commit to tnovak/ExtendedAndroidTools that referenced this pull request Nov 3, 2023
Pulls the following versions of BPF packages:

- bpftrace: 0.19.1
- libbpf: 1.2.2
- bcc: 0.28-gb8b943a1 (commit b8b943a1)

This version of bpftrace speeds up symbol resolution [1] and fixes
ustack symbolication on 32-bit systems [2].

[1] bpftrace/bpftrace#2386
[2] iovisor/bcc#4775
michalgr pushed a commit to facebookexperimental/ExtendedAndroidTools that referenced this pull request Nov 6, 2023
Pulls the following versions of BPF packages:

- bpftrace: 0.19.1
- libbpf: 1.2.2
- bcc: 0.28-gb8b943a1 (commit b8b943a1)

This version of bpftrace speeds up symbol resolution [1] and fixes
ustack symbolication on 32-bit systems [2].

[1] bpftrace/bpftrace#2386
[2] iovisor/bcc#4775
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants