New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move probe expansion into codegen #3155
base: master
Are you sure you want to change the base?
Conversation
222e56d
to
3be0722
Compare
3be0722
to
63e4c72
Compare
Update: I had to drop unit tests for single-wildcard targets for uprobe/USDT added in #2757. The reason is that tests in |
10f6f74
to
dde56bd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are "full expansion" and "multi expansion" meant to be mutually exclusive? An enum instead of two bools might be clearer/safer if that's the case.
I haven't dug into the cause, but I can trigger a crash on this branch by setting both full and multi expansion to true on a single probe with this script:
kprobe:*:__x64_sys_rea* { print("a") }
// As the C++ language supports function overload, a given function name | ||
// (without parameters) could have multiple matches even when no | ||
// wildcards are used. | ||
if (has_wildcard(ap_->func) || has_wildcard(ap_->target) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we need full expansion for wildcarded executable paths? i.e.:
if (has_wildcard(ap_->target)
ap_->need_full_expansion = true;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In such a case, we do multi-expansion for each expanded path. See BPFtrace::add_probe
for details.
Indeed they are, a great suggestion!
Nice catch! It should be fixed now. |
dde56bd
to
305b515
Compare
While rebasing onto master, I noticed that uprobe_multi expansion doesn't work with placing uprobes after the funtion prologue. I opened #3173 to track that issue and this PR respects the current behavior (i.e. when uprobe expansion is needed, use kprobe_multi and attach to function entries, otherwise attach after function prologue). |
305b515
to
d3da3ae
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a partial review so far - I'll try and get through bpftrace.cpp
tomorrow.
I'm not keen on using environment variables to mock behaviour - it runs the risk of something dodgy happening if a user sets the env var at run time. I've recently started thinking about how we can make bpftrace a "trusted compiler" to allow it to be used by a non-root user (for use in containers with BPF token), and for that to work the compiler's behaviour has to be pretty well defined.
How about this as a way of mocking static classes: https://godbolt.org/z/37czorj84
d3da3ae
to
5dfc728
Compare
It turned out that the only method that we need to mock doesn't need to be static so I made |
The previous probe expansion approach tried to minimize the amount of LLVM functions generated by emitting a single function for all probe matches in most cases. While this was efficient, it came with a couple of drawbacks: - It is necessary to generate a separate LLVM function for each match (e.g. when the 'probe' builtin is used). This leads to having two very similar loops for iterating matches in BPFtrace::add_probe and in CodegenLLVM::visit(Probe) which is quite confusing and hard to maintain. - libbpf needs one BPF program (i.e. one LLVM function) per probe so if we want to delegate program loading (and possibly attachment) to libbpf (which we do), we cannot use this approach. See [1] for more details. This refactors probe expansion by moving most of it into codegen. Overall, we now distinguish two types of probe expansion: Full expansion - A separate LLVM function is generated for each match. This is used for most expansions now. Multi expansion - Used for k(u)probes when k(u)probe_multi is available. Generates one LLVM function and one BPF program for all matches and attaches the expanded functions via bpf_link_create_opts. This allows to drop a lot of duplicated code. The expansion for "full" is done in CodegenLLVM::visit(Probe), the expansion for "multi" is done in BPFtrace::add_probe. A drawback of this approach is that we generate substantially larger ELF objects for expansions of probe types which do not support multi-probes (e.g. kfuncs and tracepoints) as we generate duplicate LLVM functions. This is something we can live with for now since multi-attachment is not the main use-case for these probe types (e.g. attaching to many kfuncs is very slow) and there's usually an alternative to use multi-kprobes. One particular area where this refactoring caused problems is unit tests in tests/bpftrace.cpp. Previously, it was sufficient to generate a simple ast::Probe and pass it to BPFtrace::add_probe since that was where most of the expansion was done. Now that the expansion was moved to codegen, we need to do full parser -> field analyser -> clang parser -> semantic analyser -> codegen sequence. With this change, some tests had to be dropped, especially the tests with a single wildcard for uprobe/USDT target. The reason is that semantic analyser expands these wildcards by searching all paths on the system which is something that cannot be mocked and therefore should not be run in unit tests (e.g. it prevents running the unit tests as non-root). Also, the problem comes with USDT probes as USDTHelper was originally fully static which prevented its mocking. Fortunately, we only need to mock the find() method which doesn't have to be static, so this refactors USDTHelper and some of its users and introduces MockUSDTHelper which mocks the find method for unit tests. [1] bpftrace#3005
5dfc728
to
e18dac6
Compare
The previous probe expansion approach tried to minimize the amount of LLVM functions generated by emitting a single function for all probe matches in most cases. While this was efficient, it came with a couple of drawbacks:
probe
builtin is used). This leads to having two very similar loops for iterating matches inBPFtrace::add_probe
and inCodegenLLVM::visit(Probe)
which is quite confusing and hard to maintain.This refactors probe expansion by moving most of it into codegen. Overall, we now distinguish two types of probe expansion:
bpf_link_create_opts
.This allows to drop a lot of duplicated code. The expansion for "full" is done in
CodegenLLVM::visit(Probe)
, the expansion for "multi" is done inBPFtrace::add_probe
.A drawback of this approach is that we generate substantially larger ELF objects for expansions of probe types which do not support multi-probes (e.g. kfuncs and tracepoints) as we generate duplicate LLVM functions. This is something we can live with for now since multi-attachment is not the main use-case for these probe types (e.g. attaching to many kfuncs is very slow) and there's usually an alternative to use multi-kprobes. In addition, we will probably add a third expansion type using LLVM symbol aliases but that requires libbpf changes.
One particular area where this refactoring caused problems is unit tests in
tests/bpftrace.cpp
. Previously, it was sufficient to generate a simpleast::Probe
and pass it toBPFtrace::add_probe
since that was where most of the expansion was done. Now that the expansion was moved to codegen, we need to do full parser -> field analyser -> clang parser -> semantic analyser -> codegen sequence.Also, the problem comes with USDT probes as it is not possible to easily mock
USDTHelper
which is a fully static class. Since we need to overrideAttachPoint::usdt::num_locations
from tests, we allow to do that via a new internal env variableBPFTRACE_TEST_USDT_NUM_LOCATIONS
.Checklist
man/adoc/bpftrace.adoc
CHANGELOG.md