-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Probe expansion in codegen #3005
Comments
What do you mean by "perform the expansion" in step 3? I'm not familiar with this code - are we copying the bytecode currently? My concern with unnecessary probe expansion is the performance impact if we want to attach to 100k+ probes (e.g. |
We're not copying it directly but we create one
In reality, attaching to such a large number of fentry probes is already terribly slow (it's caused by the kernel, not bpftrace):
Also remember that there's a limit of 512 probes which we have (can be lifted by setting an env variable). All in all, attaching to a huge number of kfuncs is not practical and it's not their main use-case in the first place. The only other probe types which could use such a large number of attach points are kprobes and uprobes, and here we could use kprobe_multi and uprobe_multi link types and generate just a single LLVM function.
I agree but we'd still rely on libbpf to do the program collection by iterating the symbol table. If that ever changes (IMHO it's very unlikely), we'd have to adapt. Also, we can always add this if we find that there are performance issues with the full expansion approach. |
Trying to digest this a bit. It seems (as per @viktormalik 's point) that perhaps the only real concern here is around expansion of |
The only problem is that the "multi" variants are rather new and therefore won't be supported on older kernels. Still, the 512 probe limit would hit on those kernels so we shouldn't get an ELF with thousands of copies of a BPF function.
I agree, unless the compiler has a "standard" way to do that. I haven't found any, yet.
|
Attaching to a huge amount of fentry probes isn't currently possible due to the kernel's performance as you said. It is something that users want to do and should be able to do though, so we need to keep it in mind for whenever a kernel fix comes along. This is something that @tyroguru is interested in. It's the probe detach which is slow rather than the attach, if that makes any difference (try with this script: Duplicate SymbolsI can create duplicate symbols for functions with Clang, so there must be an interface for doing this in libLLVM:
Maybe MCContext::getOrCreateSymbol? https://llvm.org/doxygen/classllvm_1_1MCContext.html#ac11eef690074972378846024abbe8722 libbpfIt looks like retsnoop is doing something special for mass attaching to fentries, but I suppose it has the requirement of being compiled ahead of time: https://github.com/anakryiko/retsnoop/blob/2d730d468719ed35d0f3bc2dbc958bd90f31342e/src/mass_attacher.c#L510-L528 Pinging @anakryiko for any input on using libbpf. |
There is nothing that retsnoop or libbpf can do to speed up attachment/detachment of fentry/fexit BPF programs, unfortunately. Kernel doesn't support single shot multi-attachment for them (there were discussions but it never got implemented). The piece you linked is just preparing few different copies of programs, depending on number of arguments. This is done to let libbpf perform relocations and all other adjustments, so that retsnoop can just grab raw BPF instructions and clone them for each btf_id (see clone_prog(), https://github.com/anakryiko/retsnoop/blob/2d730d468719ed35d0f3bc2dbc958bd90f31342e/src/mass_attacher.c#L977). So fentry/fexit mode is supported by retsnoop, but it's slow, with its own limitations, and definitely not the preferred mode. It does have advantages in some situations (fentry pollutes LBR entries much less compared to kprobes). It's very different for kprobe/kretprobe. Retsnoop by default will use multi-kprobes and will be able to attach to thousands of programs almost instantaneous. With just one program for entry and one for exit programs. |
Thanks for the insights @anakryiko. The
There's also symbol aliasing in LLVM which sounds like what we need. I'll have a look into it. |
The previous probe expansion approach tried to minimize the amount of LLVM functions generated by emitting a single function for all probe matches in most cases. While this was efficient, it came with a couple of drawbacks: - It is necessary to generate a separate LLVM function for each match (e.g. when the 'probe' builtin is used). This leads to having two very similar loops for iterating matches in BPFtrace::add_probe and in CodegenLLVM::visit(Probe) which is quite confusing and hard to maintain. - libbpf needs one BPF program (i.e. one LLVM function) per probe so if we want to delegate program loading (and possibly attachment) to libbpf (which we do), we cannot use this approach. See [1] for more details. This refactors probe expansion by moving most of it into codegen. Overall, we now distinguish three types of probe expansion: Full expansion - A separate LLVM function is generated for each match. This is used for probe types with a small number of matches (e.g. "hardware"), when the 'probe' builtin is used, or for USDT probes (b/c they may access args). Alias expansion - Generates one LLVM function for all matches and one alias pointing to that function for each match. This is an efficient way of creating an ELF with multiple BPF programs sharing the same code which libbpf is able to parse. Used for expansion of most probe types. Multi expansion - Used for k(u)probes when k(u)probe_multi is available. Generates one LLVM function and one BPF program for all matches and attaches the expanded functions via bpf_link_create_opts. This allows to drop a lot of duplicated code. The expansion for "full" and "alias" is done in CodegenLLVM::visit(Probe), the expansion for "multi" is done in BPFtrace::add_probe. One particular area where this refactoring caused problems is unit tests in tests/bpftrace.cpp. Previously, it was sufficient to generate a simple ast::Probe and pass it to BPFtrace::add_probe since that was where most of the expansion was done. Now that the expansion was moved to codegen, we need to do full parser -> field analyser -> clang parser -> semantic analyser -> codegen sequence. Also, the problem comes with USDT probes as it is not possible to easily mock USDTHelper which is a fully static class. Since we need to override AttachPoint::usdt::num_locations from tests, we allow to do that via a new internal env variable BPFTRACE_TEST_USDT_NUM_LOCATIONS. [1] bpftrace#3005
I did some investigation and experiments here and found that using symbol aliases will indeed produce multiple symbols with the same address and libbpf will correctly discover them as separate BPF programs (and do a copy of the instructions for each). The problem is that libbpf relocations will not work b/c libbpf doesn't count with multiple programs sharing the same instructions in the ELF file. This should be possible to fix on libbpf side but it's a bigger change so I'd suggest going with full expansion (i.e. one LLVM function per wildcard match) for the first version of #2334. |
The previous probe expansion approach tried to minimize the amount of LLVM functions generated by emitting a single function for all probe matches in most cases. While this was efficient, it came with a couple of drawbacks: - It is necessary to generate a separate LLVM function for each match (e.g. when the 'probe' builtin is used). This leads to having two very similar loops for iterating matches in BPFtrace::add_probe and in CodegenLLVM::visit(Probe) which is quite confusing and hard to maintain. - libbpf needs one BPF program (i.e. one LLVM function) per probe so if we want to delegate program loading (and possibly attachment) to libbpf (which we do), we cannot use this approach. See [1] for more details. This refactors probe expansion by moving most of it into codegen. Overall, we now distinguish two types of probe expansion: Full expansion - A separate LLVM function is generated for each match. This is used for probe types with a small number of matches (e.g. "hardware"), when the 'probe' builtin is used, or for USDT probes (b/c they may access args). Multi expansion - Used for k(u)probes when k(u)probe_multi is available. Generates one LLVM function and one BPF program for all matches and attaches the expanded functions via bpf_link_create_opts. This allows to drop a lot of duplicated code. The expansion for "full" is done in CodegenLLVM::visit(Probe), the expansion for "multi" is done in BPFtrace::add_probe. A drawback of this approach is that we generate substantially larger ELF objects for expansions of probe types which do not support multi-probes (e.g. kfuncs and tracepoints) as we generate duplicate LLVM functions. This is something we can live with for now since multi-attachment is not the main use-case for these probe types (e.g. attaching to many kfuncs is very slow) and there's usually an alternative to use multi-kprobes. One particular area where this refactoring caused problems is unit tests in tests/bpftrace.cpp. Previously, it was sufficient to generate a simple ast::Probe and pass it to BPFtrace::add_probe since that was where most of the expansion was done. Now that the expansion was moved to codegen, we need to do full parser -> field analyser -> clang parser -> semantic analyser -> codegen sequence. Also, the problem comes with USDT probes as it is not possible to easily mock USDTHelper which is a fully static class. Since we need to override AttachPoint::usdt::num_locations from tests, we allow to do that via a new internal env variable BPFTRACE_TEST_USDT_NUM_LOCATIONS. [1] bpftrace#3005
The previous probe expansion approach tried to minimize the amount of LLVM functions generated by emitting a single function for all probe matches in most cases. While this was efficient, it came with a couple of drawbacks: - It is necessary to generate a separate LLVM function for each match (e.g. when the 'probe' builtin is used). This leads to having two very similar loops for iterating matches in BPFtrace::add_probe and in CodegenLLVM::visit(Probe) which is quite confusing and hard to maintain. - libbpf needs one BPF program (i.e. one LLVM function) per probe so if we want to delegate program loading (and possibly attachment) to libbpf (which we do), we cannot use this approach. See [1] for more details. This refactors probe expansion by moving most of it into codegen. Overall, we now distinguish two types of probe expansion: Full expansion - A separate LLVM function is generated for each match. This is used for most expansions now. Multi expansion - Used for k(u)probes when k(u)probe_multi is available. Generates one LLVM function and one BPF program for all matches and attaches the expanded functions via bpf_link_create_opts. This allows to drop a lot of duplicated code. The expansion for "full" is done in CodegenLLVM::visit(Probe), the expansion for "multi" is done in BPFtrace::add_probe. A drawback of this approach is that we generate substantially larger ELF objects for expansions of probe types which do not support multi-probes (e.g. kfuncs and tracepoints) as we generate duplicate LLVM functions. This is something we can live with for now since multi-attachment is not the main use-case for these probe types (e.g. attaching to many kfuncs is very slow) and there's usually an alternative to use multi-kprobes. One particular area where this refactoring caused problems is unit tests in tests/bpftrace.cpp. Previously, it was sufficient to generate a simple ast::Probe and pass it to BPFtrace::add_probe since that was where most of the expansion was done. Now that the expansion was moved to codegen, we need to do full parser -> field analyser -> clang parser -> semantic analyser -> codegen sequence. Also, the problem comes with USDT probes as it is not possible to easily mock USDTHelper which is a fully static class. Since we need to override AttachPoint::usdt::num_locations from tests, we allow to do that via a new internal env variable BPFTRACE_TEST_USDT_NUM_LOCATIONS. [1] bpftrace#3005
The previous probe expansion approach tried to minimize the amount of LLVM functions generated by emitting a single function for all probe matches in most cases. While this was efficient, it came with a couple of drawbacks: - It is necessary to generate a separate LLVM function for each match (e.g. when the 'probe' builtin is used). This leads to having two very similar loops for iterating matches in BPFtrace::add_probe and in CodegenLLVM::visit(Probe) which is quite confusing and hard to maintain. - libbpf needs one BPF program (i.e. one LLVM function) per probe so if we want to delegate program loading (and possibly attachment) to libbpf (which we do), we cannot use this approach. See [1] for more details. This refactors probe expansion by moving most of it into codegen. Overall, we now distinguish two types of probe expansion: Full expansion - A separate LLVM function is generated for each match. This is used for most expansions now. Multi expansion - Used for k(u)probes when k(u)probe_multi is available. Generates one LLVM function and one BPF program for all matches and attaches the expanded functions via bpf_link_create_opts. This allows to drop a lot of duplicated code. The expansion for "full" is done in CodegenLLVM::visit(Probe), the expansion for "multi" is done in BPFtrace::add_probe. A drawback of this approach is that we generate substantially larger ELF objects for expansions of probe types which do not support multi-probes (e.g. kfuncs and tracepoints) as we generate duplicate LLVM functions. This is something we can live with for now since multi-attachment is not the main use-case for these probe types (e.g. attaching to many kfuncs is very slow) and there's usually an alternative to use multi-kprobes. One particular area where this refactoring caused problems is unit tests in tests/bpftrace.cpp. Previously, it was sufficient to generate a simple ast::Probe and pass it to BPFtrace::add_probe since that was where most of the expansion was done. Now that the expansion was moved to codegen, we need to do full parser -> field analyser -> clang parser -> semantic analyser -> codegen sequence. Also, the problem comes with USDT probes as it is not possible to easily mock USDTHelper which is a fully static class. Since we need to override AttachPoint::usdt::num_locations from tests, we allow to do that via a new internal env variable BPFTRACE_TEST_USDT_NUM_LOCATIONS. [1] bpftrace#3005
The previous probe expansion approach tried to minimize the amount of LLVM functions generated by emitting a single function for all probe matches in most cases. While this was efficient, it came with a couple of drawbacks: - It is necessary to generate a separate LLVM function for each match (e.g. when the 'probe' builtin is used). This leads to having two very similar loops for iterating matches in BPFtrace::add_probe and in CodegenLLVM::visit(Probe) which is quite confusing and hard to maintain. - libbpf needs one BPF program (i.e. one LLVM function) per probe so if we want to delegate program loading (and possibly attachment) to libbpf (which we do), we cannot use this approach. See [1] for more details. This refactors probe expansion by moving most of it into codegen. Overall, we now distinguish two types of probe expansion: Full expansion - A separate LLVM function is generated for each match. This is used for most expansions now. Multi expansion - Used for k(u)probes when k(u)probe_multi is available. Generates one LLVM function and one BPF program for all matches and attaches the expanded functions via bpf_link_create_opts. This allows to drop a lot of duplicated code. The expansion for "full" is done in CodegenLLVM::visit(Probe), the expansion for "multi" is done in BPFtrace::add_probe. A drawback of this approach is that we generate substantially larger ELF objects for expansions of probe types which do not support multi-probes (e.g. kfuncs and tracepoints) as we generate duplicate LLVM functions. This is something we can live with for now since multi-attachment is not the main use-case for these probe types (e.g. attaching to many kfuncs is very slow) and there's usually an alternative to use multi-kprobes. One particular area where this refactoring caused problems is unit tests in tests/bpftrace.cpp. Previously, it was sufficient to generate a simple ast::Probe and pass it to BPFtrace::add_probe since that was where most of the expansion was done. Now that the expansion was moved to codegen, we need to do full parser -> field analyser -> clang parser -> semantic analyser -> codegen sequence. With this change, some tests had to be dropped, especially the tests with a single wildcard for uprobe/USDT target. The reason is that semantic analyser expands these wildcards by searching all paths on the system which is something that cannot be mocked and therefore should not be run in unit tests (e.g. it prevents running the unit tests as non-root). Also, the problem comes with USDT probes as it is not possible to easily mock USDTHelper which is a fully static class. Since we need to override AttachPoint::usdt::num_locations from tests, we allow to do that via a new internal env variable BPFTRACE_TEST_USDT_NUM_LOCATIONS. [1] bpftrace#3005
The previous probe expansion approach tried to minimize the amount of LLVM functions generated by emitting a single function for all probe matches in most cases. While this was efficient, it came with a couple of drawbacks: - It is necessary to generate a separate LLVM function for each match (e.g. when the 'probe' builtin is used). This leads to having two very similar loops for iterating matches in BPFtrace::add_probe and in CodegenLLVM::visit(Probe) which is quite confusing and hard to maintain. - libbpf needs one BPF program (i.e. one LLVM function) per probe so if we want to delegate program loading (and possibly attachment) to libbpf (which we do), we cannot use this approach. See [1] for more details. This refactors probe expansion by moving most of it into codegen. Overall, we now distinguish two types of probe expansion: Full expansion - A separate LLVM function is generated for each match. This is used for most expansions now. Multi expansion - Used for k(u)probes when k(u)probe_multi is available. Generates one LLVM function and one BPF program for all matches and attaches the expanded functions via bpf_link_create_opts. This allows to drop a lot of duplicated code. The expansion for "full" is done in CodegenLLVM::visit(Probe), the expansion for "multi" is done in BPFtrace::add_probe. A drawback of this approach is that we generate substantially larger ELF objects for expansions of probe types which do not support multi-probes (e.g. kfuncs and tracepoints) as we generate duplicate LLVM functions. This is something we can live with for now since multi-attachment is not the main use-case for these probe types (e.g. attaching to many kfuncs is very slow) and there's usually an alternative to use multi-kprobes. One particular area where this refactoring caused problems is unit tests in tests/bpftrace.cpp. Previously, it was sufficient to generate a simple ast::Probe and pass it to BPFtrace::add_probe since that was where most of the expansion was done. Now that the expansion was moved to codegen, we need to do full parser -> field analyser -> clang parser -> semantic analyser -> codegen sequence. With this change, some tests had to be dropped, especially the tests with a single wildcard for uprobe/USDT target. The reason is that semantic analyser expands these wildcards by searching all paths on the system which is something that cannot be mocked and therefore should not be run in unit tests (e.g. it prevents running the unit tests as non-root). Also, the problem comes with USDT probes as it is not possible to easily mock USDTHelper which is a fully static class. Since we need to override AttachPoint::usdt::num_locations from tests, we allow to do that via a new internal env variable BPFTRACE_TEST_USDT_NUM_LOCATIONS. [1] bpftrace#3005
The previous probe expansion approach tried to minimize the amount of LLVM functions generated by emitting a single function for all probe matches in most cases. While this was efficient, it came with a couple of drawbacks: - It is necessary to generate a separate LLVM function for each match (e.g. when the 'probe' builtin is used). This leads to having two very similar loops for iterating matches in BPFtrace::add_probe and in CodegenLLVM::visit(Probe) which is quite confusing and hard to maintain. - libbpf needs one BPF program (i.e. one LLVM function) per probe so if we want to delegate program loading (and possibly attachment) to libbpf (which we do), we cannot use this approach. See [1] for more details. This refactors probe expansion by moving most of it into codegen. Overall, we now distinguish two types of probe expansion: Full expansion - A separate LLVM function is generated for each match. This is used for most expansions now. Multi expansion - Used for k(u)probes when k(u)probe_multi is available. Generates one LLVM function and one BPF program for all matches and attaches the expanded functions via bpf_link_create_opts. This allows to drop a lot of duplicated code. The expansion for "full" is done in CodegenLLVM::visit(Probe), the expansion for "multi" is done in BPFtrace::add_probe. A drawback of this approach is that we generate substantially larger ELF objects for expansions of probe types which do not support multi-probes (e.g. kfuncs and tracepoints) as we generate duplicate LLVM functions. This is something we can live with for now since multi-attachment is not the main use-case for these probe types (e.g. attaching to many kfuncs is very slow) and there's usually an alternative to use multi-kprobes. One particular area where this refactoring caused problems is unit tests in tests/bpftrace.cpp. Previously, it was sufficient to generate a simple ast::Probe and pass it to BPFtrace::add_probe since that was where most of the expansion was done. Now that the expansion was moved to codegen, we need to do full parser -> field analyser -> clang parser -> semantic analyser -> codegen sequence. With this change, some tests had to be dropped, especially the tests with a single wildcard for uprobe/USDT target. The reason is that semantic analyser expands these wildcards by searching all paths on the system which is something that cannot be mocked and therefore should not be run in unit tests (e.g. it prevents running the unit tests as non-root). Also, the problem comes with USDT probes as it is not possible to easily mock USDTHelper which is a fully static class. Since we need to override AttachPoint::usdt::num_locations from tests, we allow to do that via a new internal env variable BPFTRACE_TEST_USDT_NUM_LOCATIONS. [1] bpftrace#3005
The previous probe expansion approach tried to minimize the amount of LLVM functions generated by emitting a single function for all probe matches in most cases. While this was efficient, it came with a couple of drawbacks: - It is necessary to generate a separate LLVM function for each match (e.g. when the 'probe' builtin is used). This leads to having two very similar loops for iterating matches in BPFtrace::add_probe and in CodegenLLVM::visit(Probe) which is quite confusing and hard to maintain. - libbpf needs one BPF program (i.e. one LLVM function) per probe so if we want to delegate program loading (and possibly attachment) to libbpf (which we do), we cannot use this approach. See [1] for more details. This refactors probe expansion by moving most of it into codegen. Overall, we now distinguish two types of probe expansion: Full expansion - A separate LLVM function is generated for each match. This is used for most expansions now. Multi expansion - Used for k(u)probes when k(u)probe_multi is available. Generates one LLVM function and one BPF program for all matches and attaches the expanded functions via bpf_link_create_opts. This allows to drop a lot of duplicated code. The expansion for "full" is done in CodegenLLVM::visit(Probe), the expansion for "multi" is done in BPFtrace::add_probe. A drawback of this approach is that we generate substantially larger ELF objects for expansions of probe types which do not support multi-probes (e.g. kfuncs and tracepoints) as we generate duplicate LLVM functions. This is something we can live with for now since multi-attachment is not the main use-case for these probe types (e.g. attaching to many kfuncs is very slow) and there's usually an alternative to use multi-kprobes. One particular area where this refactoring caused problems is unit tests in tests/bpftrace.cpp. Previously, it was sufficient to generate a simple ast::Probe and pass it to BPFtrace::add_probe since that was where most of the expansion was done. Now that the expansion was moved to codegen, we need to do full parser -> field analyser -> clang parser -> semantic analyser -> codegen sequence. With this change, some tests had to be dropped, especially the tests with a single wildcard for uprobe/USDT target. The reason is that semantic analyser expands these wildcards by searching all paths on the system which is something that cannot be mocked and therefore should not be run in unit tests (e.g. it prevents running the unit tests as non-root). Also, the problem comes with USDT probes as it is not possible to easily mock USDTHelper which is a fully static class. Since we need to override AttachPoint::usdt::num_locations from tests, we allow to do that via a new internal env variable BPFTRACE_TEST_USDT_NUM_LOCATIONS. [1] bpftrace#3005
The previous probe expansion approach tried to minimize the amount of LLVM functions generated by emitting a single function for all probe matches in most cases. While this was efficient, it came with a couple of drawbacks: - It is necessary to generate a separate LLVM function for each match (e.g. when the 'probe' builtin is used). This leads to having two very similar loops for iterating matches in BPFtrace::add_probe and in CodegenLLVM::visit(Probe) which is quite confusing and hard to maintain. - libbpf needs one BPF program (i.e. one LLVM function) per probe so if we want to delegate program loading (and possibly attachment) to libbpf (which we do), we cannot use this approach. See [1] for more details. This refactors probe expansion by moving most of it into codegen. Overall, we now distinguish two types of probe expansion: Full expansion - A separate LLVM function is generated for each match. This is used for most expansions now. Multi expansion - Used for k(u)probes when k(u)probe_multi is available. Generates one LLVM function and one BPF program for all matches and attaches the expanded functions via bpf_link_create_opts. This allows to drop a lot of duplicated code. The expansion for "full" is done in CodegenLLVM::visit(Probe), the expansion for "multi" is done in BPFtrace::add_probe. A drawback of this approach is that we generate substantially larger ELF objects for expansions of probe types which do not support multi-probes (e.g. kfuncs and tracepoints) as we generate duplicate LLVM functions. This is something we can live with for now since multi-attachment is not the main use-case for these probe types (e.g. attaching to many kfuncs is very slow) and there's usually an alternative to use multi-kprobes. One particular area where this refactoring caused problems is unit tests in tests/bpftrace.cpp. Previously, it was sufficient to generate a simple ast::Probe and pass it to BPFtrace::add_probe since that was where most of the expansion was done. Now that the expansion was moved to codegen, we need to do full parser -> field analyser -> clang parser -> semantic analyser -> codegen sequence. With this change, some tests had to be dropped, especially the tests with a single wildcard for uprobe/USDT target. The reason is that semantic analyser expands these wildcards by searching all paths on the system which is something that cannot be mocked and therefore should not be run in unit tests (e.g. it prevents running the unit tests as non-root). Also, the problem comes with USDT probes as it is not possible to easily mock USDTHelper which is a fully static class. Since we need to override AttachPoint::usdt::num_locations from tests, we allow to do that via a new internal env variable BPFTRACE_TEST_USDT_NUM_LOCATIONS. [1] bpftrace#3005
The previous probe expansion approach tried to minimize the amount of LLVM functions generated by emitting a single function for all probe matches in most cases. While this was efficient, it came with a couple of drawbacks: - It is necessary to generate a separate LLVM function for each match (e.g. when the 'probe' builtin is used). This leads to having two very similar loops for iterating matches in BPFtrace::add_probe and in CodegenLLVM::visit(Probe) which is quite confusing and hard to maintain. - libbpf needs one BPF program (i.e. one LLVM function) per probe so if we want to delegate program loading (and possibly attachment) to libbpf (which we do), we cannot use this approach. See [1] for more details. This refactors probe expansion by moving most of it into codegen. Overall, we now distinguish two types of probe expansion: Full expansion - A separate LLVM function is generated for each match. This is used for most expansions now. Multi expansion - Used for k(u)probes when k(u)probe_multi is available. Generates one LLVM function and one BPF program for all matches and attaches the expanded functions via bpf_link_create_opts. This allows to drop a lot of duplicated code. The expansion for "full" is done in CodegenLLVM::visit(Probe), the expansion for "multi" is done in BPFtrace::add_probe. A drawback of this approach is that we generate substantially larger ELF objects for expansions of probe types which do not support multi-probes (e.g. kfuncs and tracepoints) as we generate duplicate LLVM functions. This is something we can live with for now since multi-attachment is not the main use-case for these probe types (e.g. attaching to many kfuncs is very slow) and there's usually an alternative to use multi-kprobes. One particular area where this refactoring caused problems is unit tests in tests/bpftrace.cpp. Previously, it was sufficient to generate a simple ast::Probe and pass it to BPFtrace::add_probe since that was where most of the expansion was done. Now that the expansion was moved to codegen, we need to do full parser -> field analyser -> clang parser -> semantic analyser -> codegen sequence. With this change, some tests had to be dropped, especially the tests with a single wildcard for uprobe/USDT target. The reason is that semantic analyser expands these wildcards by searching all paths on the system which is something that cannot be mocked and therefore should not be run in unit tests (e.g. it prevents running the unit tests as non-root). Also, the problem comes with USDT probes as USDTHelper was originally fully static which prevented its mocking. Fortunately, we only need to mock the find() method which doesn't have to be static, so this refactors USDTHelper and some of its users and introduces MockUSDTHelper which mocks the find method for unit tests. [1] bpftrace#3005
The previous probe expansion approach tried to minimize the amount of LLVM functions generated by emitting a single function for all probe matches in most cases. While this was efficient, it came with a couple of drawbacks: - It is necessary to generate a separate LLVM function for each match (e.g. when the 'probe' builtin is used). This leads to having two very similar loops for iterating matches in BPFtrace::add_probe and in CodegenLLVM::visit(Probe) which is quite confusing and hard to maintain. - libbpf needs one BPF program (i.e. one LLVM function) per probe so if we want to delegate program loading (and possibly attachment) to libbpf (which we do), we cannot use this approach. See [1] for more details. This refactors probe expansion by moving most of it into codegen. Overall, we now distinguish two types of probe expansion: Full expansion - A separate LLVM function is generated for each match. This is used for most expansions now. Multi expansion - Used for k(u)probes when k(u)probe_multi is available. Generates one LLVM function and one BPF program for all matches and attaches the expanded functions via bpf_link_create_opts. This allows to drop a lot of duplicated code. The expansion for "full" is done in CodegenLLVM::visit(Probe), the expansion for "multi" is done in BPFtrace::add_probe. A drawback of this approach is that we generate substantially larger ELF objects for expansions of probe types which do not support multi-probes (e.g. kfuncs and tracepoints) as we generate duplicate LLVM functions. This is something we can live with for now since multi-attachment is not the main use-case for these probe types (e.g. attaching to many kfuncs is very slow) and there's usually an alternative to use multi-kprobes. One particular area where this refactoring caused problems is unit tests in tests/bpftrace.cpp. Previously, it was sufficient to generate a simple ast::Probe and pass it to BPFtrace::add_probe since that was where most of the expansion was done. Now that the expansion was moved to codegen, we need to do full parser -> field analyser -> clang parser -> semantic analyser -> codegen sequence. With this change, some tests had to be dropped, especially the tests with a single wildcard for uprobe/USDT target. The reason is that semantic analyser expands these wildcards by searching all paths on the system which is something that cannot be mocked and therefore should not be run in unit tests (e.g. it prevents running the unit tests as non-root). Also, the problem comes with USDT probes as USDTHelper was originally fully static which prevented its mocking. Fortunately, we only need to mock the find() method which doesn't have to be static, so this refactors USDTHelper and some of its users and introduces MockUSDTHelper which mocks the find method for unit tests. [1] bpftrace#3005
The previous probe expansion approach tried to minimize the amount of LLVM functions generated by emitting a single function for all probe matches in most cases. While this was efficient, it came with a couple of drawbacks: - It is necessary to generate a separate LLVM function for each match (e.g. when the 'probe' builtin is used). This leads to having two very similar loops for iterating matches in BPFtrace::add_probe and in CodegenLLVM::visit(Probe) which is quite confusing and hard to maintain. - libbpf needs one BPF program (i.e. one LLVM function) per probe so if we want to delegate program loading (and possibly attachment) to libbpf (which we do), we cannot use this approach. See [1] for more details. This refactors probe expansion by moving most of it into codegen. Overall, we now distinguish two types of probe expansion: Full expansion - A separate LLVM function is generated for each match. This is used for most expansions now. Multi expansion - Used for k(u)probes when k(u)probe_multi is available. Generates one LLVM function and one BPF program for all matches and attaches the expanded functions via bpf_link_create_opts. This allows to drop a lot of duplicated code. The expansion for "full" is done in CodegenLLVM::visit(Probe), the expansion for "multi" is done in BPFtrace::add_probe. A drawback of this approach is that we generate substantially larger ELF objects for expansions of probe types which do not support multi-probes (e.g. kfuncs and tracepoints) as we generate duplicate LLVM functions. This is something we can live with for now since multi-attachment is not the main use-case for these probe types (e.g. attaching to many kfuncs is very slow) and there's usually an alternative to use multi-kprobes. One particular area where this refactoring caused problems is unit tests in tests/bpftrace.cpp. Previously, it was sufficient to generate a simple ast::Probe and pass it to BPFtrace::add_probe since that was where most of the expansion was done. Now that the expansion was moved to codegen, we need to do full parser -> field analyser -> clang parser -> semantic analyser -> codegen sequence. With this change, some tests had to be dropped, especially the tests with a single wildcard for uprobe/USDT target. The reason is that semantic analyser expands these wildcards by searching all paths on the system which is something that cannot be mocked and therefore should not be run in unit tests (e.g. it prevents running the unit tests as non-root). Also, the problem comes with USDT probes as USDTHelper was originally fully static which prevented its mocking. Fortunately, we only need to mock the find() method which doesn't have to be static, so this refactors USDTHelper and some of its users and introduces MockUSDTHelper which mocks the find method for unit tests. [1] bpftrace#3005
The previous probe expansion approach tried to minimize the amount of LLVM functions generated by emitting a single function for all probe matches in most cases. While this was efficient, it came with a couple of drawbacks: - It is necessary to generate a separate LLVM function for each match (e.g. when the 'probe' builtin is used). This leads to having two very similar loops for iterating matches in BPFtrace::add_probe and in CodegenLLVM::visit(Probe) which is quite confusing and hard to maintain. - libbpf needs one BPF program (i.e. one LLVM function) per probe so if we want to delegate program loading (and possibly attachment) to libbpf (which we do), we cannot use this approach. See [1] for more details. This refactors probe expansion by moving most of it into codegen. Overall, we now distinguish two types of probe expansion: Full expansion - A separate LLVM function is generated for each match. This is used for most expansions now. Multi expansion - Used for k(u)probes when k(u)probe_multi is available. Generates one LLVM function and one BPF program for all matches and attaches the expanded functions via bpf_link_create_opts. This allows to drop a lot of duplicated code. The expansion for "full" is done in CodegenLLVM::visit(Probe), the expansion for "multi" is done in BPFtrace::add_probe. A drawback of this approach is that we generate substantially larger ELF objects for expansions of probe types which do not support multi-probes (e.g. kfuncs and tracepoints) as we generate duplicate LLVM functions. This is something we can live with for now since multi-attachment is not the main use-case for these probe types (e.g. attaching to many kfuncs is very slow) and there's usually an alternative to use multi-kprobes. One particular area where this refactoring caused problems is unit tests in tests/bpftrace.cpp. Previously, it was sufficient to generate a simple ast::Probe and pass it to BPFtrace::add_probe since that was where most of the expansion was done. Now that the expansion was moved to codegen, we need to do full parser -> field analyser -> clang parser -> semantic analyser -> codegen sequence. With this change, some tests had to be dropped, especially the tests with a single wildcard for uprobe/USDT target. The reason is that semantic analyser expands these wildcards by searching all paths on the system which is something that cannot be mocked and therefore should not be run in unit tests (e.g. it prevents running the unit tests as non-root). Also, the problem comes with USDT probes as USDTHelper was originally fully static which prevented its mocking. Fortunately, we only need to mock the find() method which doesn't have to be static, so this refactors USDTHelper and some of its users and introduces MockUSDTHelper which mocks the find method for unit tests. [1] bpftrace#3005
The previous probe expansion approach tried to minimize the amount of LLVM functions generated by emitting a single function for all probe matches in most cases. While this was efficient, it came with a couple of drawbacks: - It is necessary to generate a separate LLVM function for each match (e.g. when the 'probe' builtin is used). This leads to having two very similar loops for iterating matches in BPFtrace::add_probe and in CodegenLLVM::visit(Probe) which is quite confusing and hard to maintain. - libbpf needs one BPF program (i.e. one LLVM function) per probe so if we want to delegate program loading (and possibly attachment) to libbpf (which we do), we cannot use this approach. See [1] for more details. This refactors probe expansion by moving most of it into codegen. Overall, we now distinguish two types of probe expansion: Full expansion - A separate LLVM function is generated for each match. This is used for most expansions now. Multi expansion - Used for k(u)probes when k(u)probe_multi is available. Generates one LLVM function and one BPF program for all matches and attaches the expanded functions via bpf_link_create_opts. This allows to drop a lot of duplicated code. The expansion for "full" is done in CodegenLLVM::visit(Probe), the expansion for "multi" is done in BPFtrace::add_probe. A drawback of this approach is that we generate substantially larger ELF objects for expansions of probe types which do not support multi-probes (e.g. kfuncs and tracepoints) as we generate duplicate LLVM functions. This is something we can live with for now since multi-attachment is not the main use-case for these probe types (e.g. attaching to many kfuncs is very slow) and there's usually an alternative to use multi-kprobes. One particular area where this refactoring caused problems is unit tests in tests/bpftrace.cpp. Previously, it was sufficient to generate a simple ast::Probe and pass it to BPFtrace::add_probe since that was where most of the expansion was done. Now that the expansion was moved to codegen, we need to do full parser -> field analyser -> clang parser -> semantic analyser -> codegen sequence. With this change, some tests had to be dropped, especially the tests with a single wildcard for uprobe/USDT target. The reason is that semantic analyser expands these wildcards by searching all paths on the system which is something that cannot be mocked and therefore should not be run in unit tests (e.g. it prevents running the unit tests as non-root). Also, the problem comes with USDT probes as USDTHelper was originally fully static which prevented its mocking. Fortunately, we only need to mock the find() method which doesn't have to be static, so this refactors USDTHelper and some of its users and introduces MockUSDTHelper which mocks the find method for unit tests. [1] bpftrace#3005
The previous probe expansion approach tried to minimize the amount of LLVM functions generated by emitting a single function for all probe matches in most cases. While this was efficient, it came with a couple of drawbacks: - It is necessary to generate a separate LLVM function for each match (e.g. when the 'probe' builtin is used). This leads to having two very similar loops for iterating matches in BPFtrace::add_probe and in CodegenLLVM::visit(Probe) which is quite confusing and hard to maintain. - libbpf needs one BPF program (i.e. one LLVM function) per probe so if we want to delegate program loading (and possibly attachment) to libbpf (which we do), we cannot use this approach. See [1] for more details. This refactors probe expansion by moving most of it into codegen. Overall, we now distinguish two types of probe expansion: Full expansion - A separate LLVM function is generated for each match. This is used for most expansions now. Multi expansion - Used for k(u)probes when k(u)probe_multi is available. Generates one LLVM function and one BPF program for all matches and attaches the expanded functions via bpf_link_create_opts. This allows to drop a lot of duplicated code. The expansion for "full" is done in CodegenLLVM::visit(Probe), the expansion for "multi" is done in BPFtrace::add_probe. A drawback of this approach is that we generate substantially larger ELF objects for expansions of probe types which do not support multi-probes (e.g. kfuncs and tracepoints) as we generate duplicate LLVM functions. This is something we can live with for now since multi-attachment is not the main use-case for these probe types (e.g. attaching to many kfuncs is very slow) and there's usually an alternative to use multi-kprobes. One particular area where this refactoring caused problems is unit tests in tests/bpftrace.cpp. Previously, it was sufficient to generate a simple ast::Probe and pass it to BPFtrace::add_probe since that was where most of the expansion was done. Now that the expansion was moved to codegen, we need to do full parser -> field analyser -> clang parser -> semantic analyser -> codegen sequence. With this change, some tests had to be dropped, especially the tests with a single wildcard for uprobe/USDT target. The reason is that semantic analyser expands these wildcards by searching all paths on the system which is something that cannot be mocked and therefore should not be run in unit tests (e.g. it prevents running the unit tests as non-root). Also, the problem comes with USDT probes as USDTHelper was originally fully static which prevented its mocking. Fortunately, we only need to mock the find() method which doesn't have to be static, so this refactors USDTHelper and some of its users and introduces MockUSDTHelper which mocks the find method for unit tests. [1] bpftrace#3005
The previous probe expansion approach tried to minimize the amount of LLVM functions generated by emitting a single function for all probe matches in most cases. While this was efficient, it came with a couple of drawbacks: - It is necessary to generate a separate LLVM function for each match (e.g. when the 'probe' builtin is used). This leads to having two very similar loops for iterating matches in BPFtrace::add_probe and in CodegenLLVM::visit(Probe) which is quite confusing and hard to maintain. - libbpf needs one BPF program (i.e. one LLVM function) per probe so if we want to delegate program loading (and possibly attachment) to libbpf (which we do), we cannot use this approach. See [1] for more details. This refactors probe expansion by moving most of it into codegen. Overall, we now distinguish two types of probe expansion: Full expansion - A separate LLVM function is generated for each match. This is used for most expansions now. Multi expansion - Used for k(u)probes when k(u)probe_multi is available. Generates one LLVM function and one BPF program for all matches and attaches the expanded functions via bpf_link_create_opts. This allows to drop a lot of duplicated code. The expansion for "full" is done in CodegenLLVM::visit(Probe), the expansion for "multi" is done in BPFtrace::add_probe. A drawback of this approach is that we generate substantially larger ELF objects for expansions of probe types which do not support multi-probes (e.g. kfuncs and tracepoints) as we generate duplicate LLVM functions. This is something we can live with for now since multi-attachment is not the main use-case for these probe types (e.g. attaching to many kfuncs is very slow) and there's usually an alternative to use multi-kprobes. One particular area where this refactoring caused problems is unit tests in tests/bpftrace.cpp. Previously, it was sufficient to generate a simple ast::Probe and pass it to BPFtrace::add_probe since that was where most of the expansion was done. Now that the expansion was moved to codegen, we need to do full parser -> field analyser -> clang parser -> semantic analyser -> codegen sequence. With this change, some tests had to be dropped, especially the tests with a single wildcard for uprobe/USDT target. The reason is that semantic analyser expands these wildcards by searching all paths on the system which is something that cannot be mocked and therefore should not be run in unit tests (e.g. it prevents running the unit tests as non-root). Also, the problem comes with USDT probes as USDTHelper was originally fully static which prevented its mocking. Fortunately, we only need to mock the find() method which doesn't have to be static, so this refactors USDTHelper and some of its users and introduces MockUSDTHelper which mocks the find method for unit tests. [1] #3005
While working on #2334, I realized that we'll need to significantly change the way we do probe expansion, so I'm opening an RFC to see what other people opinions are, before I start implementing it.
Let us have a simple wildcarded probe
kfunc:vfs_* { ... }
.At the moment, we generate one LLVM function, LLVM generates one BPF program from it, then we perform the expansion (74 probes), load the BPF program 74 times (each time with a different BTF id of the probe), and attach each instance to a different probe.
The problem is that if we delegate probe loading to libbpf, it will need to discover the probes from the ELF object and therefore we'll need to generate 74 copies of the same LLVM function (unless we somehow force LLVM to create multiple symbol table entries for the same BPF function). This will heavily enlarge the codegen output and the size of the ELF file.
My stand is that this is still worth it as moving to libbpf will have several advantages:
The codegen/ELF size itself is a hidden technical detail and it'll probably only cause trouble for debugging. Also, for some probes (e.g.
kprobe
), we'll be ok with just a single LLVM function but for others (likek(ret)func
), we'll always need to do the expansion.The text was updated successfully, but these errors were encountered: