-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Description
(Discussion extracted from #299)
This issue is to track "how could str() support kilobyte/megabyte/arbitrary string sizes?"
str() allocates strings on the BPF stack, so we reach the 512-byte limit very quickly (especially since we pay the toll again when printf() composes a perf event output buffer).
In fact, we're currently limited to tracing ~240-byte strings.
For an idea of "when would I want to trace a string larger than 240 bytes?":
My favourite dtrace command was one that was frequently guilty of maxing out the default string buffer: tracing what SQL queries were submitted to MySQL. I'd usually grow the buffer to 1024 bytes for good measure.
=====
I explored the possibility of storing the strings off-stack. The main problem there is that probe_read_str() will only write upon on-stack memory. I have ideas for workarounds (and some code progress towards those workarounds), but they're too big to be included in the scope of this pull request.
=====
Here's some code that I built to explore the possibility of storing strings off-stack:
bd038f9...Birch-san:c5088c9a5d731165e1005a6a800e611bb63f3dc3
It introduces (in codegen) a "StrCall" specialization of "Call".
StrCall has a PERCPU_ARRAY BPF map, to hold large data off-stack.
We cannot use probe_read_str() to output directly into the BPF map, but maybe we could allocate a small on-stack buffer, and ferry bytes (one buffer at a time) from BPF stack into BPF map.
Next, printf() poses a few problems:
- composes the perf event output struct on the BPF stack
- doesn't use runtime information to decide length of strings inside the perf event output struct
- composes a fixed-size structure
- determines size of structure during semantic analysis
I think PerfEventOutput can transfer data from BPF map to userspace (that's what "join" does), so we could compose the perf event output struct inside a BPF map.
====
We may find that we want a dynamically-sized perf event output structure. Like, imagine we see that args->count is just 3 bytes. We wouldn't want to send MAX_STRLEN bytes to userspace, if we know at runtime a smaller, more specific amount.
Here's some (horrifying) code I wrote to make print decide the size of its MEMCPY based on runtime information rather than semantic analysis:
master...Birch-san:6c1f19a3a4c3cf253aa23db79f5f637b6823e5b2
Notice how expr2_ is used to pass down a strlen at runtime, instead of using the strlen provided during semantic analysis.
And notice how the str call uses a parameterised AllocInst, so instead of isArrayTy(), its expr_ type has isPointerTy(). (it's a pointer to dynamically-allocated memory)
I reached a dead-end here because printf() allocates a buffer based on an LLVM StructType. It can store strings in ArrayType members, but you need to declare the size of the ArrayType up front. You cannot use an LLVM Value to dynamically size it.
The workaround here would be to stop using StructType and instead use a basic AllocInst. These support dynamic sizes based on LLVM Value. Or even better: we can use off-stack memory (a BPF map) instead of doing an allocation inside printf().
====
In conclusion: there may be a path to supporting arbitrarily-large strings. Bonus points if we can also make the perf event output buffer dynamically-sized, so we can transfer the smallest message possible (instead of our maximum buffer size). A big advantage of dynamic sizing is that the user wouldn't have to tune their maximum buffer size for the sake of performance.