-
Notifications
You must be signed in to change notification settings - Fork 659
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] introduce periodic BPF programs #924
Conversation
link.Periodic attaches an eBPF program to a periodic perf event running on specified CPU at specified frequency. Similarly, link.PeriodicAllCpus does the same for each online CPU. Under the hood, this uses a PERF_COUNT_SW_CPU_CLOCK perf event. The design of this module is symmetric to the tracepoint module and re-uses most of the logic and based on an original example by @florianl https://gist.github.com/florianl/5d9cc9dbb3822e03f6f65a073ffbedbb. The main use-case for this module is to periodically sample kernel state to build higher level performance metrics like runqlen or a custom profiler using bpf_get_stackid(). Signed-off-by: Jean-Tiare Le Bigot <jt@yadutaf.fr>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As pointed out in #548 (comment) I still think adding more subsystems of the Linux kernel and implementing them only partly might introduce a maintenance burden. Whereas on the other hand it should be easier for users to discover and use other subsystems where eBPF can be used. But I don't know how this conflict can be resolved best - for the maintainers and the users.
// Losing the reference to the resulting Link (tp) will close the Periodic event | ||
// and prevent further execution of prog. The Link must be Closed during | ||
// program shutdown to avoid leaking system resources. | ||
func Periodic(frequency uint64, cpu int, prog *ebpf.Program, opts *PeriodicOptions) (Link, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a number of combinations of pid
and cpu
that the perf subsystem accepts for periodic events:
- pid == 0 and cpu == -1
- pid == 0 and cpu >= 0
- pid > 0 and cpu == -1
- pid > 0 and cpu >= 0
- pid == -1 and cpu >= 0
This function signature only allows to set the CPU . But I do think there is a valid use case where people want to attach an eBPF program via perf only to a single PID - e.g. for debugging purposes.
Hey @yadutaf, thanks for the patch! I'm sorry, but I think we've reached our carrying capacity for perf-related things. Going forward, we'd like to limit new features to and focus on the link abstractions provided by bpf, since it's mostly extending existing infrastructure. I'd like to encourage heavy perf users (you, @florianl, @mmat11, @brycekahle?) to create a project that the Go community can rely on for doing specific things with perf+ebpf, like what you've proposed in this PR. Feel free to take whatever code you need from the lib to get this on the road. We're discussing similar options for spinning off the perf reader implementation. For discoverability purposes, we would officially recommend this sort of library in documentation. Hope you understand. |
Thanks @ti-mo and @florianl for your feedback. As a software developer, I can only understand your position, especially given the previous discussions in this repository. Of course, I hoped that a specific / tailored use-case could find its way into this library, but, let's be honest, I would not be the one carrying the burden of the maintenance. |
This PR is a proposition for a new link type modeled around the "tracepoint" link type. This new link type would allow easily running periodic eBPF programs without the need of figuring out the correct "perf_event" invocation.
I realize this may have already been discussed in other places like #548 and that this project's intention is clearly not to wrap every use-case of the perf API.
However, periodic sampling is a use-case where going through the perf subsystem with a non-intuitive perf event definition looks much more like an implementation detail than using an API tailored for the need.
In practice, I intend to use this API in another project to port bpftrace runqlen in an environment where shipping LLVM into production is not an option. This could be used for other use-cases like building an eBPF based profiler, but I did not investigate this part.
As a secondary question, if this proposition looks interesting, I'm not 100% that the
Periodic
name is the most appropriate. bpftrace introduced theprofile
keyword for a similar feature, but this looks tailors to a specific use-case. Another option could be something likeSampler
.Remaining work:
This proposition is based on an example by @florianl.