CPU sampling (profiling) at a timed interval #330

Closed
brendangregg opened this Issue Jan 29, 2016 · 4 comments

Projects

None yet

2 participants

Contributor

Support for CPU profiling, like "perf record -F 99 -a -g -- sleep 10". The advantage of bcc/eBPF is efficiency: stack traces can be frequency counted in kernel context (an eBPF map), and only the summary emitted to user level. The current operation of perf_events is to write a perf.data file with every stack sample, which involves extra CPU and file system overhead (context switches have been minimized thanks to perf's dynamic wakeups).

While this will be an everyday feature, I've put the priority as medium and not high because Linux perf_events is an adequate workaround for now.

Owner
4ast commented Jan 30, 2016

bpf program can kprobe into HZ tick, but there are NO_HZ environment, so need to be able to setup a timer fired BPF programs. Like generic sampling. Since such programs will likely access hw counters, need to make sure that it plays nice with NMI stuff.

@4ast 4ast self-assigned this Jan 30, 2016
Contributor

Yes, my systems are NO_HZ.

I wanted to add top use cases, imagine these are all sampled at 99 Hertz:

  1. Sampling of user/kernel stack traces.
  2. Sampling of current function or instruction pointer (ctx->ip).
  3. Sampling of stacks when another conditional is true (check a map value, which is set elsewhere).
  4. Sampling of PID/process name (bpf_get_current_pid/comm)
  5. Sampling of thread priority (task_struct->prio).
  6. Sampling of thread stack size (task_struct->?).

The first 4 should be doable right away. Not sure about 5 & 6 without task_struct access.

Contributor

closed, fixed, by #620

Contributor

For anyone looking up this ticket, while profile.py works (defaulting to a kprobe), it needs to be rewritten to use the upcoming perf_event_open() BPF support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment