refactor(internal/perfevent): extract per-CPU perf_event_open helper#13
Merged
Conversation
…elper
profile.NewProfiler and dwarfagent.NewProfilerWithMode each spelled out
the same per-CPU "PerfEventOpen → AttachRawLink → cleanup-on-error" loop
with a near-identical PerfEventAttr literal. The two implementations
also drifted in two small ways: dwarfagent opened events with
PerfBitDisabled and explicitly IOC_ENABLE'd them after attach (closes
the open→attach race window); profile opened events already-enabled.
dwarfagent skipped offline CPUs (ESRCH from perf_event_open); profile
treated ESRCH as a hard error.
Centralise both behaviours in internal/perfevent:
- perfevent.Set: bundle of per-CPU fds + links with a single Close().
- OpenAll(prog, cpus, rate, opts...): the loop, with
WithDeferredEnable() option for the disabled-then-enable pattern.
- ESRCH-on-open is silently skipped (offline / hot-unplugged CPU is
not an error condition either path should reject).
profile/ migrates without WithDeferredEnable (preserves immediate-enable
behaviour) but now also tolerates offline CPUs — a strict improvement.
dwarfagent/ migrates with WithDeferredEnable (preserves the race-free
attach behaviour). Net change: ~150 lines of boilerplate replaced by one
helper call per profiler; PerfEventAttr now defined in exactly one
place.
dpsoft
added a commit
that referenced
this pull request
May 3, 2026
The committed architecture.excalidraw was the original sketch from the FP-only era — it still showed only profile/, offcpu/, cpu/, perf.bpf.c, offcpu.bpf.c, and cpu.bpf.c. Since then the codebase has grown: - unwind/dwarfagent/ (DWARF hybrid CPU + off-CPU walker, PRs #1–#11) - unwind/{ehcompile,ehmaps,procmap} (CFI compile, per-PID lifecycle, /proc resolver — PRs #3–#9) - internal/perfevent (per-CPU perf_event_open + AttachRawLink shared helper extracted from profile/dwarfagent — PR #13) - inject/{python,ptraceop,elfsym} (Python perf-trampoline activator via ptrace — PR #12) - internal/{nspid,k8slabels} (namespace-aware --pid + k8s pprof labels — PR #14) - bpf/perf_dwarf.bpf.c, bpf/offcpu_dwarf.bpf.c (DWARF kernel-side) The new file groups them into pre-profile setup (optional, dashed — nspid, k8slabels, inject), profilers (FP, DWARF, PMU), helpers (perfevent, ehcompile, ehmaps, procmap, pprof), and the four BPF programs in the kernel band, with arrows showing data + control flow. Output lands in *-on-cpu.pb.gz / *-off-cpu.pb.gz / PMU stdout-or-file. Layout was generated programmatically; open the file in https://excalidraw.com or the VS Code extension to fine-tune positions and colours. The README's ASCII version of the same diagram is unchanged — both exist intentionally so readers can grok the architecture without opening Excalidraw, and the .excalidraw is the editable source for future iterations.
dpsoft
added a commit
that referenced
this pull request
May 3, 2026
The committed architecture.excalidraw was the original sketch from the FP-only era — it still showed only profile/, offcpu/, cpu/, perf.bpf.c, offcpu.bpf.c, and cpu.bpf.c. Since then the codebase has grown: - unwind/dwarfagent/ (DWARF hybrid CPU + off-CPU walker, PRs #1–#11) - unwind/{ehcompile,ehmaps,procmap} (CFI compile, per-PID lifecycle, /proc resolver — PRs #3–#9) - internal/perfevent (per-CPU perf_event_open + AttachRawLink shared helper extracted from profile/dwarfagent — PR #13) - inject/{python,ptraceop,elfsym} (Python perf-trampoline activator via ptrace — PR #12) - internal/{nspid,k8slabels} (namespace-aware --pid + k8s pprof labels — PR #14) - bpf/perf_dwarf.bpf.c, bpf/offcpu_dwarf.bpf.c (DWARF kernel-side) The new file groups them into pre-profile setup (optional, dashed — nspid, k8slabels, inject), profilers (FP, DWARF, PMU), helpers (perfevent, ehcompile, ehmaps, procmap, pprof), and the four BPF programs in the kernel band, with arrows showing data + control flow. Output lands in *-on-cpu.pb.gz / *-off-cpu.pb.gz / PMU stdout-or-file. Layout was generated programmatically; open the file in https://excalidraw.com or the VS Code extension to fine-tune positions and colours. The README's ASCII version of the same diagram is unchanged — both exist intentionally so readers can grok the architecture without opening Excalidraw, and the .excalidraw is the editable source for future iterations.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Both
profile.NewProfiler(FP CPU profiler) anddwarfagent.NewProfilerWithMode(DWARF CPU profiler) spelled out an identical per-CPU "PerfEventOpen → AttachRawLink → cleanup-on-error" loop with a near-identicalPerfEventAttrliteral. Two small drift points between them:PerfBitDisabledand explicitlyIOC_ENABLE'd after attach (closes the open→attach race window); profile opened events already-enabled.ESRCHfromperf_event_open); profile treatedESRCHas a hard error, which would fail any run that included a hot-unplugged CPU.What this PR does
Centralises both behaviours in
internal/perfevent:perfevent.Set— bundle of per-CPU fds + links, singleClose().OpenAll(prog, cpus, sampleRate, opts...)— the per-CPU loop.WithDeferredEnable()— opt-in for the disabled-then-IOC_ENABLEpattern.ESRCH-on-open is silently skipped (offline CPU is not an error either path should reject).profile/migrates withoutWithDeferredEnable(preserves immediate-enable behaviour) but now also tolerates offline CPUs — a strict improvement.dwarfagent/migrates withWithDeferredEnable(preserves the race-free attach behaviour).PerfEventAttris now defined in exactly one place; future tweaks (precise sampling, hardware counter switch, etc.) only need to touch one helper.Diff shape
```
internal/perfevent/perfevent.go | +133 (new)
profile/profiler.go | +14 / -74
unwind/dwarfagent/agent.go | +4 / -62
```
Test plan
Behaviour notes