Skip to content

refactor(internal/perfevent): extract per-CPU perf_event_open helper#13

Merged
dpsoft merged 1 commit into
mainfrom
refactor/perfevent-extract
Apr 30, 2026
Merged

refactor(internal/perfevent): extract per-CPU perf_event_open helper#13
dpsoft merged 1 commit into
mainfrom
refactor/perfevent-extract

Conversation

@dpsoft
Copy link
Copy Markdown
Owner

@dpsoft dpsoft commented Apr 30, 2026

Summary

Both profile.NewProfiler (FP CPU profiler) and dwarfagent.NewProfilerWithMode (DWARF CPU profiler) spelled out an identical per-CPU "PerfEventOpen → AttachRawLink → cleanup-on-error" loop with a near-identical PerfEventAttr literal. Two small drift points between them:

  • dwarfagent opened events with PerfBitDisabled and explicitly IOC_ENABLE'd after attach (closes the open→attach race window); profile opened events already-enabled.
  • dwarfagent silently skipped offline CPUs (ESRCH from perf_event_open); profile treated ESRCH as a hard error, which would fail any run that included a hot-unplugged CPU.

What this PR does

Centralises both behaviours in internal/perfevent:

  • perfevent.Set — bundle of per-CPU fds + links, single Close().
  • OpenAll(prog, cpus, sampleRate, opts...) — the per-CPU loop.
  • WithDeferredEnable() — opt-in for the disabled-then-IOC_ENABLE pattern.
  • ESRCH-on-open is silently skipped (offline CPU is not an error either path should reject).

profile/ migrates without WithDeferredEnable (preserves immediate-enable behaviour) but now also tolerates offline CPUs — a strict improvement. dwarfagent/ migrates with WithDeferredEnable (preserves the race-free attach behaviour).

PerfEventAttr is now defined in exactly one place; future tweaks (precise sampling, hardware counter switch, etc.) only need to touch one helper.

Diff shape

```
internal/perfevent/perfevent.go | +133 (new)
profile/profiler.go | +14 / -74
unwind/dwarfagent/agent.go | +4 / -62
```

Test plan

  • `go build ./...` clean
  • `go test ./...` all green
  • CI: lint, build amd64+arm64, unit tests amd64+arm64
  • CI: integration tests amd64+arm64 — exercises both `profile.NewProfiler` and `dwarfagent.NewProfilerWithMode` end-to-end through real perf_event_open

Behaviour notes

  • profile path: previously errored on offline CPUs in the `cpus` slice; now skips them silently. If callers were relying on the error, they will see it as a non-event. `OpenAll` still errors when every CPU is offline ("no perf events attached").
  • dwarfagent path: byte-for-byte equivalent (same disabled-then-IOC_ENABLE).

…elper

profile.NewProfiler and dwarfagent.NewProfilerWithMode each spelled out
the same per-CPU "PerfEventOpen → AttachRawLink → cleanup-on-error" loop
with a near-identical PerfEventAttr literal. The two implementations
also drifted in two small ways: dwarfagent opened events with
PerfBitDisabled and explicitly IOC_ENABLE'd them after attach (closes
the open→attach race window); profile opened events already-enabled.
dwarfagent skipped offline CPUs (ESRCH from perf_event_open); profile
treated ESRCH as a hard error.

Centralise both behaviours in internal/perfevent:

  - perfevent.Set: bundle of per-CPU fds + links with a single Close().
  - OpenAll(prog, cpus, rate, opts...): the loop, with
    WithDeferredEnable() option for the disabled-then-enable pattern.
  - ESRCH-on-open is silently skipped (offline / hot-unplugged CPU is
    not an error condition either path should reject).

profile/ migrates without WithDeferredEnable (preserves immediate-enable
behaviour) but now also tolerates offline CPUs — a strict improvement.
dwarfagent/ migrates with WithDeferredEnable (preserves the race-free
attach behaviour). Net change: ~150 lines of boilerplate replaced by one
helper call per profiler; PerfEventAttr now defined in exactly one
place.
@dpsoft dpsoft marked this pull request as ready for review April 30, 2026 15:28
@dpsoft dpsoft merged commit 0cbc689 into main Apr 30, 2026
10 checks passed
@dpsoft dpsoft deleted the refactor/perfevent-extract branch April 30, 2026 15:28
dpsoft added a commit that referenced this pull request May 3, 2026
The committed architecture.excalidraw was the original sketch from the
FP-only era — it still showed only profile/, offcpu/, cpu/, perf.bpf.c,
offcpu.bpf.c, and cpu.bpf.c. Since then the codebase has grown:

- unwind/dwarfagent/ (DWARF hybrid CPU + off-CPU walker, PRs #1#11)
- unwind/{ehcompile,ehmaps,procmap} (CFI compile, per-PID lifecycle,
  /proc resolver — PRs #3#9)
- internal/perfevent (per-CPU perf_event_open + AttachRawLink shared
  helper extracted from profile/dwarfagent — PR #13)
- inject/{python,ptraceop,elfsym} (Python perf-trampoline activator
  via ptrace — PR #12)
- internal/{nspid,k8slabels} (namespace-aware --pid + k8s pprof labels
  — PR #14)
- bpf/perf_dwarf.bpf.c, bpf/offcpu_dwarf.bpf.c (DWARF kernel-side)

The new file groups them into pre-profile setup (optional, dashed —
nspid, k8slabels, inject), profilers (FP, DWARF, PMU), helpers
(perfevent, ehcompile, ehmaps, procmap, pprof), and the four BPF
programs in the kernel band, with arrows showing data + control flow.
Output lands in *-on-cpu.pb.gz / *-off-cpu.pb.gz / PMU stdout-or-file.

Layout was generated programmatically; open the file in
https://excalidraw.com or the VS Code extension to fine-tune
positions and colours.

The README's ASCII version of the same diagram is unchanged — both
exist intentionally so readers can grok the architecture without
opening Excalidraw, and the .excalidraw is the editable source for
future iterations.
dpsoft added a commit that referenced this pull request May 3, 2026
The committed architecture.excalidraw was the original sketch from the
FP-only era — it still showed only profile/, offcpu/, cpu/, perf.bpf.c,
offcpu.bpf.c, and cpu.bpf.c. Since then the codebase has grown:

- unwind/dwarfagent/ (DWARF hybrid CPU + off-CPU walker, PRs #1#11)
- unwind/{ehcompile,ehmaps,procmap} (CFI compile, per-PID lifecycle,
  /proc resolver — PRs #3#9)
- internal/perfevent (per-CPU perf_event_open + AttachRawLink shared
  helper extracted from profile/dwarfagent — PR #13)
- inject/{python,ptraceop,elfsym} (Python perf-trampoline activator
  via ptrace — PR #12)
- internal/{nspid,k8slabels} (namespace-aware --pid + k8s pprof labels
  — PR #14)
- bpf/perf_dwarf.bpf.c, bpf/offcpu_dwarf.bpf.c (DWARF kernel-side)

The new file groups them into pre-profile setup (optional, dashed —
nspid, k8slabels, inject), profilers (FP, DWARF, PMU), helpers
(perfevent, ehcompile, ehmaps, procmap, pprof), and the four BPF
programs in the kernel band, with arrows showing data + control flow.
Output lands in *-on-cpu.pb.gz / *-off-cpu.pb.gz / PMU stdout-or-file.

Layout was generated programmatically; open the file in
https://excalidraw.com or the VS Code extension to fine-tune
positions and colours.

The README's ASCII version of the same diagram is unchanged — both
exist intentionally so readers can grok the architecture without
opening Excalidraw, and the .excalidraw is the editable source for
future iterations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant