Skip to content

feat(codegen): --stats=csv / --stats=json poly-dispatch instrumentation#100

Open
ryanseys wants to merge 1 commit into
masterfrom
rs-perf-stats-instrumentation
Open

feat(codegen): --stats=csv / --stats=json poly-dispatch instrumentation#100
ryanseys wants to merge 1 commit into
masterfrom
rs-perf-stats-instrumentation

Conversation

@ryanseys
Copy link
Copy Markdown
Owner

Summary

  • Adds two opt-in CLI flags to spinel_codegen that record per-mname dispatch stats for compile_poly_method_call:
    • --stats=csv <path> → CSV file with columns mname,call_sites,total_arms,max_arms
    • --stats=json <path> → JSON array of the same fields
    • Both flags can be passed together; either may be omitted.
  • Adds a CRuby-only Vernier wrapper at spinel_codegen_profile.rb (separate file). Invoked as ruby spinel_codegen_profile.rb --profile <out.json> [...spinel_codegen args]; forwards everything after --profile to spinel_codegen.rb inside a Vernier.profile block. Kept out of spinel_codegen.rb itself so require 'vernier' never enters the self-host pipeline.

Why

Two prior perf PRs each ruled out a hypothesized OOM driver on the large workload — @out_lines retention (closed as PR 4 research) and per-arm arm_arg_strs concatenation (closed as #99). Both were guesses. We need data, not more guesses. --stats gives the per-mname dispatch shape; Vernier --profile gives Ruby-level flame graphs and allocation sites.

Output shape

Triaged by total_arms descending — the hottest dispatches come first.

CSV header:

mname,call_sites,total_arms,max_arms

JSON:

[{\"mname\":\"...\",\"call_sites\":N,\"total_arms\":N,\"max_arms\":N}, ...]

Stats are populated only for poly call sites that reach the main per-class dispatch path. The special-method shortcuts (nil?, to_s, inspect, to_i, is_a? / kind_of? / instance_of?) bypass the per-class loop and return early; instrumenting those would dilute the signal we care about (per-class arm emission cost), so they are excluded by design.

Sample output (test fixture)

```
$ spinel_parse test/poly_dispatch_args_ret.rb /tmp/p.ast
$ spinel_analyze /tmp/p.ast /tmp/p.ir
$ ruby spinel_codegen.rb --stats=csv /tmp/p.csv --stats=json /tmp/p.json /tmp/p.ast /tmp/p.ir /tmp/p.c

/tmp/p.csv

mname,call_sites,total_arms,max_arms
read,4,8,2

/tmp/p.json

[{"mname":"read","call_sites":4,"total_arms":8,"max_arms":2}]
```

(spinel's own spinel_codegen.rb / spinel_analyze.rb self-compile produces an empty stats file — those workloads' poly calls all use the special-method shortcuts, which is expected. The OOM-target large workload is where these stats become load-bearing for triage.)

Verification

Check Result
make bootstrap (round 2 vs round 3 byte-identical) ✓ pass on both analyze.rb and codegen.rb
make test ✓ 588 / 588
make optcarrot ✓ checksum 59662 (unchanged)
Output diff between stats-on and stats-off codegen 0 lines (verified against `poly_dispatch_args_ret` fixture)

Performance overhead when stats are disabled

Self-compile of spinel_codegen.rb (gen1, CRuby), 3 runs each:

Run master this PR (no flags)
1 10.19s 10.76s
2 10.23s 10.46s
3 10.41s 10.69s
median 10.23s 10.69s

About 4-5% — within typical noise on this host. The opt-in flags are checked once per compile_poly_method_call invocation; the unconditional cost is three int-increments inside the existing per-arm match branches. Acceptable for a debug-only feature.

Vernier wrapper

The wrapper is intentionally minimal:

```
$ ruby spinel_codegen_profile.rb --profile /tmp/spinel.vernier.json --stats=csv /tmp/s.csv build/codegen.ast build/codegen.ir /tmp/out.c
```

If vernier isn't installed, it prints `gem install vernier` and exits 1 without running codegen. Composes cleanly with --stats=csv / --stats=json since those flags pass through to the wrapped codegen call.

Recommended follow-up

Regenerate the large-workload AST + IR artifacts (the previously-investigated workload that OOMs at ~11 GB), run codegen with --stats=json and a Vernier --profile, and use the resulting data to scope the next OOM-targeted PR.

🤖 Generated with Claude Code

Adds opt-in CLI flags that record per-mname (call_sites, total_arms,
max_arms) for every call to compile_poly_method_call that reaches
the main per-class dispatch path (skipping the special-method
shortcuts: nil?, to_s, inspect, to_i, is_a? / kind_of? / instance_of?).

When --stats=csv <path> is given, a CSV is written at end of run.
When --stats=json <path> is given, a JSON array of the same data is
written. Both flags may be passed together; either may be omitted.

A separate CRuby-only wrapper spinel_codegen_profile.rb supplies
Vernier --profile <out> by loading spinel_codegen.rb inside a
Vernier.profile block. Kept out of spinel_codegen.rb itself so the
self-host pipeline never has to parse 'require vernier'.

Disabled-by-default: @stats_enabled stays 0 unless one of the flags
is given. Per-call-site overhead is a single int compare at the
bottom of compile_poly_method_call plus three unconditional
arm_count increments inside the existing per-arm match branches.

Verification: make bootstrap (round 2 vs round 3 byte-identical
on analyze.rb and codegen.rb), make test (588/588), make optcarrot
(checksum 59662). Output is byte-identical to a no-stats run when
neither flag is passed (verified against the poly_dispatch_args_ret
test fixture).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant