feat(codegen): --stats=csv / --stats=json poly-dispatch instrumentation#100
Open
ryanseys wants to merge 1 commit into
Open
feat(codegen): --stats=csv / --stats=json poly-dispatch instrumentation#100ryanseys wants to merge 1 commit into
ryanseys wants to merge 1 commit into
Conversation
Adds opt-in CLI flags that record per-mname (call_sites, total_arms, max_arms) for every call to compile_poly_method_call that reaches the main per-class dispatch path (skipping the special-method shortcuts: nil?, to_s, inspect, to_i, is_a? / kind_of? / instance_of?). When --stats=csv <path> is given, a CSV is written at end of run. When --stats=json <path> is given, a JSON array of the same data is written. Both flags may be passed together; either may be omitted. A separate CRuby-only wrapper spinel_codegen_profile.rb supplies Vernier --profile <out> by loading spinel_codegen.rb inside a Vernier.profile block. Kept out of spinel_codegen.rb itself so the self-host pipeline never has to parse 'require vernier'. Disabled-by-default: @stats_enabled stays 0 unless one of the flags is given. Per-call-site overhead is a single int compare at the bottom of compile_poly_method_call plus three unconditional arm_count increments inside the existing per-arm match branches. Verification: make bootstrap (round 2 vs round 3 byte-identical on analyze.rb and codegen.rb), make test (588/588), make optcarrot (checksum 59662). Output is byte-identical to a no-stats run when neither flag is passed (verified against the poly_dispatch_args_ret test fixture).
6 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
spinel_codegenthat record per-mname dispatch stats forcompile_poly_method_call:--stats=csv <path>→ CSV file with columnsmname,call_sites,total_arms,max_arms--stats=json <path>→ JSON array of the same fieldsspinel_codegen_profile.rb(separate file). Invoked asruby spinel_codegen_profile.rb --profile <out.json> [...spinel_codegen args]; forwards everything after--profiletospinel_codegen.rbinside aVernier.profileblock. Kept out ofspinel_codegen.rbitself sorequire 'vernier'never enters the self-host pipeline.Why
Two prior perf PRs each ruled out a hypothesized OOM driver on the large workload —
@out_linesretention (closed as PR 4 research) and per-armarm_arg_strsconcatenation (closed as #99). Both were guesses. We need data, not more guesses.--statsgives the per-mname dispatch shape; Vernier--profilegives Ruby-level flame graphs and allocation sites.Output shape
Triaged by
total_armsdescending — the hottest dispatches come first.CSV header:
JSON:
[{\"mname\":\"...\",\"call_sites\":N,\"total_arms\":N,\"max_arms\":N}, ...]Stats are populated only for poly call sites that reach the main per-class dispatch path. The special-method shortcuts (
nil?,to_s,inspect,to_i,is_a?/kind_of?/instance_of?) bypass the per-class loop and return early; instrumenting those would dilute the signal we care about (per-class arm emission cost), so they are excluded by design.Sample output (test fixture)
```
$ spinel_parse test/poly_dispatch_args_ret.rb /tmp/p.ast
$ spinel_analyze /tmp/p.ast /tmp/p.ir
$ ruby spinel_codegen.rb --stats=csv /tmp/p.csv --stats=json /tmp/p.json /tmp/p.ast /tmp/p.ir /tmp/p.c
/tmp/p.csv
mname,call_sites,total_arms,max_arms
read,4,8,2
/tmp/p.json
[{"mname":"read","call_sites":4,"total_arms":8,"max_arms":2}]
```
(spinel's own
spinel_codegen.rb/spinel_analyze.rbself-compile produces an empty stats file — those workloads' poly calls all use the special-method shortcuts, which is expected. The OOM-target large workload is where these stats become load-bearing for triage.)Verification
make bootstrap(round 2 vs round 3 byte-identical)make testmake optcarrotPerformance overhead when stats are disabled
Self-compile of
spinel_codegen.rb(gen1, CRuby), 3 runs each:About 4-5% — within typical noise on this host. The opt-in flags are checked once per
compile_poly_method_callinvocation; the unconditional cost is three int-increments inside the existing per-arm match branches. Acceptable for a debug-only feature.Vernier wrapper
The wrapper is intentionally minimal:
```
$ ruby spinel_codegen_profile.rb --profile /tmp/spinel.vernier.json --stats=csv /tmp/s.csv build/codegen.ast build/codegen.ir /tmp/out.c
```
If
vernierisn't installed, it prints `gem install vernier` and exits 1 without running codegen. Composes cleanly with--stats=csv/--stats=jsonsince those flags pass through to the wrapped codegen call.Recommended follow-up
Regenerate the large-workload AST + IR artifacts (the previously-investigated workload that OOMs at ~11 GB), run codegen with
--stats=jsonand a Vernier--profile, and use the resulting data to scope the next OOM-targeted PR.🤖 Generated with Claude Code