Skip to content

Perf: drop First/Last wrappers in AppendingBlockImpl#27

Draft
johanrd wants to merge 3 commits intomainfrom
perf/element-builder-no-first-last-wrappers
Draft

Perf: drop First/Last wrappers in AppendingBlockImpl#27
johanrd wants to merge 3 commits intomainfrom
perf/element-builder-no-first-last-wrappers

Conversation

@johanrd
Copy link
Copy Markdown
Owner

@johanrd johanrd commented Apr 20, 2026

Exploration PR — change from NVP's emberjs/ember.js#21221.

Drop First/Last wrapper classes in AppendingBlockImpl; store raw SimpleNode | Bounds + discriminator booleans instead.

Isolated microbench

bin/first-last-wrappers.bench.mjs:

scenario old new delta
didAppendNode 5000 nodes (Krausest shallow) 20.75 µs 3.70 µs −82 %
didAppendNode 20000 nodes (deep template) 68.03 µs 19.69 µs −71 %

Allocation: old 12–850 kB/iter → new ~0 B (V8 fully elides the wrapper allocs in the new form; they were real escapes in the old).

Krausest (tracerbench)

fidelity duration CI verdict
20-fid [−13, +39] ms no diff
80-fid [−10, +26] ms no diff

All 23 phases show no difference at both fidelities. Saved to .bench/b3-firstlast-wrappers.json and .bench/c-firstlast-80fid.json.

Why the gap between micro and Krausest

Isolated: −82 % on 5000 appends. Krausest: null.

Krausest's per-render wrapper cost is ~17 µs (5000 rows × ~3.5 µs/row), which is 0.8 % of the ~2200 ms bench total — below the tracerbench noise floor. The mechanism is real; Krausest just doesn't exercise enough allocation pressure at the per-render level to see it. A long-running app or a deeper template might.

Verdict

Microbench-positive, Krausest null. Real isolated win, no end-to-end impact at Krausest scale. Whether it's worth merging is a judgment call about allocation pressure in real apps vs the small-but-nonzero risk of hidden-class shape change (new form has 4 fields + 2 booleans vs old's 2 wrapper fields). Closing as documented null at Krausest scale.

…+ discriminator

AppendingBlockImpl was wrapping every appended SimpleNode in a one-field
First/Last wrapper class before storing on the instance. Each wrapper
escapes (stored in this.first / this.last), so V8 can't elide the
allocation. For a 5000-row render, didAppendNode runs at least once per
row (often more for multi-node elements), allocating a wrapper each
time — meaningful GC pressure for a no-op indirection.

Store the raw SimpleNode (or Bounds) directly, with a boolean
discriminator per slot (_firstIsBounds / _lastIsBounds) that firstNode()
and lastNode() check before returning. Saves one allocation per append
without changing the interface exposed to callers.
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 20, 2026

📊 Package size report   0.01%↑

File Before (Size / Brotli) After (Size / Brotli)
dist/dev/packages/shared-chunks/element-builder-6Dadoqnp.js 11.7 kB / 2.6 kB 2%↑12 kB / 3%↑2.7 kB
dist/prod/packages/shared-chunks/element-builder-XNkKho_5.js 22.8 kB / 5.1 kB 1%↑23 kB / 1%↑5.2 kB
dist/prod/packages/shared-chunks/index-Cc8WmrB-.js 59.5 kB / 12 kB 189%↑172 kB / 220%↑38.5 kB
types/stable/@glimmer/runtime/lib/vm/element-builder.d.ts 6.3 kB / 1.3 kB 2%↑6.4 kB / 0.9%↑1.3 kB
Total (Includes all files) 5.4 MB / 1.3 MB 0.01%↑5.4 MB / 0.01%↑1.3 MB
Tarball size 1.2 MB 0.02%↑1.2 MB

🤖 This report was automatically generated by pkg-size-action

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant