Skip to content

Perf: replace splice with swap-and-pop in destroyable remove() (runtime)#23

Merged
johanrd merged 2 commits intomainfrom
perf/destroyable-remove-swap-pop
Apr 20, 2026
Merged

Perf: replace splice with swap-and-pop in destroyable remove() (runtime)#23
johanrd merged 2 commits intomainfrom
perf/destroyable-remove-swap-pop

Conversation

@johanrd
Copy link
Copy Markdown
Owner

@johanrd johanrd commented Apr 19, 2026

Summary

remove() in packages/@glimmer/destroyable/index.ts is invoked every time a destroyable is unassociated from its parent (during the destroy phase, via removeChildFromParent). The old form did an O(n) element shift on every removal:

let index = collection.indexOf(item);
collection.splice(index, 1);

For a parent with N children being cleared one-by-one, the total cost across the clear is O(N²) in the splice piece alone (the indexOf is also O(n) but the splice shift of the trailing elements is the dominant memory-copy work).

New form swaps the removed item with the last element and truncates — O(1) for the splice piece; indexOf is unchanged:

let index = collection.indexOf(item);
let lastIndex = collection.length - 1;
if (index !== lastIndex) {
  collection[index] = collection[lastIndex];
}
collection.length = lastIndex;

Prior art

The same change has already been discovered (alongside other runtime tweaks) in NVP's open upstream PR emberjs/ember.js#21221 "Improve render perf"
This PR is intentionally narrow: just the destroyable change, with a tracerbench 20-fidelity compare attributing the win. I have not reproduced any more concrete wins from the other fixes from emberjs#21221 yet.

Why this is semantics-identical

Collection order is not observable to anything outside the destroyable module. The only consumers are:

  • iterate(collection, fn) (lines 50–56) — uses Array.prototype.forEach, which traverses whatever the current in-memory order is. None of its callers (destroy() at line 163 for children / eagerDestructors / destructors / parents) assume a particular order among siblings. Destructors are peers; the destroyable contract says children are destroyed before their parents, but says nothing about sibling ordering.
  • Parent-side removal is batched via scheduleDestroyed (line 180), so a child is never spliced out of a parent collection while the parent is actively iterating it — no iteration-during-mutation corner cases.

The only observable behaviour change between splice and swap-and-pop is the order of the remaining elements after a removal, which — per the above — is not visible to any consumer.

Before / after

TracerBench 20-fidelity compare (Krausest-style benchmark-app, origin/main vs this branch), M1 Max / Node 24.14 / Chrome Canary:

phase main p50 branch p50 Δ ms Δ % 90% CI
clearManyItems1End 202.7 ms 159.4 ms −43 ms −21.3 % [−47, −40] ms
clearManyItems2End 106.0 ms 63.3 ms −43 ms −40.0 % [−47, −35] ms
duration (total) 2362.1 ms 2253.9 ms −108 ms −4.6 % [−144, −47] ms
render5000Items2End 666.5 ms 649.1 ms −17 ms −2.6 % no diff (CI crosses 0)
renderEnd 23.4 ms 24.6 ms +1 ms no diff
render5000Items1End 651.9 ms 652.7 ms +1 ms no diff
render1000Items1End 71.4 ms 70.7 ms −1 ms no diff
clearItems1End 10.5 ms 10.2 ms 0 ms no diff
render1000Items2End 64.4 ms 65.2 ms +1 ms no diff
clearItems2End 109.0 ms 108.2 ms −1 ms −0.6 % (small)
render1000Items3End 50.5 ms 49.7 ms −1 ms no diff
append1000Items1End 70.1 ms 70.7 ms +1 ms no diff
append1000Items2End 64.4 ms 65.7 ms +1 ms no diff
updateEvery10thItem1End 53.0 ms 52.6 ms 0 ms no diff
updateEvery10thItem2End 54.3 ms 54.4 ms 0 ms no diff
selectFirstRow1End 19.0 ms 18.7 ms 0 ms no diff
selectSecondRow1End 13.8 ms 13.5 ms 0 ms no diff
removeFirstRow1End 35.9 ms 36.3 ms 0 ms no diff
removeSecondRow1End 35.1 ms 34.5 ms −1 ms no diff
swapRows1End 25.7 ms 25.8 ms 0 ms no diff
swapRows2End 25.7 ms 24.7 ms −1 ms no diff
clearItems4End 20.7 ms 20.6 ms 0 ms no diff
paint 4.3 ms 2.9 ms −1 ms no diff

The wins are concentrated in the clear-5000-items phases — exactly where the old splice shift was quadratic. No regression in any of the 20 measured phases. Overall duration is −108 ms / −4.6 % median (CI [−144, −47] ms).

The two clear-5000 phases go from ~203 ms → ~159 ms and ~106 ms → ~63 ms respectively; together they account for nearly the entire duration delta.

Reproducing the benchmark

Commands are meant to be run from the ember.js repo root. Requires pnpm and Chrome Canary installed (TracerBench launches Canary by default).

# `pnpm bench` clones origin/main into tracerbench-testing/ember-source-control,
# builds both sides to tarballs, installs them into tracerbench-testing/{control,experiment}
# app copies of smoke-tests/benchmark-app, and runs tracerbench compare with
# 20-fidelity (20 Chrome sessions per side, throttle=1, regression threshold=25%).
pnpm bench

# Pass --force to nuke the cached control tarball and rebuild both sides from scratch.
# Useful when comparing against a fresh main.
pnpm bench -- --force

Outputs land in tracerbench-testing/experiment/tracerbench-results/:

  • compare.json — raw per-sample timings (control/experiment groups, each with samples[*].phases[*].duration in microseconds)
  • artifact-1.pdf— rendered report with sparklines and CI tables
  • artifact-1.html — same, interactive

To extract the absolute phase medians in the format used above:

node -e '
const j = JSON.parse(require("fs").readFileSync("tracerbench-testing/experiment/tracerbench-results/compare.json","utf8"));
const [control, experiment] = ["control", "experiment"].map(g => j.find(x => x.group === g));
const median = a => { const s = [...a].sort((x,y)=>x-y); const m = Math.floor(s.length/2); return s.length%2 ? s[m] : (s[m-1]+s[m])/2; };
const phases = control.samples[0].phases.map(p => p.phase);
const get = (samples, name) => samples.map(s => s.phases.find(p => p.phase === name).duration / 1000);
console.log("phase".padEnd(26) + "main".padStart(10) + "branch".padStart(12) + "delta".padStart(10));
for (const n of phases) {
  const c = median(get(control.samples, n));
  const e = median(get(experiment.samples, n));
  console.log(n.padEnd(26) + c.toFixed(1).padStart(10) + e.toFixed(1).padStart(12) + (e-c).toFixed(1).padStart(10));
}'

Runtime: ~10–15 min including both tarball builds and 40 Chrome sessions.

Test plan

  • pnpm test green
  • pnpm lint green
  • No behaviour change — existing destroyable tests exercise remove through removeChildFromParent thoroughly (association / unassociation / destroy flows).

`remove()` in `@glimmer/destroyable/index.ts:58` is invoked every time
a destroyable is unassociated from its parent (happens during the
destroy phase, from `removeChildFromParent`). Its previous form:

    let index = collection.indexOf(item);
    collection.splice(index, 1);

does an O(n) shift of all elements after `index`. For a parent with
thousands of children being cleared, the total cost is O(n²).

Swap-with-last-then-pop is O(1) for the splice piece (the indexOf
lookup above it is unchanged):

    let index = collection.indexOf(item);
    let lastIndex = collection.length - 1;
    if (index !== lastIndex) {
      collection[index] = collection[lastIndex] as T;
    }
    collection.pop();

Order is not observable. The only consumer of the collection is
`iterate()` (uses `.forEach` for destructors/parents/children); none
of the callers — destructor firing, parent/child propagation —
assume sibling order. Parent-side removal is also batched via
`scheduleDestroyed`, so a child is never removed from the collection
while the parent is iterating it.

Measured with tracerbench (20-fidelity compare vs origin/main):

| phase                | Δ ms        | Δ %      |
|----------------------|------------:|---------:|
| clearManyItems1End   | **-43 ms**  | **-21.3 %** |
| clearManyItems2End   | **-40 ms**  | **-39.5 %** |
| render1000Items1End  |   -2 ms     |  -2.9 %  |
| (all other 17 phases)| no diff     |          |

90% CIs for the two clear-5000 phases: [-46, -40] ms and [-46, -36] ms
respectively — well outside the regression/noise threshold tracerbench
uses. No regressions on any measured phase.
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 19, 2026

📊 Package size report   0%↑

File Before (Size / Brotli) After (Size / Brotli)
dist/dev/packages/@glimmer/destroyable/index.js 6.3 kB / 1.4 kB 2%↑6.4 kB / 3%↑1.5 kB
dist/prod/packages/@glimmer/destroyable/index.js 3.9 kB / 852 B 3%↑4 kB / 5%↑898 B
Total (Includes all files) 5.4 MB / 1.3 MB 0%↑5.4 MB / 0.01%↑1.3 MB
Tarball size 1.2 MB 0.01%↑1.2 MB

🤖 This report was automatically generated by pkg-size-action

@johanrd johanrd changed the title perf(destroyable): replace splice with swap-and-pop in remove() Perf: replace splice with swap-and-pop in remove() Apr 19, 2026
@johanrd johanrd changed the title Perf: replace splice with swap-and-pop in remove() Perf: replace splice with swap-and-pop in remove() (runtime) Apr 19, 2026
@johanrd johanrd changed the title Perf: replace splice with swap-and-pop in remove() (runtime) perf(destroyable): replace splice with swap-and-pop in remove() Apr 19, 2026
@johanrd johanrd changed the title perf(destroyable): replace splice with swap-and-pop in remove() perf(destroyable): replace splice with swap-and-pop in remove() (runtime) Apr 19, 2026
@johanrd johanrd changed the title perf(destroyable): replace splice with swap-and-pop in remove() (runtime) Perf: replace splice with swap-and-pop in destroyable remove() (runtime) Apr 19, 2026
Skips the unused return-value read from pop(). Semantically identical
(both truncate the array by one; the packed-array fast path is
preserved in either form). Also byte-identical to the same change in
NVP's upstream emberjs#21221, simplifying side-by-side review.
@johanrd johanrd merged commit 2297ebc into main Apr 20, 2026
52 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant