Skip to content

chore(B7): defer plan; tuple cost crossover makes large tables a regression#231

Merged
davydog187 merged 1 commit into
mainfrom
chore/defer-b7
May 22, 2026
Merged

chore(B7): defer plan; tuple cost crossover makes large tables a regression#231
davydog187 merged 1 commit into
mainfrom
chore/defer-b7

Conversation

@davydog187
Copy link
Copy Markdown
Contributor

Defer B7: tuple cost crossover makes large tables a regression

Updates .agents/plans/B7-table-array-hash-split.md from readydeferred with the measurement data that drove the decision to close PR #229 without merging.

Why we closed #229

#229 implemented the full plan correctly (all 1692 tests + 29 lua53 suite tests passed). But when we re-measured under the bench harness from #230 with multi-n inputs, the picture changed:

workload @ n main B7 (#229) delta
Build n=100 17.09 µs 14.03 µs -18%
Build n=1000 197.96 µs 265.82 µs +34% ⚠️
Sort n=100 34.91 µs 27.57 µs -21%
Sort n=1000 490.49 µs 655.72 µs +34% ⚠️
Iterate n=100 24.59 µs 21.11 µs -14%
Iterate n=1000 276.74 µs 358.64 µs +30% ⚠️
Map+Red n=100 49.79 µs 42.78 µs -14%
Map+Red n=1000 603.93 µs 843.57 µs +40% ⚠️

Memory regressed 3-5× at n=1000 (Sort: 2.08 MB → 12.40 MB).

The crossover is structural: BEAM tuples are immutable, so every setelement/3 on a 1024-cell tuple copies the whole tuple. PUC-Lua avoids this with in-place C mutation; we can't. Small n wins because setelement/3's constant-factor advantage over Map.put matters; large n loses because copying dominates.

The single n=500 measurement that originally motivated B7 was right at the crossover. Multi-n in #230 was what made the pattern visible.

Conditions for reconsidering

A future plan could revisit this with threshold-based promotion: keep contiguous integer keys in the hash map until array_len reaches some boundary (e.g. 256), then promote. That would preserve the small-table wins (-14% to -21%) without taking the large-table hit. The plan file documents what such a plan would need to address.

Until that plan exists and ships, table workloads have to look elsewhere for durable wins — likely B5 (compile prototypes to Erlang) which attacks setelement/3 register-write cost directly.

Changes

 .agents/plans/B7-table-array-hash-split.md | 73 +++++++++++++++++++++++++++--
 1 file changed, 70 insertions(+), 3 deletions(-)

Only the plan file changes; no library code touched.

Verification

mix format
mix compile --warnings-as-errors
mix test

(All trivially pass; this PR only edits a plan file.)

…ession

PR #229 implemented the full plan and passed correctness, but multi-n
measurement on the bench harness from PR #230 revealed that wins at
n<=100 (-14% to -21%) are accompanied by losses at n=1000 (+30% to
+40%), plus 3-5x memory regression at the same scale.

The crossover is structural: setelement/3 on a 1024-cell tuple copies
the whole tuple every write, vs Map.put's amortized log-time tree
allocation. PUC-Lua avoids this with in-place C mutation; we can't.

A future plan could revisit this with threshold-based promotion (stay
in the data map until array_len reaches some boundary, then promote).
That would preserve the small-table wins without the large-table hit.
Until that plan exists, table workloads have to look elsewhere for
durable wins (likely B5: compile prototypes to Erlang).

Plan file updated with full measurement data and the conditions under
which the work could be reopened. PR #229 closed without merging.
@davydog187 davydog187 merged commit bc69a2e into main May 22, 2026
4 checks passed
@davydog187 davydog187 deleted the chore/defer-b7 branch May 22, 2026 01:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant