From c81a64491d1b99d2016605af8d75ad4edea3fec6 Mon Sep 17 00:00:00 2001
From: Dave Lucia <davelucianyc@gmail.com>
Date: Thu, 21 May 2026 18:09:47 -0700
Subject: [PATCH] chore(B7): defer plan; tuple cost crossover makes large
 tables a regression

PR #229 implemented the full plan and passed correctness, but multi-n
measurement on the bench harness from PR #230 revealed that wins at
n<=100 (-14% to -21%) are accompanied by losses at n=1000 (+30% to
+40%), plus 3-5x memory regression at the same scale.

The crossover is structural: setelement/3 on a 1024-cell tuple copies
the whole tuple every write, vs Map.put's amortized log-time tree
allocation. PUC-Lua avoids this with in-place C mutation; we can't.

A future plan could revisit this with threshold-based promotion (stay
in the data map until array_len reaches some boundary, then promote).
That would preserve the small-table wins without the large-table hit.
Until that plan exists, table workloads have to look elsewhere for
durable wins (likely B5: compile prototypes to Erlang).

Plan file updated with full measurement data and the conditions under
which the work could be reopened. PR #229 closed without merging.
---
 .agents/plans/B7-table-array-hash-split.md | 73 +++++++++++++++++++++-
 1 file changed, 70 insertions(+), 3 deletions(-)

diff --git a/.agents/plans/B7-table-array-hash-split.md b/.agents/plans/B7-table-array-hash-split.md
index 3a9c92d..9c9e0c2 100644
--- a/.agents/plans/B7-table-array-hash-split.md
+++ b/.agents/plans/B7-table-array-hash-split.md
@@ -2,10 +2,10 @@
 id: B7
 title: Split table storage into array + hash parts
 issue: null
-pr: null
+pr: 229
 branch: perf/table-array-hash-split
 base: main
-status: ready
+status: deferred
 direction: B
 unlocks:
   - O(1) `t[#t + 1] = x` (supersedes A10b)
@@ -250,4 +250,71 @@ IO.puts("delta=#{after_mem - before_mem}B")
 
 ## Discoveries
 
-(populated during implementation)
+Implemented in PR #229; closed unmerged after multi-n measurement
+(enabled by the bench harness in PR #230) revealed a hard crossover.
+
+### What landed in #229
+
+- `Lua.VM.Table` gained `array :: tuple()`, `array_len :: non_neg_integer()`,
+  and `array_has_holes :: boolean()` fields.
+- Reads go through new `Table.get/2`, `Table.has?/2`, `Table.length/1`,
+  `Table.to_map/1`, `Table.keys/1` helpers that consult both parts.
+- Integer-keyed writes route through `put_integer/3` with exponential
+  capacity growth (doubling, floor 4) so sequential `t[i] = ...` is
+  amortized O(1).
+- Every site that previously read `table.data` for an integer key was
+  migrated to the new helpers (executor, stdlib, lua.ex, display).
+
+### Why we closed it
+
+Full-mode benchmarks on the merged bench harness (#230) showed:
+
+| workload @ n  | main      | B7        | delta   |
+|---------------|-----------|-----------|---------|
+| Build n=100   | 17.09 µs  | 14.03 µs  | -18%    |
+| Build n=1000  | 197.96 µs | 265.82 µs | **+34%** |
+| Sort n=100    | 34.91 µs  | 27.57 µs  | -21%    |
+| Sort n=1000   | 490.49 µs | 655.72 µs | **+34%** |
+| Iterate n=100 | 24.59 µs  | 21.11 µs  | -14%    |
+| Iterate n=1000| 276.74 µs | 358.64 µs | **+30%** |
+| Map+Red n=100 | 49.79 µs  | 42.78 µs  | -14%    |
+| Map+Red n=1000| 603.93 µs | 843.57 µs | **+40%** |
+
+Memory regressed 3-5× at n=1000 (e.g. Sort 2.08 MB → 12.40 MB).
+
+The crossover is fundamental: exponential-growth tuples win over
+`Map.put` at small n (where `setelement/3`'s constant-factor advantage
+matters), but lose at large n (where every `setelement/3` copies the
+whole tuple). PUC-Lua avoids this with in-place mutation in C; the
+BEAM cannot.
+
+The single n=500 number that originally motivated B7 was right at
+the crossover, which explains the inconsistent run-to-run results
+before #230 enabled multi-n measurement.
+
+### Conditions for reconsidering
+
+A future plan could revisit this with **threshold-based promotion**:
+keep contiguous integer keys in the hash map until `array_len` reaches
+some threshold (e.g. 256), then promote. That preserves the small-table
+wins (-14% to -21%) without taking the large-table hit. If we open such
+a plan, it should:
+
+- Decide the promotion threshold empirically (n where setelement
+  cost crosses Map.put cost on the target hardware).
+- Account for memory: even at threshold, the tuple still allocates
+  more than the equivalent map at the same size.
+- Keep the helpers (`get/2`, `has?/2`, `length/1`, `to_map/1`,
+  `keys/1`) we'd reuse — they're the call-site contract.
+
+Until that plan is written and shipped, the durable wins on table
+workloads have to come from elsewhere (e.g. attacking `setelement/3`
+register-write cost via Erlang codegen — B5 — or by reducing the
+number of writes per opcode).
+
+### Suite/test impact
+
+All 1692 tests + 29 lua53 suite tests passed on the B7 branch, so the
+correctness work (helpers, nil-as-hole semantics, dead-key iteration)
+is sound. None of that ships with this deferral — but the patterns
+proved out and would be reusable in a future threshold-based attempt.