Skip to content

interpreter: RC-aware Value::List with self-rebind append peephole#273

Merged
danieljohnmorris merged 3 commits into
mainfrom
fix/phase2b-list
May 14, 2026
Merged

interpreter: RC-aware Value::List with self-rebind append peephole#273
danieljohnmorris merged 3 commits into
mainfrom
fix/phase2b-list

Conversation

@danieljohnmorris
Copy link
Copy Markdown
Collaborator

Summary

Phase 2b.2 of the RC-aware mutation rollout. Phase 2b.1 (#261) shipped the same shape for Value::Map; this PR does it for Value::List.

The tree-walker's list accumulator (xs=[];@i 0..n{xs=+=xs i}) was O(n²) because every += cloned the whole Vec<Value>. After this PR it's O(n) amortised: cloning a list is now a refcount bump on Arc<Vec<Value>>, and the new eval_stmt peephole takes the env binding out before evaluating the RHS so Arc::make_mut mutates the inner Vec in place.

Repro before/after

build n:n>n;xs=[];@i 0..n{xs=+=xs i};len xs

Pre-fix on tree: 5k iterations took ~10s (12.5M Value clones across 5000 Vec rebuilds). Post-fix: under a second. The Cranelift / VM backends already had the equivalent fast path landed in #249 / #250.

What's in the diff

Three commits, each one coherent change so review can be done independently:

  1. interpreter: switch Value::List to Arc<Vec<Value>> with RC=1 in-place append - mechanical representation change across the interpreter, VM HeapObj boundary, CLI parse_cli_arg, JSON deserialisation, and all unit-test constructors. Builtins that mutate their result on the way out switch to (**items).clone() to materialise an owned Vec before re-wrapping. Iteration sites that previously moved values out of the Vec switch to iter().cloned().

  2. interpreter: eval_stmt peephole for self-rebind list append and concat - the two new peepholes, plus an expr_refers_to helper that walks the rhs and bails out when it references the same binding (the trap PR fix: prevent + non-rebind aliasing on string concat (VM + Cranelift) #260 hit on string self-concat). Numeric + falls back through eval_binop so non-list paths are unaffected.

  3. test + example: tree-walker list accumulator regression coverage - 11 new tests covering self-rebind correctness, non-rebind aliasing, list-concat empty/non-empty rhs, self-aliased rhs (xs = +xs xs), the numeric-add fallback, and a 5k-element scale test with a 5s ceiling. Plus examples/list-accumulator-tree.ilo so the engine harness runs the same shapes cross-engine.

Test plan

  • cargo build --release --features cranelift clean
  • cargo test --release --features cranelift - 4702 passed, 0 failed (added 11 new tests)
  • cargo clippy --release --features cranelift --all-targets -- -D warnings clean
  • cargo fmt clean
  • Phase 2b.1's mset accumulator tests still pass (no regression to the Map peephole)
  • String self-concat aliasing tests pass (no regression to PR fix: prevent + non-rebind aliasing on string concat (VM + Cranelift) #260)
  • examples_engines harness runs the new example across tree, VM, and Cranelift

Follow-ups

  • Phase 2b.3 (Text): same pattern for Value::Text(String) -> Value::Text(Arc<String>) with s = +s c peephole. Will land as a separate PR after this one merges.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 14, 2026

Codecov Report

❌ Patch coverage is 86.57718% with 100 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/vm/mod.rs 73.81% 72 Missing ⚠️
src/interpreter/mod.rs 92.53% 28 Missing ⚠️

📢 Thoughts on this report? Let us know!

… append

Phase 2b.2 of the RC-aware mutation rollout. Phase 2b.1 (PR #261) did the
same for Value::Map; this commit does the equivalent for Value::List.

Wrap the inner Vec<Value> in Arc so cloning a Value::List is a refcount
bump rather than a full Vec copy. BinOp::Append and BinOp::Add for two
lists construct a fresh Arc, so the general (non-peephole) path stays
correct under aliasing: when the caller still holds a reference, the
clone preserves their list.

Arc rather than Rc because Value crosses async tool-call boundaries
(ToolProvider: Send + Sync), same reasoning as the Map switch. The
atomic-refcount cost is negligible vs the Vec allocations it replaces.

Construction sites across builtins (chars, rev, srt, lst, slc, flat,
map, flt, fld, grp, frq, partition, flatmap, transpose, etc.) and the
VM HeapObj-to-Value conversion boundary are mechanical Arc::new wraps.
Iteration sites that previously moved owned Value out of the inner Vec
switch to iter().cloned() or take the original by reference and clone
per-item. The CLI's parse_cli_arg, the Cranelift JIT test helpers, and
all unit-test constructors get the same Arc::new wrap.

Pattern matches on Value::List in apply_binop and pattern_matches now
auto-deref through the Arc, so call sites that compare lengths or
iterate elements need no further changes. The lst, slc, flat, partition,
grp, frq builtins that mutate their result on the way out switch to
(**items).clone() to materialise an owned Vec, then re-wrap with
Arc::new.

The eval_stmt self-rebind peephole that actually exercises the RC=1
fast path lands in the next commit; this commit is intentionally
representation-only so the diff is easy to review and stays bisectable.
Add two peepholes in eval_stmt mirroring the mset peephole from
Phase 2b.1 (PR #261):

  xs = +=xs v   -- BinOp::Append self-rebind
  xs = xs + ys  -- BinOp::Add self-rebind, both sides lists

Both take the env's binding out before evaluating the RHS, so the
Arc<Vec<Value>> in `prev` reaches the fast path with refcount=1.
`Arc::make_mut` then mutates the inner Vec in place (push for append,
extend for concat) instead of cloning the whole Vec on every iteration.
The classic accumulator `xs=[];@i 0..n{xs=+=xs i}` becomes O(n)
amortised rather than O(n²) clones-per-push.

Aliased RHS bails out. `expr_refers_to(name, rhs)` walks the rhs tree
and returns true if any Ref(name) appears: in that case env.take()
would replace the binding with Nil before the rhs reads it, so we skip
the peephole and let apply_binop do the cloning copy. This is the same
trap PR #260 hit on string self-concat (`s = s + s`). Match arms are
treated conservatively as 'might refer' since their bodies are
Vec<Spanned<Stmt>> and a full walk isn't worth the complexity for the
rare case.

The concat peephole gracefully falls back to eval_binop when the
values aren't both lists, so numeric `x = x + 1` (when the literal RHS
doesn't reference x) still produces the right answer via the general
arithmetic path.

Error semantics match the mset peephole: if rhs evaluation errors, env
is left holding Value::Nil. ilo has no catch/recover form so the error
propagates to the function boundary and user code never observes the
intermediate state.
Pin the Phase 2b.2 contract on the tree engine:

  * Self-rebind `xs = +=xs v` with numbers, text, and inside a foreach
    loop. Exercises the eval_stmt append peephole on every assignment.
  * Non-rebind shape (`ys = +=xs v`, different name) leaves the caller's
    list unchanged. Confirms the peephole only fires when name matches.
  * Self-rebind list concat with empty and non-empty rhs.
  * Aliased rhs (`xs = +xs xs` self-double) takes the fallback path
    and returns the correctly doubled list. Mirrors the PR #260 trap
    fixed on string self-concat.
  * Numeric `n = n + literal` falls back through eval_binop when the
    values aren't both lists, so non-list `+` paths are unaffected.
  * 5k-element accumulator finishes well under 5s. The old O(n²) tree
    path took 10+s on a recent MacBook; the ceiling here is lenient
    enough not to flake on slow CI while still catching a regression
    to the quadratic shape.

examples/list-accumulator-tree.ilo runs through the engine harness
(tests/examples_engines.rs) on every backend, doubling as cross-engine
coverage and an in-context teaching example for agents writing
accumulator loops.
@danieljohnmorris danieljohnmorris merged commit ee3e145 into main May 14, 2026
4 of 5 checks passed
@danieljohnmorris danieljohnmorris deleted the fix/phase2b-list branch May 14, 2026 16:45
danieljohnmorris added a commit that referenced this pull request May 14, 2026
Phase 2b.3 of the RC-aware mutation rollout. Mirrors the Map switch (PR
#261) and List switch (PR #273): cloning a Value::Text is now a refcount
bump rather than a full byte copy, and the existing self-rebind concat
peephole gains a Text branch in the next commit so the
string-accumulator pattern folds from O(n^2) to O(n) amortised.

This commit is the mechanical rewrite: every Value::Text(String)
construction becomes Value::Text(Arc::new(...)) across the interpreter,
VM, Cranelift JIT, json bridge, http/mcp tool providers, and the CLI
arg-parser. Every Value::Text(s) read where the caller needs an owned
String (MapKey::Text, serde_json::Value::String, parts: Vec<String>,
etc.) becomes (**s).clone().

No behavioural changes. The peephole and its tests come next.
danieljohnmorris added a commit that referenced this pull request May 14, 2026
Extends the existing eval_self_rebind_concat fast path (added for Lists
in PR #273) to also cover the Text/Text shape. When the prev binding
and the rhs are both Value::Text, the peephole takes prev out of env to
drop the Arc refcount to 1, then uses Arc::make_mut + push_str to mutate
the inner String in place. The classic string-accumulator loop

    s=""; @i 0..n { s=+s "x" }

drops from O(n^2) to O(n) amortised on the tree engine. Mirror of the
VM OP_ADD_SS rebind-shape guard (PR #260) and the Cranelift jit_concat
non-rebind split (PR #250).

The alias guard inherited from match_self_rebind_concat (which calls
expr_refers_to to reject `s = s + s` self-aliasing) carries over
unchanged; the new Text branch leans on the same invariant the List
branch relies on. Non-rebind shapes (`b = +a c`) still go through the
general apply_binop path so the caller's `a` is preserved.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant