Skip to content

Phase 2b.3: switch Value::Text to Arc<String> with RC=1 in-place concat#276

Merged
danieljohnmorris merged 4 commits into
mainfrom
fix/phase2b-text
May 14, 2026
Merged

Phase 2b.3: switch Value::Text to Arc<String> with RC=1 in-place concat#276
danieljohnmorris merged 4 commits into
mainfrom
fix/phase2b-text

Conversation

@danieljohnmorris
Copy link
Copy Markdown
Collaborator

Summary

Third (and last) step of the tree-walker RC-aware mutation rollout. Mirror of Phase 2b.1 (Map, PR #261) and Phase 2b.2 (List, PR #273):

  • Value::Text(String) becomes Value::Text(Arc<String>) so cloning a string is a refcount bump rather than a full byte copy.
  • The existing self-rebind concat peephole in eval_self_rebind_concat (added for Lists in interpreter: RC-aware Value::List with self-rebind append peephole #273) gains a Text branch. When prev and rhs are both Value::Text, Arc::make_mut runs on the refcount=1 Arc and push_str mutates the inner String in place.

The classic string-accumulator loop folds from O(n^2) to O(n) amortised on the tree engine. 5k iterations of s=+s "x" go from 10+ seconds to milliseconds.

Repro

build n:n>n;s="";@i 0..n{s=+s "x"};len s
demo>n;build 5000
  • Before: 10+ seconds, every iteration clones the whole String.
  • After: a millisecond or two, every iteration uses Arc::make_mut + push_str.

The alias guard inherited from match_self_rebind_concat (which calls expr_refers_to to reject s = s + s) carries over unchanged. Non-rebind shapes (b = +a c) still go through apply_binop so the caller's a is preserved, same as the VM OP_ADD_SS split landed in #260 and the Cranelift jit_concat split landed in #250.

What's in the diff

  • interpreter: switch Value::Text to Arc<String> with RC=1 in-place concat (8a05a30) - mechanical rewrite. Every Value::Text(String) construction becomes Value::Text(Arc::new(...)) across the interpreter, VM, Cranelift JIT, json bridge, http/mcp tool providers, and the CLI arg-parser. Every Value::Text(s) read that needs an owned String (MapKey::Text, serde_json::Value::String, parts: Vec, etc.) becomes (**s).clone(). No behavioural change on its own.
  • interpreter: eval_stmt peephole for self-rebind text concat (daab1c1) - adds the (Value::Text, Value::Text) branch to eval_self_rebind_concat and updates the match_self_rebind_concat docstring.
  • test + example: tree-walker string accumulator regression coverage (0974b0d) - 10 regression tests covering literal / variable / Call-returning-Text rhs / foreach / non-rebind / self-alias / numeric fallthrough / 5k scale, plus examples/string-accumulator-tree.ilo so the engine harness runs the same shapes across tree, VM, and Cranelift.

Test plan

  • cargo build --release --features cranelift clean
  • cargo test --release --features cranelift - all 112 suites green
  • cargo fmt --check clean
  • cargo clippy --release --features cranelift --all-targets -- -D warnings clean
  • New regression file: 10/10 pass
  • Engine harness picks up examples/string-accumulator-tree.ilo across tree, VM, Cranelift
  • 5k-iteration scale check completes well under the 5s ceiling

Follow-ups

None. This closes Phase 2b. Map (#261), List (#273), and Text (this PR) all use the same Arc + self-rebind peephole shape on the tree engine, matching the VM OP_* rebind splits and the Cranelift jit_* rebind splits that landed earlier.

No README / SPEC updates: this is a runtime-internal perf fix with no user-observable semantic change beyond the formerly O(n^2) accumulator going O(n).

@codecov
Copy link
Copy Markdown

codecov Bot commented May 14, 2026

Codecov Report

❌ Patch coverage is 96.42325% with 24 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/interpreter/mod.rs 95.20% 23 Missing ⚠️
src/vm/jit_cranelift.rs 99.28% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Phase 2b.3 of the RC-aware mutation rollout. Mirrors the Map switch (PR
#261) and List switch (PR #273): cloning a Value::Text is now a refcount
bump rather than a full byte copy, and the existing self-rebind concat
peephole gains a Text branch in the next commit so the
string-accumulator pattern folds from O(n^2) to O(n) amortised.

This commit is the mechanical rewrite: every Value::Text(String)
construction becomes Value::Text(Arc::new(...)) across the interpreter,
VM, Cranelift JIT, json bridge, http/mcp tool providers, and the CLI
arg-parser. Every Value::Text(s) read where the caller needs an owned
String (MapKey::Text, serde_json::Value::String, parts: Vec<String>,
etc.) becomes (**s).clone().

No behavioural changes. The peephole and its tests come next.
Extends the existing eval_self_rebind_concat fast path (added for Lists
in PR #273) to also cover the Text/Text shape. When the prev binding
and the rhs are both Value::Text, the peephole takes prev out of env to
drop the Arc refcount to 1, then uses Arc::make_mut + push_str to mutate
the inner String in place. The classic string-accumulator loop

    s=""; @i 0..n { s=+s "x" }

drops from O(n^2) to O(n) amortised on the tree engine. Mirror of the
VM OP_ADD_SS rebind-shape guard (PR #260) and the Cranelift jit_concat
non-rebind split (PR #250).

The alias guard inherited from match_self_rebind_concat (which calls
expr_refers_to to reject `s = s + s` self-aliasing) carries over
unchanged; the new Text branch leans on the same invariant the List
branch relies on. Non-rebind shapes (`b = +a c`) still go through the
general apply_binop path so the caller's `a` is preserved.
Adds tests/regression_string_accumulator_tree.rs with 10 cases
exercising the Phase 2b.3 Text peephole on the tree engine:

  - literal / variable / Call-returning-Text rhs
  - foreach accumulator (the hot shape this PR targets)
  - non-rebind (`b = +a c`) preserves the caller's `a`
  - self-alias (`s = +s s`) doubles correctly because the peephole
    bails out via expr_refers_to
  - numeric `n = +n 1` still falls through apply_binop unchanged
  - 5k-iteration scale check under 5s, pinning the O(n) curve

Plus examples/string-accumulator-tree.ilo so the engine harness
runs the same shapes across tree, VM, and Cranelift, and so agents
encountering the pattern get an in-context learning example.
Rebase fallout. PR #275 (Ok-wrapper stdout fix) added a second
NanVal->Value conversion via to_value_with_program with the
HeapObj::Str(s) => Value::Text(s.clone()) shape that worked under the
old Value::Text(String) representation. Phase 2b.3 makes Value::Text
hold Arc<String>, so the bare s.clone() (which yields String) no
longer satisfies the variant. Wrap it in Arc::new the same way the
existing to_value site at line 3881 does.
@danieljohnmorris danieljohnmorris merged commit 63e818a into main May 14, 2026
5 checks passed
@danieljohnmorris danieljohnmorris deleted the fix/phase2b-text branch May 14, 2026 20:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant