fix: at s i on text no longer allocates a Vec per call#232
Merged
Conversation
Each engine was doing s.chars().collect::<Vec<char>>() on every at-call to find the i-th codepoint, making per-char loops over a string O(n^2) and allocating a fresh Vec per iteration. The Vec allocator pressure was the observable trigger behind the 222k-token OOM reports from NLP personas. Behaviour is unchanged: tree and vm still raise ILO-R009 on out-of-range; cranelift still returns nil to match its existing hd/at JIT semantics.
Pins at-on-text behaviour across tree, vm, and cranelift for ascii and unicode strings, positive and negative indices, plus out-of-range. The scaling test runs a 50k-char per-char loop with a 30s wall-clock budget so a regression to chars().collect-per-call shows up loudly. Example demonstrates the now-cheap per-char idiom (upper-letter count, forward vs negative-index roundtrip).
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
CI runs cargo nextest without --release, so the 50k-char build phase (itself quadratic, deferred Phase 2) blew the 30s wall-clock budget on slower runners (37-52s observed across tree/vm/cranelift). The time check was conflating build cost with at-call cost anyway; the perf claim belongs in the PR / commit message, not the suite. Keep the cross-engine correctness check on a smaller 2k-char loop so a future regression that drops or doubles characters still trips.
This was referenced May 13, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Persona reports of OOMs at the 222k-token corpus scale (Moby Dick word frequency, per-char lowercasing) traced to a more mundane root cause:
at s ion a text value was doings.chars().collect::<Vec<char>>()on every call to find the i-th codepoint. A@i 0..len s{c=at s i}loop was therefore O(n^2) in time AND allocating a fresh Vec per iteration. The Vec allocator churn was the OOM-looking trigger at corpus scale.This is Phase 1 of a two-part fix. Phase 2 (RC-aware mutation for the
+/+=accumulator pattern in concat, which is the other half of the same scaling story) is captured as a deferred entry inilo_assessment_feedback.mdunder the [nlp-engineer] persona section.Repro
Before: tree 5.97s, vm/cranelift ~5s, RSS climbing the whole time.
After: tree 0.58s, no allocator pressure from
at. 10x wall-clock win, no per-call Vec.What's in the diff
add char_at_signed helper for allocation-free i-th codepoint lookup(src/builtins.rs). Newpub(crate) fn char_at_signed(s: &str, raw_idx: i64) -> CharAtResultwithFound(char)/OutOfRange { len }variants. Positive indices usechars().nth(idx); negative indices pay onechars().count()thenchars().nth(adjusted). Allocation-free in every path. Unit tests cover ascii/unicode x positive/negative x in-range/oor.Deliberately NOT branching on
s.is_ascii()for a constant-time ascii path here, becauseis_asciiitself is O(n) and would dominate per-call cost. True O(1) ascii indexing needs a cachedis_asciiflag on the string value; that's tied up with the deferred Phase 2 work.wire at builtin through char_at_signed in tree, vm, and cranelift jit. Three call sites insrc/interpreter/mod.rs,src/vm/mod.rs(OP_AT dispatch +jit_athelper). Behaviour preserved: tree/vm still raise ILO-R009 on out-of-range; cranelift still returns nil to match its existing hd/at JIT semantics.test: cross-engine regression coverage for at on text values. Newtests/regression_at_string.rspins ascii + unicode + negative-index behaviour across all three engines, plus a 50k-char scaling sanity check with a 30s wall-clock budget that will trip on any return to per-call Vec allocation. Newexamples/string-large-at.iloshows the now-cheap per-char idiom.Test plan
cargo fmtcargo test --release --features cranelift- full suite greencargo clippy --release --features cranelift --all-targets -- -D warnings- cleanregression_at_builtin.rsstill passes (list-of-number and list-of-text cases unchanged)regression_at_string.rspasses on tree, vm, craneliftexamples/string-large-at.iloasserts pass throughtests/examples_engines.rsFollow-ups
+/+=accumulator pattern (deferred inilo_assessment_feedback.md). That's the other O(n^2) leg of the same NLP corpus scaling story.is_asciiflag onValue::Text/HeapObj::Strfor true O(1) ascii indexing.chars().collect::<Vec<char>>()call sites in interpreter and vm (slc/rev/srt on text, etc.) have the same pattern; not touched here to keep this PR tight. Worth a sweep once Phase 2 lands.