Skip to content

fix: multi-line bodies inside brackets, parens, and >> pipe chains#246

Merged
danieljohnmorris merged 2 commits into
mainfrom
fix/multiline-fn-bodies
May 13, 2026
Merged

fix: multi-line bodies inside brackets, parens, and >> pipe chains#246
danieljohnmorris merged 2 commits into
mainfrom
fix/multiline-fn-bodies

Conversation

@danieljohnmorris
Copy link
Copy Markdown
Collaborator

Summary

Three persona-flagged shapes (Multi-line function bodies still fatal, L1310 quant-trader, L1325 re-flag, L938 db-analyst, plus the cross-persona summary at L1023) all came from one root cause: normalize_newlines injected a ; between tokens that weren't statement boundaries. Post-#168 the header→body case was fixed; this PR finishes the job for the remaining three shapes.

The token-cost framing: every persona who reached for natural indented body shapes paid a 5-minute debug-and-collapse tax, plus the ILO-P009 expected expression, got Semi at a column deep inside a continuation line obscured the real cause. Manifesto cost was real and repeated.

Repro before/after

Pre-fix, all three failed with ILO-P009:

g>L n
  xs=[
    1,
    2,
    3
  ]
  xs

gp x:n>n
  y=(+x
    1)
  y

pipe x:n>n
  x
    >>str
    >>len

Post-fix, all three run on tree, vm, and cranelift.

What's in the diff

  • c2c28d4 lexer: suppress newline-to-semi inside ( ) and [ ] and after >>. normalize_newlines tracks (/[ depth and treats newlines inside as a single space (so (+x\n 1) doesn't normalise to (+x1)). Continuation lines that start with >> (the only operator that's never a valid statement start) suppress the ; too; +/-/* etc. are valid prefix-call heads and stay untouched. String walking is pre-existing from fix: comments above bound-call lines no longer corrupt parsing #236; this PR just relies on it so brackets inside text don't bump depth.
  • 103d877 test: cross-engine coverage. Six new shapes in regression_multiline_fn_body.rs (multi-line list, leading-comma list, multi-line paren, multi-line pipe, nested ( inside [, pipe chain inside a {...} loop body) across tree/vm/cranelift. New examples/multiline-bodies.ilo so the example_engines harness exercises the same shapes and an agent reading examples sees indented bodies as canonical.

Test plan

  • cargo test --release --features cranelift --test regression_multiline_fn_body — 3 tests across tree/vm/cranelift, all green
  • cargo test --release --features cranelift --test examples_engines — examples harness green including new file
  • cargo test --release --features cranelift full suite — 2871 lib passed, 0 failed; all integration tests green
  • cargo fmt --check
  • cargo clippy --release --features cranelift --all-targets

Out of scope

Pattern 4 from the investigation (indented -- comments above paren-bound calls) was already in flight on fix/comment-parse-corrupt and shipped in #236 before this PR landed. The two changes coexist; this PR rebased onto the merged #236.

Follow-ups

  • Single-pass indent skip would shave one cheap iterator-clone per indented newline. Deferred as nice-to-have.
  • The misleading caret-column complaint is partially fixed (because the spurious ; no longer lands inside a continuation) but a full span-rewrite is a separate piece of work.

normalize_newlines was unaware of paren/bracket depth, so an indented
continuation inside a list literal or paren-group got a `;` injected
between tokens (`xs=[\n  1,\n  2]` became `xs=[;1,;2]`, ILO-P009 at a
bewildering column). Now `(` `[` open a depth counter and newlines
inside become a single space; `)` `]` close it.

Continuation lines starting with `>>` (pipe operator) suppress the `;`
too. `>>` is the only operator that's never a valid statement start in
ilo's prefix-call grammar, so this is unambiguous. `+`/`-`/`*` etc.
remain untouched so `+x 1` stays a valid statement head.

String literals are walked through verbatim (already true post #236) so
brackets and pipe tokens inside text don't affect depth.
Extends regression_multiline_fn_body across tree/vm/cranelift with the
shapes the personas kept hitting: multi-line `[...]` list literal,
leading-comma list, multi-line `(...)` paren-group, multi-line `>>`
pipe chain, nested `(` inside `[`, and pipe chain inside a `{...}`
loop body.

Also adds examples/multiline-bodies.ilo so the same shapes get exercised
by the examples_engines harness and an agent reading examples sees the
indented form as canonical.

run_file now whitespace-splits `entry` so cases like `gp 5` pass the
argument as its own argv slot, matching how examples_engines invokes
the CLI.
@danieljohnmorris danieljohnmorris merged commit 40f390e into main May 13, 2026
4 checks passed
@danieljohnmorris danieljohnmorris deleted the fix/multiline-fn-bodies branch May 13, 2026 17:14
@codecov
Copy link
Copy Markdown

codecov Bot commented May 13, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ All tests successful. No failed tests found.

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant