fix: comments above bound-call lines no longer corrupt parsing by danieljohnmorris · Pull Request #236 · ilo-lang/ilo

danieljohnmorris · 2026-05-13T16:28:25Z

Summary

An indented -- comment immediately above a paren-bound call silently corrupted parsing. The function body ended up empty and the diagnostic pointed many lines past the actual cause, typically inside a format-string {} placeholder. Manifesto-wise, comments are supposed to be free; this regression made them a 15-minute-per-occurrence tax.

Repro

Before:

$ cat repro.ilo
main>t
  -- build the thing
  m=(fmt "k={}" 1)
  m

$ ilo repro.ilo --run-tree main
{"code":"ILO-T008","message":"return type mismatch: expected t, got _", ...}

After:

$ ilo repro.ilo --run-tree main
k=1

Root cause

normalize_newlines in src/lexer/mod.rs rewrites \n+indent to ; BEFORE the logos lexer's --[^\n]* comment-skip runs. On an indented comment line, the trailing \n is also rewritten to ;, so the logos regex (which stops at \n) greedily eats the comment text plus every following statement up to the next non-indented newline. The body vanishes, the verifier reports a type mismatch with no useful position, and the ILO-P009 expected expression, got Semi cascade points at a { inside the format string several lines later.

What's in the diff

Commit 1 moves comment-handling into normalize_newlines itself. When it sees --, it advances past the comment content without emitting anything and leaves the trailing \n intact for the loop's existing newline handling. It also passes through "..." string literals verbatim so -- inside a string isn't mistaken for a comment, and suppresses duplicate ; emission when a comment-only line precedes an indented continuation.
Commit 2 adds tests/regression_comment_parse_corrupt.rs. 15 tests across tree / VM / cranelift covering: paren-bound fmt with leading comment, mid-body comment between bindings, stacked comments, comment containing {}/;/(), and dashes inside a string literal.
Commit 3 adds examples/comment-above-call.ilo with -- run: / -- out: assertions so tests/examples_engines.rs exercises the same cases at the higher integration layer across every engine.

Test plan

Original repro from ilo_assessment_feedback.md L1790 returns k=1 instead of ILO-T008
cargo test --release --features cranelift full suite passes (2866 lib tests, plus all integration tests, including the new 15-test regression_comment_parse_corrupt)
tests/examples_engines.rs passes with the new example
cargo fmt --check clean
cargo clippy --release --features cranelift --all-targets -- -D warnings clean
-- inside a string literal ("hello -- world") still round-trips correctly

Follow-ups

None. The fix is local to normalize_newlines and matches the logos --[^\n]* rule it's now running ahead of.

normalize_newlines rewrites indented \n to ; before the logos lexer runs, so on an indented comment line the trailing \n turned into ;. The logos --[^\n]* comment-skip then matched across that synthesised ; and greedily ate every following statement up to the next non-indented newline. An indented comment immediately above a paren-bound call wiped out the rest of the function body and the diagnostic landed many lines past the actual cause, typically inside a format-string {} placeholder. Detect -- directly in normalize_newlines, advance past comment content without emitting any output, and leave the trailing \n intact for the loop's existing newline handling. Also pass through "..." string literals verbatim so -- inside a string isn't mistaken for a comment. Suppress duplicate ; emission when a comment-only line precedes an indented continuation.

Anchors the invariant that an indented -- comment line above any binding (paren-bound call, plain binding, stacked comments, comment-with-punctuation) parses identically across tree / VM / cranelift. Adds a dashes-in-string case so the lexer pre-pass keeps its string-aware skip honest.

Picked up by tests/examples_engines.rs across every engine. Shows the four shapes the lexer pre-pass has to get right: paren-bound fmt call below a comment, stacked comments before a binding, mid-body comment between bindings, and -- inside a string literal.

codecov · 2026-05-13T16:31:12Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ All tests successful. No failed tests found.

📢 Thoughts on this report? Let us know!

danieljohnmorris added 3 commits May 13, 2026 17:27

danieljohnmorris merged commit 204a7c1 into main May 13, 2026
4 checks passed

danieljohnmorris deleted the fix/comment-parse-corrupt branch May 13, 2026 16:29

danieljohnmorris mentioned this pull request May 13, 2026

fix: multi-line bodies inside brackets, parens, and >> pipe chains #246

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: comments above bound-call lines no longer corrupt parsing#236

fix: comments above bound-call lines no longer corrupt parsing#236
danieljohnmorris merged 3 commits into
mainfrom
fix/comment-parse-corrupt

danieljohnmorris commented May 13, 2026

Uh oh!

Uh oh!

codecov Bot commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

danieljohnmorris commented May 13, 2026

Summary

Repro

Root cause

What's in the diff

Test plan

Follow-ups

Uh oh!

Uh oh!

codecov Bot commented May 13, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant