Skip to content

Scala: fix parsing fragility around case bodies and try/catch keywords#7665

Merged
knutwannheden merged 1 commit into
mainfrom
plush-tiger
May 12, 2026
Merged

Scala: fix parsing fragility around case bodies and try/catch keywords#7665
knutwannheden merged 1 commit into
mainfrom
plush-tiger

Conversation

@knutwannheden
Copy link
Copy Markdown
Contributor

Motivation

Several "find next token" lookups in the Scala parser used a fixed
+50 / +200 character substring window. When intervening comments
push the target past the window, the parser silently misses the token
and produces non-idempotent print output. This surfaced as parse
failures on real Scala files containing things like
case _: Throwable => // comment (where a long line comment pushes
the closing } past the lookahead window).

A ; between cases on the same line (case a => x; case b => y) was
also being dropped on print because there was no marker preserving the
explicit separator.

Summary

  • Empty catch case bodies with trailing line comments now round-trip
    (case _: Throwable => // comment\n}) — replaced the cursor + 50
    window with positionOfNext(\"}\") which skips over comments.
  • Explicit ; separators between same-line match cases are preserved
    via a Semicolon marker on the JRightPadded wrapping the case.
    Detected by checking whether the next non-whitespace token after the
    body's content end (computed from the AST's Block.expr/stats
    spans, not the cursor — which Dotty's case body span often overshoots)
    is ;.
  • catch and finally keyword searches no longer use a fixed lookahead
    window — works correctly when long comments sit between the try body
    and catch, or between catch and finally.
  • Tightens up the parser internals: uses Space.format(source, start, end)
    instead of Space.format(source.substring(start, end)) throughout
    (avoids 70+ intermediate String allocations on the hot path), and
    uses Tree.randomId() instead of UUID.randomUUID() at marker
    construction sites.

Test plan

  • New round-trip tests in MatchTest for ;-separated cases and
    various case body shapes (./gradlew :rewrite-scala:test --tests MatchTest)
  • New round-trip tests in TryTest for empty case bodies with line
    comments, long block comments before catch, and long block
    comments before finally
  • Full Scala test suite passes (./gradlew :rewrite-scala:test)

Several "find next token" lookups used a fixed +50/+200 character
substring window. When intervening comments push the target past the
window, the parser silently misses the token and produces non-idempotent
print output. Replaces those with `positionOfNext` (comment-aware) or
captures the explicit token via a marker.

- Empty catch case bodies with trailing line comments now round-trip
  (`case _: Throwable => // comment\n}`).
- Explicit `;` separators between same-line cases (`case a => x; case b => y`)
  are preserved with a Semicolon marker on the wrapping JRightPadded.
- `catch` and `finally` keyword searches no longer rely on a fixed lookahead
  window — works correctly with long comments between try/catch/finally.
- Tightens up the parser internals: uses Space.format(source, start, end)
  instead of Space.format(source.substring(start, end)) throughout, and
  uses Tree.randomId() instead of UUID.randomUUID() at marker construction
  sites.
@github-project-automation github-project-automation Bot moved this from In Progress to Ready to Review in OpenRewrite May 12, 2026
@knutwannheden knutwannheden merged commit bde893f into main May 12, 2026
1 check passed
@knutwannheden knutwannheden deleted the plush-tiger branch May 12, 2026 10:49
@github-project-automation github-project-automation Bot moved this from Ready to Review to Done in OpenRewrite May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants