Skip to content

Fix included positions being wrong for the first token in a new file#1615

Merged
WardBrian merged 5 commits intomasterfrom
issue/1614-include-positions
Apr 15, 2026
Merged

Fix included positions being wrong for the first token in a new file#1615
WardBrian merged 5 commits intomasterfrom
issue/1614-include-positions

Conversation

@WardBrian
Copy link
Copy Markdown
Member

Closes #1614.

After staring at the results of --debug-parse --debug-lex on @rok-cesnovar's example, I realized that this was the issue, the very first token was reporting the wrong location.

It turned out that us calling lexer_lexbuf_to_supplier was to blame, as this function interacted badly with the fact that our tokenizer can internally change lexbufs. The new input in Parse.ml is based on it, but accounts for this fact.

It turns out that almost all of our previous location-munging was unnecessary once this small fix was applied.

This PR does change the location of two existing comments in our test cases. As far as I can tell, one is a previously-correct placement becoming incorrect, and the other is a previously-incorrect placement becoming correct. shrug

Submission Checklist

  • Run unit tests
  • Documentation
    • If a user-facing facing change was made, the documentation PR is here:
    • OR, no user-facing changes were made

Release notes

Fixed a bug in the parser that could lead to crashes when auto-formatting files with many #includes.

Copyright and Licensing

By submitting this pull request, the copyright holder is agreeing to
license the submitted work under the BSD 3-clause license (https://opensource.org/licenses/BSD-3-Clause)

@WardBrian WardBrian requested a review from nhuurre April 15, 2026 16:41
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 15, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.49%. Comparing base (d9e4aa6) to head (8ec2362).
⚠️ Report is 6 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1615      +/-   ##
==========================================
+ Coverage   90.47%   90.49%   +0.02%     
==========================================
  Files          65       65              
  Lines       10110    10112       +2     
==========================================
+ Hits         9147     9151       +4     
+ Misses        963      961       -2     
Files with missing lines Coverage Δ
src/frontend/Parse.ml 89.74% <100.00%> (+0.26%) ⬆️
src/frontend/Preprocessor.ml 98.50% <100.00%> (-0.05%) ⬇️
src/frontend/Pretty_printing.ml 94.61% <100.00%> (+0.71%) ⬆️
src/middle/Location.ml 96.00% <100.00%> (+0.08%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@nhuurre
Copy link
Copy Markdown
Collaborator

nhuurre commented Apr 15, 2026

one is a previously-correct placement becoming incorrect, and the other is a previously-incorrect placement becoming correct.

Perfectly balanced, as all things should be.

The one becoming is incorrect is

data {
    #include data.stan
}

where data.stan is

real x; // comment

Interestingly, a blank line makes the problem go away

data {
    #include data.stan

}

Comment thread test/integration/cli-args/canonicalize/include/stanc.expected Outdated
@WardBrian
Copy link
Copy Markdown
Member Author

Thanks for the mwe @nhuurre. It is curious, the locations on the parsed file actually all look correct to me, so I think that comment being misplaced is due to the pretty printing code misbehaving, not a parser/lexer issue at this point.

The only difference adding a newline makes is updating the end_loc of prog.data_block.xloc, which is expected.

@nhuurre
Copy link
Copy Markdown
Collaborator

nhuurre commented Apr 15, 2026

Yes, I assume it's something like, pp_list_of_statements forgets to make a trailing pp_spacing call when end-of-block location is too close.

@WardBrian
Copy link
Copy Markdown
Member Author

I think it's actually as simple as deleting this other 'historical' workaround:

| {included_from= Some loc1; _} as loc2 ->
(* When pretty-printing comments it is possible to end up with multiple
locations at an identical point in the file when they originated in
an include. We artificially break this tie by pretending that
included locations were included from the one line down from where
they truly were. *)
loc2 :: unfold {loc1 with line_num= loc1.line_num + 1} in

@WardBrian
Copy link
Copy Markdown
Member Author

Perfectly balanced, as all things should be.

Bad(?) news, this PR is now strictly out-of-balance and only improves comment positions relative to master

@WardBrian WardBrian merged commit 543515f into master Apr 15, 2026
3 checks passed
@WardBrian WardBrian deleted the issue/1614-include-positions branch April 15, 2026 19:51
@rok-cesnovar
Copy link
Copy Markdown
Member

Thanks guys! Really appreciate it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] File with includes for entire functions and data blocks fails to auto-format

3 participants