-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Read source files in one go before lexing #12403
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't spot any problems, the patch LGTM.
6dba77a
to
3083854
Compare
This feels like a non-trivial change. I am planning to wait a couple more days before merging so that people can have another look if they want. |
I had completely forgotten about this PR (the perils of "waiting for a few days"), I am planning to fix conflicts and merge. Thanks @dra27 for the reminder! |
Some of our error printing styles print the source code at the location of the error. This source is found either in the lexing buffer when available (it has not been discarded to make room for more source code), orelse by trying to re-open the source file. Re-opening the source file is not so reliable, in particular it fails in presence of preprocessor directives (the user-facing locations we have do not necessarily refer to real locations in the input file), see ocaml#12238. This commit fixes the issue by reading the whole source file and then using Lexing.from_string, which preserves all the input in the buffer.
This logic is not robust in presence of lexer directives and will silently do the wrong file. We don't need it in the compiler anymore now that we read the entire file at once -- we do not need a fallback strategy after lines_around_from_lexbuf anymore. In theory this might make a difference for compiler-libs users that would set Location.input_name but not Location.input_lexbuf, and rely on the read-from-file fallback logic. Those users can fix their code (in a backward-compatible way) by setting Location.input_lexbuf themselves.
3083854
to
cb1afbd
Compare
AppVeyor failed with a flaky test (#12425):
This is unrelated to the present PR, so merging. |
This PR introduces a regression when preprocessing code in the style of Dune + ppxlib (where all the transformations are bundled into a single executable, running standalone). Note that there's no issue when running a PPX with the compiler via I'm unsure whether the fix needs to happen here or in ppxlib, but the gist is that Here's a specific example from a project of mine: 5.1
5.2
|
@anmonteiro I wrote a potential fix for the regression you report, could you test it? on top of trunk: https://github.com/gasche/ocaml/tree/dune-ppxlib-quoting-source-file |
@gasche thanks. Your patch fixes the issue for me. |
Thanks! I submitted the patch as a PR at #12991. |
This is another attempt at fixing #12238 after #12396 was found to be too much of a hack.
Context: some of our error printing styles print the source code at the location of the error. This source is found either in the lexing buffer when available (it has not been discarded to make room for more source code), orelse by trying to re-open the source file. Re-opening the source file is not so reliable, in particular it fails in presence of preprocessor directives (the user-facing locations we have do not necessarily refer to real locations in the input file). See #12238 for an example.
The fix proposed in the present PR is to read source files entirely, and use
Lexing.from_string
to build the lexing buffer. This guarantees that the lexing buffer has the whole source.We discussed whether this approach is reasonable in terms of memory consumption in #12396 (comment), and our conclusion was that it is indeed reasonable: the intermediate representations used by the compiler have an in-memory size that is (noticeably) larger than the source code itself, so the increase in total live memory usage should never be a big issue.
The PR removes the code for the "re-open the source file" fallback that is not useful anymore.
The PR contains several commits and it is best reviewed commit-by-commit.