Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust message range on unexpected token error #2393

Merged
merged 7 commits into from
Sep 12, 2023

Conversation

Kha
Copy link
Member

@Kha Kha commented Aug 8, 2023

This PR proposes a quite simple and general solution to #1971: when the parser hits an unexpected token, it will include all preceding whitespace in the error message range as the expected token could be inserted at any of these places to fix the error. I'll let the test file speak for itself:
image


Implementation notes: In order to implement this strategy, I first had to teach the parser to report ranges at all - so far we only reported single positions and left it to the editor to including what it thinks is the next token, which is not ideal and completely breaks when we want to include preceding whitespace like this. For the purposes of this PR I only changed the token error messages to report ranges, which they do via a new helper function mkUnexpectedTokenError. True to its name, I also used this refactoring opportunity to actually report what the unexpected token is; see changed test outputs below. Because of this preparatory work, I recommend looking at the changes commit by commit.

@Kha
Copy link
Member Author

Kha commented Aug 8, 2023

!bench

@Kha
Copy link
Member Author

Kha commented Aug 8, 2023

!bench

@leanprover-bot
Copy link
Collaborator

Here are the benchmark results for commit 3def0db.
There were significant changes against commit bb73879:

  Benchmark   Metric             Change
  ===============================================
- parser      branches             6.6% (610.4 σ)
- parser      instructions         7.1% (885.0 σ)
+ stdlib      tactic execution    -1.6% (-16.8 σ)

@Kha
Copy link
Member Author

Kha commented Aug 9, 2023

Interesting, that's a little more overhead than I would have assumed

@leodemoura
Copy link
Member

Interesting, that's a little more overhead than I would have assumed

I agree. Could you please investigate?

@leodemoura leodemoura added the awaiting-author Waiting for PR author to address issues label Aug 9, 2023
@Kha Kha force-pushed the token-error-pos3 branch 2 times, most recently from 4cee657 to 4835ef3 Compare August 14, 2023 12:58
@Kha
Copy link
Member Author

Kha commented Aug 14, 2023

!bench

@Kha
Copy link
Member Author

Kha commented Aug 16, 2023

!bench

@leanprover-bot
Copy link
Collaborator

Here are the benchmark results for commit 379106f.
There were significant changes against commit d4be21b:

  Benchmark                  Metric                  Change
  ====================================================================
- parser                     branches                  4.5% (1946.6 σ)
- parser                     instructions              5.0% (1421.5 σ)
+ stdlib                     attribute application   -14.7%  (-20.9 σ)
+ stdlib                     tactic execution        -15.0% (-140.2 σ)
+ stdlib                     task-clock              -13.5% (-394.6 σ)
+ stdlib                     type checking           -14.7%  (-50.3 σ)
+ stdlib                     wall-clock              -14.1%  (-33.0 σ)
+ tests/bench/ interpreted   task-clock               -9.4%  (-18.7 σ)
+ tests/bench/ interpreted   wall-clock              -13.6%  (-18.3 σ)

@Kha
Copy link
Member Author

Kha commented Aug 16, 2023

!bench

@leanprover-bot
Copy link
Collaborator

Here are the benchmark results for commit 379106f.
There were significant changes against commit d4be21b:

  Benchmark   Metric         Change
  ============================================
- parser      branches         4.5% (1947.0 σ)
- parser      instructions     5.0% (1421.8 σ)

@Kha
Copy link
Member Author

Kha commented Aug 17, 2023

!bench

@leanprover-bot
Copy link
Collaborator

Here are the benchmark results for commit 6e44c12.
There were significant changes against commit d4be21b:

  Benchmark   Metric         Change
  ============================================
- parser      branches         3.2% (1452.2 σ)
- parser      instructions     3.8% (1157.7 σ)

@Kha
Copy link
Member Author

Kha commented Aug 17, 2023

I don't think I will get closer than that without addressing the underlying problem that we seem to backtrack entirely too often from token mismatches, apparently from code like "tk" ... <|> ...

@Kha Kha added awaiting-review Waiting for someone to review the PR and removed awaiting-author Waiting for PR author to address issues labels Aug 18, 2023
@Kha
Copy link
Member Author

Kha commented Sep 5, 2023

I believe this is ready to merge but as it is a subjective change, it would be good to hear about some feedback: can the original problem be considered solved with this? Is the space before def f4 consistent or "weird"? Please give it a try e.g. via the PR release when it is released.

@github-actions github-actions bot added the toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN label Sep 5, 2023
leanprover-community-mathlib4-bot added a commit to leanprover-community/mathlib4 that referenced this pull request Sep 5, 2023
@leanprover-community-mathlib4-bot leanprover-community-mathlib4-bot added the builds-mathlib CI has verified that Mathlib builds against this PR label Sep 5, 2023
@leanprover-community-mathlib4-bot
Copy link
Collaborator

  • ✅ Mathlib branch lean-pr-testing-2393 has successfully built against this PR. (2023-09-05 09:42:31)

@Kha Kha assigned leodemoura and Kha and unassigned leodemoura Sep 6, 2023
@kbuzzard
Copy link
Contributor

kbuzzard commented Sep 6, 2023

My instinct is that this change makes things less confusing for beginners.

@Kha
Copy link
Member Author

Kha commented Sep 12, 2023

Let's give it a try in practice

@Kha Kha merged commit e9d60e1 into leanprover:master Sep 12, 2023
15 checks passed
@Kha Kha deleted the token-error-pos3 branch September 12, 2023 09:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting-review Waiting for someone to review the PR builds-mathlib CI has verified that Mathlib builds against this PR toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants