Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error locations get skewed in the presence of unicode characters #193

Closed
Alizter opened this issue Jan 18, 2023 · 2 comments · Fixed by #200 or ocaml/opam-repository#23181
Closed
Milestone

Comments

@Alizter
Copy link
Collaborator

Alizter commented Jan 18, 2023

This is likely an upstream bug, but when in the presence of unicode characters, the reported error locations are skewed:
image

You can try this example for yourself:

Definition foo : forall Γ Γ' (Γ ⊆ Γ') : Type.

It isn't just parser errors, this is just a minimal example.

@ejgallego ejgallego added this to the 0.2.0 milestone Jan 18, 2023
@ejgallego
Copy link
Owner

ejgallego commented Jan 18, 2023

Yes, this is a known bug and I was indeed procrastinating on submitting it, sorry and thanks!

To fix it properly we still need a better separation of language vs server locations, I have a branch for that on my disk.

LSP is a mess w.r.t. to character sets, see for example the discussion here Deducteam/lambdapi#452

@ejgallego
Copy link
Owner

Main difficulty we face is that Fleche must support both line / column and offset based reporting, due to different editors and protocols requiring that, and in particular ProseMirror needs offset.

Offset is kinda tricky but maybe we could declare that our offsets are byte based, it seems Prosemirror follows that model?

@ejgallego ejgallego modified the milestones: 0.2.0, 0.1.4 Jan 19, 2023
ejgallego added a commit that referenced this issue Jan 19, 2023
We enforce a stricter separation between `Range.t` and `Loc.t`, with
Flèche not using `Loc.t` anymore, and `Range.t` being in unicode
character positions.

To achieve this we need to convert all the positions coming from Coq.

As we keep a position per sentence, this implies a `O(n)` number of
conversions in total, where `n is the number of sentences; we could
optimize that to be lazy, but it is unlikely to show in profiles as
that is a pretty reasonably `O(.)` factor.

Should fix #193 and #197

TODO:

- more principled conversion on `Coq.Protect
- see what is going on with the loc-independent cache (seems already
  broken in terms of space shifting)?
ejgallego added a commit that referenced this issue Jan 19, 2023
We enforce a stricter separation between `Range.t` and `Loc.t`, with
Flèche not using `Loc.t` anymore, and `Range.t` being in unicode
character positions.

To achieve this we need to convert all the positions coming from Coq.

As we keep a position per sentence, this implies a `O(n)` number of
conversions in total, where `n is the number of sentences; we could
optimize that to be lazy, but it is unlikely to show in profiles as
that is a pretty reasonably `O(.)` factor.

Should fix #193 and #197

TODO:

- more principled conversion on `Coq.Protect
- see what is going on with the loc-independent cache (seems already
  broken in terms of space shifting)?
ejgallego added a commit that referenced this issue Jan 19, 2023
We enforce a stricter separation between `Range.t` and `Loc.t`, with
Flèche not using `Loc.t` anymore, and `Range.t` being in unicode
character positions.

To achieve this we need to convert all the positions coming from Coq.

As we keep a position per sentence, this implies a `O(n)` number of
conversions in total, where `n is the number of sentences; we could
optimize that to be lazy, but it is unlikely to show in profiles as
that is a pretty reasonably `O(.)` factor.

Should fix #193 and #197

TODO:

- more principled conversion on `Coq.Protect
- see what is going on with the loc-independent cache (seems already
  broken in terms of space shifting)?
ejgallego added a commit that referenced this issue Jan 19, 2023
We enforce a stricter separation between `Range.t` and `Loc.t`, with
Flèche not using `Loc.t` anymore, and `Range.t` being in unicode
character positions.

To achieve this we need to convert all the positions coming from Coq.

As we keep a position per sentence, this implies a `O(n)` number of
conversions in total, where `n is the number of sentences; we could
optimize that to be lazy, but it is unlikely to show in profiles as
that is a pretty reasonably `O(.)` factor.

Should fix #193 and #197

TODO:

- more principled conversion on `Coq.Protect
- see what is going on with the loc-independent cache (seems already
  broken in terms of space shifting)?
ejgallego added a commit that referenced this issue Jan 19, 2023
We enforce a stricter separation between `Range.t` and `Loc.t`, with
Flèche not using `Loc.t` anymore, and `Range.t` being in unicode
character positions.

To achieve this we need to convert all the positions coming from Coq.

As we keep a position per sentence, this implies a `O(n)` number of
conversions in total, where `n is the number of sentences; we could
optimize that to be lazy, but it is unlikely to show in profiles as
that is a pretty reasonably `O(.)` factor.

Should fix #193 and #197

TODO:

- more principled conversion on `Coq.Protect
- see what is going on with the loc-independent cache (seems already
  broken in terms of space shifting)?
ejgallego added a commit that referenced this issue Jan 19, 2023
We enforce a stricter separation between `Range.t` and `Loc.t`, with
Flèche not using `Loc.t` anymore, and `Range.t` being in unicode
character positions.

To achieve this we need to convert all the positions coming from Coq.

As we keep a position per sentence, this implies a `O(n)` number of
conversions in total, where `n is the number of sentences; we could
optimize that to be lazy, but it is unlikely to show in profiles as
that is a pretty reasonably `O(.)` factor.

Should fix #193 and #197

TODO:

- see what is going on with the loc-independent cache (seems already
  broken in terms of space shifting)?
ejgallego added a commit that referenced this issue Jan 19, 2023
We enforce a stricter separation between `Range.t` and `Loc.t`, with
Flèche not using `Loc.t` anymore, and `Range.t` being in unicode
character positions.

To achieve this we need to convert all the positions coming from Coq;
we do this in a principled way, but the whole business is often
delicate.

As we keep a position per sentence, this implies a `O(n)` number of
conversions in total, where `n is the number of sentences; we could
optimize that to be lazy, but it is unlikely to show in profiles as
that is a pretty reasonably `O(.)` factor.

Fixes #193 , fixes #197 .
ejgallego added a commit that referenced this issue Jan 19, 2023
We enforce a stricter separation between `Range.t` and `Loc.t`, with
Flèche not using `Loc.t` anymore, and `Range.t` being in unicode
character positions.

To achieve this we need to convert all the positions coming from Coq;
we do this in a principled way, but the whole business is often
delicate.

As we keep a position per sentence, this implies a `O(n)` number of
conversions in total, where `n is the number of sentences; we could
optimize that to be lazy, but it is unlikely to show in profiles as
that is a pretty reasonably `O(.)` factor.

Fixes #193 , fixes #197 .
ejgallego added a commit that referenced this issue Jan 19, 2023
We enforce a stricter separation between `Range.t` and `Loc.t`, with
Flèche not using `Loc.t` anymore, and `Range.t` being in unicode
character positions.

To achieve this we need to convert all the positions coming from Coq;
we do this in a principled way, but the whole business is often
delicate.

As we keep a position per sentence, this implies a `O(n)` number of
conversions in total, where `n is the number of sentences; we could
optimize that to be lazy, but it is unlikely to show in profiles as
that is a pretty reasonably `O(.)` factor.

Fixes #193 , fixes #197 .
ejgallego added a commit that referenced this issue Jan 19, 2023
We enforce a stricter separation between `Range.t` and `Loc.t`, with
Flèche not using `Loc.t` anymore, and `Range.t` being in unicode
character positions.

To achieve this we need to convert all the positions coming from Coq;
we do this in a principled way, but the whole business is often
delicate.

As we keep a position per sentence, this implies a `O(n)` number of
conversions in total, where `n is the number of sentences; we could
optimize that to be lazy, but it is unlikely to show in profiles as
that is a pretty reasonably `O(.)` factor.

Fixes #193 , fixes #197 .
ejgallego added a commit that referenced this issue Jan 19, 2023
We enforce a stricter separation between `Range.t` and `Loc.t`, with
Flèche not using `Loc.t` anymore, and `Range.t` being in unicode
character positions.

To achieve this we need to convert all the positions coming from Coq;
we do this in a principled way, but the whole business is often
delicate.

As we keep a position per sentence, this implies a `O(n)` number of
conversions in total, where `n is the number of sentences; we could
optimize that to be lazy, but it is unlikely to show in profiles as
that is a pretty reasonably `O(.)` factor.

Fixes #193 , fixes #197 .
ejgallego added a commit that referenced this issue Jan 19, 2023
We enforce a stricter separation between `Range.t` and `Loc.t`, with
Flèche not using `Loc.t` anymore, and `Range.t` being in unicode
character positions.

To achieve this we need to convert all the positions coming from Coq;
we do this in a principled way, but the whole business is often
delicate.

As we keep a position per sentence, this implies a `O(n)` number of
conversions in total, where `n is the number of sentences; we could
optimize that to be lazy, but it is unlikely to show in profiles as
that is a pretty reasonably `O(.)` factor.

Fixes #193 , fixes #197 .
ejgallego added a commit that referenced this issue Jan 19, 2023
We enforce a stricter separation between `Range.t` and `Loc.t`, with
Flèche not using `Loc.t` anymore, and `Range.t` being in unicode
character positions.

To achieve this we need to convert all the positions coming from Coq;
we do this in a principled way, but the whole business is often
delicate.

As we keep a position per sentence, this implies a `O(n)` number of
conversions in total, where `n is the number of sentences; we could
optimize that to be lazy, but it is unlikely to show in profiles as
that is a pretty reasonably `O(.)` factor.

Fixes #193 , fixes #197 .
ejgallego added a commit that referenced this issue Jan 19, 2023
We enforce a stricter separation between `Range.t` and `Loc.t`, with
Flèche not using `Loc.t` anymore, and `Range.t` being in unicode
character positions.

To achieve this we need to convert all the positions coming from Coq;
we do this in a principled way, but the whole business is often
delicate.

As we keep a position per sentence, this implies a `O(n)` number of
conversions in total, where `n is the number of sentences; we could
optimize that to be lazy, but it is unlikely to show in profiles as
that is a pretty reasonably `O(.)` factor.

Fixes #193 , fixes #197 .
ejgallego added a commit that referenced this issue Jan 19, 2023
We enforce a stricter separation between `Range.t` and `Loc.t`, with
Flèche not using `Loc.t` anymore, and `Range.t` being in unicode
character positions.

To achieve this we need to convert all the positions coming from Coq;
we do this in a principled way, but the whole business is often
delicate.

As we keep a position per sentence, this implies a `O(n)` number of
conversions in total, where `n is the number of sentences; we could
optimize that to be lazy, but it is unlikely to show in profiles as
that is a pretty reasonably `O(.)` factor.

Fixes #193 , fixes #197 .
ejgallego added a commit that referenced this issue Jan 19, 2023
We enforce a stricter separation between `Range.t` and `Loc.t`, with
Flèche not using `Loc.t` anymore, and `Range.t` being in unicode
character positions.

To achieve this we need to convert all the positions coming from Coq;
we do this in a principled way, but the whole business is often
delicate.

As we keep a position per sentence, this implies a `O(n)` number of
conversions in total, where `n is the number of sentences; we could
optimize that to be lazy, but it is unlikely to show in profiles as
that is a pretty reasonably `O(.)` factor.

Fixes #193 , fixes #197 .
ejgallego added a commit that referenced this issue Jan 20, 2023
We enforce a stricter separation between `Range.t` and `Loc.t`, with
Flèche not using `Loc.t` anymore, and `Range.t` being in unicode
character positions.

To achieve this we need to convert all the positions coming from Coq;
we do this in a principled way, but the whole business is often
delicate.

As we keep a position per sentence, this implies a `O(n)` number of
conversions in total, where `n is the number of sentences; we could
optimize that to be lazy, but it is unlikely to show in profiles as
that is a pretty reasonably `O(.)` factor.

Fixes #193 , fixes #197 .
ejgallego added a commit to ejgallego/opam-repository that referenced this issue Jan 29, 2023
CHANGES:

---------------------

 - Support for OCaml 4.11 (@ejgallego, ejgallego/coq-lsp#184)
 - The keybinding alt+enter in VSCode is now correctly scoped to be
   only active on Coq files (@artagnon, ejgallego/coq-lsp#188)
 - Support Unicode files (@ejgallego, ejgallego/coq-lsp#200, fixes ejgallego/coq-lsp#193, fixes ejgallego/coq-lsp#197)
 - The info view is now script enabled and does client-side
   rendering. It is also now bundled with esbuild as part of the build
   process (@artagnon, @ejgallego, ejgallego/coq-lsp#171)
 - The no-op `--std` argument to the `coq-lsp` binary has been
   removed, beware of your setup in the extension settings
   (@ejgallego, ejgallego/coq-lsp#208)
 - Settings for the VSCode extension are now categorized (@Alizter, ejgallego/coq-lsp#212)
 - `GoalAnswer`s now include the proof "stack" and better hypothesis
   information, changes are compatible with 0.1.3 `GoalAnswer` version
   (@ejgallego, ejgallego/coq-lsp#237)
 - Focus is now preserved when the info view pops up (@artagnon, ejgallego/coq-lsp#242,
   fixes ejgallego/coq-lsp#224)
 - In `_CoqProject`, `-impredicative-set` is now parsed correctly
   (@artagnon, ejgallego/coq-lsp#241)
 - InfoView is not written in React (@ejgallego, ejgallego/coq-lsp#223)
 - `debug` option in the client / protocol that will enable Coq's backtraces
   (@Alizter, @ejgallego, ejgallego/coq-lsp#217, ejgallego/coq-lsp#248)
 - Full document stats are now correctly computed on checking
   resumption, still cached sentences will display the cached timing
   tho (@ejgallego, ejgallego/coq-lsp#257)
 - Set Coq library name correctly (@ejgallego, ejgallego/coq-lsp#260)
 - `_CoqProject` file is now detected using LSP client `rootPath`
   (@ejgallego, ejgallego/coq-lsp#261)
 - Press `\` to trigger Unicode completion by the server. This
   behavior is configurable, with "off", "regular", and extended
   settings (@artagnon, @Alizter, ejgallego, ejgallego/coq-lsp#219).
ejgallego added a commit to ejgallego/opam-repository that referenced this issue Jan 29, 2023
CHANGES:

---------------------

 - Support for OCaml 4.11 (@ejgallego, ejgallego/coq-lsp#184)
 - The keybinding alt+enter in VSCode is now correctly scoped to be
   only active on Coq files (@artagnon, ejgallego/coq-lsp#188)
 - Support Unicode files (@ejgallego, ejgallego/coq-lsp#200, fixes ejgallego/coq-lsp#193, fixes ejgallego/coq-lsp#197)
 - The info view is now script enabled and does client-side
   rendering. It is also now bundled with esbuild as part of the build
   process (@artagnon, @ejgallego, ejgallego/coq-lsp#171)
 - The no-op `--std` argument to the `coq-lsp` binary has been
   removed, beware of your setup in the extension settings
   (@ejgallego, ejgallego/coq-lsp#208)
 - Settings for the VSCode extension are now categorized (@Alizter, ejgallego/coq-lsp#212)
 - `GoalAnswer`s now include the proof "stack" and better hypothesis
   information, changes are compatible with 0.1.3 `GoalAnswer` version
   (@ejgallego, ejgallego/coq-lsp#237)
 - Focus is now preserved when the info view pops up (@artagnon, ejgallego/coq-lsp#242,
   fixes ejgallego/coq-lsp#224)
 - In `_CoqProject`, `-impredicative-set` is now parsed correctly
   (@artagnon, ejgallego/coq-lsp#241)
 - InfoView is not written in React (@ejgallego, ejgallego/coq-lsp#223)
 - `debug` option in the client / protocol that will enable Coq's backtraces
   (@Alizter, @ejgallego, ejgallego/coq-lsp#217, ejgallego/coq-lsp#248)
 - Full document stats are now correctly computed on checking
   resumption, still cached sentences will display the cached timing
   tho (@ejgallego, ejgallego/coq-lsp#257)
 - Set Coq library name correctly (@ejgallego, ejgallego/coq-lsp#260)
 - `_CoqProject` file is now detected using LSP client `rootPath`
   (@ejgallego, ejgallego/coq-lsp#261)
 - You can press `\` to trigger Unicode completion by the server. This
   behavior is configurable, with "off", "regular", and "extended"
   settings (@artagnon, @Alizter, ejgallego, ejgallego/coq-lsp#219).
ejgallego added a commit to ejgallego/opam-repository that referenced this issue Jan 29, 2023
CHANGES:

---------------------

 - Support for OCaml 4.11 (@ejgallego, ejgallego/coq-lsp#184)
 - The keybinding alt+enter in VSCode is now correctly scoped to be
   only active on Coq files (@artagnon, ejgallego/coq-lsp#188)
 - Support Unicode files (@ejgallego, ejgallego/coq-lsp#200, fixes ejgallego/coq-lsp#193, fixes ejgallego/coq-lsp#197)
 - The info view is now script enabled and does client-side
   rendering. It is also now bundled with esbuild as part of the build
   process (@artagnon, @ejgallego, ejgallego/coq-lsp#171)
 - The no-op `--std` argument to the `coq-lsp` binary has been
   removed, beware of your setup in the extension settings
   (@ejgallego, ejgallego/coq-lsp#208)
 - Settings for the VSCode extension are now categorized (@Alizter, ejgallego/coq-lsp#212)
 - `GoalAnswer`s now include the proof "stack" and better hypothesis
   information, changes are compatible with 0.1.3 `GoalAnswer` version
   (@ejgallego, ejgallego/coq-lsp#237)
 - Focus is now preserved when the info view pops up (@artagnon, ejgallego/coq-lsp#242,
   fixes ejgallego/coq-lsp#224)
 - In `_CoqProject`, `-impredicative-set` is now parsed correctly
   (@artagnon, ejgallego/coq-lsp#241)
 - InfoView is not written in React (@ejgallego, ejgallego/coq-lsp#223)
 - `debug` option in the client / protocol that will enable Coq's backtraces
   (@Alizter, @ejgallego, ejgallego/coq-lsp#217, ejgallego/coq-lsp#248)
 - Full document stats are now correctly computed on checking
   resumption, still cached sentences will display the cached timing
   tho (@ejgallego, ejgallego/coq-lsp#257)
 - Set Coq library name correctly (@ejgallego, ejgallego/coq-lsp#260)
 - `_CoqProject` file is now detected using LSP client `rootPath`
   (@ejgallego, ejgallego/coq-lsp#261)
 - You can press `\` to trigger Unicode completion by the server. This
   behavior is configurable, with "off", "regular", and "extended"
   settings (@artagnon, @Alizter, ejgallego, ejgallego/coq-lsp#219).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants