-
Notifications
You must be signed in to change notification settings - Fork 205
Advance scanner position by byte length while searching for line #3612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Advance scanner position by byte length while searching for line #3612
Conversation
How to use the Graphite Merge QueueAdd the label graphite-merge to this PR to add it to the merge queue. You must have a Graphite account in order to use the merge queue. Sign up using this link. An organization admin has enabled the Graphite Merge Queue in this repository. Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue. This stack of pull requests is managed by Graphite. Learn more about stacking. |
1a368c4
to
7b99851
Compare
I thought this functionality was largely taken care of inside of Prism::Location. Is there something missing? |
@kddnewton |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🚀
Just one small change, otherwise good to 🚢
(feel free to ignore the nits)
7b99851
to
ab7bcf5
Compare
And refactor each encoding scanner into a subclass
ab7bcf5
to
f930333
Compare
Motivation
Fixes #3494 and #2446. Second time is the charm 😅
In #3583, we fixed bytesize counting for UTF-8, but didn't advance
@pos
based on the byte size when finding the correct line, which meant the document would still get corrupted.Implementation
I split scanning into 3 subclasses since I found the logic for each encoding to be sufficiently different. We also don't want to pay the price of checking the encoding inside the many loops.
For posterity, the spec explains that positions use code unit lengths. For each encoding, that means a different thing:
I implemented the logic for each in subclasses.
Automated Tests
Added a bunch of tests for each encoding and some edge cases, which should hopefully help us prevent further regressions.
Manual Tests
Tested on VS Code and NeoVim (UTF-16 and UTF-8).