-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[question] Line/column information #17
Comments
In rust-analyzer, we maintain a separate index to translate utf8-offsets into (invalid) utf-16 line/column as per LSP: https://github.com/rust-analyzer/rust-analyzer/blob/fcdb387f0d7e76f325a858e4463efd5d7ed3efc3/crates/ra_ide_api/src/line_index.rs |
A separate index sounds somewhat inconvenient. By the way, why UTF-16? |
LSP requires UTF-16 This is actually one of the main reasons why a separate index makes sense: there's no universal definition of |
Determining the column number is not as simple as performing byte arithmetic, because certain characters have different widths. Even if we only accepted ASCII, control characters aren't visible to the user. This uses the unicode-width crate as an alternative to POSIX wcwidth, to determine (hopefully) the number of fixed-width cells that a unicode character will take up on a terminal. For example, control characters are zero-width, while an emoji is likely double-width. See test cases for more information on that. There is also the unicode-segmentation crate, which can handle extended grapheme clusters and such, but (a) we'll be outputting the line to the terminal and (b) there's no guarantee that the user's editor displays grapheme clusters as a single column. LSP measures in UTF-16, apparently. I use both Emacs and Vim from a terminal, so unicode-width applies to me. There's too much variation to try to solve that right now. The columns can be considered a visual span---this gives us enough information to draw line annotations, which will happen soon. Here are some useful links: - https://hsivonen.fi/string-length/ - https://unicode.org/reports/tr29/ - rust-analyzer/rowan#17 - https://www.reddit.com/r/rust/comments/gpw2ra/how_is_the_rust_compiler_able_to_tell_the_visible/ DEV-10935
FYI: thanks to |
It seems like
SyntaxNode
only hasTextRange
which contains only infomation about absolute offsets. But how do you handle line/column ranges (necessary for printing errors)?The text was updated successfully, but these errors were encountered: