Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix off-by-one error converting to LSP UTF8 offsets with multi-byte char #17003

Merged
merged 1 commit into from Apr 3, 2024

Commits on Apr 3, 2024

  1. Fix off-by-one error converting to LSP UTF8 offsets with multi-byte char

    On this file,
    
    ```rust
    fn main() {
        let 된장 = 1;
    }
    ```
    
    when using `"positionEncodings":["utf-16"]` I get an "unused variable" diagnostic on the variable
    name (codepoint offset range `8..10`). So far so good.
    
    When using `positionEncodings":["utf-8"]`, I expect to get the equivalent range in bytes (LSP:
    "Character offsets count UTF-8 code units (e.g bytes)."), which is `8..14`, because both
    characters are 3 bytes in UTF-8.  However I actually get `10..14`.
    
    Looks like this is because we accidentally treat a 1-based index as an offset value: when
    converting from our internal char-indices to LSP byte offsets, we look at one character to many.
    This causes wrong results if the extra character is a multi-byte one, such as when computing
    the start coordinate of 된장.
    
    Fix that by actually passing an offset. While at it, fix the variable name of the line number,
    which is not an offset (yet).
    
    Originally reported at kakoune-lsp/kakoune-lsp#740
    krobelus committed Apr 3, 2024
    Configuration menu
    Copy the full SHA
    d24b0ba View commit details
    Browse the repository at this point in the history