-
Notifications
You must be signed in to change notification settings - Fork 343
Improve LSP semantic tokens #4270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The existing semantic tokens implementation had a couple of issues - for example, `message MyMessage` was all considered a semantic token of type "struct", when only "MyMessage" should be considered a struct and "message" should be a keyword. Iterating on that, I've added some more testcases and handling for the various protobuf types. To handle things that aren't directly in the file's symbol's IR, I reworked the main loop to just collect the tokens up front, and then figure out sorting and encoding once we've grabbed all of the tokens. --- I'm fairly certain this can be improved more, but think it's already a decent improvement on what we have, and lays the groundwork for future tweaks with additional testing. For testing this, I highly recommend checking out this PR, then running `make installbuf`, and in neovim, using the `:Inspect` command on different tokens under the cursor, which should show a "Semantic Tokens" section that contains the semantic token under the cursor.
|
The latest Buf updates on your PR. Results from workflow Buf CI / buf (pull_request).
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the impl in server.go was getting unwieldy, and the constants only pertained to semantic tokens, so I figured a separate file was reasonable now.
| // Collect all comments and certain keywords that can't be fetched in the IR from token stream | ||
| for tok := range astFile.Stream().All() { | ||
| if tok.Kind() == token.Comment { | ||
| collectToken(tok.Span(), semanticTypeComment, 0, keyword.Unknown) | ||
| } | ||
| kw := tok.Keyword() | ||
| switch kw { | ||
| // These keywords seemingly are not easy to reach via the IR. | ||
| case keyword.Option, keyword.Reserved, keyword.To, keyword.Returns: | ||
| collectToken(tok.Span(), semanticTypeKeyword, 0, kw) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
originally I had this grabbing all the keywords, but it's too greedy: syntax operators like =, ;, { and } and all considered keywords in the taxa, and it makes the highlighting too confusing (everything becomes bolded as a keyword).
Instead, I tried to be more specific and go through the individual elements below. Again, this could probably be improved; I'm doubtless missing elements that we could probably add.
| // Invalid import should not have string token (only resolved imports get string tokens) | ||
| {6, 12, 13, semanticTypeString, "invalid import path should not have string token"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a test-case with an import to a file that doesn't exist, so the IR doesn't have it; seems fine to me — the alternative is uglier.
| {0, 7, 1, semanticTypeKeyword, "'=' should not be keyword"}, | ||
| {0, 17, 1, semanticTypeKeyword, "';' should not be keyword"}, | ||
| {10, 20, 1, semanticTypeKeyword, "'=' should not be keyword"}, | ||
| {10, 26, 1, semanticTypeKeyword, "';' should not be keyword"}, | ||
| {13, 14, 1, semanticTypeKeyword, "'=' should not be keyword"}, | ||
| {13, 17, 1, semanticTypeKeyword, "';' should not be keyword"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think any of these "operators"(?) ought to be considered keywords; just leaving these negative tests around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be too noisy to consider the operators in proto for keyword tokens, since it's mostly = for alignment and ; for breaking decl statements. I like this as-is.
| // decorator (option) | ||
| {10, 9, 10, semanticTypeDecorator, "'deprecated' as decorator"}, | ||
| // built-in type | ||
| {13, 2, 6, semanticTypeProperty, "'string' as property"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to revisit this; not sure that in string name = 1;, the string is a "property" (name is the property); maybe more just a type or keyword?:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that idea of having built-in scalars being distinct from types declared for protos, but also, they aren't exactly keywords (e.g. I think bool is a type, and true and false are keywords). Since there isn't really a way to differentiate the types, maybe the scalar types should just be type... based on the spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I think type is reasonable; I wish there was a scalarType or something in the spec - oh well!: 66a7a88
Included the "syntax highlighting" bit because some users may not know what semantic tokens actually does from a feature perspective.
The existing semantic tokens implementation had a couple of issues - for example,
message MyMessagewas all considered a semantic token of type "struct", when only "MyMessage" should be considered a struct and "message" should be a keyword.Iterating on that, I've added some more testcases and handling for the various protobuf types. To handle things that aren't directly in the file's symbol's IR, I reworked the main loop to just collect the tokens up front, and then figure out sorting and encoding once we've grabbed all of the tokens.
I'm fairly certain this can be improved more, but think it's already a decent improvement on what we have, and lays the groundwork for future tweaks with additional testing.
For testing this, I highly recommend checking out this PR, then running
make installbuf, and in neovim, using the:Inspectcommand on different tokens under the cursor, which should show a "Semantic Tokens" section that contains the semantic token under the cursor.