Consider supporting embedded language grammars when using semantic tokens #163292

DanTup · 2022-10-11T13:03:12Z

This has been discussed a little in some other issues:

The issue is that a language uses semantic tokens, injected embedded grammars do not work. The suggestion given by @alexdima at #113640 (comment) is for the embedded language to coordinate with the language server to suppress semantic tokens where this grammar needs to be used (which Rust has gone ahead with, adding an option to suppress semantic tokens on strings).

This does not seem like a very scalable solution. I had a request at Dart-Code/Dart-Code#4212 related to this where another extension is providing highlighting of some strings inside Dart. When semantic tokens are disabled, everything is fine, but with semantic tokens enabled the Dart server produces string tokens (because strings are a non-default colour) that breaks the embedded language.

Having the Dart server suppress these tokens is not a good solution because:

It means strings that aren't in the embedded languages format would lose their colouring
It requires an LSP server (which is intended to be generic and editor-agnostic by design) to make changes for some specific functionality of another extension (of which there could be many, with varying needs)

It would be much better if this could be done without changes to the server. I don't know what a solution to this would look like, but perhaps the injected language could be allowed to layer it's scopes over the semantic tokens (while semantic tokens are more accurate, I don't believe that's a reason to prevent this), or allow the injected language to apply specifically to some tokens (like strings) from the server (though VS Code's lack of support for multiline semantic tokens may complicate that).

If there are caveats to switching to semantic tokens, it may cause languages to think twice about switching to them (or, may lead to more users turning them off) which would be a shame.

VSCodeTriageBot · 2022-10-11T14:19:13Z

This feature request is now a candidate for our backlog. The community has 60 days to upvote the issue. If it receives 20 upvotes we will move it to our backlog. If not, we will close it. To learn more about how we handle feature requests, please see our documentation.

Happy Coding!

VSCodeTriageBot · 2022-12-01T02:54:39Z

This feature request has not yet received the 20 community upvotes it takes to make to our backlog. 10 days to go. To learn more about how we handle feature requests, please see our documentation.

Happy Coding!

wakaztahir · 2024-03-26T13:57:59Z

To support semantic tokens in embedded languages

Solution 1

1 - VSCode sends my LSP server a request to get semantic tokens
2 - I lex my language and reach a token for an embedded language
3 - I set a field in this semantic token to indicate embedded language start & length and which embedded language is being used

Cons :
1 - This means vscode needs to go through my semantic tokens, find the embedded language and use tokens from its own set of extensions or lsp servers
2 - lsp server might need to be started to provide semantic tokens for embedded language

Solution 2

1 - VSCode sends my LSP server a request to get semantic tokens
2 - I lex my language and when I reach a token for an embedded language
3 - I send a request back to vscode to get tokens for an embedded language (two way semanticTokens/range)
4 - vscode provides me the semantic tokens, I might need to parse these because the format is different, I add these tokens to my tokens and provide it to vscode

Cons :

1 - Harder to implement, when sending tokens, they are compressed, vscode must not compress them, when sending to server
2 - still requires lsp server to be started to provide semantic tokens for embedded language
3 - this approach is worse than approach above

The biggest problem

I don't just need semantic tokens support for embedded language, I also need support for completions & all that.

DanTup mentioned this issue Oct 11, 2022

Embedded languages seems to be overriden Dart-Code/Dart-Code#4212

Open

VSCodeTriageBot assigned hediet Oct 11, 2022

hediet assigned alexdima Oct 11, 2022

alexdima added feature-request Request for new features or functionality tokenization Text tokenization semantic-tokens Semantic tokens issues labels Oct 11, 2022

alexdima removed their assignment Oct 11, 2022

VSCodeTriageBot added this to the Backlog Candidates milestone Oct 11, 2022

hediet modified the milestones: Backlog Candidates, Backlog Dec 1, 2022

ggrossetie mentioned this issue Oct 18, 2023

Improve syntax highlighting using a Language Server Protocol implementation asciidoctor/asciidoctor-vscode#686

Open

11 tasks

DanTup mentioned this issue Jan 11, 2024

Syntax Highlighting for Markdown Documentation Comments Dart-Code/Dart-Code#4925

Open

This was referenced Feb 21, 2024

Add Markdown support to Documentation Comments Dart-Code/Dart-Code#4999

Closed

Markdown decorations for the simple // comments Dart-Code/Dart-Code#4993

Closed

DanTup mentioned this issue May 28, 2024

Highligh issue with another textmate grammar Dart-Code/Dart-Code#5121

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider supporting embedded language grammars when using semantic tokens #163292

Consider supporting embedded language grammars when using semantic tokens #163292

DanTup commented Oct 11, 2022

VSCodeTriageBot commented Oct 11, 2022

VSCodeTriageBot commented Dec 1, 2022

wakaztahir commented Mar 26, 2024