Consider supporting embedded language grammars when using semantic tokens #163292
Labels
feature-request
Request for new features or functionality
semantic-tokens
Semantic tokens issues
tokenization
Text tokenization
Milestone
This has been discussed a little in some other issues:
The issue is that a language uses semantic tokens, injected embedded grammars do not work. The suggestion given by @alexdima at #113640 (comment) is for the embedded language to coordinate with the language server to suppress semantic tokens where this grammar needs to be used (which Rust has gone ahead with, adding an option to suppress semantic tokens on strings).
This does not seem like a very scalable solution. I had a request at Dart-Code/Dart-Code#4212 related to this where another extension is providing highlighting of some strings inside Dart. When semantic tokens are disabled, everything is fine, but with semantic tokens enabled the Dart server produces string tokens (because strings are a non-default colour) that breaks the embedded language.
Having the Dart server suppress these tokens is not a good solution because:
It would be much better if this could be done without changes to the server. I don't know what a solution to this would look like, but perhaps the injected language could be allowed to layer it's scopes over the semantic tokens (while semantic tokens are more accurate, I don't believe that's a reason to prevent this), or allow the injected language to apply specifically to some tokens (like strings) from the server (though VS Code's lack of support for multiline semantic tokens may complicate that).
If there are caveats to switching to semantic tokens, it may cause languages to think twice about switching to them (or, may lead to more users turning them off) which would be a shame.
The text was updated successfully, but these errors were encountered: