Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Use tree-sitter for syntax coloring #2555
This PR uses tree-sitter to replace VSCode's builtin syntax coloring for Go. Tree-sitter is an incremental parsing framework, developed within Github and used by the Atom editor as a replacement for TextMate grammars. It produces full parse trees, but it's efficient and incremental so the parse tree can be updated on every keystroke.
I originally created an independent extension that does this for several languages, but I think it makes more sense for each language-specific extension to be responsible for syntax coloring, so I've repackaged it as an NPM library.
The basic strategy is:
This strategy allows more accurate coloring of types, and it makes it possible to color based on scope. Notice how top-level vars are blue, but local vars are not:
This works correctly when locals shadow top-level vars:
We can also do fancy things like underline mutable vars:
In the future, the tree produced by tree-sitter could be combined with semantic information from Go language server to provide even more accurate coloring, for example coloring constants differently even when they are in other packages.
If you want to check out the source code of the vscode-tree-sitter npm library, it is published from the npm branch of my repo.
The official CPP extension has replaced VSCode's builtin TextMate-based coloring with a custom syntax colorizer that uses the setDecorations API, similar to this PR. The approach is slightly different: vscode-cpptools uses the actual C++ parser, while this PR uses a tree-sitter parser for Go. The advantage of tree-sitter is that it's incremental, so it's easy to get as-you-type updates to the coloring.
I think it would make sense for vscode-go to also override the builtin syntax coloring. Perhaps @sean-mcmanus has some comment?
@georgewfraser The C/C++ extension has not replaced the built-in TextMate coloring -- we still rely on TextMate for fast syntactic/lexical colorization, but then use the setDecorations to add "semantic" colorization on top of that. However, our approach has several inherent limitations/bugs that requires additional APIs from VS Code to fix. Also, early versions did have lexical colorization that replaced the TextMate colors, but the performance using decorations was too poor and led to white space in comments being colored green, etc.
A clone of the Atom TextMate Go repo with a bug fix is at https://github.com/jeff-hykin/better-go-syntax if VS Code wants to switch to using that instead.
Thanks for clarifying @sean-mcmanus , this is actually the same strategy used in this PR---basic tokens are colored using a simplified TextMate grammar, and only tricky things like types are colored using setDecorations.