Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use tree-sitter for syntax coloring #2555

Open
wants to merge 3 commits into
base: master
from

Conversation

@georgewfraser
Copy link

georgewfraser commented Jun 2, 2019

This PR uses tree-sitter to replace VSCode's builtin syntax coloring for Go. Tree-sitter is an incremental parsing framework, developed within Github and used by the Atom editor as a replacement for TextMate grammars. It produces full parse trees, but it's efficient and incremental so the parse tree can be updated on every keystroke.

I originally created an independent extension that does this for several languages, but I think it makes more sense for each language-specific extension to be responsible for syntax coloring, so I've repackaged it as an NPM library.

The basic strategy is:

This strategy allows more accurate coloring of types, and it makes it possible to color based on scope. Notice how top-level vars are blue, but local vars are not:

Screen Shot 2019-06-02 at 10 45 56 AM

This works correctly when locals shadow top-level vars:

shadow mov

We can also do fancy things like underline mutable vars:

mutable mov

In the future, the tree produced by tree-sitter could be combined with semantic information from Go language server to provide even more accurate coloring, for example coloring constants differently even when they are in other packages.

The performance-critical section is the javascript function that walks the visible part of the tree produced by tree-sitter and applies colors. Note that this doesn't need to update every frame, because the setDecorations API is designed to be used asynchronously and VSCode will "patch up" small edits even when the decorations are slightly out-of-date. But in practice it takes about 2ms to color on small and large files:

Screen Shot 2019-06-02 at 12 47 11 PM

Large file:

Screen Shot 2019-06-02 at 12 49 00 PM

If you want to check out the source code of the vscode-tree-sitter npm library, it is published from the npm branch of my repo.

@msftclas

This comment has been minimized.

Copy link

msftclas commented Jun 2, 2019

CLA assistant check
All CLA requirements met.

@georgewfraser

This comment has been minimized.

Copy link
Author

georgewfraser commented Jun 22, 2019

@ramya-rao-a Any thoughts about this? I've been using it for a while myself and it's a big improvement over the textmate syntax coloring.

@oneslash

This comment has been minimized.

Copy link
Contributor

oneslash commented Jun 26, 2019

@ramya-rao-a let's push this please, it is a great improvement for vscode-go

@georgewfraser georgewfraser force-pushed the georgewfraser:master branch from ba3e132 to edb8846 Oct 3, 2019
@georgewfraser

This comment has been minimized.

Copy link
Author

georgewfraser commented Oct 3, 2019

@ramya-rao-a You commented on Twitter "I have been sitting on that for a while now, still trying to make up my mind if the Go extension is the right place for it or a separate extension :("

The official CPP extension has replaced VSCode's builtin TextMate-based coloring with a custom syntax colorizer that uses the setDecorations API, similar to this PR. The approach is slightly different: vscode-cpptools uses the actual C++ parser, while this PR uses a tree-sitter parser for Go. The advantage of tree-sitter is that it's incremental, so it's easy to get as-you-type updates to the coloring.

I think it would make sense for vscode-go to also override the builtin syntax coloring. Perhaps @sean-mcmanus has some comment?

@sean-mcmanus

This comment has been minimized.

Copy link

sean-mcmanus commented Oct 3, 2019

@georgewfraser The C/C++ extension has not replaced the built-in TextMate coloring -- we still rely on TextMate for fast syntactic/lexical colorization, but then use the setDecorations to add "semantic" colorization on top of that. However, our approach has several inherent limitations/bugs that requires additional APIs from VS Code to fix. Also, early versions did have lexical colorization that replaced the TextMate colors, but the performance using decorations was too poor and led to white space in comments being colored green, etc.

A clone of the Atom TextMate Go repo with a bug fix is at https://github.com/jeff-hykin/better-go-syntax if VS Code wants to switch to using that instead.

@georgewfraser

This comment has been minimized.

Copy link
Author

georgewfraser commented Oct 4, 2019

we still rely on TextMate for fast syntactic/lexical colorization, but then use the setDecorations to add "semantic" colorization on top of that

Thanks for clarifying @sean-mcmanus , this is actually the same strategy used in this PR---basic tokens are colored using a simplified TextMate grammar, and only tricky things like types are colored using setDecorations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.