Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How We Made Bracket Pair Colorization 10,000x Faster In VS Code #4794

Merged
merged 21 commits into from
Sep 29, 2021

Conversation

hediet
Copy link
Member

@hediet hediet commented Aug 31, 2021

  • Figure out how to support LaTeX in vscode blog posts.

@hediet hediet changed the base branch from main to vnext August 31, 2021 10:24
@hediet
Copy link
Member Author

hediet commented Aug 31, 2021

  • Figure out how to support LaTeX in vscode blog posts.

@mjbvz do you have ideas how we can enable support for LaTeX?

} @3
```

Does the bracket at @1 close the bracket at @2 or at @3? This depends on the length of the template literal expression, which only a tokenizer with unbounded state (i.e. a non-regular tokenizer) can determine correctly!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

close "with" the? (I'm missing a word.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "match" is better anyway:

Does the bracket at @1 match the bracket at @2 or at @3?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked @alexr00 and she said "close" is ok 😉

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It reads as if @1 might close one of the other brackets when it is one of the others closing @1.

Copy link
Member

@alexdima alexdima left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! To be honest, the theoretical analysis was a bit too much for my taste, and I really missed seeing some measurements of time / memory / comparisons. Just to show some kind of before/after in numbers on some chart.

blogs/2021/09/06/bracket-pair-colorization.md Outdated Show resolved Hide resolved
blogs/2021/09/06/bracket-pair-colorization.md Outdated Show resolved Hide resolved
blogs/2021/09/06/bracket-pair-colorization.md Outdated Show resolved Hide resolved
blogs/2021/09/06/bracket-pair-colorization.md Outdated Show resolved Hide resolved
blogs/2021/09/06/bracket-pair-colorization.md Outdated Show resolved Hide resolved
blogs/2021/09/06/bracket-pair-colorization.md Show resolved Hide resolved

We have two data structures for this task: The *before edit position mapper* and the *node reader*.

The [position mapper](https://github.com/microsoft/vscode/blob/f8e9f87b6554b527c61ba963d0c96c7687cbaae9/src/vs/editor/common/model/bracketPairColorizer/beforeEditPositionMapper.ts#L17) maps a position in the new document (after applying the edit) to the old document (before applying the edit), if possible. It also tells us the length between the current position and the next edit (or 0, if we are in an edit). This is done in $\mathcal{O}(1)$.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest linking to the code at the end. I'm also not 100% sure if these two paragraphs bring a lot of value, as they seem to describe the implementation in very high detail.

blogs/2021/09/06/bracket-pair-colorization.md Outdated Show resolved Hide resolved
The beauty of a recursive-decent parser is that we can use anchor sets to improve error recovery.

Consider the following example:
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe an image is better here where you can draw arrows or things


This needs to be considered when reusing nodes: The pair `( } )` cannot be reused when prepending it with `{`! We use bit-sets to encode anchor sets and compute the set of containing unopened brackets for every node. If they intersect, we cannot reuse the node. Luckily, there are only a few bracket types, so this does not affect performance too much.

## Outlook
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, bar charts please! I suggest to put at least as much effort as in the big O analysis into measuring the times it takes to colorize or the used up memory in checker.ts, sqlite3.c, maybe a couple more like bootstrap.css. I would also compare with 2-3 extensions, like bracketpair1, bracketpair2, rainbow brackets. You can also compare it with the previous bracket matching implementation that we still have.

@hediet hediet self-assigned this Sep 7, 2021
@hediet hediet added this to the September 2021 milestone Sep 7, 2021
blogs/2021/09/06/bracket-pair-colorization.md Outdated Show resolved Hide resolved
blogs/2021/09/06/bracket-pair-colorization.md Show resolved Hide resolved
blogs/2021/09/06/bracket-pair-colorization.md Outdated Show resolved Hide resolved
blogs/2021/09/06/bracket-pair-colorization.md Outdated Show resolved Hide resolved
blogs/2021/09/06/bracket-pair-colorization.md Outdated Show resolved Hide resolved
blogs/2021/09/06/bracket-pair-colorization.md Show resolved Hide resolved
blogs/2021/09/06/bracket-pair-colorization.md Outdated Show resolved Hide resolved
blogs/2021/09/06/bracket-pair-colorization.md Outdated Show resolved Hide resolved
blogs/2021/09/06/bracket-pair-colorization.md Outdated Show resolved Hide resolved
blogs/2021/09/06/bracket-pair-colorization.md Outdated Show resolved Hide resolved
@hediet
Copy link
Member Author

hediet commented Sep 20, 2021

@CoenraadS We would love to hear your feedback! You can view it by pressing . and opening the markdown file in github.dev. Is it okay for you that we mention you in this blog post?

@CoenraadS
Copy link

CoenraadS commented Sep 20, 2021 via email

@CoenraadS
Copy link

CoenraadS commented Sep 20, 2021 via email

@hediet
Copy link
Member Author

hediet commented Sep 20, 2021

Now it feels like whoever reads this blog first gets a story about how bad
my extension is which makes me feel bit self conscious

Oh I'm very sorry, that was certainly not my intention when I wrote this article! I'll revisit that passage.

We really like your extension! And I don't think, as stated in the article, that your bracket pair colorizer extension could ever be as performant as the native implementation due to VS Code not offering an incremental decoration API (and of course missing access to tokens). I think your extension is as fast as it could be given the technical limitations it has.

We really would like to give you credits though for your initial implementation of the extension!

@hediet hediet changed the title How We Made Bracket Pair Colorization 10,000x Faster How We Made Bracket Pair Colorization 10,000x Faster In VS Code Sep 20, 2021
@hediet
Copy link
Member Author

hediet commented Sep 20, 2021

@CoenraadS I improved the wording. I hope it is clearer now that we really appreciate your work and that these performance issues are mainly caused by missing incremental decoration API / token API on our side. The blog post should not at all indicate that you could have done better!

Only a native implementation could make use of these advanced data-structures as they don't need to go through the decoration API, but can directly be queried when rendering.

What do you think?

@CoenraadS
Copy link

CoenraadS commented Sep 20, 2021 via email

@hediet
Copy link
Member Author

hediet commented Sep 20, 2021

Awesome ;) Are you fine with linking to your github profile? We can also use your full name in the introduction if you like!

@CoenraadS
Copy link

CoenraadS commented Sep 20, 2021 via email


![Native implementation needs less than a millisecond to process text changes in checker.ts](./checker_ts-native.gif)

Without being limited by public API design, we could use (2,3)-trees, recursion-free tree-traversal, bit-arithmetic, incremental parsing and other techniques to reduce the extension's worst-case update complexity (i.e. the time required to process user-input when a document already has been opened) from $\mathcal{O}(N + E)$ to $\mathcal{O}(\mathrm{log}^3 N + E)$ with $N$ being the document size and $E$ the edit size, assuming the nesting level of bracket pairs is bounded by $\mathcal{O}(\mathrm{log} N)$.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a link we can add for people who are not familiar with the big-O notation and what that means?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that this is explained below too, so maybe it can be removed from here? We could just say that we implemented more advanced algorithms (and their names) which greatly reduced the complexity of the code and improved the performance. And then say "read more below for the details"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me, this is the abstract of the blog post and the time-complexity reduction of the update operation is the main result. That is the thing that (hopefully) motivates interested readers (who know about big O notation) to learn how we did it.

I would not say we reduced the complexity of the algorithm though :D Thanks for pointing out that I should say "time-complexity" here.
I'll link to wikipedia.

@egamma egamma requested review from egamma and removed request for egamma September 27, 2021 10:04
@gregvanl gregvanl merged commit 951bd26 into vnext Sep 29, 2021
@ItalyPaleAle ItalyPaleAle deleted the hediet/blog-post-bracket-pair-colorizer branch October 7, 2021 17:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants