-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How We Made Bracket Pair Colorization 10,000x Faster In VS Code #4794
Conversation
hediet
commented
Aug 31, 2021
•
edited
Loading
edited
- Figure out how to support LaTeX in vscode blog posts.
… Might need reviews.
@mjbvz do you have ideas how we can enable support for LaTeX? |
} @3 | ||
``` | ||
|
||
Does the bracket at @1 close the bracket at @2 or at @3? This depends on the length of the template literal expression, which only a tokenizer with unbounded state (i.e. a non-regular tokenizer) can determine correctly! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
close "with" the? (I'm missing a word.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I asked @alexr00 and she said "close" is ok 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice! To be honest, the theoretical analysis was a bit too much for my taste, and I really missed seeing some measurements of time / memory / comparisons. Just to show some kind of before/after in numbers on some chart.
|
||
We have two data structures for this task: The *before edit position mapper* and the *node reader*. | ||
|
||
The [position mapper](https://github.com/microsoft/vscode/blob/f8e9f87b6554b527c61ba963d0c96c7687cbaae9/src/vs/editor/common/model/bracketPairColorizer/beforeEditPositionMapper.ts#L17) maps a position in the new document (after applying the edit) to the old document (before applying the edit), if possible. It also tells us the length between the current position and the next edit (or 0, if we are in an edit). This is done in $\mathcal{O}(1)$. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest linking to the code at the end. I'm also not 100% sure if these two paragraphs bring a lot of value, as they seem to describe the implementation in very high detail.
The beauty of a recursive-decent parser is that we can use anchor sets to improve error recovery. | ||
|
||
Consider the following example: | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe an image is better here where you can draw arrows or things
|
||
This needs to be considered when reusing nodes: The pair `( } )` cannot be reused when prepending it with `{`! We use bit-sets to encode anchor sets and compute the set of containing unopened brackets for every node. If they intersect, we cannot reuse the node. Luckily, there are only a few bracket types, so this does not affect performance too much. | ||
|
||
## Outlook |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, bar charts please! I suggest to put at least as much effort as in the big O analysis into measuring the times it takes to colorize or the used up memory in checker.ts
, sqlite3.c
, maybe a couple more like bootstrap.css
. I would also compare with 2-3 extensions, like bracketpair1, bracketpair2, rainbow brackets. You can also compare it with the previous bracket matching implementation that we still have.
@CoenraadS We would love to hear your feedback! You can view it by pressing |
Hi, actually I prefer not to have my 'CoenraadS' name present, since the
post is about the poor performance of my extension haha
Prefer just to have it written in non personal style. E.g. Bracket Pair
Colorizer 2 attempted to improve the performance by reusing the token and
bracket parsing... Etc
…On Mon, Sep 20, 2021, 9:34 AM Henning Dieterichs ***@***.***> wrote:
@CoenraadS <https://github.com/CoenraadS> We would love to hear your
feedback! You can view it by pressing . and opening the markdown file in
github.dev. Is it okay for you that we mention you in this blog post?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4794 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABKVVYPCU2J35QVCZUTFE53UC3P2BANCNFSM5DDZFPHQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Perhaps also the blog could start with the section about the demand for
better brackets, then small section about the performance problem of
existing solutions, and then how vscode solves them.
Now it feels like whoever reads this blog first gets a story about how bad
my extension is which makes me feel bit self conscious
On Mon, Sep 20, 2021, 9:50 AM Coenraad Stijne ***@***.***>
wrote:
… Hi, actually I prefer not to have my 'CoenraadS' name present, since the
post is about the poor performance of my extension haha
Prefer just to have it written in non personal style. E.g. Bracket Pair
Colorizer 2 attempted to improve the performance by reusing the token and
bracket parsing... Etc
On Mon, Sep 20, 2021, 9:34 AM Henning Dieterichs ***@***.***>
wrote:
> @CoenraadS <https://github.com/CoenraadS> We would love to hear your
> feedback! You can view it by pressing . and opening the markdown file in
> github.dev. Is it okay for you that we mention you in this blog post?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#4794 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ABKVVYPCU2J35QVCZUTFE53UC3P2BANCNFSM5DDZFPHQ>
> .
> Triage notifications on the go with GitHub Mobile for iOS
> <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
> or Android
> <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
>
>
|
Oh I'm very sorry, that was certainly not my intention when I wrote this article! I'll revisit that passage. We really like your extension! And I don't think, as stated in the article, that your bracket pair colorizer extension could ever be as performant as the native implementation due to VS Code not offering an incremental decoration API (and of course missing access to tokens). I think your extension is as fast as it could be given the technical limitations it has. We really would like to give you credits though for your initial implementation of the extension! |
@CoenraadS I improved the wording. I hope it is clearer now that we really appreciate your work and that these performance issues are mainly caused by missing incremental decoration API / token API on our side. The blog post should not at all indicate that you could have done better! Only a native implementation could make use of these advanced data-structures as they don't need to go through the decoration API, but can directly be queried when rendering. What do you think? |
It's better now thanks 😊
…On Mon, Sep 20, 2021, 12:56 PM Henning Dieterichs ***@***.***> wrote:
@CoenraadS <https://github.com/CoenraadS> I improved the wording. I hope
it is clearer now that we really appreciate your work and that these
performance issues are mainly caused by missing incremental decoration API
/ token API on our side. The blog post should not at all indicate that you
could have done better!
Only a native implementation could make use of these advanced
data-structures as they don't need to go through the decoration API.
What do you think?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4794 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABKVVYOK7BHJ7FS7HZ7W3FLUC4HNPANCNFSM5DDZFPHQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Awesome ;) Are you fine with linking to your github profile? We can also use your full name in the introduction if you like! |
Hi it's ok to leave it out.. I don't have a need for attention to my
profile and I don't really want my full name to be visible either. I prefer
my internet presence to be low profile 🙂
…On Mon, Sep 20, 2021, 1:40 PM Henning Dieterichs ***@***.***> wrote:
Awesome ;) Are you fine with linking to your github profile? We can also
use your full name in the introduction if you like!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4794 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABKVVYJW5TJ4PUYFYWK3JNLUC4MULANCNFSM5DDZFPHQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
|
||
![Native implementation needs less than a millisecond to process text changes in checker.ts](./checker_ts-native.gif) | ||
|
||
Without being limited by public API design, we could use (2,3)-trees, recursion-free tree-traversal, bit-arithmetic, incremental parsing and other techniques to reduce the extension's worst-case update complexity (i.e. the time required to process user-input when a document already has been opened) from $\mathcal{O}(N + E)$ to $\mathcal{O}(\mathrm{log}^3 N + E)$ with $N$ being the document size and $E$ the edit size, assuming the nesting level of bracket pairs is bounded by $\mathcal{O}(\mathrm{log} N)$. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a link we can add for people who are not familiar with the big-O notation and what that means?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that this is explained below too, so maybe it can be removed from here? We could just say that we implemented more advanced algorithms (and their names) which greatly reduced the complexity of the code and improved the performance. And then say "read more below for the details"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me, this is the abstract of the blog post and the time-complexity reduction of the update operation is the main result. That is the thing that (hopefully) motivates interested readers (who know about big O notation) to learn how we did it.
I would not say we reduced the complexity of the algorithm though :D Thanks for pointing out that I should say "time-complexity" here.
I'll link to wikipedia.