Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Background tokenization is a lot slower in large files #138887

Closed
DanielRosenwasser opened this issue Dec 10, 2021 · 6 comments
Closed

Background tokenization is a lot slower in large files #138887

DanielRosenwasser opened this issue Dec 10, 2021 · 6 comments
Assignees
Labels
bug Issue identified by VS Code Team member as probable bug candidate Issue identified as probable candidate for fixing in the next release regression Something that used to work is now broken tokenization Text tokenization verified Verification succeeded

Comments

@DanielRosenwasser
Copy link
Member

Something @ahejlsberg and I noticed in a recent meeting - if you open up checker.ts in TypeScript today, it quicky colorizes the area of the file you're viewing; however, if you start adding a function declaration towards the middle or bottom of the file, colorization takes a noticeable bit of time unless you try scrolling. Scrolling seems to get a bunch of content "fixed", and new text only gets the right colorization on scroll.

Here's a video of some of this in action.

highlighting-initial-file-load.mp4

Eventually, the file starts to always get accurate colorization and smart-indentation, but it took at least over 30 seconds from what we remembered seeing.

This occurs in VS Code's latest (1.63 I believe) along with Insiders, both on Windows.

Closest similar issue I could find was #138822

@mjbvz mjbvz assigned alexdima and unassigned mjbvz Dec 11, 2021
@alexdima alexdima added bug Issue identified by VS Code Team member as probable bug tokenization Text tokenization candidate Issue identified as probable candidate for fixing in the next release labels Dec 13, 2021
@alexdima
Copy link
Member

When I saw the video, I immediately thought that this must be a recent regression. But in my attempt to track down the first bad commit, I have tried with our releases from the past 6 months, and then jumped all the way back to 1.40.2 from October 2019 and I always get the current behavior.

Basically, changing the text buffer never ran a viewport tokenization, which is quite an oversight, but not a recent regression.

@alexdima alexdima removed the candidate Issue identified as probable candidate for fixing in the next release label Dec 13, 2021
@ahejlsberg
Copy link
Member

@alexdima You're right, if start VS Code on a large file (like checker.ts) and edit towards the end of the file, colorization doesn't happen for the first few seconds or so. But with 1.63 it takes more like a minute before the white text gets colorized. I just verified by reverting to 1.61. There's definitely a big change in behavior.

@alexdima
Copy link
Member

@ahejlsberg You're right, tokenization is now running for 2ms, and then yielding for 14ms, making it roughly 8 times slower:

image

I think this bad yielding behavior is caused by #137646 , where we moved away from setImmediate and to requestIdleCallback.

fyi @jrieken @bpasero

@alexdima alexdima added candidate Issue identified as probable candidate for fixing in the next release regression Something that used to work is now broken labels Dec 13, 2021
@alexdima
Copy link
Member

I've created #139072 to track that typing does not trigger viewport tokenization and I suggest to use this issue to track that background tokenization takes a lot longer now.

@alexdima alexdima changed the title Syntax colorization/indentation takes a long time to apply for new code in large files Background tokenization is a lot slower in large files Dec 14, 2021
@alexdima alexdima added this to the November 2021 Recovery milestone Dec 14, 2021
alexdima added a commit that referenced this issue Dec 14, 2021
@alexdima
Copy link
Member

Steps to verify:

  • open checker.ts
  • have minimap enabled so you can see when background tokenization reaches the bottom
  • scroll to the bottom and then do a window reload
  • tokenization should reach the bottom in ~10-15s
  • (you can optionally take a JS CPU trace and check that there aren't a lot of idle gaps in there)

@dbaeumer
Copy link
Member

Verified. Took ~11 seconds on my machine.

@dbaeumer dbaeumer added the verified Verification succeeded label Dec 15, 2021
@github-actions github-actions bot locked and limited conversation to collaborators Jan 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Issue identified by VS Code Team member as probable bug candidate Issue identified as probable candidate for fixing in the next release regression Something that used to work is now broken tokenization Text tokenization verified Verification succeeded
Projects
None yet
Development

No branches or pull requests

6 participants