Improve performance in large files #265

merged 2 commits into from Dec 22, 2016


None yet

2 participants

nathansobo commented Dec 22, 2016 edited

This PR massively reduces this package's negative impact on large file editing by adding some limitations.

Remove support for identical open/close characters

For pairs like "" '' and ``, the package can't determine whether a given character is the start or end of a pair by inspecting the character alone. It ends up assuming the next character it finds is a start pair and ends up scanning through the entire file. Proper implementation would require determining how many instances of the given character proceed the first match. If an odd number do, we could consider it a close bracket, otherwise an open bracket. Adding a scan from the beginning to cover this case didn't seem worth it to me in the short run since this feature wasn't really working anyway. We could have confined the scan to a single line for many languages, but that doesn't really generalize very well. I'm also worried about getting confused with constructs like """ in Python and CoffeeScript. Ultimately, I think we should revisit this whole package when we have better parse tree support someday so we can do a better job. For now I'm removing support for highlighting identical match pairs entirely. Support for inserting a match during typing remains.

Limit scan to 10k lines

We do a fair amount of work on each match we encounter, mostly related to determining whether the matched range is inside a comment or a string. This ended up really killing us on unmatched brackets at the beginning or end of really huge files, because we'd scan the entire file when the cursor was placed on the bracket. There may be some optimization to do in the scope matching, but nothing super easy. For now I'm limiting the scan length to 10k lines. Beyond that you won't get matching bracket behavior. That seems reasonable to me.

Here's a profile of moving the cursor to an unmatched bracket at the beginning of a 250,000-line file...

Before this change, we block for 350ms:

screen shot 2016-12-21 at 5 28 40 pm

After limiting the scan to 10k lines, we block for 17ms. Not perfect, but at least sane.

screen shot 2016-12-21 at 5 29 59 pm

/cc @maxbrunsfeld

nathansobo added some commits Dec 21, 2016
@nathansobo nathansobo Limit search for matching brackets to 10k lines
This prevents performance problems with mismatched brackets in extremely
large files.
@nathansobo nathansobo Don't find matching pairs when start/end characters are equivalent
@nathansobo nathansobo merged commit 32aca98 into master Dec 22, 2016

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
continuous-integration/travis-ci/pr The Travis CI build passed
@nathansobo nathansobo deleted the ns-large-file-perf branch Dec 22, 2016

👌 Nice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment