Optimize scan when using single-line regexps #187

maxbrunsfeld · 2016-12-16T23:50:01Z

Most of the time when searching a buffer, the search regex can only match within a single line. This is the case whenever the regex doesn't contain an explicit line ending character (\n or \r) or a negated character class (e.g. [^\w]). In this situation, we can avoid two inefficient aspects of the current search algorithm:

We can avoid joining the buffer's text into a single string to pass to Regex.exec and then translating the resulting match positions back into 2-dimensional buffer ranges with the BufferOffsetIndex. Instead we can search each line individually, keeping track of the row as we go, so that we can provide the ranges with no cost.
When searching backwards, there's no need for the back-off algorithm that we currently use to avoid searching from the beginning of the text. We can simply iterate through the lines in reverse.

/cc @nathansobo

Regex patterns can only match across line boundaries if they contain a carriage return, a newline, or a negated character class.

nathansobo · 2016-12-21T02:31:48Z

This looks great.

maxbrunsfeld added 6 commits December 16, 2016 14:19

Load buffer synchronously in scan specs

187dcd2

Refactor iterator helper objects in scanInRange

7ad08bc

Add optimized single line match iterators

e225c9f

Compute lineText lazily in TextBuffer.scan, .backwardsScan

f4f3a92

Make multi-line regex test more strict

d4d5d3b

Regex patterns can only match across line boundaries if they contain a carriage return, a newline, or a negated character class.

Add randomized tests for scanInRange, fix bugs

416eae8

maxbrunsfeld merged commit a48918c into master Dec 20, 2016

maxbrunsfeld deleted the mb-optimize-scan branch December 20, 2016 21:32

nathansobo mentioned this pull request Dec 22, 2016

Don't yield empty matches at end of scanned range #189

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize scan when using single-line regexps #187

Optimize scan when using single-line regexps #187

Uh oh!

maxbrunsfeld commented Dec 16, 2016 •

edited

Loading

Uh oh!

nathansobo commented Dec 21, 2016

Uh oh!

Uh oh!

Optimize scan when using single-line regexps #187

Optimize scan when using single-line regexps #187

Uh oh!

Conversation

maxbrunsfeld commented Dec 16, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nathansobo commented Dec 21, 2016

Uh oh!

Uh oh!

maxbrunsfeld commented Dec 16, 2016 •

edited

Loading