Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move has_visited check to the top of the loop in backtrack::Bounded::step #384

Merged
merged 1 commit into from
Jun 23, 2017

Conversation

adamcrume
Copy link
Contributor

…pressions

With certain repeated empty expressions similar to (x*)*?, the backtracker can
go into an infinite loop. This change adds the Progress instruction which
requires the engine to make progress to continue matching a repeated
subexpression.

Fixes #375

Note that this was inspired by https://swtch.com/~rsc/regexp/regexp2.html#real (mentioned in HACKING.md), which mentions that a progress instruction can be used to prevent backtracking loops.

@adamcrume
Copy link
Contributor Author

Whoops, was only running a subset of the tests. Fix on the way...

@BurntSushi
Copy link
Member

Sorry that I haven't had a chance to look at this yet, but I'm skeptical of the approach in this PR. The code that re-implements matching semantics in an ad hoc fashion on the AST seems particularly unfortunate.

I haven't given any thought to this, but both RE2 and Go's regexp library have a bitstate backtracker like the one in this crate. Are those regex engines susceptible to this problem? If not, how do they solve it?

@BurntSushi
Copy link
Member

BurntSushi commented Jun 19, 2017

In particular, I'd like to understand why the visited logic isn't preventing the infinite loop.

…step

This prevents us from executing instructions if they have been executed before, rather than returning after the instruction has already been executed.

Fixes rust-lang#375
@adamcrume
Copy link
Contributor Author

has_visited was returning true, so the loop was exiting, but the split instruction had already pushed a job at that point, so step kept getting called again. Moving the has_visited check to above the match statement fixes the problem without any Progress instruction.

@adamcrume adamcrume changed the title Add Progress instruction to avoid infinite loops in repeated empty ex… Move has_visited check to the top of the loop in backtrack::Bounded::step Jun 20, 2017
@adamcrume
Copy link
Contributor Author

Does this new change look better? It's certainly smaller in scope and size.

@BurntSushi
Copy link
Member

@adamcrume Yes, much nicer! Thank you for doing this work to fix this subtle bug. It is most appreciated! @bors r+

@bors
Copy link
Contributor

bors commented Jun 23, 2017

📌 Commit 6356d1a has been approved by BurntSushi

@bors
Copy link
Contributor

bors commented Jun 23, 2017

⌛ Testing commit 6356d1a with merge 30e35a3...

bors added a commit that referenced this pull request Jun 23, 2017
Move has_visited check to the top of the loop in backtrack::Bounded::step

…pressions

With certain repeated empty expressions similar to (x*)*?, the backtracker can
go into an infinite loop. This change adds the Progress instruction which
requires the engine to make progress to continue matching a repeated
subexpression.

Fixes #375

Note that this was inspired by https://swtch.com/~rsc/regexp/regexp2.html#real (mentioned in HACKING.md), which mentions that a progress instruction can be used to prevent backtracking loops.
@bors
Copy link
Contributor

bors commented Jun 23, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: BurntSushi
Pushing 30e35a3 to master...

@bors bors merged commit 6356d1a into rust-lang:master Jun 23, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants