fix(lib): correctly handle miss for loop in loop #393
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
fixes #392
This PR changes the construction of the graph for a regex
Mir::Loop
when there is amiss
path present. Given an input that essentially represents this control flow:we previously lowered it to a single loop and unconditionally merged the
then
and oldmiss
arms into the new miss of the loop:This lowering can lead to situations, where we incorrectly match
a
followed bymiss
, even thougha
must be followed bythen
.This PR solves the issue by generating two separate paths for the first and other iterations of the loop, and merging the
then
and oldmiss
only for the new miss of the first iteration path:Since the path for
a
may be potentially long, this can result in a lot more generated code, especially for deeply nested loops. But I hope that such patterns are rare enough in the wild that we can accept a performance hit for the sake of a correctness fix. For what it's worth, thecargo bench
es seem to perform similar before and after this patch on my machine.