Skip to content

Commit

Permalink
Fixes for rewrite_repeat
Browse files Browse the repository at this point in the history
rewrite_repeat, used for rewriting nested repeats contained a comment
that said repeats can only be combined if the range of the result is not
more than the sum of the ranges of the inputs, but the assertion
actually tested the opposite: it tested that the range of the result is
more than the sum of the ranges of the inputs. This assertion does not
universally hold either way: a{2}{2} can be rewritten to a{4}, but the
range as calculated of the latter (0) is equal to the sum of the ranges
of the inputs (also 0).

Separate from the assert, the rewrite was not performed in some cases
where it is valid to do so. It is valid whenever the inner repeat has
a lower bound of 0 or 1, whenever the inner repeat has no upper bound
and the outer repeat has a positive lower bound, and whenever the outer
range has a lower bound equal to the upper bound. The last case was
missing. An example is a{2,3}{2}, which would previously be preserved
in that form, but is now rewritten as a{4,6}.
  • Loading branch information
hvdijk committed Sep 10, 2020
1 parent bd59eee commit 4bf9161
Show file tree
Hide file tree
Showing 3 changed files with 12 additions and 38 deletions.
42 changes: 4 additions & 38 deletions src/libre/ast_rewrite.c
Original file line number Diff line number Diff line change
Expand Up @@ -397,48 +397,14 @@ rewrite_repeat(struct ast_expr *n, enum re_flags flags)
return 1;
}

if (h == 0 || h == 1) {
dead = n->u.repeat.e;

n->u.repeat.min = v;
n->u.repeat.max = w;
n->u.repeat.e = n->u.repeat.e->u.repeat.e;

dead->type = AST_EXPR_EMPTY;
ast_expr_free(dead);

return 1;
}

/*
* a{h,i}{j,k} is equivalent to a{h*j,i*k} if it's possible to combine,
* and it's possible iff the range of the result is not more than
* the sum of the ranges of the two inputs.
*/
if (i != AST_COUNT_UNBOUNDED && k != AST_COUNT_UNBOUNDED) {
/* I don't know why this is true */
assert(w - v > i - h + k - j);
}

/*
* a{h,}{j,}
* a{h,i}{j,}
* a{h,}{j,k}
* and it's possible to combine unless h > 1, i is finite or j is 0,
* and j < k: in that case, the replacement would contain a{i*j+1}, but
* the original might not.
*/
if ((i == AST_COUNT_UNBOUNDED || k == AST_COUNT_UNBOUNDED) && h <= 1 && j <= 1) {
dead = n->u.repeat.e;

n->u.repeat.min = v;
n->u.repeat.max = w;
n->u.repeat.e = n->u.repeat.e->u.repeat.e;

dead->type = AST_EXPR_EMPTY;
ast_expr_free(dead);

return 1;
}

if (h > 1 && i == AST_COUNT_UNBOUNDED && j > 0) {
if (h <= 1 || (i == AST_COUNT_UNBOUNDED && j > 0) || j == k) {
dead = n->u.repeat.e;

n->u.repeat.min = v;
Expand Down
1 change: 1 addition & 0 deletions tests/pcre-repeat/in55.re
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
^a{2}{2}$
7 changes: 7 additions & 0 deletions tests/pcre-repeat/out55.fsm
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
0 -> 1 'a';
1 -> 2 'a';
2 -> 3 'a';
3 -> 4 'a';

start: 0;
end: 4;

0 comments on commit 4bf9161

Please sign in to comment.