Skip to content

Commit 407267b

Browse files
committed
regex engine: split EVAL_postponed_AB state
(This commit makes no practical changes in behaviour except for debugging output.) Currently, one of the regex engine stack states is EVAL_postponed_AB. When executing something like /(??{'A'})B/ where A and B represent general subpatterns, the engine executes the eval code, which returns the string 'A', which is compiled into a subpattern. Then the engine pushes an EVAL_postponed_AB state and runs the subpattern until it reaches an END op. Then it pushes *another* EVAL_postponed_AB state and runs the B part of the pattern until the final END. Then before returning success, it pops EVAL_postponed_AB off the stack (twice), executing any cleanup required. Similarly during failure, the EVAL_postponed_AB_fail action will be executed once or twice (depending on whether it failed during A or B). This commit splits that state into two, EVAL_postponed_A EVAL_postponed_B The first is pushed before running A, the second before running B. The actions currently remain the same and share the same code; i.e. this commit just does the equivalent of: - case EVAL_postponed_AB: + case EVAL_postponed_A: + case EVAL_postponed_B: ... cleanup code .... But it makes the code easier to understand, makes debugging output clearer, and will allow in future for the cleanup behaviours to differ between A and B. This commit also fixes up a few debugging messages and code comments which were still referring to 'EVAL_AB', which was renamed to EVAL_postponed_AB some years ago.
1 parent 676faf6 commit 407267b

File tree

3 files changed

+425
-389
lines changed

3 files changed

+425
-389
lines changed

regcomp.sym

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -332,7 +332,7 @@ REGEX_SET REGEX_SET, depth p S ; Regex set, temporary node used in pre-optimi
332332
#
333333
#
334334
TRIE next:FAIL
335-
EVAL B,postponed_AB:FAIL
335+
EVAL B,postponed_A,postponed_B:FAIL
336336
CURLYX end:FAIL
337337
WHILEM A_pre,A_min,A_max,B_min,B_max:FAIL
338338
BRANCH next:FAIL

regexec.c

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -6671,7 +6671,7 @@ S_regmatch(pTHX_ regmatch_info *reginfo, char *startpos, regnode *prog)
66716671
/* mark_state piggy backs on the yes_state logic so that when we unwind
66726672
the stack on success we can update the mark_state as we go */
66736673
regmatch_state *mark_state = NULL; /* last mark state we have seen */
6674-
regmatch_state *cur_eval = NULL; /* most recent EVAL_AB state */
6674+
regmatch_state *cur_eval = NULL; /* most recent EVAL_postponed_A state */
66756675
struct regmatch_state *cur_curlyx = NULL; /* most recent curlyx */
66766676
U32 state_num;
66776677
bool no_final = 0; /* prevent failure from backtracking? */
@@ -8719,15 +8719,18 @@ S_regmatch(pTHX_ regmatch_info *reginfo, char *startpos, regnode *prog)
87198719
ST.prev_eval = cur_eval;
87208720
cur_eval = st;
87218721
/* now continue from first node in postoned RE */
8722-
PUSH_YES_STATE_GOTO(EVAL_postponed_AB, startpoint, locinput,
8722+
PUSH_YES_STATE_GOTO(EVAL_postponed_A, startpoint, locinput,
87238723
loceol, script_run_begin);
87248724
NOT_REACHED; /* NOTREACHED */
87258725
}
87268726

8727-
case EVAL_postponed_AB: /* cleanup after a successful (??{A})B */
8727+
case EVAL_postponed_A: /* cleanup the A part after a
8728+
successful (??{A})B */
8729+
case EVAL_postponed_B: /* cleanup the B part after a
8730+
successful (??{A})B */
87288731
/* note: this is called twice; first after popping B, then A */
87298732
DEBUG_STACK_r({
8730-
Perl_re_exec_indentf( aTHX_ "EVAL_AB cur_eval = %p prev_eval = %p\n",
8733+
Perl_re_exec_indentf( aTHX_ "EVAL_postponed_A/B cur_eval = %p prev_eval = %p\n",
87318734
depth, cur_eval, ST.prev_eval);
87328735
});
87338736

@@ -8744,7 +8747,7 @@ S_regmatch(pTHX_ regmatch_info *reginfo, char *startpos, regnode *prog)
87448747
rex->recurse_locinput[CUR_EVAL.close_paren - 1] = VAL; \
87458748
}
87468749

8747-
SET_RECURSE_LOCINPUT("EVAL_AB[before]", CUR_EVAL.prev_recurse_locinput);
8750+
SET_RECURSE_LOCINPUT("EVAL_postponed_A/B[before]", CUR_EVAL.prev_recurse_locinput);
87488751

87498752
rex_sv = ST.prev_rex;
87508753
is_utf8_pat = reginfo->is_utf8_pat = cBOOL(RX_UTF8(rex_sv));
@@ -8771,7 +8774,7 @@ S_regmatch(pTHX_ regmatch_info *reginfo, char *startpos, regnode *prog)
87718774
if ( nochange_depth )
87728775
nochange_depth--;
87738776

8774-
SET_RECURSE_LOCINPUT("EVAL_AB[after]", cur_eval->locinput);
8777+
SET_RECURSE_LOCINPUT("EVAL_postponed_A/B[after]", cur_eval->locinput);
87758778
sayYES;
87768779

87778780

@@ -8780,7 +8783,8 @@ S_regmatch(pTHX_ regmatch_info *reginfo, char *startpos, regnode *prog)
87808783
regcppop(rex, &maxopenparen);
87818784
sayNO;
87828785

8783-
case EVAL_postponed_AB_fail: /* unsuccessfully ran A or B in (??{A})B */
8786+
case EVAL_postponed_A_fail: /* unsuccessfully ran A in (??{A})B */
8787+
case EVAL_postponed_B_fail: /* unsuccessfully ran B in (??{A})B */
87848788
/* note: this is called twice; first after popping B, then A */
87858789
DEBUG_STACK_r({
87868790
Perl_re_exec_indentf( aTHX_ "EVAL_AB_fail cur_eval = %p prev_eval = %p\n",
@@ -9902,7 +9906,7 @@ NULL
99029906

99039907
SET_RECURSE_LOCINPUT("FAKE-END[after]", cur_eval->locinput);
99049908

9905-
PUSH_YES_STATE_GOTO(EVAL_postponed_AB, /* match B */
9909+
PUSH_YES_STATE_GOTO(EVAL_postponed_B, /* match B */
99069910
st->u.eval.prev_eval->u.eval.B,
99079911
locinput, loceol, script_run_begin);
99089912
}

0 commit comments

Comments
 (0)