Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
84 changes: 68 additions & 16 deletions doc/html/pcre2pattern.html
Original file line number Diff line number Diff line change
Expand Up @@ -41,14 +41,12 @@ <h1>pcre2pattern man page</h1>
<li><a name="TOC26" href="#SEC26">CONDITIONAL GROUPS</a>
<li><a name="TOC27" href="#SEC27">COMMENTS</a>
<li><a name="TOC28" href="#SEC28">RECURSIVE PATTERNS</a>
<li><a name="TOC29" href="#SEC29">GROUPS AS SUBROUTINES</a>
<li><a name="TOC30" href="#SEC30">ONIGURUMA SUBROUTINE SYNTAX</a>
<li><a name="TOC31" href="#SEC31">CALLOUTS</a>
<li><a name="TOC32" href="#SEC32">BACKTRACKING CONTROL</a>
<li><a name="TOC33" href="#SEC33">EBCDIC ENVIRONMENTS</a>
<li><a name="TOC34" href="#SEC34">SEE ALSO</a>
<li><a name="TOC35" href="#SEC35">AUTHOR</a>
<li><a name="TOC36" href="#SEC36">REVISION</a>
<li><a name="TOC29" href="#SEC29">CALLOUTS</a>
<li><a name="TOC30" href="#SEC30">BACKTRACKING CONTROL</a>
<li><a name="TOC31" href="#SEC31">EBCDIC ENVIRONMENTS</a>
<li><a name="TOC32" href="#SEC32">SEE ALSO</a>
<li><a name="TOC33" href="#SEC33">AUTHOR</a>
<li><a name="TOC34" href="#SEC34">REVISION</a>
</ul>
<h2><a name="SEC1" href="#TOC1">PCRE2 REGULAR EXPRESSION DETAILS</a></h2>
<p>
Expand Down Expand Up @@ -3399,7 +3397,9 @@ <h3>
"b" and so the whole match succeeds. This match used to fail in Perl, but in
later versions (I tried 5.024) it now works.
<a name="groupsassubroutines"></a></p>
<h2><a name="SEC29" href="#TOC1">GROUPS AS SUBROUTINES</a></h2>
<h3>
Groups as subroutines
</h3>
<p>
If the syntax for a recursive group call (either by number or by name) is used
outside the parentheses to which it refers, it operates a bit like a subroutine
Expand Down Expand Up @@ -3446,8 +3446,60 @@ <h2><a name="SEC29" href="#TOC1">GROUPS AS SUBROUTINES</a></h2>
in groups when called as subroutines is described in the section entitled
<a href="#btsub">"Backtracking verbs in subroutines"</a>
below.
</p>
<h3>
Recursion and subroutines with returned capture groups
</h3>
<p>
Since PCRE2 10.46, recursion and subroutine calls may also specify a list of
capture groups to return. This is a PCRE2 syntax extension not supported by
Perl. The pattern matching recurses into the referenced expression as described
above, however, when the recursion returns to the calling expression the
subgroups captured during the recursion can be retained when the calling
expression's context is restored.
</p>
<p>
When used as a subroutine, this allows the subroutine's capture groups to
be used as return values.
</p>
<p>
Only the specific capture groups listed by the caller will be retained, using
the following syntax:
<pre>
(?R(grouplist)) recurse whole pattern, returning capture groups
(?n(grouplist)) )
(?+n(grouplist)) )
(?-n(grouplist)) ) call subroutine, returning capture groups
(?&name(grouplist)) )
(?P&#62;name(grouplist)) )
</pre>
</p>
<p>
The list of capture groups "grouplist" is a comma-separated list of (absolute
or relative) group numbers, and group names enclosed in single quotes or angle
brackets.
</p>
<p>
Here is an example which first uses the DEFINE condition to create a re-usable
routine for matching a weekday, then calls that subroutine and retains the
groups it captures for use later:
<pre>
(?x: # ignore whitespace for clarity
# Define the routine "weekendday" which matches Saturday or
# Sunday, and returns the Sat/Sun prefix as \k&#60;short&#62;.
(?(DEFINE) (?&#60;weekendday&#62;
(?|(?&#60;short&#62;Sat)urday|(?&#60;short&#62;Sun)day) ) )
# Call the routine. Matches "Saturday,Sat" or "Sunday,Sun".
(?&weekendday(&#60;short&#62;)),\k&#60;short&#62; )
</pre>
</p>
<p>
This feature is not available using the Oniguruma syntax \g&#60;...&#62; or \g'...'
below.
<a name="onigurumasubroutines"></a></p>
<h2><a name="SEC30" href="#TOC1">ONIGURUMA SUBROUTINE SYNTAX</a></h2>
<h3>
Oniguruma subroutine syntax
</h3>
<p>
For compatibility with Oniguruma, the non-Perl syntax \g followed by a name or
a number enclosed either in angle brackets or single quotes, is an alternative
Expand All @@ -3465,7 +3517,7 @@ <h2><a name="SEC30" href="#TOC1">ONIGURUMA SUBROUTINE SYNTAX</a></h2>
Note that \g{...} (Perl syntax) and \g&#60;...&#62; (Oniguruma syntax) are <i>not</i>
synonymous. The former is a backreference; the latter is a subroutine call.
</p>
<h2><a name="SEC31" href="#TOC1">CALLOUTS</a></h2>
<h2><a name="SEC29" href="#TOC1">CALLOUTS</a></h2>
<p>
Perl has a feature whereby using the sequence (?{...}) causes arbitrary Perl
code to be obeyed in the middle of matching a regular expression. This makes it
Expand Down Expand Up @@ -3543,7 +3595,7 @@ <h3>
</pre>
The doubling is removed before the string is passed to the callout function.
<a name="backtrackcontrol"></a></p>
<h2><a name="SEC32" href="#TOC1">BACKTRACKING CONTROL</a></h2>
<h2><a name="SEC30" href="#TOC1">BACKTRACKING CONTROL</a></h2>
<p>
There are a number of special "Backtracking Control Verbs" (to use Perl's
terminology) that modify the behaviour of backtracking during matching. They
Expand Down Expand Up @@ -4071,7 +4123,7 @@ <h3>
is no such group within the subroutine's group, the subroutine match fails and
there is a backtrack at the outer level.
<a name="ebcdicenvironments"></a></p>
<h2><a name="SEC33" href="#TOC1">EBCDIC ENVIRONMENTS</a></h2>
<h2><a name="SEC31" href="#TOC1">EBCDIC ENVIRONMENTS</a></h2>
<p>
Differences in the way PCRE behaves when it is running in an EBCDIC environment
are covered in this section.
Expand Down Expand Up @@ -4115,12 +4167,12 @@ <h3>
points. However, if the range is specified numerically, for example,
[\x88-\x92] or [h-\x92], all code points are included.
</p>
<h2><a name="SEC34" href="#TOC1">SEE ALSO</a></h2>
<h2><a name="SEC32" href="#TOC1">SEE ALSO</a></h2>
<p>
<b>pcre2api</b>(3), <b>pcre2callout</b>(3), <b>pcre2matching</b>(3),
<b>pcre2syntax</b>(3), <b>pcre2</b>(3).
</p>
<h2><a name="SEC35" href="#TOC1">AUTHOR</a></h2>
<h2><a name="SEC33" href="#TOC1">AUTHOR</a></h2>
<p>
Philip Hazel
<br>
Expand All @@ -4129,7 +4181,7 @@ <h2><a name="SEC35" href="#TOC1">AUTHOR</a></h2>
Cambridge, England.
<br>
</p>
<h2><a name="SEC36" href="#TOC1">REVISION</a></h2>
<h2><a name="SEC34" href="#TOC1">REVISION</a></h2>
<p>
Last updated: 27 November 2024
<br>
Expand Down
26 changes: 24 additions & 2 deletions doc/html/pcre2syntax.html
Original file line number Diff line number Diff line change
Expand Up @@ -566,14 +566,14 @@ <h2><a name="SEC25" href="#TOC1">SUBSTRING SCAN ASSERTION</a></h2>
(*scan_substring:(grouplist)...) scan captured substring
(*scs:(grouplist)...) scan captured substring
</pre>
The comma-separated list may identify groups in any of the following ways:
The comma-separated list "grouplist" may identify groups in any of the
following ways:
<pre>
n absolute reference
+n relative reference
-n relative reference
&#60;name&#62; name
'name' name

</pre>
</p>
<h2><a name="SEC26" href="#TOC1">SCRIPT RUNS</a></h2>
Expand Down Expand Up @@ -621,6 +621,28 @@ <h2><a name="SEC28" href="#TOC1">SUBROUTINE REFERENCES (POSSIBLY RECURSIVE)</a><
\g&#60;-n&#62; call subroutine by relative number (PCRE2 extension)
\g'-n' call subroutine by relative number (PCRE2 extension)
</pre>
The variants using parentheses (?...) may also specify a list of capture groups
to return, which shall be retained in the calling subexpression if set during
the recursion (this feature is not supported by Perl).
<pre>
(?R(grouplist)) recurse whole pattern, returning capture groups
(PCRE2 extension)
(?n(grouplist)) )
(?+n(grouplist)) ) call subroutine, returning capture groups
(?-n(grouplist)) ) (PCRE2 extension)
(?&name(grouplist)) )
(?P&#62;name(grouplist)) )
</pre>
The comma-separated list "grouplist" uses the same syntax as
(*scan_substring:(grouplist)...), and may identify groups in any of the
following ways:
<pre>
n absolute reference
+n relative reference
-n relative reference
&#60;name&#62; name
'name' name
</pre>
</p>
<h2><a name="SEC29" href="#TOC1">CONDITIONAL PATTERNS</a></h2>
<p>
Expand Down
Loading