Skip to content

Commit 12ee078

Browse files
committed
relationship of :sigspace with %
The rules follow from the way sigspace is enabled by previous matcher. fixes #22
1 parent 838a4b9 commit 12ee078

File tree

1 file changed

+39
-13
lines changed

1 file changed

+39
-13
lines changed

S05-regex.pod

Lines changed: 39 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -17,8 +17,8 @@ Synopsis 5: Regexes and Rules
1717

1818
Created: 24 Jun 2002
1919

20-
Last Modified: 28 Jul 2012
21-
Version: 157
20+
Last Modified: 31 Jul 2012
21+
Version: 158
2222

2323
This document summarizes Apocalypse 5, which is about the new regex
2424
syntax. We now try to call them I<regex> rather than "regular
@@ -387,6 +387,14 @@ and these do not:
387387
:foo declarations, including :my and :sigspace itself
388388
{...}
389389

390+
When we say sigspace can follow either an atom or a quantified atom, we
391+
mean that it can come between an atom and its quantifier:
392+
393+
ms/ <atom> * / # means / [<atom><.ws>]* /
394+
395+
(If each atom matches whitespace, then it doesn't need to match after the
396+
quantifier.)
397+
390398
In general you don't need to use C<:sigspace> within grammars because
391399
the parser rules automatically handle whitespace policy for you.
392400
In this context, whitespace often includes comments, depending on
@@ -1116,27 +1124,45 @@ does not count as "progress" under C<:ratchet> semantics unless the
11161124
next item succeeds.
11171125

11181126
When significant space is used under C<:sigspace>,
1119-
only the matching atoms pay attention to whether whitespace follows.
1127+
each matching element enables the immediately following whitespace
1128+
to be considered signicant. Space after the C<%> does nothing. If you write:
11201129

1121-
ms/<element> + % ',' /
1130+
ms/ <element> + % ',' /
1131+
#1 #2 #3 #4 #5
11221132

1123-
allows whitespace around the separator like this:
1133+
it ignores whitespace #1 and #4, and rewrites the rest to:
1134+
1135+
/ [ <element> <.ws> ]+ % [ ',' <.ws> ] <.ws> /
1136+
#2 #5 #3
11241137

1125-
/ <element>[<.ws>','<.ws><element>]*<.ws> /
1138+
Since #3 is redundant with #2, it suffices to supply either #2 or #3:
11261139

1127-
while
1140+
ms/ <element>+ % ',' / # ws after comma and at end
1141+
ms/ <element> +% ',' / # ws after comma and any element
11281142

1129-
ms/<element>+%','/
1143+
So the first
11301144

1131-
excludes all significant whitespace like this:
1145+
ms/ <element>+ % ',' / # ws after comma and at end
11321146

1133-
/ <element>[','<element>]* /
1147+
is like
1148+
1149+
/ <element>[','<.ws><element>]*<.ws> /
11341150

1135-
And
1151+
while the second
11361152

1137-
ms/<element>+ % ',' /
1153+
ms/ <element> +% ',' / # ws after comma and any element
11381154

1139-
allows whitespace after each comma but nowhere else.
1155+
is like
1156+
1157+
/ <element><.ws>[','<.ws><element><.ws>]* /
1158+
1159+
and
1160+
1161+
ms/ <element>+% ','/
1162+
1163+
excludes all significant whitespace like this:
1164+
1165+
/ <element>[','<element>]* /
11401166

11411167
=item *
11421168

0 commit comments

Comments
 (0)