@@ -17,8 +17,8 @@ Synopsis 5: Regexes and Rules
17
17
18
18
Created: 24 Jun 2002
19
19
20
- Last Modified: 28 Jul 2012
21
- Version: 157
20
+ Last Modified: 31 Jul 2012
21
+ Version: 158
22
22
23
23
This document summarizes Apocalypse 5, which is about the new regex
24
24
syntax. We now try to call them I<regex> rather than "regular
@@ -387,6 +387,14 @@ and these do not:
387
387
:foo declarations, including :my and :sigspace itself
388
388
{...}
389
389
390
+ When we say sigspace can follow either an atom or a quantified atom, we
391
+ mean that it can come between an atom and its quantifier:
392
+
393
+ ms/ <atom> * / # means / [<atom><.ws>]* /
394
+
395
+ (If each atom matches whitespace, then it doesn't need to match after the
396
+ quantifier.)
397
+
390
398
In general you don't need to use C<:sigspace> within grammars because
391
399
the parser rules automatically handle whitespace policy for you.
392
400
In this context, whitespace often includes comments, depending on
@@ -1116,27 +1124,45 @@ does not count as "progress" under C<:ratchet> semantics unless the
1116
1124
next item succeeds.
1117
1125
1118
1126
When significant space is used under C<:sigspace>,
1119
- only the matching atoms pay attention to whether whitespace follows.
1127
+ each matching element enables the immediately following whitespace
1128
+ to be considered signicant. Space after the C<%> does nothing. If you write:
1120
1129
1121
- ms/<element> + % ',' /
1130
+ ms/ <element> + % ',' /
1131
+ #1 #2 #3 #4 #5
1122
1132
1123
- allows whitespace around the separator like this:
1133
+ it ignores whitespace #1 and #4, and rewrites the rest to:
1134
+
1135
+ / [ <element> <.ws> ]+ % [ ',' <.ws> ] <.ws> /
1136
+ #2 #5 #3
1124
1137
1125
- / <element>[<.ws>','<.ws><element>]*<.ws> /
1138
+ Since #3 is redundant with #2, it suffices to supply either #2 or #3:
1126
1139
1127
- while
1140
+ ms/ <element>+ % ',' / # ws after comma and at end
1141
+ ms/ <element> +% ',' / # ws after comma and any element
1128
1142
1129
- ms/<element>+%','/
1143
+ So the first
1130
1144
1131
- excludes all significant whitespace like this:
1145
+ ms/ <element>+ % ',' / # ws after comma and at end
1132
1146
1133
- / <element>[','<element>]* /
1147
+ is like
1148
+
1149
+ / <element>[','<.ws><element>]*<.ws> /
1134
1150
1135
- And
1151
+ while the second
1136
1152
1137
- ms/<element>+ % ',' /
1153
+ ms/ <element> + % ',' / # ws after comma and any element
1138
1154
1139
- allows whitespace after each comma but nowhere else.
1155
+ is like
1156
+
1157
+ / <element><.ws>[','<.ws><element><.ws>]* /
1158
+
1159
+ and
1160
+
1161
+ ms/ <element>+% ','/
1162
+
1163
+ excludes all significant whitespace like this:
1164
+
1165
+ / <element>[','<element>]* /
1140
1166
1141
1167
=item *
1142
1168
0 commit comments