Skip to content
This repository
Browse code

relationship of :sigspace with %

The rules follow from the way sigspace is enabled by previous matcher.
fixes #22
  • Loading branch information...
commit 12ee0787e9c2a0aced82bf3ad4951e5808fd40b9 1 parent 838a4b9
Larry Wall authored July 31, 2012

Showing 1 changed file with 39 additions and 13 deletions. Show diff stats Hide diff stats

  1. 52  S05-regex.pod
52  S05-regex.pod
Source Rendered
@@ -17,8 +17,8 @@ Synopsis 5: Regexes and Rules
17 17
 
18 18
     Created: 24 Jun 2002
19 19
 
20  
-    Last Modified: 28 Jul 2012
21  
-    Version: 157
  20
+    Last Modified: 31 Jul 2012
  21
+    Version: 158
22 22
 
23 23
 This document summarizes Apocalypse 5, which is about the new regex
24 24
 syntax.  We now try to call them I<regex> rather than "regular
@@ -387,6 +387,14 @@ and these do not:
387 387
     :foo declarations, including :my and :sigspace itself
388 388
     {...}
389 389
 
  390
+When we say sigspace can follow either an atom or a quantified atom, we
  391
+mean that it can come between an atom and its quantifier:
  392
+
  393
+    ms/ <atom> * /      # means / [<atom><.ws>]* /
  394
+
  395
+(If each atom matches whitespace, then it doesn't need to match after the
  396
+quantifier.)
  397
+
390 398
 In general you don't need to use C<:sigspace> within grammars because
391 399
 the parser rules automatically handle whitespace policy for you.
392 400
 In this context, whitespace often includes comments, depending on
@@ -1116,27 +1124,45 @@ does not count as "progress" under C<:ratchet> semantics unless the
1116 1124
 next item succeeds.
1117 1125
 
1118 1126
 When significant space is used under C<:sigspace>,
1119  
-only the matching atoms pay attention to whether whitespace follows.
  1127
+each matching element enables the immediately following whitespace
  1128
+to be considered signicant.  Space after the C<%> does nothing.  If you write:
1120 1129
 
1121  
-    ms/<element> + % ',' /
  1130
+    ms/ <element> +  %  ',' /
  1131
+      #1        #2 #3 #4  #5
1122 1132
 
1123  
-allows whitespace around the separator like this:
  1133
+it ignores whitespace #1 and #4, and rewrites the rest to:
  1134
+                   
  1135
+    / [ <element> <.ws> ]+ % [ ',' <.ws> ] <.ws> /
  1136
+                    #2               #5      #3
1124 1137
 
1125  
-    / <element>[<.ws>','<.ws><element>]*<.ws> /
  1138
+Since #3 is redundant with #2, it suffices to supply either #2 or #3:
1126 1139
 
1127  
-while
  1140
+    ms/ <element>+ % ',' /    # ws after comma and at end
  1141
+    ms/ <element> +% ',' /    # ws after comma and any element
1128 1142
 
1129  
-    ms/<element>+%','/
  1143
+So the first
1130 1144
 
1131  
-excludes all significant whitespace like this:
  1145
+    ms/ <element>+ % ',' /    # ws after comma and at end
1132 1146
 
1133  
-    / <element>[','<element>]* /
  1147
+is like
  1148
+
  1149
+    / <element>[','<.ws><element>]*<.ws> /
1134 1150
 
1135  
-And
  1151
+while the second
1136 1152
 
1137  
-    ms/<element>+ % ',' /
  1153
+    ms/ <element> +% ',' /    # ws after comma and any element
1138 1154
 
1139  
-allows whitespace after each comma but nowhere else.
  1155
+is like
  1156
+
  1157
+    / <element><.ws>[','<.ws><element><.ws>]* /
  1158
+
  1159
+and
  1160
+
  1161
+    ms/ <element>+% ','/
  1162
+
  1163
+excludes all significant whitespace like this:
  1164
+
  1165
+    / <element>[','<element>]* /
1140 1166
 
1141 1167
 =item *
1142 1168
 

0 notes on commit 12ee078

Please sign in to comment.
Something went wrong with that request. Please try again.