@@ -17,8 +17,8 @@ Synopsis 5: Regexes and Rules
17
17
18
18
Created: 24 Jun 2002
19
19
20
- Last Modified: 3 Apr 2013
21
- Version: 161
20
+ Last Modified: 6 May 2013
21
+ Version: 162
22
22
23
23
This document summarizes Apocalypse 5, which is about the new regex
24
24
syntax. We now try to call them I<regex> rather than "regular
@@ -1201,7 +1201,7 @@ how to handle them (more on that below).
1201
1201
=item *
1202
1202
1203
1203
The default way in which the engine handles a string scalar is to match it
1204
- as a C<< ' ...' >> literal (i.e. it does not treat the interpolated string
1204
+ as a C<< " ..." >> literal (i.e. it does not treat the interpolated string
1205
1205
as a subpattern). In other words, a Perl 6:
1206
1206
1207
1207
/ $var /
@@ -1210,13 +1210,16 @@ is like a Perl 5:
1210
1210
1211
1211
/ \Q$var\E /
1212
1212
1213
- However, if C<$var> contains a C<Regex> object, instead of attempting to
1214
- convert it to a string, it is called as a subrule, as if you said
1215
- C<< <$var> >>. (See assertions below.) This form does not capture,
1216
- and it fails if C<$var> is tainted.
1213
+ To interpolate a C<Regex> object, use C<< <$var> >> instead.
1217
1214
1218
1215
If C<$var> is undefined, a warning is issued and the match fails.
1219
1216
1217
+ When matching against a Stringy type that is not Str, the variable must
1218
+ be interpretable as a value of that Stringy type (or a related type
1219
+ that can be coerced to that type). For example, when regex matching a
1220
+ Buf type, the variable will be matched under the Buf type's semantics,
1221
+ not Str semantics.
1222
+
1220
1223
[Conjecture: when we allow matching against non-string types, doing a
1221
1224
type match on the current node will require the syntax of an embedded
1222
1225
signature, not just a bare variable, so there is no need to account for
@@ -1271,7 +1274,7 @@ An interpolated array:
1271
1274
1272
1275
/ @cmds /
1273
1276
1274
- is matched as if it were an alternation of its elements. Ordinarily it
1277
+ is matched as if it were an alternation of its literal elements. Ordinarily it
1275
1278
matches using junctive semantics:
1276
1279
1277
1280
/ [ @cmds[0] | @cmds[1] | @cmds[2] | ... ] /
@@ -1293,16 +1296,18 @@ Or course, you can also
1293
1296
1294
1297
to be clear that you mean junctive semantics.
1295
1298
1299
+ Since C<$x> is interpolated as if you'd said C<"$x">, if C<$x> contains
1300
+ a list, it is stringified first. To get alternation you must use the
1301
+ C<@$x> or C<@($x)> form to indicate that you're intending the scalar
1302
+ variable to be treated as a list.
1303
+
1296
1304
An interpolated array using junctive semantics is declarative
1297
1305
(participates in external longest token matching) only if it's
1298
1306
known to be constant at the time the regex is compiled.
1299
1307
1300
- As with a scalar variable, each element is matched as a literal
1301
- unless it happens to be a C<Regex> object, in which case it is matched
1302
- as a subrule. As with scalar subrules, a tainted subrule always fails.
1303
- All string values pay attention to the current C<:ignorecase>
1304
- and C<:ignoremark> settings, while C<Regex> values use their own
1305
- C<:ignorecase> and C<:ignoremark> settings.
1308
+ As with a scalar variable, each element is matched as a literal.
1309
+ All such values pay attention to the current C<:ignorecase>
1310
+ and C<:ignoremark> settings.
1306
1311
1307
1312
When you get tired of writing:
1308
1313
0 commit comments