@@ -1603,7 +1603,7 @@ pattern, which may be summarized as follows:
1603
1603
| stringified return value literally.
1604
1604
C « <$variable> » | Interpolates stringified contents of variable as a regex.
1605
1605
C « <{code}> » | Runs Perl6 code inside the regex, and interpolates the
1606
- | return value as a regex.
1606
+ | stringified return value as a regex.
1607
1607
1608
1608
= end table
1609
1609
@@ -1614,26 +1614,33 @@ value isn't a L<Regex object|/type/Regex>. If the value is a Regex, it will not
1614
1614
be stringified, but instead be interpolated as such. 'Literally' means
1615
1615
I < strictly literally > , that is: as if the respective stringified value is quoted
1616
1616
with a basic C < Q > string L < C < Q[...] > |/language/quoting#Literal_strings:_Q> .
1617
- Consequently, the stringified value will not itself undergo any (second-level)
1617
+ Consequently, the stringified value will not itself undergo any further
1618
1618
interpolation.
1619
1619
1620
- my $string = 'Is this a regex or a string: 123\w+False ?';
1620
+ my $string = 'Is this a regex or a string: 123\w+False$pattern1 ?';
1621
1621
my $pattern1 = 'string';
1622
1622
my $pattern2 = '\w+';
1623
1623
my $pattern3 = 'gnirts';
1624
+ my $pattern4 = '$pattern1';
1624
1625
my $number = 123;
1625
1626
my $bool = True;
1627
+ my $regex = /\w+/;
1628
+ my sub f1 { return Q[$pattern1] };
1626
1629
1627
- say $string.match: / 'string' /; # [1] OUTPUT: 「string」
1628
- say $string.match: / $pattern1 /; # [2] OUTPUT: 「string」
1629
- say $string.match: / $pattern2 /; # [3] OUTPUT: 「\w+」
1630
- say $string.match: / $number /; # [4] OUTPUT: 「123」
1630
+ say $string.match: / 'string' /; # [1] OUTPUT: 「string」
1631
+ say $string.match: / $pattern1 /; # [2] OUTPUT: 「string」
1632
+ say $string.match: / $pattern2 /; # [3] OUTPUT: 「\w+」
1633
+ say $string.match: / $regex /; # [4] OUTPUT: 「Is」
1634
+ say $string.match: / $number /; # [5] OUTPUT: 「123」
1631
1635
1632
- say $string.match: / $pattern3.flip /; # [5] OUTPUT: Nil
1633
- say $string.match: / "$pattern3.flip()" /; # [6] OUTPUT: 「string」
1634
- say $string.match: / $($pattern3.flip) /; # [7] OUTPUT: 「string」
1635
- say $string.match: / $([~] $pattern3.comb.reverse) /; # [8] OUTPUT: 「string」
1636
- say $string.match: / $(!$bool) /; # [9] OUTPUT: 「False」
1636
+ say $string.match: / $pattern3.flip /; # [6] OUTPUT: Nil
1637
+ say $string.match: / "$pattern3.flip()" /; # [7] OUTPUT: 「string」
1638
+ say $string.match: / $($pattern3.flip) /; # [8] OUTPUT: 「string」
1639
+ say $string.match: / $([~] $pattern3.comb.reverse) /; # [9] OUTPUT: 「string」
1640
+ say $string.match: / $(!$bool) /; # [10] OUTPUT: 「False」
1641
+
1642
+ say $string.match: / $pattern4 /; # [11] OUTPUT: 「$pattern1」
1643
+ say $string.match: / $(f1) /; # [12] OUTPUT: 「$pattern1」
1637
1644
1638
1645
In this example, the statements C < [1] > and C < [2] > are equivalent and meant to
1639
1646
illustrate a plain case of regex interpolation. Since unescaped/unquoted
@@ -1643,62 +1650,71 @@ to emphasize the correspondence between the first two statements. Statement
1643
1650
C < [3] > unambiguously shows that the string pattern held by C < $pattern2 > is
1644
1651
interpreted literally, and not as a regex. In case it would have been
1645
1652
interpreted as a regex, it would have matched the first word of C < $string > , i.e.
1646
- C < 「Is」 > . Statement C < [4 ] > shows how the stringified number is used as a match
1647
- pattern.
1653
+ C < 「Is」 > , as can be seen in statement <[4]> . Statement C < [5 ] > shows how the
1654
+ stringified number is used as a match pattern.
1648
1655
1649
- Statement C < [5] > does not work as intended. To the human reader, the dot C < . >
1650
- may seem to represent the L < method call operator|/language/operators#methodop_. > ,
1651
- but given the regex context the compiler will parse it as the regex wildcard
1656
+ Statement C < [6] > does not work as probably intended. To the human reader, the
1657
+ dot C < . > may seem to represent the L < method call operator|/language/operators#methodop_. > ,
1658
+ but since a dot is not a valid character for an L < ordinary identifier|/language/syntax#Ordinary_identifiers > ,
1659
+ and given the regex context, the compiler will parse it as the regex wildcard
1652
1660
L < .|/language/regexes#Wildcards > that matches any character. The apparent
1653
1661
ambiguity may be resolved in various ways, for instance through the use of
1654
- straightforward L < string interpolation|/language/quoting#Interpolation:_qq > from
1655
- the regex as in statement C < [6] > (note that the inclusion of the call operator
1656
- C < () > is key here), or by using the second syntax form from the above table as
1657
- in statement C < [7] > , in which case the match pattern C < 'string' > first emerges
1658
- as the return value of the C < flip > method call. Since general Perl6 code may be
1659
- run from within the parentheses of C < $( ) > , the same effect can also be achieved
1660
- with a bit more effort, like in statement C < [8] > . Statement C < [9] > illustrates
1661
- how the stringified version of the code's return value (the boolean value
1662
- C < False > ) is matched literally.
1662
+ straightforward L < string interpolation|/language/quoting#Interpolation:_qq >
1663
+ from the regex as in statement C < [7] > (note that the inclusion of the call
1664
+ operator C < () > is key here), or by using the second syntax form from the above
1665
+ table as in statement C < [8] > , in which case the match pattern C < string > first
1666
+ emerges as the return value of the C < flip > method call. Since general Perl6
1667
+ code may be run from within the parentheses of C < $( ) > , the same effect can
1668
+ also be achieved with a bit more effort, like in statement C < [9] > . Statement
1669
+ C < [10] > illustrates how the stringified version of the code's return value (the
1670
+ boolean value C < False > ) is matched literally.
1671
+
1672
+ Finally, statements C < [11] > and C < [12] > show how the value of C < $pattern4 > and
1673
+ the return value of C < f1 > are I < not > subject to a further round of
1674
+ interpolation. Hence, in general, after possible stringification, C « $variable »
1675
+ and C « $(code) » provide for a strictly literal match of the variable or return
1676
+ value.
1663
1677
1664
1678
Now consider the second two syntactical forms from the table above:
1665
1679
C « <$variable> » and C « <${code}> » . These forms will stringify the value of the
1666
1680
variable or the return value of the code and interpolate it as a regex. If the
1667
- respective value is a Regex, it is interpolated as such. 'Interpolated as a
1668
- regex' means interpolated/inserted into the target Regex without protective
1669
- quoting. Consequently, the further evaluation of the target Regex may trigger
1670
- the (second-level) interpolation of any variables it contains.
1681
+ respective value is a Regex, it is interpolated as such:
1671
1682
1672
1683
my $string = 'Is this a regex or a string: 123\w+$x ?';
1673
1684
my $pattern1 = '\w+';
1674
1685
my $number = 123;
1675
1686
my sub f1 { return /s\w+/ };
1676
- my sub f2 (Str $x) { return /$x x/ };
1677
- my sub f3 { return Q[$x] };
1678
1687
1679
1688
say $string.match: / <$pattern1> /; # [1] OUTPUT: 「Is」
1680
1689
say $string.match: / <$number> /; # [2] OUTPUT: 「123」
1681
1690
say $string.match: / <{ f1 }> /; # [3] OUTPUT: 「string」
1682
1691
1683
- my $x = "rege";
1684
- say $string.match: / <{ f2($x) }> /; # [4] OUTPUT: 「regex」
1685
- say $string.match: / <{ f3 }> /; # [5] OUTPUT: 「rege」
1686
-
1687
- In statement C < [4] > use is made of the function C < f2 > , which acts as a (very
1688
- simple) "regex factory": you can pass it a string variable, and it will return a
1689
- Regex object into which the variable has been interpolated. In this case, C < f2 >
1690
- appends the letter 'x' to whatever string it is passed. The Regex that is
1691
- returned by C < f2 > is in turn inserted into target Regex by the C < {...} >
1692
- construct. Statement C < [5] > illustrates another case of two-fold regex
1693
- interpolation. When the target Regex is constructed, the strictly literal string
1694
- value C < $x > is interpolated into it by the C < {...} > construct. When the Regex is
1695
- evaluated further, the unprotected variable C < $x > is interpolated, i.e. replaced
1696
- by the string value C < rege > , which explains the match.
1692
+ Importantly, 'interpolated as a regex' means interpolated/inserted into the
1693
+ target Regex without protective quoting. Consequently, if the value of the
1694
+ variable C < $variable1 > is itself of the form C < $variable2 > , evaluation of
1695
+ C « <$variable1> » or C « <{ $variable1 }> » inside a target regex C < /.../ > will cause
1696
+ the target regex to assume the form C < /$variable2/ > . As described above, the
1697
+ evaluation of this regex will then trigger further interpolation of
1698
+ C < $variable2 > :
1699
+
1700
+ my $string = Q[Mindfuck \w+ $variable1 $variable2];
1701
+ my $variable1 = Q[\w+];
1702
+ my $variable2 = Q[$variable1];
1703
+ my sub f1 { return Q[$variable2] };
1704
+
1705
+ # /<{ f1 }>/ ==> /$variable2/ ==> / '$variable1' /
1706
+ say $string.match: / <{ f1 }> /; # OUTPUT: 「$variable1」
1707
+
1708
+ # /<$variable2>/ ==> /$variable1/ ==> / '\w+' /
1709
+ say $string.match: /<$variable2>/; # OUTPUT: 「\w+」
1710
+
1711
+ # /<$variable1>/ ==> /\w+/
1712
+ say $string.match: /<$variable1>/; # OUTPUT: 「Mindfuck」
1697
1713
1698
1714
Note: it may be desired to run arbitrary code from within the regex I < without >
1699
1715
making use of its return value inside the regex. This may, for instance, come in
1700
1716
handy when debugging a regex or figuring out just how it matches. In such a
1701
- case, rather than (ab)using either C « $($pattern) or C « <{$pattern}> » , you may
1717
+ case, rather than (ab)using either C « $($pattern) » or C « <{$pattern}> » , you may
1702
1718
simply use C < { } > to insert a code block:
1703
1719
1704
1720
my sub nplus1($n) {$n +1}
0 commit comments