Skip to content

Commit f3b3fdd

Browse files
committed
Adds clarification for non-collating two-letter adverbs
Which closes #2406. Also some reflow and rephrasing.
1 parent 51f2357 commit f3b3fdd

File tree

1 file changed

+24
-17
lines changed

1 file changed

+24
-17
lines changed

doc/Language/regexes.pod6

Lines changed: 24 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1573,16 +1573,17 @@ say $/<ipv4-octet>; # OUTPUT: [「127」 「0」 「0」 「1」]
15731573
15741574
=head1 Adverbs
15751575
1576-
Adverbs modify how regexes work and provide convenient shortcuts for
1577-
certain kinds of recurring tasks.
1576+
Adverbs, which modify how regexes work and provide convenient shortcuts for
1577+
certain kinds of recurring tasks, are combinations of one or more letters
1578+
preceded by a colon C<:>.
15781579
1579-
There are two kinds of adverbs: regex adverbs apply at the point where a
1580-
regex is defined and matching adverbs apply at the point that a regex
1580+
There are two kinds of adverbs: I<regex> adverbs apply at the point where a
1581+
regex is defined, and I<matching> adverbs apply at the point that a regex
15811582
matches against a string.
15821583
15831584
This distinction often blurs, because matching and declaration are often
1584-
textually close but using the method form of matching makes the distinction
1585-
clear.
1585+
textually close but using the method form of matching, that is, C<.match>, makes
1586+
the distinction clear.
15861587
15871588
C<'abc' ~~ /../> is roughly equivalent to C<'abc'.match(/../)>, or even more
15881589
clearly written in separate lines:
@@ -1592,8 +1593,8 @@ clearly written in separate lines:
15921593
say "'abc' has at least two characters";
15931594
}
15941595
1595-
Regex adverbs like C<:i> go into the definition line and matching adverbs
1596-
like C<:overlap> are appended to the match call:
1596+
Regex adverbs like C<:i> go into the definition line and matching adverbs like
1597+
C<:overlap> (which can be abbreviated to C<:ov>) are appended to the match call:
15971598
15981599
my $regex = /:i . a/;
15991600
for 'baA'.match($regex, :overlap) -> $m {
@@ -1603,7 +1604,7 @@ like C<:overlap> are appended to the match call:
16031604
16041605
=head2 X<Regex adverbs|regex adverb,:ignorecase;regex adverb,:i>
16051606
1606-
Adverbs that appear at the time of a regex declaration are part of the
1607+
The adverbs that appear at the time of a regex declaration are part of the
16071608
actual regex and influence how the Perl 6 compiler translates the regex into
16081609
binary code.
16091610
@@ -1612,15 +1613,14 @@ the distinction between upper case, lower case and title case letters.
16121613
16131614
So C<'a' ~~ /A/> is false, but C<'a' ~~ /:i A/> is a successful match.
16141615
1615-
Regex adverbs can come before or inside a regex declaration and only affect
1616-
the part of the regex that comes afterwards, lexically.
1617-
Note that regex adverbs appearing before the regex must appear after
1618-
something that introduces the regex to the parser, like 'rx' or 'm' or a bare '/'.
1619-
This is NOT valid:
1616+
Regex adverbs can come before or inside a regex declaration and only affect the
1617+
part of the regex that comes afterwards, lexically. Note that regex adverbs
1618+
appearing before the regex must appear after something that introduces the regex
1619+
to the parser, like 'rx' or 'm' or a bare '/'. This is NOT valid:
16201620
1621-
=begin code :skip-test
1622-
my $rx1 = :i/a/; # adverb is before the regex is recognized => exception
1623-
=end code
1621+
=begin code :skip-test
1622+
my $rx1 = :i/a/; # adverb is before the regex is recognized => exception
1623+
=end code
16241624
16251625
but these are valid:
16261626
@@ -1643,6 +1643,13 @@ Square brackets and parentheses limit the scope of an adverb:
16431643
/ (:i a b) c /; # matches 'ABc' but not 'ABC'
16441644
/ [:i a b] c /; # matches 'ABc' but not 'ABC'
16451645
1646+
When two adverbs are used together, they keep their colon at the front
1647+
1648+
"þor is Þor" ~~ m:g:i/þ/;# OUTPUT: «(「þ」 「Þ」)␤»
1649+
1650+
That implies that when there are two vowels together after a C<:>, they
1651+
correspond to the same adverb, as in C<:ov> or C<:P5>.
1652+
16461653
=head3 X<Ignoremark|regex adverb,:ignoremark;regex adverb,:m>
16471654
16481655
The C<:ignoremark> or C<:m> adverb instructs the regex engine to only

0 commit comments

Comments
 (0)