Skip to content

Commit e2effd7

Browse files
Literals and metacharacters: layout correction, introduction of new examples, minor corrections (#2957)
1 parent 6986a51 commit e2effd7

File tree

1 file changed

+26
-21
lines changed

1 file changed

+26
-21
lines changed

doc/Language/regexes.pod6

Lines changed: 26 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -69,15 +69,13 @@ literals: these characters match themselves and nothing else. Other characters
6969
act as metacharacters and may, as such, have a special meaning, either by
7070
themselves (such as the dot C<.>, which serves as a wildcard) or together with
7171
other characters in larger metasyntactic constructs (such as C«<?before ...>»,
72-
which defines a lookahead assertion). But before looking at metacharacters and
73-
their particular uses, let's first explore the relation between literals and
74-
metacharacters in some more detail.
72+
which defines a lookahead assertion).
7573
7674
In its simplest form a regex comprises only literals:
7775
78-
if 'properly' ~~ / perl / {
79-
say "'properly' contains 'perl'"; # OUTPUT: «'properly' contains 'perl'␤»
80-
}
76+
/Cześć/; # "Hello" in Polish
77+
/こんばんは/; # "Good afternoon" in Japanese
78+
/Καλησπέρα/; # "Good evening" in Greek
8179
8280
If you want a regex to literally match one or more characters that normally act
8381
as metacharacters, these characters must either be escaped using a backslash, or
@@ -93,43 +91,50 @@ literal, and vice versa:
9391
Even if a metacharacter does not (yet) have a special meaning in Perl 6,
9492
escaping (or quoting) it is required to ensure that the regex compiles and
9593
matches the character literally. This allows the clear distinction between
96-
literals and metacharacters to be maintained:
94+
literals and metacharacters to be maintained. So, for instance, to match a
95+
comma this will work:
9796
9897
/ \, /; # matches a literal comma ','
9998
99+
while this will fail:
100100
=for code :skip-test<deliberate error>
101-
/ , /; # !! error: a yet meaningless/unrecognized metacharacter
102-
# does not automatically match literally
101+
/ , /; # !! error: a yet meaningless/unrecognized metacharacter
102+
# does not automatically match literally
103103
104104
While an escaping backslash exerts its effect on the next individual character,
105-
single I<and multiple> metacharacters may be turned into literally matching
106-
strings by quoting them using single or double quotes:
105+
both a single metacharacter and a sequence of metacharacters may be turned into
106+
literally matching strings by quoting them in single or double quotes:
107107
108-
/ "abc" /; # you may quote literals like this, but it has no effect
108+
/ "abc" /; # quoting literals does not make them more literal
109109
/ "Hallelujah!" /; # yet, this form is generally preferred over /Hallelujah\!/
110110
111111
/ "two words" /; # quoting a space renders it significant, so this matches
112112
# the string 'two words' including the intermediate space
113113
114114
/ '#!:@' /; # this regex matches the string of metacharacters '#!:@'
115115
116-
Quoting does not turn every metacharacter into a literal, however. This is due
117-
to the fact that quotes allow for backslash-escapes and interpolation.
118-
Specifically: in single quotes, the backslash may be used to escape single
119-
quotes and the backslash itself; double quotes additionally enable the
120-
interpolation of variables, and of code blocks of the form C<{...}>:
116+
Quoting does not simply turn every metacharacter into a literal, however. This
117+
is because quotes allow for backslash-escapes and interpolation. Specifically:
118+
in single quotes, the backslash may be used to escape single quotes and the
119+
backslash itself; double quotes additionally enable the interpolation of
120+
variables, and of code blocks of the form C<{...}>. Hence all of this works:
121121
122122
/ '\\\'' /; # matches a backslash followed by a single quote: \'
123123
124-
=for code :skip-test<deliberate error>
125-
/ '\' /; # !! error: this is NOT the way to literally match a
126-
# backslash because now it escapes the second quote
127-
128124
my $x = 'Hi';
129125
/ "$x there!" /; # matches the string 'Hi there!'
130126
131127
/ "1 + 1 = {1+1}" /; # matches the string '1 + 1 = 2'
132128
129+
while these examples illustrate mistakes that you will want to avoid:
130+
=begin code :skip-test<deliberate error>
131+
/ '\' /; # !! error: this is NOT the way to literally match a
132+
# backslash because now it escapes the second quote
133+
134+
/"Price tag $0.50"/; # !! error: "$0" is interpreted as the first positional
135+
# capture (which is Nil), not as '$0'
136+
=end code
137+
133138
Strings are searched left to right, so it is enough if only part of the string
134139
matches the regex:
135140

0 commit comments

Comments
 (0)