@@ -79,7 +79,7 @@ Otherwise it is L<Nil>.
79
79
80
80
= head1 Wildcards and character classes
81
81
82
- = head2 X < Dot to match any character|regex syntax ,. >
82
+ = head2 X < Dot to match any character|regex,. >
83
83
84
84
An unescaped dot C < . > in a regex matches any single character.
85
85
@@ -101,7 +101,7 @@ because there is no character to match before C<per> in the target string.
101
101
There are predefined character classes of the form C < \w > . Its negation is
102
102
written with an upper-case letter, C < \W > .
103
103
104
- = item X < \d and \D|regex syntax ,\d;regex syntax ,\D >
104
+ = item X < \d and \D|regex,\d;regex,\D >
105
105
106
106
C < \d > matches a single digit (Unicode property C < N > ), and C < \D > matches a
107
107
single character that is not a digit.
@@ -119,7 +119,7 @@ Examples for digits are
119
119
U+0E53 ๓ THAI DIGIT THREE
120
120
U+1B56 ᭖ BALINESE DIGIT SIX
121
121
122
- = item X < \h and \H|regex syntax ,\h;regex syntax ,\H >
122
+ = item X < \h and \H|regex,\h;regex,\H >
123
123
124
124
C < \h > matches a single horizontal whitespace character. C < \H > matches a
125
125
single character that is not a horizontal whitespace character.
@@ -134,27 +134,27 @@ Examples for horizontal whitespace characters are
134
134
Vertical whitespaces like newline characters are explicitly excluded; those
135
135
can be matched with C < \v > , and C < \s > matches any kind of whitespace.
136
136
137
- = item X < \n and \N|regex syntax ,\n;regex syntax ,\N >
137
+ = item X < \n and \N|regex,\n;regex,\N >
138
138
139
139
C < \n > matches a single, logical newline character. C < \n > is supposed to also
140
140
match a Windows CR LF codepoing pair; though it is unclear whether the magic
141
141
happens at the time that external data is read, or at regex match time. C < \N >
142
142
matches a single character that's not a logical newline.
143
143
144
- = item X < \s and \S|regex syntax ,\s;regex syntax ,\S >
144
+ = item X < \s and \S|regex,\s;regex,\S >
145
145
146
146
C < \s > matches a single whitespace character. C < \S > matches a single
147
147
character that is not a whitspace.
148
148
149
149
TODO: examples
150
150
151
- = item X < \t and \T|regex syntax ,\t;regex syntax ,\T >
151
+ = item X < \t and \T|regex,\t;regex,\T >
152
152
153
153
C < \t > matches a single tab/tabulation character, C < U+0009 > . (Note that
154
154
exotic tabs like the C < U+000B VERTICAL TABULATION > character are not included
155
155
here). C < \T > matches a single character that is not a tab.
156
156
157
- = item X < \v and \V|regex syntax ,\v;regex syntax ,\V >
157
+ = item X < \v and \V|regex,\v;regex,\V >
158
158
159
159
C < \v > matches a single vertical whitespace character. C < \V > match a single
160
160
character that is not a vertical whitspace.
@@ -169,7 +169,7 @@ Examples for vertical whitespace characters:
169
169
170
170
Use C < \s > to match any kind of whitespace, not just vertical whitespace
171
171
172
- = item X < \w and \W|regex syntax ,\w;regex syntax ,\W >
172
+ = item X < \w and \W|regex,\w;regex,\W >
173
173
174
174
C < \w > matches a single word character, that is a letter (Unicode category L),
175
175
a digit or an underscore. C < \W > matches a single character that isn't a word
@@ -183,7 +183,7 @@ Examples of word characters:
183
183
03F3 ϳ GREEK LETTER YOT
184
184
0409 Љ CYRILLIC CAPITAL LETTER LJE
185
185
186
- = head2 X « Unicode properties|regex syntax ,<:property> »
186
+ = head2 X « Unicode properties|regex,<:property> »
187
187
188
188
The character classes so far are mostly for convenience; a more systematic
189
189
approach is the use of Unicode properties. They are called in the form
@@ -265,7 +265,7 @@ C<< <:Ll+:N> >> or C<< <:Ll+:Number> >> or C<< <+ :Lowercase_Letter + :Number> >
265
265
(Grouping of set operations with round parens inside character classes is
266
266
supposed to work, but not supported by Rakudo at the time of writing).
267
267
268
- = head2 X « Enumerated character classes and ranges|regex syntax ,<[ ]>;regex assertion ,<-[ ]> »
268
+ = head2 X « Enumerated character classes and ranges|regex,<[ ]>;regex,<-[ ]> »
269
269
270
270
Sometimes the pre-existing wildcards and character classes are just not
271
271
enough. Fortunately, defining your own is simple enough. Between C << <[ ]> >> ,
@@ -312,7 +312,7 @@ Quantifiers bind tighter than concatenation, so C<ab+> matches one C<a>
312
312
followed by one or more C < b > s. This is different for quotes, so C < 'ab'+ >
313
313
matches the strings C < ab > , C < abab > , C < ababab > etc.
314
314
315
- = head2 X < One or more: +|regex syntax ,+ >
315
+ = head2 X < One or more: +|regex,+ >
316
316
317
317
The C < + > quantifier makes the preceding atom match one or more times, with
318
318
no upper limit.
@@ -322,7 +322,7 @@ like this:
322
322
323
323
/ \w+ '=' \w+ /
324
324
325
- = head2 X < Zero or more: *|regex syntax ,* >
325
+ = head2 X < Zero or more: *|regex,* >
326
326
327
327
The C < * > quantifier makes the preceding atom match zero or more times, with
328
328
no upper limit.
@@ -331,7 +331,7 @@ For example to optional whitespace between C<a> and C<b> you can write
331
331
332
332
/ a \s* b /
333
333
334
- = head2 X < Zero or one match: ?|regex syntax ,? >
334
+ = head2 X < Zero or one match: ?|regex,? >
335
335
336
336
The C < ? > quantifier makes the preceding atom match zero or one time.
337
337
@@ -343,7 +343,7 @@ C<a ** 2..5> to match the character C<a> at least twice and at most 5 times
343
343
If minimal and maximal number of matches are the same, a single integer
344
344
is possible: C < a ** 5 > to match C < a > exactly five times.
345
345
346
- = head1 X < Alternation|regex syntax ,|| >
346
+ = head1 X < Alternation|regex,|| >
347
347
348
348
To match one of several possible alternatives, separate them by C < || > ; the
349
349
first matching alternative wins.
@@ -379,7 +379,7 @@ match.
379
379
Anchors need to match successfully in order for the whole regex to match, but
380
380
they do not use up characters while matching.
381
381
382
- = head2 X « C < ^ > , Start of String|regex syntax ,^»
382
+ = head2 X « C < ^ > , Start of String|regex,^»
383
383
384
384
The C < ^ > assertion only matches at the start of the string.
385
385
@@ -388,7 +388,7 @@ The C<^> assertion only matches at the start of the string.
388
388
say so 'perly' ~~ /^ perl/; # True
389
389
say so 'perl' ~~ /^ perl/; # True
390
390
391
- = head2 X « C < ^^ > , Start of Line and C < $$ > , End of Line|regex syntax ,^^;regex syntax ,$$»
391
+ = head2 X « C < ^^ > , Start of Line and C < $$ > , End of Line|regex,^^;regex,$$»
392
392
393
393
The C < ^^ > assertion matches at the start of a logical line. That is, either at
394
394
the start of the string, or after a newline character.
@@ -420,7 +420,7 @@ leading space, and the third and fourth lines have two leading spaces each).
420
420
# and the end of line)
421
421
say so $str ~~ / '."' $$/; # True (at the last line)
422
422
423
- = head2 X <<< <C <<< << >>> and C <<< >> >>> , left and right word boundary|regex syntax ,<<;regex syntax ,>>;regex syntax ,«;regex syntax ,» >>> >
423
+ = head2 X <<< <C <<< << >>> and C <<< >> >>> , left and right word boundary|regex,<<;regex,>>;regex,«;regex,» >>> >
424
424
425
425
C <<< << >>> matches a left word boundary, so positions where at the left there
426
426
a non-word character (or the start of the string), and to the right there is a
@@ -438,7 +438,7 @@ the end of the string.
438
438
say so $str ~~ /<< own/; # False
439
439
say so $str ~~ /own >>/; # True
440
440
441
- = head1 X « Grouping and Capturing|regex syntax ,( );regex syntax ,[ ];regex syntax ,$<capture> = »
441
+ = head1 X « Grouping and Capturing|regex,( );regex,[ ];regex,$<capture> = »
442
442
443
443
In regular (non-regex) Perl 6, you can use parenthesis to group things
444
444
together, usually to override operator precedence:
0 commit comments