@@ -249,14 +249,14 @@ another approach is to use Unicode character properties. These come in
249
249
the form C « <:property> » , where C < property > can be a short or long
250
250
Unicode General Category name. These use pair syntax.
251
251
252
- To match against a Unicode Property :
252
+ To match against a Unicode property you can use either smartmatch or L < C < uniprop > |/routine/uniprop > :
253
253
254
254
"a".uniprop('Script'); # OUTPUT: «Latin»
255
255
"a" ~~ / <:Script<Latin>> /; # OUTPUT: «「a」»
256
256
"a".uniprop('Block'); # OUTPUT: «Basic Latin»
257
257
"a" ~~ / <:Block('Basic Latin')> /; # OUTPUT: «「a」»
258
258
259
- Unicode General Categories :
259
+ These are the unicode general categories used for matching :
260
260
261
261
= begin table
262
262
@@ -314,11 +314,9 @@ Categories can be used together, with an infix operator:
314
314
315
315
Operator | Meaning
316
316
==========+=========
317
- \+ | set union
318
- \| | set union
319
- & | set intersection
317
+ + | set union
320
318
- | set difference
321
- ^ | set symmetric difference
319
+
322
320
323
321
= end table
324
322
@@ -332,10 +330,11 @@ parentheses; for example:
332
330
333
331
= head2 X « Enumerated Character Classes and Ranges|regex,<[ ]>;regex,<-[ ]> »
334
332
335
- Sometimes the pre-existing wildcards and character classes are not enough.
336
- Fortunately, defining your own is fairly simple. Within C « <[ ]> » , you
337
- can put any number of single characters and ranges of characters (expressed
338
- with two dots between the end points), with or without whitespace.
333
+ Sometimes the pre-existing wildcards and character classes are not
334
+ enough. Fortunately, defining your own is fairly simple. Within C « <[ ]> » ,
335
+ you can put any number of single characters and ranges of characters
336
+ (expressed with two dots between the end points), with or without
337
+ whitespace.
339
338
340
339
"abacabadabacaba" ~~ / <[ a .. c 1 2 3 ]>* /;
341
340
# Unicode hex codepoint range
0 commit comments