Skip to content

Commit 6a10cf7

Browse files
committed
Regex: Add more info on matching Unicode Properties.
Show how to match not only the General Categories but also the values of Unicode Properties.
1 parent 7bce2d1 commit 6a10cf7

File tree

1 file changed

+10
-3
lines changed

1 file changed

+10
-3
lines changed

doc/Language/regexes.pod6

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -203,10 +203,17 @@ Predefined subrules:
203203
204204
The character classes so far are mostly for convenience; a more systematic
205205
approach is the use of Unicode properties. They are called in the form C<<
206-
<:property> >>, where C<property> can be a short or long Unicode property
207-
name.
206+
<:property> >>, where C<property> can be a short or long Unicode General
207+
Category name. These use pair syntax.
208208
209-
The following list is stolen from the Perl 5
209+
To match against a specific value for a Unicode Property:
210+
211+
"a".uniprop('Script') # Latin
212+
"a" ~~ / <:Script<Latin>> /
213+
"a".uniprop('Block') # Basic Latin
214+
"a" ~~ / <:Block('Basic Latin') /
215+
216+
The following list of Unicode General Categories is stolen from the Perl 5
210217
L<perlunicode|http://perldoc.perl.org/perlunicode.html> documentation:
211218
212219
=begin table

0 commit comments

Comments
 (0)