@@ -93,12 +93,11 @@ With NFG, strings start by being run through the normal NFC process, compressing
93
93
any given character sequences into precomposed characters.
94
94
95
95
Any graphemes remaining without precomposed characters, such as ậ or नि, are
96
- given their own negative numbers to refer to them, at least 32 bits in
97
- length. This is done to avoid clashing with any potential future changes to
98
- Unicode.
99
-
100
- The mapping between negative numbers and graphemes in this form is not
101
- guaranteed constant, even between strings in the same process.
96
+ given their own internal designation to refer to them, at least 32 bits in
97
+ length, in such a way that they avoid clashing with any potential future
98
+ changes to Unicode. The mapping between these internal designations and
99
+ graphemes in this form is not guaranteed constant, even between strings in
100
+ the same process.
102
101
103
102
The Perl 6 C < Str > type, and more generally the C < Stringy > role, deals
104
103
exclusively in NFG form.
@@ -152,12 +151,9 @@ Unicode. C<Str> deals exclusively in the NFG form of Unicode strings.
152
151
ord(Str $string)
153
152
ords(Str $string)
154
153
155
- These give you the numeric values of the characters in a string. C < ord > only
156
- works on the first character, while C < ords > works on every character.
157
-
158
- Some the values returned may be negative numbers, and are useless outside that
159
- specific string. You must convert to one of the codepoint-based types for a
160
- to-Standard list of numbers.
154
+ These give you the numeric values of the B < base character > of graphemes in a
155
+ string. C < ord > only works on the first graphemes, while C < ords > works on every
156
+ grapheme.
161
157
162
158
= head2 Length Methods
163
159
@@ -371,12 +367,11 @@ An error may be issued if the given category name is not valid.
371
367
Stringy.ords() --> Array[Int]
372
368
373
369
The C < &ord > function (and corresponding C < Stringy.ord > method) return the
374
- codepoint number of the first codepoint of the string. The C < &ords > function and
375
- method returns an C < Array > of codepoint numbers for every codepoint in the
376
- string.
370
+ codepoint number of the base character of the first grapheme of the string.
371
+ The C < &ords > function and method returns an C < Array > of codepoint numbers
372
+ of the base character for every grapheme in the string.
377
373
378
- This works on any type that does the C < Stringy > role. Note that using this on
379
- type C < Str > may return invalid negative numbers as "codepoints".
374
+ This works on any type that does the C < Stringy > role.
380
375
381
376
= head2 Character Representation
382
377
0 commit comments