Skip to content

Commit

Permalink
Proofreading: unicode.pod6
Browse files Browse the repository at this point in the history
  • Loading branch information
AlexDaniel committed Aug 3, 2019
1 parent 4547eb6 commit 752f194
Showing 1 changed file with 10 additions and 10 deletions.
20 changes: 10 additions & 10 deletions doc/Language/unicode.pod6
Expand Up @@ -5,8 +5,8 @@
=SUBTITLE Unicode support in Perl 6
Perl 6 has a high level of support of Unicode. This document aims to be both an
overview as well as describe Unicode features which don't belong in the documentation
for routines and methods.
overview as well as description of Unicode features which don't belong
in the documentation for routines and methods.
For an overview on MoarVM's internal representation of strings, see the
L<MoarVM string documentation|https://github.com/MoarVM/MoarVM/blob/master/docs/strings.asciidoc>.
Expand All @@ -18,8 +18,8 @@ X<|Normalization>
Perl 6 applies normalization by default to all input and output except for file
names, which are read and written as L<C<UTF8-C8>|#UTF8-C8>; graphemes, which are
user-visible forms of the characters, will use a normalized representation. What
does this mean? For example, the grapheme C<á> can be represented in two ways,
user-visible forms of the characters, will use a normalized representation.
For example, the grapheme C<á> can be represented in two ways,
either using one codepoint:
=for code :lang<text>
Expand All @@ -36,7 +36,7 @@ that two inputs that are equivalent are both treated the same. Unicode has a con
of canonical equivalence which allows us to determine the canonical form of a string,
allowing us to properly compare strings and manipulate them, without having to worry
about the text losing these properties. By default, any text you process or output
from Perl 6 will be in this "canonical" form, even when making modifications or
from Perl 6 will be in this canonical form, even when making modifications or
concatenations to the string (see below for how to avoid this). For more detailed information
about Normalization Form C and canonical equivalence, see the Unicode Foundation's page on
L<Normalization and Canonical Equivalence|https://unicode.org/reports/tr15/#Canon_Compat_Equivalence>.
Expand Down Expand Up @@ -102,13 +102,13 @@ You can enter Unicode codepoints by number (decimal as well as hexadecimal). For
say "\x1E2"; # OUTPUT: «Ǣ␤»
You can also access Unicode codepoints by name:
Rakudo supports all Unicode 9.0 names. X<|\c[] unicode name>
Perl 6 supports all Unicode names. X<|\c[] unicode name>
say "\c[PENGUIN]"; # OUTPUT: «🐧␤»
say "\c[BELL]"; # OUTPUT: «🔔␤» (U+1F514 BELL)
All Unicode codepoint names/named seq/emoji sequences are now case-insensitive:
[Starting in 2017.02]
[Starting in Rakudo 2017.02]
say "\c[latin capital letter ae with macron]"; # OUTPUT: «Ǣ␤»
say "\c[latin capital letter E]"; # OUTPUT: «E␤» (U+0045)
Expand All @@ -126,7 +126,7 @@ the L<uniparse|/routine/uniparse>:
=head2 Name aliases
By name alias. Name Aliases are used mainly for codepoints without an official
Name Aliases are used mainly for codepoints without an official
name, for abbreviations, or for corrections (Unicode names never change).
For full list of them see L<here|https://www.unicode.org/Public/UCD/latest/ucd/NameAliases.txt>.
Expand All @@ -151,14 +151,14 @@ Abbreviations:
=head2 Named sequences
You can also use any of the L<Named Sequences|https://www.unicode.org/Public/UCD/latest/ucd/NamedSequences.txt>,
these are not single codepoints, but sequences of them. [Starting in 2017.02]
these are not single codepoints, but sequences of them. [Starting in Rakudo 2017.02]
say "\c[LATIN CAPITAL LETTER E WITH VERTICAL LINE BELOW AND ACUTE]"; # OUTPUT: «É̩␤»
say "\c[LATIN CAPITAL LETTER E WITH VERTICAL LINE BELOW AND ACUTE]".ords; # OUTPUT: «(201 809)␤»
=head3 Emoji sequences
Rakudo has support for Emoji 4.0 (the latest non-draft release) sequences.
Perl 6 supports Emoji sequences.
For all of them see:
L<Emoji ZWJ Sequences|https://www.unicode.org/Public/emoji/4.0/emoji-zwj-sequences.txt>
and L<Emoji Sequences|https://www.unicode.org/Public/emoji/4.0/emoji-sequences.txt>.
Expand Down

0 comments on commit 752f194

Please sign in to comment.