Skip to content

Commit

Permalink
Addressed #162 (Arabic pair of combining marks) by changing the text …
Browse files Browse the repository at this point in the history
…to say that the order "often" has positional meaning.
  • Loading branch information
aphillips committed Jul 14, 2018
1 parent ad0982c commit e99eb2e
Showing 1 changed file with 1 addition and 7 deletions.
8 changes: 1 addition & 7 deletions index.html
Expand Up @@ -748,13 +748,7 @@ <h4>Canonical vs. Compatibility Equivalence</h4>
characters are sometimes also encoded as a distinct "precomposed"
character. In this example, the character <span class="codepoint"><span lang="en">&#x00C7;</span> [<span class="uname">U+00C7 LATIN CAPITAL LETTER C WITH CEDILLA</span>]</span> is canonically equivalent to the character sequence starting with the base character <span class="codepoint"><span lang="en">&#x0043;</span> [<span class="uname">U+0043 LATIN CAPITAL LETTER C</span>]</span> followed by <span class="codepoint"><span lang="en">&#x25CC;&#x0327;</span> [<span class="uname">U+0327 COMBINING CEDILLA​</span>]</span>. Such equivalence can extend to characters with multiple combining marks.</li>
<li class="dropExampleItem"><span class="dropExample">q&#x0307;&#x0323;<span style="font-size:75%">
vs.</span>q&#x0323;&#x0307;</span> <em>Order of combining marks.</em> When
a base character is modified by multiple combining marks, the
order of the combining marks might not represent a distinct
character. Here the sequence <span class="codepoint"><span lang="en">&#x0071;</span> [<span class="uname">U+0071 LATIN SMALL LETTER Q</span>]</span> <span class="codepoint"><span lang="en">&nbsp;&#x0307;</span> [<span class="uname">U+0307 COMBINING DOT ABOVE​</span>]</span> <span class="codepoint"><span lang="en">&nbsp;&#x0323;</span> [<span class="uname">U+0323 COMBINING DOT BELOW​</span>]</span> and <span class="codepoint"><span lang="en">&#x0071;</span> [<span class="uname">U+0071 LATIN SMALL LETTER Q</span>]</span> <span class="codepoint"><span lang="en">&nbsp;&#x0323;</span> [<span class="uname">U+0323 COMBINING DOT BELOW​</span>]</span> <span class="codepoint"><span lang="en">&nbsp;&#x0307;</span> [<span class="uname">U+0307 COMBINING DOT ABOVE​</span>]</span> are equivalent, even though the combining marks are in a different order. Note that this example is chosen
carefully: the dot-above character and dot-below character are on
opposite "sides" of the base character. The order of combining
diacritics on the same side have a positional meaning.</li>
vs.</span>q&#x0323;&#x0307;</span> <em>Order of combining marks.</em> When a base character is modified by multiple combining marks, the order of the combining marks might not represent a distinct character. Here the sequence <span class="codepoint"><span lang="en">&#x0071;</span> [<span class="uname">U+0071 LATIN SMALL LETTER Q</span>]</span> <span class="codepoint"><span lang="en">&nbsp;&#x0307;</span> [<span class="uname">U+0307 COMBINING DOT ABOVE​</span>]</span> <span class="codepoint"><span lang="en">&nbsp;&#x0323;</span> [<span class="uname">U+0323 COMBINING DOT BELOW​</span>]</span> and <span class="codepoint"><span lang="en">&#x0071;</span> [<span class="uname">U+0071 LATIN SMALL LETTER Q</span>]</span> <span class="codepoint"><span lang="en">&nbsp;&#x0323;</span> [<span class="uname">U+0323 COMBINING DOT BELOW​</span>]</span> <span class="codepoint"><span lang="en">&nbsp;&#x0307;</span> [<span class="uname">U+0307 COMBINING DOT ABOVE​</span>]</span> are equivalent, even though the combining marks are in a different order. Note that this example is chosen carefully: the dot-above character and dot-below character are on opposite "sides" of the base character. The order of combining diacritics on the same side often has a positional meaning (although there are cases where the order doesn't matter to the presentation).</li>
<li class="dropExampleItem"><span class="dropExample">&#x2126;<span style="font-size:75%">
vs.</span>&#x03a9;</span> <em>Singleton mappings.</em> These result from the need to separately encode otherwise equivalent characters to support legacy character encodings. In this example, the Ohm symbol <span class="codepoint"><span lang="en">&#x2126;</span> [<span class="uname">U+2126 OHM SYMBOL</span>]</span> is canonically equivalent (and identical in appearance) to the Greek letter Omega <span class="codepoint"><span lang="en">&#x03A9;</span> [<span class="uname">U+03A9 GREEK CAPITAL LETTER OMEGA</span>]</span>. (Another example of a singleton is <span class="codepoint"><span lang="en">&#x212B;</span> [<span class="uname">U+212B ANGSTROM SIGN</span>]</span> in the <a href="#aringExample">encoding variations example</a> above)</li>
<li class="dropExampleItem"><span class="dropExample">&#xac00;<span style="font-size:75%">
Expand Down

0 comments on commit e99eb2e

Please sign in to comment.