qa-html-css-normalization: Clarifications for para on NFKC/NFKD

w3c · Aug 31, 2022 · 17cc050 · 17cc050
1 parent e0e0458
commit 17cc050
Showing 1 changed file with 2 additions and 2 deletions.
diff --git a/questions/qa-html-css-normalization.en.html b/questions/qa-html-css-normalization.en.html
@@ -98,7 +98,7 @@ <h2>What is Unicode normalization?</h2>
 <p>Four <dfn>normalization forms</dfn> are specified by the Unicode Standard: NFC, NFD, NFKC and NFKD. The <span class="qchar">C</span> stands for (pre-)composed, and the <span class="qchar">D</span> for decomposed. The <span class="qchar">K</span> stands for compatibility. </p>
 <p><span class="leadin">NFD</span> uses Unicode rules to maximally decompose a code point into component parts. For example, the Vietnamese letter <span class="codepoint" translate="no"><bdi lang="vi">&#x1EC1;</bdi> [<span class="uname">U+1EC1 LATIN SMALL LETTER E WITH CIRCUMFLEX AND GRAVE</span>]</span> becomes the sequence <span class="codepoint" translate="no"><bdi lang="vi">&#x0065;&#x0302;&#x0300;</bdi> [<span class="uname">U+0065 LATIN SMALL LETTER E</span> + <span class="uname">U+0302 COMBINING CIRCUMFLEX ACCENT</span> + <span class="uname">U+0300 COMBINING GRAVE ACCENT</span>]</span>.</p>
 <p><span class="leadin">NFC</span> runs that process in reverse, and will also completely compose partially decomposed sequences. However, this composition process is only applied to a subset of the Unicode repertoire. For example, the sequence <span class="codepoint" translate="no"><bdi lang="en">&#x0067;&#x0300;</bdi> [<span class="uname">U+0067 LATIN SMALL LETTER G</span> + <span class="uname">U+0300 COMBINING GRAVE ACCENT</span>]</span> has no precomposed form, and is unaffected by normalization.</p>
-<p><span class="leadin">NFKC and NFKD</span> were intially introduced to provide round-trip compatibility with other character sets. This applies to code points that represent such things as glyph variants, shaped forms, alternative compositions, and so on, that can also be represented by other ‘canonical’ code points already in Unicode.  NFKD and NFKC normalization  replaces these code points with canonical characters or character sequences, and you cannot convert back to the original code points.  In principle, such compatibility variants should not be used.</p>
+<p><span class="leadin">NFKC and NFKD</span> were  introduced  to handle characters that were included in Unicode in order to provide compatibility with other character sets. This applies to code points that represent such things as glyph variants, shaped forms, alternative compositions, and so on.  NFKD and NFKC normalization  replaces these code points with canonical characters or character sequences, and you cannot convert back to the original code points.  In principle, such compatibility variants should not be used.</p>
 </section>
 
 
@@ -113,7 +113,7 @@ <h3>Choosing a normalization form</h3>
 
 <p>Natural language content aimed at human consumption does not need to all be in one normalized form – there may sometimes be good reasons to mix normalized forms. Applications that try to match one piece of text with another should, however, compare normalized versions of both.</p>
 
-<p style="">Unfortunately, normalization doesn't always take place before content is compared, and a particularly important case is when CSS selectors are compared with HTML class names or ids, as style is applied to a page. If the word <span class="qterm">világ</span> is used in precomposed form in the HTML (eg. <code>&lt;span class=&quot;világ&quot;&gt;</code>), but in decomposed form in the CSS (eg. <code>.vila&#x0301;g { font-style: italic; }</code>), then the selector won't match the class name.</p>
+<p style="">Unfortunately, normalization doesn't always take place before content is compared, and a particularly important case is when CSS selectors are compared with HTML class names or ids, as style is applied to a page. If the word <span class="qterm">világ</span> (meaning 'word' in Hungarian) is used in precomposed form in the HTML (eg. <code>&lt;span class=&quot;világ&quot;&gt;</code>), but in decomposed form in the CSS (eg. <code>.vila&#x0301;g { font-style: italic; }</code>), then the selector won't match the class name.</p>
 
 <p style="">The following example shows this. The CSS selector is decomposed, whereas one class name in the HTML is decomposed and the other precomposed. As you should be able to see, only the decomposed class name is matched to the style. But notice also that it is not possible to distinguish the two forms in the source text.</p>