Browse files

[e] (0) Closer integration with encoding.spec.whatwg.org

Fixing https://www.w3.org/Bugs/Public/show_bug.cgi?id=22661
Affected topics: HTML, HTML Syntax and Parsing, Security

git-svn-id: http://svn.whatwg.org/webapps@8081 340c8d12-0b0e-0410-8428-c7bf67bfef74
  • Loading branch information...
1 parent 0c8820d commit 0bbd1b098879ca872e795b4f6467d5d54085768d @Hixie Hixie committed Jul 23, 2013
Showing with 51 additions and 82 deletions.
  1. +17 −23 complete.html
  2. +17 −23 index
  3. +17 −36 source
View
40 complete.html
@@ -2473,7 +2473,7 @@ <h4 id=syntax-errors><span class=secno>1.12.2 </span>Syntax errors</h4>
<div class=example>
<p>For example, the restriction on using UTF-7 exists purely to avoid authors falling prey to a
- known cross-site-scripting attack using UTF-7.</p>
+ known cross-site-scripting attack using UTF-7. <a href=#refsUTF7>[UTF7]</a></p>
</div>
@@ -3065,7 +3065,7 @@ <h4 id=encoding-terminology><span class=secno>2.1.6 </span>Character encodings</
0x3F, 0x41 - 0x5A, and 0x61 - 0x7A<!-- is that list ok? do any character sets we want to support
do things outside that range? -->, ignoring bytes that are the second and later bytes of multibyte
sequences, all correspond to single-byte sequences that map to the same Unicode characters as
- those bytes in ANSI_X3.4-1968 (US-ASCII). <a href=#refsRFC1345>[RFC1345]</a></p>
+ those bytes in Windows-1252<!--ANSI_X3.4-1968 (US-ASCII)-->. <a href=#refsENCODING>[ENCODING]</a></p>
<p class=note>This includes such encodings as Shift_JIS, HZ-GB-2312, and variants of ISO-2022,
even though it is possible in these encodings for bytes like 0x70 to be part of longer sequences
@@ -3077,8 +3077,8 @@ <h4 id=encoding-terminology><span class=secno>2.1.6 </span>Character encodings</
different encodings at once, with different <meta charset> elements applying in each case.
-->
- <p>The term <dfn id=a-utf-16-encoding>a UTF-16 encoding</dfn> refers to any variant of UTF-16: self-describing UTF-16
- with a BOM, ambiguous UTF-16 without a BOM, raw UTF-16LE, and raw UTF-16BE. <a href=#refsRFC2781>[RFC2781]</a></p>
+ <p>The term <dfn id=a-utf-16-encoding>a UTF-16 encoding</dfn> refers to any variant of UTF-16: UTF-16LE or UTF-16BE,
+ regardless of the presence or absence of a BOM. <a href=#refsENCODING>[ENCODING]</a></p>
<p>The term <dfn id=code-unit>code unit</dfn> is used as defined in the Web IDL specification: a 16 bit
unsigned integer, the smallest atomic component of a <code>DOMString</code>. (This is a narrower
@@ -3431,6 +3431,10 @@ <h4 id=dependencies><span class=secno>2.2.2 </span>Dependencies</h4>
algorithm</i>. The latter first strips a Byte Order Mark (BOM), if any, and then invokes the
former.</p>
+ <p>For readability, character encodings are sometimes referenced in this specification with a
+ case that differs from the canonical case given in the encoding standard. (For example,
+ "UTF-16LE" instead of "utf16-le".)</p>
+
</dd>
@@ -86613,13 +86617,6 @@ <h5 id=character-encodings><span class=secno>12.2.2.3 </span>Character encodings
UTF-32 in its algorithms; support and use of these encodings can thus lead to unexpected behavior
in implementations of this specification.</p>
- <p>When a user agent is to use the self-describing UTF-16 encoding but no Byte Order Mark (BOM)
- has been found, user agents must default to little-endian UTF-16.</p>
-
- <p class=note>The requirement to default UTF-16 to little-endian rather than big-endian is a
- <a href=#willful-violation>willful violation</a> of RFC 2781, motivated by a desire for compatibility with legacy
- content. <a href=#refsRFC2781>[RFC2781]</a></p>
-
<h5 id=changing-the-encoding-while-parsing><span class=secno>12.2.2.4 </span>Changing the encoding while parsing</h5>
@@ -103331,7 +103328,7 @@ <h2 class=no-num id=references>References</h2><!--REFS-->
<dd><cite><a href=http://fetch.spec.whatwg.org/>Cross-Origin Resource Sharing</a></cite>, A. van Kesteren. WHATWG.</dd>
<dt id=refsCP50220>[CP50220]</dt>
- <dd><cite><a href=http://www.iana.org/assignments/charset-reg/CP50220>CP50220</a></cite>, Y. Naruse. IANA.</dd> <!-- really should be "NARUSE, Y." or some such, but there's a western bias to these references for consistency. sorry. -->
+ <dd>(Non-normative) <cite><a href=http://www.iana.org/assignments/charset-reg/CP50220>CP50220</a></cite>, Y. Naruse. IANA.</dd> <!-- really should be "NARUSE, Y." or some such, but there's a western bias to these references for consistency. sorry. -->
<dt id=refsCSP>[CSP]</dt>
<dd>(Non-normative) <cite><a href=http://dvcs.w3.org/hg/content-security-policy/raw-file/tip/csp-specification.dev.html>Content Security Policy</a></cite>, B. Sterne, A. Barth. W3C.</dd>
@@ -103523,22 +103520,22 @@ <h2 class=no-num id=references>References</h2><!--REFS-->
<dd><cite><a href=http://tools.ietf.org/html/rfc1123>Requirements for Internet Hosts -- Application and Support</a></cite>, R. Braden. IETF, October 1989.</dd>
<dt id=refsRFC1345>[RFC1345]</dt>
- <dd><cite><a href=http://tools.ietf.org/html/rfc1345>Character Mnemonics and Character Sets</a></cite>, K. Simonsen. IETF.</dd>
+ <dd>(Non-normative) <cite><a href=http://tools.ietf.org/html/rfc1345>Character Mnemonics and Character Sets</a></cite>, K. Simonsen. IETF.</dd>
<dt id=refsRFC1468>[RFC1468]</dt>
- <dd><cite><a href=http://tools.ietf.org/html/rfc1468>Japanese Character Encoding for Internet Messages</a></cite>, J. Murai, M. Crispin, E. van der Poel. IETF.</dd>
+ <dd>(Non-normative) <cite><a href=http://tools.ietf.org/html/rfc1468>Japanese Character Encoding for Internet Messages</a></cite>, J. Murai, M. Crispin, E. van der Poel. IETF.</dd>
<dt id=refsRFC1554>[RFC1554]</dt>
- <dd><cite><a href=http://tools.ietf.org/html/rfc1554>ISO-2022-JP-2: Multilingual Extension of ISO-2022-JP</a></cite>, M. Ohta, K. Handa. IETF.</dd>
+ <dd>(Non-normative) <cite><a href=http://tools.ietf.org/html/rfc1554>ISO-2022-JP-2: Multilingual Extension of ISO-2022-JP</a></cite>, M. Ohta, K. Handa. IETF.</dd>
<dt id=refsRFC1557>[RFC1557]</dt>
- <dd><cite><a href=http://tools.ietf.org/html/rfc1557>Korean Character Encoding for Internet Messages</a></cite>, U. Choi, K. Chon, H. Park. IETF.</dd>
+ <dd>(Non-normative) <cite><a href=http://tools.ietf.org/html/rfc1557>Korean Character Encoding for Internet Messages</a></cite>, U. Choi, K. Chon, H. Park. IETF.</dd>
<dt id=refsRFC1842>[RFC1842]</dt>
- <dd><cite><a href=http://tools.ietf.org/html/rfc1842>ASCII Printable Characters-Based Chinese Character Encoding for Internet Messages</a></cite>, Y. Wei, Y. Zhang, J. Li, J. Ding, Y. Jiang. IETF.</dd>
+ <dd>(Non-normative) <cite><a href=http://tools.ietf.org/html/rfc1842>ASCII Printable Characters-Based Chinese Character Encoding for Internet Messages</a></cite>, Y. Wei, Y. Zhang, J. Li, J. Ding, Y. Jiang. IETF.</dd>
<dt id=refsRFC1922>[RFC1922]</dt>
- <dd><cite><a href=http://tools.ietf.org/html/rfc1922>Chinese Character Encoding for Internet Messages</a></cite>, HF. Zhu, DY. Hu, ZG. Wang, TC. Kao, WCH. Chang, M. Crispin. IETF.</dd>
+ <dd>(Non-normative) <cite><a href=http://tools.ietf.org/html/rfc1922>Chinese Character Encoding for Internet Messages</a></cite>, HF. Zhu, DY. Hu, ZG. Wang, TC. Kao, WCH. Chang, M. Crispin. IETF.</dd>
<dt id=refsRFC2046>[RFC2046]</dt>
<dd><cite><a href=http://tools.ietf.org/html/rfc2046>Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types</a></cite>, N. Freed, N. Borenstein. IETF.</dd> <!-- for text/plain and "Internet Media type"; not for definition of "valid MIME type". -->
@@ -103547,7 +103544,7 @@ <h2 class=no-num id=references>References</h2><!--REFS-->
<dd><cite><a href=http://tools.ietf.org/html/rfc2119>Key words for use in RFCs to Indicate Requirement Levels</a></cite>, S. Bradner. IETF.</dd>
<dt id=refsRFC2237>[RFC2237]</dt>
- <dd><cite><a href=http://tools.ietf.org/html/rfc2237>Japanese Character Encoding for Internet Messages</a></cite>, K. Tamaru. IETF.</dd>
+ <dd>(Non-normative) <cite><a href=http://tools.ietf.org/html/rfc2237>Japanese Character Encoding for Internet Messages</a></cite>, K. Tamaru. IETF.</dd>
<dt id=refsRFC2313>[RFC2313]</dt>
<dd><cite><a href=http://tools.ietf.org/html/rfc2313>PKCS #1: RSA Encryption</a></cite>, B. Kaliski. IETF.</dd>
@@ -103567,9 +103564,6 @@ <h2 class=no-num id=references>References</h2><!--REFS-->
<dt id=refsRFC2483>[RFC2483]</dt>
<dd><cite><a href=http://tools.ietf.org/html/rfc2483>URI Resolution Services Necessary for URN Resolution</a></cite>, M. Mealling, R. Daniel. IETF.</dd>
- <dt id=refsRFC2781>[RFC2781]</dt>
- <dd><cite><a href=http://tools.ietf.org/html/rfc2781>UTF-16, an encoding of ISO 10646</a></cite>, P. Hoffman, F. Yergeau. IETF.</dd>
-
<dt id=refsRFC3676>[RFC3676]</dt>
<dd><cite><a href=http://tools.ietf.org/html/rfc3676>The Text/Plain Format and DelSp Parameters</a></cite>, R. Gellens. IETF.</dd>
@@ -103652,7 +103646,7 @@ <h2 class=no-num id=references>References</h2><!--REFS-->
<dd><cite><a href=http://url.spec.whatwg.org/>URL</a></cite>, A. van Kesteren. WHATWG.</dd>
<dt id=refsUTF7>[UTF7]</dt>
- <dd><cite><a href=http://tools.ietf.org/html/rfc2152>UTF-7: A Mail-Safe Transformation Format of Unicode</a></cite>, D. Goldsmith, M. Davis. IETF.</dd>
+ <dd>(Non-normative) <cite><a href=http://tools.ietf.org/html/rfc2152>UTF-7: A Mail-Safe Transformation Format of Unicode</a></cite>, D. Goldsmith, M. Davis. IETF.</dd>
<dt id=refsUTF8DET>[UTF8DET]</dt>
<dd>(Non-normative) <cite><a href=http://www.w3.org/International/questions/qa-forms-utf-8>Multilingual form encoding</a></cite>, M. D&uuml;rst. W3C.</dd>
View
40 index
@@ -2473,7 +2473,7 @@ a.setAttribute('href', 'http://example.com/'); // change the content attribute d
<div class=example>
<p>For example, the restriction on using UTF-7 exists purely to avoid authors falling prey to a
- known cross-site-scripting attack using UTF-7.</p>
+ known cross-site-scripting attack using UTF-7. <a href=#refsUTF7>[UTF7]</a></p>
</div>
@@ -3065,7 +3065,7 @@ a.setAttribute('href', 'http://example.com/'); // change the content attribute d
0x3F, 0x41 - 0x5A, and 0x61 - 0x7A<!-- is that list ok? do any character sets we want to support
do things outside that range? -->, ignoring bytes that are the second and later bytes of multibyte
sequences, all correspond to single-byte sequences that map to the same Unicode characters as
- those bytes in ANSI_X3.4-1968 (US-ASCII). <a href=#refsRFC1345>[RFC1345]</a></p>
+ those bytes in Windows-1252<!--ANSI_X3.4-1968 (US-ASCII)-->. <a href=#refsENCODING>[ENCODING]</a></p>
<p class=note>This includes such encodings as Shift_JIS, HZ-GB-2312, and variants of ISO-2022,
even though it is possible in these encodings for bytes like 0x70 to be part of longer sequences
@@ -3077,8 +3077,8 @@ a.setAttribute('href', 'http://example.com/'); // change the content attribute d
different encodings at once, with different <meta charset> elements applying in each case.
-->
- <p>The term <dfn id=a-utf-16-encoding>a UTF-16 encoding</dfn> refers to any variant of UTF-16: self-describing UTF-16
- with a BOM, ambiguous UTF-16 without a BOM, raw UTF-16LE, and raw UTF-16BE. <a href=#refsRFC2781>[RFC2781]</a></p>
+ <p>The term <dfn id=a-utf-16-encoding>a UTF-16 encoding</dfn> refers to any variant of UTF-16: UTF-16LE or UTF-16BE,
+ regardless of the presence or absence of a BOM. <a href=#refsENCODING>[ENCODING]</a></p>
<p>The term <dfn id=code-unit>code unit</dfn> is used as defined in the Web IDL specification: a 16 bit
unsigned integer, the smallest atomic component of a <code>DOMString</code>. (This is a narrower
@@ -3431,6 +3431,10 @@ a.setAttribute('href', 'http://example.com/'); // change the content attribute d
algorithm</i>. The latter first strips a Byte Order Mark (BOM), if any, and then invokes the
former.</p>
+ <p>For readability, character encodings are sometimes referenced in this specification with a
+ case that differs from the canonical case given in the encoding standard. (For example,
+ "UTF-16LE" instead of "utf16-le".)</p>
+
</dd>
@@ -86613,13 +86617,6 @@ dictionary <dfn id=storageeventinit>StorageEventInit</dfn> : <a href=#eventinit>
UTF-32 in its algorithms; support and use of these encodings can thus lead to unexpected behavior
in implementations of this specification.</p>
- <p>When a user agent is to use the self-describing UTF-16 encoding but no Byte Order Mark (BOM)
- has been found, user agents must default to little-endian UTF-16.</p>
-
- <p class=note>The requirement to default UTF-16 to little-endian rather than big-endian is a
- <a href=#willful-violation>willful violation</a> of RFC 2781, motivated by a desire for compatibility with legacy
- content. <a href=#refsRFC2781>[RFC2781]</a></p>
-
<h5 id=changing-the-encoding-while-parsing><span class=secno>12.2.2.4 </span>Changing the encoding while parsing</h5>
@@ -103331,7 +103328,7 @@ if (s = prompt('What is your name?')) {
<dd><cite><a href=http://fetch.spec.whatwg.org/>Cross-Origin Resource Sharing</a></cite>, A. van Kesteren. WHATWG.</dd>
<dt id=refsCP50220>[CP50220]</dt>
- <dd><cite><a href=http://www.iana.org/assignments/charset-reg/CP50220>CP50220</a></cite>, Y. Naruse. IANA.</dd> <!-- really should be "NARUSE, Y." or some such, but there's a western bias to these references for consistency. sorry. -->
+ <dd>(Non-normative) <cite><a href=http://www.iana.org/assignments/charset-reg/CP50220>CP50220</a></cite>, Y. Naruse. IANA.</dd> <!-- really should be "NARUSE, Y." or some such, but there's a western bias to these references for consistency. sorry. -->
<dt id=refsCSP>[CSP]</dt>
<dd>(Non-normative) <cite><a href=http://dvcs.w3.org/hg/content-security-policy/raw-file/tip/csp-specification.dev.html>Content Security Policy</a></cite>, B. Sterne, A. Barth. W3C.</dd>
@@ -103523,22 +103520,22 @@ if (s = prompt('What is your name?')) {
<dd><cite><a href=http://tools.ietf.org/html/rfc1123>Requirements for Internet Hosts -- Application and Support</a></cite>, R. Braden. IETF, October 1989.</dd>
<dt id=refsRFC1345>[RFC1345]</dt>
- <dd><cite><a href=http://tools.ietf.org/html/rfc1345>Character Mnemonics and Character Sets</a></cite>, K. Simonsen. IETF.</dd>
+ <dd>(Non-normative) <cite><a href=http://tools.ietf.org/html/rfc1345>Character Mnemonics and Character Sets</a></cite>, K. Simonsen. IETF.</dd>
<dt id=refsRFC1468>[RFC1468]</dt>
- <dd><cite><a href=http://tools.ietf.org/html/rfc1468>Japanese Character Encoding for Internet Messages</a></cite>, J. Murai, M. Crispin, E. van der Poel. IETF.</dd>
+ <dd>(Non-normative) <cite><a href=http://tools.ietf.org/html/rfc1468>Japanese Character Encoding for Internet Messages</a></cite>, J. Murai, M. Crispin, E. van der Poel. IETF.</dd>
<dt id=refsRFC1554>[RFC1554]</dt>
- <dd><cite><a href=http://tools.ietf.org/html/rfc1554>ISO-2022-JP-2: Multilingual Extension of ISO-2022-JP</a></cite>, M. Ohta, K. Handa. IETF.</dd>
+ <dd>(Non-normative) <cite><a href=http://tools.ietf.org/html/rfc1554>ISO-2022-JP-2: Multilingual Extension of ISO-2022-JP</a></cite>, M. Ohta, K. Handa. IETF.</dd>
<dt id=refsRFC1557>[RFC1557]</dt>
- <dd><cite><a href=http://tools.ietf.org/html/rfc1557>Korean Character Encoding for Internet Messages</a></cite>, U. Choi, K. Chon, H. Park. IETF.</dd>
+ <dd>(Non-normative) <cite><a href=http://tools.ietf.org/html/rfc1557>Korean Character Encoding for Internet Messages</a></cite>, U. Choi, K. Chon, H. Park. IETF.</dd>
<dt id=refsRFC1842>[RFC1842]</dt>
- <dd><cite><a href=http://tools.ietf.org/html/rfc1842>ASCII Printable Characters-Based Chinese Character Encoding for Internet Messages</a></cite>, Y. Wei, Y. Zhang, J. Li, J. Ding, Y. Jiang. IETF.</dd>
+ <dd>(Non-normative) <cite><a href=http://tools.ietf.org/html/rfc1842>ASCII Printable Characters-Based Chinese Character Encoding for Internet Messages</a></cite>, Y. Wei, Y. Zhang, J. Li, J. Ding, Y. Jiang. IETF.</dd>
<dt id=refsRFC1922>[RFC1922]</dt>
- <dd><cite><a href=http://tools.ietf.org/html/rfc1922>Chinese Character Encoding for Internet Messages</a></cite>, HF. Zhu, DY. Hu, ZG. Wang, TC. Kao, WCH. Chang, M. Crispin. IETF.</dd>
+ <dd>(Non-normative) <cite><a href=http://tools.ietf.org/html/rfc1922>Chinese Character Encoding for Internet Messages</a></cite>, HF. Zhu, DY. Hu, ZG. Wang, TC. Kao, WCH. Chang, M. Crispin. IETF.</dd>
<dt id=refsRFC2046>[RFC2046]</dt>
<dd><cite><a href=http://tools.ietf.org/html/rfc2046>Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types</a></cite>, N. Freed, N. Borenstein. IETF.</dd> <!-- for text/plain and "Internet Media type"; not for definition of "valid MIME type". -->
@@ -103547,7 +103544,7 @@ if (s = prompt('What is your name?')) {
<dd><cite><a href=http://tools.ietf.org/html/rfc2119>Key words for use in RFCs to Indicate Requirement Levels</a></cite>, S. Bradner. IETF.</dd>
<dt id=refsRFC2237>[RFC2237]</dt>
- <dd><cite><a href=http://tools.ietf.org/html/rfc2237>Japanese Character Encoding for Internet Messages</a></cite>, K. Tamaru. IETF.</dd>
+ <dd>(Non-normative) <cite><a href=http://tools.ietf.org/html/rfc2237>Japanese Character Encoding for Internet Messages</a></cite>, K. Tamaru. IETF.</dd>
<dt id=refsRFC2313>[RFC2313]</dt>
<dd><cite><a href=http://tools.ietf.org/html/rfc2313>PKCS #1: RSA Encryption</a></cite>, B. Kaliski. IETF.</dd>
@@ -103567,9 +103564,6 @@ if (s = prompt('What is your name?')) {
<dt id=refsRFC2483>[RFC2483]</dt>
<dd><cite><a href=http://tools.ietf.org/html/rfc2483>URI Resolution Services Necessary for URN Resolution</a></cite>, M. Mealling, R. Daniel. IETF.</dd>
- <dt id=refsRFC2781>[RFC2781]</dt>
- <dd><cite><a href=http://tools.ietf.org/html/rfc2781>UTF-16, an encoding of ISO 10646</a></cite>, P. Hoffman, F. Yergeau. IETF.</dd>
-
<dt id=refsRFC3676>[RFC3676]</dt>
<dd><cite><a href=http://tools.ietf.org/html/rfc3676>The Text/Plain Format and DelSp Parameters</a></cite>, R. Gellens. IETF.</dd>
@@ -103652,7 +103646,7 @@ if (s = prompt('What is your name?')) {
<dd><cite><a href=http://url.spec.whatwg.org/>URL</a></cite>, A. van Kesteren. WHATWG.</dd>
<dt id=refsUTF7>[UTF7]</dt>
- <dd><cite><a href=http://tools.ietf.org/html/rfc2152>UTF-7: A Mail-Safe Transformation Format of Unicode</a></cite>, D. Goldsmith, M. Davis. IETF.</dd>
+ <dd>(Non-normative) <cite><a href=http://tools.ietf.org/html/rfc2152>UTF-7: A Mail-Safe Transformation Format of Unicode</a></cite>, D. Goldsmith, M. Davis. IETF.</dd>
<dt id=refsUTF8DET>[UTF8DET]</dt>
<dd>(Non-normative) <cite><a href=http://www.w3.org/International/questions/qa-forms-utf-8>Multilingual form encoding</a></cite>, M. D&uuml;rst. W3C.</dd>
View
53 source
@@ -1189,7 +1189,7 @@ a.setAttribute('href', 'http://example.com/'); // change the content attribute d
<div class="example">
<p>For example, the restriction on using UTF-7 exists purely to avoid authors falling prey to a
- known cross-site-scripting attack using UTF-7.</p>
+ known cross-site-scripting attack using UTF-7. <a href="#refsUTF7">[UTF7]</a></p>
</div>
@@ -1823,7 +1823,7 @@ a.setAttribute('href', 'http://example.com/'); // change the content attribute d
0x3F, 0x41 - 0x5A, and 0x61 - 0x7A<!-- is that list ok? do any character sets we want to support
do things outside that range? -->, ignoring bytes that are the second and later bytes of multibyte
sequences, all correspond to single-byte sequences that map to the same Unicode characters as
- those bytes in ANSI_X3.4-1968 (US-ASCII). <a href="#refsRFC1345">[RFC1345]</a></p>
+ those bytes in Windows-1252<!--ANSI_X3.4-1968 (US-ASCII)-->. <a href="#refsENCODING">[ENCODING]</a></p>
<p class="note">This includes such encodings as Shift_JIS, HZ-GB-2312, and variants of ISO-2022,
even though it is possible in these encodings for bytes like 0x70 to be part of longer sequences
@@ -1835,9 +1835,8 @@ a.setAttribute('href', 'http://example.com/'); // change the content attribute d
different encodings at once, with different <meta charset> elements applying in each case.
-->
- <p>The term <dfn>a UTF-16 encoding</dfn> refers to any variant of UTF-16: self-describing UTF-16
- with a BOM, ambiguous UTF-16 without a BOM, raw UTF-16LE, and raw UTF-16BE. <a
- href="#refsRFC2781">[RFC2781]</a></p>
+ <p>The term <dfn>a UTF-16 encoding</dfn> refers to any variant of UTF-16: UTF-16LE or UTF-16BE,
+ regardless of the presence or absence of a BOM. <a href="#refsENCODING">[ENCODING]</a></p>
<p>The term <dfn>code unit</dfn> is used as defined in the Web IDL specification: a 16 bit
unsigned integer, the smallest atomic component of a <code>DOMString</code>. (This is a narrower
@@ -2211,6 +2210,10 @@ a.setAttribute('href', 'http://example.com/'); // change the content attribute d
algorithm</i>. The latter first strips a Byte Order Mark (BOM), if any, and then invokes the
former.</p>
+ <p>For readability, character encodings are sometimes referenced in this specification with a
+ case that differs from the canonical case given in the encoding standard. (For example,
+ "UTF-16LE" instead of "utf16-le".)</p>
+
</dd>
@@ -96680,13 +96683,6 @@ dictionary <dfn>StorageEventInit</dfn> : <span>EventInit</span> {
UTF-32 in its algorithms; support and use of these encodings can thus lead to unexpected behavior
in implementations of this specification.</p>
- <p>When a user agent is to use the self-describing UTF-16 encoding but no Byte Order Mark (BOM)
- has been found, user agents must default to little-endian UTF-16.</p>
-
- <p class="note">The requirement to default UTF-16 to little-endian rather than big-endian is a
- <span>willful violation</span> of RFC 2781, motivated by a desire for compatibility with legacy
- content. <a href="#refsRFC2781">[RFC2781]</a></p>
-
<h5>Changing the encoding while parsing</h5>
@@ -115757,9 +115753,6 @@ if (s = prompt('What is your name?')) {
<dt id="refsBIDI">[BIDI]</dt>
<dd><cite><a href="http://www.unicode.org/reports/tr9/">UAX #9: Unicode Bidirectional Algorithm</a></cite>, M. Davis. Unicode Consortium.</dd>
- <dt id="refsBIG5">[BIG5]</dt>
- <dd>(Non-normative) <cite>Chinese Coded Character Set in Computer</cite>. Institute for Information Industry, March 1984.</dd>
-
<dt id="refsBOCU1">[BOCU1]</dt>
<dd>(Non-normative) <cite><a href="http://www.unicode.org/notes/tn6/">UTN #6: BOCU-1: MIME-Compatible Unicode Compression</a></cite>, M. Scherer, M. Davis. Unicode Consortium.</dd>
@@ -115785,10 +115778,7 @@ if (s = prompt('What is your name?')) {
<dd><cite><a href="http://fetch.spec.whatwg.org/">Cross-Origin Resource Sharing</a></cite>, A. van Kesteren. WHATWG.</dd>
<dt id="refsCP50220">[CP50220]</dt>
- <dd><cite><a href="http://www.iana.org/assignments/charset-reg/CP50220">CP50220</a></cite>, Y. Naruse. IANA.</dd> <!-- really should be "NARUSE, Y." or some such, but there's a western bias to these references for consistency. sorry. -->
-
- <dt id="refsCP51932">[CP51932]</dt>
- <dd><cite><a href="http://www.iana.org/assignments/charset-reg/CP51932">CP51932</a></cite>, Y. Naruse. IANA.</dd> <!-- really should be "NARUSE, Y." or some such, but there's a western bias to these references for consistency. sorry. -->
+ <dd>(Non-normative) <cite><a href="http://www.iana.org/assignments/charset-reg/CP50220">CP50220</a></cite>, Y. Naruse. IANA.</dd> <!-- really should be "NARUSE, Y." or some such, but there's a western bias to these references for consistency. sorry. -->
<dt id="refsCSP">[CSP]</dt>
<dd>(Non-normative) <cite><a href="http://dvcs.w3.org/hg/content-security-policy/raw-file/tip/csp-specification.dev.html">Content Security Policy</a></cite>, B. Sterne, A. Barth. W3C.</dd>
@@ -115930,9 +115920,6 @@ if (s = prompt('What is your name?')) {
<dt id="refsISO8601">[ISO8601]</dt>
<dd>(Non-normative) <cite><a href="http://isotc.iso.org/livelink/livelink/4021199/ISO_8601_2004_E.zip?func=doc.Fetch&amp;nodeid=4021199">ISO8601: Data elements and interchange formats &mdash; Information interchange &mdash; Representation of dates and times</a></cite>. ISO.</dd>
- <dt id="refsISO885911">[ISO885911]</dt>
- <dd><cite><a href="http://std.dkuug.dk/jtc1/sc2/open/02n3333.pdf">ISO-8859-11: Information technology &mdash; 8-bit single-byte coded character sets &mdash; Part 11: Latin/Thai alphabet</a></cite>. ISO.</dd>
-
<dt id="refsJLREQ">[JLREQ]</dt>
<dd><cite><a href="http://www.w3.org/TR/jlreq/">Requirements for Japanese Text Layout</a></cite>. W3C.</dd> <!-- too many editors to list -->
@@ -116026,25 +116013,25 @@ if (s = prompt('What is your name?')) {
<dd><cite><a href="http://tools.ietf.org/html/rfc1321">The MD5 Message-Digest Algorithm</a></cite>, R. Rivest. IETF.</dd>
<dt id="refsRFC1345">[RFC1345]</dt>
- <dd><cite><a href="http://tools.ietf.org/html/rfc1345">Character Mnemonics and Character Sets</a></cite>, K. Simonsen. IETF.</dd>
+ <dd>(Non-normative) <cite><a href="http://tools.ietf.org/html/rfc1345">Character Mnemonics and Character Sets</a></cite>, K. Simonsen. IETF.</dd>
<dt id="refsRFC1468">[RFC1468]</dt>
- <dd><cite><a href="http://tools.ietf.org/html/rfc1468">Japanese Character Encoding for Internet Messages</a></cite>, J. Murai, M. Crispin, E. van der Poel. IETF.</dd>
+ <dd>(Non-normative) <cite><a href="http://tools.ietf.org/html/rfc1468">Japanese Character Encoding for Internet Messages</a></cite>, J. Murai, M. Crispin, E. van der Poel. IETF.</dd>
<dt id="refsRFC1494">[RFC1494]</dt>
<dd>(Non-normative) <cite><a href="http://tools.ietf.org/html/rfc1494">Equivalences between 1988 X.400 and RFC-822 Message Bodies</a></cite>, H. Alvestrand, S. Thompson. IETF.</dd>
<dt id="refsRFC1554">[RFC1554]</dt>
- <dd><cite><a href="http://tools.ietf.org/html/rfc1554">ISO-2022-JP-2: Multilingual Extension of ISO-2022-JP</a></cite>, M. Ohta, K. Handa. IETF.</dd>
+ <dd>(Non-normative) <cite><a href="http://tools.ietf.org/html/rfc1554">ISO-2022-JP-2: Multilingual Extension of ISO-2022-JP</a></cite>, M. Ohta, K. Handa. IETF.</dd>
<dt id="refsRFC1557">[RFC1557]</dt>
- <dd><cite><a href="http://tools.ietf.org/html/rfc1557">Korean Character Encoding for Internet Messages</a></cite>, U. Choi, K. Chon, H. Park. IETF.</dd>
+ <dd>(Non-normative) <cite><a href="http://tools.ietf.org/html/rfc1557">Korean Character Encoding for Internet Messages</a></cite>, U. Choi, K. Chon, H. Park. IETF.</dd>
<dt id="refsRFC1842">[RFC1842]</dt>
- <dd><cite><a href="http://tools.ietf.org/html/rfc1842">ASCII Printable Characters-Based Chinese Character Encoding for Internet Messages</a></cite>, Y. Wei, Y. Zhang, J. Li, J. Ding, Y. Jiang. IETF.</dd>
+ <dd>(Non-normative) <cite><a href="http://tools.ietf.org/html/rfc1842">ASCII Printable Characters-Based Chinese Character Encoding for Internet Messages</a></cite>, Y. Wei, Y. Zhang, J. Li, J. Ding, Y. Jiang. IETF.</dd>
<dt id="refsRFC1922">[RFC1922]</dt>
- <dd><cite><a href="http://tools.ietf.org/html/rfc1922">Chinese Character Encoding for Internet Messages</a></cite>, HF. Zhu, DY. Hu, ZG. Wang, TC. Kao, WCH. Chang, M. Crispin. IETF.</dd>
+ <dd>(Non-normative) <cite><a href="http://tools.ietf.org/html/rfc1922">Chinese Character Encoding for Internet Messages</a></cite>, HF. Zhu, DY. Hu, ZG. Wang, TC. Kao, WCH. Chang, M. Crispin. IETF.</dd>
<dt id="refsRFC2045">[RFC2045]</dt>
<dd><cite><a href="http://tools.ietf.org/html/rfc2045">Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies</a></cite>, N. Freed, N. Borenstein. IETF.</dd>
@@ -116056,7 +116043,7 @@ if (s = prompt('What is your name?')) {
<dd><cite><a href="http://tools.ietf.org/html/rfc2119">Key words for use in RFCs to Indicate Requirement Levels</a></cite>, S. Bradner. IETF.</dd>
<dt id="refsRFC2237">[RFC2237]</dt>
- <dd><cite><a href="http://tools.ietf.org/html/rfc2237">Japanese Character Encoding for Internet Messages</a></cite>, K. Tamaru. IETF.</dd>
+ <dd>(Non-normative) <cite><a href="http://tools.ietf.org/html/rfc2237">Japanese Character Encoding for Internet Messages</a></cite>, K. Tamaru. IETF.</dd>
<dt id="refsRFC2246">[RFC2246]</dt>
<dd><cite><a href="http://tools.ietf.org/html/rfc2246">The TLS Protocol Version 1.0</a></cite>, T. Dierks, C. Allen. IETF.</dd>
@@ -116079,9 +116066,6 @@ if (s = prompt('What is your name?')) {
<dt id="refsRFC2483">[RFC2483]</dt>
<dd><cite><a href="http://tools.ietf.org/html/rfc2483">URI Resolution Services Necessary for URN Resolution</a></cite>, M. Mealling, R. Daniel. IETF.</dd>
- <dt id="refsRFC2781">[RFC2781]</dt>
- <dd><cite><a href="http://tools.ietf.org/html/rfc2781">UTF-16, an encoding of ISO 10646</a></cite>, P. Hoffman, F. Yergeau. IETF.</dd>
-
<dt id="refsRFC3676">[RFC3676]</dt>
<dd><cite><a href="http://tools.ietf.org/html/rfc3676">The Text/Plain Format and DelSp Parameters</a></cite>, R. Gellens. IETF.</dd>
@@ -116197,7 +116181,7 @@ if (s = prompt('What is your name?')) {
<dd><cite><a href="http://url.spec.whatwg.org/">URL</a></cite>, A. van Kesteren. WHATWG.</dd>
<dt id="refsUTF7">[UTF7]</dt>
- <dd><cite><a href="http://tools.ietf.org/html/rfc2152">UTF-7: A Mail-Safe Transformation Format of Unicode</a></cite>, D. Goldsmith, M. Davis. IETF.</dd>
+ <dd>(Non-normative) <cite><a href="http://tools.ietf.org/html/rfc2152">UTF-7: A Mail-Safe Transformation Format of Unicode</a></cite>, D. Goldsmith, M. Davis. IETF.</dd>
<dt id="refsUTF8DET">[UTF8DET]</dt>
<dd>(Non-normative) <cite><a href="http://www.w3.org/International/questions/qa-forms-utf-8">Multilingual form encoding</a></cite>, M. D&uuml;rst. W3C.</dd>
@@ -116208,9 +116192,6 @@ if (s = prompt('What is your name?')) {
<dt id="refsWCAG">[WCAG]</dt>
<dd>(Non-normative) <cite><a href="http://www.w3.org/TR/WCAG20/">Web Content Accessibility Guidelines (WCAG) 2.0</a></cite>, B. Caldwell, M. Cooper, L. Reid, G. Vanderheiden. W3C.</dd>
- <dt id="refsWEBADDRESSES">[WEBADDRESSES]</dt>
- <dd><cite><a href="http://www.w3.org/html/wg/href/draft">Web addresses in HTML5</a></cite>, D. Connolly, C. Sperberg-McQueen.</dd>
-
<dt id="refsWEBGL">[WEBGL]</dt>
<dd><cite><a href="http://www.khronos.org/registry/webgl/specs/latest/">WebGL Specification</a></cite>, D. Jackson. Khronos Group.</dd>

0 comments on commit 0bbd1b0

Please sign in to comment.