Permalink
Browse files

[c] (0) Reword how we require that XML documents that use <meta chars…

…et> must use UTF-8. Also require it in the first 512 bytes.

git-svn-id: http://svn.whatwg.org/webapps@2861 340c8d12-0b0e-0410-8428-c7bf67bfef74
  • Loading branch information...
1 parent bc76119 commit ed6aa96ff91e69a40add0c3ed7dc59c07163c944 @Hixie Hixie committed Feb 23, 2009
Showing with 46 additions and 31 deletions.
  1. +23 −15 index
  2. +23 −16 source
View
38 index
@@ -9145,13 +9145,17 @@ http://software.hixie.ch/utilities/js/live-dom-viewer/?%3C%21DOCTYPE%20HTML%3E%0
also be specified. Otherwise, it must be omitted.</p>
<p>The <dfn id=attr-meta-charset title=attr-meta-charset><code>charset</code></dfn>
- attribute specifies the character encoding used by the document. In
- <a href=#html5 title=HTML5>HTML documents</a> this is a <a href=#character-encoding-declaration>character
- encoding declaration</a>. If the attribute is present in an <a href=#xhtml5 title=XHTML>XML document</a>, its value must be an <a href=#ascii-case-insensitive>ASCII
- case-insensitive</a> match for the string "<code title="">UTF-8</code>", and the resource must be encoded using the
- UTF-8 character encoding. (The element has no effect in XML
- documents, and is only allowed to facilitate migration to and from
- XHTML.)</p>
+ attribute specifies the character encoding used by the
+ document. This is a <a href=#character-encoding-declaration>character encoding declaration</a>. If
+ the attribute is present in an <a href=#xhtml5 title=XHTML>XML
+ document</a>, its value must be an <a href=#ascii-case-insensitive>ASCII
+ case-insensitive</a> match for the string "<code title="">UTF-8</code>" (and the document is therefore required to
+ use UTF-8 as its encoding).</p>
+
+ <p class=note>The <code title=attr-meta-charset><a href=#attr-meta-charset>charset</a></code>
+ attribute on the <code><a href=#meta>meta</a></code> element has no effect in XML
+ documents, and is only allowed in order to facilitate migration to
+ and from XHTML.</p>
<p>There must not be more than one <code><a href=#meta>meta</a></code> element with a
<code title=attr-meta-charset><a href=#attr-meta-charset>charset</a></code> attribute per
@@ -9645,7 +9649,9 @@ people expect to have work and what is necessary.
<!-- XXX maybe the rest should move to "writing html" section,
though if we do then we have to duplicate the requirements in the
- parsing section for conformance checkers -->
+ parsing section for conformance checkers, and we have to make sure
+ that the requirements for charset="" apply even in XML, for the
+ polyglot hack -->
<p>A <dfn id=character-encoding-declaration>character encoding declaration</dfn> is a mechanism by
which the character encoding used to store or transmit a document is
@@ -9669,16 +9675,18 @@ people expect to have work and what is necessary.
declaration must be serialised completely within the first 512
bytes of the document.</li>
- </ul><p>If the document does not start with a BOM, and if its encoding is
- not explicitly given by <a href=#content-type-0 title=Content-Type>Content-Type
- metadata</a>, then the character encoding used must be an
- <a href=#ascii-compatible-character-encoding>ASCII-compatible character encoding</a>, and, in addition,
- if that encoding isn't US-ASCII itself, then the encoding must be
- specified using a <code><a href=#meta>meta</a></code> element with a <code title=attr-meta-charset><a href=#attr-meta-charset>charset</a></code> attribute or a
+ </ul><p>If an <a href=#html-documents title="HTML documents">HTML document</a> does not
+ start with a BOM, and if its encoding is not explicitly given by
+ <a href=#content-type-0 title=Content-Type>Content-Type metadata</a>, then the
+ character encoding used must be an <a href=#ascii-compatible-character-encoding>ASCII-compatible character
+ encoding</a>, and, in addition, if that encoding isn't US-ASCII
+ itself, then the encoding must be specified using a
+ <code><a href=#meta>meta</a></code> element with a <code title=attr-meta-charset><a href=#attr-meta-charset>charset</a></code> attribute or a
<code><a href=#meta>meta</a></code> element in the <a href=#attr-meta-http-equiv-content-type title=attr-meta-http-equiv-content-type>Encoding declaration
state</a>.</p>
- <p>If the document contains a <code><a href=#meta>meta</a></code> element with a <code title=attr-meta-charset><a href=#attr-meta-charset>charset</a></code> attribute or a
+ <p>If an <a href=#html-documents title="HTML documents">HTML document</a> contains
+ a <code><a href=#meta>meta</a></code> element with a <code title=attr-meta-charset><a href=#attr-meta-charset>charset</a></code> attribute or a
<code><a href=#meta>meta</a></code> element in the <a href=#attr-meta-http-equiv-content-type title=attr-meta-http-equiv-content-type>Encoding declaration
state</a>, then the character encoding used must be an
<a href=#ascii-compatible-character-encoding>ASCII-compatible character encoding</a>.</p>
View
39 source
@@ -9488,15 +9488,18 @@ http://software.hixie.ch/utilities/js/live-dom-viewer/?%3C%21DOCTYPE%20HTML%3E%0
also be specified. Otherwise, it must be omitted.</p>
<p>The <dfn title="attr-meta-charset"><code>charset</code></dfn>
- attribute specifies the character encoding used by the document. In
- <span title="HTML5">HTML documents</span> this is a <span>character
- encoding declaration</span>. If the attribute is present in an <span
- title="XHTML">XML document</span>, its value must be an <span>ASCII
+ attribute specifies the character encoding used by the
+ document. This is a <span>character encoding declaration</span>. If
+ the attribute is present in an <span title="XHTML">XML
+ document</span>, its value must be an <span>ASCII
case-insensitive</span> match for the string "<code
- title="">UTF-8</code>", and the resource must be encoded using the
- UTF-8 character encoding. (The element has no effect in XML
- documents, and is only allowed to facilitate migration to and from
- XHTML.)</p>
+ title="">UTF-8</code>" (and the document is therefore required to
+ use UTF-8 as its encoding).</p>
+
+ <p class="note">The <code title="attr-meta-charset">charset</code>
+ attribute on the <code>meta</code> element has no effect in XML
+ documents, and is only allowed in order to facilitate migration to
+ and from XHTML.</p>
<p>There must not be more than one <code>meta</code> element with a
<code title="attr-meta-charset">charset</code> attribute per
@@ -10081,7 +10084,9 @@ people expect to have work and what is necessary.
<!-- XXX maybe the rest should move to "writing html" section,
though if we do then we have to duplicate the requirements in the
- parsing section for conformance checkers -->
+ parsing section for conformance checkers, and we have to make sure
+ that the requirements for charset="" apply even in XML, for the
+ <meta charset=""> polyglot hack -->
<p>A <dfn>character encoding declaration</dfn> is a mechanism by
which the character encoding used to store or transmit a document is
@@ -10110,18 +10115,20 @@ people expect to have work and what is necessary.
</ul>
- <p>If the document does not start with a BOM, and if its encoding is
- not explicitly given by <span title="Content-Type">Content-Type
- metadata</span>, then the character encoding used must be an
- <span>ASCII-compatible character encoding</span>, and, in addition,
- if that encoding isn't US-ASCII itself, then the encoding must be
- specified using a <code>meta</code> element with a <code
+ <p>If an <span title="HTML documents">HTML document</span> does not
+ start with a BOM, and if its encoding is not explicitly given by
+ <span title="Content-Type">Content-Type metadata</span>, then the
+ character encoding used must be an <span>ASCII-compatible character
+ encoding</span>, and, in addition, if that encoding isn't US-ASCII
+ itself, then the encoding must be specified using a
+ <code>meta</code> element with a <code
title="attr-meta-charset">charset</code> attribute or a
<code>meta</code> element in the <span
title="attr-meta-http-equiv-content-type">Encoding declaration
state</span>.</p>
- <p>If the document contains a <code>meta</code> element with a <code
+ <p>If an <span title="HTML documents">HTML document</span> contains
+ a <code>meta</code> element with a <code
title="attr-meta-charset">charset</code> attribute or a
<code>meta</code> element in the <span
title="attr-meta-http-equiv-content-type">Encoding declaration

0 comments on commit ed6aa96

Please sign in to comment.