Skip to content
Permalink
Browse files

[e] (0) Move the Content-Type encoding parsing hack of an algorithm b…

…ack into HTML5 from MIMESNIFF.

git-svn-id: http://svn.whatwg.org/webapps@5042 340c8d12-0b0e-0410-8428-c7bf67bfef74
  • Loading branch information...
Hixie committed Apr 14, 2010
1 parent cd7b56c commit 11db5093934038ca9827c95787aee53e1f235855
Showing with 144 additions and 21 deletions.
  1. +45 −7 complete.html
  2. +45 −7 index
  3. +54 −7 source
@@ -186,7 +186,7 @@

<header class=head id=head><p><a class=logo href=http://www.whatwg.org/ rel=home><img alt=WHATWG src=/images/logo></a></p>
<hgroup><h1>Web Applications 1.0</h1>
<h2 class="no-num no-toc">Draft Standard &mdash; 13 April 2010</h2>
<h2 class="no-num no-toc">Draft Standard &mdash; 14 April 2010</h2>
</hgroup><p>You can take part in this work. <a href=http://www.whatwg.org/mailing-list>Join the working group's discussion list.</a></p>
<p><strong>Web designers!</strong> We have a <a href=http://blog.whatwg.org/faq/>FAQ</a>, a <a href=http://forums.whatwg.org/>forum</a>, and a <a href=http://www.whatwg.org/mailing-list#help>help mailing list</a> for you!</p>
<!--<p class="impl"><strong>Implementors!</strong> We have a <a href="http://www.whatwg.org/mailing-list#implementors">mailing list</a> for you too!</p>-->
@@ -6368,12 +6368,6 @@ <h4 id=content-type-sniffing><span class=secno>2.6.3 </span>Determining the type
with the requirements of the Content-Type Processing Model
specification. <a href=#refsMIMESNIFF>[MIMESNIFF]</a></p>

<p>The <dfn id=algorithm-for-extracting-an-encoding-from-a-content-type>algorithm for extracting an encoding from a
Content-Type</dfn>, given a string <var title="">s</var>, is given
in the Content-Type Processing Model specification. It either
returns an encoding or nothing. <a href=#refsMIMESNIFF>[MIMESNIFF]</a></p>
<p class=XXX>The above is out of date now that the relevant section has been removed from MIMESNIFF. Stay tuned; I'll bring it back here soon.</p>

<p>The <dfn id=content-type-sniffing-0 title="Content-Type sniffing">sniffed type of a
resource</dfn> must be found in a manner consistent with the
requirements given in the Content-Type Processing Model
@@ -6394,6 +6388,50 @@ <h4 id=content-type-sniffing><span class=secno>2.6.3 </span>Determining the type
occur. For more details, see the Content-Type Processing Model
specification. <a href=#refsMIMESNIFF>[MIMESNIFF]</a></p>

<p>The <dfn id=algorithm-for-extracting-an-encoding-from-a-content-type>algorithm for extracting an encoding from a
Content-Type</dfn>, given a string <var title="">s</var>, is as
follows. It either returns an encoding or nothing.</p>

<ol><li><p>Find the first seven characters in <var title="">s</var>
that are an <a href=#ascii-case-insensitive>ASCII case-insensitive</a> match for the word
"<code title="">charset</code>". If no such match is found, return
nothing.</li>

<li><p>Skip any U+0009, U+000A, U+000C, U+000D, or U+0020
characters that immediately follow the word "<code title="">charset</code>" (there might not be any).</li>

<li><p>If the next character is not a U+003D EQUALS SIGN ('='),
return nothing and abort these steps.</li>

<li><p>Skip any U+0009, U+000A, U+000C, U+000D, or U+0020
characters that immediately follow the equals sign (there might not
be any).</li>

<li>

<p>Process the next character as follows:</p>

<dl class=switch><dt>If it is a U+0022 QUOTATION MARK ('"') and there is a later U+0022 QUOTATION MARK ('"') in <var title="">s</var></dt>
<dt>If it is a U+0027 APOSTROPHE ("'") and there is a later U+0027 APOSTROPHE ("'") in <var title="">s</var></dt>
<dd>Return the encoding corresponding to the string between this character and the next earliest occurrence of this character.</dd>

<dt>If it is an unmatched U+0022 QUOTATION MARK ('"')</dt>
<dt>If it is an unmatched U+0027 APOSTROPHE ("'")</dt>
<dt>If there is no next character</dt>
<dd>Return nothing.</dd>

<dt>Otherwise</dt>
<dd>Return the encoding corresponding to the string from this
character to the first U+0009, U+000A, U+000C, U+000D, U+0020, or
U+003B character or the end of <var title="">s</var>, whichever
comes first.</dd>

</dl></li>

</ol><p class=note>This requirement is a <a href=#willful-violation>willful violation</a>
of the HTTP specification, motivated by the need for backwards
compatibility with legacy content. <a href=#refsHTTP>[HTTP]</a></p>

</div>


52 index
@@ -190,7 +190,7 @@

<header class=head id=head><p><a class=logo href=http://www.whatwg.org/ rel=home><img alt=WHATWG src=/images/logo></a></p>
<hgroup><h1>HTML5 (including next generation additions still in development)</h1>
<h2 class="no-num no-toc">Draft Standard &mdash; 13 April 2010</h2>
<h2 class="no-num no-toc">Draft Standard &mdash; 14 April 2010</h2>
</hgroup><p>You can take part in this work. <a href=http://www.whatwg.org/mailing-list>Join the working group's discussion list.</a></p>
<p><strong>Web designers!</strong> We have a <a href=http://blog.whatwg.org/faq/>FAQ</a>, a <a href=http://forums.whatwg.org/>forum</a>, and a <a href=http://www.whatwg.org/mailing-list#help>help mailing list</a> for you!</p>
<!--<p class="impl"><strong>Implementors!</strong> We have a <a href="http://www.whatwg.org/mailing-list#implementors">mailing list</a> for you too!</p>-->
@@ -6266,12 +6266,6 @@ a.setAttribute('href', 'http://example.com/'); // change the content attribute d
with the requirements of the Content-Type Processing Model
specification. <a href=#refsMIMESNIFF>[MIMESNIFF]</a></p>

<p>The <dfn id=algorithm-for-extracting-an-encoding-from-a-content-type>algorithm for extracting an encoding from a
Content-Type</dfn>, given a string <var title="">s</var>, is given
in the Content-Type Processing Model specification. It either
returns an encoding or nothing. <a href=#refsMIMESNIFF>[MIMESNIFF]</a></p>
<p class=XXX>The above is out of date now that the relevant section has been removed from MIMESNIFF. Stay tuned; I'll bring it back here soon.</p>

<p>The <dfn id=content-type-sniffing-0 title="Content-Type sniffing">sniffed type of a
resource</dfn> must be found in a manner consistent with the
requirements given in the Content-Type Processing Model
@@ -6292,6 +6286,50 @@ a.setAttribute('href', 'http://example.com/'); // change the content attribute d
occur. For more details, see the Content-Type Processing Model
specification. <a href=#refsMIMESNIFF>[MIMESNIFF]</a></p>

<p>The <dfn id=algorithm-for-extracting-an-encoding-from-a-content-type>algorithm for extracting an encoding from a
Content-Type</dfn>, given a string <var title="">s</var>, is as
follows. It either returns an encoding or nothing.</p>

<ol><li><p>Find the first seven characters in <var title="">s</var>
that are an <a href=#ascii-case-insensitive>ASCII case-insensitive</a> match for the word
"<code title="">charset</code>". If no such match is found, return
nothing.</li>

<li><p>Skip any U+0009, U+000A, U+000C, U+000D, or U+0020
characters that immediately follow the word "<code title="">charset</code>" (there might not be any).</li>

<li><p>If the next character is not a U+003D EQUALS SIGN ('='),
return nothing and abort these steps.</li>

<li><p>Skip any U+0009, U+000A, U+000C, U+000D, or U+0020
characters that immediately follow the equals sign (there might not
be any).</li>

<li>

<p>Process the next character as follows:</p>

<dl class=switch><dt>If it is a U+0022 QUOTATION MARK ('"') and there is a later U+0022 QUOTATION MARK ('"') in <var title="">s</var></dt>
<dt>If it is a U+0027 APOSTROPHE ("'") and there is a later U+0027 APOSTROPHE ("'") in <var title="">s</var></dt>
<dd>Return the encoding corresponding to the string between this character and the next earliest occurrence of this character.</dd>

<dt>If it is an unmatched U+0022 QUOTATION MARK ('"')</dt>
<dt>If it is an unmatched U+0027 APOSTROPHE ("'")</dt>
<dt>If there is no next character</dt>
<dd>Return nothing.</dd>

<dt>Otherwise</dt>
<dd>Return the encoding corresponding to the string from this
character to the first U+0009, U+000A, U+000C, U+000D, U+0020, or
U+003B character or the end of <var title="">s</var>, whichever
comes first.</dd>

</dl></li>

</ol><p class=note>This requirement is a <a href=#willful-violation>willful violation</a>
of the HTTP specification, motivated by the need for backwards
compatibility with legacy content. <a href=#refsHTTP>[HTTP]</a></p>

</div>


61 source
@@ -5954,13 +5954,6 @@ a.setAttribute('href', 'http://example.com/'); // change the content attribute d
with the requirements of the Content-Type Processing Model
specification. <a href="#refsMIMESNIFF">[MIMESNIFF]</a></p>

<p>The <dfn>algorithm for extracting an encoding from a
Content-Type</dfn>, given a string <var title="">s</var>, is given
in the Content-Type Processing Model specification. It either
returns an encoding or nothing. <a
href="#refsMIMESNIFF">[MIMESNIFF]</a></p>
<p class="XXX">The above is out of date now that the relevant section has been removed from MIMESNIFF. Stay tuned; I'll bring it back here soon.</p>

<p>The <dfn title="Content-Type sniffing">sniffed type of a
resource</dfn> must be found in a manner consistent with the
requirements given in the Content-Type Processing Model
@@ -5981,6 +5974,60 @@ a.setAttribute('href', 'http://example.com/'); // change the content attribute d
occur. For more details, see the Content-Type Processing Model
specification. <a href="#refsMIMESNIFF">[MIMESNIFF]</a></p>

<p>The <dfn>algorithm for extracting an encoding from a
Content-Type</dfn>, given a string <var title="">s</var>, is as
follows. It either returns an encoding or nothing.</p>

<ol>

<li><p>Find the first seven characters in <var title="">s</var>
that are an <span>ASCII case-insensitive</span> match for the word
"<code title="">charset</code>". If no such match is found, return
nothing.</p></li>

<li><p>Skip any U+0009, U+000A, U+000C, U+000D, or U+0020
characters that immediately follow the word "<code
title="">charset</code>" (there might not be any).</p></li>

<li><p>If the next character is not a U+003D EQUALS SIGN ('='),
return nothing and abort these steps.</p></li>

<li><p>Skip any U+0009, U+000A, U+000C, U+000D, or U+0020
characters that immediately follow the equals sign (there might not
be any).</p></li>

<li>

<p>Process the next character as follows:</p>

<dl class="switch">

<dt>If it is a U+0022 QUOTATION MARK ('"') and there is a later U+0022 QUOTATION MARK ('"') in <var title="">s</var></dt>
<dt>If it is a U+0027 APOSTROPHE ("'") and there is a later U+0027 APOSTROPHE ("'") in <var title="">s</var></dt>
<dd>Return the encoding corresponding to the string between this character and the next earliest occurrence of this character.</dd>

<dt>If it is an unmatched U+0022 QUOTATION MARK ('"')</dt>
<dt>If it is an unmatched U+0027 APOSTROPHE ("'")</dt>
<dt>If there is no next character</dt>
<dd>Return nothing.</dd>

<dt>Otherwise</dt>
<dd>Return the encoding corresponding to the string from this
character to the first U+0009, U+000A, U+000C, U+000D, U+0020, or
U+003B character or the end of <var title="">s</var>, whichever
comes first.</dd>

</dl>

</li>

</ol>

<p class="note">This requirement is a <span>willful violation</span>
of the HTTP specification, motivated by the need for backwards
compatibility with legacy content. <a
href="#refsHTTP">[HTTP]</a></p>

</div>


0 comments on commit 11db509

Please sign in to comment.
You can’t perform that action at this time.