Skip to content

Commit

Permalink
[giow] (2) Make <meta charset=x-user-defined> turn into windows-1252 …
Browse files Browse the repository at this point in the history
…for legacy reasons

Fixing https://www.w3.org/Bugs/Public/show_bug.cgi?id=23940
Affected topics: HTML Syntax and Parsing

git-svn-id: http://svn.whatwg.org/webapps@8618 340c8d12-0b0e-0410-8428-c7bf67bfef74
  • Loading branch information
Hixie committed May 7, 2014
1 parent 2699e4a commit 202e170
Show file tree
Hide file tree
Showing 3 changed files with 53 additions and 2 deletions.
19 changes: 18 additions & 1 deletion complete.html
Original file line number Diff line number Diff line change
Expand Up @@ -87923,9 +87923,14 @@ <h5 id=determining-the-character-encoding><span class=secno>12.2.2.2 </span>Dete
<li><p>If <var title="">need pragma</var> is true but <var title="">got pragma</var> is
false, then jump to the step below labeled <i>next byte</i>.</li>

<!-- the next two steps are redundant with steps in the 'change the encoding' algorithm -->

<li><p>If <var title="">charset</var> is <a href=#a-utf-16-encoding>a UTF-16 encoding</a>, change the value of
<var title="">charset</var> to UTF-8.</li>

<li><p>If <var title="">charset</var> is the x-user-defined encoding, change the value of
<var title="">charset</var> to Windows-1252. <a href=#refsENCODING>[ENCODING]</a></li>

<li><p>If <var title="">charset</var> is not a supported character encoding, then jump to the
step below labeled <i>next byte</i>.</li>

Expand Down Expand Up @@ -88133,13 +88138,20 @@ <h5 id=changing-the-encoding-while-parsing><span class=secno>12.2.2.4 </span>Cha
failed to find a character encoding, or if it found a character encoding that was not the actual
encoding of the file.</p>

<!--CLEANUP--><!-- use <p>s -->
<ol><li>If the encoding that is already being used to interpret the input stream is <a href=#a-utf-16-encoding>a UTF-16
encoding</a>, then set the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a> to
<i>certain</i> and abort these steps. The new encoding is ignored; if it was anything but the
same encoding, then it would be clearly incorrect.</li>

<!-- the next two steps are redundant with similar logic in the sniffer -->
<!-- if you add anything else here, then factor it out into a common algorithm -->

<li>If the new encoding is <a href=#a-utf-16-encoding>a UTF-16 encoding</a>, change it to UTF-8.</li>

<li>If the new encoding is the x-user-defined encoding, change it to Windows-1252. <a href=#refsENCODING>[ENCODING]</a></li> <!-- apparently this was a Chrome invention, later
picked up by Mozilla -->

<li>If the new encoding is identical or equivalent to the encoding that is already being used to
interpret the input stream, then set the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a> to <i>certain</i> and abort these steps.
This happens when the encoding information found in the file matches what the <a href=#encoding-sniffing-algorithm>encoding
Expand All @@ -88166,7 +88178,12 @@ <h5 id=changing-the-encoding-while-parsing><span class=secno>12.2.2.4 </span>Cha
encoding. The resource will be misinterpreted. User agents may notify the user of the situation,
to aid in application development.</li>

</ol><h5 id=preprocessing-the-input-stream><span class=secno>12.2.2.5 </span>Preprocessing the input stream</h5>
</ol><p class=note>This algorithm is only invoked when a new encoding is found declared on a
<code><a href=#the-meta-element>meta</a></code> element.</p> <!-- this is important for the x-user-defined stuff in particular
-->


<h5 id=preprocessing-the-input-stream><span class=secno>12.2.2.5 </span>Preprocessing the input stream</h5>

<p>The <dfn id=input-stream>input stream</dfn> consists of the characters pushed into it as the <a href=#the-input-byte-stream>input byte
stream</a> is decoded or from the various APIs that directly manipulate the input stream.</p>
Expand Down
19 changes: 18 additions & 1 deletion index
Original file line number Diff line number Diff line change
Expand Up @@ -87923,9 +87923,14 @@ dictionary <dfn id=storageeventinit>StorageEventInit</dfn> : <a href=#eventinit>
<li><p>If <var title="">need pragma</var> is true but <var title="">got pragma</var> is
false, then jump to the step below labeled <i>next byte</i>.</li>

<!-- the next two steps are redundant with steps in the 'change the encoding' algorithm -->

<li><p>If <var title="">charset</var> is <a href=#a-utf-16-encoding>a UTF-16 encoding</a>, change the value of
<var title="">charset</var> to UTF-8.</li>

<li><p>If <var title="">charset</var> is the x-user-defined encoding, change the value of
<var title="">charset</var> to Windows-1252. <a href=#refsENCODING>[ENCODING]</a></li>

<li><p>If <var title="">charset</var> is not a supported character encoding, then jump to the
step below labeled <i>next byte</i>.</li>

Expand Down Expand Up @@ -88133,13 +88138,20 @@ dictionary <dfn id=storageeventinit>StorageEventInit</dfn> : <a href=#eventinit>
failed to find a character encoding, or if it found a character encoding that was not the actual
encoding of the file.</p>

<!--CLEANUP--><!-- use <p>s -->
<ol><li>If the encoding that is already being used to interpret the input stream is <a href=#a-utf-16-encoding>a UTF-16
encoding</a>, then set the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a> to
<i>certain</i> and abort these steps. The new encoding is ignored; if it was anything but the
same encoding, then it would be clearly incorrect.</li>

<!-- the next two steps are redundant with similar logic in the sniffer -->
<!-- if you add anything else here, then factor it out into a common algorithm -->

<li>If the new encoding is <a href=#a-utf-16-encoding>a UTF-16 encoding</a>, change it to UTF-8.</li>

<li>If the new encoding is the x-user-defined encoding, change it to Windows-1252. <a href=#refsENCODING>[ENCODING]</a></li> <!-- apparently this was a Chrome invention, later
picked up by Mozilla -->

<li>If the new encoding is identical or equivalent to the encoding that is already being used to
interpret the input stream, then set the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a> to <i>certain</i> and abort these steps.
This happens when the encoding information found in the file matches what the <a href=#encoding-sniffing-algorithm>encoding
Expand All @@ -88166,7 +88178,12 @@ dictionary <dfn id=storageeventinit>StorageEventInit</dfn> : <a href=#eventinit>
encoding. The resource will be misinterpreted. User agents may notify the user of the situation,
to aid in application development.</li>

</ol><h5 id=preprocessing-the-input-stream><span class=secno>12.2.2.5 </span>Preprocessing the input stream</h5>
</ol><p class=note>This algorithm is only invoked when a new encoding is found declared on a
<code><a href=#the-meta-element>meta</a></code> element.</p> <!-- this is important for the x-user-defined stuff in particular
-->


<h5 id=preprocessing-the-input-stream><span class=secno>12.2.2.5 </span>Preprocessing the input stream</h5>

<p>The <dfn id=input-stream>input stream</dfn> consists of the characters pushed into it as the <a href=#the-input-byte-stream>input byte
stream</a> is decoded or from the various APIs that directly manipulate the input stream.</p>
Expand Down
17 changes: 17 additions & 0 deletions source
Original file line number Diff line number Diff line change
Expand Up @@ -96732,9 +96732,14 @@ dictionary <dfn>StorageEventInit</dfn> : <span>EventInit</span> {
<li><p>If <var data-x="">need pragma</var> is true but <var data-x="">got pragma</var> is
false, then jump to the step below labeled <i>next byte</i>.</p></li>

<!-- the next two steps are redundant with steps in the 'change the encoding' algorithm -->

<li><p>If <var data-x="">charset</var> is <span>a UTF-16 encoding</span>, change the value of
<var data-x="">charset</var> to UTF-8.</p></li>

<li><p>If <var data-x="">charset</var> is the x-user-defined encoding, change the value of
<var data-x="">charset</var> to Windows-1252. <a href="#refsENCODING">[ENCODING]</a></p></li>

<li><p>If <var data-x="">charset</var> is not a supported character encoding, then jump to the
step below labeled <i>next byte</i>.</p></li>

Expand Down Expand Up @@ -96991,15 +96996,23 @@ dictionary <dfn>StorageEventInit</dfn> : <span>EventInit</span> {
failed to find a character encoding, or if it found a character encoding that was not the actual
encoding of the file.</p>

<!--CLEANUP--><!-- use <p>s -->
<ol>

<li>If the encoding that is already being used to interpret the input stream is <span>a UTF-16
encoding</span>, then set the <span data-x="concept-encoding-confidence">confidence</span> to
<i>certain</i> and abort these steps. The new encoding is ignored; if it was anything but the
same encoding, then it would be clearly incorrect.</li>

<!-- the next two steps are redundant with similar logic in the sniffer -->
<!-- if you add anything else here, then factor it out into a common algorithm -->

<li>If the new encoding is <span>a UTF-16 encoding</span>, change it to UTF-8.</li>

<li>If the new encoding is the x-user-defined encoding, change it to Windows-1252. <a
href="#refsENCODING">[ENCODING]</a></p></li> <!-- apparently this was a Chrome invention, later
picked up by Mozilla -->

<li>If the new encoding is identical or equivalent to the encoding that is already being used to
interpret the input stream, then set the <span
data-x="concept-encoding-confidence">confidence</span> to <i>certain</i> and abort these steps.
Expand Down Expand Up @@ -97031,6 +97044,10 @@ dictionary <dfn>StorageEventInit</dfn> : <span>EventInit</span> {

</ol>

<p class="note">This algorithm is only invoked when a new encoding is found declared on a
<code>meta</code> element.</p> <!-- this is important for the x-user-defined stuff in particular
-->


<h5>Preprocessing the input stream</h5>

Expand Down

0 comments on commit 202e170

Please sign in to comment.