Browse files

[] (0) Support BOMs in <script src=''> JS files. (credit: mp)

git-svn-id: http://svn.whatwg.org/webapps@2802 340c8d12-0b0e-0410-8428-c7bf67bfef74
  • Loading branch information...
1 parent 92e09d0 commit d46a38986bbe3f0594c37ea8e05e1e88c5caaef4 @Hixie Hixie committed Feb 12, 2009
Showing with 82 additions and 4 deletions.
  1. +36 −2 index
  2. +46 −2 source
View
38 index
@@ -5061,6 +5061,7 @@
<p>If <var title="">n</var> is 4 or more, and the first bytes of
the resource match one of the following byte sets:</p>
+ <!-- this table is present in several forms in this file; keep them in sync -->
<table><thead><tr><th>Bytes in Hexadecimal
<th>Description
<tbody><tr><td>FE FF
@@ -10288,8 +10289,40 @@ people expect to have work and what is necessary.
<p>The contents of that file, interpreted as string of
Unicode characters, are the script source.</p>
- <p>The file must be converted to Unicode using the character
- encoding given by <var><a href="#the-script-block's-character-encoding">the script block's character
+ <p>For each of the rows in the following table, starting with
+ the first one and going down, if the file has as many or more
+ bytes available than the number of bytes in the first column,
+ and the first bytes of the file match the bytes given in the
+ first column, then set <var><a href="#the-script-block's-character-encoding">the script block's character
+ encoding</a></var> to the encoding given in the cell in the second
+ column of that row, irrespective of any previous value:</p>
+
+ <!-- this table is present in several forms in this file; keep them in sync -->
+ <table><thead><tr><th>Bytes in Hexadecimal
+ <th>Encoding
+ <tbody><!-- nobody uses this
+ <tr>
+ <td>00 00 FE FF
+ <td>UTF-32BE
+ <tr>
+ <td>FF FE 00 00
+ <td>UTF-32LE
+--><tr><td>FE FF
+ <td>UTF-16BE
+ <tr><td>FF FE
+ <td>UTF-16LE
+ <tr><td>EF BB BF
+ <td>UTF-8
+<!-- nobody uses this
+ <tr>
+ <td>DD 73 66 73
+ <td>UTF-EBCDIC
+-->
+ </table><p class=note>This step looks for Unicode Byte Order Marks
+ (BOMs).</p>
+
+ <p>The file must then be converted to Unicode using the
+ character encoding given by <var><a href="#the-script-block's-character-encoding">the script block's character
encoding</a></var>.</p>
</dd>
@@ -47971,6 +48004,7 @@ interface <dfn id=messagechannel>MessageChannel</dfn> {
that row, with the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a>
<i>certain</i>, and abort these steps:</p>
+ <!-- this table is present in several forms in this file; keep them in sync -->
<table><thead><tr><th>Bytes in Hexadecimal
<th>Encoding
<tbody><!-- nobody uses this
View
48 source
@@ -4749,6 +4749,7 @@
<p>If <var title="">n</var> is 4 or more, and the first bytes of
the resource match one of the following byte sets:</p>
+ <!-- this table is present in several forms in this file; keep them in sync -->
<table>
<thead>
<tr>
@@ -10831,8 +10832,50 @@ people expect to have work and what is necessary.
<p>The contents of that file, interpreted as string of
Unicode characters, are the script source.</p>
- <p>The file must be converted to Unicode using the character
- encoding given by <var>the script block's character
+ <p>For each of the rows in the following table, starting with
+ the first one and going down, if the file has as many or more
+ bytes available than the number of bytes in the first column,
+ and the first bytes of the file match the bytes given in the
+ first column, then set <var>the script block's character
+ encoding</var> to the encoding given in the cell in the second
+ column of that row, irrespective of any previous value:</p>
+
+ <!-- this table is present in several forms in this file; keep them in sync -->
+ <table>
+ <thead>
+ <tr>
+ <th>Bytes in Hexadecimal
+ <th>Encoding
+ <tbody>
+<!-- nobody uses this
+ <tr>
+ <td>00 00 FE FF
+ <td>UTF-32BE
+ <tr>
+ <td>FF FE 00 00
+ <td>UTF-32LE
+-->
+ <tr>
+ <td>FE FF
+ <td>UTF-16BE
+ <tr>
+ <td>FF FE
+ <td>UTF-16LE
+ <tr>
+ <td>EF BB BF
+ <td>UTF-8
+<!-- nobody uses this
+ <tr>
+ <td>DD 73 66 73
+ <td>UTF-EBCDIC
+-->
+ </table>
+
+ <p class="note">This step looks for Unicode Byte Order Marks
+ (BOMs).</p>
+
+ <p>The file must then be converted to Unicode using the
+ character encoding given by <var>the script block's character
encoding</var>.</p>
</dd>
@@ -54791,6 +54834,7 @@ interface <dfn>MessageChannel</dfn> {
title="concept-encoding-confidence">confidence</span>
<i>certain</i>, and abort these steps:</p>
+ <!-- this table is present in several forms in this file; keep them in sync -->
<table>
<thead>
<tr>

0 comments on commit d46a389

Please sign in to comment.