Skip to content

Commit

Permalink
[acgiowt] (2) Make invalid &#x...; character references not get conve…
Browse files Browse the repository at this point in the history
…rted to U+FFFD, for consistency with literal invalid characters.

git-svn-id: http://svn.whatwg.org/webapps@3374 340c8d12-0b0e-0410-8428-c7bf67bfef74
  • Loading branch information
Hixie committed Jul 8, 2009
1 parent 53cf665 commit 81da9b0
Show file tree
Hide file tree
Showing 2 changed files with 28 additions and 27 deletions.
29 changes: 15 additions & 14 deletions index
Original file line number Diff line number Diff line change
Expand Up @@ -61677,9 +61677,10 @@ interface <dfn id=messageport>MessagePort</dfn> {
row.</p>

<table><thead><tr><th>Number <th colspan=2>Unicode character
<tbody><tr><td>0x0D <td>U+000A <td>LINE FEED (LF)
<tbody><tr><td>0x00 <td>U+FFFD <td>REPLACEMENT CHARACTER
<tr><td>0x0D <td>U+000A <td>LINE FEED (LF)
<tr><td>0x80 <td>U+20AC <td>EURO SIGN ('&euro;')
<tr><td>0x81 <td>U+FFFD <td>REPLACEMENT CHARACTER
<tr><td>0x81 <td>U+0081 <td>&lt;control&gt;
<tr><td>0x82 <td>U+201A <td>SINGLE LOW-9 QUOTATION MARK ('&sbquo;')
<tr><td>0x83 <td>U+0192 <td>LATIN SMALL LETTER F WITH HOOK ('&fnof;')
<tr><td>0x84 <td>U+201E <td>DOUBLE LOW-9 QUOTATION MARK ('&bdquo;')
Expand All @@ -61691,10 +61692,10 @@ interface <dfn id=messageport>MessagePort</dfn> {
<tr><td>0x8A <td>U+0160 <td>LATIN CAPITAL LETTER S WITH CARON ('&Scaron;')
<tr><td>0x8B <td>U+2039 <td>SINGLE LEFT-POINTING ANGLE QUOTATION MARK ('&lsaquo;')
<tr><td>0x8C <td>U+0152 <td>LATIN CAPITAL LIGATURE OE ('&OElig;')
<tr><td>0x8D <td>U+FFFD <td>REPLACEMENT CHARACTER
<tr><td>0x8D <td>U+008D <td>&lt;control&gt;
<tr><td>0x8E <td>U+017D <td>LATIN CAPITAL LETTER Z WITH CARON ('&#381;')
<tr><td>0x8F <td>U+FFFD <td>REPLACEMENT CHARACTER
<tr><td>0x90 <td>U+FFFD <td>REPLACEMENT CHARACTER
<tr><td>0x8F <td>U+008F <td>&lt;control&gt;
<tr><td>0x90 <td>U+0090 <td>&lt;control&gt;
<tr><td>0x91 <td>U+2018 <td>LEFT SINGLE QUOTATION MARK ('&lsquo;')
<tr><td>0x92 <td>U+2019 <td>RIGHT SINGLE QUOTATION MARK ('&rsquo;')
<tr><td>0x93 <td>U+201C <td>LEFT DOUBLE QUOTATION MARK ('&ldquo;')
Expand All @@ -61707,12 +61708,16 @@ interface <dfn id=messageport>MessagePort</dfn> {
<tr><td>0x9A <td>U+0161 <td>LATIN SMALL LETTER S WITH CARON ('&scaron;')
<tr><td>0x9B <td>U+203A <td>SINGLE RIGHT-POINTING ANGLE QUOTATION MARK ('&rsaquo;')
<tr><td>0x9C <td>U+0153 <td>LATIN SMALL LIGATURE OE ('&oelig;')
<tr><td>0x9D <td>U+FFFD <td>REPLACEMENT CHARACTER
<tr><td>0x9D <td>U+009D <td>&lt;control&gt;
<tr><td>0x9E <td>U+017E <td>LATIN SMALL LETTER Z WITH CARON ('&#382;')
<tr><td>0x9F <td>U+0178 <td>LATIN CAPITAL LETTER Y WITH DIAERESIS ('&Yuml;')
</table><!-- this is the same as the equivalent list in the input stream
section, except it has 0x0000 included in the first range. --><p>Otherwise, if the number is in the range 0x0000 to 0x0008, <!--
HT, LF allowed --> <!-- U+000B is in the next list --> <!-- FF, CR
</table><p>Otherwise, return a character token for the Unicode character
whose code point is that number.

<!-- this is the same as the equivalent list in the input stream
section -->
If the number is in the range 0x0001 to 0x0008, <!-- HT, LF
allowed --> <!-- U+000B is in the next list --> <!-- FF, CR
allowed --> 0x000E to 0x001F, <!-- ASCII allowed --> 0x007F <!--to
0x0084, (0x0085 NEL not allowed), 0x0086--> to 0x009F, 0xD800 to
0xDFFF<!-- surrogates not allowed -->, 0xFDD0 to 0xFDEF, or is one
Expand All @@ -61722,11 +61727,7 @@ interface <dfn id=messageport>MessagePort</dfn> {
0xAFFFE, 0xAFFFF, 0xBFFFE, 0xBFFFF, 0xCFFFE, 0xCFFFF, 0xDFFFE,
0xDFFFF, 0xEFFFE, 0xEFFFF, 0xFFFFE, 0xFFFFF, 0x10FFFE, or
0x10FFFF, or is higher than 0x10FFFF, then this is a <a href=#parse-error>parse
error</a>; return a character token for the U+FFFD REPLACEMENT
CHARACTER character instead.</p>

<p>Otherwise, return a character token for the Unicode character
whose code point is that number.</p>
error</a>.</p>

</dd>

Expand Down
26 changes: 13 additions & 13 deletions source
Original file line number Diff line number Diff line change
Expand Up @@ -75699,9 +75699,10 @@ interface <dfn>MessagePort</dfn> {
<thead>
<tr><th>Number <th colspan=2>Unicode character
<tbody>
<tr><td>0x00 <td>U+FFFD <td>REPLACEMENT CHARACTER
<tr><td>0x0D <td>U+000A <td>LINE FEED (LF)
<tr><td>0x80 <td>U+20AC <td>EURO SIGN ('&#x20AC;')
<tr><td>0x81 <td>U+FFFD <td>REPLACEMENT CHARACTER
<tr><td>0x81 <td>U+0081 <td>&lt;control>
<tr><td>0x82 <td>U+201A <td>SINGLE LOW-9 QUOTATION MARK ('&#x201A;')
<tr><td>0x83 <td>U+0192 <td>LATIN SMALL LETTER F WITH HOOK ('&#x0192;')
<tr><td>0x84 <td>U+201E <td>DOUBLE LOW-9 QUOTATION MARK ('&#x201E;')
Expand All @@ -75713,10 +75714,10 @@ interface <dfn>MessagePort</dfn> {
<tr><td>0x8A <td>U+0160 <td>LATIN CAPITAL LETTER S WITH CARON ('&#x0160;')
<tr><td>0x8B <td>U+2039 <td>SINGLE LEFT-POINTING ANGLE QUOTATION MARK ('&#x2039;')
<tr><td>0x8C <td>U+0152 <td>LATIN CAPITAL LIGATURE OE ('&#x0152;')
<tr><td>0x8D <td>U+FFFD <td>REPLACEMENT CHARACTER
<tr><td>0x8D <td>U+008D <td>&lt;control>
<tr><td>0x8E <td>U+017D <td>LATIN CAPITAL LETTER Z WITH CARON ('&#x017D;')
<tr><td>0x8F <td>U+FFFD <td>REPLACEMENT CHARACTER
<tr><td>0x90 <td>U+FFFD <td>REPLACEMENT CHARACTER
<tr><td>0x8F <td>U+008F <td>&lt;control>
<tr><td>0x90 <td>U+0090 <td>&lt;control>
<tr><td>0x91 <td>U+2018 <td>LEFT SINGLE QUOTATION MARK ('&#x2018;')
<tr><td>0x92 <td>U+2019 <td>RIGHT SINGLE QUOTATION MARK ('&#x2019;')
<tr><td>0x93 <td>U+201C <td>LEFT DOUBLE QUOTATION MARK ('&#x201C;')
Expand All @@ -75729,15 +75730,18 @@ interface <dfn>MessagePort</dfn> {
<tr><td>0x9A <td>U+0161 <td>LATIN SMALL LETTER S WITH CARON ('&#x0161;')
<tr><td>0x9B <td>U+203A <td>SINGLE RIGHT-POINTING ANGLE QUOTATION MARK ('&#x203A;')
<tr><td>0x9C <td>U+0153 <td>LATIN SMALL LIGATURE OE ('&#x0153;')
<tr><td>0x9D <td>U+FFFD <td>REPLACEMENT CHARACTER
<tr><td>0x9D <td>U+009D <td>&lt;control>
<tr><td>0x9E <td>U+017E <td>LATIN SMALL LETTER Z WITH CARON ('&#x017E;')
<tr><td>0x9F <td>U+0178 <td>LATIN CAPITAL LETTER Y WITH DIAERESIS ('&#x0178;')
</table>

<p>Otherwise, return a character token for the Unicode character
whose code point is that number.

<!-- this is the same as the equivalent list in the input stream
section, except it has 0x0000 included in the first range. -->
<p>Otherwise, if the number is in the range 0x0000 to 0x0008, <!--
HT, LF allowed --> <!-- U+000B is in the next list --> <!-- FF, CR
section -->
If the number is in the range 0x0001 to 0x0008, <!-- HT, LF
allowed --> <!-- U+000B is in the next list --> <!-- FF, CR
allowed --> 0x000E to 0x001F, <!-- ASCII allowed --> 0x007F <!--to
0x0084, (0x0085 NEL not allowed), 0x0086--> to 0x009F, 0xD800 to
0xDFFF<!-- surrogates not allowed -->, 0xFDD0 to 0xFDEF, or is one
Expand All @@ -75747,11 +75751,7 @@ interface <dfn>MessagePort</dfn> {
0xAFFFE, 0xAFFFF, 0xBFFFE, 0xBFFFF, 0xCFFFE, 0xCFFFF, 0xDFFFE,
0xDFFFF, 0xEFFFE, 0xEFFFF, 0xFFFFE, 0xFFFFF, 0x10FFFE, or
0x10FFFF, or is higher than 0x10FFFF, then this is a <span>parse
error</span>; return a character token for the U+FFFD REPLACEMENT
CHARACTER character instead.</p>

<p>Otherwise, return a character token for the Unicode character
whose code point is that number.</p>
error</span>.</p>

</dd>

Expand Down

0 comments on commit 81da9b0

Please sign in to comment.