Browse files

[eciowt] (2) Be explicit about what an invalid Unicode character is.

git-svn-id: http://svn.whatwg.org/webapps@943 340c8d12-0b0e-0410-8428-c7bf67bfef74
  • Loading branch information...
1 parent a38a682 commit 4079b70371a064decda098d3ff05aa2262cd485b @Hixie Hixie committed Jun 22, 2007
Showing with 12 additions and 9 deletions.
  1. +7 −5 index
  2. +5 −4 source
View
12 index
@@ -22,7 +22,7 @@
<h1 id=html-5>HTML 5</h1>
- <h2 class="no-num no-toc" id=working>Working Draft &mdash; 21 June 2007</h2>
+ <h2 class="no-num no-toc" id=working>Working Draft &mdash; 22 June 2007</h2>
<p>You can take part in this work. <a
href="http://www.whatwg.org/mailing-list">Join the working group's
@@ -35026,12 +35026,14 @@ function receiver(e) {
<td>LATIN CAPITAL LETTER Y WITH DIAERESIS ('&#x0178')
</table>
- <p>Otherwise, if the number is not a valid Unicode character (e.g. if the
- number is higher than 1114111), or if the number is zero, then return a
- character token for the U+FFFD REPLACEMENT CHARACTER character instead.</p>
+ <p>Otherwise, if the number is zero, if the number is higher than
+ 0x10FFFF, or if it's one of the surrogate characters (characters in the
+ range 0xD800 to 0xDFFF), then this is a <a href="#parse">parse
+ error</a>; return a character token for the U+FFFD REPLACEMENT CHARACTER
+ character instead.</p>
<p>Otherwise, return a character token for the Unicode character whose
- code point is that number.
+ code point is that number.</p>
<dt>Anything else
View
9 source
@@ -32337,13 +32337,14 @@ function receiver(e) {
<tr><td>0x9F <td>U+0178 <td>LATIN CAPITAL LETTER Y WITH DIAERESIS ('&#x0178')
</table>
- <p>Otherwise, if the number is not a valid Unicode character
- (e.g. if the number is higher than 1114111), or if the number is
- zero, then return a character token for the U+FFFD REPLACEMENT
+ <p>Otherwise, if the number is zero, if the number is higher than
+ 0x10FFFF, or if it's one of the surrogate characters (characters
+ in the range 0xD800 to 0xDFFF), then this is a <span>parse
+ error</span>; return a character token for the U+FFFD REPLACEMENT
CHARACTER character instead.</p>
<p>Otherwise, return a character token for the Unicode character
- whose code point is that number.
+ whose code point is that number.</p>
</dd>

0 comments on commit 4079b70

Please sign in to comment.