-
Notifications
You must be signed in to change notification settings - Fork 11
Open
Description
Unicode code points in the range U+D800 to U+DFFF are used for encoding UTF-16 surrogate pairs, and can not occur on their own in a valid Unicode sequence. However, the \uHHHH
escape sequence can be used to construct such a an invalid Unicode sequence, e.g "\uDEAD"
.
The interpretation of invalid Unicode sequences like "\uDEAD"
is not specified in the JSON, JSON5 or ES5 (I think) specs. However, it is specified in the ES6 spec:
A code unit that is in the range 0xD800 to 0xDFFF, but is not part of a surrogate pair, is interpreted as a code point with the same value.
This also matches the behavior of the JSON5 reference implementation. I think it's best if this is also clarified in the JSON5 spec.
zamicol