Does not correctly parse surrogate pairs #42

johnezang · 2011-02-03T01:10:36Z

The following is not parsed correctly:

{ "MATHEMATICAL ITALIC CAPITAL ALPHA": "\uD835\uDEE2" }

Expected result:

{ "MATHEMATICAL ITALIC CAPITAL ALPHA": "𝛢" }

(note: github seems to have problems dealing with unicode characters > U+10000. This is why it looks funky, I did my best with what I could.)

Using the following code:

NSString *json = [NSString stringWithUTF8String:"{ \"MATHEMATICAL ITALIC CAPITAL ALPHA\": \"\\uD835\\uDEE2\" }"];
id obj = [json JSONValue];
NSLog(@"stringWithObject: %@", [writer stringWithObject:obj]);

... produces the following:

stringWithObject: {"MATHEMATICAL ITALIC CAPITAL ALPHA":"훢"}

Also, the code in parseUnicodeEscape and decodeHexQuad "may" (on a zero order approximation) have corner cases that "read past the end of the array", in particular when dealing with surrogate pairs. The code that calls parseUnicodeEscape seems to have an explicit length variable, while the unicode parsing code does not, instead relying on \0 termination. It's not clear to me if this assumption is guaranteed to be valid, looks very suspicious to me.

The text was updated successfully, but these errors were encountered:

stig · 2011-02-03T13:36:48Z

There is a hack in -appendBytes: that appends a \0 to make sure the hecodeHexQuad worsk. Let me stress again that it's a hack. One of these days I want to make the code completely length-based.

stig · 2011-02-13T13:06:00Z

Thanks. Having looked into this the decoding of the code point seems to work, but my conversion from the code point to the string was not. I'll try to fix this.

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does not correctly parse surrogate pairs #42

Does not correctly parse surrogate pairs #42

johnezang commented Feb 3, 2011

stig commented Feb 3, 2011

stig commented Feb 13, 2011

Does not correctly parse surrogate pairs #42

Does not correctly parse surrogate pairs #42

Comments

johnezang commented Feb 3, 2011

stig commented Feb 3, 2011

stig commented Feb 13, 2011