Support UTF-8 for JSON #7

nscavell · 2013-02-15T21:11:54Z

Went ahead and took a stab at supporting UTF-8 for both parsing and displaying JSON. I left the readFromString method alone (still using US-ASCII).

I looked at a couple of libraries: org.json, json-smart, jackson, jettision and they all behave slightly differently when it comes to escaping unicode characters.

Jettision seemed to be the most accurate and it meets the requirements of the GateIn team wrt to localization. It behaves closely to http://www.ietf.org/rfc/rfc4627.txt as it only escapes unicode characters \u0000 through \u001F. It does not however escape forward slash.

nscavell · 2013-02-15T21:24:34Z

Sorry wanted it to be a separate PR since DMR-6 is an actual bug fix. I would at least like to see DMR-6 make it into next release. Let me know if you want individual PR's for each issue.

dmlloyd · 2013-05-16T17:04:10Z

src/main/java/org/jboss/dmr/JSONParserImpl.java

-        for (int i = 1; i < length - 1; i = yyText.offsetByCodePoints(i, 1)) {
-            int ch = yyText.codePointAt(i);
+        for (int i = 1; i < length - 1; i++) {
+            char ch = yyText.charAt(i);


Why change these to not use code points?

I don't think we need it. I believe if you have characters outside BMP (which is my understanding of code points, but I am novice in this area) then they need to be represented by two unicode characters in json.

I guess it depends on the encoding in which the parser reads the content. I'm ok leaving it in if you think it solves a problem. A test to prove this code wrong would be ideal :)

Since you changed it to expect input in UTF-8 then I think we should assume that the text might contain any code point.

dmlloyd · 2013-05-16T18:29:41Z

Looks good.

nscavell · 2013-06-03T14:44:56Z

Any update or status for this ? Would be nice to include in a release in the not so distant future :)

[DMR-6] JSON parser throws exception for valid JSON string

0833adb

dmlloyd reviewed May 16, 2013
View reviewed changes

[DMR-7] Support UTF-8 encoding for JSON

1693191

dmlloyd merged commit 1693191 into jbossas:master Jun 6, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support UTF-8 for JSON #7

Support UTF-8 for JSON #7

nscavell commented Feb 15, 2013

nscavell commented Feb 15, 2013

dmlloyd May 16, 2013

nscavell May 16, 2013

nscavell May 16, 2013

dmlloyd May 16, 2013

dmlloyd commented May 16, 2013

nscavell commented Jun 3, 2013

Support UTF-8 for JSON #7

Support UTF-8 for JSON #7

Conversation

nscavell commented Feb 15, 2013

nscavell commented Feb 15, 2013

dmlloyd May 16, 2013

Choose a reason for hiding this comment

nscavell May 16, 2013

Choose a reason for hiding this comment

nscavell May 16, 2013

Choose a reason for hiding this comment

dmlloyd May 16, 2013

Choose a reason for hiding this comment

dmlloyd commented May 16, 2013

nscavell commented Jun 3, 2013