Strict/Relaxed JSON syntax #36

miloyip · 2014-06-30T16:38:48Z

SlausB · 2014-07-13T20:16:07Z

Hello. Are these check-boxes to vote? Cannot click them.
Thank you.

miloyip · 2014-07-14T01:39:01Z

No, these are not for vote. This is only a GitHub markdown syntax for checklist.
I don't know if a vote is suitable on this. But it may be good to gather opinions first.

SlausB · 2014-07-14T08:26:21Z

I think it would be nice to bring into library as much functionality as possible till it reasonably affects speed/interface.
BTW, I still vote for my "parse real-valued numbers with dot but without fraction part" here: https://code.google.com/p/rapidjson/issues/detail?id=33

pah · 2014-07-15T08:45:37Z

From the parser point of view, the most tricky part seem to be the relaxed rules for Object keys (without quotes) and the NaN, inf support. Both affect the lookahead and dispatching logic and may require speculative code paths. I would suggest to postpone this feature to a later version and only add the basic infrastructure to integrate it later.

For the other enhancements, I would vote for their adoption, e.g. via a (set of) parse flags. It may lead to binary bloat to handle all individual flag combinations from within Reader::Parse directly.

kParseErrorDocumentEmpty
null sounds like a reasonable value for this.
kParseErrorDocumentRootNotObjectOrArray
See Optionally accept arbitrary root elements #21, although solution should be consistent with all relaxations.
Additional comma at the end
Should be possible to integrate fairly easily, should be added.
Comments / String delimiters / NaN,inf
May be a good idea, although an implementation should eventually delegate the parsing to an alternate parser base class to handle "invalid lookaheads".
Number formats
Same here, maybe add support to replace the number parser.

thebusytypist · 2014-07-18T10:04:44Z

I think these relaxed syntax can be categorized in two groups: a grammar one, and a lexical one.

For the grammar group, I think they can be implemented by overriding/augmenting the state transition table of strict syntax.
A typical example is "extra comma at the end of object or array"(needs overriding a transition).
Currently the non-recursive parser is using a transition table encoded in 2D array. All the overriding/augmenting transition rules may be applied just after the look up in that table.
For the recursive parser, extra rules for relaxed syntax may be added to ParseArray and ParseObject.
I suggest every rule set of relaxed syntax can be switched on/off by a macro.

For the lexical group, I think a more sophisticated tokenizer may be required.
In current implementation, the tokenizer guesses the token type by its first character.
This may not work in some cases. For example, NaN and Inf.

I am still not very clear about "Key without quotes". This may require extension on the parser as well as the tokenizer.

hdlopesrocha · 2014-07-23T23:58:01Z

It would also be very useful to represent fractions without its integer part, like .123 instead of 0.123, I have a 3D model with 2Mb, aprox 200Kb are zeros like that...

peeeeter · 2014-08-11T18:07:55Z

Along the lines of busytypist's comment, from a user perspective these features are in two categories (that are basically the same as what busytypist says), things that add to what RapidJson can do within itself - ie extra information that can be encoded such as comments and inf/nan. The other category consists of things that make RapidJson more compatible with other generators (and human generators), such as extra commas and key quoting and stuff but which don't really add to the information in the stream. As a user who uses only one Json library for all purposes, this suggests that the first category is a little bit higher priority for me.

questor · 2014-09-04T13:26:34Z

my 2cents: comments (skipping them during parsing) and extra commas are most valuable for my usecases.

adam4813 · 2014-09-10T19:03:06Z

Commands would be a neat feature!

E.g { file: @filename } Would notice the @ as a load file command value and load and parse the filename. (I currently am using a string value with that sort of syntax and check it on parse, if it finds it it loads the new file if it isn't loaded yet with a circular include check mechanism.)

pah · 2014-09-10T19:12:26Z

I would rather suggest to go for some kind of extension mechanism, to handle syntax extensions (and maybe also parser relaxations). This could possibly be done by some set of parser policy classes, which can be customized by the user, e.g. by some traits parameter.

andrusha97 · 2015-01-10T13:20:33Z

Hi. Is this issue still active? I'd like to have support of comments because I use RapidJson to parse config files.
May I help with the implementation to speed-up the process?

M1chae1 · 2015-01-15T08:52:33Z

If you just want to trim out the comments in JSON file before load into rapidjson, you can reference jsmin class used in the below open source project.

https://code.google.com/p/page-speed/source/browse/lib/trunk/src/third_party/jsmin/cpp/?r=339

Then using MemoryStream to load the minified string and parse the stream with rapidjson::Document.ParseStream to do so.

andrusha97 · 2015-01-15T11:09:04Z

@M1chae1, I wouldn't like to use this method because:

Double parsing is a kludge, and continuing using my fork of rapidjson with comments support is better than using a kludge.
The most important thing: currently, if a config is invalid, my parser reports line number and column of the error. With this minifier it will be impossible.

miloyip · 2015-01-16T15:36:36Z

@andrusha97 This issue is still active. I am now preparing for 1.0 release and these features will be added afterwards. If you would like to, you may implement your solution and I will adapt or take it as a reference.

erikfroseth · 2015-07-08T06:27:54Z

I would argue very much to keep Rapidjson as strict/standard compliant as possible by default. If a non-standard syntax should be supported, I think the user have to set some flags/options or whatever so that he/she actually have to make an effort to support the non-standard behaviour.

I have seen way too many examples of applications supporting non-standard things just for the case of simplicity. It might be nice and convenient at the time of implementation, but in the long run it will nearly always lead to problems in one way or another.

miloyip · 2015-07-09T01:41:00Z

@erikfr I agree.

kenrobbins · 2015-09-14T06:45:06Z

Excuse my ignorance, but why wouldn't NaN or Infinity (ignoring -Infinity) be parsed the same way as true or false or null? I'm asking after seeing comments above about more sophisticated tokenizers and alternate parsing. Thanks.

miloyip · 2015-09-14T07:47:23Z

@kenrobbins Those can be implemented. It is a matter of fact that the standard does not support those. If we want to support that, it may need to set a spec for this. Different users/languages may uses different strings for those values. Also, some may need to handle negative infinity, some may not, for example.

Nelvin · 2015-09-16T20:03:03Z

I'd like to add my vote for comments and trailing commas.
A solution where these would require to be activated by the developer would be great.

Many thanks for rapidjson - seems to be a really great lib (I just started to use it today).

miloyip · 2015-10-14T04:38:23Z

I merged #443 for parsing single/multiple line comments, contributed by @andrusha97 .
This does not generate SAX events, nor stored in DOM. So roundtrip is not supported now.

This adds kParseTrailingCommasFlag to allow a trailing comma at the end of maps and arrays. This is part of issue Tencent#36, adding optional support for relaxed JSON syntax.

ArcticLampyrid · 2017-06-28T11:09:41Z

I think we need to provide a flag to support JSON5
Becasue now more and more JSON are written by humans
@miloyip

miloyip added the enhancement label Jul 1, 2014

pah mentioned this issue Jul 15, 2014

Allow Reader to read only one/first JSON object #75

Closed

miloyip modified the milestone: v1.0 Beta Aug 20, 2014

pah mentioned this issue Sep 22, 2014

Improvement suggestion #153

Closed

miloyip mentioned this issue Jul 8, 2015

Will rapadijson support options to accept non-strict json? #381

Closed

sontek mentioned this issue Sep 14, 2015

Add support for kwarg allow_nan python-rapidjson/python-rapidjson#13

Closed

pah mentioned this issue Dec 16, 2015

parser error about comma #496

Closed

miloyip mentioned this issue Jan 20, 2016

NaN encoded as 2.696539702293474e308 #509

Closed

ludocode mentioned this issue Mar 20, 2016

Added optional support for trailing commas #584

Merged

miloyip mentioned this issue Mar 24, 2016

Can rapidjson parse a json which key is not a string but an int? #586

Open

miloyip mentioned this issue May 5, 2016

Parsing large Json File #624

Closed

m7thon mentioned this issue May 20, 2016

Allow options for writing and parsing NaN/Infinity #641

Merged

gergondet mentioned this issue May 17, 2017

Add kParsePartialFractionFlag #961

Open

bmuzzin mentioned this issue Jul 21, 2017

SJSON support #1024

Open

5 tasks

miloyip mentioned this issue Oct 9, 2017

Json不支持十六进制很头痛 #1076

Open

pah mentioned this issue Dec 9, 2017

string can't multiline #1132

Closed

miloyip mentioned this issue Oct 16, 2018

Support for HJSON #1377

Open

madeso mentioned this issue May 28, 2019

Add "relaxed" json support madeso/json#4

Closed

14 tasks

ranjithpp mentioned this issue Jan 14, 2020

duplicate member/key tag:question #1630

Open

miloyip mentioned this issue Feb 12, 2020

Allow escaped apostrophe in values #1639

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Strict/Relaxed JSON syntax #36

Strict/Relaxed JSON syntax #36

miloyip commented Jun 30, 2014 •

edited

Loading

SlausB commented Jul 13, 2014

miloyip commented Jul 14, 2014

SlausB commented Jul 14, 2014

pah commented Jul 15, 2014

thebusytypist commented Jul 18, 2014

hdlopesrocha commented Jul 23, 2014

peeeeter commented Aug 11, 2014

questor commented Sep 4, 2014

adam4813 commented Sep 10, 2014

pah commented Sep 10, 2014

andrusha97 commented Jan 10, 2015

M1chae1 commented Jan 15, 2015

andrusha97 commented Jan 15, 2015

miloyip commented Jan 16, 2015

erikfroseth commented Jul 8, 2015

miloyip commented Jul 9, 2015

kenrobbins commented Sep 14, 2015

miloyip commented Sep 14, 2015

Nelvin commented Sep 16, 2015

miloyip commented Oct 14, 2015

ArcticLampyrid commented Jun 28, 2017 •

edited

Loading

Strict/Relaxed JSON syntax #36

Strict/Relaxed JSON syntax #36

Comments

miloyip commented Jun 30, 2014 • edited Loading

SlausB commented Jul 13, 2014

miloyip commented Jul 14, 2014

SlausB commented Jul 14, 2014

pah commented Jul 15, 2014

thebusytypist commented Jul 18, 2014

hdlopesrocha commented Jul 23, 2014

peeeeter commented Aug 11, 2014

questor commented Sep 4, 2014

adam4813 commented Sep 10, 2014

pah commented Sep 10, 2014

andrusha97 commented Jan 10, 2015

M1chae1 commented Jan 15, 2015

andrusha97 commented Jan 15, 2015

miloyip commented Jan 16, 2015

erikfroseth commented Jul 8, 2015

miloyip commented Jul 9, 2015

kenrobbins commented Sep 14, 2015

miloyip commented Sep 14, 2015

Nelvin commented Sep 16, 2015

miloyip commented Oct 14, 2015

ArcticLampyrid commented Jun 28, 2017 • edited Loading

miloyip commented Jun 30, 2014 •

edited

Loading

ArcticLampyrid commented Jun 28, 2017 •

edited

Loading