Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support reporting errors/warnings when detecting duplicate fields #573

Open
luxas opened this issue Sep 2, 2021 · 1 comment
Open

Support reporting errors/warnings when detecting duplicate fields #573

luxas opened this issue Sep 2, 2021 · 1 comment

Comments

@luxas
Copy link

luxas commented Sep 2, 2021

Related to #570

As can be read in YAML 1.2 spec :

JSON's RFC4627 requires that mappings keys merely “SHOULD” be unique, while YAML insists they “MUST” be. Technically, YAML therefore complies with the JSON spec, choosing to treat duplicates as an error. In practice, since JSON is silent on the semantics of such duplicates, the only portable JSON files are those with unique keys, which are therefore valid YAML files.

As the recommendation of JSON specifies that fields SHOULD be unique; it would be nice to have an option to enforce this; especially for consistency between JSON and YAML parsers (e.g. both would return errors on duplicate fields consistently).

yaml.v3 errors out if there are duplicate fields. It can be very non-obvious how parsing is done if duplicate fields don't yield an error, e.g. encoding/json where duplicate fields with object values are merged, whereas all other types are replaced with the latest value (ref: golang/go#24415 (comment)):

The documentation doesn't specify what happens in this case; it says it matches object keys to fields but doesn't specify in what order it does so (it only matters with duplicate keys). Changing the current behaviour might break backwards compatibility too much, but I think at least it should be documented.

Also, the behaviour is inconsistent between a map and a slice field; the maps get merged, the slice field takes the last value seen, like the other types.

Here is that comment behavior in action: https://play.golang.org/p/CkCMdoGXc07
This was unexpected to me, and I believe most users, too.

I have an implementation of this fix, providing an opt-in method to enable errors on duplicate fields: https://github.com/luxas/json-iterator/tree/error_for_duplicates

But, I think it should be possible to write this as a json-iter extension as well; I don't know which one is better for the community in the long run (i.e. feature creep vs extensibility).

If json-iter supported "warnings", it'd also be possible to begin the transition to duplicate fields erroring by showing the user warnings (in the spirit of https://kubernetes.io/blog/2020/09/03/warnings/)

@zamicol
Copy link

zamicol commented Jul 25, 2022

Go also has an open issue for this: golang/go#48298. dsnet wrote a great summary of the problem and I'd suggest a read.

As far as implementation, this looks like it might require another bool on generalStructDecoder:

type generalStructDecoder struct {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants