New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/json: ambiguous fields are marshalled #17609

Open
nathanjsweet opened this Issue Oct 26, 2016 · 8 comments

Comments

Projects
None yet
6 participants
@nathanjsweet

nathanjsweet commented Oct 26, 2016

Embedded fields work, even when there are selectors that are the same:
ex: https://play.golang.org/p/fJTGL-HWEk
You get a compile time error though when you select ambiguously:
ex: https://play.golang.org/p/McjVYbnAhT
I understand that it would be hard to achieve, but I expect a compile time error from the following code, but I don't get one:
https://play.golang.org/p/2oOXH2uzWy

I understand that others might rely on this inconsistency now, but if a compile time error is achievable (I understand that this would be difficult), it will be the least offensive fix possible.

@cznic

This comment has been minimized.

Contributor

cznic commented Oct 26, 2016

I understand that it would be hard to achieve, but I expect a compile time error from the following code, ...

It's not hard, it's impossible without strong AI ;-)

@nathanjsweet

This comment has been minimized.

nathanjsweet commented Oct 26, 2016

It's not hard, it's impossible without strong AI ;-)

Fair enough. I think a runtime error would be good as well, or at least the option of enabling a runtime error (maybe with a json tag option like "nodups").

@quentinmit quentinmit changed the title from Embedded Field encoding ambiguity to encoding/json: ambiguous fields are marshalled Oct 28, 2016

@quentinmit quentinmit added this to the Go1.8Maybe milestone Oct 28, 2016

@quentinmit

This comment has been minimized.

Contributor

quentinmit commented Oct 28, 2016

I would naïvely expect json to print none of the fields, since there is effectively no field called "Foo" or "Bar" in C. But I'm not the expert here.

/cc @rsc @bradfitz

@nathanjsweet

This comment has been minimized.

nathanjsweet commented Oct 28, 2016

I would naïvely expect json to print none of the fields, since there is effectively no field called "Foo" or "Bar" in C

@quentinmit I think that ship has sailed. I think some, myself included, like that serialization/deserialization correlates with how embedded structs function in the language itself. Also, 1.0 did exactly what you just supposed, so this feature has been added in since. I highly doubt anyone is about to remove it.

The more I think about this issue the less convinced I am that it can be resolved. I suppose you could add a tag to one of the embedded structs like json:"-,nodups":

type C struct {
    *A `json:"-,nodups"`
    *B
}

However the current behavior of encoding/json supposes that a comma after an omit-dash, "-", actually includes the dash as the key name of the JSON object.

Supposing it didn't though, what would this tag even mean? That the preceding embedded struct will become named if a duplicate is detected in the parent? That's not deterministic and doesn't really solve anyone's problems. The more desirable situation is that it should error out if a duplicate is detected, but that doesn't make much sense either, because you're using the tag on an embedded struct to dictate the serialization behavior of the parent, which doesn't really make any sense (why not the other embedded struct?). You'd almost want something like this:

type C struct {
    *A
    *B
} `json:"nodups"`

Obviously the language doesn't support anything like this (nor should it).

I think the right solution is to have the linter check for this issue.

I'll keep this issue open until Monday in case anyone has anything else they would like to add.

@mpvl

This comment has been minimized.

Member

mpvl commented Oct 29, 2016

@quentinmit: Embedded fields are indeed supposed to be promoted in JSON, if there is no conflict. The use of embedding in this context is quite useful. Even more so, the JSON package allows using "-" to explicitly break ties to allow one of the conflicting fields be promoted:

The Go visibility rules for struct fields are amended for JSON when deciding which field to marshal or unmarshal. If there are multiple fields at the same level, and that level is the least nested (and would therefore be the nesting level selected by the usual Go rules), the following extra rules apply:

  1. Of those fields, if any are JSON-tagged, only tagged fields are considered, even if there are multiple untagged fields that would otherwise conflict. ...

Allowing a package to bypass the language rules may seem like a bug, but it is consistent with the way the reflect package works. In fact, the reflect package was altered slightly to support this (and for other packages) correctly and consistently across compilers. See #7363 and #7247. The reflect model is not perfect, but the best we could do without breaking the Go 1 compatibility promise. And this approach also has its merits. By now too much code depends on exactly this behavior and it could only ever be changed in Go 2, if desirable.

That said, I see the point of being able to detect inadvertent hiding of conflicting fields. The suggested nodups could work like:

type C struct {
    _ struct {
      *A
      *B
   } `json:"nodups"`
} 

But that is ugly. Another possibility I could imagine is to use a sentinel type:

type C struct {
      json.NoDups
      *A
      *B
} 

where NoDups is defined in the json package as type NoDups struct{} and signals conflicts should be errors.
Yet another possibility would be to support a NoDups bool field in Encoder and Decoder.

Having a vet addition seems more natural, except that this would then need to be extended to support xml and many other packages that suffer from the same issue. Might get messy. The only way I can see this work in a scalable way is if we factor out the very similar code for handling this in json and xml, expose the API, and then support this vetting for any package using it. That has the side benefit of standardizing serialization of structs.

@nathanjsweet

This comment has been minimized.

nathanjsweet commented Oct 31, 2016

Having a vet addition seems more natural, except that this would then need to be extended to support xml and many other packages that suffer from the same issue. Might get messy. The only way I can see this work in a scalable way is if we factor out the very similar code for handling this in json and xml, expose the API, and then support this vetting for any package using it. That has the side benefit of standardizing serialization of structs.

This sounds good. The more I think about this issue the more I think that a runtime error is inferior to the vet tool. I think the order of precedence for any language problem should be compiler, linter, runtime. Who knows what people are doing vis-a-vis embedded structs and json. I'm sure there's some crazy nonsens out there. Having a vet warning and leaving it up to the developer to be responsible with their reflection seems like the best approach to me.

I can take a crack at this if folks are interested.

@mpvl I've never contributed to go before, how would I get the ball rolling on an issue like this?

@rsc

This comment has been minimized.

Contributor

rsc commented Nov 2, 2016

Postponing decision to Go 1.9.

@rsc rsc modified the milestones: Go1.9, Go1.8Maybe Nov 2, 2016

@mpvl

This comment has been minimized.

Member

mpvl commented Nov 3, 2016

@nathanjsweet: I also agree vet would be nice. But to do it in a scalable way will be a lot of work. The opt-in, like a json.NoDups sentinel would be a lot easier.

That said, for the preferred vet scenario, the following would need to happen:

  1. define an API that json and xml (and other packages) can use to do their field promotion resolution.
  2. rewrite the json and xml packages to use this API
  3. write a vet extension that detects users of this API and if there are conflicts in structs.

Some criteria for the API is that it should allow its users to cache results. All of this would first need to be flushed out in a proposal (we're talking about extending the core API here).

If this cannot be done successfully, I don't think this should be done in vet. Authors of other packages will rightly want to have the same vet check for theirs as well.
It is a lot of work, but it is worth a shot as it will be the nicest solution. And as a nice side effect it is an opportunity to standardize on the behavior of handling embedded, anonymous structs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment