Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

protoc-gen-go: generate MarshalJSON/UnmarshalJSON methods for messages #256

Open
bcmills opened this issue Nov 17, 2016 · 25 comments
Open
Labels
generator-proto-option involves generators respecting a proto option to control generated source output proposal
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Nov 17, 2016

Per #183:

There's several other parts of the proto3 JSON spec that make it impossible to use the standard encoding/json package, like WKTs.

I don't want to have generated protos grow any extra dependencies, so depending on the jsonpb package isn't appropriate.

I think json usage is pervasive enough that we should revisit this decision. Message types whose JSON encoding diverges from the standard encoding/json package should provide MarshalJSON and UnmarshalJSON methods. Then users could use the standard package without the need to explicitly interact with jsonpb.

Yes, it's possible that would add a few more dependencies for programs that don't strictly need them. To the extent that that's a problem, it's a problem that should be fixed in the compiler and linker: the marginal increase in build and link times due to the extra dependencies seems like it would be miniscule compared to the complexity of interacting with two separate and incompatible packages for JSON encoding (e.g. #248).

@awalterschulze
Copy link

Ok so what would the generated MarshalJSON method do.
Would it call jsonpb (which calls encoding/json again) or would it be actual generated code that marshals json faster than with reflect and gives packages like ffjson a run for their money?

@bcmills
Copy link
Contributor Author

bcmills commented Nov 17, 2016

@awalterschulze: This proposal is about the API, not the implementation.

Initially it would probably just call jsonpb. If we find it's worth the extra overhead to generate code (or use some other optimization technique to avoid reflection), it would be easy enough to change the implementation while preserving the same API.

@awalterschulze
Copy link

awalterschulze commented Nov 18, 2016

Ok so encoding/json is going to call the generated MarshalJSON method which is going to call jsonpb which sometimes also calls encoding/json. Some psuedo circular dependency. We hope it doesn't become a loop. Sounds like any easy bug to make.

Maybe we can list the differences between encoding/json en jsonpb? Even if its just for the record.

I know one reason was the fact that non string keys could not be encoded by encoding/json. But the go1.7 release notes now state the following:

In earlier versions of Go, this package only supported encoding and decoding maps using keys with string types. Go 1.7 adds support for maps using keys with integer types: the encoding uses a quoted decimal representation as the JSON key. Go 1.7 also adds support for encoding maps using non-string keys that implement the MarshalText (see encoding.TextMarshaler) method, as well as support for decoding maps using non-string keys that implement the UnmarshalText (see encoding.TextUnmarshaler) method. These methods are ignored for keys with string types in order to preserve the encoding and decoding used in earlier versions of Go.

Which means this problem might not be a problem anymore?

What are the other differences between encoding/json and jsonpb?

@bcmills
Copy link
Contributor Author

bcmills commented Nov 18, 2016

Maybe we can list the differences between encoding/json en jsonpb? Even if its just for the record.

Differences I'm aware of:

  • Field names. (Easy enough to fix with a proto compiler change, but it's a breaking change for current users of the json package with generated message types.)
  • The default behavior for json is to include default ("empty") values; the default behavior for jsonpb is to omit them.
  • jsonpb encodes enums by name rather than as integers.
  • jsonpb recognizes protobuf Well-Known Types and parses them accordingly.
  • Maybe some differences involving numbers encoded as strings vs. numbers encoded as JS number literals?
  • Maybe handling of NaNs and Infinity?

@awalterschulze
Copy link

Some of these things are solvable:

  • Field names: maybe the new json_name feature can be used to have something consistent here?
  • "empty": simply generate structures with omitempty?
  • enums: We can generate MarshalJSON and UnmarshalJSON methods for enums, like was done in the past?
  • WKTs: We can write MarshalJSON and UnmarshalJSON for those types.
  • numbers vs strings literals: This might be a problem?
  • NaNs, Infinity, I also don't know about this.

So what if we start by moving the jsonpb implementation to be closer to the encoding/json implementation.
Maybe we can start by generating omitempty for proto3 and then simply copying the behaviour of encoding/json for this case?
Then we could start handling well known types in a less specific way in the jsonpb package and rather use MarshalJSON and UnmarshalJSON implementations.
Then maybe we can get to the point where the proto json marshaler is simply the encoding/json Marshaler with a few custom options?
Then our generated types don't need to generate MarshalJSON and UnmarshalJSON methods?

@bcmills
Copy link
Contributor Author

bcmills commented Nov 22, 2016

Then maybe we can get to the point where the proto json marshaler is simply the encoding/json Marshaler with a few custom options?

I agree that that's a nice goal in principle, and I've retitled the proposal accordingly. I'm not sure whether it's feasible.

@bcmills bcmills changed the title proposal: generated message types should implement json.{Marshaler,Unmarshaler} proposal: generated message types should obey the proto JSON spec when serialized with encoding/json Nov 22, 2016
@awalterschulze
Copy link

Yes this title is something I can get behind.

Maybe we can start by generating omitempty for proto3 and then simply copying the behaviour of encoding/json for this case?

@mwitkow
Copy link

mwitkow commented Dec 7, 2016

Yes please, this would solve so many problems that are common: people deserializing with jsonpb and serializing with json

@mwitkow
Copy link

mwitkow commented Feb 27, 2017

Ok, so to sum up this discussion Step 1 would be:

  • general MarshalJSON and UnmarshalJSON for message structs that simply delegate to jsonpb
  • remove the json tag on the struct to not confuse people
  • add a flag to the protoc-gen-go compiler that allows you fallback to the previous behaviour (no MarshalJSON UnmarshalJSON generation json tag is there)
  • document it in a the release notes protoc-gen-go

I like this approach, because it gets us 90% there (you can use encoding/json like normal humans), there is a fallback for prior users of json, and the jsonpb deserializes the "whole tree" (without sub message caling) so we de-risk any edge cases.

Step 2 would be to consider code-generating the MarshalJSON/UnmarshalJSON functions that work in a tree-like fashion. I.e. if a message has a nested message, the relevant MarshalJSON/UnmarshalJSON of the nested message is called. This may be tricky with:

  • numbers vs strings literals: This might be a problem?
  • NaNs, Infinity, I also don't know about this.
    But I'm sure we can have helper functions in jsonpb library as a stepping stone that handles the field values inside the codegenned methods.

@zombiezen
Copy link
Contributor

Note that MarshalJSON and UnmarshalJSON would have to be opinionated on which options (fields in jsonpb.Marshaler and jsonpb.Unmarshaler) are used. It may be fine to have this as a default, but I don't think we can remove the jsonpb package.

@bcmills
Copy link
Contributor Author

bcmills commented May 19, 2017

@zombiezen True, but I think the protobuf JSON spec and the Go encoding/json package provide pretty clear defaults.

The Marshaler options:

  • EnumsAsInts is not allowed by the spec: it requires "[t]he name of the enum value as specified in proto".
  • EmitDefaults is explicitly allowed and must default to false.
  • Indent is allowed (the spec is silent about whitespace), so it should default to the same formatting produced by encoding/json.
  • OrigName is not allowed by the spec: it requires that "names are mapped to lowerCamelCase".

The Unmarshaler options:

  • AllowUnknownFields is not addressed in the spec. We should probably get some clarification on that, and/or check the published conformance-test suite.

@zombiezen
Copy link
Contributor

SGTM. Just wanted to make sure we thought about it.

@morphar
Copy link

morphar commented Aug 23, 2017

I just wanted to add a note: the json StructTag value of "-" isn't checked.
I found out about this in a new project, where I added json:"-" for a password field (just to be sure) - it didn't work.
So it would seem, there is currently no way to ignore a value with jsonpb.

@cybrcodr
Copy link
Contributor

@morphar Curious, how did you set the JSON struct tag value of "-" on a generated type field? If this is for a dynamic message type, you can implement JSONPBMarshaler/JSONPBUnmarshaler. See https://github.com/golang/protobuf/blob/master/jsonpb/jsonpb_test.go#L767.

@mkmik
Copy link

mkmik commented Sep 6, 2017

any progress on this? need help?

jgiles added a commit to jgiles/go-grpc-middleware that referenced this issue Oct 1, 2017
Previously, the call to .addReflected(j.Message) was leading to the
default Go json.Marshal output for the message struct (since pb messages
do not themselves implement json.Marshaler). This meant the various
jsonpb.Marshaler config controls were not respected - in particular,
enum values were always printed as numbers instead of strings.

This change alters the capitalization of ouput field names to the jsonbp
camel-case style. The old output was governed by the "legacy" generated
json struct tags.
- golang/protobuf#189
- golang/protobuf#256
mwitkow pushed a commit to grpc-ecosystem/go-grpc-middleware that referenced this issue Oct 20, 2017
Previously, the call to .addReflected(j.Message) was leading to the
default Go json.Marshal output for the message struct (since pb messages
do not themselves implement json.Marshaler). This meant the various
jsonpb.Marshaler config controls were not respected - in particular,
enum values were always printed as numbers instead of strings.

This change alters the capitalization of ouput field names to the jsonbp
camel-case style. The old output was governed by the "legacy" generated
json struct tags.
- golang/protobuf#189
- golang/protobuf#256
@tnarg
Copy link

tnarg commented Nov 21, 2017

In case anyone is interested, I implemented protoc-gen-go-jsonpb to generate MarshalJSON/UnmarshalJSON functions for messages.

@dsnet dsnet added the proposal label Feb 14, 2018
@dsnet dsnet changed the title proposal: generated message types should obey the proto JSON spec when serialized with encoding/json proposal: protoc-gen-go: generate MarshalJSON/UnmarshalJSON methods for messages Mar 9, 2018
@dsnet dsnet added the generator-proto-option involves generators respecting a proto option to control generated source output label Mar 10, 2018
@marcusljx
Copy link

marcusljx commented Mar 22, 2019

Would love MarshalJSON/UnmarshalJSON methods to be generated for Message types. Would make it a lot easier to deal with map/array types as well, such as the following:

message A {
  string name = 1;
  map<string, B> b_field = 2;
}

message B {
  float64 C = 1;
}

In this case, say we have a backend service that needs to serialise/deserialise only A.b_field:

func (h *Handler) storeBFields(a *mypackage.A) error {
  b := a.GetBField()
  
  jsonpb.Marshal(b)  // <--- Wrong: map[string]*mypackage.B does not implement proto.Message

}

By having MarshalJSON/UnmarshalJSON methods, it would allow users to use only stdlib encoding/json's Marshal/Unmarshal functions to unmarshal protobuf to it's exact JSON mapping, without using hacky workarounds.

@mvdan
Copy link
Member

mvdan commented Jun 7, 2019

Perhaps related, at least as far as decoding enums is concerned: #866

@mitchellh
Copy link

Whoops I missed this dup when I reported my issue (#926).

My stab is similar to @tnarg above, I didn't see this thread and googling didn't bring anything up. We've now started using this and so far has been working great: https://github.com/mitchellh/protoc-gen-go-json

It'd be great to have this sort of functionality built-in.

@mfridman
Copy link

mfridman commented Jan 8, 2020

Is this waiting on the refactor mentioned in #266 for v2 of this pkg?

Anything members of the community can help out with to move this particular set of issues through? Very much would like to see this land in the core pkg.

@dsnet dsnet changed the title proposal: protoc-gen-go: generate MarshalJSON/UnmarshalJSON methods for messages protoc-gen-go: generate MarshalJSON/UnmarshalJSON methods for messages Mar 4, 2020
@puellanivis
Copy link
Collaborator

Huh, this isn’t such a bad idea. Since we can override JSON marshaling/unmarshalling per message to call into jsonpb instead, we can avoid incidents of people confusing which one they are supposed to be using. And meanwhile, standard encoding/json does not have to support any of the unique issues of protobuf JSON encoding…

@hashamali
Copy link

Thanks @mitchellh, that package works perfectly! Would love to see this built in as well.

@jbgrunewald
Copy link

Is there a status on this? I recently came across this issue when trying to wrap a protobuf generated struct inside another struct that contained other non protobuf generate structs. Now I can't simply use json.Unmarshal/Marshal because the protobuf related struct requires special handling related to the enum type fields. I'd be super happy to have these protobuf structs implement the MarshalJSON and UnmarshalJSON functions with some sane defaults.

Please let me know if this is ready to be worked on.

@tamalsaha
Copy link

@jbgrunewald
Copy link

jbgrunewald commented Dec 5, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
generator-proto-option involves generators respecting a proto option to control generated source output proposal
Projects
None yet
Development

No branches or pull requests