Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/protojson: unmarshaling empty bytes produces non-nil #896

Open
ofpiyush opened this issue Jul 9, 2019 · 9 comments

Comments

Projects
None yet
4 participants
@ofpiyush
Copy link

commented Jul 9, 2019

What version of protobuf and what language are you using?
Version: v1.3.1
Language: Go
Libprotoc: tried in both 3.6.1 and 3.7.1

What did you do?

message BytesTest {
    bytes Data = 1;
}

Made an object with empty bytes, marshalled it, unmarshalled from the resulting bytes.

I've hosted a working example of this bug at https://github.com/ofpiyush/protobuf-bug

What did you expect to see?
Expected to see empty slice of bytes

What did you see instead?
Got nil instead

Anything else we should know about your project / environment?

This becomes especially problematic when coupled with google's well known type BytesValue.

We were hoping to use it to differentiate between nil and empty cases. Now we can't because this bug affects that too.

@neild

This comment has been minimized.

Copy link
Contributor

commented Jul 9, 2019

Proto3 doesn't have a concept of field presence, so there's no difference on the wire between &BytesTest{Data: nil} and &BytesTest{Data: []byte{}}. The two are the same.

Proto2 does have field presence, and the nil-ness of the byte slice is preserved for proto2 messages.

@neild neild closed this Jul 9, 2019

@ofpiyush

This comment has been minimized.

Copy link
Author

commented Jul 10, 2019

@neild this behaviour breaks well known wrapper type "BytesValue" where nilness and emptiness are supposed to be separate and meaningful.

Should I open a separate issue for that or can we use this one?

@dsnet

This comment has been minimized.

Copy link
Member

commented Jul 10, 2019

@neild is talking about presence on a bytes field which is what the issue was originally about. The BytesValue is a message type, for which there is a distinction between "empty" and "not populated" in both proto2 and proto3.

@ofpiyush

This comment has been minimized.

Copy link
Author

commented Jul 10, 2019

@dsnet, the last section of the orignal issue discusses how well-known type BytesValue is breaking because of this behaviour.

Moreover json Marshal of original and unmarshalled produce different values making conformance testing etc confusing.

Should I open two separate issues about json Marshal output and well-known type BytesValue or can we use this one?

@dsnet

This comment has been minimized.

Copy link
Member

commented Jul 10, 2019

the last section of the orignal issue discusses how well-known type BytesValue is breaking because of this behaviour.

Sorry I missed that.

Looking over your repro, the issue is that you assuming presence is preserved for a top-level message. Unfortunately it is not. Presence is the property of a field in a message. This is a quirk of how the protobuf type system works.

@ofpiyush

This comment has been minimized.

Copy link
Author

commented Jul 10, 2019

  • JSON Marshal and proto Marshal behaviour are different. That's made conformance testing difficult and confusing. JSON Marshal + Unmarshal reproduces []byte{}. That seemed like the correct behaviour to be honest 😅

  • Emptiness vs null information for google.protobuf.BytesValue is lost if you marshal proto -> unmarshal proto -> marshal JSON. Most likely because json Marshal gives control over to underlying json marshaller which has no concept of separating the two cases.

I've updated the repro cases to reflect that on both top level bytes type and google.protobuf.BytesValue. Pay attention to the Json output in all three cases.

@dsnet

This comment has been minimized.

Copy link
Member

commented Jul 10, 2019

JSON Marshal + Unmarshal reproduces []byte{}. That seemed like the correct behaviour to be honest

That's buggy behavior according to protobuf semantics per @neild's comment earlier, and one that exists on v2 as well.

Assigning to @cybrcodr. More clear repro:

syntax = "proto3";
import "google/protobuf/wrappers.proto";
message Foo {
    bytes f1 = 1;
    google.protobuf.BytesValue f2 = 2;
}
m1 := new(foopb.Foo)
protojson.Unmarshal([]byte(`{"f1":"", "f2":""}`), m1)
fmt.Print(m1.GetF1() == nil)            // prints false, should be true
fmt.Print(m1.GetF2() == nil)            // prints false
fmt.Print(m1.GetF2().GetValue() == nil) // prints false, should be true

m1 := new(foopb.Foo)
protojson.Unmarshal([]byte(`{"f1":null, "f2":null}`), m2)
fmt.Printf(m2.GetF1() == nil)            // prints true
fmt.Printf(m2.GetF2() == nil)            // prints true
fmt.Printf(m2.GetF2().GetValue() == nil) // prints true

@dsnet dsnet reopened this Jul 10, 2019

@dsnet dsnet changed the title Unmarshaling empty bytes produces nil encoding/protojson: unmarshaling empty bytes produces non-nil Jul 10, 2019

@dsnet dsnet added bug jsonpb labels Jul 10, 2019

@dsnet

This comment has been minimized.

Copy link
Member

commented Jul 10, 2019

I decided to fix this in the protobuf reflection implementation. Thus, it would benefit the JSON and text implementations.

@dsnet dsnet assigned dsnet and unassigned cybrcodr Jul 10, 2019

gopherbot pushed a commit to protocolbuffers/protobuf-go that referenced this issue Jul 10, 2019

internal/impl: avoid preserving presence on proto3 bytes field
When setting a proto3 bytes field to an empty, non-nil bytes slice,
just store a nil slice in the underderlying storage.
This is done to avoid presenting the illusion to the user that
presence is preserved for proto3.

Updates golang/protobuf#896

Change-Id: I1b97bedd547d336863c65d9418d8f07edf69ccd6
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185577
Reviewed-by: Herbie Ong <herbie@google.com>

@dsnet dsnet added pending-v2 api-v2 and removed jsonpb labels Jul 10, 2019

@ofpiyush

This comment has been minimized.

Copy link
Author

commented Jul 11, 2019

@dsnet great!

As long as it is consistent and uniform across languages, any solution works for us :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.