Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proto3 JSON required field confusion #760

Closed
osa1 opened this issue Oct 7, 2022 · 1 comment · Fixed by #763
Closed

proto3 JSON required field confusion #760

osa1 opened this issue Oct 7, 2022 · 1 comment · Fixed by #763

Comments

@osa1
Copy link
Member

osa1 commented Oct 7, 2022

proto3 does not have required fields, so proto3 JSON format spec does not specify how required fields should be handled.

This is fine, but we allow serializing proto2 messages as proto3 JSON. So users do it and get confusing results. Example:

syntax = "proto2";

message MyMessage {
    optional MyOtherMessage my_message_field = 1;
}

message MyOtherMessage {
    required int32 field1 = 1 [default = 123];
    required int32 field2 = 2 [default = 456];
}
import '../test.pb.dart';

void main() {
  {
    final json = {'my_message_field': null};
    final myMessage = MyMessage()..mergeFromProto3Json(json);
    print(myMessage.myMessageField.field1);
    print(myMessage.toProto3Json());
  }

  {
    final json = {'my_message_field': {}}; // required fields missing
    final myMessage = MyMessage()..mergeFromProto3Json(json);
    print(myMessage.myMessageField.field1);
    print(myMessage.toProto3Json());
  }
}

Output:

123
{myMessageField: {}}
123
{myMessageField: {}}

Problems:

  1. Deser->ser changes message from {myMessageField: null} to {myMessageField: {}}
  2. Deserialization accepts missing required fields
  3. Serialization omits required fields

I think (2) and (3) are consistent with binary serialization and work as intended, but (1) is potentially a problem as depending on the implementation on the receiving end it changes a message with non-existent optional field into a message with an invalid field value.

@osa1
Copy link
Member Author

osa1 commented Oct 10, 2022

We have the same problem in other serializers as well. Repro for binary serialization:

import '../test.pb.dart';

void main() {
  final json = {'my_message_field': null};
  final myMessage = MyMessage()..mergeFromProto3Json(json);
  print(myMessage.writeToBuffer());
}

Prints [10, 0].

I think we need one of these:

  1. Initialize required fields in makeDefault
  2. Have another message initializer function that initializes required fields, use it when parsing null in proto3 JSON (makeDefaultWithRequiredFields)
  3. Check field presence when serializing a message, use default value if field is required but the message does not have the value explicitly set (this may not work in some cases as presence checks are not always correct: Incorrect field presence with null JSON values #751, Presence checks (hazzers) for optional bytes fields don't work #690)

My understanding is empty message (message with empty field set) represents the default message, not the empty message. If this is right, then we don't need to do (1) as getters already work as expected. For example, getters below work as expected:

final json = {'my_message_field': null};
final myMessage = MyMessage()..mergeFromProto3Json(json);
print(myMessage.myMessageField.field1);  // 123
print(myMessage.myMessageField.getField(1)); // 456

(Not sure if bug or not: hasField1 returns false)

I don't know if (1) will interact with some of the other features. We should be careful not to initialize required fields when parsing binary messages. I think the only case where we need to initialize required fields is when we see a null in proto3 JSON, so maybe (2) is better than (1) in that it won't cause any unintentional changes elsewhere.

@sigurdm any thoughts on this issue?

osa1 added a commit to osa1/protobuf.dart that referenced this issue Oct 11, 2022
[proto3 JSON spec][1] says this about `null` values:

> If a value is missing in the JSON-encoded data or if its value is
> null, it will be interpreted as the appropriate default value when
> parsed into a protocol buffer

Normally empty field set represents the default message, but for that
representation to work with proto2 required fields, serializers need to
check for required fields in the `BuilderInfo` and for fields that are
not in the field set serialize the default value for the field.

Currently we don't do this, which results in generating messages with
missing required fields. See google#760 for a repro.

Checking missing required fields in serializers should fix the issue,
but as a simpler fix, this PR deserializes proto3 JSON `null` values to
field sets with required fields set to defaults. This way serializers do
not need to be tweaked.

Fixes google#760

[1]: https://developers.google.com/protocol-buffers/docs/proto3#json
osa1 added a commit to osa1/protobuf.dart that referenced this issue Oct 13, 2022
[proto3 JSON spec][1] says this about `null` values:

> If a value is missing in the JSON-encoded data or if its value is
> null, it will be interpreted as the appropriate default value when
> parsed into a protocol buffer

proto3 spec does not talk about presence, but in an internal spec we
have this about proto3 JSON null values:

> When parsing, a “null” value is accepted for all field types. It will
> be treated as if the field has the default value (the field itself
> would not be set, but its getter will return the default value for
> proto3).

So we shouldn't be initializing the fields with `null` values. The
"interpreted as default value" part in the public spec is already
handled by the getters.

With this google#760 is no longer an issue as we don't generate empty message
for the field value when we have a message-field.

We will still accept proto3 JSON message encodings with missing
`required` fields, but that's by design. We don't check for `required`
fields until `GeneratedMessage.check` is called.

Closes google#760

[1]: https://developers.google.com/protocol-buffers/docs/proto3#json
@osa1 osa1 closed this as completed in #763 Oct 18, 2022
osa1 added a commit that referenced this issue Oct 18, 2022
[proto3 JSON spec][1] says this about `null` values:

> If a value is missing in the JSON-encoded data or if its value is
> null, it will be interpreted as the appropriate default value when
> parsed into a protocol buffer

proto3 spec does not talk about presence, but in an internal spec we
have this about proto3 JSON null values:

> When parsing, a “null” value is accepted for all field types. It will
> be treated as if the field has the default value (the field itself
> would not be set, but its getter will return the default value for
> proto3).

So we shouldn't be initializing the fields with `null` values. The
"interpreted as default value" part in the public spec is already
handled by the getters.

With this #760 is no longer an issue as we don't generate empty message
for the field value when we have a message-field.

We will still accept proto3 JSON message encodings with missing
`required` fields, but that's by design. We don't check for `required`
fields until `GeneratedMessage.check` is called.

Closes #760

[1]: https://developers.google.com/protocol-buffers/docs/proto3#json
osa1 added a commit that referenced this issue Oct 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant