Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Two bug fixes in map field binary parsing #745

Merged
merged 4 commits into from
Sep 1, 2022
Merged

Conversation

osa1
Copy link
Member

@osa1 osa1 commented Aug 26, 2022

Map fields are encoded as repeated message fields where the message
field 1 is the key and 2 is the value.

In proto syntax, a map field like

map<int32, MyMessage> map_field = 1;

is encoded as if it was

repeated MapFieldEntry map_field = 1;

where

message MapFieldEntry {
    optional int32 key = 1;
    optional MyMessage value = 2;
}

Since map entries are ordinary messages, it's possible for a map entry
to have no key or value fields.

This PR updates map decoding to handle missing key and value fields.
Three tests added for (1) missing key field (2) missing value field (3)
missing key and value fields (entry message has 0 length prefix).

Fixes #719

This PR fixes two bugs when decoding map fields.

Map fields are encoded as repeated message fields where the message
field 1 is the key and 2 is the value.

In proto syntax, a map field like

    map<int32, MyMessage> map_field = 1;

is encoded as if it was

    repeated MapFieldEntry map_field = 1;

where

    message MapFieldEntry {
        optional int32 key = 1;
        optional MyMessage value = 2;
    }

This means we should handle these three cases:

1. A map field with length 0.

   In the example above, it's possible to see tag 1 (for `map_field`)
   with length prefix 0. Currently this case causes a null check error.

   (This is reported in google#719)

   A message encoding with length prefix 0 is the empty message. For a
   map field entry, this means default key mapped to default value. So
   this case now generates an entry `0 => MyMessage()`.

2. A map field with only key specified
3. A map field with only value specified.

   These two cases worked when the value is not a message or enum,
   becuase the library knows how to generate default values for fields
   other than messages and enums.

   (Reminder: map keys can only be integers or strings)

   To fix, `BuilderInfo.m` now takes a "default or maker" argument for
   the default value of the map values. Code generator passes the value
   when the map value is a message or enum.

Test cases added for all three cases.

Fixes google#719
@osa1 osa1 requested a review from sigurdm August 26, 2022 13:31
Copy link
Collaborator

@sigurdm sigurdm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@osa1
Copy link
Member Author

osa1 commented Aug 29, 2022

I just realized that proto2 and proto3 specs give different encodings for map fields:

map<int32, MyMessage> my_messages = 1;

===>

repeated MyMessagesEntry my_messages = 1;

// According to proto2 spec
message MyMessagesEntry {
    optional int32 key = 1;
    optional MyMessage value = 2;
}

// According to proto3 spec
message MyMessagesEntry {
    int32 key = 1;
    MyMessage value = 2;
}

These two encodings are not the same, for example, if a value fields is missing, proto2 implementation might return "null" for the value, while a proto3 implementation needs to return the default value for the message (which may or may not be null).

So I'm not sure if this PR is correct. I'll try get proto team's opinion on this.

@osa1
Copy link
Member Author

osa1 commented Sep 1, 2022

I asked about this to protobuf team. The key point is {key: 0, value: {}} should be identical to {} after the parse step. So we should handle missing key and value fields as we handle missing fields in messages that are not map entries.

I will merge this PR after testing internally.

@osa1 osa1 merged commit bdd90b2 into google:master Sep 1, 2022
@osa1 osa1 deleted the fix_719 branch September 1, 2022 11:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Decoder throws exception when decoding an empty map
2 participants