Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the correct way in proto3 to use HasField? #1402

Closed
dhermes opened this issue Jan 20, 2016 · 15 comments
Closed

What is the correct way in proto3 to use HasField? #1402

dhermes opened this issue Jan 20, 2016 · 15 comments
Assignees
Labels
api: core api: datastore Issues related to the Datastore API. type: question Request for information or clarification. Not an issue.

Comments

@dhermes
Copy link
Contributor

dhermes commented Jan 20, 2016

Originally came up in #1329. HasField with proto3 fails on simple / singular / non-message fields.

@nathanielmanistaatgoogle Mentioned that @haberman might be able to shine some light. Josh, what should we be doing here? The current approach isn't great:

    # NOTE: As of proto3, HasField() only works for message fields, not for
    #       singular (non-message) fields. First try to use HasField and
    #       if it fails (with a ValueError) we manually consult the fields.
    try:
        return message_pb.HasField(property_name)
    except ValueError:
        all_fields = set([field.name for field in message_pb._fields])
        return property_name in all_fields

Should we bother with checking "has field", e.g.

if _has_field(key_pb.partition_id, 'dataset_id'):
    ...

or should we instead just check Truth-iness

if key_pb.partition_id.dataset_id:
    ...
# OR MORE STRICT
if key_pb.partition_id.dataset_id != '':
    ...
@dhermes dhermes added type: question Request for information or clarification. Not an issue. api: datastore Issues related to the Datastore API. api: core labels Jan 20, 2016
@haberman
Copy link

I think the question is: what are you trying to do? Why is it relevant to you whether a field is set or not and what do you intend to do with this information?

In proto3, field presence for scalar fields simply doesn't exist. Your mental model for proto3 should be that it's a C++ or Go struct. For integers and strings, there is no such thing as being set or not, it always has a value. For submessages, it's a pointer to the submessage instance which can be NULL, that's why you can test presence for it.

Please don't access msg._fields. It's a private variable with no defined semantics.

@dhermes
Copy link
Contributor Author

dhermes commented Feb 12, 2016

Thanks for the clarification (that mostly was what I assumed to be true, so that's comforting). The original reason for using msg._fields was to keep from rocking the boat on the proto2 to proto3 move.

Changing _has_field(key_pb.partition_id, 'dataset_id') to key_pb.partition_id.dataset_id != '' (and similar usage of _has_field) will be simple.

@haberman
Copy link

Yes, testing against 0 and the empty string is definitely one option. By the way, you can probably just use Python's truthiness testing for this (ie. if key_pb.partition_id.dataset_id:).

It still would help to know: what are you actually going to do with this information? If the dataset_id is non-empty, what will you do differently than when it's empty?

@dhermes
Copy link
Contributor Author

dhermes commented Feb 12, 2016

@haberman for example, when parsing a Value (stored in the datastore as a property on an Entity) a user could very well have stored boolean_value as False, double_value as 0.0, integer_value as 0, and so on.

I am trying to excise uses of _has_field where not necessary (e.g. checking if the dataset_id is set on a ParitionId) but it seems the above case still requires it (even for these simple fields).

@haberman
Copy link

Ah! See this is why I am glad I asked. The proper way of solving this problem is with a oneof. Oneofs will properly preserve presence even if the value being stored is 0, false, etc. See: https://developers.google.com/protocol-buffers/docs/proto#oneof

@dhermes
Copy link
Contributor Author

dhermes commented Feb 12, 2016

Yup. The datastore v1beta3 API uses a oneof for the case listed above, so I should be able to excise _has_field once that version is released. (The v1beta2 API is still on proto2)

@haberman
Copy link

proto2 supports oneof! In practice nearly all proto3 features (maps, oneofs, well-known types, etc) are all available in proto2 also. The main exception is JSON support.

@haberman
Copy link

Also, changing an existing protobuf to use oneof is a wire-compatible change. And in most ways it is an API-compatible change also.

@dhermes
Copy link
Contributor Author

dhermes commented Feb 12, 2016

Gotcha, v1beta2 was released some time ago so maybe it wasn't available to the authors at that time. Thanks for being so helpful here!

@dhermes
Copy link
Contributor Author

dhermes commented Feb 12, 2016

Address this in #1450.

@TheJKFever
Copy link

@haberman we're using protobuf to transition from some legacy code where an enum of values is mapped to integers including a valid enum with value of 0. In my protobuf the default value is 0 even if the field is "unset". Is there a way that I can check if this field is unset without using _fields? A related issue is the value 0 isn't even encoded to the wire even if I change the default. Any suggestions?

@haberman
Copy link

@TheJKFever Are you using proto2 or proto3 (ie. what does the "syntax=" declaration in your .proto file say?)

If you're using proto2, then you can use the HasField() method to check whether the field is set.

If you're using proto3, then there is no way to tell, sorry. This is by design; see extensive discussion on this here: protocolbuffers/protobuf#1606

@TheJKFever
Copy link

Thanks @haberman. We were using proto3, but decided to switch to proto2 for this reason. It seems like if you wanted to use proto3 and have the ability to see if someone explicitly set an int enum, you would need to make sure that your enum did not start at 0 or be okay with an implicit default of 0.

@marshallma21
Copy link

Suffering the same pain with @TheJKFever .
Can't tell whether it's a ENUM value with 0 or not receiving at all.

The funny thing is that, if you convert the proto to json string, you can actually tell whether the field exists or not. So the proto3 actually knows the existance of a Field, but it refuse to provide a way to tell that.

@bwanglzu
Copy link

bwanglzu commented May 26, 2021

maybe ListFields is better solution:

Returns a list of (FieldDescriptor, value) tuples for present fields.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: core api: datastore Issues related to the Datastore API. type: question Request for information or clarification. Not an issue.
Projects
None yet
Development

No branches or pull requests

5 participants