New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Meta: protobuf optional and required fields #379
Comments
Great write-up! As related to protobuf v3 upgrade I think we should check:
|
Re (1): tooling for proto2 is not going away anytime soon, and the only upstream tool we need is Re (2): the wire format for proto3 is pretty much identical. |
How does the following plan sound?
|
What is the benefit of having the default values in application code instead of the parser? It seems to me that the default values should be as close to the messages defintion as possible. |
I agree with this more or less.
I am afraid of shitty Protobuf implementations which do not properly encode default values. Having default values on the decoding side would be nicer IMHO. But no strong opinion. |
That's a good point. |
But we will have the defaults on the decoding side. We will use the |
Does this need QA? |
QA is part of #1266 |
Test requested via Asana. |
i'm getting confused, so i'm going to start a meta-issue to document the state of our knowledge
Related Issues
#14, #16, #352
Current Situation
Our code uses protobuf v2, where the following is true:
repeated
field can either be present or missing from the wire formatApplySettings.use_passphrase
can betrue
,false
, or unset. If unset, the setting is not touched. Otherwise it is set to the appropriate value.is None
"Problem Statement
I'm not sure ;)
One concrete problem is that mypy static typing is not powerful enough to test our error checking. When all message fields are optional, the type of that field is set to
Optional[some type]
, and every function that uses it would need to contain something likeassert field is not None
, because the check does not carry across function call boundaries. Actually doing this would crazily complicate our code.A related concrete problem is the already crazy amount of code we spend on input validation -- and we have no good way to ensure that the coverage is exhaustive.
Possible Solutions
Prefill default values for primitive types
When parsing the message, if a field is missing, substitute a base value for that type:
""
for strings,0
for integer-likes, etc. This solves a huge part of validation: type annotations for these fields are no longerOptional
and we can statically check their use.Nested messages are still
None
, so still optional, but that is desirable. And there's much less of them, so the remaining amount of explicit validation is manageable.This is essentially what proto3 suggests, see also #16.
The drawback: we can no longer check whether the field was sent.
In places where this is required, we would have to add explicit flags to the protobuf. For instance, in
ApplySettings
, we would need a new field for every option, to tell if we should set that option, or keep the current setting. This would be backwards-incompatible in most places, and maybe even require rework of some parts of the logic.IMHO this is a no-go.
Only prefill explicit default values
We could annotate fields with
[default=x]
withx
the desired default value. We can then modify the parser so that if a field is missing on wire, the default is used.At the moment, some places have a default annotation, but it is only advisory, see also #352.
This has the advantage that we can pick and choose which fields must be optional and which must not. The code that checks if a field is present can continue to work unchanged, AND we get strong checking for the other places.
Of course, this would involve a full review of all protobuf messages and their field usage.
Make use of
required
fieldsWe could start using
required
field type, and reject messages that don't contain the required fields. See also #14.This seems complementary to the explicit default values option. The usage semantic of
required
is essentially "you must always provide a value", versusdefault
's "if you don't like the default value of<x>
, you can provide a different one."Note however that
required uint32 field
andoptional uint32 field [default=99]
are effectively identical in terms of "defensiveness". A possible attacker can always choose to either send the value explicitly (to avoidrequired
check) or not send it (to triggerdefault
).The drawback is that this introduces possible backwards-compatibility problems: once a consumer has a
required
field, we can never again not send it.But functinally speaking, that is true for all fields. Many of our consumers rely on data in specific fields. We can leave them out of the wire format, but the data will be missing on the other side anyway. This might even be an argument in favor: marking a field
required
will make such consumer fail early if we ever make a change to that field.Conclusion
Given the above problem statement, proto2 is more appropriate for our usecase, and we should decide if we want to start using
default
s andrequired
fields.It is possible to enable both features in
pb2py
now and change the definitions gradually.Did I miss some related problems? Esp. in terms of what proto3 brings us?
The text was updated successfully, but these errors were encountered: