Meta: protobuf optional and required fields #379

matejcik · 2019-08-02T15:38:42Z

i'm getting confused, so i'm going to start a meta-issue to document the state of our knowledge

Related Issues

#14, #16, #352

Current Situation

Our code uses protobuf v2, where the following is true:

Each non-repeated field can either be present or missing from the wire format
Producer can choose whether to send a field or not
Consumer can know whether the message contained the field or not
We make use of this in many places; e.g., ApplySettings.use_passphrase can be true, false, or unset. If unset, the setting is not touched. Otherwise it is set to the appropriate value.
At the same time, a lot of code is spent on validation, just search for "is None"

Problem Statement

I'm not sure ;)

One concrete problem is that mypy static typing is not powerful enough to test our error checking. When all message fields are optional, the type of that field is set to Optional[some type], and every function that uses it would need to contain something like assert field is not None, because the check does not carry across function call boundaries. Actually doing this would crazily complicate our code.

A related concrete problem is the already crazy amount of code we spend on input validation -- and we have no good way to ensure that the coverage is exhaustive.

Possible Solutions

Prefill default values for primitive types

When parsing the message, if a field is missing, substitute a base value for that type: "" for strings, 0 for integer-likes, etc. This solves a huge part of validation: type annotations for these fields are no longer Optional and we can statically check their use.

Nested messages are still None, so still optional, but that is desirable. And there's much less of them, so the remaining amount of explicit validation is manageable.

This is essentially what proto3 suggests, see also #16.

The drawback: we can no longer check whether the field was sent.
In places where this is required, we would have to add explicit flags to the protobuf. For instance, in ApplySettings, we would need a new field for every option, to tell if we should set that option, or keep the current setting. This would be backwards-incompatible in most places, and maybe even require rework of some parts of the logic.

IMHO this is a no-go.

Only prefill explicit default values

We could annotate fields with [default=x] with x the desired default value. We can then modify the parser so that if a field is missing on wire, the default is used.

At the moment, some places have a default annotation, but it is only advisory, see also #352.

This has the advantage that we can pick and choose which fields must be optional and which must not. The code that checks if a field is present can continue to work unchanged, AND we get strong checking for the other places.

Of course, this would involve a full review of all protobuf messages and their field usage.

Make use of `required` fields

We could start using required field type, and reject messages that don't contain the required fields. See also #14.

This seems complementary to the explicit default values option. The usage semantic of required is essentially "you must always provide a value", versus default's "if you don't like the default value of <x>, you can provide a different one."

Note however that required uint32 field and optional uint32 field [default=99] are effectively identical in terms of "defensiveness". A possible attacker can always choose to either send the value explicitly (to avoid required check) or not send it (to trigger default).

The drawback is that this introduces possible backwards-compatibility problems: once a consumer has a required field, we can never again not send it.
But functinally speaking, that is true for all fields. Many of our consumers rely on data in specific fields. We can leave them out of the wire format, but the data will be missing on the other side anyway. This might even be an argument in favor: marking a field required will make such consumer fail early if we ever make a change to that field.

Conclusion

Given the above problem statement, proto2 is more appropriate for our usecase, and we should decide if we want to start using defaults and required fields.

It is possible to enable both features in pb2py now and change the definitions gradually.

Did I miss some related problems? Esp. in terms of what proto3 brings us?

The text was updated successfully, but these errors were encountered:

tsusanka · 2019-08-05T10:44:13Z

Great write-up! As related to protobuf v3 upgrade I think we should check:

if v2 will be supported in the near future (it seems so)
if v3 has some performance advatages (it seems it does not)

matejcik · 2019-08-05T11:41:28Z

Re (1): tooling for proto2 is not going away anytime soon, and the only upstream tool we need is protoc -- which we can freeze at an old version if necessary.
Also we can simulate both required and default with custom field options, making the definition valid in proto3.
All encoding and decoding is done by custom code in the firmware. The only problem could arise in Connect/wallet.

Re (2): the wire format for proto3 is pretty much identical.
There is some space saving because proto3 does not send fields that have the default value, and there are "packed arrays" that are more efficient than the normal ones. I expect the saving to be negligible for us.
I expect speed to be identical too.

prusnak · 2019-08-05T12:17:50Z

How does the following plan sound?

let's stick to proto2
revisit all the fields and use required when the field is really required
revisit all the fields, remove the default values and make sure we handle the missing value correctly in the code (in another word, the application code, not the parser, adds the default value) + also consider whether the default value fields shouldn't become required fields

tsusanka · 2019-08-05T12:23:44Z

revisit all the fields, remove the default values and make sure we handle the missing value correctly in the code (in another word, the application code, not the parser, adds the default value) + also consider whether the default value fields shouldn't become required fields

What is the benefit of having the default values in application code instead of the parser? It seems to me that the default values should be as close to the messages defintion as possible.

prusnak · 2019-08-05T18:11:38Z

It seems to me that the default values should be as close to the messages defintion as possible.

I agree with this more or less.

What is the benefit of having the default values in application code instead of the parser?

I am afraid of shitty Protobuf implementations which do not properly encode default values. Having default values on the decoding side would be nicer IMHO. But no strong opinion.

tsusanka · 2019-08-06T07:12:43Z

I am afraid of shitty Protobuf implementations which do not properly encode default values. Having default values on the decoding side would be nicer IMHO. But no strong opinion.

That's a good point.

matejcik · 2019-08-06T10:09:21Z

I am afraid of shitty Protobuf implementations which do not properly encode default values. Having default values on the decoding side would be nicer IMHO. But no strong opinion.

But we will have the defaults on the decoding side. We will use the default option to generate code that sets the default if it's not sent over the wire -- which is the behavior we want. Shitty host-side implementations can be free to fix their own code.

tsusanka · 2020-09-25T10:54:16Z

Does this need QA?

matejcik · 2020-09-25T11:30:22Z

QA is part of #1266

tsusanka · 2020-09-25T11:44:07Z

Test requested via Asana.

matejcik added protobuf Structure of messages exchanged between Trezor and the host meta labels Aug 2, 2019

matejcik mentioned this issue Aug 2, 2019

Investigate upgrading to Protobuf v3 #16

Closed

ZdenekSL added W20 code Code improvements labels Aug 22, 2019

This was referenced Aug 22, 2019

Investigate dropping required fields #14

Closed

Improve handling of default values in Protobuf #352

Closed

tsusanka added this to the backlog milestone Sep 30, 2019

tsusanka modified the milestones: backlog, 2020-07 May 25, 2020

tsusanka mentioned this issue May 25, 2020

Enable mypy in sign_tx code #1001

Closed

tsusanka assigned matejcik Jun 18, 2020

tsusanka modified the milestones: 2020-08, 2020-09 Jul 8, 2020

matejcik mentioned this issue Jul 14, 2020

bump nanopb to 0.4.3 #1105

Closed

tsusanka modified the milestones: 2020-09, 2020-10 Aug 19, 2020

matejcik mentioned this issue Sep 21, 2020

Make use of protobuf required fields for typing the Bitcoin app #1266

Merged

matejcik closed this as completed in #1266 Sep 23, 2020

tsusanka added the needs QA label Sep 25, 2020

tsusanka modified the milestones: 2020-10, 2020-11 Oct 1, 2020

tsusanka modified the milestones: 2020-11, 2020-12 Nov 6, 2020

tsusanka modified the milestones: 2020-12, 2021-01 Dec 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Meta: protobuf optional and required fields #379

Meta: protobuf optional and required fields #379

matejcik commented Aug 2, 2019

tsusanka commented Aug 5, 2019

matejcik commented Aug 5, 2019

prusnak commented Aug 5, 2019 •

edited

tsusanka commented Aug 5, 2019 •

edited

prusnak commented Aug 5, 2019

tsusanka commented Aug 6, 2019 •

edited

matejcik commented Aug 6, 2019

tsusanka commented Sep 25, 2020

matejcik commented Sep 25, 2020

tsusanka commented Sep 25, 2020

Meta: protobuf optional and required fields #379

Meta: protobuf optional and required fields #379

Comments

matejcik commented Aug 2, 2019

Related Issues

Current Situation

Problem Statement

Possible Solutions

Prefill default values for primitive types

Only prefill explicit default values

Make use of required fields

Conclusion

tsusanka commented Aug 5, 2019

matejcik commented Aug 5, 2019

prusnak commented Aug 5, 2019 • edited

tsusanka commented Aug 5, 2019 • edited

prusnak commented Aug 5, 2019

tsusanka commented Aug 6, 2019 • edited

matejcik commented Aug 6, 2019

tsusanka commented Sep 25, 2020

matejcik commented Sep 25, 2020

tsusanka commented Sep 25, 2020

Make use of `required` fields

prusnak commented Aug 5, 2019 •

edited

tsusanka commented Aug 5, 2019 •

edited

tsusanka commented Aug 6, 2019 •

edited