Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revisit decoding parameters of associated types #912

Open
divergentdave opened this issue Jan 19, 2024 · 2 comments
Open

Revisit decoding parameters of associated types #912

divergentdave opened this issue Jan 19, 2024 · 2 comments

Comments

@divergentdave
Copy link
Contributor

From discussion with @cjpatton, here's an idea for a possible future trait change. The Vdaf and Aggregator traits require several associated types for different messages, and most of them are required to implement either Decode or ParameterizedDecode.

  • Vdaf::AggregationParam: Decode
  • Vdaf::PublicShare: ParameterizedDecode<Self>
  • Vdaf::InputShare: for<'a> ParameterizedDecode<(&'a Self, usize)>
  • Vdaf::OutputShare: for<'a> ParameterizedDecode<(&'a Self, &'a Self::AggregationParam)>
  • Vdaf::AggregateShare: for<'a> ParameterizedDecode<(&'a Self, &'a Self::AggregationParam)>
  • Aggregator::PrepareShare: ParameterizedDecode<Self::PrepareState>
  • Aggregator::PrepareMessage: ParameterizedDecode<Self::PrepareState>

The Aggregator::PrepareState associated type has no Decode/ParameterizedDecode trait bounds, but it must be deserializable one way or another in order for an aggregator that's implemented as a distributed system to use the VDAF.

Lastly, Vdaf::Measurement and Vdaf::AggregateResult have no such trait bound, and don't need to traverse network connections or be written to disk.

These trait boundaries were arrived at through iteration, and notably had to be changed in order to implement Poplar1. It would be better if we redesigned this in a more principled way, starting from what set of information participants will have a priori when they need to decode different messages. This would prevent the need for future changes motivated by other VDAFs (or, in the alternative, avoid us having to use partially-self-describing serialization if we don't make trait changes for a new VDAF that needs more information).

The various existing decoding parameters include the instantiated VDAF, the aggregator's ID (as a usize, though now the spec requires that it fit in one byte), the aggregation parameter, and the prepare state. Note that the prepare state implicitly provides the current aggregation round. Here's a first cut of a maximal set of decoding parameters:

Message Instantiated VDAF Aggregator ID Aggregation parameter Prepare state or round
PublicShare
InputShare
AggregationParam
PrepareState1 ?
PrepareShare
PrepareMessage
OutputShare
AggregateShare

Fleshing out decoding parameters could also let us clean up an awkward data flow in Poplar1's prepare state: decoding routines for various types need to switch on the field size (inner vs. leaf), and some determine this directly from the level in the aggregation parameter, while others use an enum discriminant in the prepare state, because that's the only decoding parameter provided.

On the other hand, note that the VDAF spec states that prep_next() takes only the PrepState and PrepMessage as input. While aggregators will always have access to the aggregation parameter by the time they execute prep_next(), requiring that it be passed as a decoding parameter may make implementations thread it through functions it wasn't passed to previously.

I think at some point I previously argued against taking the current round as a decoding parameter to decode a prepare state, because requiring users to store round numbers and prepare state blobs next to each other was more complicated than storing prepare state blobs that included their own indication of the round, if needed. One open question I have is whether we should still provide the entire prepare state when decoding prepare shares and prepare messages, instead of just the round number. It seems unlikely but possible that a multi-round VDAF could need to remember something from a previous preparation round until deserialization of a subsequent message from the other aggregator.

In some cases, implementations may reuse one type in multiple associated types. However, opportunities to do so may be limited by the deserialization trait implementations. If two messages need to be deserialized sightly differently, but they get the same decoding parameters passed as context, then the same type can't be used for both. We could fix this so implementations never overlap, and make the intent of ParameterizedDecode implementations easier to understand, by adding a zero-sized type to the decoding parameter, indicating what sort of message is being decoded. (similar to the typestate pattern) For example we could declare struct PrepareMessageToken;, and then use the trait bound type PrepareMessage: for<'a> ParameterizedDecode<(&'a Self, u8, &'a Self::AggregationParam, &'a Self::PrepareState, PrepareMessageToken)>;.

Footnotes

  1. Note that this associated type isn't required to be deserializable by the trait bounds.

@cjpatton
Copy link
Collaborator

Thanks for the analysis. I think this is a good thing to do.

Boiling things down a bit, there are three pieces of context we might need to decode a given message:

  1. the VDAF parameters (i.e., a &Vdaf)
  2. the step of VDAF execution (i.e., agg id or prep round number)
  3. the aggregation parameter

In redesigning the API, I suggest we consider the following requirement: the VDAF implementation shouldn't have to store any of this context; force the user (i.e., DAP) to manage context instead.

With this requirement in mind, I think our API would look something like this:

Message Decoding context
PublicShare VDAF
InputShare VDAF, agg id
AggregationParam VDAF
PrepareState VDAF, prep round
PrepareShare VDAF, prep round
PrepareMessage VDAF, prep round
OutputShare VDAF, agg param
AggregateShare VDAF, agg param

@cjpatton
Copy link
Collaborator

cjpatton commented Feb 2, 2024

One extra data point here: In https://github.com/trustworthyComputing/mastic I'm working on adding Prio3 to a protocol whose architecture is significantly different from DAP:

  1. "driver" sends input shares to each "server"
  2. each server responds with its prep share
  3. driver combines the prep shares and sends the prep message to each server
  4. each server decides validity

Step 3 is impossible with the current API because decoding the prep shares requires prep state, which we don't want to reveal to the driver. To work around this, the driver can relays each server's prep share to its peer, but this results in needless communication overhead.

We might wonder if this is the optimal architecture for VDAF, but it's certainly valid, and it should be supported by our API.

cc/ @jimouris

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants