-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redefine internal Kingdom proto messages to support v2alpha public API. #39
Conversation
2a6e92b
to
3102132
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 6 files at r2.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @SanjayVas and @wangyaopw)
src/main/proto/wfa/measurement/internal/kingdom/duchy_protocol_config.proto, line 24 at r2 (raw file):
option java_multiple_files = true; // Protocol configuration that need only be known by Duchies.
Does the Kingdom actually need an understanding of this if it's only needed by Duchies? (Can we treat it as bytes to avoid coupling this too tightly to the public API?)
Actually, more generally -- can we reduce what the Kingdom needs to understand at the data layer?
11eb11e
to
ab80333
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 3 of 8 files reviewed, 1 unresolved discussion (waiting on @efoxepstein and @wangyaopw)
src/main/proto/wfa/measurement/internal/kingdom/duchy_protocol_config.proto, line 24 at r2 (raw file):
Previously, efoxepstein (Eli Fox-Epstein) wrote…
Does the Kingdom actually need an understanding of this if it's only needed by Duchies? (Can we treat it as bytes to avoid coupling this too tightly to the public API?)
Actually, more generally -- can we reduce what the Kingdom needs to understand at the data layer?
My understanding is that this config must be the same for all Duchies participating in a given computation, but it's not something that MCs need to specify. The easiest way to do this is to have the Kingdom provide it (likely reading the values from some config file).
You're right that the actual values could be in an opaque bytes since the Kingdom doesn't do anything with the fields, but given that the Kingdom is the source for this it should also define the structure. IMO the only places where the Kingdom should persist opaque bytes is when it's acting as a pass-through, holding bytes from an external source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r3.
Reviewable status: 7 of 8 files reviewed, 1 unresolved discussion (waiting on @efoxepstein, @SanjayVas, and @wangyaopw)
src/main/proto/wfa/measurement/internal/kingdom/duchy_protocol_config.proto, line 24 at r2 (raw file):
Previously, SanjayVas (Sanjay Vasandani) wrote…
My understanding is that this config must be the same for all Duchies participating in a given computation, but it's not something that MCs need to specify. The easiest way to do this is to have the Kingdom provide it (likely reading the values from some config file).
You're right that the actual values could be in an opaque bytes since the Kingdom doesn't do anything with the fields, but given that the Kingdom is the source for this it should also define the structure. IMO the only places where the Kingdom should persist opaque bytes is when it's acting as a pass-through, holding bytes from an external source.
Would it make sense for the Kingdom to understand the structure at the public API level but not at the data layer then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 7 of 8 files reviewed, 1 unresolved discussion (waiting on @efoxepstein and @wangyaopw)
src/main/proto/wfa/measurement/internal/kingdom/duchy_protocol_config.proto, line 24 at r2 (raw file):
Previously, efoxepstein (Eli Fox-Epstein) wrote…
Would it make sense for the Kingdom to understand the structure at the public API level but not at the data layer then?
It needs to understand it sufficiently at the persistence layer to make conversions between API versions, since the same persistence may need to support more than one API version (assuming they are sufficiently compatible but not necessarily binary-compatible).
I suppose we could theoretically persist one API version and convert to others, but that means you need to either keep old API definitions around for reading old rows or do a data migration. (This particular message will most likely be in a config file so migration should be easy, but I don't want to make that assumption in the proto definition).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 7 of 8 files reviewed, 1 unresolved discussion (waiting on @efoxepstein, @SanjayVas, and @wangyaopw)
src/main/proto/wfa/measurement/internal/kingdom/duchy_protocol_config.proto, line 24 at r2 (raw file):
Don't we have this problem with all sorts of serialized things like this?
bytes el_gamal_public_key = 1;
and
// Serialized `MeasurementSpec` from public API.
bytes measurement_spec = 2;
I guess one philosophy is enforce a consistent API version for all parties per Measurement and then hope we never need to surface these details for old measurements in UIs.
I don't particularly care what we do here with DuchyProtocolConfig but we should have some philosophy as we're likely to run into this a few more times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 7 of 8 files reviewed, 1 unresolved discussion (waiting on @efoxepstein and @wangyaopw)
src/main/proto/wfa/measurement/internal/kingdom/duchy_protocol_config.proto, line 24 at r2 (raw file):
Previously, efoxepstein (Eli Fox-Epstein) wrote…
Don't we have this problem with all sorts of serialized things like this?
bytes el_gamal_public_key = 1;
and
// Serialized `MeasurementSpec` from public API. bytes measurement_spec = 2;
I guess one philosophy is enforce a consistent API version for all parties per Measurement and then hope we never need to surface these details for old measurements in UIs.
I don't particularly care what we do here with DuchyProtocolConfig but we should have some philosophy as we're likely to run into this a few more times.
All other things begin equal, I think erring on the side of explicitly storing things is better. At the very least it helps debugging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r4.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @SanjayVas and @wangyaopw)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @efoxepstein, @kungfucraig, and @wangyaopw)
src/main/proto/wfa/measurement/internal/kingdom/duchy_protocol_config.proto, line 24 at r2 (raw file):
Don't we have this problem with all sorts of serialized things like this?
I guess one philosophy is enforce a consistent API version for all parties per Measurement and then hope we never need to surface these details for old measurements in UIs.
Yes, in some cases the Kingdom is serving as a pass-through for pre-serialized values where the actual bytes are significant (regardless if they deserialize into an equivalent message). In those cases, we can't do much beyond checking or surfacing the API version of the serialized messages to prevent clients from deserializing incorrectly. This is why this PR includes fields for persisting API version. (We likely need to surface these in the API so that old entries can be accessed even on newer API versions).
IMO we should only persist API messages when we don't have a choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @kungfucraig and @wangyaopw)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @kungfucraig and @wangyaopw)
src/main/proto/wfa/measurement/internal/kingdom/duchy_protocol_config.proto, line 27 at r4 (raw file):
// The maximum frequency to reveal in the histogram. int32 maximum_frequency = 1;
@wangyaopw I assume this is something that MCs shouldn't be free to decide, so they can't specify it in the measurement spec. Should this perhaps be in the public protocol config so that MCs can know the max frequency after the measurement is created?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @kungfucraig and @SanjayVas)
src/main/proto/wfa/measurement/internal/kingdom/duchy_protocol_config.proto, line 27 at r4 (raw file):
Previously, SanjayVas (Sanjay Vasandani) wrote…
// The maximum frequency to reveal in the histogram. int32 maximum_frequency = 1;
@wangyaopw I assume this is something that MCs shouldn't be free to decide, so they can't specify it in the measurement spec. Should this perhaps be in the public protocol config so that MCs can know the max frequency after the measurement is created?
Technically, it is OK for the MC to specify the maximum_frequency. But since it impacts the computation cost, we won't allow them to wildly choose a value. If we set a cap on this value, e.g., 15, then I guess most MCs would just use 15 for it. so it seems not necessary to make it configureable by the MC at all.
The last bucket in the final report would be for "K+", so MC would know what max frequency is used.
ab80333
to
fcca129
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r5.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @kungfucraig and @SanjayVas)
src/main/proto/wfa/measurement/internal/kingdom/computation_participant.proto, line 43 at r5 (raw file):
REQUISITION_PARAMS_SET = 2; // This `ComputationParticipant` is ready to participate.
Is this the terminal state? Are there any others we need like "REFUSED" or something?
4b77aeb
to
3c1650f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 7 of 8 files reviewed, 1 unresolved discussion (waiting on @efoxepstein)
src/main/proto/wfa/measurement/internal/kingdom/computation_participant.proto, line 43 at r5 (raw file):
Previously, efoxepstein (Eli Fox-Epstein) wrote…
Is this the terminal state? Are there any others we need like "REFUSED" or something?
Yes to both. This is just reflecting the states from the system API, so also updating in #44.
Let me know if you think we need to capture refusal info like we do for Requisitions. I assume the only real case for refusal would be signature verification failure, and most other errors would come from computation stage attempts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 7 of 9 files reviewed, 1 unresolved discussion (waiting on @efoxepstein, @kungfucraig, and @wangyaopw)
src/main/proto/wfa/measurement/internal/kingdom/differential_privacy.proto, line 27 at r7 (raw file):
Previously, wangyaopw (Yao Wang) wrote…
I am wondering if we should define such proto in the common repo. They are unlikely to change in the future.
I already created one in any-sketch and is referring to it inside the duchy. We could move it to the common repo and just use it everywhere..
https://github.com/world-federation-of-advertisers/any-sketch/blob/bd54716d9c76664df363c0b16337c09443074b09/src/main/proto/wfa/common/noise_parameters.proto
It's a bad idea to depend on external proto messages in either persistence or APIs. The exceptions is for stable well-known messages, such as google.protobuf.Timestamp
.
This is why we have some messages that are currently duplicated in the public API and internal persistence, since they may evolve somewhat independently (e.g. in the case where an API proto change can be implemented without a data migration).
I don't know if I have sufficiently high confidence in any of our messages not changing, so I'd rather that we can isolate them even at the cost of duplication.
Added |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 7 of 9 files reviewed, 1 unresolved discussion (waiting on @efoxepstein, @kungfucraig, and @SanjayVas)
src/main/proto/wfa/measurement/internal/kingdom/differential_privacy.proto, line 27 at r7 (raw file):
Previously, SanjayVas (Sanjay Vasandani) wrote…
It's a bad idea to depend on external proto messages in either persistence or APIs. The exceptions is for stable well-known messages, such as
google.protobuf.Timestamp
.This is why we have some messages that are currently duplicated in the public API and internal persistence, since they may evolve somewhat independently (e.g. in the case where an API proto change can be implemented without a data migration).
I don't know if I have sufficiently high confidence in any of our messages not changing, so I'd rather that we can isolate them even at the cost of duplication.
These math protos are very unlikely to change no matter what system/protocol we are building.
a44ff5b
to
ec96c81
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 6 of 19 files reviewed, 1 unresolved discussion (waiting on @efoxepstein)
src/main/proto/wfa/measurement/internal/kingdom/differential_privacy.proto, line 27 at r7 (raw file):
Previously, wangyaopw (Yao Wang) wrote…
These math protos are very unlikely to change no matter what system/protocol we are building.
@efoxepstein Seeking second opinion. Do you think this would be stable enough to consider it a common type? This implies that we would put in in a common type
package and share it among internal and API services as described in https://google.aip.dev/215.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 2 files at r7, 12 of 12 files at r8.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @wangyaopw)
src/main/proto/wfa/measurement/internal/kingdom/differential_privacy.proto, line 27 at r7 (raw file):
Previously, SanjayVas (Sanjay Vasandani) wrote…
@efoxepstein Seeking second opinion. Do you think this would be stable enough to consider it a common type? This implies that we would put in in a common
type
package and share it among internal and API services as described in https://google.aip.dev/215.
I think the risk here is pretty low.
ec96c81
to
7ca9ece
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 17 of 19 files reviewed, 1 unresolved discussion (waiting on @efoxepstein and @wangyaopw)
src/main/proto/wfa/measurement/internal/kingdom/differential_privacy.proto, line 27 at r7 (raw file):
Previously, efoxepstein (Eli Fox-Epstein) wrote…
I think the risk here is pretty low.
So I believe the following repos would depend on this common type: any-sketch, cmm, and cmm-api. cmm depends on both cmm-api and any-sketch.
Any thoughts on where we should put this? Should it be a new repo for common protobuf types? Package halo.type
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 17 of 19 files reviewed, 1 unresolved discussion (waiting on @efoxepstein and @wangyaopw)
src/main/proto/wfa/measurement/internal/kingdom/differential_privacy.proto, line 27 at r7 (raw file):
Previously, SanjayVas (Sanjay Vasandani) wrote…
So I believe the following repos would depend on this common type: any-sketch, cmm, and cmm-api. cmm depends on both cmm-api and any-sketch.
Any thoughts on where we should put this? Should it be a new repo for common protobuf types? Package
halo.type
?
From offline discussion, we've agreed to keep these separate out of caution since proto converters aren't too difficult to write/maintain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 17 of 19 files reviewed, all discussions resolved (waiting on @efoxepstein)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r9.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @SanjayVas)
7ca9ece
to
b7d354f
Compare
b7d354f
to
044c22a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 19 of 25 files reviewed, 3 unresolved discussions (waiting on @corbantek, @oliver-amzn, @SanjayVas, and @zachcwc)
src/main/proto/wfa/measurement/internal/kingdom/requisition.proto, line 57 at r10 (raw file):
// specified certificate or an encrypted value cannot be decrypted using // the specified key. CONSENT_SIGNAL_INVALID = 1;
what would a duplicate hash/salt/requisition be considered?
src/main/proto/wfa/measurement/internal/kingdom/requisition.proto, line 93 at r10 (raw file):
message Details { bytes data_provider_public_key = 1;
is this the data provider's encryption key?
src/main/proto/wfa/measurement/internal/kingdom/requisition.proto, line 94 at r10 (raw file):
message Details { bytes data_provider_public_key = 1; bytes data_provider_public_key_signature = 2;
Is this the public signing key for this particular requisition signed with the root certificate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 19 of 25 files reviewed, 3 unresolved discussions (waiting on @corbantek, @oliver-amzn, @stevenwarejones, and @zachcwc)
src/main/proto/wfa/measurement/internal/kingdom/requisition.proto, line 57 at r10 (raw file):
Previously, stevenwarejones (Steven Ware Jones) wrote…
what would a duplicate hash/salt/requisition be considered?
I think that fits under this case, since every measurement should have a unique key.
Note that this is currently just a copy of the enum from the v2alpha public API.
src/main/proto/wfa/measurement/internal/kingdom/requisition.proto, line 93 at r10 (raw file):
Previously, stevenwarejones (Steven Ware Jones) wrote…
is this the data provider's encryption key?
Yes. It's to persist that data from the Requisition
API resource.
src/main/proto/wfa/measurement/internal/kingdom/requisition.proto, line 94 at r10 (raw file):
Previously, stevenwarejones (Steven Ware Jones) wrote…
Is this the public signing key for this particular requisition signed with the root certificate?
This is the signature of the above public key field, which can be verified using the certificate indicated by the external_data_provider_certificate_id
field.
I can add additional documentation for this, but it's just for persisting the equivalent data from the public API.
044c22a
to
640ecdd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 18 of 25 files reviewed, 3 unresolved discussions (waiting on @corbantek, @efoxepstein, @oliver-amzn, @SanjayVas, @stevenwarejones, and @zachcwc)
src/main/proto/wfa/measurement/internal/kingdom/requisition.proto, line 93 at r10 (raw file):
Previously, SanjayVas (Sanjay Vasandani) wrote…
Yes. It's to persist that data from the
Requisition
API resource.
what do you think of naming it to data_provider_public_encryption_key? So, its not confused with a signing key.
640ecdd
to
bb1a0d6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 17 of 25 files reviewed, 3 unresolved discussions (waiting on @corbantek, @efoxepstein, @oliver-amzn, @stevenwarejones, and @zachcwc)
src/main/proto/wfa/measurement/internal/kingdom/requisition.proto, line 93 at r10 (raw file):
Previously, stevenwarejones (Steven Ware Jones) wrote…
what do you think of naming it to data_provider_public_encryption_key? So, its not confused with a signing key.
I updated the comment to indicate that this is a serialized EncryptionPublicKey
message, which not only makes the purpose clear but also indicates that it includes the key type.
As an aside, signing keys aren't actually ever persisted/exposed by the system on their own. Those are always part of an X.509 certificate, which would be in a field that has certificate
somewhere in the name. Therefore, any public key field in the system is an encryption key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 6 files at r10, 2 of 2 files at r11.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @corbantek, @oliver-amzn, @stevenwarejones, and @zachcwc)
This is for #52.
This change is