Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove non-constant-time comparisons of secret values. #734

Merged
merged 7 commits into from
Sep 8, 2023

Conversation

branlwyd
Copy link
Member

@branlwyd branlwyd commented Sep 6, 2023

This affects the types representing:

  • Input shares
  • Preparation states
  • Output shares
  • Aggregate shares

(and their constituent types)

Mostly, the comparisons were either dropped entirely or updated to be test-only. Input shares were instead given a constant-time equality implementation, as it is believed this is required for DAP implementations.

PartialEq & Eq implementations are still derived when the "test-util" feature is enabled. This is to ease testing for users of this library.

Closes #722.

@branlwyd
Copy link
Member Author

branlwyd commented Sep 6, 2023

A few comparisons in constant-time equality implementations were left non-constant-time; only the content of secret values was considered as required for constant-time comparisons. Specifically:

  • The Share type's ConstantTimeEq implementation will short-circuit on the type of the share (Leader vs Helper); this is thought to be OK since the content of the secrets inside is still compared in constant time.
  • The Prio3InputShare type's ConstantTimeEq implementation will short-circuit on the presence/absence of joint_rand_blind; since this is a property of the type of VDAF in use, it is not considered a secret. The content (if any) of joint_rand_blind is compared in constant time, as well as all other fields.
  • The Poplar1InputShare type's ConstantTimeEq implementation will short-circuit on the length of corr_inner; since only the content should be a secret, this should be OK. All field contents are otherwise verified in constant time.

Please verify these assertions as part of review. Please also verify that no other types should have their non-constant-time comparisons dropped. I have modified Janus-at-main to use this change successfully, so I am confident that at least Janus can work with these traits.

@branlwyd branlwyd force-pushed the bran/constant-time branch 3 times, most recently from 55ba66b to d74428f Compare September 7, 2023 00:29
This affects the types representing:
  * Input shares
  * Preparation states
  * Output shares
  * Aggregate shares

Mostly, the comparisons were either dropped entirely or updated to be
test-only.  Input shares were instead given a constant-time equality
implementation, as it is believed this is required for DAP
implementations.

PartialEq & Eq implementations are still derived when the "test-util"
feature is enabled. This is to ease testing for users of this library.
@branlwyd branlwyd marked this pull request as ready for review September 7, 2023 00:47
@branlwyd branlwyd requested a review from a team as a code owner September 7, 2023 00:47
@cjpatton
Copy link
Collaborator

cjpatton commented Sep 7, 2023

@branlwyd that seems like the right definition. Basically what our goal is to make runtime depend as little as possible on secrets. For other things the side channel is not important. We don't much care if the attacker knows that the party doing the computation is the Leader or Helper, for example.

Copy link
Collaborator

@cjpatton cjpatton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementations of ConstantTimeEq look good. I'm a bit concerned about API safety, see inline comment.

src/vdaf/poplar1.rs Outdated Show resolved Hide resolved
src/vdaf.rs Outdated
#[derive(Clone, Debug, Eq, PartialEq)]
#[derive(Clone, Debug)]
// Only derive equality checks in test code, as the content of this type is a secret.
#[cfg_attr(feature = "test-util", derive(PartialEq, Eq))]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned that this API might be too permissive. The intent is to fence the unsafe comparison to testing, but IIUC, the user can still shoot themself in the foot if they're not careful. Let's say I enable "test-util" in my crate that I also deploy to production; then I get to use Eq in my production code, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way we could resolve this is to write ConstantTimeEq implementations for the relevant structs/enums, and write PartialEq implementations that dispatch to them. (for an example of this pattern, see IdpfPublicShare) This would provide safe APIs for both downstream dependents and for test code, avoid the need for the test-util Cargo feature, and still give us the convenience benefit of being able to use assert_eq!().

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds reasonable to me.

Copy link
Member Author

@branlwyd branlwyd Sep 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The concern that led to this implementation was (from #722):

But the risk there is that new VDAFs will use derived PartialEq implementations that aren't constant time. So perhaps we would want to remove all PartialEq + Eq trait bounds and instead define some custom ConstantTimeEq trait.

That is, if we expose PartialEq + Eq by default (and expect users of the library to use these traits for comparison), we risk future VDAFs implementing variable-time versions of these traits. This would lead to any DAP implementations built on these VDAFs silently becoming insecure, in a way that would be particularly hard to notice.

OTOH, I like the idea of safe-by-default implementations and especially not requiring callers to know about subtle at all to use the library. I have moved everything relevant over to a constant-time implmementation of PartialEq + Eq.

This required implementing a few more constant-time equality checks than I otherwise would have had to, since a few types had test-only equality derivations before.

Let's say I enable "test-util" in my crate that I also deploy to production; then I get to use Eq in my production code, right?

Correct; if you enable a feature called "test-util" in your production binaries, IMO you are responsible for what comes next. :) This is a somewhat-common pattern for libraries -- for example, subtle's documentation contains the following: "Note: the subtle crate contains debug_asserts to check invariants during debug builds. These invariant checks involve secret-dependent branches, and are not present when compiled in release mode. This crate is intended to be used in release mode."

src/vdaf/prio3.rs Outdated Show resolved Hide resolved
src/vdaf/prio3.rs Outdated Show resolved Hide resolved
src/vdaf.rs Outdated
#[derive(Clone, Debug, Eq, PartialEq)]
#[derive(Clone, Debug)]
// Only derive equality checks in test code, as the content of this type is a secret.
#[cfg_attr(feature = "test-util", derive(PartialEq, Eq))]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way we could resolve this is to write ConstantTimeEq implementations for the relevant structs/enums, and write PartialEq implementations that dispatch to them. (for an example of this pattern, see IdpfPublicShare) This would provide safe APIs for both downstream dependents and for test code, avoid the need for the test-util Cargo feature, and still give us the convenience benefit of being able to use assert_eq!().

let mut rslt = self.idpf_key.ct_eq(&other.idpf_key)
& self.corr_seed.ct_eq(&other.corr_seed)
& self.corr_leaf.ct_eq(&other.corr_leaf);
for (x, y) in self.corr_inner.iter().zip(other.corr_inner.iter()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can avoid manually checking lengths and iterating by using the blanket impl<T: ConstantTimeEq> ConstantTimeEq for [T]. Just as the code here does, it short-circuits on slice length. Then, this method body could be simplified to a single four-way & expression.

Copy link
Member Author

@branlwyd branlwyd Sep 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this, unfortunately it doesn't work with an error like:

error[E0599]: the method `ct_eq` exists for struct `Vec<[Field64; 2]>`, but its trait bounds were not satisfied
   --> src/vdaf/poplar1.rs:159:31
    |
159 |             & self.corr_inner.ct_eq(&other.corr_inner)
    |                               ^^^^^ method cannot be called on `Vec<[Field64; 2]>` due to unsatisfied trait bounds
    |
    = note: the following trait bounds were not satisfied:
            `[Field64; 2]: ConstantTimeEq`
            which is required by `[[Field64; 2]]: ConstantTimeEq`

I think the reason [Field64; 2] doesn't satisfy ConstantTimeEq is that it can't be automatically converted to a [Field64]. subtle's const-generics feature seems like it should help -- but it actually only provides a blanket implementation of ConditionallySelectable for [T; N], no ConstantTimeEq implementation. I don't see why one couldn't be written, but this would require waiting on a release of the subtle crate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I filed dalek-cryptography/subtle#114, we'll see if it lands. The last release was 6 months ago so we may be waiting awhile in any case.

@branlwyd
Copy link
Member Author

branlwyd commented Sep 7, 2023

I added tests for all of the manually-implemented equality traits. They test PartialEq, but since all of the relevant PartialEq implementations fall through to ConstantTimeEq, this is sufficient to cover both traits.

src/vdaf/prio3.rs Outdated Show resolved Hide resolved
src/vdaf/prio3.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@cjpatton cjpatton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting your back into making this API-safe. It was a lot more work than we initially expected, but I think this is a really great improvement. Two high-level comments:

  1. For consistency, I think we should short-circuit any comparisons that don't need to be constant time. The only things that need to be constant time are secrets and secret shares. (See inline comments for specific suggestions.)
  2. It's important to keep in mind that constant time code is "best effort" and doesn't guarantee that side channels are eliminated. One day some academic lab might take a radio receiver or electron microscope to libprio-rs and find they can pull off an attack if both Aggregators are on old Pentium laptops in the same room or something. The documentation and comments should reflect that we have done our best to mitigate some timing side channels, but others may exist. I don't have a specific suggestion here, so maybe this is just something to keep in mind.

src/vdaf/poplar1.rs Outdated Show resolved Hide resolved
src/vdaf/prio3.rs Show resolved Hide resolved
src/vdaf/prio3.rs Outdated Show resolved Hide resolved
src/vdaf/prio3.rs Outdated Show resolved Hide resolved
src/vdaf/prio3.rs Outdated Show resolved Hide resolved
@branlwyd
Copy link
Member Author

branlwyd commented Sep 8, 2023

For consistency, I think we should short-circuit any comparisons that don't need to be constant time. The only things that need to be constant time are secrets and secret shares. (See inline comments for specific suggestions.)

Done, but I'm relying on y'all for which fields specifically are safe to compare in variable-time.

It's important to keep in mind that constant time code is "best effort" and doesn't guarantee that side channels are eliminated. One day some academic lab might take a radio receiver or electron microscope to libprio-rs and find they can pull off an attack if both Aggregators are on old Pentium laptops in the same room or something. The documentation and comments should reflect that we have done our best to mitigate some timing side channels, but others may exist. I don't have a specific suggestion here, so maybe this is just something to keep in mind.

Agreed; I think this should go in something like a "security considerations" section, once one is written.

Getting `option_ct_eq` to work in the last place it needed to required a
few changes.
@branlwyd branlwyd merged commit 753a461 into main Sep 8, 2023
6 checks passed
@branlwyd branlwyd deleted the bran/constant-time branch September 8, 2023 22:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Comparing Vdaf::PrepareState in constant time
3 participants