Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New arithmetic for k256 #59

Merged
merged 59 commits into from Jul 13, 2020
Merged

Conversation

fjarri
Copy link
Contributor

@fjarri fjarri commented Jun 29, 2020

This PR implements "lazy reduction" arithmetic for SECP256k1, based on libsecp256k1.

(Disclaimer: this is my first non-trivial piece of Rust code. Don't hesitate to point out my mistakes).

  • adds 5x52bit and 10x26bit arithmetic for the internal field (the old Montgomery one is preserved)
  • adds 4x64bit and 8x32bit arithmetic for scalars
  • adds more tests for fields and scalars (using num-bigint as a reference)
  • adds a fixed window point-scalar multiplication

More advanced multiplication routines (wNAF, endomorphism, batch multiplication) will not be included in this PR — it has too much stuff already. Even the windowed multiplication already gives a good performance boost. As compared to the baseline performance, I currently see an almost 3x speedup (300us vs 106us per mul).

The old Montgomery field can be enabled with field-montgomery, 32-bit fields & scalars on a 64-bit target can be enabled with force-32-bit.

@tarcieri tarcieri requested review from tuxxy and str4d Jun 30, 2020
@tarcieri
Copy link
Member

Note: there is a #[bench] test in field.rs, so the whole thing only compiles for nightly now. Not sure what is the best practice here.

A couple options:

  1. Add #[cfg(nightly)] gating, and run the benchmarks as cargo +nightly bench -- --cfg nightly. Somewhat annoying.
  2. On some other issues (Not for merge: quick n' dirty ECDSA #57) we've discussed adding an expose-arithmetic cargo feature (potentially hidden in the https://docs.rs docs, and with appropriate scare language documenting it) which would be useful in this case. It would allow you to use nicer tools for benchmarking too (e.g. criterion)

@tarcieri
Copy link
Member

Looks like some recent changes to arithmetic.rs are now conflicting. Can you rebase please?

Overall this looks exciting!

@fjarri
Copy link
Contributor Author

fjarri commented Jun 30, 2020

Sure, I'm doing some abstracting now, will rebase after.

@fjarri
Copy link
Contributor Author

fjarri commented Jul 5, 2020

Sorry for a bit of a delay on this. I was playing with endomorphism-based multiplication from the original libsecp256k1, but the code related to it is quite large in size and very convoluted, so it is better left for the next PR. For now I'll settle for a simple fixed-window multiplication (which is still an almost 2x improvement over the previous double-and-add) and limit this PR to scalar arithmetic.

@fjarri
Copy link
Contributor Author

fjarri commented Jul 6, 2020

@tarcieri I added the nightly-bench feature as you suggested, and it seems to work (tests run on my machine both on stable and nightly), but CI still fails because this feature gets enabled by --all-features. Is there a way to avoid that somehow? This will also be a problem when I add 32-bit implementations for FieldElement and Scalar, which I planned to make available as features.

@tarcieri
Copy link
Member

tarcieri commented Jul 6, 2020

My suggestion was to add a cfg attribute, not a cargo feature:

#[cfg(nightly)]

@fjarri
Copy link
Contributor Author

fjarri commented Jul 6, 2020

For some reason it doesn't work for me. The marked functions/imports are picked neither in a stable nor in a nightly run. --cfg option is not recognized either:

> cargo +nightly bench -- --cfg nightly
    Finished bench [optimized] target(s) in 0.03s
     Running /Users/bogdan/wb/github/elliptic-curves/target/release/deps/k256-5d871111fa650de9
error: Unrecognized option: 'cfg'
error: bench failed

@tarcieri
Copy link
Member

tarcieri commented Jul 6, 2020

Hmm, that's unfortunate.

I think a better option in general would be finding a way to make it work with criterion, which in addition to working on stable provides a much better benchmarking tool in general.

@fjarri
Copy link
Contributor Author

fjarri commented Jul 6, 2020

I see. I guess I'll just scrap the benchmark in the final version then.

@tarcieri
Copy link
Member

tarcieri commented Jul 6, 2020

Sounds good for now. We can explore adding an expose-arithmetic feature in a follow-up PR which would permit a benchmark using criterion.

@tarcieri
Copy link
Member

tarcieri commented Jul 8, 2020

Not sure why the codecov check is failing, but regardless, don't worry about it

@tarcieri
Copy link
Member

tarcieri commented Jul 11, 2020

@fjarri sorry about #66. After landing it, I realized I'll need a bunch of stuff in this branch anyway for what I'm presently working on (public key recovery from ECDSA signatures) and should probably be using this branch as a base instead

@fjarri
Copy link
Contributor Author

fjarri commented Jul 11, 2020

No worries, seems like a small change anyway. I know I've been taking some time with this PR, but I'm really hoping to get it ready for review this weekend.

@tarcieri
Copy link
Member

tarcieri commented Jul 11, 2020

@fjarri fantastic! I can keep working off your WIP branch in the meantime.

Also: do you plan on adding scalar inversions? It looks like you have most of the prerequisites in place.

(Edit: for a PoC I tried gluing together @nickray's scalar inversions from #57 together with this branch's u64x4 backend in #68)

@fjarri fjarri changed the title [WIP] new arithmetic for k256 New arithmetic for k256 Jul 11, 2020
@fjarri fjarri marked this pull request as ready for review Jul 11, 2020
@fjarri
Copy link
Contributor Author

fjarri commented Jul 11, 2020

@tarcieri , uhh, I just realized you merged some of my commits to upstream. I was intending to squash them, they're quite unkempt, but I guess it's too late now...

k256/Cargo.toml Outdated Show resolved Hide resolved
@fjarri
Copy link
Contributor Author

fjarri commented Jul 11, 2020

Also, I added an addition-chain based inversion for scalars, which should be faster than a generic vartime power.

@tarcieri
Copy link
Member

I just realized you merged some of my commits to upstream. I was intending to squash them, they're quite unkempt, but I guess it's too late now...

@fjarri I used them in a draft PR (#68). None of them have been merged to the master branch.

@fjarri
Copy link
Contributor Author

fjarri commented Jul 11, 2020

Ah, my bad, I'm going to do some squashing then.

k256/src/arithmetic/field.rs Outdated Show resolved Hide resolved
@tarcieri
Copy link
Member

Another small change you might consider making (or otherwise I can) is wiring up the GenerateSecretKey implementation to use your newly added Scalar::generate method:

https://github.com/RustCrypto/elliptic-curves/blob/42dcac1/k256/src/arithmetic.rs#L519-L535

@codecov-commenter
Copy link

codecov-commenter commented Jul 12, 2020

Codecov Report

Merging #59 into master will decrease coverage by 24.67%.
The diff coverage is 45.34%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master      #59       +/-   ##
===========================================
- Coverage   81.50%   56.82%   -24.68%     
===========================================
  Files          12       18        +6     
  Lines        1395     2965     +1570     
===========================================
+ Hits         1137     1685      +548     
- Misses        258     1280     +1022     
Impacted Files Coverage Δ
k256/src/arithmetic/field/field_10x26.rs 0.00% <0.00%> (ø)
k256/src/arithmetic/field/field_montgomery.rs 0.00% <0.00%> (ø)
k256/src/arithmetic/scalar/scalar_8x32.rs 0.00% <0.00%> (ø)
k256/src/arithmetic/util.rs 48.14% <52.38%> (-18.52%) ⬇️
k256/src/arithmetic/scalar.rs 82.60% <81.76%> (-5.07%) ⬇️
k256/src/arithmetic.rs 84.90% <86.58%> (-0.03%) ⬇️
k256/src/arithmetic/field.rs 92.57% <91.08%> (+7.02%) ⬆️
k256/src/arithmetic/field/field_5x52.rs 93.59% <93.59%> (ø)
k256/src/arithmetic/scalar/scalar_4x64.rs 94.22% <94.22%> (ø)
k256/src/arithmetic/field/field_impl.rs 95.31% <95.31%> (ø)
... and 7 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 93e7a40...1d7f6a2. Read the comment docs.

@tarcieri
Copy link
Member

tarcieri commented Jul 13, 2020

@fjarri I'd say this is ready to merge unless there are any objections (would definitely like to use this ASAP)

Normally we use GitHub's "Squash and Merge" however if you'd like me to preserve the commit history here (or squash it yourself) you can.

@fjarri
Copy link
Contributor Author

fjarri commented Jul 13, 2020

Squash and merge sounds good to me.

Please give me until the end of the day before merging, I just want to add a bit more comments.

@tarcieri
Copy link
Member

tarcieri commented Jul 13, 2020

@fjarri regarding the SecretKey issue, after talking with some people it sounds like the current method it's using (rejection sampling) is the safest.

Notably it can also be made constant time by always generating a fixed number of samples. The probability of a non-biased RNG generating multiple consecutive candidates which are out of range vanishes rather quickly (after ~4 candidates).

Whether that's actually worthwhile seems to be a matter of debate. For now I think it's fine to leave as-is.

@cargo-dep-bot
Copy link

cargo-dep-bot bot commented Jul 13, 2020

This PR made the following dependency changes:

Added Packages (Duplicate versions in '()'):
	num-bigint 0.3.0
	num-integer 0.1.43

Removed Packages (Remaining versions in '()'):
	fiat-crypto 0.1.5

@fjarri
Copy link
Contributor Author

fjarri commented Jul 13, 2020

Ok, I think it's ready now. I exposed both Scalar::generate_vartime() (the algorithm from generate_secret_key()) and Scalar::generate_biased(), and used the former in generate_secret_key(). We can remove the biased version later if no-one needs it, but at least its drawback is right there in the name.

@tarcieri tarcieri merged commit f932323 into RustCrypto:master Jul 13, 2020
@tarcieri
Copy link
Member

Merged! 🎉

Thanks so much!

@tuxxy
Copy link
Contributor

tuxxy commented Jul 14, 2020

Hey, y'all! Sorry I didn't see these mentions (GitHub notifications don't exactly pull my attention anymore 😅), but it sounds like y'all figured it out anyway.

This PR looks great @fjarri. Excited to build on this for sure!

@daira
Copy link

daira commented Jul 15, 2020

That I'm not qualified to answer. @tuxxy , what do you say? The biased generation is essentially the same as it was for field elements - 512 random bits are generated and modulo reduced.

In my professional opinion that is the best way to do it. It's what we use in Zcash for RedDSA signatures. There is no reason to worry about a ~2-256 bias in any plausible cryptographic context. I think there's no reason to use a name like generate_biased when the bias is so small. (Consider that the probability of the hardware failing in a way that silently introduces a bias is much, much higher than the bias due to the algorithm itself.)

@tarcieri
Copy link
Member

tarcieri commented Jul 31, 2020

Interesting article on rejection sampling vs modular reduction for randomly generating in-range values with a uniform distribution. The conclusion was interesting: why not both?

https://research.kudelskisecurity.com/2020/07/28/the-definitive-guide-to-modulo-bias-and-how-to-avoid-it/

@tarcieri tarcieri mentioned this pull request Aug 11, 2020
@fjarri fjarri deleted the k256-new-arithmetic branch Aug 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants