Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native Poseidon implementation(s) #207

Merged
merged 28 commits into from
Sep 1, 2021
Merged

Native Poseidon implementation(s) #207

merged 28 commits into from
Sep 1, 2021

Conversation

unzvfu
Copy link
Contributor

@unzvfu unzvfu commented Aug 30, 2021

Two implementations of Poseidon, for widths 8 and 12: First implementation is 'naive' taken directly from the definition, and the other uses some trickery to do faster partial rounds (described in the paper). Both versions use a specially crafted MDS matrix.

Performance data is tracked in #4. In summary, the 'fast' implementation is faster on width 12, and the two implementations are similar on width 8. Performance is within about 10-15% of GMiMC with width 8.

Scripts required to generate the constants in the 'fast' implementation and the MDS matrix will be stored in a separate repository.

@unzvfu unzvfu requested a review from dlubarov August 30, 2021 13:32
@unzvfu unzvfu self-assigned this Aug 30, 2021
@dlubarov
Copy link
Contributor

What do you think about having Fields implement PoseidonInterface? I took a stab at it in #209.

Also I should mention, but we've been using rustfmt lately, so please run cargo fmt at some point before merging. Could optionally use #[rustfmt::skip] so that it skips the big constant arrays.

@unzvfu
Copy link
Contributor Author

unzvfu commented Aug 30, 2021

What do you think about having Fields implement PoseidonInterface? I took a stab at it in #209.

Yep, that could work. The constants definitely depend on the choice field, so that makes sense. My only hesitation would be from the fact that the Fields code is already a bit unwieldy. I'll have a look at #209 later today.

ETA: Yep, agree that #209 is an improvement.

Also I should mention, but we've been using rustfmt lately, so please run cargo fmt at some point before merging. Could optionally use #[rustfmt::skip] so that it skips the big constant arrays.

Ah, great to know about #[rustfmt::skip]! I'll use it on the big arrays, and tidy up the other other linting issues.

@unzvfu unzvfu requested a review from dlubarov September 1, 2021 00:10
* Have `Field`s implement `PoseidonInterface`

Rather than having a sort of "dummy struct" implement `PoseidonInterface` with the field as a generic param. I think this seems more natural and type-safe.

The type safety does come at a price -- it would be harder to do dynamic things such as taking `WIDTH` as a command line option -- but I think that's alright.

* Fix missed conflicts.

* cargo fmt fixes.

* Fix to accommodate changes in latest nightly.

Co-authored-by: Hamish Ivey-Law <426294+unzvfu@users.noreply.github.com>
Co-authored-by: Hamish Ivey-Law <hamish@ivey-law.name>
#[unroll_for_loops]
fn sbox_layer(state: &mut [Self; WIDTH]) {
for i in 0..WIDTH {
state[i] = Self::sbox_monomial(state[i]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if sbox_layer could be even faster if it was written in a more "row oriented" manner? See e.g. winterfell's Rescue implementation.

I suppose if this is really inlined and unrolled, the compiler should be able to interleave things optimally regardless of how we structure the code. I forget if unroll_for_loops works with const generics though? I wish we could put it on a specific loop and have it fail if it it couldn't unroll it.

(Definitely not a blocker in any case, just thinking out loud.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure the link to winterfell's Rescue implementation is what you meant? It's actually just a 'big' exponentiation: powers of the base a computed and combined using an addition chain for the exponent. I don't think that helps us, does it?

I wrote up a version of sbox_layer that unrolled and interleaved the computation of the state[i]^7 for i = 0..7 in such a way as to give maximum opportunity for multiplication throughput. Unfortunately the performance was the same.

I have had some issues with unroll_for_loops and const generics in the past, but they generally work okay together. In this case sbox_layer is not unrolled, but that is (presumably) because the code for calculating x^7 is too long at 4 modular multiplications. Other functions are definitely unrolled, like mds_layer for example.

Copy link
Contributor

@dlubarov dlubarov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@unzvfu
Copy link
Contributor Author

unzvfu commented Sep 1, 2021

Bizarrely, somehow the performance has changed a bit somewhere between f118464 and 45613c9. The numbers I'm getting now for width=8 are (name/microseconds/slowdown):

GMiMC             2.07   1.00
Poseidon          3.31   1.60
Poseidon naive    3.04   1.47

So, compared to the numbers in #4, GMiMC is 20% faster, and 'fast' Poseidon is now 10% slower than 'naive' Poseidon (which itself is slightly slower).

I think we can just push ahead though, since they're still very good, and we don't want to be optimising for my old laptop anyway! ;)

@unzvfu unzvfu mentioned this pull request Sep 1, 2021
@unzvfu unzvfu merged commit 92bc65a into main Sep 1, 2021
@unzvfu unzvfu deleted the poseidon-impl branch September 4, 2021 02:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants