Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

evaluate SNARK-friendly hash functions, e.g. Poseidon, Rescue-Prime, Griffin, Anemoi #2233

Open
zookozcash opened this issue Apr 2, 2017 · 69 comments
Labels
A-circuit Area: zk-SNARK circuits A-crypto Area: Cryptography C-research Category: Engineering notes in support of design choices I-performance Problems and improvements with respect to performance I-SECURITY Problems and improvements related to security. M-requires-nu A network upgrade is required to implement this. Network Upgrade Wishlist not in Sapling special to Zooko

Comments

@zookozcash
Copy link

zookozcash commented Apr 2, 2017

[MiMC] is the result of the design approach of "let's do a traditional symmetric-flavored secure hash function, but using primitives that are efficient within a SNARK".

@zookozcash zookozcash added A-crypto Area: Cryptography M-requires-nu A network upgrade is required to implement this. maybe in 2.0 I-performance Problems and improvements with respect to performance C-research Category: Engineering notes in support of design choices NU1-sapling Network upgrade: Sapling-specific tasks I-SECURITY Problems and improvements related to security. special to Zooko A-circuit Area: zk-SNARK circuits labels Apr 2, 2017
@daira daira changed the title evaluate the SNARK-friendly MiMC secure hash function evaluate the SNARK-friendly MiMChash function Apr 3, 2017
@daira
Copy link
Contributor

daira commented Apr 3, 2017

Note that MiMC is a cipher (or permutation). MiMChash (section 2.3 of the original paper) is what we want; probably a prime-field variation on MiMCHash-256b (which is a sponge hash based on MiMC-769/769).

Section 5.1 of that paper says:

The above descriptions of MiMC can also be used to operate over prime fields i.e. a field Fp where p is prime. In that case, it needs to be assured that the cubing in the round function creates a permutation. For this, it is sufficient to require gcd(3, p − 1) = 1.

However for BLS12-381 (https://z.cash/blog/new-snark-curve.html), the most efficient choice of p is 0x73eda753299d7d483339d80809a1d80553bda402fffe5bfeffffffff00000001, for which gcd(3, p − 1) ≠ 1.

@daira
Copy link
Contributor

daira commented Apr 3, 2017

Also note that the ~255-bit field Fp (for the BLS12-381 p, or the BN128 one for that matter) is not big enough for MiMC(Fp)-p/p to have comparable security to MiMC(F2n)-769/769. We'd need to do bignum arithmetic with at least three limbs, which could get us to a 764-bit permutation (or use an extension field Fp3; not sure if that works for the permutation requirement).

@daira
Copy link
Contributor

daira commented Apr 3, 2017

The disadvantage of doing bignum arithmetic is that then you have to implement modular reduction, which would significantly increase the cost.

@daira
Copy link
Contributor

daira commented Apr 3, 2017

Hmm, we could use the LP231 construction of Rogaway and Steinberger (Constructing Cryptographic Hash Functions from Fixed-Key Blockciphers) with MiMC(Fp)-p/p. That would fit the field exactly, giving a ~510 to ~255-bit compression function. (This construction requires 3 calls to the permutation per compression, but we would need at least three times the field size to construct a similarly sized compression function from MiMC using a sponge, so using LP231 appears to pay off.)

[Edit: oh, except that this still hits the gcd(3, p − 1) ≠ 1 problem. But if we decided to use MiMC then we could just generate a new curve with this constraint.]

@daira
Copy link
Contributor

daira commented Apr 5, 2017

Here's the definition of LPA231 from the Rogaway and Steinberger paper:

function LP231A(v1, v2):
    x1 := v1 + 2 v2
    y1 := π1(x1)
    x2 := 2 v1 + 2 v2 + y1
    y2 := π2(x2)
    x3 := 2 v1 + v2 + y2
    y3 := π3(x3)
    w1 := v1 + y1 + y2 + 2 y3
    return w1

(Note that there's an error in the left-hand side of Figure 1 of the paper; y3 := π3(x3) is correct, y3 := π3(x2) is not. The right-hand side of that figure gives the correct general algorithm, of which LP231A is supposed to be a special case.)

In that paper, the focus is on F2128 as the group, so '+' essentially denotes XOR. However, nothing prevents us from applying this construction in Fp instead of F2128, and I believe the security analysis still goes through with '+' as addition mod p. I'll call that variation LP231Ap.

We can instantiate π1..3 using MiMCP(Fp)-p/p with different (fixed) round constants. There is no performance penalty for using distinct sets of constants. Here's the definition of MiMCP(Fp)-p/p, generalized for exponent e:

function MiMCPpC, e(x):
    x0 := x
    for i from 0 up to R-1:
        xi+1 := (xi + Ci)e
    return xR

(This differs slightly from the paper because we have an extra round constant C0; note that k is zero. We could set C0 to zero to reproduce MiMCP from the paper.)

I will call this hash function construction "LongsightL" (Longsight is where I am at the moment; the L is to distinguish it from LongsightF below).

In particular LongsightL-R/p/e refers to the case where the MiMC permutations have R rounds, are defined over Fp, and use exponent e.

When Fp is the internal field of the SNARK, LongsightL-R/p/3 requires 6R multiplications (2 multiplications per round for each of the three permutations). This corresponds to 6R constraints; the operations used in LP231p are linear and can be folded into the same constraints as the multiplications.

LongsightL-R/p/5 requires 9R multiplications or constraints (3 multiplications per round for each permutation), i.e. there is 50% overhead for the use of exponent 5. So, it would definitely be preferable to use exponent 3 if the gcd(3, p − 1) ≠ 1 problem can be fixed.

Note that there appears to be an error in the MiMC paper concerning the number of constraints needed; the paper claims in section 6.1 (page 22) that only one R1CS constraint is needed to implement each
round of the permutation (which is reflected in the prototype implementation). However this fails to constrain the intermediate value Y in equation (5) on page 22, which is necessary for security.

With exponent 3, to equal the performance of a Pedersen hash estimated at #2234 (comment) , we could have R up to 2675/6 = 445 rounds.

The MiMC paper only defines the number of rounds for MiMC(F2n), but it can be generalized to an arbitrary finite field Fr as:

    R = ceiling(log3(r))

So for r = p = 0x73eda753299d7d483339d80809a1d80553bda402fffe5bfeffffffff00000001 which is efficient for BLS12-381, the number of rounds should be 161. (The calculation at the end of section 4.3 suggests that this may be very conservative, at least for protection against known attacks, so it might be possible to reduce the number of rounds.)

LongsightL-161/p/3 doesn't actually work because of the gcd(3, p − 1) ≠ 1 problem, but it needs 966 constraints, so for a similar curve without that problem it would be ~1.95 times faster than Pedersen (1881 constraints), or ~28.5 times faster than SHA256Compress (27536 constraints), or ~1.99 times faster than a subset-sum hash of dimension 5 (~1923 constraints).

LongsightL-161/p/5 does work (on either the BN128 or BLS12-381 curves) and needs 1449 constraints, which makes it ~1.30 times faster than Pedersen, or ~19.0 times faster than SHA256Compress, or ~1.33 times faster than a subset-sum hash of dimension 5.

@daira
Copy link
Contributor

daira commented Apr 5, 2017

Another option to get a ~510 → ~255-bit compression function is to use the Feistel variant of MiMC, i.e. MiMCP(Fp)-2p/p, to construct a ~510-bit permutation, and then truncate the result to a single field element (by taking the element that was modified in the last round). This would require double the number of rounds for the permutation (i.e. 322 rounds), but only one permutation instance is required, so the overall time is one-third less (1.5 times faster). This also does not require the x ↦ x3 mapping to be a permutation (but I am slightly nervous of it not being one until this has been further analysed).

Security of this variant (let's call it LongsightF) would rely on the fact that truncating a random permutation gives something tightly indistinguishable from a random function.
Specifically, for the case of a permutation (Fp)2 → (Fp)2 truncated to Fp, if q is the number of queries such that q ≤ 3p2/4, then the advantage of an adversary in distinguishing the random permutation from a random function is bounded by q/p3/2. And a random function is obviously the best we can do for a collision-resistant hash.

@daira
Copy link
Contributor

daira commented Apr 5, 2017

So, LongsightF-322/p/3 needs 644 constraints, and is ~2.76 times faster than Pedersen (after further optimization of Pedersen hashing), ~42.8 times faster than SHA256Compress, and ~2.99 times faster than a subset-sum hash of dimension 5.

@daira
Copy link
Contributor

daira commented Apr 5, 2017

Copying this comment from #647:

If we were to use Pedersen hashes without #647 in place of all of the SHA256Compress instances in the circuit (which might not be possible, but for the sake of argument), it would still take ~3 seconds on our benchmark machine to generate a proof. Using LongsightF-322/p/3 it would still take ~1 second. (In both cases this is neglecting any improvement from using the Groth16 proving system, or any other changes to the circuit.) With #647 as well, proving would take ~1.6 seconds for Pedersen hashes or ~0.6 seconds for LongsightF-322/p/3.

[Note to anyone getting excited about these performance estimates: they are very preliminary and make lots of assumptions; we're certainly not making any commitment that the actual performance of Sapling will look like this.]

[Edit: updated estimates for Pedersen hashes, which are still very approximate.]

@ebfull
Copy link
Contributor

ebfull commented Apr 5, 2017

So, from a security perspective there doesn't seem to be any good reason to choose MiMC for the merkle tree. If discrete log is broken, counterfeiting can occur. If collision resistance of the tree is broken, counterfeiting can occur. Thus it is strictly better to rely on discrete log for collision resistance.

From a performance perspective, there needs to be a compelling improvement in performance. I would like to benchmark the following:

  • subset sum d=5
  • mimc using whatever daira thinks is best
  • pedersen hashes

Based on what I'm hearing, I think MiMC will have a negligible improvement in performance (it will not translate to 2x faster than pedersen just because it has 2x less constraints) at the cost of implementation complexity and extra assumptions. It doesn't help that the paper seems to have errors, and we're gluing other papers in...

As for subset sum, Eran thinks it's roughly as efficient as pedersen hashes with d=5.

@daira
Copy link
Contributor

daira commented Apr 6, 2017

@ebfull wrote:

So, from a security perspective there doesn't seem to be any good reason to choose MiMC for the merkle tree. If discrete log is broken, counterfeiting can occur. If collision resistance of the tree is broken, counterfeiting can occur. Thus it is strictly better to rely on discrete log for collision resistance.

Strictly speaking, "If discrete log is broken" is too much of a simplification, because there are two distinct discrete log problems: the one in the external curve used for the SNARK, and the one in the internal curve that is efficiently implementable in the circuit. Note that the modulus of the field for the inner curve is constrained, and so we cannot trivially dismiss any concern that it might not be as secure. Now, in practice I don't think it is a problem, because it is common to use curves defined over fields with moduli that are even more constrained (in a similar way), such as Curve25519 and the many of the others listed at safecurves.

I don't think that the fact we would be relying on more than one paper is a good argument against trying to optimize the hashing mode for MiMC to our particular requirements and SNARK. Many other well-regarded hashes are constructed as a hashing mode around a core designed by other researchers (e.g. BLAKE2 uses the ChaCha20 core). We would need to rely on more than one paper for Pedersen, since the original paper does not cover the multiple-generator variant (which I think was proposed in a paper by Chaum, van Heijst and Pfitzmann), nor does it include a tight concrete security analysis (which is in this paper by Bellare, Goldreich and Goldwasser). We also relied on analysis from multiple papers when choosing the current SHA-256 and BLAKE2-based primitives (see #792 for details). There's no particular reason why doing so is a problem.

The error in the Rogaway and Steinberger paper was just a typo. The error in the MiMC paper –if I'm right that it is an error– is more concerning, although it may just indicate inexperience with SNARKs on the part of the authors. My considered opinion is that this does not significantly reflect on the analysis of MiMC as a cipher or random permutation, since:

  • there's no reason to believe they skimped on the number of recommended rounds for performance reasons, and underestimating the number of constraints needed wouldn't have caused them to do so (if anything the opposite);
  • none of the analysis depends on the implementation for SNARKs;
  • this error was always going to be caught as soon as we tried to understand the implementation using libsnark; I think there's little chance that we would have missed it.

I'm aware that proving performance is not exactly proportional to the number of constraints, but I was under the impression that taking it to be so was a pretty close approximation in practice. Of course we do have to benchmark it, but in the meantime, my best guess is that LongsightF-322/p/3 would indeed be about 4 times faster than Pedersen. (If anything the dependence of the proving time on number of constraints is slightly superlinear, which would cause the speed-up to be slightly more than estimated here — although there's also a small constant term which would have the opposite effect. The circuit cost is dominated by hashing, and either Pedersen or MiMC can potentially replace all the current SHA256Compress hashes/PRFs. There will be some minor differences due to input block granularity.)

[Edit: the paragraph below assumes out-of-date performance estimates for Pedersen hashes. The advantage of MiMC is now roughly a factor of 3 rather than 4.]

I don't think that a factor-of-4 difference in performance (if that's even close to accurate) is negligible. There are plenty of applications for which 0.6 seconds to compute a JoinSplit would be acceptable but 2.3 seconds might cause a problem. (Don't assume that the only constraint is how much latency a human user will put up with. That's important, but high-volume automated payments are also important.) Also bear in mind that 0.6 seconds on our 4-core E3-1225 v2 3.20GHz benchmarking machine probably translates to tens of seconds on some lower-end platforms.

I am unconvinced of the security of subset-sum with d=5. Eli said in a past zdep meeting that he is also unconvinced. Of these three options, subset-sum is the one that would keep me up at night worrying that it might be broken without enough warning to switch to something else.

@ebfull
Copy link
Contributor

ebfull commented Apr 6, 2017

Note that the modulus of the field for the inner curve is constrained, and so we cannot trivially dismiss any concern that it might not be as secure.

The groups are different but the assumptions are not so independent. The SNARK curve we use has a very low embedding degree and would not be considered safe by safecurves, yet the embedded curve would be. It's very plausible that if the discrete log is failing in the Montgomery curve, our SNARKs are in bad shape. (And by then, I doubt anyone will still give traditional ECC the benefit of the doubt.)

It's also nicer to rely on two related assumptions than to rely on many completely different ones. At the more practical level, using ECC for the CRH simplifies auditing and code review if we're using it elsewhere.

I don't think that the fact we would be relying on more than one paper is a good argument against trying to optimize the hashing mode for MiMC to our particular requirements and SNARK.

I'm definitely not arguing that we shouldn't do analysis and compare. I'm just pointing out that if we're doing homebrew hash function development, we're better off choosing something that has had more eyes on it than MiMC apparently has.

I would happily take the performance improvement of MiMC if it is worth those other costs and risks.

@ebfull
Copy link
Contributor

ebfull commented Apr 6, 2017

@daira Would you mind writing a little constraint-system mockup of the best MiMC variant you can construct for the BLS curve and I'll post some benchmarks?

@daira
Copy link
Contributor

daira commented Apr 7, 2017

Let Ci = FromLE(BLAKE2b("LongsightF-322p3", ToLE2(i))) mod p.

(ToLE(n) is the sequence of ℓ bytes representing n in little-endian order. FromLE(s) is the nonnonnegative integer represented in little-endian form by s. But any constants that are randomish elements of Fp will do.)

function LongsightF322p3(xL ⦂ Fp, xR ⦂ Fp) {
    for i from 0 up to 321 {
        xL, xR := xR + (xL + Ci)3, xL
    }
    return xL
}

Additions and cubing are mod p. I'm assuming for the time being that there is no security problem with cubing not being a permutation, when the Feistel network variant is used. We need to ask some experts on hash functions about that.

@daira
Copy link
Contributor

daira commented Apr 7, 2017

Or did you want me to convert it to R1CS? We should probably pair on doing that.

@str4d str4d added this to Performance in Protocol Team Mar 6, 2019
@daira daira changed the title evaluate SNARK-friendly hash functions, e.g. MiMC evaluate SNARK-friendly hash functions, e.g. MiMC or Friday Mar 23, 2019
@daira
Copy link
Contributor

daira commented Apr 24, 2019

I wrote in #2233 (comment) :

A sponge has parameters c and r where c must be at least twice the security level, and r determines how much input can be processed in each absorption. So for a ~128-bit security level with a ~256-bit field, we want c to be 1 field element, and r must be at least one field element.

  • For a permutation width of 1 field element, sponge is excluded because r would be zero.

  • For a permutation width of 2 field elements, you can use sponge but it requires two [evaluations for] absorption and one [for] squeezing to implement a 2:1 hash. This compares unfavourably to M–P using a 1-element wide permutation, which only requires 2 evaluations of the permutation.

This is incorrect. Up to r field elements of output can be extracted immediately after the last absorption, so this would only require two evaluations for absorption and none for squeezing. Therefore M–P has no advantage.

  • A permutation width of 3 field elements is probably optimal for a sponge if you want to construct a 2:1 hash; it requires one absorption and one squeezing.

This actually requires just one evaluation for absorption.

The rest of my comment was correct.

However, it isn't clear how to construct the triple-width permutation generically from a 1-element wide permutation; you need a dedicated construction. There haven't been many block ciphers with a block size that is three times the natural word size (3-Way is the only one I can think of), so I have no intuition as to how much this costs.

  • For a permutation width of 4 field elements, a single absorption takes 3 field elements as input. This is overkill for many applications. It may be possible to use it with a ternary Merkle tree, but that doubles the size of authentication paths.

  • For a permutation width of n = 5 or more field elements, the input size is definitely overkill. An (n-1)-ary Merkle tree would multiply the size of authentication paths by (n-2), which starts to become quite unwieldy.

@khovratovich
Copy link

khovratovich commented Apr 24, 2019 via email

@daira
Copy link
Contributor

daira commented Apr 24, 2019

@khovratovich wrote:

The size of authentication paths actually multiplies by (n-2)/log_2(n-1) because the tree has smaller depth.

I stand corrected, thankyou. (So, a 3-ary tree has only 26% larger authentication paths, and a 4-ary tree only 50% larger.)

@daira
Copy link
Contributor

daira commented Apr 26, 2019

Friday and Jarvis (the cipher it is based on) are broken by Albrecht, Cid, Grassi, Khovratovich, Lüftenegger, Rechberger, and Schofnegger 2019. The attack expresses the whole cipher or the whole hash compression function as a system of equations that is solved using a Gröbner basis. At the risk of oversimplifying, although this category of attacks is applicable to many algebraic cipher/hash constructions, it is only feasible for Jarvis/Friday (for their originally recommended number of rounds) because of the specific way that they construct the linear layer in each round using two low-degree polynomials.

The paper applies a similar analysis to MiMC but does not break it for the recommended number of rounds.

@daira daira changed the title evaluate SNARK-friendly hash functions, e.g. MiMC or Friday evaluate SNARK-friendly hash functions, e.g. MiMC or the MARVELlous family Apr 26, 2019
@TomerAshur
Copy link

TomerAshur commented Apr 26, 2019

Here are a few points from me as a co-designer of Jarvis/Friday (speaking only on behalf of myself here):

  1. I agree that the proposed attack works for Jarvis/Friday with the recommended number of rounds.
  2. I have doubts about the exact complexities reported in the paper. We're waiting for the code of the attack to be published on their Github (see Footnote 7 in the paper) to perform our own evaluation of the attack.
    2.1 Corollary of 1+2: we are not certain about the security of Jarvis/Friday and currently we do not promote nor recommend them for real-world use until we (as a community) better understand its resistance to algebraic attacks.
  3. I have doubts about the comparison with MiMC and disagree with the Albrecht et al.'s conclusion about its security against an analysis similar to what was presented in this paper.

@daira
Copy link
Contributor

daira commented Apr 26, 2019

Thanks for your input @TomerAshur. I did say "at the risk of oversimplifying" :-) I was also careful to say that that the paper doesn't break MiMC; not that a Gröbner basis attack cannot break it (which I would not be qualified to assess one way or another).

@daira
Copy link
Contributor

daira commented Jun 29, 2019

From https://eprint.iacr.org/2019/426 :

The relative importance of Gröbner basis attacks is illustrated by Jarvis [5] and MiMC [3], two arithmetization-oriented ciphers that were proposed with explicit consideration for a wide range of attacks, but not attacks based on computing Gröbner bases. However, shortly after its publication, a Gröbner basis attack that requires only a single plaintext-ciphertext pair was used to discover
non-ideal properties in Jarvis [2]. An investigation of MiMC using the same attack was argued to be infeasible [2, Sec. 6]. While finding the Gröbner basis is easy, the next two steps — monomial order conversion and factorization of the resulting univariate polynomial — are not, owing to the infeasibly large number of parasitical solutions in the field closure.

However, relying on the large number of parasitical solutions for security against Gröbner basis attacks is a new security argument and a risky one. The simple observation that using more than just one plaintext-ciphertext pair makes the system of equations overdetermined, and thus filters out all parasitical extension field solutions with overwhelming probability, seems to undermine this argument. We note that the complexity analysis of overdetermined polynomial system solving requires delicate attention and it is conceivable that the resulting attack is also infeasible but for a different reason. However, the point is that even if this is the case, MiMC's security is not guaranteed by the large number of parasitical solutions. Either way, these observations raise the question whether there is a systematic argument for Gröbner basis security that does not depend on the particular flavor of the attack. In Section 4.5 we answer this question positively by providing such an argument.

@dlubarov
Copy link

dlubarov commented Aug 21, 2019

Wouldn't MiMC + Davies-Meyer work in this context? I see why we would need something like Rogaway-Steinberger if we constrained ourselves to using the permutation MiMCP (where the key is zero), but wouldn't Davies-Meyer work with MiMC-the-block-cipher?

It seems like (the additive variant of) Davies-Meyer would be quite efficient, with just 161 * 2 + 1 = 323 constraints per layer, assuming a different 255-bit curve in which x^3 is a permutation.

@daira
Copy link
Contributor

daira commented Aug 21, 2019

A Rescue sponge is much more efficient than Davies-Meyer, since it takes full advantage of the fact that the Rescue permutation can be run with a fairly large width (number of field elements).

@dlubarov
Copy link

dlubarov commented Aug 22, 2019

Oh sorry if I was beating a dead horse; I overlooked your comparison of LP231 vs Miyaguchi–Preneel earlier.

This compares unfavourably to M–P using a 1-element wide permutation, which only requires 2 evaluations of the permutation

Could you explain this? I don't understand why more than one evaluation would be needed with D-M or M-P.

@dlubarov
Copy link

Regarding Poseidon, I think it's assumed within the paper that a 1/x S-box would map zero to itself, costing three constraints per S-box. (@khovratovich could you confirm?) It would be interesting to explore using Poseidon with a plain inversion S-box instead.

In the context of Merkle proofs, the leaves are normally hashes, so I think finding a set of hashes which results in 1/0 would be at least as hard as a generalized birthday attack. But even if an attacker could find a 1/0 case, it seems harmless -- the R1CS instance just wouldn't be satisfiable with those inputs, so the leaves in question wouldn't be included. Honest users should never encounter 1/0, or if they somehow did, they could avoid it by generating a new opening.

If my argument is correct, and if the paper's security analysis holds up, it seems like we could do Merkle proofs in about 2k constraints. Here's a sample configuration:

  • curve: BLS12-381
  • full rounds: 8
  • partial rounds: 118
  • permutation size: 13
  • sponge capacity: 1
  • sponge rate (and Merkle tree arity): 12
  • tree depth for 2^32 elements: 9
  • total constraints per Poseidon instance: 222
  • total constraints per Merkle proof: 1998

@khovratovich
Copy link

khovratovich commented Sep 18, 2019 via email

@dlubarov
Copy link

dlubarov commented Sep 18, 2019

Thanks for clarifying Dmitry. Yeah, it definitely makes sense to use a 3-constraint inverse for a generic construction. Just in the context of private blockchains like Zcash, the leaves of the Merkle tree are hash-based commitments, so we pay that hashing cost regardless.

The leaf values are out of adversarial control in the sense that they must be hash outputs, but the sibling values have no such constraint. So to prevent fabricating membership proofs, it should be hard to find a sibling such that compress(current, sibling) = root. I think that would rule out polynomial based hashes, right? But Poseidon is a very interesting candidate, particularly if it can be used with a "plain" inverse S-box.

@daira
Copy link
Contributor

daira commented Jul 18, 2022

We are using Poseidon (with $x^5$ S-box) as part of the nullifier PRF construction in Orchard, and it's likely that we'll be using it for Fiat–Shamir in the recursive instantiation of Halo. It's possible that we could use it in future for the note commitment tree, but that wouldn't actually save very much in terms of constraints relative to Sinsemilla (because Sinsemilla takes full advantage of lookups, and Poseidon does not). So I'm not sure this issue needs to remain open.

@daira daira changed the title evaluate SNARK-friendly hash functions, e.g. MiMC or the MARVELlous family evaluate SNARK-friendly hash functions, e.g. MiMC, Poseidon, Rescue-Prime, Anemoi Dec 24, 2022
@daira
Copy link
Contributor

daira commented Dec 24, 2022

I believe we should reconsider Poseidon and Anemoi for the commitment tree in some future iteration of the protocol (this would be after any update for ZSAs). A few things have changed since this issue was last active:

  • We and others have switched to Plonkish arithmetization, and we understand its trade-offs better now (most of the above cost analysis was done for R1CS).
  • There has been more experience using Poseidon in deployed systems, including as part of the nullifier PRF in Orchard.
  • There is a refinement of Rescue called Rescue-Prime that is worth considering (although it's still somewhat costly to compute outside the circuit, which is the main reason Rescue has lost mindshare and deployment experience to Poseidon).
  • There is a new primitive, Anemoi, that has good efficiency in Plonkish.
  • Similarly, Griffin should also be efficient in Plonkish.
  • Another possibility is Reinforced Concrete, although I am a bit skeptical of its cost relative to other options.
  • There is a new compression mode, Jive, that is optimized for applications such as Merkle trees that only need collision resistance for small inputs. It is described in section 3 of the Anemoi paper, and is applicable to any cryptographic permutation on small vectors of field elements, including Anemoi, Griffin, Poseidon, and Rescue-Prime. (Note: when choosing the number of rounds, some designers may assume that the permutation will be used in a sponge. Care must be taken to select the number of rounds to make it reasonable to treat the primitive as a randomly selected permutation, so that the security analysis for Jive actually applies.)

I personally would now be comfortable using Poseidon with Jive, or possibly even Anemoi or Griffin with Jive if they had obtained sufficient analysis, for the commitment tree in a post-ZSA update of the circuit.

@daira daira changed the title evaluate SNARK-friendly hash functions, e.g. MiMC, Poseidon, Rescue-Prime, Anemoi evaluate SNARK-friendly hash functions, e.g. Poseidon, Rescue-Prime, Anemoi Dec 24, 2022
@daira
Copy link
Contributor

daira commented Dec 24, 2022

Another reason why field arithmetic-based hashes such as Poseidon, Rescue-Prime, Anemoi, and Griffin may win out over hashes that depend on hardness of discrete log such as Sinsemilla in the long term, is that the former could potentially be used in a fully post-quantum protocol (when used with a plausibly post-quantum proof system such as Plonky2, Vortex proofs, or HyperPlonk using FRI).

@daira daira changed the title evaluate SNARK-friendly hash functions, e.g. Poseidon, Rescue-Prime, Anemoi evaluate SNARK-friendly hash functions, e.g. Poseidon, Rescue-Prime, Griffin, Anemoi Dec 25, 2022
@khovratovich
Copy link

khovratovich commented Dec 26, 2022 via email

@weikengchen
Copy link

Here is evidence supporting the use of Anemoi in Zcash. I did a benchmark of 6-in-6-out transfer—single-core MacBook pro.

Zcash Sapling (Groth16): 49.65s
Espresso CAP (UltraPlonk): 40.64s
Zcash Orchard (Halo2Plonk-IPA): 23.76s
2022/1487 using Anemoi (TurboPlonk): 5.44s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-circuit Area: zk-SNARK circuits A-crypto Area: Cryptography C-research Category: Engineering notes in support of design choices I-performance Problems and improvements with respect to performance I-SECURITY Problems and improvements related to security. M-requires-nu A network upgrade is required to implement this. Network Upgrade Wishlist not in Sapling special to Zooko
Projects
Protocol Team
  
Performance
Development

No branches or pull requests