Closed
Conversation
Co-authored-by: Cesar Descalzo <Cesar199999@users.noreply.github.com>
Co-authored-by: Cesar Descalzo <Cesar199999@users.noreply.github.com>
Co-authored-by: Cesar Descalzo <Cesar199999@users.noreply.github.com>
Co-authored-by: Cesar Descalzo <Cesar199999@users.noreply.github.com>
Co-authored-by: Antonio Mejías Gil <anmegi.95@gmail.com>
Co-authored-by: Cesar Descalzo <Cesar199999@users.noreply.github.com>
Co-authored-by: Cesar Descalzo <Cesar199999@users.noreply.github.com>
Co-authored-by: Antonio Mejías Gil <anmegi.95@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Joint work with @Cesar199999
Cf. issue #476
General explanation
This PR contains some important steps towards hybrid Merkle trees, by which we mean a Merkle tree where different hash functions are used at different levels. This technique can save substantial prover work while only slightly increasing (recursive) verifier work. In the simplest case, one can use a (natively) very fast, recursion-unfriendly hash function (such as Blake3) for the bottom level of the tree, while maintaining a (natively) slower, recursion-friendly hash (such as Poseidon2) for the rest of the levels. Because the bottom level accounts for half of all the prover's hashing, while only constituting a 1/height fraction of the verifier's (due to the structure of a Merkle path), the aforementioned tradeoff arises.
The new code is meant to work as a toolbox allowing users (even ones importing the crate as a dependency) to mix hashes of their choice and try out different hybrid-hashing and node-conversion strategies (allowing them to implementing the relevant traits themselves). This flexibility should enable one to tweak the tradeoff and/or apply it only when it is beneficial.
So far the design only allows hybrid hashing for the compression of tree nodes (the previous generic
C: PseudoCompressionFunction), not for the digestion of matrix rows into leaves (the previous genericH: CryptographicHasher). Our reasoning is that the verifier needs to do the same amount of hashing as the prover in order to digest a row, and so the tradeoff explained above doesn't manifest as clearly in the case of the digestion function. However, implementing hybrid digestion down the line is indeed a possibility which would make the toolbox more general.Code overview
(the code itself contains detailed information in the form of
//and///documentation)In essence, a new structure
HybridMerkleTreehas been created whose constructornewis generic onC: HybridPseudoCompressionFunction<...>, as opposed to theC: PseudoCompressionFunction<...>present in the preexistingMerkleTree. The traitHybridPseudoCompressionFunction<T, const N: usize>closely mirrorsPseudoCompressionFunction<T, const N: usize>but replaces the original methodfn compress(&self, input: [T; N]) -> Tbyfn compress(&self, input: [T; N], sizes: &[usize], current_size: usize) -> T. The two extra arguments are meant to inform the hybrid compressor about which level of the Merkle tree it is being called at, as well as the heights of the matrices being committed to - so that the hybrid compressor can decide which of its (possibly many) compression functions to use. Note that the hybrid compressor only accepts one input type[T; N]. If its implementor uses several compressors requiring different input types under the hood (e. g. field elements and bytes), that variety is hidden away from the trait and left for the implementor to handle. This design has some tradeoffs we are happy to discuss, but one of its main advantages is that theHybridMerkleTreecode is almost a carbon copy of the originalMerkleTree's - it does not need to worry about differing node types becauseHybridPseudoCompressionFunction<T, N>only exposes one type to it. The only change introduced to theHybridMerkleTreeis passing the heights of the matrices to the compression function.One implementor of
HybridPseudoCompressionFunctionis provided:SimpleHybridCompressor, which uses the aforementioned strategy of one compressorC1at the bottom level and another oneC2at all others. To be precise, if the highest and second-highest numbers of rows differ by a factor of 2 (when rounded up to the nearest power of 2), thenC1is used both to compress the bottom leaves and to inject the next-to-bottom leaves - which makes sense, since the amount of calls tocompressis the same in both operations.The generics and inner workings of
SimpleHybridCompressorare explained in detail in the code. Perhaps the only point deserving further attention is the genericNC: NodeConverter<..., ...>ofSimpleHybridCompressor. It allows this hybrid compressor to convert back and forth between the input types ofC1andC2in order to always receive and output the unique type present in the traitHybridPseudoCompressionFunction.Two implementations of the node converter are provided, both for
BabyBear<->u8conversion in the 256-bit-node case. One isNodeConverter256BabyBearBytes, which can handle any[PackedValue<Val=BabyBear>; 8]and[PackedValue<Val=u8>; 32](with the sameWIDTH); and another one,UnsafeNodeConverter256BabyBearBytes, which can only handle[BabyBear; 8]and[u8; 32](with the individual types acting as implementors ofPackedValueofWIDTH1). The former implementation is forced to usePackedValuetrait methods (which only provide references and therefore involve cloning) and requires transposition of the arrays of packed values, being therefore relatively slow. The latter relies on hard-casts (aside from the modular reduction) and is hence bothunsafeand lightning fast.Separately, we have added an
IdentityHasherwhich simply pads the vector to be digested withdefaults to the desired output size (and panics if the vector is longer than the that, since this would lead to a trivially second-preimage-weak digestion hasher). The relevance of this hasher will made clear by the benchmark explanation below.What can be executed and some numbers
We have added three benchmarks to the
merkle_treecrate. In them, matrices ofBabyBearelements are generated randomly (both in terms of size and field elements). Then several compressor configurations are used to construct hybrid and plain Merkle trees, and the time is measured. In this description, by "plain-4", we mean "OriginalMerkleTreewith Poseidon2 compressor all throughout and packedBabyBearnodes ofWIDTH4". By "plain-1" we mean the same withWIDTHequal to 1. And by "hybrid" we mean "HybridMerkleTreewithBlake3at the bottom andPoseidon2elsewhere, withWIDTHequal to 1 in both cases".The benchmarks were run on an Apple desktop computer with M2 and 16GB of RAM. In one case, it became relevant to also run them on a another system: a Linux laptop with i7-1165G7 and 32GB of RAM. We refer to these machines as A and B, respectively. The benchmarks and results are as follows:
hybrid_unsafe_vs_safe: This is simply meant to highlight the speed difference between the two node converters. The exact numbers aren't all that relevant in that the cost of node conversion is overshadowed by the rest of the tree construction, but e. g. using flamegraph shows that unsafe conversion has a negligible cost compared to the safe one. For that reason, only theUnsafeNodeConverteris used for the hybrid tree in all other benchmarks.hybrid_vs_plain: This benchesplain-1,plain-4andhybridusingPoseidon2as the digest hasher in all cases. Leaf digestion takes substantially longer than compression, which highlights one case in which hybrid Merkle trees (in their current form at least) may not be so useful, that is: when matrices have many columns and digestion is the dominating cost. These are the numbers:Crucially, hybrid is not twice as fast as
plain-1: even though compression itself should indeed take about half the time (becauseBlake3is essentially free compared toPoseidon2), the digestion costs diminish that advantage.hybrid_vs_plain_identity_hasher: Here we repeat the same experiment as in bench 2, but we use theIdentityHasherin order to digest the leaves. Since this is essentially a free operation, digestion costs no longer muddle the compression gains brought about by the hybrid strategy. Note that this places a restriction on the matrices that form the leaves: at each level, the row length of all concatenated matrices cannot surpass 8BabyBearelements (256 bits), since that would render identity hashing into 256 bits impossible. These are the numbers on machine A:Here the expected ~50% savings in compression time between
plain-1andhybridshine through (of course, other smaller costs slightly alter the exact relation). Crucially, the hybrid strategy is not better than theWIDTH-4 plain one. This is likely because machine A's architecture can execute operations onPackedValues efficiently, which the hybrid strategy cannot take advantage of (cf. point 1 of "Possible further work" below).However, the numbers on machine B are more flattering:
In this machine, whose architecture does not take advantage of
PackedValues, the hybrid strategy is the clear winner.Possible future work
The point of this draft PR is to receive some feedback on the current code, but also to get your feel on whether it makes sense to develop this further and, if so, in which direction. Here are some things we thought about which we haven't got working on yet:
It follows from the benchmarks that a very fast tree construction could be achieved if the hybrid tree could handle packed nodes of
WIDTH4. We don't have any numbers, but the obstacle to implementing this is purely Rust-based. Summarising things, if one wants theHybridPseudoCompressionfunction to be able to handle both individual leaves[W; DIGEST_ELEMENTS]as well as packed ones[PW: DIGEST_ELEMENTS], one needs to add generics to either theSimpleHybridCompressorstructure or the trait itself. Both possibilities come with their own difficulties - in the case of the trait, one would be forced to 1) limit the trait to two compression functions (as opposed to as many as one wants, as now); and 2) fill theHybridMerkleTreecode with several more generics, whereas so far it is as clean as the originalMerkleTreeone. There might be a simpler design/Rust solution we are missing, though.Carry over the hybrid construction to the
MMCS, i. e. add methods to prove and verify paths usingHybridPseudoCompressionFunctions. This should be relatively straightforward.Add more implementations and benches for other hashes present in the codebase, including the required node converters (in particular, switching the current node converters from
BabyBearto arbitrary 31-bit Montgomery fields shouldn't be too hard).Hybrid digestion, as explained above. This would probably require significant generic/trait work.
N. B.: We have included a cautioning docstring about the dangers of implementing/using hybrid strategies and node conversion without thinking about the security of the resulting configuration (cf. the beginning of the file
hybrid_merkle_tree.rs).@mmagician