Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Research zkSNARKS blocker: Benchmark and optimize proof time #7

Open
oskarth opened this issue Nov 8, 2019 · 4 comments

Comments

@oskarth
Copy link
Member

@oskarth oskarth commented Nov 8, 2019

Problem

Prove time for Semaphore (https://github.com/kobigurk/semaphore) zKSNARKs using circom, groth and snarkjs is currently way too long. It takes on the order of ~10m to generate a proof. With Websnark, it is likely to take 30s, which might still be too long. We should experiment with native code on mobile here.

Acceptance criteria

While we can potentially use precomputed proofs to some extent, this needs to be on the order of a seconds to be feasible for e.g. spam protection signalling normal messages. This means a performance improvement of a bit more than two orders of magnitude is necessary. Alternatively, perf can be a bit less if we get precomputed proofs and batching to work smoothly.

Details

See https://github.com/vacp2p/research/blob/master/zksnarks/semaphore/src/hello.js#L118 for proof generation. This is using snarkjs, circom and running through Node, which is likely one of the slower environments. Additionally, number of constraints for the snark are around ~60k.

Reducing the problem to a simpler circuit (Pedersen commitment, 5 rounds), see https://ethresear.ch/t/benchmark-circom-vs-bellman-wasm-in-chrome-on-mobile/5261, we note the previous benchmarks of Circom and Bellman in different environments such as Node/WASM/Native:

  • Node: ~50s
  • Wasm (Circom and Bellman, incl on mobile) ~1s
  • Native(Bellman, multicore somewhat useful) ~0.1s

I.e. wasm 50x faster, and native 500x faster. There appears to be WIP and room for improvement for the wasm benchmarks.

Circom and Bellman both have around ~3.5k constraints, i.e. 5% of our original snark.

MacBook Core-i7 @ 2.6Ghz
circom (node.js):        53685ms
circom (chrome):         53266ms
circom (websnark):       1408ms
bellman (webassembly):   1151ms
bellman (rust, 1 core):  235ms
bellman (rust, 8 cores): 85ms

iPhone XS
bellman (webassembly):   1126ms

For our initial problem, this would mean roughly:

3.5k -> 60k; 20x constraints

> circom (node.js):        53685ms
> circom (websnark):       1408ms
> bellman (rust, 8 cores): 85ms

=> node ~15m
=> wasm ~30s
=> native ~2s

Possible solutions

1. Use a better environment (WASM or native).

No-brainer. Either use Websnarks with existing Circom snarks or Bellman to WASM.

WASM might not be enough though, especially to get to subsecond, and requires more optimization.

Native would be useful, especially for resource restricted devices and for spam protection. Additional benchmarks on mobile would be good to have. A big issue is in terms of deployment - currently native would mean something like Bellman, which right now doesn't have a library like Circom for writing SNARKs. This seems to be an ongoing effort:

We currently work on Circom -> Bellman export so that we can combine the ease of use of Circom language and performance and portability of Bellman. The goal is to make a toolchain that allows you to write snarks in Circom and get super fast .wasm + js bindings as compilation result, hiding the complexity of Bellman under the hood.

2. Decrease number of constraints.

60k could possible be reduce quite a lot, but require additional research (e.g. merkle tree structure, number of rounds for various hash functions, etc). If this could be halved, then Webnarks might be enough.

3. Use a different type of ZKP

Other types of ZKP come with lower proof time, but with other trade offs (such as proof size). The main reasons to explore this would, from my current POV, be related to: proof time; security assumptions (cryptographic assumptions and trusted setup); and proof time.

4. Precompute proofs

It's possible we can precompute proofs for e.g. spam protection. This means you can instantly signal, and then 'save up' messages. How this would work in practice requires further research.

5. Prover as a service?

I don't know what this would look like, but perhaps proof generation could be outsourced. This appears to be STARKware business model if I understand it correctly: https://starkware.co/the-road-ahead-in-2019/

6. Batching of signals

A bit similar to 'precompute case', but it might be doable to take many messages in a 30s period and batch them together. In the human chat case, this would impact UX. Worth exploring though.

Note

The benchmark in problem descrption and ethresear.ch might be incorrect, so worth checking them again.

@barryWhiteHat

This comment has been minimized.

Copy link

@barryWhiteHat barryWhiteHat commented Nov 8, 2019

Semphore currently uses blake2 hash function (quantum forward secracy) and pedersen hash (harden mimc collision resistant) we also use eddsa signature for similar crypto UX, this could be replaced with knoledge of preimage as signature.

An example minimal circuit that is lacking the quantum forward secrecy is https://github.com/peppersec/tornado-mixer/ its possibly a better benchmark candidate.

You would need to add shamir secret share of the leaf but that is for both.

@mmaller

This comment has been minimized.

Copy link

@mmaller mmaller commented Nov 11, 2019

Regarding Point 4: Precompute proofs

Using Groth16 (and possibly any other ZKP system that doesn't rely on random oracles) I believe this is possible provided that small changes to the input does not drastically change the witness values.

For example, consider two very small Merkle trees Hash(Hash(a), Hash(b)) and Hash(Hash(a), Hash(c)). Then the witness components required to verify that Hash(a) is computed correctly are the same for both, so the provers work in verifying this value does not necessarily need to be repeated.

In Groth16, I believe some of the work for computing the first two proof elements A and B can be precomputed for these types of circuits. Regarding the third proof element C, anything related to the output of the FFT will need to be computed every time from scratch (so a guestimated 1/8 of the total work cannot be aided with precomputation, not including FFT costs).

@kobigurk

This comment has been minimized.

Copy link

@kobigurk kobigurk commented Nov 11, 2019

Some of these are avenues that definitely can be explored.

About reducing the number of constraints: we use some methods that are possibly safer, though come at a cost - we use Pedersen hashes for commitments and Blake2s for nullifier computation. If you believe the algebraic hash functions, such as MiMC, are strong enough to provide sufficient hiding, post-quantum security and not reveal anything about the hashes inputs - you could use those instead and lower the amount of constraints considerably.

About WASM vs JS: I'm not sure you'll get to subsecond with the current tooling, either JS or WASM. That said, a native Zcash spend proof in Bellman takes on a good laptop, as far as I remember, about a second. I've also done some benchmarks in the past with single-threaded Bellman: https://github.com/kobigurk/wasm_proof. With websnark, the situation is better since it's multi-threaded and doesn't have the overhead of Rust -> WASM, though I don't have numbers.

As to prover as a service, this isn't trivial. In the StarkWare case it makes sense - there's no privacy involved. The StarkWare prover, roughly, aggregates a bunch of in-the-clear transactions and proves it was done correctly. Anyone could do it, since there's no secret data involved. In this case, the user has private data they have to input to the proof. There could be interesting methods to explore here, though it's not as direct as just offloading.

Regarding proof precomputation, I agree with @mmaller that some of A and B, and even parts of C (e.g., some of the sA + rB part of it) can be precomputed. One example is moderately expensive and doesn't seem to be able to be precomputed - the H coefficients, which is the where FFT happens.

I think your best bet would be to use multicore Bellman on native mobile. I believe this would be by far the most efficient method and would also alleviate the need for having to download circuit.json, as its equivalent be efficiently compiled into the program binary. The proving key might also be more efficiently represented, though I'd have to look at it again to be sure. proving_key.bin is already binary, though maybe points are decompressed or there are zero points saved, etc.

It would require the rewrite of the Semaphore circuit in Bellman. With the gadgets already existing in Bellman and sapling-crypto, it shouldn't be a huge undertaking, though not trivial.

@barryWhiteHat

This comment has been minimized.

Copy link

@barryWhiteHat barryWhiteHat commented Nov 12, 2019

It would require the rewrite of the Semaphore circuit in Bellman. With the gadgets already existing in Bellman and sapling-crypto, it shouldn't be a huge undertaking, though not trivial.

Could we use the circom importer that you wrote for the trusted setup to bypass the need to rewrite?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.