Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: SNARK pool batching #4882

Closed
wants to merge 4 commits into from
Closed

RFC: SNARK pool batching #4882

wants to merge 4 commits into from

Conversation

emberian
Copy link
Contributor

There are some grammatical tense issues in this, but I have decided not to fix them. This one still has some unresolved questions and it introduces a new operational requirement that I find distasteful. Feedback wanted!

## Detailed design
[detailed-design]: #detailed-design

The proposed mechanism is to have two "batch modes": Batch All and Batch by Prover, and to split provers into two categories: Trusted and Untrusted.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think batching snarks that are required immediately (if the block producer is scheduled to create the next block) would also be nice? if we wait too long to batch verify/ or some snarks are in a batch that won't be verified until after the diff is created then we might not be able include as many transactions if they were serialized. But this might not be as beneficial from a performance standpoint.
Also, we have this delay factor in scan state so, if there are enough provers and the coordinator picks the work that are needed first then the nodes producing blocks should have the required work at least a block before they need it (if the delay is at least 1). So maybe it is ok

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deepthiskumar how can we detect which of these proofs are required in the next block? I think it makes sense to eagerly start a batch before the timeout/max batch size if there's a pending block production deadline, but I'm not sure how to know which proofs to give that treatment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can query the scan state at the best tip to get all the required work for the next block. work_statements_for_new_diff in transaction_snark_scan_state.ml will fetch you that


The timer ensures that, even when SNARK work is coming in slowly, we aren't adding too much latency into the gossip verification.

In order to identify provers into those two categories, we need an unforgeable notion of prover identity. I propose we add a new signature of the SNARK work, using a key that is present in the best tip ledger. This adds a new failure mode to SNARK proofs: signing with a key that's not present in the ledger. Users should be discouraged from storing value in that account. By necessity, the private key will need to be relatively unprotected as it needs to be available on the SNARK coordinator to sign proofs before broadcasting them.
Copy link
Member

@deepthiskumar deepthiskumar May 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much time does it takes to check a signature? Is that the 10 ms in 600ms + N*10

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, good point, this doesn't account for the extra signature verification time. I can add a benchmark for this, and I suspect it'll be slower than it could be (using a different signature scheme) but not as slow as not batching.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imeckler gave me new observed numbers: 800ms + N*56ms

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The identity can be included as part of the signature-of-knowledge message and thus implicitly included in SNARK verification


An early idea was an extra "not yet known" state before nodes become Untrusted. In the extra state, we don't do any batching, so as to avoid commiting potentially a lot of time to verifying a bogus batch before any reputation is established. This was deemed unnecessary.

One idea was for identifying provers was to use the fee recipient account, since it's already encoded into the SoK. Without adding an additional signature this allows forgery as only public information is needed, and also if the proof fails to verify then nothing can be inferred from it. Using the fee recipient account itself seems unwise as it will be storing actual value, and having the private key just lying around with on the SNARK coordinator is risky (and a regression from today's world where SNARK coordinators and workers need no sensitive information).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I understand the forgery part. The Sok is baked in the snark and so if someone sent it with a different key, the proof would fail to verify?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SoK digest is baked into the snark (and I'm not sure there's a way to extract it? maybe it's a public input), and if the proof fails to verify because it was sent with a different key we still don't know who to blame for the bad proof.

- Upon receipt of the first SNARK work into the queue, start a `MAX_SNARK_DELAY` second timer.
- When the queue hits `MAX_SNARK_BATCH_LENGTH`, or after the timer expires, clear the queue and start the batch verification process.

The timer ensures that, even when SNARK work is coming in slowly, we aren't adding too much latency into the gossip verification.
Copy link
Member

@imeckler imeckler Jun 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emberian is there a need for a MAX_SNARK_BATCH_LENGTH? Or even any kind of timer? It occurs to me that we could maybe do the following.

Have a state

type verifier_state =
  { mutable out_for_verification : Snark_work.t list option
  ; queue : Snark_work.t Queue.t }
  • out_for_verification : Snark_work.t list option represents the proofs which are currently being processed by the verifier. It is None if the verifier is not verifying any proofs at the moment.
  • When a new proof comes in put it in the queue. If out_for_verification = None, then verify that proof immediately.
  • Whenever the verifier returns, flush queue into the verifier and move the queue's contents into out_for_verification. If the verifier returned successfully, then insert the old out_for_verification proofs into the table and regossip them.

Copy link
Member

@imeckler imeckler Jun 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess basically you are suggesting waiting MAX_SNARK_DELAY (which should be understood as max delay beyond mere verification time) in the beginning to get more proofs into that first verification right?

@emberian
Copy link
Contributor Author

Signing works produced (ie, proving possession of the private key) prevents attacks where malicious provers can pay computation to broadcast 0-fee proofs for arbitrary pubkeys.

@emberian
Copy link
Contributor Author

from izzy: zcash foundation have a similar problem, and end up with a similar solution? find it, compare it, link it.

@imeckler
Copy link
Member

imeckler commented Aug 6, 2020

Our concern is mostly DOS. One concern is it easy to circumvent batching by prover and accomplish the same DOS by never using the same prover public key twice right?

@nholland94
Copy link
Member

nholland94 commented Aug 13, 2020

Our concern is mostly DOS. One concern is it easy to circumvent batching by prover and accomplish the same DOS by never using the same prover public key twice right?

@imeckler Because this adds a new requirement that the snark worker public key has to be in the ledger, doing such an attack would involve paying the account creation fee for each attack. This should deter this kind of attack so long as we are able to reduce the cost of handling a bad proof from a prover, which we should be able to do correctly by tuning the amount of "good work" required to move over into the "trusted prover" state.

@deepthiskumar deepthiskumar mentioned this pull request Aug 21, 2020
5 tasks
@emberian emberian added ci-build-me Add this label to trigger a circle+buildkite build for this branch ready-to-merge Adding this label will trigger mergify and trigger CI to run and merge the PR labels Aug 21, 2020
@netlify
Copy link

netlify bot commented Aug 21, 2020

Preview:

Built with commit 94b42f2

https://deploy-preview-4882--o1website2.netlify.app

@emberian emberian closed this Apr 20, 2022
@emberian emberian deleted the rfc/snark-pool-batching branch February 15, 2024 18:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-build-me Add this label to trigger a circle+buildkite build for this branch consider-for-closing ready-to-merge Adding this label will trigger mergify and trigger CI to run and merge the PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants