Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mixed Add Scheduler Question #2

Closed
ajx42 opened this issue Sep 25, 2023 · 2 comments
Closed

Mixed Add Scheduler Question #2

ajx42 opened this issue Sep 25, 2023 · 2 comments

Comments

@ajx42
Copy link

ajx42 commented Sep 25, 2023

Hello -- I wasn't sure where to ask this question. It is quite likely a very silly one! I'll really appreciate clarification and help on this!

I was looking at the design in this article: https://eprint.iacr.org/2022/1396.pdf

Curious why the delay scheduler is used. Can't we send a new addition to pipeline as soon as we have two operands (either points or the aggregation) corresponding to the same bucket ready?

At time $t$, we pull up a reduced scaler $a_t$ and a point $P_t$. Assuming no collision so far, this is simply fed to the MixedAdd pipeline to compute $S_t \leftarrow S_t + P_t$. However in the meantime, say at $t+1$ when the point $P_{t+1}$ is pulled and if it collides, if I understand correctly, this is going to have to wait till the aggregation $S_t$ is ready. Moreover, if at $t+2$ if we collide again, this point will need to wait till $S_{t+1}$ is ready.

However, at $t+2$ we already have two operands for the bucket ready (even though the aggregation $S_t$ is still in pipeline. So we can still feed $P_{t+1}$ and $P_{t+2}$ to the adder to get $S'_{t+1,t+2} \leftarrow P_{t+1}+P_{t+2}$. Ultimately, when both $S_t$ and $S'_{t+1,t+2}$ are ready, these can be fed to the pipeline again $S_{t+2} \leftarrow S_t + S'_{t+1,t+2}$. This can be extended to support larger number of collisions as well, the idea being as long as we have two operands ready, we can always dispatch them for the MixedAdd.

Overall the delay in computation reduces from $3T$ to $2T+2$.

In this approach we are doing the same number of operations. This will need additional resources though, a few FIFOs, likely one per bucket (?).

I'd like to know whether there is some mathematical reason to not do it this way, or if the resource requirements become to large to handle.

Thanks in advance for reading through it!

@0x0ece
Copy link

0x0ece commented Sep 25, 2023

In short, there's no mathematical reason.

The issues are purely practical: how many resources do you need and how much time does it take to do any extra operation. Often time it's also pretty hard to estimate without an actual experiment.

See also these PRs to gnark-crypto, I think they illustrate the problem pretty well:
Consensys/gnark-crypto#249
Consensys/gnark-crypto#261

Even the check "is this point already in the current batch? (and therefore needs to be added to the queue)" needs to be implemented somehow, e.g. by setting a bit to 1 every time you add a point to the current batch... but an extra write in memory is pretty expensive at this level of optimization.

On FPGA we just implemented the simplest solution we could as the deadline for the Zprize was coming fast. It's certainly possible that someone will come up with an improved scheduler.

@ajx42
Copy link
Author

ajx42 commented Sep 26, 2023

Got it! Thanks for the prompt response, Emanuele!

@ajx42 ajx42 closed this as completed Nov 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants