You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Curious why the delay scheduler is used. Can't we send a new addition to pipeline as soon as we have two operands (either points or the aggregation) corresponding to the same bucket ready?
At time $t$, we pull up a reduced scaler $a_t$ and a point $P_t$. Assuming no collision so far, this is simply fed to the MixedAdd pipeline to compute $S_t \leftarrow S_t + P_t$. However in the meantime, say at $t+1$ when the point $P_{t+1}$ is pulled and if it collides, if I understand correctly, this is going to have to wait till the aggregation $S_t$ is ready. Moreover, if at $t+2$ if we collide again, this point will need to wait till $S_{t+1}$ is ready.
However, at $t+2$ we already have two operands for the bucket ready (even though the aggregation $S_t$ is still in pipeline. So we can still feed $P_{t+1}$ and $P_{t+2}$ to the adder to get $S'_{t+1,t+2} \leftarrow P_{t+1}+P_{t+2}$. Ultimately, when both $S_t$ and $S'_{t+1,t+2}$ are ready, these can be fed to the pipeline again $S_{t+2} \leftarrow S_t + S'_{t+1,t+2}$. This can be extended to support larger number of collisions as well, the idea being as long as we have two operands ready, we can always dispatch them for the MixedAdd.
Overall the delay in computation reduces from $3T$ to $2T+2$.
In this approach we are doing the same number of operations. This will need additional resources though, a few FIFOs, likely one per bucket (?).
I'd like to know whether there is some mathematical reason to not do it this way, or if the resource requirements become to large to handle.
Thanks in advance for reading through it!
The text was updated successfully, but these errors were encountered:
The issues are purely practical: how many resources do you need and how much time does it take to do any extra operation. Often time it's also pretty hard to estimate without an actual experiment.
Even the check "is this point already in the current batch? (and therefore needs to be added to the queue)" needs to be implemented somehow, e.g. by setting a bit to 1 every time you add a point to the current batch... but an extra write in memory is pretty expensive at this level of optimization.
On FPGA we just implemented the simplest solution we could as the deadline for the Zprize was coming fast. It's certainly possible that someone will come up with an improved scheduler.
Hello -- I wasn't sure where to ask this question. It is quite likely a very silly one! I'll really appreciate clarification and help on this!
I was looking at the design in this article: https://eprint.iacr.org/2022/1396.pdf
Curious why the delay scheduler is used. Can't we send a new addition to pipeline as soon as we have two operands (either points or the aggregation) corresponding to the same bucket ready?
At time$t$ , we pull up a reduced scaler $a_t$ and a point $P_t$ . Assuming no collision so far, this is simply fed to the $S_t \leftarrow S_t + P_t$ . However in the meantime, say at $t+1$ when the point $P_{t+1}$ is pulled and if it collides, if I understand correctly, this is going to have to wait till the aggregation $S_t$ is ready. Moreover, if at $t+2$ if we collide again, this point will need to wait till $S_{t+1}$ is ready.
MixedAdd
pipeline to computeHowever, at$t+2$ we already have two operands for the bucket ready (even though the aggregation $S_t$ is still in pipeline. So we can still feed $P_{t+1}$ and $P_{t+2}$ to the adder to get $S'_{t+1,t+2} \leftarrow P_{t+1}+P_{t+2}$ . Ultimately, when both $S_t$ and $S'_{t+1,t+2}$ are ready, these can be fed to the pipeline again $S_{t+2} \leftarrow S_t + S'_{t+1,t+2}$ . This can be extended to support larger number of collisions as well, the idea being as long as we have two operands ready, we can always dispatch them for the MixedAdd.
Overall the delay in computation reduces from$3T$ to $2T+2$ .
In this approach we are doing the same number of operations. This will need additional resources though, a few FIFOs, likely one per bucket (?).
I'd like to know whether there is some mathematical reason to not do it this way, or if the resource requirements become to large to handle.
Thanks in advance for reading through it!
The text was updated successfully, but these errors were encountered: