Try to optimize inner-product proof #89

hdevalence · 2018-05-08T18:20:20Z

The inner-product proof spends lots of time in this computation https://github.com/chain/ristretto-bulletproofs/blob/main/src/inner_product_proof.rs#L102-L103 performing double-base scalar multiplications.

The verifier's side inlines all of the computation into a single check. This isn't possible for the prover, which has to produce the L, R points, but (looking quickly) it seems like it should be possible for the prover to only do one multiscalar mul per L and R, and drop all of the double-base scalar muls.

Possibly supersedes #80

The text was updated successfully, but these errors were encountered:

oleganza · 2018-05-08T18:29:05Z

Seems like it's not the usual multiscalar mul (which produces one point), but a multi-point multiscalar mul. Fully general Pippenger algorithm covers that case (that is, you can compute independent exponentiations faster with a Pippenger's algorithm, than doing them one by one), but we have never needed or implemented it yet.

oleganza · 2018-05-08T18:51:25Z

Disclaimer: I yet have no intuition on how that sequence of independent multiplications can be computed faster than one multiplication at a time. In this case points are all different, only scalars are the same.

hdevalence · 2018-05-10T22:52:08Z

What I was thinking of was to try collapsing the L, R computations into a single multiscalar multiplication for each, in the same way that the verifier collapses them when performing verification: it does not compute the final G, H points via the iterative process, but collapses it into a single equation involving the original \vec G, \vec H.

However, this would mean that the prover would compute 2k multiscalar multiplications of full size 2n+1 each, whereas currently the prover's work gets smaller in the lower stages. For instance, here's the number of multiscalar multiplications used in each strategy:

current
{2: 252, 5: 2, 9: 2, 17: 2, 33: 2, 65: 2, 129: 2}
collapsed
{129: 12}

so this would actually end up doing more work.

The goal is to avoid having to compute all of the double-base scalar multiplications, but I think this idea doesn't quite work out.

oleganza · 2018-10-15T21:58:48Z

@hdevalence do you think this is still worth experimenting with, or should we close the issue?

hdevalence · 2018-10-18T18:07:56Z

There's no milestone attached, so we can leave this open and have it be a point of accumulation for ideas about the inner-product proof.

hdevalence · 2018-10-18T18:09:36Z

Benedikt mentioned that the libsecp implementation does a hybrid strategy, jumping a few steps at once (without shrinking), then cutting the sizes of the vectors. Could be worth exploring, at some point in the future, although at the moment we're still faster on the IPP proof for other reasons.

hdevalence · 2018-12-11T19:10:07Z

Another strategy would be to try to speed up the double-base scalar multiplications by vectorizing across points. This would require a change in the upstream curve library, and I'm not sure what kind of API would be useful generally (instead of just in this one weird protocol).

psivesely · 2020-03-17T06:40:30Z

I observed that you can half the number of multiplications required to scale the witness and generator vectors in each round by computing a' = a_L + u * a_R and b' = b_L + u_inv * b_R (and similarly for G' and H'). There's a proof soundness is preserved in our recent paper https://eprint.iacr.org/2019/1177.pdf. Specifically, we prove computational witness-extended emulatability of our generalized inner product argument (GIPA) that uses this optimization, and since Bulletproofs is an instantiation of GIPA security follows.

Another thing to look at is choosing the challenges u to be sparse and/or complex (https://eprint.iacr.org/2005/276.pdf), which can significantly reduce the number of additions it takes to compute each multiplication. Easier to implement is choosing small exponents, i.e.., truncating the RO output to 128-bits. Though you don't get as much of a speedup as with sparse/ complex exponents. The advantage of sparse exponents versus small exponents grows with the gap between the security parameter and the scalar field size--so the advantage is greater for pairing-friendly curves. I can't say anything about complex exponents.

Note that if the verifier wishes to do a MSM the final scalars will no longer be sparse/small except for perhaps for very small instance sizes. I suspect there exists a threshold instance size where the O(n/log n) speedup of Pippenger's provides a greater verifier speedup than the constant speedup the verifier would gain if they forgo the MSM and instead rescale the generators using the sparse/small/complex exponents as the prover must.

kevaundray · 2020-03-17T06:57:52Z

I observed that you can half the number of multiplications required to scale the witness and generator vectors in each round by computing a' = a_L + u * a_R and b' = b_L + u_inv * b_R (and similarly for G' and H'). There's a proof soundness is preserved in our recent paper https://eprint.iacr.org/2019/1177.pdf. Specifically, we prove computational witness-extended emulatability of our generalized inner product argument (GIPA) that uses this optimization, and since Bulletproofs is an instantiation of GIPA security follows.

Another thing to look at is choosing the challenges u to be sparse and/or complex (https://eprint.iacr.org/2005/276.pdf), which can significantly reduce the number of additions it takes to compute each multiplication. Easier to implement is choosing small exponents, i.e.., truncating the RO output to 128-bits. Though you don't get as much of a speedup as with sparse/ complex exponents. The advantage of sparse exponents versus small exponents grows with the gap between the security parameter and the scalar field size--so the advantage is greater for pairing-friendly curves. I can't say anything about complex exponents.

Note that if the verifier wishes to do a MSM the final scalars will no longer be sparse/small except for perhaps for very small instance sizes. I suspect there exists a threshold instance size where the O(n/log n) speedup of Pippenger's provides a greater verifier speedup than the constant speedup the verifier would gain if they forgo the MSM and instead rescale the generators using the sparse/small/complex exponents as the prover must.

Does this differ from the improved inner product in https://eprint.iacr.org/2019/944.pdf ?

psivesely · 2020-03-17T22:09:20Z

It seems like they make the same observation on w.r.t. halving the multiplications, but provide some different ideas on how to sample the challenges.

daira · 2020-03-19T14:02:30Z

Another way to sample the challenges, for curves that support a cubic endomorphism, is given by Algorithm 1 in the Halo paper. See also the proof of correctness in Appendix C. It's optimized for use in a circuit but might be useful outside, too (with projective coordinates instead of affine, probably).

hdevalence added T-optimization Making things faster/smaller P-low E-medium labels May 8, 2018

hdevalence added E-hard and removed E-medium labels May 10, 2018

hdevalence mentioned this issue Aug 28, 2018

Remove wasteful scalar muls in inner product proof #136

Closed

hdevalence added the T-research Open, unsolved problems label Dec 3, 2018

hdevalence changed the title ~~Try to optimize inner-product proof by collapsing scalar multiplications~~ Try to optimize inner-product proof Dec 11, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Try to optimize inner-product proof #89

Try to optimize inner-product proof #89

hdevalence commented May 8, 2018

oleganza commented May 8, 2018 •

edited

Loading

oleganza commented May 8, 2018

hdevalence commented May 10, 2018

oleganza commented Oct 15, 2018

hdevalence commented Oct 18, 2018

hdevalence commented Oct 18, 2018

hdevalence commented Dec 11, 2018

psivesely commented Mar 17, 2020

kevaundray commented Mar 17, 2020

psivesely commented Mar 17, 2020

daira commented Mar 19, 2020 •

edited

Loading

Try to optimize inner-product proof #89

Try to optimize inner-product proof #89

Comments

hdevalence commented May 8, 2018

oleganza commented May 8, 2018 • edited Loading

oleganza commented May 8, 2018

hdevalence commented May 10, 2018

oleganza commented Oct 15, 2018

hdevalence commented Oct 18, 2018

hdevalence commented Oct 18, 2018

hdevalence commented Dec 11, 2018

psivesely commented Mar 17, 2020

kevaundray commented Mar 17, 2020

psivesely commented Mar 17, 2020

daira commented Mar 19, 2020 • edited Loading

oleganza commented May 8, 2018 •

edited

Loading

daira commented Mar 19, 2020 •

edited

Loading