Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schnorr Batch Verification Interface #2

Draft
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

siv2r
Copy link
Owner

@siv2r siv2r commented May 16, 2022

This PR implements the proof of concept for the batch verification interface proposed here.

Todo

  • Create a new module for Batch Verification
    - todo: fix warning on adding ENABLE_MODULE_BATCH for batch APIs in schnorrsig, extrakeys header file (in include/)
  • better refactor for Strauss Batch
    - todo: implement tests
  • Add comprehensive tests
  • Add bencmarks
  • Rewrite commit history for easier review
  • Create PR on libsecp256k1 (post design questions along with it)

@siv2r
Copy link
Owner Author

siv2r commented May 23, 2022

There are two possible ways to design the build for the batch interface.

  1. new batch module
    • add function (present in schnorrsig module) need to be guarded by MODULE_BATCH flags
      • placing batch_add_* functions in new modules might make the usage difficult
      • since the new module will depend on the schnorrsig module (or just extrakeys??)
    • new module can be marked experimental -- (is it worth having experimental modules in the master branch?)
    • schnorrsig module is stabilized, so creating new modules might be a better approach
    • lower binary size
  2. no new module
    • batch_create, batch_destroy funcs in src/secp256k1.c
    • batch_add_* funcs in schnorrsig module
      • no guard required for it
    • easy to use for the user?
      • no extra params during ./configure

I have implemented the no new module build since it is less time-consuming, and the top priority is to get feedback on batch API design. I would be happy to change to new batch module build later (if that makes more sense).

include/secp256k1_schnorrsig.h Outdated Show resolved Hide resolved
@siv2r
Copy link
Owner Author

siv2r commented May 26, 2022

I wanted to place batch_context_add_schnorrsig, batch_context_add_tweak_checks, and batch_context_verify in schnorrsig/batch_add_impl.h. Since, adding them to schnorrsig/main_impl.h will bloat the file.

But that may not be possible since, the batch_context_add_* functions will use schnorrsig_challenge (to compute e), which is defined as static function.

@jonasnick
Copy link

That schnorrsig_schallenge is static shouldn't prevent usage from other files, but you'll probably need to add a declaration that's imported by batch_add_impl.h (see modules/musig/keyagg.h for example).

batch_context_verify doesn't seem to belong into the schnorrsig module though, but rather somewhere that allows using it even if the schnorrsig module isn't enabled.

@siv2r
Copy link
Owner Author

siv2r commented Jun 2, 2022

But that may not be possible since, the batch_context_add_* functions will use schnorrsig_challenge (to compute e), which is defined as static function.

This is not true. Even though schnorrsig_challenge is static, it is defined in a header file (not .c file) hence, it will work as long as schnorrsig/batch_impl.h is included after schnorrsig/main_impl.h in the final C file (i.e, src/secp256k1.c)

@siv2r
Copy link
Owner Author

siv2r commented Jun 2, 2022

The schnorrsig_challenge function call in the musig module works since the musig module is included after the schnorrsig module in src/secp256k1.c (code here).
Changing this order, I get the same error message (as in keep functions in schnorrsig/batch_add_impl.h):

src/modules/schnorrsig/main_impl.h:116:13: error: static declaration ofsecp256k1_schnorrsig_challengefollows non-static declaration
In file included from src/modules/musig/main_impl.h:11,
                 from src/secp256k1.c:818:
src/modules/musig/session_impl.h:457:5: note: previous implicit declaration ofsecp256k1_schnorrsig_challengewas here
  457 |     secp256k1_schnorrsig_challenge(&session_i.challenge, fin_nonce, msg32, 32, agg_pk32);

@siv2r
Copy link
Owner Author

siv2r commented Jun 14, 2022

I have added the Todo's to the description of this PR. A detailed version of it can be found here.

@siv2r siv2r force-pushed the batch-verify-interface branch 2 times, most recently from 8add7ce to 22342d4 Compare June 20, 2022 23:50
@siv2r
Copy link
Owner Author

siv2r commented Jun 21, 2022

The batch verification interface is now available as an optional batch module (experimental).

To enable all batch APIs and example:

Option 1:

./configure --enable-experimental --enable-module-batch --enable-module-schnorrsig --enable-examples
make

Option 2:

./configure --enable-dev-mode
make

Option 2 will enable all libsecp features (except external callbacks and test coverage)

@siv2r
Copy link
Owner Author

siv2r commented Jun 22, 2022

I would like to get a second opinion on the following questions.

Q: What happens if the user enables only the batch module? (i.e, ./configure --enable-module-batch --enable-experimental)

  • option1: enable the schnorrsig module by default
    • this will be very similar to what schnorrsig does for extrakeys
      • schnorrsig module enables extrakeys module even if the user doesn't specify extrakeys
    • Pros: enabling only the batch module is not useful
    • Cons: In the future, if the batch modules support a third type, enabling schnorrsig might not make any sense
  • option 2: show an error message to users (asking them to enable schnorrsig or extrakeys)
    • similar to the error msg shown when the user does not pass --enable-experimental
    • Cons: even though enabling just batch module is not useful (right now). It is not exactly wrong.
  • option 3: do nothing

I prefer option 3 since option 1 and option 2 assume the user might incorrectly build libsecp. We can easily avoid this through simple documentation.


Q: Where should the batch_add_* APIs be present?
This does not matter from the usage point of view. I ask this for better code readability.

  • option1: outside the batch module (in schnorrsig and extrakeys)
    • batch code is spread across different modules (difficult to understand?)
      • each modules that can support batch verification will have a batch_add_impl.h file
      • but the tests for these function will be present in modules/batch/tests_impl.h
    • to use the batch APIs user need to include
      • secp256k1_batch.h
      • secp256k1_schnorrsig.h
      • secp256k1_extrakeys.h
  • option2: inside the batch module
    • All batch code in a single place (easy to understand?)
    • #include <secp256k1_batch.h> will suffice for using all batch APIs

Choosing one option over another doesn't seem to make much difference.

Copy link

@jonasnick jonasnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: What happens if the user enables only the batch module? (i.e, ./configure --enable-module-batch --enable-experimental)

Agree with your analysis that option 3, do nothing, makes the most sense at this point.

Q: Where should the batch_add_* APIs be present?

  • but the tests for these function will be present in modules/batch/tests_impl.h

Why place the tests there? They could also be in the test files for extrakeys and schnorrsig which seems way more natural with option 1.

I prefer option 1 because then all the schnorr code is in the schnorr module. It's possible to see batching more as a compile time feature than a module. Then it makes sense in the schnorrsig module to test for that feature and if so we compile schnorrsig_batch_add. Moreover, most of the logic in that function is related to Schnorr sigs and not batching.

@siv2r
Copy link
Owner Author

siv2r commented Jul 2, 2022

It's possible to see batching more as a compile time feature than a module.

Ah, I see. Then, it makes more sense to have tests in the schnorrsig module.

Copy link

@jonasnick jonasnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created commit that enables batch in CI (untested): https://github.com/jonasnick/secp256k1/commits/batch-verify-interface-jn

@siv2r
Copy link
Owner Author

siv2r commented Jul 5, 2022

Created commit that enables batch in CI

Thanks! I have included them.

@siv2r siv2r force-pushed the batch-verify-interface branch 2 times, most recently from 41ad09e to e8ba116 Compare July 12, 2022 16:55
src/ecmult_impl.h Outdated Show resolved Hide resolved
@siv2r
Copy link
Owner Author

siv2r commented Jul 18, 2022

option 1: batch_verify does not clear the batch after computing the result (currently implemented)

Min, Avg, and Max are calculated over 10 samplings (for each func). iters is the number of signatures/tweaks used during each sampling.

  • batch created with max_terms = iters
Benchmark                     ,    Min(us)    ,    Avg(us)    ,    Max(us)    
 
schnorrsig_verify             ,   144.0       ,   148.0       ,   155.0    
schnorrsigs_batch_verify      ,   163.0       ,   272.0       ,   286.0
  • batch created with max_terms = 10*iters
Benchmark                     ,    Min(us)    ,    Avg(us)    ,    Max(us)    
  
schnorrsig_verify             ,   144.0       ,   147.0       ,   150.0    
schnorrsigs_batch_verify      ,   161.0       ,   716.0       ,  1283.0

here, schnorrsigs_batch_verify exec time is unusually large since none of the sigs added during previous samplings was cleared (large batch capacity). Hence, batch verifying i*iters sigs instead of iters sigs during ith sampling.


option 2: _batch_verify clears the batch after computing the result

  • batch created with max_terms = iters
Benchmark                     ,    Min(us)    ,    Avg(us)    ,    Max(us)    

schnorrsig_verify             ,   144.0       ,   146.0       ,   150.0    
schnorrsigs_batch_verify      ,   156.0       ,   159.0       ,   164.0
  • batch create with max_terms = 10*iters
Benchmark                     ,    Min(us)    ,    Avg(us)    ,    Max(us)    

schnorrsig_verify             ,   144.0       ,   145.0       ,   150.0    
schnorrsigs_batch_verify      ,   157.0       ,   158.0       ,   163.0

Observations:

  • both are worse than a single verify
  • both give similar performance irrespective of the batch capacity

@siv2r
Copy link
Owner Author

siv2r commented Jul 21, 2022

I have created a new branch (max-terms-threshold) to compare the batch verification performance under different max terms limits (currently set to 80).

I compared the batch verify execution time by varying the max_terms limit from 20 to 200. The performance is optimal in 44 to 80's (most optimal at 55). But the performance is comparable (~2 micro secs difference) to other threshold values. You can find my output file in this google drive folder. I am yet to generate graphs for these.

To generate the benchmark CSV file (for various max_terms limit) on your machine:

  1. Fetch my max-terms-threshold branch
  2. ./batch_threshold.sh <lower_limit> <upper_limit>

@siv2r
Copy link
Owner Author

siv2r commented Jul 22, 2022

Currently:

  1. Batch verifying Schnorr signatures is around 17% faster than single verification.
  2. Batch verifying tweaked pubkey checks is around 5% faster than single verification

@siv2r
Copy link
Owner Author

siv2r commented Jul 23, 2022

The following commits fixed the CI test fails:

  • cc83c32 - bitflip test had the incorrect size (33 instead of 32) for tweak, so it was flipping out of bound bits.
  • 42b00bd - batch_create leaked memory if ARG_CHECK(max_terms != 0) was hit during testing.
  • 47d3716 - adding limit to max_terms made it impossible to test ARG_CHECK(max_terms <= SIZE_MAX/2) condition.

I should avoid pushing new code if CI fails. It makes debugging troublesome at a later stage :'(

Copy link

@jonasnick jonasnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a commit that refactors the strauss batch code. This results in less code duplication, so I think this is a bit better than the existing code. Feel free to cherry-pick if you agree

https://github.com/jonasnick/secp256k1/commits/batch-verify-interface-jn.

@jonasnick
Copy link

Why would we do the check in batch_add_*? Isn't it sufficient to just do what we're doing right now in batch_create

    if (max_terms > STRAUSS_MAX_TERMS_PER_BATCH) {
        max_terms = STRAUSS_MAX_TERMS_PER_BATCH;
    }

but update STRAUSS_MAX_TERMS_PER_BATCH somthing between 103 and 110?

@siv2r
Copy link
Owner Author

siv2r commented Aug 12, 2022

Isn't it sufficient to just do what we're doing right now in batch_create

Yes, that should suffice. I got confused.

Currently, batch_create allocates space for 2*max_terms number of points on scratch. From the above graphs:

  • batch with only schnorrsig: Optimal STRAUSS_MAX_TERMS_PER_BATCH = 55 sigs (terms) = 110 points on scratch
  • batch with only tweak checks: Optimal STRAUSS_MAX_TERMS_PER_BATCH = 103 tweak checks (terms) = 103 points on scratch

If we make STRAUSS_MAX_TERMS_PER_BATCH = 106, the batch creates space for 212 points on scratch. This batch's capacity = 106 schnorrsig terms = 212 tweak check terms = 212 points. Doesn't 106 sigs (= 212 points) seem less optimal than 80 (= 160 points) sigs? (but there isn't too much performance drop, though)

eyeballing both the graphs, the optimal range is [40, 200] points = [20, 100] schnorrsig terms = [40, 200] tweak check terms. We could generalize this optimal range as [40, 100] terms (here, "term" can mean schnorrsig or tweak check). I put the upper limit at 100 since schnorrsig performance drops quickly after 100 sigs.

The optimal value of STRAUSS_MAX_TERMS_PER_BATCH will vary depending on the batch's ratio of sigs and tweak. If we give more importance to schnorrsig, we must choose the STRAUSS_MAX_TERMS_PER_BATCH value from the lower end of the optimal range [40, 100].

Sorry for introducing the confusing wording "max_term".

@jonasnick
Copy link

Now I'm confused. Term for me always meant a tuple of scalar and point. So "schnorrsig terms" and "tweak terms" doesn't make sense to me.

batch_create allocates space for 2*max_terms number of points on scratch

I think this is confusing indeed. We should only allocate max_terms and explain in the doc of batch_tweak_add that it "uses up" one term and in the doc of batch_schnorrsig_add that it uses up two terms.

the optimal range is [40, 200] points

Hm? The optimal range seems smaller in my eyes.

The optimal value of STRAUSS_MAX_TERMS_PER_BATCH will vary depending on the batch's ratio of sigs and tweak.

Not sure if this is significant or not just the result of random fluctuations in your tweak and schnorrsig benchmarks.

@siv2r
Copy link
Owner Author

siv2r commented Aug 12, 2022

explain in the doc of batch_tweak_add that it "uses up" one term and in the doc of batch_schnorrsig_add that it uses up two terms.

This clears up all the confusion. We can now use STRAUSS_MAX_TERMS_PER_BATCH = 106.
Will calling it "points" (instead of "terms") might make more sense from the developer's perspective? It might not make much difference to the user.

The optimal value of STRAUSS_MAX_TERMS_PER_BATCH will vary depending on the batch's ratio of sigs and tweak.

Not sure if this is significant or not just the result of random fluctuations in your tweak and schnorrsig benchmarks.

I said this because of my definition of "term" ("schnorrsig term" and "tweak check term") and 2*max_terms allocation. I am unable to articulate the exact reason here. I have to think this through. But your above suggestion fixes this since every "term" now is just a single tuple.

@jonasnick
Copy link

We can now use STRAUSS_MAX_TERMS_PER_BATCH = 106.

Ok, cool, I agree.

I think "terms" is slightly better because it's more neutral and may be more likely to not confuse users (who have some preconceived notions about what points are and how they are used, etc.).

@siv2r
Copy link
Owner Author

siv2r commented Aug 14, 2022

Currently, batch_create has the following checks:

/* Check that `max_terms` is less than half the maximum size_t value. This is necessary because
 * `batch_add_schnorrsig` appends two (scalar, point) pairs for one signature */
ARG_CHECK(max_terms <= SIZE_MAX / 2);
/* Check that max_terms is less than 2^31 to ensure the same behavior of this function on 32-bit
 * and 64-bit platforms. */
ARG_CHECK(max_terms < ((uint32_t)1 << 31));

Since we plan to allocate only the max_terms (instead of 2*max_terms) number of points on scratch, the first check will be removed.

Should the second check be changed to ARG_CHECK(max_terms <= UINT32_MAX);? why should batch_create have the same behavior on 32-bit and 64-bit machines? Isn't this limiting the 64-bit machine's power?

Although this is not significant since we have an internal STRAUSS_MAX_TERMS_PER_BATCH limit on max_terms.

@siv2r siv2r force-pushed the batch-verify-interface branch 2 times, most recently from 3b27149 to cee343c Compare August 14, 2022 12:42
@jonasnick
Copy link

I think with the ARG_CHECK(max_terms <= SIZE_MAX / 2); check removed, we can remove the second check as well. Otherwise, without the second check the function would accept up to 2^31 and up to 2^63 on the different architectures. Now it the full range on size_t on both platforms which one could somewhat count as "ensuring the same behavior".

@siv2r siv2r force-pushed the batch-verify-interface branch 2 times, most recently from 8e9204a to 3262ff4 Compare August 15, 2022 19:53
siv2r added a commit that referenced this pull request Aug 19, 2022
This commit generates graphs that visualize the batch verify speed up over
single verification (y-axis) wrt the number of signatures (or tweak checks)
in the batch. The input data points are taken from the batch verify benchmark.

GNU plot was used to generate these graphs. The instructions to reproduce these
graphs (on your local machine) are given in doc/speedup-batch.md

The value of `STRAUSS_MAX_TERMS_PER_BATCH` was calculated (approx) from
the generated graphs. Relevant discussion: #2 (comment)
This commit adds the foundational configuration, build scripts,
and an initial structure for experimental batch module.
This commit adds the _batch_create and _batch_destroy APIs.

Relevant Links:
1. _batch_scratch_size allocation formula is taken from bench ecmult:
https://github.com/bitcoin-core/secp256k1/blob/694ce8fb2d1fd8a3d641d7c33705691d41a2a860/src/bench_ecmult.c#L312.
2. aux_rand16 param in _batch_create enables synthetic randomness for
randomizer generation: sipa/bips#204.
This commit refactors _ecmult_strauss_batch and adds _batch_verify API.

The current _ecmult_strauss_batch only works on empty scratch space. To
make _batch_verify work, we need _ecmult_strauss_batch to support a scratch
space pre-filled with scalars and points. So, it was refactored to do
exactly that.

The _batch_verify API always uses the Strauss algorithm. It doesn't switch
to Pippenger (unlike _ecmult_multi_var). _ecmult_pippenger_batch represents
points as secp256k1_ge whereas _ecmult_strauss_batch represents points as
secp256k1_gej. This makes supporting both Pippenger and Strauss difficult
(at least with the current batch object design). Hence, _batch_verify only
supports Strauss for simplicity.
This commit adds the batch APIs:

1. _batch_add_schnorrsig
    Adds a Schnorr signature to the batch.

2. _batch_add_xonlypub_tweak_check
	Adds a tweaked x-only pubkey check to the batch.

3. _batch_usable
	Checks if a batch can be used by _batch_add_* APIs.

**Side Note:**
Exposing batch_add_schnorrsig in the `secp256k1_schnorrsig.h` header
file (with batch module header guards) will force the user to define
ENABLE_MODULE_BATCH during their code compilation. Hence, it is in a
standalone `secp256k1_schnorrsig_batch` header file. A similar argument
could be made for batch_add_xonlypub_tweak_check.
This commit adds an example C program using the batch API.

GNU Autotools will compile this example only if both batch and
schnorrsig modules are enabled.
siv2r added a commit that referenced this pull request Aug 20, 2022
This commit generates two semi-log graphs that visualize the batch
verification speed up over single verification (y-axis) wrt the number
of signatures (or tweak checks) in the batch (x-axis). The input data
points are taken from the batch verify benchmark.

GNU plot was used to generate these graphs (plot.gp file). The instructions
to reproduce these graphs (on your local machine) are given in
doc/speedup-batch.md file.

The value of `STRAUSS_MAX_TERMS_PER_BATCH` was calculated (approx) from
the generated graphs.
Relevant discussion: #2 (comment)
This commit adds the following tests:
	1. Cirrus scripts
	2. Batch API tests (ordered)
	3. Tagged SHA256 test
	4. BIP340 test vectors: https://github.com/bitcoin/bips/blob/master/bip-0340/test-vectors.csv
	5. Large random test for `strauss_batch` refactor
This commit adds the following tests:
	1. Random bitflip test for randomizer generating function
	2. Random bitflip in Schnorr Signature (_batch_add_schnorrsig test)
	3. NULL arg tests (for both _batch_add APIs)
siv2r added a commit that referenced this pull request Aug 20, 2022
This commit generates two semi-log graphs that visualize the batch
verification speed up over single verification (y-axis) wrt the number
of signatures (or tweak checks) in the batch (x-axis). The input data
points are taken from the batch verify benchmark.

GNU plot was used to generate these graphs (plot.gp file). The instructions
to reproduce these graphs (on your local machine) are given in
doc/speedup-batch.md file.

The value of `STRAUSS_MAX_TERMS_PER_BATCH` was calculated (approx) from
the generated graphs.
Relevant discussion: #2 (comment)
This commit adds benchmarks for Schnorr signature batch verification,Tweaked
pubkey check batch verification, and Tweaked pubkey check (single verification).

For batch verify benchmark, the number of sigs (or checks) in the batch
varies from 1 to SECP256K1_BENCH_ITERS with a 20% increment.
This commit generates two semi-log graphs that visualize the batch
verification speed up over single verification (y-axis) wrt the number
of signatures (or tweak checks) in the batch (x-axis). The input data
points are taken from the batch verify benchmark.

GNU plot was used to generate these graphs (plot.gp file). The instructions
to reproduce these graphs (on your local machine) are given in
doc/speedup-batch.md file.

The value of `STRAUSS_MAX_TERMS_PER_BATCH` was calculated (approx) from
the generated graphs.
Relevant discussion: #2 (comment)
fjahr pushed a commit to fjahr/secp256k1 that referenced this pull request Oct 9, 2023
This commit generates two semi-log graphs that visualize the batch
verification speed up over single verification (y-axis) wrt the number
of signatures (or tweak checks) in the batch (x-axis). The input data
points are taken from the batch verify benchmark.

GNU plot was used to generate these graphs (plot.gp file). The instructions
to reproduce these graphs (on your local machine) are given in
doc/speedup-batch.md file.

The value of `STRAUSS_MAX_TERMS_PER_BATCH` was calculated (approx) from
the generated graphs.
Relevant discussion: siv2r#2 (comment)
fjahr pushed a commit to fjahr/secp256k1 that referenced this pull request Oct 14, 2023
This commit generates two semi-log graphs that visualize the batch
verification speed up over single verification (y-axis) wrt the number
of signatures (or tweak checks) in the batch (x-axis). The input data
points are taken from the batch verify benchmark.

GNU plot was used to generate these graphs (plot.gp file). The instructions
to reproduce these graphs (on your local machine) are given in
doc/speedup-batch.md file.

The value of `STRAUSS_MAX_TERMS_PER_BATCH` was calculated (approx) from
the generated graphs.
Relevant discussion: siv2r#2 (comment)
fjahr pushed a commit to fjahr/secp256k1 that referenced this pull request Feb 24, 2024
This commit generates two semi-log graphs that visualize the batch
verification speed up over single verification (y-axis) wrt the number
of signatures (or tweak checks) in the batch (x-axis). The input data
points are taken from the batch verify benchmark.

GNU plot was used to generate these graphs (plot.gp file). The instructions
to reproduce these graphs (on your local machine) are given in
doc/speedup-batch.md file.

The value of `STRAUSS_MAX_TERMS_PER_BATCH` was calculated (approx) from
the generated graphs.
Relevant discussion: siv2r#2 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants