Skip to content

Conversation

JeremyRubin
Copy link
Contributor

@JeremyRubin JeremyRubin commented Jan 19, 2022

This is a mild validation improvement that improves performance by caching some signature data when you have a Taproot script fragment that uses CHECKSIGADD Multisignatures with sighash single. In some basic testing I showed this to have about a 0.6% speedup during block validation for a block with a lot of CHECKSIGADDs, but that was with the entirety of block validation so the specific impact on the script interpreter performance should be a bit more once you subtract things like coin fetching. If desired I can produce a more specific/sharable bench for this, the code I used to test was just monkey patching the existing taproot tests since generating valid spends is kinda tricky. But it's sort of an obvious win so I'm not sure it needs a rigorous bench, but I will tinker on one of those while the code is being reviewed for correctness.

The overhead of this approach is that:

  1. ScriptExecutionData is no longer const
  2. around 32 bytes of extra stack space
  3. zero extra hashing since we only cache on first use

@DrahtBot
Copy link
Contributor

DrahtBot commented Jan 20, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #22793 (Simplify BaseSignatureChecker virtual functions and GenericTransactionSignatureChecker constructors by achow101)
  • #21702 (Implement BIP-119 Validation (CheckTemplateVerify) by JeremyRubin)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@jamesob
Copy link
Contributor

jamesob commented Jan 20, 2022

Benchmarks are showing results consistent with a 0.5% speedup:

bench name command
ibd.local.range.500000.550000 bitcoind -dbcache=10000 -debug=coindb -debug=bench -listen=0 -connect=0 -addnode=127.0.0.1:8888 -prune=9999999 -printtoconsole=0 -assumevalid=0

#24105 vs. $mergebase (absolute)

bench name x #24105 $mergebase
ibd.local.range.500000.550000.total_secs 2 6004.5920 (± 11.7900) 6035.6871 (± 1.0297)
ibd.local.range.500000.550000.peak_rss_KiB 2 7234538.0000 (± 3322.0000) 7236830.0000 (± 250.0000)
ibd.local.range.500000.550000.cpu_kernel_secs 2 357.3100 (± 3.3900) 359.2450 (± 0.8750)
ibd.local.range.500000.550000.cpu_user_secs 2 36706.6300 (± 8.2500) 36801.1350 (± 10.6050)

#24105 vs. $mergebase (relative)

bench name x #24105 $mergebase
ibd.local.range.500000.550000.total_secs 2 1 1.005
ibd.local.range.500000.550000.peak_rss_KiB 2 1 1.000
ibd.local.range.500000.550000.cpu_kernel_secs 2 1 1.005
ibd.local.range.500000.550000.cpu_user_secs 2 1 1.003

Copy link
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK cfa5752

@sipa
Copy link
Member

sipa commented Jan 20, 2022

utACK cfa5752

@maflcko
Copy link
Member

maflcko commented Jan 20, 2022

ibd.local.range.500000.550000

wat?

How can this result in a speedup in syncing a part of the chain that has no taproot inputs at all? This code is almost dead during IBD, so shouldn't at all influence IBD time.

@maflcko
Copy link
Member

maflcko commented Jan 20, 2022

If someone wants to write a "real" benchmark there is already src/bench/verify_script.cpp which can serve as a template.

@JeremyRubin
Copy link
Contributor Author

@MarcoFalke i don't think that's a useful template since what is needed is a taproot transaction to test this.

I agree that the @jamesob result is suspect for the reason you mentioned.

@maflcko
Copy link
Member

maflcko commented Jan 20, 2022

Jup, with template I meant that all you need to do is create the taproot transaction. Obviously the benchmark doesn't yet bench taproot, since the bench was written before taproot.

@jamesob
Copy link
Contributor

jamesob commented Jan 20, 2022

ibd.local.range.500000.550000

wat?

How can this result in a speedup in syncing a part of the chain that has no taproot inputs at all? This code is almost dead during IBD, so shouldn't at all influence IBD time.

Yeah I just blindly ran

bitcoinperf bench-pr --num-blocks 50_000 \
  --run-id checksigadd-optimize \
  --bitcoind-args='-dbcache=10000 -assumevalid=0' \
  --run-count 2 24105

which anyone is capable of doing (provided they're willing to navigate the somewhat labyrinthine bitcoinperf setup process).

I had only cursorily skimmed the change when doing this; the standard deviation of the timing on this branch's IBD is ~0.2%, so there's a decent chance the result is just noise. But it's not like I cooked the benchmark or anything.

@sipa
Copy link
Member

sipa commented Jan 20, 2022

Sometimes even seemingly unrelated code changes can affect performance (e.g. by ordering the functions slightly differently in memory, resulting in different CPU instruction caching or alignment) in small but statistically significant ways.

@maflcko
Copy link
Member

maflcko commented Jan 22, 2022

review ACK cfa5752

@JeremyRubin
Copy link
Contributor Author

ready to merge?

Copy link
Contributor

@theStack theStack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code-review ACK cfa5752

nit: could adapt the PR description before merge, the "because of mutable field" part doesn't apply anymore

@JeremyRubin
Copy link
Contributor Author

nit addressed

@maflcko
Copy link
Member

maflcko commented Jan 24, 2022

Conceptually the same optimization works also for bip 143 hashing, so I am wondering if it is worth it to keep the 143/341 symmetry. Though, this will likely make a larger/uglier diff.

@JeremyRubin
Copy link
Contributor Author

I think this is good to go as is.

The optimization can be added for BIP-143 later, I can draft a follow up PR for that.

In any case, it would be separate commits because we want to preserve bisecting.

@maflcko maflcko merged commit bd482b3 into bitcoin:master Jan 25, 2022
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Jan 28, 2022
cfa5752 Optimize CHECKSIGADD Script Validation (Jeremy Rubin)

Pull request description:

  This is a mild validation improvement that improves performance by caching some signature data when you have a Taproot script fragment that uses CHECKSIGADD Multisignatures with sighash single. In some basic testing I showed this to have about a 0.6% speedup during block validation for a block with a lot of CHECKSIGADDs, but that was with the entirety of block validation so the specific impact on the script interpreter performance should be a bit more once you subtract things like coin fetching. If desired I can produce a more specific/sharable bench for this, the code I used to test was just monkey patching the existing taproot tests since generating valid spends is kinda tricky. But it's sort of an obvious win so I'm not sure it needs a rigorous bench, but I will tinker on one of those while the code is being reviewed for correctness.

  The overhead of this approach is that:

  1. ScriptExecutionData is no longer const
  2. around 32 bytes of extra stack space
  3. zero extra hashing since we only cache on first use

ACKs for top commit:
  sipa:
    utACK cfa5752
  MarcoFalke:
    review ACK cfa5752
  jonatack:
    ACK cfa5752
  theStack:
    Code-review ACK cfa5752

Tree-SHA512: d5938773724bb9c97b6fd623ef7efdf7f522af52dc0903ecb88c38a518b628d7915b7eae6a774f7be653dc6bcd92e9abc4dd5e8b11f3a995e01e0102d2113d09
@bitcoin bitcoin locked and limited conversation to collaborators Jan 25, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants