Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel script verification #2060

Merged
merged 5 commits into from
Jan 18, 2013
Merged

Parallel script verification #2060

merged 5 commits into from
Jan 18, 2013

Conversation

sipa
Copy link
Member

@sipa sipa commented Dec 2, 2012

  • During block verification (when parallelism is requested), script check actions are stored instead of being executed immediately.
  • After every processed transactions, its signature actions are pushed to a CScriptCheckQueue, which maintains a queue and some synchronization mechanism.
  • Two or more threads (if enabled) process elements from this queue, and, and signal the waiting block verification code when they are done.

As cs_main is held the entire time, and all verification must be finished before the block continues processing, this does not reach the best possible performance. It is a less drastic change than some more advanced mechanisms (like doing verification out-of-band entirely, and rolling back blocks when a failure is detected).

This feature is enabled though the -par=N flag.

Depends on #2058 and #2059.

@sipa
Copy link
Member Author

sipa commented Dec 2, 2012

Benchmark result: on my system (an i7-2670QM), a reindex of the first 210000 blocks, with script verification enabled everywhere, and -dbcache=900:

  • HEAD: 3h22m
  • -par=4: 1h14m

With -par=4, CPU usage is around 350% (though the first ~100000 blocks cause lower CPU usage)

@@ -579,6 +588,11 @@ bool AppInit2()
if (fDaemon)
fprintf(stdout, "Bitcoin server starting\n");

if (nScriptCheckThreads) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When -par=1 this would cause no thread to get spawned for verification and matches current behaviour?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If nScriptCheckThreads == 0, there is some special code that just runs the script validation inline, instead of pushing it to queues.

nScriptCheckThreads == 1 shouldn't ever happen - there's some code that turns it into 0 if set to 1.

If nScriptCheckThreads is higher, nScriptCheckThreads-1 actual separate threads are started. When the main block processing thread is done with its normal tasks, it joins the worker thread pool temporarily, becoming the N'th worker, so there are always N threads working.

@sipa
Copy link
Member Author

sipa commented Dec 4, 2012

  • cleaned up the code
  • moved the job queue implementation to checkqueue.h
  • added comments
  • enabled by default (-par=0 autodetects)

unsigned int nNow = 0;
bool fOk = true;
do {
{
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Small indentation glitch.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How so? Indentation is 4 spaces...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, it's fine ... just looked weird because of the do { above.

@Diapolo
Copy link

Diapolo commented Dec 4, 2012

I love your comments, great work here. I still need to try out the code though :).

nScriptCheckThreads = boost::thread::hardware_concurrency();
if (nScriptCheckThreads <= 1)
nScriptCheckThreads = 0;
else if (nScriptCheckThreads > 64)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make this (arbitary?) limit of 64 a constant instead of a magic number.

@laanwj
Copy link
Member

laanwj commented Dec 6, 2012

Nice!

@sipa
Copy link
Member Author

sipa commented Dec 6, 2012

I've been doing some benchmark, and it seems the contention on the (single) lock protecting the queue makes the throughput and contention overhead go rather high when using too many threads. At least extrapolating from what I see on my system. more than 8 or 16 threads will probably cause significantly degraded performance. Switching to a per-thread queue is probably better, with jobs assigned in a round-robin way to them, or something more intelligent

That said, rebuilding the coindb from scratch (-dbcache=1000, -par=12, with #2061 and #2062, script checks only after block 193k) takes 13m51s on a hexacore E5-1650 @ 3.2Ghz)...

@BitcoinPullTester
Copy link

Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/8f706026e6dee8e38cca0d17acbfc75107d2dcba for binaries and test log.

@sipa
Copy link
Member Author

sipa commented Dec 8, 2012

Changes:

  • Access to the script check queue is now piped through a RAII CScriptCheckQueueControl, which guarantees the queue is fully processed before continuing
  • Print the number of threads used in debug.log
  • Don't store block validation results in signature cache (only mempool transactions are stored), but still use them. This allows multiple threads reading the cache simultaneously.

@BitcoinPullTester
Copy link

Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/5c713c9daa1128d407d9c483d1abae9bde6d48ad for binaries and test log.

@BitcoinPullTester
Copy link

Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/2f3ae3eebd979c1c4c7f43d9cfbe95f61db93ec6 for binaries and test log.

@gmaxwell
Copy link
Contributor

Just a comment on negative testing results:

I've been running loops of par inside valgrind on fuzzed blockchains with an instrumented copy of Bitcoin that disables most of the block validity tests (so that the fuzzing doesn't cause the chain to be rejected). In 1000 runs, no errors so far— but I did trigger invalid memory accesses after about 100 runs on this code prior to the RAII CScriptCheckQueueControl added in the last patch.

@sipa
Copy link
Member Author

sipa commented Dec 19, 2012

Given that any non-trivial code has at least one bug (see http://www.murphys-laws.com/murphy/murphy-computer.html), this is indeed bad news :(

* During block verification (when parallelism is requested), script
  check actions are stored instead of being executed immediately.
* After every processed transactions, its signature actions are
  pushed to a CScriptCheckQueue, which maintains a queue and some
  synchronization mechanism.
* Two or more threads (if enabled) start processing elements from
  this queue,
* When the block connection code is finished processing transactions,
  it joins the worker pool until the queue is empty.

As cs_main is held the entire time, and all verification must be
finished before the block continues processing, this does not reach
the best possible performance. It is a less drastic change than
some more advanced mechanisms (like doing verification out-of-band
entirely, and rolling back blocks when a failure is detected).

The -par=N flag controls the number of threads (1-16). 0 means auto,
and is the default.
Since block validation happens in parallel, multiple threads may be
accessing the signature cache simultaneously. To prevent contention:
* Turn the signature cache lock into a shared mutex
* Make reading from the cache only acquire a shared lock
* Let block validations not store their results in the cache
@BitcoinPullTester
Copy link

Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/ef0f422519de4a3ce47d923e5f8f90cd12349f3e for binaries and test log.

@gavinandresen
Copy link
Contributor

ACK.

Benchmark results on my mac, testing by doing a fresh sync of the -testnet blockchain pulled over the LAN:

Without this pull:
32-bit compile: 270 seconds
64-bit compile: 180 seconds

With this pull:
64-bit, 4-CPU : 125 seconds

gavinandresen added a commit that referenced this pull request Jan 18, 2013
Parallel script verification
@gavinandresen gavinandresen merged commit 0e31ae9 into bitcoin:master Jan 18, 2013
@sipa sipa deleted the parallel branch May 3, 2013 18:53
laudney pushed a commit to reddcoin-project/reddcoin-3.10 that referenced this pull request Mar 19, 2014
@practicalswift
Copy link
Contributor

@gmaxwell Is your instrumented copy that disables most of the block validity tests available on GitHub? I've thought about writing something similar myself to facilitate deeper fuzzing (my current fuzzing is quite shallow) so I'd be very interested in your version :-)

@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants