Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel Validation #254

Merged
merged 24 commits into from
Mar 2, 2017
Merged

Conversation

ptschip
Copy link
Collaborator

@ptschip ptschip commented Jan 20, 2017

opening PR on the dev branch

@ptschip ptschip force-pushed the dev_parallel branch 12 times, most recently from 80c2dab to 8feb1c5 Compare January 26, 2017 02:22
@ptschip ptschip force-pushed the dev_parallel branch 3 times, most recently from 26c5e53 to 63ca593 Compare January 31, 2017 02:10
Peter Tschipper and others added 3 commits January 31, 2017 10:49
- add 3 more script check queues
- Create semaphores to control and handle queue selection and throughput
- Update locking - we do not need to hold cs_main when checking inputs
- Python test for Parallel Validation

- Ability to Quit scriptcheck threads

When  4 blocks are running in parallel and a
new, smaller block shows up, we need to be able to interrupt the
script threads that are currently validating for a one of the  blocks
so that we can free up a script check queue for the new smaller  block.

- Update the nBlockSequenceId after the block has advanced the tip

This is important for Parallel Validation.  By decreasing the sequence id
we are indicating that this block represents the current pindexMostWork. This
prevents the losing block from having the pindexMostWork point to it rather
than to the winning block.

- Continously check for new blocks to connect

With Parallel Validation new blocks can be accepted while we are connecting at
tip therefore we need to check after we connect each block whether a potentially
longer more work chain now exists. If we don't do this then we can at times end
up with a temporarily unconnected block until the next block gets mined and
another attempt is made to connect the most work chain.

- Terminate any PV threads during a re-org

PV can only operate with blocks that will advance the current chaintip (fork1). Therefore,
if we are needing to re-org to another chain (fork 2)  then we have to kill any currently
running PV threads assoicated with the current chain tip on fork1. This is the
solution to the problem of having two forks being mined while there is an attack
block processing on fork1. If fork2 continues to be mined and eventually pulls
ahead of fork1 in proof of work, then a re-org to fork2 will be initiated causing
the PV threads on fork1 to be terminated and fork2 blocks then connected and fork2 then
becoming the chain active tip.

- Use Chain Work instead of nHeight to self terminate a PV thread

If the Chain Work has changed either positve or negative to where it
was when we started the PV thread then we will exit the thread.  Previously
we were using Chain Height, which worked fine, but this is more understantable
from a coding perspective and also we added the feature to check for when
Chain Work has decreased from the starting point which would indicate that
a re-org was underway.

-Move ZMQ notifications to ActivateBestChainStep

We must notify ZMQ after each tip is connected rather
than after ActivateBestChain otherwise we can miss a block
when there are several to connect together at once.

This particularly seems to help sync issues on regest where
we're mining many blocks all at the same time.

-Completely remove cs_main locks from checkinputs and sigs during PV

During the loop whre we check inputs and signatures we can completely
remove the cs_main locks by making sure to add an internal lock
cs_main lock to ChainWorkHasChanged().  This function rarely will
get invoked and only if the thread is about to exit so there is no
problem here of adding any additional overhead.

-Create SetLocks() used to set the locking order when returning from PV

If there is an error during PV then we have to make sure the locks
are set in the right order before returning.  cs_main must always
be locked when we return from ConnectBlock() as cs_main is recursive
but we must also ensure that the scriptlock is unlocked before
re-locking cs_main to avoid a potential deadlock.

By creating SetLocks() we can abstract away the setting of the locking
order and prevent any developer confusion or possible introduction of  errors
into the code if future changes are made in the ConnectBlock() function.

-Consolidate and simplify the sript check thread pool creation

In the past the code for creaating the script check threads and pools
was all over the place in several files but is now consolidated in
parallel.cpp and parallel.h.  Also is is much easier to make any
changes to the number of scriptcheckqueue's by just editing two
lines in  parallel.cpp.

-Add StartShutdown when closing QT Gui

StartShutdown() was never getting called which sets the shutdown
flag to true.  Generally this would be rare to cause a problem  but in PV
it can become an issue fairly easily if a user wants
to shutdown while many blocks are being connected during IBD.

-Correctly notify the UI of block tip changes

When many blocks are being connected during IBD it was
possible the the UI was not getting updated.  This is a Core
bug which happened rarely but in PV it can happen frequently.

-update IsChainNearlySyncdInit after each block is processed

-Only print out error message when state.Invalid() or state.IsError()

When activatebestchain fails during PV due to a competing block being
beaten, we don't want to print out an error because it really isn't an
error. The block validation was terminated but the block will be store
on disk for the future in the event of a re-org.

-Do not allow two thinblocks to be validating at the same time

- Update the UI if we are close to being syncd

Update the block Current Number of Blocks in the UI if we
are close to finishing our sync. This way when users turn off
their node for a while and turn it on they can see each block
being added as it finishes processing.

-Track whether the current block is acutally validating inputs

We need to do this in the event that two of the same block arrive
and have launched validation threads.  If this should happen then
one of those blocks will begin checking inputs before the other and
we can set a flag which indicates that fact.  Then when the other
block beginns checking inputs it can check to see if the same
block is also currently validatin and if so then exit.

-Use nChainWork to determine whether we have any work to do

When a newblock arrives we check whether the chaintip already matches
pindexMostWork.  However in PV, pindexMostWork may not necessarily point
to the chaintip, therefore we use pindexMostWork->nChainWork to determine
if an attempt should be made to connect this block.

Simplify ConnectBlock and remove unnecessary if(fParallel) statments

-Re-enable SENDHEADERS when Xthins is not on
In order to remove our dependence on cs_main further
and prepare the way for multithreaded transaction processing
we need a set of locks on the UTXO and the UTXO in memory
cache.

Furthermore, since the reads are fast (once the UTXO is in
memory) there is no value in using a Read/Write lock system
as that would entail more overhead and reduce performance.
However we do need recursive locks since serveral of the
methods can be invoked either by themselves or from within
other methods.
and a few comments removed and unnecessary code
@ptschip ptschip changed the title [WIP] Parallel Validation Parallel Validation Feb 1, 2017
@ptschip
Copy link
Collaborator Author

ptschip commented Feb 1, 2017

@gandrewstone taking this out of WIP...I finally fixed the last problem I was having which was showing up on win32...

@sickpig
Copy link
Collaborator

sickpig commented Feb 13, 2017

@gandrewstone @ptschip I wonder if you could add the consensus label to this PR

Peter Tschipper and others added 2 commits February 20, 2017 17:04
- add 3 more script check queues
- Create semaphores to control and handle queue selection and throughput
- Update locking - we do not need to hold cs_main when checking inputs
- Python test for Parallel Validation

- Ability to Quit scriptcheck threads

When  4 blocks are running in parallel and a
new, smaller block shows up, we need to be able to interrupt the
script threads that are currently validating for a one of the  blocks
so that we can free up a script check queue for the new smaller  block.

- Update the nBlockSequenceId after the block has advanced the tip

This is important for Parallel Validation.  By decreasing the sequence id
we are indicating that this block represents the current pindexMostWork. This
prevents the losing block from having the pindexMostWork point to it rather
than to the winning block.

- Continously check for new blocks to connect

With Parallel Validation new blocks can be accepted while we are connecting at
tip therefore we need to check after we connect each block whether a potentially
longer more work chain now exists. If we don't do this then we can at times end
up with a temporarily unconnected block until the next block gets mined and
another attempt is made to connect the most work chain.

- Terminate any PV threads during a re-org

PV can only operate with blocks that will advance the current chaintip (fork1). Therefore,
if we are needing to re-org to another chain (fork 2)  then we have to kill any currently
running PV threads assoicated with the current chain tip on fork1. This is the
solution to the problem of having two forks being mined while there is an attack
block processing on fork1. If fork2 continues to be mined and eventually pulls
ahead of fork1 in proof of work, then a re-org to fork2 will be initiated causing
the PV threads on fork1 to be terminated and fork2 blocks then connected and fork2 then
becoming the chain active tip.

- Use Chain Work instead of nHeight to self terminate a PV thread

If the Chain Work has changed either positve or negative to where it
was when we started the PV thread then we will exit the thread.  Previously
we were using Chain Height, which worked fine, but this is more understantable
from a coding perspective and also we added the feature to check for when
Chain Work has decreased from the starting point which would indicate that
a re-org was underway.

-Move ZMQ notifications to ActivateBestChainStep

We must notify ZMQ after each tip is connected rather
than after ActivateBestChain otherwise we can miss a block
when there are several to connect together at once.

This particularly seems to help sync issues on regest where
we're mining many blocks all at the same time.

-Completely remove cs_main locks from checkinputs and sigs during PV

During the loop whre we check inputs and signatures we can completely
remove the cs_main locks by making sure to add an internal lock
cs_main lock to ChainWorkHasChanged().  This function rarely will
get invoked and only if the thread is about to exit so there is no
problem here of adding any additional overhead.

-Create SetLocks() used to set the locking order when returning from PV

If there is an error during PV then we have to make sure the locks
are set in the right order before returning.  cs_main must always
be locked when we return from ConnectBlock() as cs_main is recursive
but we must also ensure that the scriptlock is unlocked before
re-locking cs_main to avoid a potential deadlock.

By creating SetLocks() we can abstract away the setting of the locking
order and prevent any developer confusion or possible introduction of  errors
into the code if future changes are made in the ConnectBlock() function.

-Consolidate and simplify the sript check thread pool creation

In the past the code for creaating the script check threads and pools
was all over the place in several files but is now consolidated in
parallel.cpp and parallel.h.  Also is is much easier to make any
changes to the number of scriptcheckqueue's by just editing two
lines in  parallel.cpp.

-Add StartShutdown when closing QT Gui

StartShutdown() was never getting called which sets the shutdown
flag to true.  Generally this would be rare to cause a problem  but in PV
it can become an issue fairly easily if a user wants
to shutdown while many blocks are being connected during IBD.

-Correctly notify the UI of block tip changes

When many blocks are being connected during IBD it was
possible the the UI was not getting updated.  This is a Core
bug which happened rarely but in PV it can happen frequently.

-update IsChainNearlySyncdInit after each block is processed

-Only print out error message when state.Invalid() or state.IsError()

When activatebestchain fails during PV due to a competing block being
beaten, we don't want to print out an error because it really isn't an
error. The block validation was terminated but the block will be store
on disk for the future in the event of a re-org.

-Do not allow two thinblocks to be validating at the same time

- Update the UI if we are close to being syncd

Update the block Current Number of Blocks in the UI if we
are close to finishing our sync. This way when users turn off
their node for a while and turn it on they can see each block
being added as it finishes processing.

-Track whether the current block is acutally validating inputs

We need to do this in the event that two of the same block arrive
and have launched validation threads.  If this should happen then
one of those blocks will begin checking inputs before the other and
we can set a flag which indicates that fact.  Then when the other
block beginns checking inputs it can check to see if the same
block is also currently validatin and if so then exit.

-Use nChainWork to determine whether we have any work to do

When a newblock arrives we check whether the chaintip already matches
pindexMostWork.  However in PV, pindexMostWork may not necessarily point
to the chaintip, therefore we use pindexMostWork->nChainWork to determine
if an attempt should be made to connect this block.

Simplify ConnectBlock and remove unnecessary if(fParallel) statments

-Re-enable SENDHEADERS when Xthins is not on
In order to remove our dependence on cs_main further
and prepare the way for multithreaded transaction processing
we need a set of locks on the UTXO and the UTXO in memory
cache.

Furthermore, since the reads are fast (once the UTXO is in
memory) there is no value in using a Read/Write lock system
as that would entail more overhead and reduce performance.
However we do need recursive locks since serveral of the
methods can be invoked either by themselves or from within
other methods.
@sickpig
Copy link
Collaborator

sickpig commented Feb 23, 2017

travis fails here: https://travis-ci.org/BitcoinUnlimited/BitcoinUnlimited/jobs/204568706#L1530

I've tried to reproduce on my local dev machine and I wasn't able to.

@gandrewstone / @AndrewClifford would you mind restart Travis job 853.6 just to be sure that it is not a casual failure. (click on the "Restart job" button on the upper right corner)

@ptschip
Copy link
Collaborator Author

ptschip commented Feb 23, 2017 via email

@sickpig
Copy link
Collaborator

sickpig commented Feb 23, 2017

@ptschip I've tried 4 times in a raw, test passed all the times successfully.

@ptschip
Copy link
Collaborator Author

ptschip commented Feb 23, 2017 via email

@ptschip
Copy link
Collaborator Author

ptschip commented Feb 23, 2017 via email

In we may connect more than one block at a time and we must
announce all blocks connected.
This fixes a problem with ZMQ after the new atomic
IsInitialBlockDownload() was merged.  We have to make sure the value
is updated after each block is found.
@gandrewstone
Copy link
Collaborator

gandrewstone commented Feb 27, 2017

It is possible to use a CNode::thinBlock object after its been freed, because CThinBlock::process calls PV.HandleBlockMessage(pfrom, strCommand, pfrom->thinBlock, GetInv()); (line 120), and the thinBlock is passed by reference. PV.Handle... then creates a separate thread. But during the execution of this block processing thread, what if the CNode disconnects?

You can prevent CNode cleanup-upon-disconnect by calling CNode::AddRef(). Then at the end of the processing thread, call CNode::Release() to allow the CNode to be cleaned up.

@ptschip
Copy link
Collaborator Author

ptschip commented Feb 27, 2017 via email

When we process a block in PV we are launching a detached thread
which uses a reference to a CNode and therefore we need to add
a reference when running the thread and then release the reference
when the thread is finished.
@ptschip
Copy link
Collaborator Author

ptschip commented Feb 27, 2017

@gandrewstone I added and released the ref at the beginning and end of HandlBlockMessageThread()

…on the chain tip. But the chain tip might have advanced from the time the block was created and the time the height was updated
@gandrewstone
Copy link
Collaborator

PR issued to solve the bad coinbase height issue...

@gandrewstone gandrewstone merged commit e0749ce into BitcoinUnlimited:dev Mar 2, 2017
@ptschip ptschip deleted the dev_parallel branch March 2, 2017 17:15
AllanDoensen pushed a commit to AllanDoensen/BitcoinUnlimited that referenced this pull request Apr 23, 2017
6c527ec Merge pull request BitcoinUnlimited#357
445f7f1 Fix for Windows compile issue
2bfb82b Merge pull request BitcoinUnlimited#351
06aeea5 Turn secp256k1_ec_pubkey_serialize outlen to in/out
970164d Merge pull request BitcoinUnlimited#348
6466625 Improvements for coordinate decompression
e2100ad Merge pull request BitcoinUnlimited#347
8e48787 Change secp256k1_ec_pubkey_combine's count argument to size_t.
c69dea0 Clear output in more cases for pubkey_combine, adds tests.
269d422 Comment copyediting.
b4d17da Merge pull request BitcoinUnlimited#344
4709265 Merge pull request BitcoinUnlimited#345
26abce7 Adds 32 static test vectors for scalar mul, sqr, inv.
5b71a3f Better error case handling for pubkey_create & pubkey_serialize, more tests.
3b7bc69 Merge pull request BitcoinUnlimited#343
eed87af Change contrib/laxder from headers-only to files compilable as standalone C
d7eb1ae Merge pull request BitcoinUnlimited#342
7914a6e Make lax_der_privatekey_parsing.h not depend on internal code
73f64ff Merge pull request BitcoinUnlimited#339
9234391 Overhaul flags handling
1a36898 Make flags more explicit, add runtime checks.
1a3e03a Merge pull request BitcoinUnlimited#340
96be204 Add additional tests for eckey and arg-checks.
bb5aa4d Make the tweak function zeroize-output-on-fail behavior consistent.
4a243da Move secp256k1_ec_privkey_import/export to contrib.
1b3efc1 Move secp256k1_ecdsa_sig_recover into the recovery module.
e3cd679 Eliminate all side-effects from VERIFY_CHECK() usage.
b30fc85 Avoid nonce_function_rfc6979 algo16 argument emulation.
70d4640 Make secp256k1_ec_pubkey_create skip processing invalid secret keys.
6c476a8 Minor comment improvements.
131afe5 Merge pull request BitcoinUnlimited#334
0c6ab2f Introduce explicit lower-S normalization
fea19e7 Add contrib/lax_der_parsing.h
3bb9c44 Rewrite ECDSA signature parsing code
fa57f1b Use secp256k1_rand_int and secp256k1_rand_bits more
49b3749 Add new tests for the extra testrand functions
f684d7d Faster secp256k1_rand_int implementation
251b1a6 Improve testrand: add extra random functions
31994c8 Merge pull request BitcoinUnlimited#338
f79aa88 Bugfix: swap arguments to noncefp
c98df26 Merge pull request BitcoinUnlimited#319
67f7da4 Extensive interface and operations tests for secp256k1_ec_pubkey_parse.
ee2cb40 Add ARG_CHECKs to secp256k1_ec_pubkey_parse/secp256k1_ec_pubkey_serialize
7450ef1 Merge pull request BitcoinUnlimited#328
68a3c76 Merge pull request BitcoinUnlimited#329
98135ee Merge pull request BitcoinUnlimited#332
37100d7 improve ECDH header-doc
b13d749 Fix couple of typos in API comments
7c823e3 travis: fixup module configs
cc3141a Merge pull request BitcoinUnlimited#325
ee58fae Merge pull request BitcoinUnlimited#326
213aa67 Do not force benchmarks to be statically linked.
338fc8b Add API exports to secp256k1_nonce_function_default and secp256k1_nonce_function_rfc6979.
52fd03f Merge pull request BitcoinUnlimited#320
9f6993f Remove some dead code.
357f8cd Merge pull request BitcoinUnlimited#314
118cd82 Use explicit symbol visibility.
4e64608 Include public module headers when compiling modules.
1f41437 Merge pull request BitcoinUnlimited#316
fe0d463 Merge pull request BitcoinUnlimited#317
cfe0ed9 Fix miscellaneous style nits that irritate overactive static analysis.
2b199de Use the explicit NULL macro for pointer comparisons.
9e90516 Merge pull request BitcoinUnlimited#294
dd891e0 Get rid of _t as it is POSIX reserved
201819b Merge pull request BitcoinUnlimited#313
912f203 Eliminate a few unbraced statements that crept into the code.
eeab823 Merge pull request BitcoinUnlimited#299
486b9bb Use a flags bitfield for compressed option to secp256k1_ec_pubkey_serialize and secp256k1_ec_privkey_export
05732c5 Callback data: Accept pointers to either const or non-const data
1973c73 Bugfix: Reinitialise buffer lengths that have been used as outputs
788038d Use size_t for lengths (at least in external API)
c9d7c2a secp256k1_context_set_{error,illegal}_callback: Restore default handler by passing NULL as function argument
9aac008 secp256k1_context_destroy: Allow NULL argument as a no-op
64b730b secp256k1_context_create: Use unsigned type for flags bitfield
cb04ab5 Merge pull request BitcoinUnlimited#309
a551669 Merge pull request BitcoinUnlimited#295
81e45ff Update group_impl.h
85e3a2c Merge pull request BitcoinUnlimited#112
b2eb63b Merge pull request BitcoinUnlimited#293
dc0ce9f [API BREAK] Change argument order to out/outin/in
6d947ca Merge pull request BitcoinUnlimited#298
c822693 Merge pull request BitcoinUnlimited#301
6d04350 Merge pull request BitcoinUnlimited#303
7ab311c Merge pull request BitcoinUnlimited#304
5fb3229 Fixes a bug where bench_sign would fail due to passing in too small a buffer.
263dcbc remove unused assignment
b183b41 bugfix: "ARG_CHECK(ctx != NULL)" makes no sense
6da1446 build: fix parallel build
5eb4356 Merge pull request BitcoinUnlimited#291
c996d53 Print success
9f443be Move pubkey recovery code to separate module
d49abbd Separate ECDSA recovery tests
439d34a Separate recoverable and normal signatures
a7b046e Merge pull request BitcoinUnlimited#289
f66907f Improve/reformat API documentation secp256k1.h
2f77487 Add context building benchmarks
cc623d5 Merge pull request BitcoinUnlimited#287
de7e398 small typo fix
9d96e36 Merge pull request BitcoinUnlimited#280
432e1ce Merge pull request BitcoinUnlimited#283
14727fd Use correct name in gitignore
356b0e9 Actually test static precomputation in Travis
ff3a5df Merge pull request BitcoinUnlimited#284
2587208 Merge pull request BitcoinUnlimited#212
a5a66c7 Add support for custom EC-Schnorr-SHA256 signatures
d84a378 Merge pull request BitcoinUnlimited#252
72ae443 Improve perf. of cmov-based table lookup
92e53fc Implement endomorphism optimization for secp256k1_ecmult_const
ed35d43 Make `secp256k1_scalar_add_bit` conditional; make `secp256k1_scalar_split_lambda_var` constant time
91c0ce9 Add benchmarks for ECDH and const-time multiplication
0739bbb Add ECDH module which works by hashing the output of ecmult_const
4401500 Add constant-time multiply `secp256k1_ecmult_const` for ECDH
e4ce393 build: fix hard-coded usage of "gen_context"
b8e39ac build: don't use BUILT_SOURCES for the static context header
baa75da tests: add a couple tests
ae4f0c6 Merge pull request BitcoinUnlimited#278
995c548 Introduce callback functions for dealing with errors.
c333074 Merge pull request BitcoinUnlimited#282
18c329c Remove the internal secp256k1_ecdsa_sig_t type
74a2acd Add a secp256k1_ecdsa_signature_t type
23cfa91 Introduce secp256k1_pubkey_t type
4c63780 Merge pull request BitcoinUnlimited#269
3e6f1e2 Change rfc6979 implementation to be a generic PRNG
ed5334a Update configure.ac to make it build on OpenBSD
1b68366 Merge pull request BitcoinUnlimited#274
a83bb48 Make ecmult static precomputation default
166b32f Merge pull request BitcoinUnlimited#276
c37812f Add gen_context src/ecmult_static_context.h to CLEANFILES to fix distclean.
125c15d Merge pull request BitcoinUnlimited#275
76f6769 Fix build with static ecmult altroot and make dist.
5133f78 Merge pull request BitcoinUnlimited#254
b0a60e6 Merge pull request BitcoinUnlimited#258
733c1e6 Add travis build to test the static context.
fbecc38 Add ability to use a statically generated ecmult context.
4fb174d Merge pull request BitcoinUnlimited#263
4ab8990 Merge pull request BitcoinUnlimited#270
bdf0e0c Merge pull request BitcoinUnlimited#271
31d0c1f Merge pull request BitcoinUnlimited#273
eb2c8ff Add missing casts to SECP256K1_FE_CONST_INNER
55399c2 Further performance improvements to _ecmult_wnaf
99fd963 Add secp256k1_ec_pubkey_compress(), with test similar to the related decompress() function.
145cc6e Improve performance of _ecmult_wnaf
36b305a Verify the result of GMP modular inverse using non-GMP code
0cbc860 Merge pull request BitcoinUnlimited#266
06ff7fe Merge pull request BitcoinUnlimited#267
5a43124 Save 1 _fe_negate since s1 == -s2
a5d796e Update code comments
3f3964e Add specific VERIFY tests for _fe_cmov
7d054cd Refactor to save a _fe_negate
b28d02a Refactor to remove a local var
55e7fc3 Perf. improvement in _gej_add_ge
a0601cd Fix VERIFY calculations in _fe_cmov methods
17f7148 Merge pull request BitcoinUnlimited#261
7657420 Add tests for adding P+Q with P.x!=Q.x and P.y=-Q.y
8c5d5f7 tests: Add failing unit test for BitcoinUnlimited#257 (bad addition formula)
5de4c5d gej_add_ge: fix degenerate case when computing P + (-lambda)P
bcf2fcf gej_add_ge: rearrange algebra
e2a07c7 Fix compilation with C++
873a453 Merge pull request BitcoinUnlimited#250
91eb0da Merge pull request BitcoinUnlimited#247
210ffed Use separate in and out pointers in `secp256k1_ec_pubkey_decompress`
a1d5ae1 Tiny optimization
729badf Merge pull request BitcoinUnlimited#210
2d5a186 Apply effective-affine trick to precomp
4f9791a Effective affine addition in EC multiplication
2b4cf41 Use pkg-config always when possible, with failover to manual checks for libcrypto

git-subtree-dir: src/secp256k1
git-subtree-split: 6c527ec
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants