Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2019 08 16 process header optimization #701

Conversation

Christewart
Copy link
Contributor

@Christewart Christewart commented Aug 20, 2019

This is part of #653

Previously, we were inserting block headers after every tip validation instead of batching them. The goal of this PR is to batch these headers to insert them into the database all at once, to reduce database latency in our tip validation process.

8ba9626 (current master)

[info] ScalaTest
[info] Run completed in 2 minutes.
[info] Total number of tests run: 27
[info] Suites: completed 8, aborted 0
[info] Tests: succeeded 27, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.

31d9e31

[info] ScalaTest
[info] Run completed in 2 minutes, 5 seconds.
[info] Total number of tests run: 27
[info] Suites: completed 8, aborted 0
[info] Tests: succeeded 27, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[info] Passed: Total 27, Failed 0, Errors 0, Passed 27
[success] Total time: 128 s, completed Aug 20, 2019 9:16:11 AM

As the tests indicate, it seems like we have slightly regressed in at least running the test suite. I don't think this will be true when syncing the real chain though. Our chain tests sync insert roughly 10,000 headers, which is a relatively small sample size to a real chain.

It also looks like @nkohen found that the optimal insert size is not 2,000 -- which is the default number a node will send to us one the network at one time -- but has used batch sizes of 500 in our ChainUnitTest

@Christewart Christewart added the chain work on the chain project label Aug 20, 2019
…), this decouples 'BlockchainUpdate' and Blockchain.connectTip(). This was needed because it doesn't make sense to have a Vector[BlockHeaderDb] returned from Blockchain.connectTip() as we are validating one header. However, to execute batch inserts, we need to accumulate headers somewhere. I choose to do this in BlockchainUpdate with the 'successfulHeaders' field.
@Christewart Christewart force-pushed the 2019-08-16-process-header-optimization branch from 31d9e31 to 9c3ea41 Compare August 21, 2019 12:06
@Christewart
Copy link
Contributor Author

With #708 in master the results for running the test case that is most similar to header processing is this one.

1619d5d

sbt:bitcoin-s-chain-test> testOnly *ChainHandler* -- -z "be able to process and fetch real headers from mainnet"
[info] BitcoindChainHandlerViaZmqTest:
[info] BitcoindChainHandlerViaZmq
[info] ChainHandlerTest:
[info] ChainHandler
[info] - must be able to process and fetch real headers from mainnet
[info] ScalaTest
[info] Run completed in 41 seconds, 476 milliseconds.
[info] Total number of tests run: 1
[info] Suites: completed 2, aborted 0
[info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[info] Passed: Total 1, Failed 0, Errors 0, Passed 1
[success] Total time: 43 s, completed Aug 21, 2019 7:14:12 AM
sbt:bitcoin-s-chain-test> testOnly *ChainHandler* -- -z "be able to process and fetch real headers from mainnet"
[info] BitcoindChainHandlerViaZmqTest:
[info] BitcoindChainHandlerViaZmq
[info] ChainHandlerTest:
[info] ChainHandler
[info] - must be able to process and fetch real headers from mainnet
[info] ScalaTest
[info] Run completed in 37 seconds, 110 milliseconds.
[info] Total number of tests run: 1
[info] Suites: completed 2, aborted 0
[info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[info] Passed: Total 1, Failed 0, Errors 0, Passed 1
[success] Total time: 38 s, completed Aug 21, 2019 7:14:54 AM

and this branch (9c3ea41)

[info] BitcoindChainHandlerViaZmqTest:
[info] BitcoindChainHandlerViaZmq
[info] ChainHandlerTest:
[info] ChainHandler
[info] - must be able to process and fetch real headers from mainnet
[info] ScalaTest
[info] Run completed in 42 seconds, 312 milliseconds.
[info] Total number of tests run: 1
[info] Suites: completed 2, aborted 0
[info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[info] Passed: Total 1, Failed 0, Errors 0, Passed 1
[success] Total time: 67 s, completed Aug 21, 2019 7:16:30 AM
sbt:bitcoin-s-chain-test> testOnly *ChainHandler* -- -z "be able to process and fetch real headers from mainnet"
[info] BitcoindChainHandlerViaZmqTest:
[info] BitcoindChainHandlerViaZmq
[info] ChainHandlerTest:
[info] ChainHandler
[info] - must be able to process and fetch real headers from mainnet
[info] ScalaTest
[info] Run completed in 39 seconds, 206 milliseconds.
[info] Total number of tests run: 1
[info] Suites: completed 2, aborted 0
[info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[info] Passed: Total 1, Failed 0, Errors 0, Passed 1
[success] Total time: 40 s, completed Aug 21, 2019 7:17:22 AM

@Christewart
Copy link
Contributor Author

Christewart commented Aug 21, 2019

So I pushed up another commit (b206d8d) that creates a specific test case to call processHeaders() and then assert we have the best hash after processing headers. This cuts out a lot of extraneous stuff that was being timed in my other results for benchmarking this PR.

Here is current master with my new test case (1619d5d)

[info] ChainHandlerTest:
[info] ChainHandler
[info] - must benchmark ChainHandler.processHeaders()
[info] ScalaTest
[info] Run completed in 17 seconds, 26 milliseconds.
[info] Total number of tests run: 1
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[info] Passed: Total 1, Failed 0, Errors 0, Passed 1
[success] Total time: 19 s, completed Aug 21, 2019 7:55:21 AM
sbt:bitcoin-s-chain-test> testOnly *ChainHandlerTest* -- -z "benchmark ChainHandler.processHeaders()"
[info] ChainHandlerTest:
[info] ChainHandler
[info] - must benchmark ChainHandler.processHeaders()
[info] ScalaTest
[info] Run completed in 17 seconds, 629 milliseconds.
[info] Total number of tests run: 1
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[info] Passed: Total 1, Failed 0, Errors 0, Passed 1
[success] Total time: 19 s, completed Aug 21, 2019 7:55:47 AM

and here is the results from b206d8d

[info] ChainHandlerTest:
[info] ChainHandler
[info] - must benchmark ChainHandler.processHeaders()
[info] ScalaTest
[info] Run completed in 5 seconds, 751 milliseconds.
[info] Total number of tests run: 1
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[info] Passed: Total 1, Failed 0, Errors 0, Passed 1
[success] Total time: 27 s, completed Aug 21, 2019 7:57:16 AM
sbt:bitcoin-s-chain-test> testOnly *ChainHandlerTest* -- -z "benchmark ChainHandler.processHeaders()"
[info] ChainHandlerTest:
[info] ChainHandler
[info] - must benchmark ChainHandler.processHeaders()
[info] ScalaTest
[info] Run completed in 5 seconds, 650 milliseconds.
[info] Total number of tests run: 1
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[info] Passed: Total 1, Failed 0, Errors 0, Passed 1
[success] Total time: 7 s, completed Aug 21, 2019 7:57:27 AM

As you can see there is a pretty significant time reduction here from 17 seconds to 5 seconds.

Here is the method that was being called in previous measurements, and as you can see it doesn't even call processHeaders()

}
}
val headersToBeCreated = {
blockchainUpdates.map(_.successfulHeaders).flatten.distinct.toVector
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also map and flatten to flatMap

@Christewart Christewart requested a review from rorp August 22, 2019 18:17
Copy link
Contributor

@rorp rorp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@Christewart Christewart merged commit ab170d0 into bitcoin-s:master Aug 23, 2019
Christewart added a commit that referenced this pull request May 1, 2021
* WIP: implement processHeaders() in ChainHandler instead of processHeader()

* Create ConnectTipResult which is returned from Blockchain.connectTip(), this decouples 'BlockchainUpdate' and Blockchain.connectTip(). This was needed because it doesn't make sense to have a Vector[BlockHeaderDb] returned from Blockchain.connectTip() as we are validating one header. However, to execute batch inserts, we need to accumulate headers somewhere. I choose to do this in BlockchainUpdate with the 'successfulHeaders' field.

* make chain-verification log level INFO

* Add unit test for benchmarking ChainHandler.processHeaders() that cuts out extraneous noise in another test case

* Address code review
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
chain work on the chain project
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants