New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build txindex in parallel with validation #13033
Conversation
If anyone wants to read the old PR and has difficulty loading the page, it helps to use another browser session that's logged out. Only comment of mine I'd like to export to this PR is the suggestion to backport "the migration code that removes txindex without requiring a reindex would be useful for folks who regret having set Tested 3272b17637c5, including |
Reverse chron. summary of ACKs from #11857:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested ACK 3272b17 (round 1).
doc/release-notes-pr11857.md
Outdated
@@ -0,0 +1,11 @@ | |||
Transaction index changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename file? PR changed.
src/index/txindex.cpp
Outdated
@@ -0,0 +1,308 @@ | |||
// Copyright (c) 2017 The Bitcoin Core developers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, 2018? (and 2 more files)
@@ -47,6 +48,8 @@ void TxToJSON(const CTransaction& tx, const uint256 hashBlock, UniValue& entry) | |||
TxToUniv(tx, uint256(), entry, true, RPCSerializationFlags()); | |||
|
|||
if (!hashBlock.IsNull()) { | |||
LOCK(cs_main); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to avoid locking and lookup here, how about adding the argument CBlockIndex* pindex
to TxToJSON
? (I know this is only called from getrawtransaction
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty against adding more to this PR. Seems like a good follow-on cleanup item.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair.
src/index/txindex.h
Outdated
/// @param[out] block_hash The hash of the block the transaction is found in. | ||
/// @param[out] tx The raw transaction itself. | ||
/// @return true if transaction is found, false otherwise | ||
bool FindTx(const uint256& tx_hash, uint256& block_hash, CTransactionRef& tx) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, LookupTransaction
.
src/index/txindex.h
Outdated
/// up from far behind, this method does not block and immediately returns false. | ||
bool BlockUntilSyncedToCurrentChain(); | ||
|
||
/// Look up a raw transaction by hash. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drop raw
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your wish is my command.
src/index/txindex.h
Outdated
/// | ||
/// @param[in] tx_hash The hash of the transaction to be returned. | ||
/// @param[out] block_hash The hash of the block the transaction is found in. | ||
/// @param[out] tx The raw transaction itself. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drop raw?
// Need to register this ValidationInterface before running Init(), so that | ||
// callbacks are not missed if Init sets m_synced to true. | ||
RegisterValidationInterface(this); | ||
if (!Init()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this case, should exit instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thinking is that logging and running without the TxIndex is better than FatalError, but I could see it the other way too. Probably a good idea to add a log line at the very least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say running without the desired functionality should exit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @promag. If the daemon can't provide the service that is requested, it should fail early. Otherwise you may just complicate debugging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to FatalError.
src/rpc/rawtransaction.cpp
Outdated
if (!coin.IsSpent()) { | ||
pblockindex = chainActive[coin.nHeight]; | ||
break; | ||
LOCK(cs_main); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of nesting, repeat the lock:
if (!request.params[1].isNull()) {
hashBlock = uint256S(request.params[1].get_str());
LOCK(cs_main);
...
} else {
LOCK(cs_main);
...
}
3272b17
to
40eabf6
Compare
src/index/txindex.cpp
Outdated
} | ||
|
||
if (last_locator_write_time + SYNC_LOCATOR_WRITE_INTERVAL < current_time) { | ||
WriteBestBlock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this is called in the middle of ThreadSync()
, shouldn't this be writing a locator for pindex
rather than chainActive
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, wow, oops. Good catch.
40eabf6
to
8f1bfe1
Compare
src/index/txindex.cpp
Outdated
int64_t last_locator_write_time = 0; | ||
while (true) { | ||
if (m_interrupt) { | ||
WriteBestBlock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function takes an argument now.
8f1bfe1
to
70d1881
Compare
ec79b24
to
c6d4c88
Compare
The new TxIndexDB class will be used by a future commit in this change set.
The TxIndex will be responsible for building the transaction index concurrently with the main validation thread by implementing ValidationInterface. This does not process blocks concurrently yet.
TxIndex starts up a background thread to get in sync with the block index before blocks are processed through the ValidationInterface.
In order to preserve getrawtransaction RPC behavior, there needs to be a way for a thread to ensure the transaction index is in sync with the current state of the blockchain.
Now that the transaction index is updated asynchronously, in order to preserve the current behavior of public interfaces, the code blocks until the transaction index is caught up with the current state of the blockchain.
c6d4c88
to
9b27047
Compare
utACK 9b27047 Also compared with a re-merge of master with 806b2f1 (which had utACK from @ryanofsky and @Sjors), with a re-merge of master with 523dd76 (which had utACK from @TheBlueMatt), and with a re-merge of master with ea8be45 (which had Tested ACK from @jamesob) to find the only differences are:
|
9b27047 [doc] Include txindex changes in the release notes. (Jim Posen) ed77dd6 [test] Simple unit test for TxIndex. (Jim Posen) 6d772a3 [rpc] Public interfaces to GetTransaction block until synced. (Jim Posen) a03f804 [index] Move disk IO logic from GetTransaction to TxIndex::FindTx. (Jim Posen) e0a3b80 [validation] Replace tx index code in validation code with TxIndex. (Jim Posen) 8181db8 [init] Initialize and start TxIndex in init code. (Jim Posen) f90c3a6 [index] TxIndex method to wait until caught up. (Jim Posen) 70d510d [index] Allow TxIndex sync thread to be interrupted. (Jim Posen) 94b4f8b [index] TxIndex initial sync thread. (Jim Posen) 34d68bf [index] Create new TxIndex class. (Jim Posen) c88bcec [db] Migration for txindex data to new, separate database. (Jim Posen) 0cb8303 [db] Create separate database for txindex. (Jim Posen) Pull request description: I'm re-opening #11857 as a new pull request because the last one stopped loading for people ------------------------------- This refactors the tx index code to be in it's own class and get built concurrently with validation code. The main benefit is decoupling and moving the txindex into a separate DB. The primary motivation is to lay the groundwork for other indexers that might be desired (such as the [compact filters](bitcoin/bips#636)). The basic idea is that the TxIndex spins up its own thread, which first syncs the txindex to the current block index, then once in sync the BlockConnected ValidationInterface hook writes new blocks. ### DB changes At the suggestion of some other developers, the txindex has been split out into a separate database. A data migration runs at startup on any nodes with a legacy txindex. Currently the migration blocks node initialization until complete. ### Open questions - Should the migration of txindex data from the old DB to the new DB block in init or should it happen in a background thread? The downside to backgrounding it is that `getrawtransaction` would return an error message saying the txindex is syncing while the migration is running. ### Impact In a sample size n=1 test where I synced nodes from scratch, the average time [Index writing](https://github.com/bitcoin/bitcoin/blob/master/src/validation.cpp#L1903) was 3.36ms in master and 1.72ms in this branch. The average time between `UpdateTip` log lines for sequential blocks between 400,000 and IBD end on mainnet was 0.297204s in master and 0.286134s in this branch. Most likely this is just variance in IBD times, but I can try with some more trials if people want. Tree-SHA512: 451fd7d95df89dfafceaa723cdf0f7b137615b531cf5c5035cfb54e9ccc2026cec5ac85edbcf71b7f4e2f102e36e9202b8b3a667e1504a9e1a9976ab1f0079c4
🎉 nice work, @jimpo. |
9b27047 [doc] Include txindex changes in the release notes. (Jim Posen) ed77dd6 [test] Simple unit test for TxIndex. (Jim Posen) 6d772a3 [rpc] Public interfaces to GetTransaction block until synced. (Jim Posen) a03f804 [index] Move disk IO logic from GetTransaction to TxIndex::FindTx. (Jim Posen) e0a3b80 [validation] Replace tx index code in validation code with TxIndex. (Jim Posen) 8181db8 [init] Initialize and start TxIndex in init code. (Jim Posen) f90c3a6 [index] TxIndex method to wait until caught up. (Jim Posen) 70d510d [index] Allow TxIndex sync thread to be interrupted. (Jim Posen) 94b4f8b [index] TxIndex initial sync thread. (Jim Posen) 34d68bf [index] Create new TxIndex class. (Jim Posen) c88bcec [db] Migration for txindex data to new, separate database. (Jim Posen) 0cb8303 [db] Create separate database for txindex. (Jim Posen) Pull request description: I'm re-opening bitcoin#11857 as a new pull request because the last one stopped loading for people ------------------------------- This refactors the tx index code to be in it's own class and get built concurrently with validation code. The main benefit is decoupling and moving the txindex into a separate DB. The primary motivation is to lay the groundwork for other indexers that might be desired (such as the [compact filters](bitcoin/bips#636)). The basic idea is that the TxIndex spins up its own thread, which first syncs the txindex to the current block index, then once in sync the BlockConnected ValidationInterface hook writes new blocks. ### DB changes At the suggestion of some other developers, the txindex has been split out into a separate database. A data migration runs at startup on any nodes with a legacy txindex. Currently the migration blocks node initialization until complete. ### Open questions - Should the migration of txindex data from the old DB to the new DB block in init or should it happen in a background thread? The downside to backgrounding it is that `getrawtransaction` would return an error message saying the txindex is syncing while the migration is running. ### Impact In a sample size n=1 test where I synced nodes from scratch, the average time [Index writing](https://github.com/bitcoin/bitcoin/blob/master/src/validation.cpp#L1903) was 3.36ms in master and 1.72ms in this branch. The average time between `UpdateTip` log lines for sequential blocks between 400,000 and IBD end on mainnet was 0.297204s in master and 0.286134s in this branch. Most likely this is just variance in IBD times, but I can try with some more trials if people want. Tree-SHA512: 451fd7d95df89dfafceaa723cdf0f7b137615b531cf5c5035cfb54e9ccc2026cec5ac85edbcf71b7f4e2f102e36e9202b8b3a667e1504a9e1a9976ab1f0079c4
9b27047 [doc] Include txindex changes in the release notes. (Jim Posen) ed77dd6 [test] Simple unit test for TxIndex. (Jim Posen) 6d772a3 [rpc] Public interfaces to GetTransaction block until synced. (Jim Posen) a03f804 [index] Move disk IO logic from GetTransaction to TxIndex::FindTx. (Jim Posen) e0a3b80 [validation] Replace tx index code in validation code with TxIndex. (Jim Posen) 8181db8 [init] Initialize and start TxIndex in init code. (Jim Posen) f90c3a6 [index] TxIndex method to wait until caught up. (Jim Posen) 70d510d [index] Allow TxIndex sync thread to be interrupted. (Jim Posen) 94b4f8b [index] TxIndex initial sync thread. (Jim Posen) 34d68bf [index] Create new TxIndex class. (Jim Posen) c88bcec [db] Migration for txindex data to new, separate database. (Jim Posen) 0cb8303 [db] Create separate database for txindex. (Jim Posen) Pull request description: I'm re-opening bitcoin#11857 as a new pull request because the last one stopped loading for people ------------------------------- This refactors the tx index code to be in it's own class and get built concurrently with validation code. The main benefit is decoupling and moving the txindex into a separate DB. The primary motivation is to lay the groundwork for other indexers that might be desired (such as the [compact filters](bitcoin/bips#636)). The basic idea is that the TxIndex spins up its own thread, which first syncs the txindex to the current block index, then once in sync the BlockConnected ValidationInterface hook writes new blocks. ### DB changes At the suggestion of some other developers, the txindex has been split out into a separate database. A data migration runs at startup on any nodes with a legacy txindex. Currently the migration blocks node initialization until complete. ### Open questions - Should the migration of txindex data from the old DB to the new DB block in init or should it happen in a background thread? The downside to backgrounding it is that `getrawtransaction` would return an error message saying the txindex is syncing while the migration is running. ### Impact In a sample size n=1 test where I synced nodes from scratch, the average time [Index writing](https://github.com/bitcoin/bitcoin/blob/master/src/validation.cpp#L1903) was 3.36ms in master and 1.72ms in this branch. The average time between `UpdateTip` log lines for sequential blocks between 400,000 and IBD end on mainnet was 0.297204s in master and 0.286134s in this branch. Most likely this is just variance in IBD times, but I can try with some more trials if people want. Tree-SHA512: 451fd7d95df89dfafceaa723cdf0f7b137615b531cf5c5035cfb54e9ccc2026cec5ac85edbcf71b7f4e2f102e36e9202b8b3a667e1504a9e1a9976ab1f0079c4
9b27047 [doc] Include txindex changes in the release notes. (Jim Posen) ed77dd6 [test] Simple unit test for TxIndex. (Jim Posen) 6d772a3 [rpc] Public interfaces to GetTransaction block until synced. (Jim Posen) a03f804 [index] Move disk IO logic from GetTransaction to TxIndex::FindTx. (Jim Posen) e0a3b80 [validation] Replace tx index code in validation code with TxIndex. (Jim Posen) 8181db8 [init] Initialize and start TxIndex in init code. (Jim Posen) f90c3a6 [index] TxIndex method to wait until caught up. (Jim Posen) 70d510d [index] Allow TxIndex sync thread to be interrupted. (Jim Posen) 94b4f8b [index] TxIndex initial sync thread. (Jim Posen) 34d68bf [index] Create new TxIndex class. (Jim Posen) c88bcec [db] Migration for txindex data to new, separate database. (Jim Posen) 0cb8303 [db] Create separate database for txindex. (Jim Posen) Pull request description: I'm re-opening bitcoin#11857 as a new pull request because the last one stopped loading for people ------------------------------- This refactors the tx index code to be in it's own class and get built concurrently with validation code. The main benefit is decoupling and moving the txindex into a separate DB. The primary motivation is to lay the groundwork for other indexers that might be desired (such as the [compact filters](bitcoin/bips#636)). The basic idea is that the TxIndex spins up its own thread, which first syncs the txindex to the current block index, then once in sync the BlockConnected ValidationInterface hook writes new blocks. ### DB changes At the suggestion of some other developers, the txindex has been split out into a separate database. A data migration runs at startup on any nodes with a legacy txindex. Currently the migration blocks node initialization until complete. ### Open questions - Should the migration of txindex data from the old DB to the new DB block in init or should it happen in a background thread? The downside to backgrounding it is that `getrawtransaction` would return an error message saying the txindex is syncing while the migration is running. ### Impact In a sample size n=1 test where I synced nodes from scratch, the average time [Index writing](https://github.com/bitcoin/bitcoin/blob/master/src/validation.cpp#L1903) was 3.36ms in master and 1.72ms in this branch. The average time between `UpdateTip` log lines for sequential blocks between 400,000 and IBD end on mainnet was 0.297204s in master and 0.286134s in this branch. Most likely this is just variance in IBD times, but I can try with some more trials if people want. Tree-SHA512: 451fd7d95df89dfafceaa723cdf0f7b137615b531cf5c5035cfb54e9ccc2026cec5ac85edbcf71b7f4e2f102e36e9202b8b3a667e1504a9e1a9976ab1f0079c4
9b27047 [doc] Include txindex changes in the release notes. (Jim Posen) ed77dd6 [test] Simple unit test for TxIndex. (Jim Posen) 6d772a3 [rpc] Public interfaces to GetTransaction block until synced. (Jim Posen) a03f804 [index] Move disk IO logic from GetTransaction to TxIndex::FindTx. (Jim Posen) e0a3b80 [validation] Replace tx index code in validation code with TxIndex. (Jim Posen) 8181db8 [init] Initialize and start TxIndex in init code. (Jim Posen) f90c3a6 [index] TxIndex method to wait until caught up. (Jim Posen) 70d510d [index] Allow TxIndex sync thread to be interrupted. (Jim Posen) 94b4f8b [index] TxIndex initial sync thread. (Jim Posen) 34d68bf [index] Create new TxIndex class. (Jim Posen) c88bcec [db] Migration for txindex data to new, separate database. (Jim Posen) 0cb8303 [db] Create separate database for txindex. (Jim Posen) Pull request description: I'm re-opening bitcoin#11857 as a new pull request because the last one stopped loading for people ------------------------------- This refactors the tx index code to be in it's own class and get built concurrently with validation code. The main benefit is decoupling and moving the txindex into a separate DB. The primary motivation is to lay the groundwork for other indexers that might be desired (such as the [compact filters](bitcoin/bips#636)). The basic idea is that the TxIndex spins up its own thread, which first syncs the txindex to the current block index, then once in sync the BlockConnected ValidationInterface hook writes new blocks. ### DB changes At the suggestion of some other developers, the txindex has been split out into a separate database. A data migration runs at startup on any nodes with a legacy txindex. Currently the migration blocks node initialization until complete. ### Open questions - Should the migration of txindex data from the old DB to the new DB block in init or should it happen in a background thread? The downside to backgrounding it is that `getrawtransaction` would return an error message saying the txindex is syncing while the migration is running. ### Impact In a sample size n=1 test where I synced nodes from scratch, the average time [Index writing](https://github.com/bitcoin/bitcoin/blob/master/src/validation.cpp#L1903) was 3.36ms in master and 1.72ms in this branch. The average time between `UpdateTip` log lines for sequential blocks between 400,000 and IBD end on mainnet was 0.297204s in master and 0.286134s in this branch. Most likely this is just variance in IBD times, but I can try with some more trials if people want. Tree-SHA512: 451fd7d95df89dfafceaa723cdf0f7b137615b531cf5c5035cfb54e9ccc2026cec5ac85edbcf71b7f4e2f102e36e9202b8b3a667e1504a9e1a9976ab1f0079c4
Merge bitcoin#13033: Build txindex in parallel with validation
I'm re-opening #11857 as a new pull request because the last one stopped loading for people
This refactors the tx index code to be in it's own class and get built concurrently with validation code. The main benefit is decoupling and moving the txindex into a separate DB. The primary motivation is to lay the groundwork for other indexers that might be desired (such as the compact filters). The basic idea is that the TxIndex spins up its own thread, which first syncs the txindex to the current block index, then once in sync the BlockConnected ValidationInterface hook writes new blocks.
DB changes
At the suggestion of some other developers, the txindex has been split out into a separate database. A data migration runs at startup on any nodes with a legacy txindex. Currently the migration blocks node initialization until complete.
Open questions
getrawtransaction
would return an error message saying the txindex is syncing while the migration is running.Impact
In a sample size n=1 test where I synced nodes from scratch, the average time Index writing was 3.36ms in master and 1.72ms in this branch. The average time between
UpdateTip
log lines for sequential blocks between 400,000 and IBD end on mainnet was 0.297204s in master and 0.286134s in this branch. Most likely this is just variance in IBD times, but I can try with some more trials if people want.