-
Notifications
You must be signed in to change notification settings - Fork 37.4k
Switch chainstate db and cache to per-txout model #10195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
2cc1901
to
5002396
Compare
This is awesome. |
e09c341
to
e82448a
Compare
Rebased on top of new #10148. |
Rebased without #10148 at popular request. |
Few overrides needed probably:
And few initializations:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK. Briefly looked over most of the commits except the big one, which I only briefly skimmed, but am left wondering if there's an easier way to get that +650-881 commit in. At least the first 6 commits could be merged separately (I know the first two are already in another PR), but I'm not sure thats sufficient. Anyway, will circle back around to doing a full review as other things get merged.
test/functional/blockchain.py
Outdated
@@ -52,7 +52,6 @@ def _test_gettxoutsetinfo(self): | |||
assert_equal(res['txouts'], 200) | |||
assert_equal(res['bytes_serialized'], 13924), | |||
assert_equal(len(res['bestblock']), 64) | |||
assert_equal(len(res['hash_serialized']), 64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just switch to hash_serialized_2?
src/txdb.cpp
Outdated
boost::this_thread::interruption_point(); | ||
COldCoins oldcoins; | ||
std::pair<unsigned char, uint256> key; | ||
if (pcursor->GetValue(oldcoins) && pcursor->GetKey(key) && key.first == DB_COINS_OLD) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldnt the first two be an error and not a "done upgrading" result?
7f6f393
to
2d6c9ae
Compare
I'm splitting the commits up more. I've pushed one update already, but I'm working splitting the big commit further. |
482e15d
to
98902de
Compare
Done. There is still one commit that does many things at once ("Switch CCoinsView and chainstate db to per-txout"), but splitting it is nontrivial. If requested, I can try splitting the database format change into a second commit, but that would require adding a bunch of conversion logic in the first commit that just gets removed in the second one. Thoughts? |
a92599b
to
8da478e
Compare
src/undo.h
Outdated
template <typename Stream> | ||
void Unserialize(Stream& s) { | ||
// TODO: avoid reimplementing vector deserializer | ||
uint64_t count; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
init to 0 here please.
Rebased and reset the author timestamps (so that GitHub shows them in logical order, $@#!*). |
5898279 scripted-diff: various renames for per-utxo consistency (Pieter Wuille) a5e02bc Increase travis unit test timeout (Pieter Wuille) 73de2c1 Rename CCoinsCacheEntry::coins to coin (Pieter Wuille) 119e552 Merge CCoinsViewCache's GetOutputFor and AccessCoin (Pieter Wuille) 580b023 [MOVEONLY] Move old CCoins class to txdb.cpp (Pieter Wuille) 8b25d2c Upgrade from per-tx database to per-txout (Pieter Wuille) b2af357 Reduce reserved memory space for flushing (Pieter Wuille) 41aa5b7 Pack Coin more tightly (Pieter Wuille) 97072d6 Remove unused CCoins methods (Pieter Wuille) ce23efa Extend coins_tests (Pieter Wuille) 5083079 Switch CCoinsView and chainstate db from per-txid to per-txout (Pieter Wuille) 4ec0d9e Refactor GetUTXOStats in preparation for per-COutPoint iteration (Pieter Wuille) 13870b5 Replace CCoins-based CTxMemPool::pruneSpent with isSpent (Pieter Wuille) 05293f3 Remove ModifyCoins/ModifyNewCoins (Pieter Wuille) 961e483 Switch tests from ModifyCoins to AddCoin/SpendCoin (Pieter Wuille) 8b3868c Switch CScriptCheck to use Coin instead of CCoins (Pieter Wuille) c87b957 Only pass things committed to by tx's witness hash to CScriptCheck (Matt Corallo) f68cdfe Switch from per-tx to per-txout CCoinsViewCache methods in some places (Pieter Wuille) 0003911 Introduce new per-txout CCoinsViewCache functions (Pieter Wuille) bd83111 Optimization: Coin&& to ApplyTxInUndo (Pieter Wuille) cb2c7fd Replace CTxInUndo with Coin (Pieter Wuille) 422634e Introduce Coin, a single unspent output (Pieter Wuille) 7d991b5 Store/allow tx metadata in all undo records (Pieter Wuille) c3aa0c1 Report on-disk size in gettxoutsetinfo (Pieter Wuille) d342424 Remove/ignore tx version in utxo and undo (Pieter Wuille) 7e00322 Add specialization of SipHash for 256 + 32 bit data (Pieter Wuille) e484652 Introduce CHashVerifier to hash read data (Pieter Wuille) f54580e error() in disconnect for disk corruption, not inconsistency (Pieter Wuille) e66dbde Add SizeEstimate to CDBBatch (Pieter Wuille) Tree-SHA512: ce1fb1e40c77d38915cd02189fab7a8b125c7f44d425c85579d872c3bede3a437760997907c99d7b3017ced1c2de54b2ac7223d99d83a6658fe5ef61edef1de3
5898279 scripted-diff: various renames for per-utxo consistency (Pieter Wuille) a5e02bc Increase travis unit test timeout (Pieter Wuille) 73de2c1 Rename CCoinsCacheEntry::coins to coin (Pieter Wuille) 119e552 Merge CCoinsViewCache's GetOutputFor and AccessCoin (Pieter Wuille) 580b023 [MOVEONLY] Move old CCoins class to txdb.cpp (Pieter Wuille) 8b25d2c Upgrade from per-tx database to per-txout (Pieter Wuille) b2af357 Reduce reserved memory space for flushing (Pieter Wuille) 41aa5b7 Pack Coin more tightly (Pieter Wuille) 97072d6 Remove unused CCoins methods (Pieter Wuille) ce23efa Extend coins_tests (Pieter Wuille) 5083079 Switch CCoinsView and chainstate db from per-txid to per-txout (Pieter Wuille) 4ec0d9e Refactor GetUTXOStats in preparation for per-COutPoint iteration (Pieter Wuille) 13870b5 Replace CCoins-based CTxMemPool::pruneSpent with isSpent (Pieter Wuille) 05293f3 Remove ModifyCoins/ModifyNewCoins (Pieter Wuille) 961e483 Switch tests from ModifyCoins to AddCoin/SpendCoin (Pieter Wuille) 8b3868c Switch CScriptCheck to use Coin instead of CCoins (Pieter Wuille) c87b957 Only pass things committed to by tx's witness hash to CScriptCheck (Matt Corallo) f68cdfe Switch from per-tx to per-txout CCoinsViewCache methods in some places (Pieter Wuille) 0003911 Introduce new per-txout CCoinsViewCache functions (Pieter Wuille) bd83111 Optimization: Coin&& to ApplyTxInUndo (Pieter Wuille) cb2c7fd Replace CTxInUndo with Coin (Pieter Wuille) 422634e Introduce Coin, a single unspent output (Pieter Wuille) 7d991b5 Store/allow tx metadata in all undo records (Pieter Wuille) c3aa0c1 Report on-disk size in gettxoutsetinfo (Pieter Wuille) d342424 Remove/ignore tx version in utxo and undo (Pieter Wuille) 7e00322 Add specialization of SipHash for 256 + 32 bit data (Pieter Wuille) e484652 Introduce CHashVerifier to hash read data (Pieter Wuille) f54580e error() in disconnect for disk corruption, not inconsistency (Pieter Wuille) e66dbde Add SizeEstimate to CDBBatch (Pieter Wuille) Tree-SHA512: ce1fb1e40c77d38915cd02189fab7a8b125c7f44d425c85579d872c3bede3a437760997907c99d7b3017ced1c2de54b2ac7223d99d83a6658fe5ef61edef1de3
…of tx 8627946 [RPC] trivial: gettxout no longer shows version of tx (Felix Weis) Pull request description: Since the switch to a per-txout chainstate db in bitcoin#10195, the tx version information is no longer stored. Updated `gettxout` rpc help text accordingly. Tree-SHA512: 3d7f42ef0f649056ece98bf22a1e972d1876324733adc81fa31bc2cd160550c5b6cb8682209fb8e8dbc56a8139ed5f5f0e740945f709039e69d52997ddbca7b8
…of tx 8627946 [RPC] trivial: gettxout no longer shows version of tx (Felix Weis) Pull request description: Since the switch to a per-txout chainstate db in bitcoin#10195, the tx version information is no longer stored. Updated `gettxout` rpc help text accordingly. Tree-SHA512: 3d7f42ef0f649056ece98bf22a1e972d1876324733adc81fa31bc2cd160550c5b6cb8682209fb8e8dbc56a8139ed5f5f0e740945f709039e69d52997ddbca7b8
…of tx 8627946 [RPC] trivial: gettxout no longer shows version of tx (Felix Weis) Pull request description: Since the switch to a per-txout chainstate db in bitcoin#10195, the tx version information is no longer stored. Updated `gettxout` rpc help text accordingly. Tree-SHA512: 3d7f42ef0f649056ece98bf22a1e972d1876324733adc81fa31bc2cd160550c5b6cb8682209fb8e8dbc56a8139ed5f5f0e740945f709039e69d52997ddbca7b8
…of tx 8627946 [RPC] trivial: gettxout no longer shows version of tx (Felix Weis) Pull request description: Since the switch to a per-txout chainstate db in bitcoin#10195, the tx version information is no longer stored. Updated `gettxout` rpc help text accordingly. Tree-SHA512: 3d7f42ef0f649056ece98bf22a1e972d1876324733adc81fa31bc2cd160550c5b6cb8682209fb8e8dbc56a8139ed5f5f0e740945f709039e69d52997ddbca7b8
21fa0a4 [docs] use consistent naming for possible_overwrite (John Newbery) 2685c21 [tests] small whitespace fixup (John Newbery) e993696 scripted-diff: Rename PRUNED to SPENT in coins tests (John Newbery) c205979 [docs] Improve commenting in coins.cpp|h (John Newbery) Pull request description: - Add full commenting for spentness / DIRTYness / FRESHness and which combinations are valid - Remove the 'pruned' terminology, which doesn't make sense since per-txout chainstate db was merged (#10195). - Rename `potential_overwrite` to `possible_overwrite` to standardize terminology (there were previously examples of both, which made searching the codebase difficult). - Make other minor improvements to the comments ACKs for top commit: jonatack: Re-ACK 21fa0a4 per `git diff 98bee55 21fa0a4` the only change since my previous review is the following code commenting diff in `src/coins.cpp::L177-179`; rebuilt/ran unit tests anyway as a sanity check on the unit test changes. Tree-SHA512: 391e01588ef5edb417250080cec17361f982c4454bc5f8c6d78bbd528c68a2bb94373297760691295c24660ce1022ad3ef7599762f736c8eed772ce096d38c3d
83512d2 [Trivial] Mark CCoinsView child classes methods as override (random-zebra) 5643075 Store/allow tx metadata in all undo records (random-zebra) 8c4add4 Report on-disk size in gettxoutsetinfo (random-zebra) 6cb2989 Remove/ignore tx version in utxo and undo (random-zebra) 03b9724 Add specialization of SipHash for 256 + 32 bit data (Pieter Wuille) e3cd496 Introduce CHashVerifier to hash read data (random-zebra) 10baa70 error() in disconnect for disk corruption, not inconsistency (random-zebra) 802f8c8 Add SizeEstimate to CDBBatch (Pieter Wuille) Pull request description: This backports the commits listed as "Preparation" in bitcoin#10195. The rest will be included in a followup PR. Built on top of: - [x] #1772 - [x] #1771 - [x] #1793 Here we: - add in-memory size estimation of a LevelDB batch - remove the usage of `error()` in ApplyTxInUndo/DisconnectBlock - Introduce the new CHashVerifier class to hash read data - Add a specialized version of SipHash for tuples of 256 bits and 32 bits data - Remove the transaction version from utxo and undo data structures - Add on-disk size of the utxo set to gettxoutsetinfo - store tx metadata in all undo records, instead of records corresponding only to the last output of a tx being spent. ACKs for top commit: furszy: looking good, utACK 83512d2. Fuzzbawls: utACK 83512d2 Tree-SHA512: 7db730058ae208535aad5f5989abb9c55fd3722b79efef24fdd8c717687b67272bddaeae5e5a4a2d3e97626a7ec62a5d1ebdb5af0233f4a950b5a0c0b154484f
535d8e4 scripted-diff: various renames for per-utxo consistency (random-zebra) 60c73ad Rename CCoinsCacheEntry::coins to coin (Pieter Wuille) 1166f1f Merge CCoinsViewCache's GetOutputFor and AccessCoin (Pieter Wuille) f25d3c6 [MOVEONLY] Move old CCoins class to txdb.cpp (random-zebra) aab17b3 Upgrade from per-tx database to per-txout (Pieter Wuille) a55eb98 Reduce reserved memory space for flushing (random-zebra) 7d637ea Remove unused CCoins methods (random-zebra) 4b7b1a3 Extend coins_tests (random-zebra) 8cf2dc0 Switch CCoinsView and chainstate db from per-txid to per-txout (random-zebra) b20a662 [Cleanup] Remove unused CCoinsViewCache::IsOutputAvailable (random-zebra) ea82855 Refactor GetUTXOStats in preparation for per-COutPoint iteration (Pieter Wuille) 0f9f2b6 [Core] Fix not-pruned-check in miner for zerocoin mint outputs (random-zebra) 3698d47 Replace CCoins-based CTxMemPool::pruneSpent with isSpent (Pieter Wuille) 11c728a Remove ModifyCoins/ModifyNewCoins (Pieter Wuille) 3c65936 Switch tests from ModifyCoins to AddCoin/SpendCoin (random-zebra) b05fe55 Switch CScriptCheck to use Coin instead of CCoins (random-zebra) 9e5c2ae Only pass things committed to by tx's witness hash to CScriptCheck (random-zebra) 22b0b4a [Cleanup] fix warning in comparisons with nCoinbaseMaturity (random-zebra) 666081d Switch from per-tx to per-txout CCoinsViewCache methods in some places (random-zebra) 53fc2be [Core] PIVX: remove potential_overwrite: BIP34 has always been enforced (random-zebra) 16924aa Introduce new per-txout CCoinsViewCache functions (random-zebra) 2d74bc1 Optimization: Coin&& to ApplyTxInUndo (random-zebra) 3fcb092 Replace CTxInUndo with Coin (random-zebra) 42a8996 [Core] PIVX: Add fCoinStake boolean field to Coin class (random-zebra) a01a9f2 Introduce Coin, a single unspent output (random-zebra) Pull request description: The chainstate database, and its cache, rely on a per-tx model (i.e. it is a map from txids to *lists* of unspent outputs). This PR chages it to a per-txout model (i.e. a map from outpoints to *single* unspent outputs). Pros: - Simpler code. - Avoiding the CPU overhead of deserializing and serializing the unused outputs. - More predictable memory usage. - More easily adaptable to various cache flushing strategies. Cons: - Slightly larger on-disk representation, and sometimes larger in-memory representation (when there are multiple outputs for the same txid in the cache, which becomes optional). Adapted from bitcoin#10195 (adding a `fCoinStake` boolean memeber to the Coin class and considering BIP34 always in effect as per #1775) Based on top of: - [x] #1799 - [x] #1800 - [x] #1797 - [x] #1796 - [x] #1795 - [x] #1793 - [x] #1773 The actual PR starts with "Introduce Coin, a single unspent output" (171bf0b)⚠️ **Note for testers**: Do a backup of the chainstate before testing. With this PR, at startup, the database is automatically upgraded to the new format (so a reindex is not needed). ACKs for top commit: furszy: Code review ACK 535d8e4. Need some live testing now. furszy: ACK 535d8e4 Fuzzbawls: ACK 535d8e4 Tree-SHA512: 8648d31c343ff901f8568c14d0848241c9b8aca0b9a8f89f3e5dbe02d384ba35faf572757e84f979fba1f198d721a2516d0e784b5e0dc8d02b6e308b4278b484
Summary: ``` They're doing the same thing now. ``` Partial backport of core [[bitcoin/bitcoin#10195 | PR10195]]: bitcoin/bitcoin@119e552 This is a leftover from the backport that started with D342. Test Plan: ninja all check-all Reviewers: #bitcoin_abc, PiRK Reviewed By: PiRK Differential Revision: https://reviews.bitcoinabc.org/D8254
21fa0a4 [docs] use consistent naming for possible_overwrite (John Newbery) 2685c21 [tests] small whitespace fixup (John Newbery) e993696 scripted-diff: Rename PRUNED to SPENT in coins tests (John Newbery) c205979 [docs] Improve commenting in coins.cpp|h (John Newbery) Pull request description: - Add full commenting for spentness / DIRTYness / FRESHness and which combinations are valid - Remove the 'pruned' terminology, which doesn't make sense since per-txout chainstate db was merged (bitcoin#10195). - Rename `potential_overwrite` to `possible_overwrite` to standardize terminology (there were previously examples of both, which made searching the codebase difficult). - Make other minor improvements to the comments ACKs for top commit: jonatack: Re-ACK 21fa0a4 per `git diff 98bee55 21fa0a4` the only change since my previous review is the following code commenting diff in `src/coins.cpp::L177-179`; rebuilt/ran unit tests anyway as a sanity check on the unit test changes. Tree-SHA512: 391e01588ef5edb417250080cec17361f982c4454bc5f8c6d78bbd528c68a2bb94373297760691295c24660ce1022ad3ef7599762f736c8eed772ce096d38c3d
21fa0a4 [docs] use consistent naming for possible_overwrite (John Newbery) 2685c21 [tests] small whitespace fixup (John Newbery) e993696 scripted-diff: Rename PRUNED to SPENT in coins tests (John Newbery) c205979 [docs] Improve commenting in coins.cpp|h (John Newbery) Pull request description: - Add full commenting for spentness / DIRTYness / FRESHness and which combinations are valid - Remove the 'pruned' terminology, which doesn't make sense since per-txout chainstate db was merged (bitcoin#10195). - Rename `potential_overwrite` to `possible_overwrite` to standardize terminology (there were previously examples of both, which made searching the codebase difficult). - Make other minor improvements to the comments ACKs for top commit: jonatack: Re-ACK 21fa0a4 per `git diff 98bee55 21fa0a4` the only change since my previous review is the following code commenting diff in `src/coins.cpp::L177-179`; rebuilt/ran unit tests anyway as a sanity check on the unit test changes. Tree-SHA512: 391e01588ef5edb417250080cec17361f982c4454bc5f8c6d78bbd528c68a2bb94373297760691295c24660ce1022ad3ef7599762f736c8eed772ce096d38c3d
21fa0a4 [docs] use consistent naming for possible_overwrite (John Newbery) 2685c21 [tests] small whitespace fixup (John Newbery) e993696 scripted-diff: Rename PRUNED to SPENT in coins tests (John Newbery) c205979 [docs] Improve commenting in coins.cpp|h (John Newbery) Pull request description: - Add full commenting for spentness / DIRTYness / FRESHness and which combinations are valid - Remove the 'pruned' terminology, which doesn't make sense since per-txout chainstate db was merged (bitcoin#10195). - Rename `potential_overwrite` to `possible_overwrite` to standardize terminology (there were previously examples of both, which made searching the codebase difficult). - Make other minor improvements to the comments ACKs for top commit: jonatack: Re-ACK 21fa0a4 per `git diff 98bee55 21fa0a4` the only change since my previous review is the following code commenting diff in `src/coins.cpp::L177-179`; rebuilt/ran unit tests anyway as a sanity check on the unit test changes. Tree-SHA512: 391e01588ef5edb417250080cec17361f982c4454bc5f8c6d78bbd528c68a2bb94373297760691295c24660ce1022ad3ef7599762f736c8eed772ce096d38c3d
21fa0a4 [docs] use consistent naming for possible_overwrite (John Newbery) 2685c21 [tests] small whitespace fixup (John Newbery) e993696 scripted-diff: Rename PRUNED to SPENT in coins tests (John Newbery) c205979 [docs] Improve commenting in coins.cpp|h (John Newbery) Pull request description: - Add full commenting for spentness / DIRTYness / FRESHness and which combinations are valid - Remove the 'pruned' terminology, which doesn't make sense since per-txout chainstate db was merged (bitcoin#10195). - Rename `potential_overwrite` to `possible_overwrite` to standardize terminology (there were previously examples of both, which made searching the codebase difficult). - Make other minor improvements to the comments ACKs for top commit: jonatack: Re-ACK 21fa0a4 per `git diff 98bee55 21fa0a4` the only change since my previous review is the following code commenting diff in `src/coins.cpp::L177-179`; rebuilt/ran unit tests anyway as a sanity check on the unit test changes. Tree-SHA512: 391e01588ef5edb417250080cec17361f982c4454bc5f8c6d78bbd528c68a2bb94373297760691295c24660ce1022ad3ef7599762f736c8eed772ce096d38c3d
Since Bitcoin Core v0.8, we've used a per-tx model for the chainstate database and its cache. This means that the database is effectively a map from txids to the list of unspent outputs for that transaction. This PR changes that to a map from outpoints ((txid,index) pairs) to just the individual unspent output for that outpoint.
The original reason for aggregating the outputs per transaction was to save space: this way, we could avoid duplicating the txid and transaction meta across the multiple outputs. However, LevelDB internally uses an encoding that omits repeated prefix bytes in keys, and because of that, duplicating the txids is not very significant.
There are many advantages for using a per-txout model:
Downsides:
This PR includes:
Tests:
Commit links:
Preparation:
Switch to COutPoint/Coin instead of uint256/CCoins:
Cleanup: