-
Notifications
You must be signed in to change notification settings - Fork 36.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Coinstats Index #19521
Coinstats Index #19521
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
if (gArgs.GetArg("-prune", 0)) { | ||
if (gArgs.GetBoolArg("-txindex", DEFAULT_TXINDEX)) | ||
return InitError(_("Prune mode is incompatible with -txindex.")); | ||
if (gArgs.GetBoolArg("-coinstatsindex", DEFAULT_COINSTATSINDEX)) | ||
return InitError(_("Prune mode is incompatible with -coinstatsindex.")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It syncs over the whole blockchain, processing every block and saving state on each. That allows users to query the stats for every block height (a nice-to-have feature). It's the standard way BaseIndex
currently works, although changing that would probably only require minimal effort. Potential users who I have talked to didn't really care much about this. But I will look into it if this again since progress is slow anyway currently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a potential user, it would be nice if I can at least turn on pruning after the index is generated, but I think that's for a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is necessary to disallow the coin stats index when pruning is enabled. The only reason that txindex disallows it is because txindex gets the transaction data by reading it off disk from the block files, so that is incompatible with pruning.
The coin stats index doesn't have the same problem. The only issue would be if there were a large reorg and it needed the undo data that has already been deleted. However this is an issue overall of pruned nodes so it shouldn't be a concern for the coin stats index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To reconcile total issuance with the spendable UTXO set, it would be helpful to sum up unspendable amounts.
struct DBVal { | ||
uint64_t transaction_output_count; | ||
uint64_t bogo_size; | ||
CAmount total_amount; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand the logic, total_amount
is the net increase of coins due to this block?
I don't want to bloat your stats index, but I'd be interested in adding total_prevout_spent_amount
, total_new_outputs_ex_coinbase_amount
, and coinbase_amount
to unpack total_amount
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is the total amount of coins in the UTXO set at that block height. So it's accumulative rather than for this specific block. I added the other values in a draft commit, needs some cleanup but ready to test.
@PierreRochard Thanks a lot for the comments! I am happy to include these values you requested and drafted that up in a new commit (will need another day for clean-up, docs, tests etc. but it should work already). I actually had done some similar work already for my blog post and was thinking about adding some more numbers myself, just opted to keep the changeset small at the end. The RPC now has the values for every block by passing a verbose flag
I just noticed when I was almost done, that I wasn't sure if you would like these requested values to be cumulative or per-block. They are per-block now but it is trivial to change that. Actually, I could even do both. And another questions: should the unspendable output values be included or excluded from the other output values. Example: And I saw your request for the data dump on twitter. Will look into that as a follow-up :) |
fdcc2ca
to
7139c17
Compare
Concept ACK Would be nice to be able to see the |
Concept ACK |
I find it easier to do the cumulative sum in pandas than to back in to the per-block, but the answer also depends on the performance of the underlying index for aggregate queries. If this behaves like a SQL database and we want it to be normalized, it should be per-block, but I'm not religious about it. |
7139c17
to
b40efde
Compare
@PierreRochard Thanks for testing and further feedback. As I have written the tests today and did further cleanup I have found that I hadn't been really doing what you were looking for in the coinbase_amount. I was simply returning the block subsidy there. I am pushing my fixes and test as a separate commit for now since you were already testing with the first version:
I think the formula in
I think the new code also fixes the BIP30 issue you discovered but I am still syncing the new version of the index with not-super-fast hardware so I will be able to check tomorrow. |
Yes, perfect, that equation is exactly right. I'll just delete the coinstats index file for now but does this indexing system have versioning/migrations? |
From testnet:
Looking good! |
Narrator: it lost, which is a good opportunity to test a reorg in real life. For height 679824 I now get 157ec719407b64f9204f464fcc203b04d288ff837a3ce5ca63edeb20a2614903. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review re-ACK 5f96d7d per git range-diff 13d27b4 07201d3 5f96d7d
{ | ||
{"hash_type", RPCArg::Type::STR, RPCArg::Default{"hash_serialized_2"}, "Which UTXO set hash should be calculated. Options: 'hash_serialized_2' (the legacy algorithm), 'muhash', 'none'."}, | ||
{"hash_or_height", RPCArg::Type::NUM, RPCArg::Optional::OMITTED, "The block hash or height of the target height (only available with coinstatsindex)", "", {"", "string or numeric"}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit if you have to rebase, this is the only help argument of the three that doesn't end with a period.
{"hash_or_height", RPCArg::Type::NUM, RPCArg::Optional::OMITTED, "The block hash or height of the target height (only available with coinstatsindex)", "", {"", "string or numeric"}}, | |
{"hash_or_height", RPCArg::Type::NUM, RPCArg::Optional::OMITTED, "The block hash or height of the target height (only available with coinstatsindex).", "", {"", "string or numeric"}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in #21818
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should error when hash_or_height
is set but index is not available or use_index=false
otherwise the result is misleading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -23,6 +23,7 @@ enum class CoinStatsHashType { | |||
|
|||
struct CCoinsStats | |||
{ | |||
CoinStatsHashType m_hash_type; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const CoinStatsHashType m_hash_type
and drop constructor.
stats.nHeight = Assert(pindex)->nHeight; | ||
|
||
if (!pindex) { | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unnecessary block.
@@ -140,6 +140,35 @@ static int ComputeNextBlockAndDepth(const CBlockIndex* tip, const CBlockIndex* b | |||
return blockindex == tip ? 1 : -1; | |||
} | |||
|
|||
CBlockIndex* ParseHashOrHeight(const UniValue& param, ChainstateManager& chainman) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, can be static
and move {
to new line.
There is already a check for that and it should error in the case you describe: https://github.com/bitcoin/bitcoin/pull/19521/files#diff-decae4be02fb8a47ab4557fe74a9cb853bdfa3ec0fa1b515c0a1e5de91f4ad0bR1168 If you found a scenario where it doesn't work, please let me know. There is at least a basic test for it here but maybe I have overlooked something: https://github.com/bitcoin/bitcoin/pull/19521/files#diff-a6434325d09a6df4b371513c837907dfc1a97cf540709def4bded9b2a17e3f49R115 |
Code review ACK 5f96d7d |
Namecoin tracks currency and name outputs separately in the UTXO stats. For the recent merge of the coinstats index in upstream bitcoin/bitcoin#19521, we modify the code now to keep this separate tracking also in the new index.
The coinstats index that was created upstream in bitcoin/bitcoin#19521 needs some updates for Xaya, since the genesis block reward is custom (the premine) and spendable.
for (size_t j = 0; j < tx->vout.size(); ++j) { | ||
const CTxOut& out{tx->vout[j]}; | ||
Coin coin{out, pindex->nHeight, tx->IsCoinBase()}; | ||
COutPoint outpoint{tx->GetHash(), static_cast<uint32_t>(j)}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could just make j
of type uint32_t
to avoid the cast?
if (tx->IsCoinBase()) { | ||
m_block_coinbase_amount += coin.out.nValue; | ||
} else { | ||
m_block_new_outputs_ex_coinbase_amount += coin.out.nValue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this called m_block...
? The value is not per block, but the total, like all other CAmount members!? It might be good to call all of them m_block
, or m_total
, or just skip the prefix for all of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yepp, it used to be per block, but the names were not changed when the values were changed to track totals.
m_muhash.Finalize(out); | ||
value.second.muhash = out; | ||
|
||
return m_db->Write(DBHeightKey(pindex->nHeight), value) && m_db->Write(DB_MUHASH, m_muhash); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be a batch to ensure an atomic operation? Otherwise you may end up with a corrupt muhash?
} | ||
} | ||
|
||
if (BaseIndex::Init()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use early return if (!...) return false;
to avoid large multi-line nesting
|
||
uint256 expected_block_hash{pindex->pprev->GetBlockHash()}; | ||
if (read_out.first != expected_block_hash) { | ||
if (!m_db->Read(DBHashKey(expected_block_hash), read_out)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Under what circumstance would this recovery condition be hit? Also, shouldn't the error message mention that this condition failed as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could happen if there is a reorg that the index has not picked up on yet, so the hash returned from the height index is not on the active chain anymore. I added a warning logging and changed the error()
message to be more accurate.
|
||
uint256 expected_block_hash{pindex->pprev->GetBlockHash()}; | ||
if (read_out.first != expected_block_hash) { | ||
if (!m_db->Read(DBHashKey(expected_block_hash), read_out)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Under what circumstance would this recovery condition be hit? Also, shouldn't the error message mention that this condition failed as well?
if (args.GetArg("-prune", 0)) { | ||
if (args.GetBoolArg("-txindex", DEFAULT_TXINDEX)) | ||
return InitError(_("Prune mode is incompatible with -txindex.")); | ||
if (args.GetBoolArg("-coinstatsindex", DEFAULT_COINSTATSINDEX)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing {}
😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hopefully removed soon anyway with #21726 :)
self.nodes[0].submitblock(ToHex(block)) | ||
self.sync_all() | ||
|
||
self.wait_until(lambda: not try_rpc(-32603, "Unable to read UTXO set", index_node.gettxoutsetinfo, 'muhash')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this needed? (sync_all should call syncwithvalidationinterface). Also, if it didn't, shouldn't the index RPC call syncwithvalidationinterface internally by itself, similar to how the wallet does it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, I missed the option to use the BlockUntilSynced...
function from BaseIndex.
{RPCResult::Type::OBJ, "block_info", "Info on amounts in the block at this block height (only available if coinstatsindex is used)", | ||
{ | ||
{RPCResult::Type::STR_AMOUNT, "prevout_spent", ""}, | ||
{RPCResult::Type::STR_AMOUNT, "coinbase", ""}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empty string as documentation? Would be nice to explain this a bit better. Does it include unspendable outputs ...?
Thanks @MarcoFalke , all comments should be addressed in #22047 aside from the init arg brackets which should be made redundant with #21726. |
779e638 coinstats: Add comments for new coinstatsindex values (Fabian Jahr) 5b3d4e7 Index: Improve logging in coinstatsindex (Fabian Jahr) d4356d4 rpc: Block until synced if coinstatsindex is used in gettxoutsetinfo (Fabian Jahr) a5f6791 rpc: Add missing gettxoutsetinfo help docs (Fabian Jahr) 01386bf Index: Return early from failed coinstatsindex init (Fabian Jahr) 1e38423 index: Use batch writing in coinstatsindex WriteBlock (Fabian Jahr) fb65dde scripted-diff: Fix coinstats data member names (Fabian Jahr) 8ea8c92 index: Avoid unnecessary type casts in coinstatsindex (Fabian Jahr) Pull request description: This is a collection of smaller follow-ups to #19521, addressing several post-merge review comments. ACKs for top commit: Sjors: re-utACK 779e638 jonatack: re-ACK 779e638 diff since last review involves doc changes only; rebased to current master and verified clean debug build/no silent conflicts, unit tests, and feature_coinstatsindex functional test laanwj: Code review ACK 779e638 Talkless: re-utACK 779e638 after cosmetic changes. Tree-SHA512: cb0d038d230c582d7fe3041c89b1e04d39971fab3739d540c609cf826754c6c513b12ded08ac92180aec7a9d7a70114ece50357bd1a902de4adaae9f30b8d699
Summary: Division of MuHash objects are very expensive and multiplication relatively cheap. The whole idea of introducing and tracking numerator and denominators seperately as a representation of the internal state was so that divisions would be rare. So using divison in the Remove method did not make any sense and was just a silly mistake which is corrected here. This is a partial backport of [[bitcoin/bitcoin#19521 | core#19521]] bitcoin/bitcoin@2e2648a Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Subscribers: Fabien Differential Revision: https://reviews.bitcoinabc.org/D11168
This is part of the "UTXO Set Statistics" project in Core, which aims to make things like "checking coin supply" faster. This stuff, and related things in coinstats.cpp, have always been broken for Elements in the sense that we consider only explicit outputs and we ignore assets (so everything just gets added together to get a meaningless total). It probably wouldn't be too hard to restrict this to only consider policyAsset, but it's out of scope for a rebase IMO. Also, I think this situation is fine .. I don't understand the motivation for this or why Core is merging this when they refuse to merge an address index .. but I guess we'll see if there are users who care about this data and who care about it being meaningful on Elements. Also, apologies for the big diff -- there were some mechanical changes to deal with CT amounts, but most of the changes related to the difference in how fees are accounted for. While I'm not thrilled with this PR, its functional test is really good! So I think what I eventually came up with is internally consistent.
Summary: This is a backport of [[bitcoin/bitcoin#19521 | core#19521]] [2/17] bitcoin/bitcoin@9c8a265 Depends on D11589 Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, sdulfari Reviewed By: #bitcoin_abc, sdulfari Differential Revision: https://reviews.bitcoinabc.org/D11596
Summary: This is a backport of [[bitcoin/bitcoin#19521 | core#19521]] [3/17] bitcoin/bitcoin@a8a46c4 Depends on D11596 Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, sdulfari Reviewed By: #bitcoin_abc, sdulfari Differential Revision: https://reviews.bitcoinabc.org/D11597
Summary: Main commit bitcoin/bitcoin@dd58a4d > index: Add Coinstats index > > The index holds the values previously calculated in coinstats.cpp for each block, representing the state of the UTXO set at each height. ----- bitcoin/bitcoin@57a026c > test: Add unit test for Coinstats index ----- bitcoin/bitcoin@8ea8c92 > index: Avoid unnecessary type casts in coinstatsindex ----- Replace two database writes with a single atomic batch write, to avoid a risk of database corruption. See [[ bitcoin/bitcoin@dd58a4d#r634089179 | the corresponding review ]]. bitcoin/bitcoin@1e38423 > index: Use batch writing in coinstatsindex WriteBlock ----- This is a backport of [[bitcoin/bitcoin#19521 | core#19521]] [4 & 8/17] and [[bitcoin/bitcoin#22047 | core#22047]] [1 & 3/3] Depends on D11597 Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, sdulfari Reviewed By: #bitcoin_abc, sdulfari Subscribers: sdulfari Differential Revision: https://reviews.bitcoinabc.org/D11598
Summary: This is a backport of [[bitcoin/bitcoin#19521 | core#19521]] [5/17] bitcoin/bitcoin@3c914d5 partial bitcoin/bitcoin@6a4c0c0 The functional test only checks that "-coinstatsindex" argument does not break anything. The rest of the functional test from commit 6a4c0c09ab is not yet applicable and will be added in the next commit. This includes also minor documentation fixups from [[ bitcoin/bitcoin#21818 | core#21818]] Depends on D11598 and D11595 Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Subscribers: Fabien Differential Revision: https://reviews.bitcoinabc.org/D11599
Summary: This is a backport of [[bitcoin/bitcoin#19521 | core#19521]] [6 & 7/17] bitcoin/bitcoin@3f166ec bitcoin/bitcoin@6a4c0c0 (removal of 'disk_size' and 'transactions' from the index and test the RPC on specific heights) Notes: - in the source commits there is a call to `::ChainstateActive().ForceFlushStateToDisk();` added, but it is removed in a following commit in the same PR. This is obviously duplicated with the `active_chainstate.ForceFlushStateToDisk()` a few lines later, so I omitted it. - I added a mention about `disk_size` not being available when using the index in the doc of the RPC call. This is added later in the same PR in an unrelated commit in the source material. - the new `hash_or_height` RPC parameter needs to be added to the `CRPCConvertParam`table for the unit tests to pass. This is done in a later commit in the same PR in the source material Depends on D11599 Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Subscribers: Fabien Differential Revision: https://reviews.bitcoinabc.org/D11600
Summary: This is a backport of [[bitcoin/bitcoin#19521 | core#19521]] [9 & 10/17] bitcoin/bitcoin@ca01bb8 bitcoin/bitcoin@655d929 Depends on D11600 Test Plan: `ninja && test/functional/test_runner.py rpc_misc` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Subscribers: Fabien Differential Revision: https://reviews.bitcoinabc.org/D11604
Summary: This is a backport of [[bitcoin/bitcoin#19521 | core#19521]] [11/17] and [[bitcoin/bitcoin#22047 | core#22047]] [2/3] bitcoin/bitcoin@2501576 bitcoin/bitcoin@fb65dde Depends on D11604 Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Differential Revision: https://reviews.bitcoinabc.org/D11605
Summary: This is a backport of [[bitcoin/bitcoin#19521 | core#19521]] [12/17] bitcoin/bitcoin@e0938c2 Note that there are minor differences in some amounts compared to the source commit due to differences in tx fees. Depends on D11605 Test Plan: `ninja check-functional` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Differential Revision: https://reviews.bitcoinabc.org/D11606
Summary: This is a backport of [[bitcoin/bitcoin#19521 | core#19521]] [13/17] bitcoin/bitcoin@bb7788b Depends on D11606 Test Plan: `ninja check-functional` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Differential Revision: https://reviews.bitcoinabc.org/D11607
Summary: This is a backport of [[bitcoin/bitcoin#19521 | core#19521]] [14/17] bitcoin/bitcoin@b936239 Depends on D11607 Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Differential Revision: https://reviews.bitcoinabc.org/D11608
… test for coinstatsindex behavior in reorgs Summary: This is a backport of [[bitcoin/bitcoin#19521 | core#19521]] [15 & 16/17] bitcoin/bitcoin@90c966b bitcoin/bitcoin@23fe504 Co-authored-by: Sjors Provoost <sjors@sprovoost.nl> Depends on D11608 Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Differential Revision: https://reviews.bitcoinabc.org/D11609
Summary: This concludes backport of [[bitcoin/bitcoin#19521 | core#19521]] [17/17] bitcoin/bitcoin@5f96d7d Depends on D11609 Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Differential Revision: https://reviews.bitcoinabc.org/D11610
This is part of the coinstats index project tracked in #18000
While the review of the new UTXO set hash algorithm (MuHash) takes longer recently #19328 was merged which added the possibility to run
gettxoutsetinfo
with a specific hash type. As the first type it addedhash_type=none
which skips the hashing of the UTXO set altogether. This alone did not makegettxoutsetinfo
much faster but it allows the use of an index for the remaining coin statistics even before a new hashing algorithm has been added. Credit to Sjors for the idea to take this intermediate step.Features summary:
-coinstatsindex
which syncs the index in the backgroundgettxoutsetinfo
withhash_type=none
orhash_type=muhash
and will get the response instantly out of the indexgettxoutsetinfo
to see coin statistics at a specific block height