New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add scanblocks RPC call #20664
Add scanblocks RPC call #20664
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK
When I call without a height
param it doesn't actually scan (documentation suggests otherwise).
I was able to find some blocks used by addresses in my own wallet, both receiving on and sending from. If anyone wants to ACK #19136 I'll test with a ranged descriptor too.
Sadly the RPC call timed out, so it didn't return the result. You may need to add an abstraction like for rescanwallet
where one can query rescans in progress, but also obtain the result later. That might be a lot of extra code though.
#20295 should also help with prune wallet rescans
Don't forget to drop Conflicts
from the second commit message.
Can you also add a test for passing a descriptor object (with range)?
Other RPC methods that involve descriptors require a checksum; we should probably do that here as well.
For a followup it would be nice if descriptors are automatically expanded, so we don't need to specify a range. This comes with some headaches, see #15845
The documentation should remind the user about false positive block matches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested concept ACK
Naming-wise, scanobjects
, startheight
and filtertype
should probably be snakecase.
Seems like block filters are just a potential implementation detail/optimisation here, not part of the fundamental concept... Maybe rename it to |
Concept ACK |
I can do that. But the call doesn't scan blocks (hence your proposed name |
maybe call it |
I think |
Wouldn't it be inconsistent to have |
b245ecc
to
4e87eb7
Compare
4e87eb7
to
9d7ba40
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK, not tested
c6f0c5d
to
f22a048
Compare
Overhauled this PR. Fixed all the reported issues. Added the same interface then Also added Some performance reports on an
Single sPUk:
|
Thanks for the update! Unfortunately unlike |
I think we can handle this the same way as I thought about adding dispatched threads. Let's just keep it identical to |
Having long running tasks that you can manage via a UUID does sound useful, but I agree it would be overkill in this PR. |
Github-Pull: bitcoin#20664 Rebased-From: f22a048
f22a048
to
ab315e5
Compare
Rebased |
}, | ||
"[scanobjects,...]"}, | ||
{"start_height", RPCArg::Type::NUM, /*default*/ "0", "height to start to filter from"}, | ||
{"stop_height", RPCArg::Type::NUM, /*default*/ "<tip>", "height to stop to scan"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to above.
{ | ||
return RPCHelpMan{"scanblocks", | ||
"\nReturn relevant blockhashes for given descriptors.\n" | ||
"This call may take serval minutes. Make sure to use no RPC timeout (bitcoin-cli -rpcclienttimeout=0)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"This call may take serval minutes. Make sure to use no RPC timeout (bitcoin-cli -rpcclienttimeout=0)", | |
"This call may take several minutes. Make sure to use no RPC timeout (bitcoin-cli -rpcclienttimeout=0)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in #23549
|
||
// loop through the scan objects, add scripts to the needle_set | ||
GCSFilter::ElementSet needle_set; | ||
for (const UniValue& scanobject : request.params[1].get_array().getValues()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my information: This array seems to unbounded. The bound is probably some maximum size of RPC requests in general. Anyway, I wonder whether it is an issue or not. I guess not because nobody mentioned it here. Could anybody elaborate on this a bit, please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only subject to authenticated RPC user input and isn't handled in previous RPC calls (e.g. scantxoutset
) so if someone wants to DoS their own node with a massive input array that seems okay to me.
std::vector<BlockFilter> filters; | ||
const CBlockIndex* start_block = block; // for progress reporting | ||
const CBlockIndex* last_scanned_block = block; | ||
g_scanfilter_should_abort_scan = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my information: Would it make sense to move this line to L2414 and to make the for-loop on L2417 abortable too? Or is it an overkill? I'm not sure how long the for loop can take for "many scanobjects".
If the for-loop can take a long time then this g_scanfilter_should_abort_scan = false;
assignment may mean that a user abort request might be ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is overkill IMO; even for a long list, EvalDescriptorStringOrObject
should be quite fast since it doesn't rely on any disk IO, unlike the calls below.
# Copyright (c) 2021 The Bitcoin Core developers | ||
# Distributed under the MIT software license, see the accompanying | ||
# file COPYING or http://www.opensource.org/licenses/mit-license.php. | ||
"""Test the scanblocks rpc call.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: What about:
"""Test the scanblocks rpc call.""" | |
"""Test the scanblocks RPC call.""" |
?
It looks like it is mostly written in this way in tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in #23549
from test_framework.test_framework import BitcoinTestFramework | ||
from test_framework.util import assert_equal, assert_raises_rpc_error | ||
|
||
class scanblocksTest(BitcoinTestFramework): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
class scanblocksTest(BitcoinTestFramework): | |
class ScanblocksTest(BitcoinTestFramework): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in #23549
{"stop_height", RPCArg::Type::NUM, /*default*/ "<tip>", "height to stop to scan"}, | ||
{"filtertype", RPCArg::Type::STR, /*default*/ "basic", "The type name of the filter"} | ||
}, | ||
RPCResult{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In 6a69dd2 "Add scanblocks RPC call - scan for relevant blocks with descriptors"
This RPCResult
does not reflect the actual results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in #23549
block = next; | ||
} | ||
ret.pushKV("from_height", start_block->nHeight); | ||
ret.pushKV("to_height", last_scanned_block->nHeight); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In 6a69dd2 "Add scanblocks RPC call - scan for relevant blocks with descriptors"
I agree with @fjahr here. last_scanned_block
isn't actually the last block we have scanned, and if we abort, then to_height
does not accurately reflect where we have actually scanned up to. It would be better to move it up into the if
the guards the actual scanning. But in that case, it's the same as g_scanfilter_progress_height
, so it can just be removed.
self.nodes[0].sendtoaddress("mkS4HXoTYWRTescLGaUTGbtTTYX5EjJyEE", 1.0) # childkey 5 of tpubD6NzVbkrYhZ4WaWSyoBvQwbpLkojyoTZPRsgXELWz3Popb3qkjcJyJUGLnL4qHHoQvao8ESaAstxYSnhyswJ76uZPStJRJCTKvosUCJZL5B | ||
|
||
# mine a block and assure that the mined blockhash is in the filterresult | ||
blockhash = self.nodes[0].generate(1)[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In 71b7cdb "Add scanblock functional test"
Since the indexes are built asynchronously, it is possible that after generating the block, the block filter index has not updated to include the new block, so scanblocks might not find anything. Instead, we need to wait for it to become synced:
blockhash = self.nodes[0].generate(1)[0] | |
blockhash = self.nodes[0].generate(1)[0] | |
self.wait_until(lambda: all(i["synced"] for i in self.nodes[0].getindexinfo().values())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in #23549
Concept ACK There have been various code reviews but no changes since months. @jonasschnelli: are you still working on this PR? (Just checking out before I start to code-review). |
Removed from high-prio for now, but can be added back any time. |
Compiled and tested Successfully 6a69dd2 at Ubuntu 18.04. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK
Concept ACK. Would love to see this, or any other mechanism that allows performing wallet rescans in ~2min on good hardware, make it in. |
Rebased here: master...luke-jr:rpc_filterblocks |
Should this be marked "up for grabs" @jonasschnelli ? |
# also test the stop height | ||
assert(blockhash in self.nodes[0].scanblocks("start", ["addr("+addr_1+")"], self.nodes[0].getblockheader(blockhash)['height'], self.nodes[0].getblockheader(blockhash)['height'])['relevant_blocks']) | ||
|
||
# use the stop_height to exclude the relevent block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
relevent ==> relevant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in #23549
I haven't started review yet, but given @MarcoFalke has done some previous work on this, we should ensure that this change isn't subject to the bugs he points out here: #15845 (comment) |
For what it's worth, I'm working on resurrecting this PR. Hoping to have a branch pushed by end of day. |
Please see #23549 |
626b7c8 fuzz: add scanblocks as safe for fuzzing (James O'Beirne) 94fe545 test: rpc: add scanblocks functional test (Jonas Schnelli) 6ef2566 rpc: add scanblocks - scan for relevant blocks with descriptors (Jonas Schnelli) a4258f6 rpc: move-only: consolidate blockchain scan args (James O'Beirne) Pull request description: Revives #20664. All feedback from the previous PR has either been responded to inline or incorporated here. --- Major changes from Jonas' PR: - consolidated arguments for scantxoutset/scanblocks - substantial cleanup of the functional test Here's the range-diff (`git range-diff master jonasschnelli/2020/12/filterblocks_rpc jamesob/2021-11-scanblocks`): https://gist.github.com/jamesob/aa4a975344209f0316444b8de2ec1d18 ### Original PR description > The `scanblocks` RPC call allows one to get relevant blockhashes from a set of descriptors by scanning all blockfilters in a given range. > > **Example:** > > `scanblocks start '["addr(<bitcoin_address>)"]' 661000` (returns relevant blockhashes for `<bitcoin_address>` from blockrange 661000->tip) > > ## Why is this useful? > **Fast wallet rescans**: get the relevant blocks and only rescan those via `rescanblockchain getblockheader(<hash>)[height] getblockheader(<hash>)[height])`. A future PR may add an option to allow to provide an array of blockhashes to `rescanblockchain`. > > **prune wallet rescans**: (_needs additional changes_): together with a call to fetch blocks from the p2p network if they have been pruned, it would allow to rescan wallets back to the genesis block in pruned mode (relevant #15946). > > **SPV mode** (_needs additional changes_): it would be possible to build the blockfilterindex from the p2p network (rather then deriving them from the blocks) and thus allow some sort of hybrid-SPV mode with moderate bandwidth consumption (related #9483) ACKs for top commit: furszy: diff re-ACK 626b7c8 Tree-SHA512: f84e4dcb851b122b39e9700c58fbc31e899cdcf9b587df9505eaf1f45578cc4253e89ce2a45d1ff21bd213e31ddeedbbcad2c80810f46755b30acc17b07e2873
626b7c8 fuzz: add scanblocks as safe for fuzzing (James O'Beirne) 94fe545 test: rpc: add scanblocks functional test (Jonas Schnelli) 6ef2566 rpc: add scanblocks - scan for relevant blocks with descriptors (Jonas Schnelli) a4258f6 rpc: move-only: consolidate blockchain scan args (James O'Beirne) Pull request description: Revives bitcoin#20664. All feedback from the previous PR has either been responded to inline or incorporated here. --- Major changes from Jonas' PR: - consolidated arguments for scantxoutset/scanblocks - substantial cleanup of the functional test Here's the range-diff (`git range-diff master jonasschnelli/2020/12/filterblocks_rpc jamesob/2021-11-scanblocks`): https://gist.github.com/jamesob/aa4a975344209f0316444b8de2ec1d18 ### Original PR description > The `scanblocks` RPC call allows one to get relevant blockhashes from a set of descriptors by scanning all blockfilters in a given range. > > **Example:** > > `scanblocks start '["addr(<bitcoin_address>)"]' 661000` (returns relevant blockhashes for `<bitcoin_address>` from blockrange 661000->tip) > > ## Why is this useful? > **Fast wallet rescans**: get the relevant blocks and only rescan those via `rescanblockchain getblockheader(<hash>)[height] getblockheader(<hash>)[height])`. A future PR may add an option to allow to provide an array of blockhashes to `rescanblockchain`. > > **prune wallet rescans**: (_needs additional changes_): together with a call to fetch blocks from the p2p network if they have been pruned, it would allow to rescan wallets back to the genesis block in pruned mode (relevant bitcoin#15946). > > **SPV mode** (_needs additional changes_): it would be possible to build the blockfilterindex from the p2p network (rather then deriving them from the blocks) and thus allow some sort of hybrid-SPV mode with moderate bandwidth consumption (related bitcoin#9483) ACKs for top commit: furszy: diff re-ACK 626b7c8 Tree-SHA512: f84e4dcb851b122b39e9700c58fbc31e899cdcf9b587df9505eaf1f45578cc4253e89ce2a45d1ff21bd213e31ddeedbbcad2c80810f46755b30acc17b07e2873
The
scanblocks
RPC call allows one to get relevant blockhashes from a set of descriptors by scanning all blockfilters in a given range.Example:
scanblocks start '["addr(<bitcoin_address>)"]' 661000
(returns relevant blockhashes for
<bitcoin_address>
from blockrange 661000->tip)Why is this useful?
Fast wallet rescans: get the relevant blocks and only rescan those via
rescanblockchain getblockheader(<hash>)[height] getblockheader(<hash>)[height])
. A future PR may add an option to allow to provide an array of blockhashes torescanblockchain
.prune wallet rescans: (needs additional changes): together with a call to fetch blocks from the p2p network if they have been pruned, it would allow to rescan wallets back to the genesis block in pruned mode (relevant #15946).
SPV mode (needs additional changes): it would be possible to build the blockfilterindex from the p2p network (rather then deriving them from the blocks) and thus allow some sort of hybrid-SPV mode with moderate bandwidth consumption (related #9483)