New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rpc: getblockfrompeer, introduce fetch from "any" functionality #27836
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
…ted peer Limited peers only store the last 288 blocks, not all historical blocks.
31918e0
to
31ae2d2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approach ACK
Nice to have the option to call getpeerinfo
without explicitly passing a peer id. Often, I get block from a random peer, and the current implementation's random selection of a peer when no id is provided aligns with that.
Tests fail on the master pass on 31ae2d2.
I've left a few suggestions and a question.
@@ -1780,13 +1780,24 @@ std::optional<std::string> PeerManagerImpl::FetchBlock(NodeId peer_id, const CBl | |||
{ | |||
if (m_chainman.m_blockman.LoadingBlocks()) return "Loading blocks ..."; | |||
|
|||
// Only allow fetching from limited peers if the block height is within the allowed window | |||
bool allow_limited_peers = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: feel free to ignore
bool allow_limited_peers = false; | |
bool allow_limited_peers{false}; |
|
||
pruned_block_10 = self.nodes[0].getblockhash(10) | ||
assert_raises_rpc_error(-1, "Cannot fetch old block from a limited peer", pruned_node.getblockfrompeer, pruned_block_10, limited_peer_id) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to include a test specifically targeting the scenario of retrieving a block within the last 288 blocks from a Network Limited peer.
The 'prune' option cannot be set below 550 MiB, it is unclear whether it allows pruning to the last 288 blocks.
( That is the possibility of a node not having any of the last 288 blocks ).
We could just exclude limited peers from fetching the blocks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 'prune' option cannot be set below 550 MiB, it is unclear whether it allows pruning to the last 288 blocks.
( That is the possibility of a node not having any of the last 288 blocks ).
Have you seen FindFilesToPrune
?, MIN_BLOCKS_TO_KEEP
prevents pruning from deleting files that contain a block within the last 288 range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah, the MIN_BLOCKS_TO_KEEP are protected from pruning, all pruned nodes retain the last 288 blocks.
Limited peers should just be excluded from fetching blocks, unless there are any cases where the last 288 blocks might not be available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Limited peers should just be excluded from fetching blocks, unless there are any cases where the last 288 blocks might not be available?
Not yet. Basically, one of the features that have locally for the on-going #21267 working path, is to be able to sync-up the node on-demand by doing getblockfrompeer
requests instead of using the node's automatic sync process. Which is later used for the filters matching block requests too.
Said that, no problem on excluding limited peers for now. We could enable them later, when the new feature is introduced.
…tched groundwork for introducing the fetch block from "any" peer functionality.
If no 'peer_id' is provided, 'getblockfrompeer' will pick a random full node to fetch the block from.
31ae2d2
to
94e6eea
Compare
I would much prefer an approach where this is handled by the existing background block fetches, because that one is aware of which peers have which blocks, and can load balance across this, and re-issue on timeout etc. It would make the functionality also usable for e.g. redownloading for wallet rescan after pruning or so. |
Guess that you are talking about the random peer selection part? Like, you are expecting to have a set of "retry" blocks in net_processing, so Because, that is implemented in the follow-up #27837, in a slightly different way. With a The purpose of this PR was to limit the scope of the initial changes to But.. thinking further, what if instead of returning the initial peer id, we return the tracking request id from What do you think? Side note about the tracker class:
Yeah. That one, and the rescan by using compact block filters, are my long-term goals. |
@furszy Thanks for giving some background, I do see better where this is coming from now. But yes, my thinking is that it's the wrong approach to assign block requests to random peers, for multiple reasons:
And it seemed to me that tacking something on, without reusing the existing logic for downloading and tracking blocks, seemed both unnecessary and inferior to me. However, thinking about it more, I realize there is perhaps something I missed, if that's an intended use case: downloading blocks we don't have a block header for. All the existing logic in FindNextBlockToDownload, the reasoning about which peers have what, and the timeout logic/tracking around block downloading in general is based on identifying blocks using their entry in the block index. For downloading blocks just by hash, very little of that is reusable. So perhaps the right (eventual) approach is actually treating these two separately: redownloading of known blocks using the existing logic, but downloading of unknown blocks using a new mechanism. Regarding identification of tracking requests: is there a need for that? For normal IBD/sync, there is no such functionality either. Perhaps I'm missing the use case for this. If you do need an identifier to track (re)download requests, what about using the block hash itself? Multiple redownload requests for the same block don't make sense anyway. |
Thanks sipa for all the feedback!
Ok yeah, now I see the full picture, and agree.
Hmm, yeah.
Yeah, no.. I had one of those fake epiphanies, probably originated by a lack of caffeine in my system.. block requests are tracked by hash in #27837. So we can easily retrieve their tracking information with it. So, to summarize: I’m going to decouple the commit that prevents users from calling |
Work implemented in #27837. Closing. |
Coming from #27652. Implementing the first part of it.
The idea is to let users call
getblockfrompeer
without providing any peer id.The node will internally select one peer at random and make the
getdata
request.This also fixes a bug where the user is allowed to call
getblockfrompeer
providingan id of a peer that signals a "limited" service. As limited peers cannot provide historical
blocks, it is not correct to allow the user to do that.