New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rpc_getblockfrompeer.py intermittent failure: assert_equal(pruneheight, 248); not(249 == 248) #27749
Comments
FWIW I have seen this error once before running tests locally, but I was making changes to |
I wasn't able to reproduce it but my spider sense tells me that it will be fixed by #27770. |
After digging deeper and analyzing the failed CI log, I got a clearer picture what's going on here.
Blocks are appended to blk*.dat files in whatever order they are received from the network (either directly as I was surprised to see that even with only a single peer (node2 is only getting blocks from node0) blocks could arrive out-of-order, but that seems to be due to a mix of receiving full blocks and compact blocks. Wrote up some hacky python-regex-log-parsing magic to compare order of the actual chain (as generated on node0) with on how it's received on node2, confirming what's stated above:
Not sure how to best solve this issue. Is there a way to enforce linear blocks reception at least when only having a single peer? Or maybe the best solution would be to not involve the network at node2 at all (for the sake of the pruning part of the test) and just |
Nice analysis!
I think the cause is that a failed DoS check So I think that an alternative fix would be to add a sync_blocks call between node0 and node2 before letting node0 generate the 400 blocks. This should guarantee that all of these 400 blocks will be processed by compact block reconstruction. |
Awesome investigation! It feels good when all dots are connected. Apart from mzumzande's patch to enforce sequential block storage, which looks promising at first glance. I would like to refer to a portion of what I just wrote in #27770 (comment): I believe the main conclusion of the analysis (at least mine) is that the accuracy of the Considering this aspect, alongside mzumzande's fix, I hold the view that the test is quite fragile and would be unwise to maintain the current prune blockchain hardcoded numbers. These numbers have a significant discrepancy vs the block files information (as indicated in comment) and aren't really documented anywhere. |
cc @fjahr since he worked on the prune part of this test. |
This fixes an intermittent error, caused by blocks arriving out of order due to how compact block relay may revert to headers processing when the tip hasn't caught up, and resulting in slightly different pruning behavior. Making sure that all blocks from the previous tests are synced before generating more blocks makes this impossible. See Issue bitcoin#27749 for more details.
Opened #27784 which just adds an additional sync, as suggested in #27770 (comment) . |
9fe9074 test: add block sync to getblockfrompeer.py (Martin Zumsande) Pull request description: This adds an additional `sync_blocks` call, fixing an intermittent error caused by blocks arriving out of order due to how compact block relay may revert to headers processing when the tip hasn't caught up, and resulting in slightly different pruning behavior. Making sure that all blocks from the previous tests are synced before generating more blocks makes this impossible. See bitcoin#27749 (comment) and bitcoin#27749 (comment) for a more detailed analysis. bitcoin#27770 is a more long-term approach to avoid having to deal with magic pruneheight numbers in the first place, but that PR introduces a new RPC and needs more discussion. Fixes bitcoin#27749. ACKs for top commit: MarcoFalke: lgtm ACK 9fe9074 theStack: ACK 9fe9074 Tree-SHA512: f3de1ea68725429aeef448c351ea812b805fa216912b112d7db9aceeddb1f2381b705c2577734b0d308e78ec5e0c4d26dc65fc2171f6e21f13061fc71d48216c
Is there an existing issue for this?
Current behaviour
https://cirrus-ci.com/task/4776911524593664?logs=ci#L3147
Expected behaviour
.
Steps to reproduce
CI
Relevant log output
No response
How did you obtain Bitcoin Core
Compiled from source
What version of Bitcoin Core are you using?
current master
Operating system and version
[previous releases, qt5 dev package and depends packages, DEBUG] [focal]
Machine specifications
No response
The text was updated successfully, but these errors were encountered: