Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: fix intermittent error in getblockfrompeer.py #27784

Merged
merged 1 commit into from May 31, 2023

Conversation

mzumsande
Copy link
Contributor

This adds an additional sync_blocks call, fixing an intermittent error caused by blocks arriving out of order due to how compact block relay may revert to headers processing when the tip hasn't caught up, and resulting in slightly different pruning behavior.
Making sure that all blocks from the previous tests are synced before generating more blocks makes this impossible.

See #27749 (comment) and #27749 (comment) for a more detailed analysis.

#27770 is a more long-term approach to avoid having to deal with magic pruneheight numbers in the first place, but that PR introduces a new RPC and needs more discussion.

Fixes #27749.

This fixes an intermittent error, caused by blocks arriving
out of order due to how compact block relay may revert to headers
processing when the tip hasn't caught up, and resulting in slightly
different pruning behavior.

Making sure that all blocks from the previous tests are synced before
generating more blocks makes this impossible.
See Issue bitcoin#27749 for more details.
@DrahtBot
Copy link
Contributor

DrahtBot commented May 30, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK theStack, MarcoFalke

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Copy link
Contributor

@theStack theStack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 9fe9074

Unfortunately there seems to be no reliable way to reproduce the issue on master (which could then be also ran on this PR to ensure that it is indeed fixed), but the explanation in #27749 (comment) makes sense to me.

@Sjors
Copy link
Member

Sjors commented May 30, 2023

9fe9074 looks harmless, but I haven't delved into the underlying issue. Can the original be reproduced by running the test in a loop??

CI failure seems spurious.

@mzumsande
Copy link
Contributor Author

Can the original be reproduced by running the test in a loop??

I haven't been able to reproduce it, I think it requires a constellation with a very special timing which should make it very rare even in the CI. One thing I did is check in the log is that node2 doesn't use HeadersDirectFetchBlocks for any blocks > 205, and uses compact block reconstruction instead. This is different in the log of the failed run.

@theStack
Copy link
Contributor

9fe9074 looks harmless, but I haven't delved into the underlying issue. Can the original be reproduced by running the test in a loop??

If you want to give that approach a try, the following shell script (which I have been running for days already, without any issue) should do exactly that:

#!/usr/bin/env bash
set -e
while true; do ./test/functional/rpc_getblockfrompeer.py; done

@maflcko
Copy link
Member

maflcko commented May 31, 2023

lgtm ACK 9fe9074

@fanquake fanquake merged commit 433f17b into bitcoin:master May 31, 2023
15 of 16 checks passed
@mzumsande mzumsande deleted the 202305_fix_getblockfrompeer branch May 31, 2023 14:25
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request May 31, 2023
9fe9074 test: add block sync to getblockfrompeer.py (Martin Zumsande)

Pull request description:

  This adds an additional `sync_blocks` call, fixing an intermittent error caused by blocks arriving out of order due to how compact block relay may revert to headers processing when the tip hasn't caught up, and resulting in slightly different pruning behavior.
  Making sure that all blocks from the previous tests are synced before generating more blocks makes this impossible.

  See bitcoin#27749 (comment) and bitcoin#27749 (comment) for a more detailed analysis.

  bitcoin#27770 is a more long-term approach to avoid having to deal with magic pruneheight numbers in the first place, but that PR introduces a new RPC and needs more discussion.

  Fixes bitcoin#27749.

ACKs for top commit:
  MarcoFalke:
    lgtm ACK 9fe9074
  theStack:
    ACK 9fe9074

Tree-SHA512: f3de1ea68725429aeef448c351ea812b805fa216912b112d7db9aceeddb1f2381b705c2577734b0d308e78ec5e0c4d26dc65fc2171f6e21f13061fc71d48216c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

rpc_getblockfrompeer.py intermittent failure: assert_equal(pruneheight, 248); not(249 == 248)
6 participants