p2p: For assumeutxo, download snapshot chain before background chain #29519

mzumsande · 2024-02-29T21:44:09Z

After loading a snapshot, pindexLastCommonBlock is usually already set to some block for existing peers. That means we'd continue syncing the background chain from those peers instead of prioritising the snapshot chain, which defeats the purpose of doing assumeutxo in the first place. Only existing peers are affected by this bug.

After loading a snapshot, pindexLastCommonBlock is usually already set to some block for existing peers. That means we'd continue syncing the background chain from those peers instead of prioritising the snapshot chain, which defeats the purpose of doing assumeutxo in the first place. Only existing peers are affected by this bug.

DrahtBot · 2024-02-29T21:44:12Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	Sjors

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

fanquake · 2024-03-05T16:18:47Z

Note that given that 27 will not ship with mainnet assumeutxo, this isn't a blocker, and a fix can still be backported to 27.x if deemed necessary.

Sjors · 2024-03-11T15:21:53Z

Concept ACK

ac547ea looks reasonable, but it would be nice to somehow add a test for it.

What happens once we've reached the tip and need more peers for background sync? And once background sync is done, is it updated to the tip again?

Would it make sense to introduce pindexLastCommonBackgroundSyncBlock so we have more control over what peers to dedicate to the tip vs background (which might take weeks on some devices)?

mzumsande · 2024-03-11T20:30:08Z

ac547ea looks reasonable, but it would be nice to somehow add a test for it.

I tested this syncing on signet, but I'll think about a functional test; I imagine that it would be hard to prevent race conditions in a functional test because we'd:

need to first connect to a peer that has the entire chain and start syncing the background chain
then load the snapshot (at the same time the peer would continue syncing the background chain)
then check that we switch to requesting from the snapshot chain (while the bg chain hopefully hasn't finished downloading yet)

It might be possible though if that peer was a P2PConnection() (that could stall / withhold certain blocks) instead of a real node.

What happens once we've reached the tip and need more peers for background sync? And once background sync is done, is it updated to the tip again?

Would it make sense to introduce pindexLastCommonBackgroundSyncBlock so we have more control over what peers to dedicate to the tip vs background (which might take weeks on some devices)?

We have this logic in SendMessages: It first tries FindNextBlocksToDownload meant for the active chainstate, and, if it doesn't fill up the vToDownload buffer (e.g. because we are at / near the tip), then calls TryDownloadingHistoricalBlocks specifically for the background chainstate. I think that this existing separation should work well, the problem that this PR fixes is that FindNextBlocksToDownload would pick historical blocks from the background chain when it shouldn't because higher-prio blocks are available.

Sjors · 2024-03-12T15:45:19Z

I can see how it would be difficult to test, indeed.

I encountered this scenario while testing #29370 on signet. All my peers were only downloading the background state. I only got headers for the tip chain, but it didn't progress. The workaround there was to turn the network off and on, which is consistent with your observation that only existing peers are affected.

utACK ac547ea

fjahr · 2024-03-25T17:58:11Z

The code and reasoning look good to me and I have been trying to write a test but I am currently unable to reproduce the behavior described before the change, i.e. to make the test fail on master.

Here is the test so far: fjahr@478fb28

This now does fail on master but only because it also checks for the debug log HERE I have added. This log confirms that the test does trigger the behavior change, i.e. on master the last common block is set once, after the change here it is set twice within FindNextBlockToDownload. But, of course, we would rather like to have the test fail at the asserts at the end but that is not the case so far.

@mzumsande do you have an idea why this doesn't reproduce the behavior you have seen on master?

mzumsande · 2024-03-25T22:24:00Z

@mzumsande do you have an idea why this doesn't reproduce the behavior you have seen on master?

I tested your branch, and I see the same - this is really interesting, and it works differently than I thought when I opened the PR!

I think the following happens in PeerManagerImpl::FindNextBlocks (after being called from FindNextBlocksToDownload()):
We process up to 1024 blocks ahead of the current pindexWalk:

bitcoin/src/net_processing.cpp

Line 1430 in 2102c97

int nWindowEnd = state->pindexLastCommonBlock->nHeight + BLOCK_DOWNLOAD_WINDOW;

But due to the lines

bitcoin/src/net_processing.cpp

Lines 1497 to 1502 in 2102c97

    
           if (pindex->nStatus & BLOCK_HAVE_DATA || (activeChain && activeChain->Contains(pindex))) { 
        
               if (activeChain && pindex->HaveNumChainTxs()) { 
        
                   state->pindexLastCommonBlock = pindex; 
        
               } 
        
               continue; 
        
           }

we never ask for the blocks below the snapshot block here: Even though we don't have their data, they are now in the active chain, so they are skipped instead.
In the test, we immediately ask for the blocks after the snapshot instead, which is exactly the desired behavior.

So why did I observe a different behavior on signet?
I think it only works like this in the test because there, the snapshot is just a 100 blocks away from pindexLastCommonBlock.

If this wasn't the case, we'd request nothing in FindNextBlocksToDownload(), only move our pindexLastCommonBlock ahead by a maximum of 1024. Then we have capacity for this peer left and would actually ask for the historical blocks via TryDownloadingHistoricalBlocks().

So I think what would actually happen on mainnet and signet is that each time we can process a block request with an existing peer, we'd move pindexLastCommonBlock 1024 blocks ahead, and then request historical blocks - until pindexLastCommonBlock has reached the snapshot. So we'd continuing downloading the background chain for a while (a few minutes?) with existing peers until we switch over to the snapshot chain (instead of never switching over as I originally thought when opening this PR).

Still, I think that this behavior is not great, and the fix is an improvement - not only because we start downloading the snapshot chain immediately, we also do much less iterating in FindNextBlocksToDownload , moving pindexLastCommonBlock ahead without actually downloading anything.

DrahtBot added the P2P label Feb 29, 2024

fanquake added this to the 27.0 milestone Mar 1, 2024

fanquake requested a review from ryanofsky March 1, 2024 15:16

mzumsande mentioned this pull request Mar 1, 2024

assumeutxo: Get rid of faked nTx and nChainTx values #29370

Merged

fanquake mentioned this pull request Mar 5, 2024

assumeutxo: Add mainnet chainparams for block 830000 #29551

Closed

fjahr mentioned this pull request Mar 10, 2024

AssumeUTXO Mainnet Readiness Tracking #29616

Open

22 tasks

fanquake removed this from the 27.0 milestone Apr 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

p2p: For assumeutxo, download snapshot chain before background chain #29519

p2p: For assumeutxo, download snapshot chain before background chain #29519

mzumsande commented Feb 29, 2024

DrahtBot commented Feb 29, 2024 •

edited

fanquake commented Mar 5, 2024

Sjors commented Mar 11, 2024 •

edited

mzumsande commented Mar 11, 2024 •

edited

Sjors commented Mar 12, 2024

fjahr commented Mar 25, 2024

mzumsande commented Mar 25, 2024 •

edited

p2p: For assumeutxo, download snapshot chain before background chain #29519

Are you sure you want to change the base?

p2p: For assumeutxo, download snapshot chain before background chain #29519

Conversation

mzumsande commented Feb 29, 2024

DrahtBot commented Feb 29, 2024 • edited

Code Coverage

Reviews

fanquake commented Mar 5, 2024

Sjors commented Mar 11, 2024 • edited

mzumsande commented Mar 11, 2024 • edited

Sjors commented Mar 12, 2024

fjahr commented Mar 25, 2024

mzumsande commented Mar 25, 2024 • edited

DrahtBot commented Feb 29, 2024 •

edited

Sjors commented Mar 11, 2024 •

edited

mzumsande commented Mar 11, 2024 •

edited

mzumsande commented Mar 25, 2024 •

edited