Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing to fetch cfheader corresponding to block header in headers message #27085

Open
Christewart opened this issue Feb 11, 2023 · 12 comments
Open

Comments

@Christewart
Copy link
Contributor

Occasionally when I receive a headers message on the p2p network, and attempt to fetch the cfheader that corresponds to a block header inside of a headers, i get this error message inside of my debug.log

2023-02-07T16:47:15Z [net] Failed to find block filter hashes in index: filter_type=basic, start_height=150, stop_hash=3ca9030aeb6c2721cfbab0116b8d96e2d3c7e00738238010e0bc622566dc2aed

Expected behavior

I should be able to fetch a cfheader for a block after the block has been relayed to me over the p2p network via headers p2p message.

Actual behavior

It fails to fetch the cfheader that corresponds to the block.

To reproduce

I am unable to reproduce reliably, but this does occur ~1/5 times when running test suites on bitcoin-s.

System information

24.0 (although i have seen this behavior for awhile and believe this issue exists in older versions of bitcoind)

The OS/CPU arch does not seem to be a factor in this bug. I can reproduce on Linux/mac machines with older/newer versions of bitcoind

I've attached the debug.log for my bitcoind instance. The block hash in question is 3ca9030aeb6c2721cfbab0116b8d96e2d3c7e00738238010e0bc622566dc2aed.

debug.log

This is related to: bitcoin-s/bitcoin-s#4976

@furszy
Copy link
Member

furszy commented Feb 11, 2023

Even when the peer sent the headers message, the block filter index could still be processing the block signals.
If possible on your tests, call syncwithvalidationinterfacequeue to empty the validation interface queue prior requesting the cfheader or, could also do it right after generating the block/s, so you ensure that the index processes the events.

@Christewart
Copy link
Contributor Author

Christewart commented Feb 11, 2023

Thanks for pointing me in the right direction, this seems to be the equivalent in the bitcoin core python test harness #22311 . I'll patch with this on bitcoin-s and close this issue after i've verifed this is the appropriate fix.

@Christewart
Copy link
Contributor Author

Christewart commented Jan 5, 2024

This still seems to be an issue on 24.2 . I'm going to try and upgrade my nodes used for test harnesses in to use 25.x and 26.x. From a quick look through release notes, it doesn't seem like there was anything in those releases that would fix this bug.

Note, i am calling syncwithvalidationinterfacequeue explicitly after generating blocks on regtest and waiting for that rpc call to complete before continuing with testing logic. This doesn't seem to happen as much on slower machines (x86_64), but on apple silicon it seems fairly reproducible.

2024-01-05T16:33:17Z [rpc] ThreadRPCServer method=generatetoaddress user=random_user_name
... bunch of block generation logs ...
2024-01-05T16:33:17Z [rpc] ThreadRPCServer method=syncwithvalidationinterfacequeue user=random_user_name
...
2024-01-05T16:33:18Z [net] received: getcfheaders (37 bytes) peer=9
2024-01-05T16:33:18Z [net] Failed to find block filter hashes in index: filter_type=basic, start_height=209, stop_hash=2765372306e3a828a72ec307057fcc1885a1600d14087b5729020e8c8a99a8a8
2024-01-05T16:33:18Z [default wallet] AddToWallet d78a57f4bf12014a12da2d1914b9fcc69277d7b4ae738a30e8cc18cdc282d7f5  new
2024-01-05T16:33:18Z [validation] UpdatedBlockTip: new block hash=6b067436661a30cea95eaf93664cc9a2723372c272e80bcb97eeb2330895aecd fork block hash=5a62c32fe94767a36ed71048301e21276fbb2de6a9a392bcc4fb51571fc910fb (in IBD=false)
2024-01-05T16:33:18Z [zmq] Publish hashblock 6b067436661a30cea95eaf93664cc9a2723372c272e80bcb97eeb2330895aecd to tcp://0.0.0.0:27298
2024-01-05T16:33:18Z [zmq] Publish rawblock 6b067436661a30cea95eaf93664cc9a2723372c272e80bcb97eeb2330895aecd to tcp://0.0.0.0:44946
2024-01-05T16:33:18Z [validation] BlockConnected: block hash=60cef82e330f20ee10402d8ec33a6dc6547063711e88c9a354cf0b4da50f4a64 block height=211
2024-01-05T16:33:18Z [zmq] Publish hashtx 8606ba3634d0e87a962e0517635e8e723a57e18c7cb11a0706407c786afc50d3 to tcp://0.0.0.0:37406
2024-01-05T16:33:18Z [zmq] Publish rawtx 8606ba3634d0e87a962e0517635e8e723a57e18c7cb11a0706407c786afc50d3 to tcp://0.0.0.0:31111
2024-01-05T16:33:18Z [leveldb] WriteBatch memory usage: db=txindex, before=0.0MiB, after=0.0MiB
2024-01-05T16:33:18Z [leveldb] WriteBatch memory usage: db=db, before=0.0MiB, after=0.0MiB
2024-01-05T16:33:18Z [default wallet] AddToWallet 8606ba3634d0e87a962e0517635e8e723a57e18c7cb11a0706407c786afc50d3  new
2024-01-05T16:33:18Z [validation] UpdatedBlockTip: new block hash=60cef82e330f20ee10402d8ec33a6dc6547063711e88c9a354cf0b4da50f4a64 fork block hash=6b067436661a30cea95eaf93664cc9a2723372c272e80bcb97eeb2330895aecd (in IBD=false)
2024-01-05T16:33:18Z [zmq] Publish hashblock 60cef82e330f20ee10402d8ec33a6dc6547063711e88c9a354cf0b4da50f4a64 to tcp://0.0.0.0:27298
2024-01-05T16:33:18Z [zmq] Publish rawblock 60cef82e330f20ee10402d8ec33a6dc6547063711e88c9a354cf0b4da50f4a64 to tcp://0.0.0.0:44946
2024-01-05T16:33:18Z [validation] BlockConnected: block hash=2765372306e3a828a72ec307057fcc1885a1600d14087b5729020e8c8a99a8a8 block height=212
2024-01-05T16:33:18Z [zmq] Publish hashtx 36038c1828cd2ea7d2b3432364fa480cd0bdfbaaabf06784668cd8fbc4d67da7 to tcp://0.0.0.0:37406
2024-01-05T16:33:18Z [zmq] Publish rawtx 36038c1828cd2ea7d2b3432364fa480cd0bdfbaaabf06784668cd8fbc4d67da7 to tcp://0.0.0.0:31111
2024-01-05T16:33:18Z [leveldb] WriteBatch memory usage: db=txindex, before=0.0MiB, after=0.0MiB
2024-01-05T16:33:18Z [leveldb] WriteBatch memory usage: db=db, before=0.0MiB, after=0.0MiB
2024-01-05T16:33:18Z [default wallet] AddToWallet 36038c1828cd2ea7d2b3432364fa480cd0bdfbaaabf06784668cd8fbc4d67da7  new
2024-01-05T16:33:18Z [validation] UpdatedBlockTip: new block hash=2765372306e3a828a72ec307057fcc1885a1600d14087b5729020e8c8a99a8a8 fork block hash=60cef82e330f20ee10402d8ec33a6dc6547063711e88c9a354cf0b4da50f4a64 (in IBD=false)
2024-01-05T16:33:18Z [zmq] Publish hashblock 2765372306e3a828a72ec307057fcc1885a1600d14087b5729020e8c8a99a8a8 to tcp://0.0.0.0:27298

EDIT:

i've attached the full log
debug.log

@Christewart Christewart reopened this Jan 5, 2024
@maflcko maflcko added the P2P label Jan 5, 2024
@maflcko
Copy link
Member

maflcko commented Jan 5, 2024

Note, i am not calling syncwithvalidationinterfacequeue explicitly

At least one of the following must happen, otherwise this bug will remain, obviously:

  • Call syncwithvalidationinterfacequeue to produce the compact filter prior to requesting it, to ensure arrival
  • Request the compact filter in a loop (busy polling) until it arrives
  • Change the code to syncwithvalidationinterfacequeue before answering compact filter requests. (This would be my preference, but there was a previous discussion, which I can't find right now)

@Christewart
Copy link
Contributor Author

Note, i am not calling syncwithvalidationinterfacequeue explicitly

Massive apologies, that was a typo. I am calling syncwithvalidationinterfacequeue explicitly. I always call it in conjunction with generatetoaddress

@maflcko
Copy link
Member

maflcko commented Jan 5, 2024

I am calling syncwithvalidationinterfacequeue explicitly.

Are you sure, because the debug log you shared does not mention it?

@Christewart
Copy link
Contributor Author

Christewart commented Jan 5, 2024

I am calling syncwithvalidationinterfacequeue explicitly.

Are you sure, because the debug log you shared does not mention it?

I've modified the post to add those logs and i've attached the full debug.log.

@maflcko
Copy link
Member

maflcko commented Jan 5, 2024

Ok, it is called, but you also have to wait for it to complete and return from the RPC, before continuing the test. Currently it looks like it is called, and processes the block events, but at the same time the P2P interface is asked for the filters.

It may be best to only use a single thread for the test logic.

@Christewart
Copy link
Contributor Author

Christewart commented Jan 5, 2024

Ok, it is called, but you also have to wait for it to complete and return from the RPC, before continuing the test. Currently it looks like it is called, and processes the block events, but at the same time the P2P interface is asked for the filters.

It may be best to only use a single thread for the test logic.

Hmm, you are totally right. You said elsewhere in this thread

Even when the peer sent the headers message, the block filter index could still be processing the block signals.

This is exactly what is happening and the race condition i'm running into.

  1. call generatetoaddress
  2. bitcoind generates blocks and gossips them over the p2p network
  3. I receive inv or headersmsg over the p2p network and process it
  4. I send getcfheaders to the bitcoind after processing the block header to fetch the corresponding filter header
  5. bitcoind doesn't respond since the index is still building.

I don't see why syncwithvalidationinterfacequeue would matter at all - since bitcoind gossips the header regardless which kicks off my logic to request the corresponding filter header / filter.

I guess the question now is why are we gossiping headers before its possible to serve filterheaders / filters that correspond with them if blockfilterindex=1 is enabled?

@maflcko
Copy link
Member

maflcko commented Jan 5, 2024

I guess the question now is why are we gossiping headers before its possible to serve filterheaders / filters that correspond with them if blockfilterindex=1 is enabled?

There was a previous discussion (I can't find it now), but I think it said that the P2P interface may fail, because it reaches to untrusted peers, so a logic to recover is needed either way. If you want to use a trusted interface, use RPC.

Though, there is also P2P encryption and peer permission flags, so my preference would be to sync before replying to compact filter requests on the P2P interface.

@Christewart
Copy link
Contributor Author

There was a previous discussion (I can't find it now)

Do you recall the venue of where it was discussed? Some PR/issue on github or elsewhere? I may go digging.

In general, should I leave this issue open? It seems like this will not be resolved in the short term (or ever maybe).

@maflcko
Copy link
Member

maflcko commented Jan 5, 2024

It was on GitHub (https://github.com/bitcoin/bitcoin)

Christewart added a commit to Christewart/bitcoin-s-core that referenced this issue Feb 16, 2024
…ously querying for children rather than the block hashes at the given height for building the FilterSyncMarker for getcfilter request. Also introduce delay in sending getcfheader because of bitcoin/bitcoin#27085
Christewart added a commit to bitcoin-s/bitcoin-s that referenced this issue Feb 17, 2024
* Fix bug in getFilterSyncMarkerFromStopBlockHeader where we were previously querying for children rather than the block hashes at the given height for building the FilterSyncMarker for getcfilter request. Also introduce delay in sending getcfheader because of bitcoin/bitcoin#27085

* Only fallback to reorg scenario if we find no headers at height + 1

* Remove the best filter's block header from the candidate headers in reorg scenarios
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants