Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More reasons Wallet stops adding new blocks to chain on Testnet #1259

Open
jarlfr opened this issue May 17, 2016 · 3 comments
Open

More reasons Wallet stops adding new blocks to chain on Testnet #1259

jarlfr opened this issue May 17, 2016 · 3 comments

Comments

@jarlfr
Copy link
Contributor

jarlfr commented May 17, 2016

Here below are some more reasons I found when analyzing that will cause bitcoinj wallet get stuck and stop following the correct block chain on Testnet. We experienced a lot of problems like this on testnet and my wish was to fix them or rule out that these things could happen in production on mainnet.

I present some ideas for action but more thoughts are needed. Testnet forks can be very long, to the degree it will run beyond what can be handled with SPVBlockStore, so we need to think about this. A solution for a very low spec (memory/disk) wallet could maybe do a two-pass download from peers. (I am unsure but does bitcoinj make use getheaders in normal catch up)

Peers do not recognize hashes in getblock-message

The hashes sent in the getblock message should help a peer to decide which blocks the client needs to catch up with the chain. I noticed that Peers response to getblocks was to send the 500 first blocks of the block chain. These are not to much use of course. It will not change the state of of the chain head in the wallet's store so the wallet will be stuck.

Analysis: The bitcoinj seams to fill in 100 hashes starting from its chain head in a linear fashion, and if all are on a fork that was discarded, the Peer cannot find any common block except for the genesis block, thus it starts there in the reply inv-message.

Action: Use better set of hashes from the known blocks in the store (5000 for SPVBlockStore). A better selection is proposed on the bitcoin wiki: "dense to start, but then sparse". This helps but I ran into the next problem:

getdata 500 blocks do not trigger re-organize despite head on a dead fork

Requesting blocks with better hashes can still leave the the wallet store chain head unchanged. This will result in the same request for blocks again, and the store head is effectively stuck.

Analysis: This happens when the downloaded blocks, despite belong to the correct chain, will not trigger a re-organize despite the head is a dead chain. Why? It seems the special difficulty jumps on testnet can make a branch of blocks have less total work despite being very much longer, e.g. several 100 blocks longer than head. For some reason the network selected this longer chain for many blocks.

Action: Not sure here: one way is to follow the current rules. In these cases we need to download many blocks (1000s) to trigger a re-org. To make that happen the store needs to track not only chain head but also what, at this point, looks like a fork (less total work) to send a different getblocks and getdata to get more blocks it has not downloaded. Note that getblocks can easily be used to ask for multiple branches in one request. But, without extending the SPVBlockStore this solution does not work as we run out of space (currently to manage the reorg bitcoinj seems to need all blocks in both branches back to the split point).
Another way is maybe to discard blocks back to the split point and try to restart with getblocks to peers (but this would trust this peer more than those that resulted in the current head). Does the transaction confidence model allow this two step operation: first lowering head and total work, and then following another branch that eventually will reach higher total work much later? Currently I assume total work can only increase. I general, tx confidence changes for these deep reorg events seems very hard to handle in applications anyway. Any ideas?

To sum up, these particular finds are very unlikely to happen on the main network. It still would be nice if testnet could be made to work reliably with bitcoinj.

@schildbach
Copy link
Member

Thanks a lot for your analysis! I replied to the mailing list as that's a better place for discussions.

@dcw312
Copy link
Contributor

dcw312 commented Aug 13, 2016

Is this issue still open? I ask because it looks like an interesting problem to research.

@schildbach
Copy link
Member

I think yes. There has been a bit of discussion on the mailing list (see topic started May 23) but I must admit I didn't really decide what to do. If you want to help, that's great! Posting your ideas/thoughts to the mailing list would be a good start.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants