Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid conflicting chain locks when rescan is triggered by new block #533

Closed
wants to merge 1 commit into from

Conversation

pinheadmz
Copy link
Member

@pinheadmz pinheadmz commented Jan 8, 2021

Fix for bcoin-org/bcoin#1006 from upstream in bcoin.

Issue summary:

  • WalletDB somehow falls behind chain (i.e. botched rescan attempt, maybe due to one of the txdb assertion errors)
  • A new block is added to the chain by the network
  • The chain block-adding process locks the chain
  • WalletDB realizes its behind and initiates a rescan
  • Rescan also requires a lock on chain, but chain is still locked and will never unlock until the rescan is done!

Bug fix summary:
Emit the 'block connect' event synchronously so the original 'connect' event from chain can resolve. The chain 'connect' event is emitted asynchronously meaning it waits for all its listeners to resolve before program flow continues. The chain event is proxied by nodeclient to the wallet as 'block connect' and this is what ultimately triggers the reorg.

Note that this is only an issue when the wallet is run as a plugin. If the wallet is being run as a separate node, it gets the 'block connect' event from the node's HTTP websocket:

hsd/lib/node/http.js

Lines 666 to 680 in a1409dc

this.chain.on('connect', (entry, block, view) => {
const sockets = this.channel('chain');
if (!sockets)
return;
const raw = entry.encode();
this.to('chain', 'chain connect', raw);
for (const socket of sockets) {
const txs = this.filterBlock(socket, block);
socket.fire('block connect', raw, txs);
}
});

The websocket event is obviously synchronous because it is just a one-way firing, it doesn't wait for a response at all. Indeed! The fact that one mode of operation uses synchronous events should help justify this pull request.

The side effect of this change is that in a few tests we always just expect the walletDB to be up to date with the chain when we add new blocks. This is no longer the case and probably we should always consider wallet and node operations to be asynchronous anyway. For those tests I've added a wait-er to ensure wallet is synced to chain before continuing.

Reviewers can run the new test test/wallet-rescan-test.js after reverting the one-line change in lib/wallet/nodeclient.js and observe that the test will timeout -- this is because the two conflicting locks will wait for each other forever.

@coveralls
Copy link

coveralls commented Jan 8, 2021

Pull Request Test Coverage Report for Build 497068965

  • 1 of 1 (100.0%) changed or added relevant line in 1 file are covered.
  • 4 unchanged lines in 3 files lost coverage.
  • Overall coverage increased (+0.2%) to 59.805%

Files with Coverage Reduction New Missed Lines %
lib/mining/miner.js 1 64.58%
lib/net/pool.js 1 32.63%
lib/net/peer.js 2 35.28%
Totals Coverage Status
Change from base Build 480764491: 0.2%
Covered Lines: 19628
Relevant Lines: 30587

💛 - Coveralls

@pinheadmz pinheadmz added this to the v2.4.0 milestone Jan 18, 2021
@pinheadmz pinheadmz force-pushed the asyncrescan1 branch 4 times, most recently from a4b8404 to a50faad Compare January 19, 2021 21:35
@pinheadmz
Copy link
Member Author

Closing for now, although we still may need a fix for this. Turns out we use emitAsync here for a good reason (memory):

bcoin-org/bcoin#932

@pinheadmz pinheadmz closed this Jan 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants