Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[qa] Handle disconnect_node race #13201

Merged
merged 1 commit into from May 10, 2018

Conversation

sdaftuar
Copy link
Member

@sdaftuar sdaftuar commented May 9, 2018

Several tests call disconnect_nodes() on each node-pair in rapid
succession, resulting in a race condition if a node disconnects a peer
in-between the calculation of the nodeid's to disconnect and the
invocation of the disconnectnode rpc call. Handle this.

Several tests call disconnect_nodes() on each node-pair in rapid
succession, resulting in a race condition if a node disconnects a peer
in-between the calculation of the nodeid's to disconnect and the
invocation of the disconnectnode rpc call.  Handle this.
@sdaftuar
Copy link
Member Author

sdaftuar commented May 9, 2018

I believe I encountered the race condition I describe in the OP during a run of the p2p_node_network_limited.py test. An exception was raised in disconnect_all(), where the test disconnects each pair of peers. Here's the relevant test log snippet:

 node0 2018-05-09T16:27:51.966761Z Received a POST request for / from 127.0.0.1:57133 
 node0 2018-05-09T16:27:51.966844Z ThreadRPCServer method=getpeerinfo user=__cookie__ 
 node0 2018-05-09T16:27:51.968238Z Received a POST request for / from 127.0.0.1:57133 
 node0 2018-05-09T16:27:51.973861Z ThreadRPCServer method=disconnectnode user=__cookie__ 
 node0 2018-05-09T16:27:51.980630Z Received a POST request for / from 127.0.0.1:57133 
 node0 2018-05-09T16:27:51.986890Z ThreadRPCServer method=disconnectnode user=__cookie__ 
 node0 2018-05-09T16:27:51.993866Z Received a POST request for / from 127.0.0.1:57133 
 node0 2018-05-09T16:27:51.998753Z ThreadRPCServer method=getpeerinfo user=__cookie__ 
 node0 2018-05-09T16:27:51.999412Z disconnecting peer=0 
 node0 2018-05-09T16:27:52.007234Z disconnecting peer=1 
 node1 2018-05-09T16:27:52.007299Z socket closed 
 node0 2018-05-09T16:27:52.007387Z Cleared nodestate for peer=0 
 node1 2018-05-09T16:27:52.008380Z Received a POST request for / from 127.0.0.1:57134 
 node0 2018-05-09T16:27:52.012901Z Cleared nodestate for peer=1 
 node1 2018-05-09T16:27:52.015917Z disconnecting peer=0 
 node1 2018-05-09T16:27:52.022234Z ThreadRPCServer method=getpeerinfo user=__cookie__ 
 node1 2018-05-09T16:27:52.022266Z Cleared nodestate for peer=0 
 node1 2018-05-09T16:27:52.022394Z socket closed 
 node1 2018-05-09T16:27:52.022435Z disconnecting peer=1 
 node1 2018-05-09T16:27:52.022482Z Cleared nodestate for peer=1 
 node1 2018-05-09T16:27:52.023900Z Received a POST request for / from 127.0.0.1:57134 
 node1 2018-05-09T16:27:52.028597Z ThreadRPCServer method=disconnectnode user=__cookie__ 
 test  2018-05-09T16:27:52.034000Z TestFramework (ERROR): JSONRPC error

So we're telling node0 to disconnect from node1 first, and then we're telling node1 to disconnect from node0. Note that the calculation of which nodes to disconnect node1 from (which happens in the node1.getpeerinfo() call) is before node1 disconnects node0, but we actual call node1.disconnectnode() after that.

@jnewbery Does this look right to you?

@laanwj laanwj added the Tests label May 9, 2018
@jnewbery
Copy link
Contributor

jnewbery commented May 9, 2018

utACK 09c6699.

Seems reasonable - the only way we can receive a RPC_CLIENT_NODE_NOT_CONNECTED is if the requested node to disconnect is not found in conman's vNodes. In that case it's fine to just continue, since we're already disconnected by definition.

@laanwj
Copy link
Member

laanwj commented May 10, 2018

Making disconnection idempotent seems a good idea. utACK 09c6699

@maflcko maflcko added this to the 0.16.1 milestone May 10, 2018
@maflcko
Copy link
Member

maflcko commented May 10, 2018

utACK 09c6699. Nice fixup, that should be backported

@maflcko maflcko merged commit 09c6699 into bitcoin:master May 10, 2018
maflcko pushed a commit that referenced this pull request May 10, 2018
09c6699 [qa] Handle disconnect_node race (Suhas Daftuar)

Pull request description:

  Several tests call disconnect_nodes() on each node-pair in rapid
  succession, resulting in a race condition if a node disconnects a peer
  in-between the calculation of the nodeid's to disconnect and the
  invocation of the disconnectnode rpc call.  Handle this.

Tree-SHA512: 3078cea0006fcb507c812004a777c505eb1e9dda7c6df12dbbe72395a73ff6f6760f597b6492054f5487b34534417ddef5fbad30553c135c288c4b7cfce79223
fanquake pushed a commit to fanquake/bitcoin that referenced this pull request May 17, 2018
Several tests call disconnect_nodes() on each node-pair in rapid
succession, resulting in a race condition if a node disconnects a peer
in-between the calculation of the nodeid's to disconnect and the
invocation of the disconnectnode rpc call.  Handle this.

GitHub-Pull: bitcoin#13201
Rebased-From: 09c6699
laanwj added a commit that referenced this pull request May 24, 2018
acdf433 Hold cs_main while calling UpdatedBlockTip() and ui.NotifyBlockTip (Jesse Cohen)
5ff571e [wallet] [tests] Test disallowed multiwallet params (John Newbery)
4c14e7b [wallet] Fix zapwallettxes/multiwallet interaction. (John Newbery)
4087dd0 RPC Docs: gettxout*: clarify bestblock and unspent counts (David A. Harding)
b8aacd6 [qa] Handle disconnect_node race (Suhas Daftuar)

Pull request description:

  Backports:
  - #13201 [qa] Handle disconnect_node race
  - #13184 RPC Docs: gettxout*: clarify bestblock and unspent counts
  - #13030 [bugfix] [wallet] Fix zapwallettxes/multiwallet interaction.
  - #12988 Hold cs_main while calling UpdatedBlockTip() signal

  to the 0.16 branch.

Tree-SHA512: 8f65002bbafaf9c436f89051b2d79bf6a668fbd07bd317c64af238ed4a7c8efe776864b739a7f2869f1e3daa16f2f4366a85f41b188f9c454879d2c7b309be50
HashUnlimited pushed a commit to chaincoin/chaincoin that referenced this pull request Jun 29, 2018
Several tests call disconnect_nodes() on each node-pair in rapid
succession, resulting in a race condition if a node disconnects a peer
in-between the calculation of the nodeid's to disconnect and the
invocation of the disconnectnode rpc call.  Handle this.

GitHub-Pull: bitcoin#13201
Rebased-From: 09c6699
jasonbcox pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Sep 27, 2019
Summary:
09c6699 [qa] Handle disconnect_node race (Suhas Daftuar)

Pull request description:

  Several tests call disconnect_nodes() on each node-pair in rapid
  succession, resulting in a race condition if a node disconnects a peer
  in-between the calculation of the nodeid's to disconnect and the
  invocation of the disconnectnode rpc call.  Handle this.

Tree-SHA512: 3078cea0006fcb507c812004a777c505eb1e9dda7c6df12dbbe72395a73ff6f6760f597b6492054f5487b34534417ddef5fbad30553c135c288c4b7cfce79223

Backport of Core PR13201
bitcoin/bitcoin#13201

Test Plan:
  test_runner.py

Reviewers: deadalnix, Fabien, jasonbcox, O1 Bitcoin ABC, #bitcoin_abc

Reviewed By: jasonbcox, O1 Bitcoin ABC, #bitcoin_abc

Differential Revision: https://reviews.bitcoinabc.org/D4119
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Apr 3, 2020
09c6699 [qa] Handle disconnect_node race (Suhas Daftuar)

Pull request description:

  Several tests call disconnect_nodes() on each node-pair in rapid
  succession, resulting in a race condition if a node disconnects a peer
  in-between the calculation of the nodeid's to disconnect and the
  invocation of the disconnectnode rpc call.  Handle this.

Tree-SHA512: 3078cea0006fcb507c812004a777c505eb1e9dda7c6df12dbbe72395a73ff6f6760f597b6492054f5487b34534417ddef5fbad30553c135c288c4b7cfce79223
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants