Seednode CLOSE_REQUESTED_BY_PEER + Timeout == 90s delay in startup #3797

julianknutsen · 2019-12-16T18:04:32Z

Description

During startup, the Bisq client connects to 2 seednodes to sync the initial date. In the event that one seednode closes the connection w/ CLOSE_REQUESTED_BY_PEER and the other times out, the client hangs at (3/4) for the full 90s timeout window.

There isn't much that can be done w.r.t. timeouts, but if a seednode closes the connection intentionally, we can do a better job of sending another request in its place in the event the "backup" connection is going to time out.

I'm not sure if we track metrics on seednode intentional disconnects, but this might be a sign of overloaded seednodes.

Version

v1.2.4

Steps to reproduce

I haven't fabricated this scenario for testing, but in my normal usage of Bisq there have been a few instances where (3/4) will hang for the full 90s before cycling to another seed node and starting. The logs indicate the error described above.

Expected behaviour

If a seednode intentionally disconnects a starting client, the client will retry with another valid node "immediately".

Actual behaviour

Connections intentionally closed by the seednode are not restarted with another valid seednode.

Screenshots

Device or machine

Ubuntu 19.10

Additional info

I don't have a lot of experience with the networking code, but it looks like RequestDataManager::onDisconnect only removes the handler from the internal map and doesn't trigger any action.

Dec-16 09:24:25.738 - CLOSE_REQUESTED_BY_PEER (Seednode 1)
Dec-16 09:25:40.094 - Timeout (Seednode 2)
Dec-16 09:25:43.691 - Retry

Dec-16 09:24:09.937 [JavaFX Application Thread] INFO  b.n.p.p.g.RequestDataHandler: We send a PreliminaryGetDataRequest to peer fl3mmribyxgrv63c.onion:8000.  
Dec-16 09:24:10.083 [JavaFX Application Thread] INFO  b.n.p.p.g.RequestDataHandler: We send a PreliminaryGetDataRequest to peer jhgcy2won7xnslrb.onion:8000.  
Dec-16 09:24:10.084 [ STARTING] INFO  o.b.core.AbstractBlockChain: chain head is at height 608391:
 block: 
   hash: 000000000000000000033aaf6baf36c29cfd1273b4534be10a3f9b5b429e5ef0
   version: 536928256 (BIP34, BIP66, BIP65)
   previous block: 00000000000000000012df7e87203e2570e61a2dd9eb75dd483804e6f3e8ed95
   merkle root: 0f3d965d7a966af8ee45f09e2b1081b299630e29268f4551c9248e52b87e2337
   time: 1576516776 (2019-12-16T17:19:36Z)
   difficulty target (nBits): 387308498
   nonce: 3598059532
 
Dec-16 09:24:10.326 [ STARTING] INFO  b.core.btc.setup.WalletConfig: We try to connect to 9 btc nodes 
Dec-16 09:24:10.353 [PeerGroup Thread] INFO  org.bitcoinj.core.PeerGroup: Starting ... 
Dec-16 09:24:10.356 [PeerGroup Thread] INFO  org.bitcoinj.core.PeerGroup: Localhost peer not detected. 
Dec-16 09:24:10.361 [PeerGroup Thread] INFO  org.bitcoinj.core.PeerGroup: Attempting connection to [fz6nsij6jiyuwlsc.onion]:8333     (0 connected, 1 pending, 9 max) 
Dec-16 09:24:10.363 [PeerGroup Thread] INFO  org.bitcoinj.core.PeerGroup: Attempting connection to [sslnjjhnmwllysv4.onion]:8333     (0 connected, 2 pending, 9 max) 
Dec-16 09:24:10.382 [PeerGroup Thread] INFO  org.bitcoinj.core.PeerGroup: Attempting connection to [jiuuuislm7ooesic.onion]:8333     (0 connected, 3 pending, 9 max) 
Dec-16 09:24:10.384 [PeerGroup Thread] INFO  org.bitcoinj.core.PeerGroup: Attempting connection to [22tg6ufbwz6o3l2u.onion]:8333     (0 connected, 4 pending, 9 max) 
Dec-16 09:24:10.395 [PeerGroup Thread] INFO  org.bitcoinj.core.PeerGroup: Attempting connection to [z33nukt7ngik3cpe.onion]:8333     (0 connected, 5 pending, 9 max) 
Dec-16 09:24:10.421 [PeerGroup Thread] INFO  org.bitcoinj.core.PeerGroup: Attempting connection to [lva54pnbq2nsmjyr.onion]:8333     (0 connected, 6 pending, 9 max) 
Dec-16 09:24:10.432 [PeerGroup Thread] INFO  org.bitcoinj.core.PeerGroup: Attempting connection to [i3a5xtzfm4xwtybd.onion]:8333     (0 connected, 7 pending, 9 max) 
Dec-16 09:24:10.436 [PeerGroup Thread] INFO  org.bitcoinj.core.PeerGroup: Attempting connection to [3r44ddzjitznyahw.onion]:8333     (0 connected, 8 pending, 9 max) 
Dec-16 09:24:10.437 [PeerGroup Thread] INFO  org.bitcoinj.core.PeerGroup: Attempting connection to [mxdtrjhe2yfsx3pg.onion]:8333     (0 connected, 9 pending, 9 max) 
Dec-16 09:24:10.438 [JavaFX Application Thread] INFO  bisq.core.app.BisqSetup: walletInitialized=true, p2pNetWorkReady=false 
Dec-16 09:24:13.856 [BlockingClient network thread for sslnjjhnmwllysv4.onion:8333] INFO  org.bitcoinj.core.Peer: Announcing to sslnjjhnmwllysv4.onion:8333 as: /bitcoinj:0.14.7.bisq.1-SNAPSHOT/Bisq:1.2.4/ 
Dec-16 09:24:14.352 [BlockingClient network thread for sslnjjhnmwllysv4.onion:8333] INFO  org.bitcoinj.core.Peer: [sslnjjhnmwllysv4.onion]:8333: Got version=70015, subVer='/Satoshi:0.18.1/', services=0x1037, time=2019-12-16 09:24:16, blocks=608391 
Dec-16 09:24:14.354 [BlockingClient network thread for sslnjjhnmwllysv4.onion:8333] INFO  org.bitcoinj.core.PeerGroup: [sslnjjhnmwllysv4.onion]:8333: New peer      (1 connected, 8 pending, 9 max) 
Dec-16 09:24:14.355 [BlockingClient network thread for sslnjjhnmwllysv4.onion:8333] INFO  org.bitcoinj.core.PeerGroup: Setting download peer: [sslnjjhnmwllysv4.onion]:8333 
Dec-16 09:24:15.333 [BlockingClient network thread for lva54pnbq2nsmjyr.onion:8333] INFO  org.bitcoinj.core.Peer: Announcing to lva54pnbq2nsmjyr.onion:8333 as: /bitcoinj:0.14.7.bisq.1-SNAPSHOT/Bisq:1.2.4/ 
Dec-16 09:24:15.508 [BlockingClient network thread for jiuuuislm7ooesic.onion:8333] INFO  org.bitcoinj.core.Peer: Announcing to jiuuuislm7ooesic.onion:8333 as: /bitcoinj:0.14.7.bisq.1-SNAPSHOT/Bisq:1.2.4/ 
Dec-16 09:24:15.508 [BlockingClient network thread for i3a5xtzfm4xwtybd.onion:8333] INFO  org.bitcoinj.core.Peer: Announcing to i3a5xtzfm4xwtybd.onion:8333 as: /bitcoinj:0.14.7.bisq.1-SNAPSHOT/Bisq:1.2.4/ 
Dec-16 09:24:15.580 [BlockingClient network thread for z33nukt7ngik3cpe.onion:8333] INFO  org.bitcoinj.core.Peer: Announcing to z33nukt7ngik3cpe.onion:8333 as: /bitcoinj:0.14.7.bisq.1-SNAPSHOT/Bisq:1.2.4/ 
Dec-16 09:24:15.981 [BlockingClient network thread for lva54pnbq2nsmjyr.onion:8333] INFO  org.bitcoinj.core.Peer: [lva54pnbq2nsmjyr.onion]:8333: Got version=70015, subVer='/Satoshi:0.18.1/', services=0x1037, time=2019-12-16 09:24:18, blocks=608391 
Dec-16 09:24:15.982 [BlockingClient network thread for lva54pnbq2nsmjyr.onion:8333] INFO  org.bitcoinj.core.PeerGroup: [lva54pnbq2nsmjyr.onion]:8333: New peer      (2 connected, 7 pending, 9 max) 
Dec-16 09:24:16.130 [BlockingClient network thread for 22tg6ufbwz6o3l2u.onion:8333] INFO  org.bitcoinj.core.Peer: Announcing to 22tg6ufbwz6o3l2u.onion:8333 as: /bitcoinj:0.14.7.bisq.1-SNAPSHOT/Bisq:1.2.4/ 
Dec-16 09:24:16.131 [BlockingClient network thread for i3a5xtzfm4xwtybd.onion:8333] INFO  org.bitcoinj.core.Peer: [i3a5xtzfm4xwtybd.onion]:8333: Got version=70015, subVer='/Satoshi:0.18.1/', services=0x1037, time=2019-12-16 09:24:18, blocks=608391 
Dec-16 09:24:16.135 [BlockingClient network thread for i3a5xtzfm4xwtybd.onion:8333] INFO  org.bitcoinj.core.PeerGroup: [i3a5xtzfm4xwtybd.onion]:8333: New peer      (3 connected, 6 pending, 9 max) 
Dec-16 09:24:16.144 [BlockingClient network thread for z33nukt7ngik3cpe.onion:8333] INFO  org.bitcoinj.core.Peer: [z33nukt7ngik3cpe.onion]:8333: Got version=70015, subVer='/Satoshi:0.16.3/', services=0x1037, time=2019-12-16 09:20:48, blocks=608391 
Dec-16 09:24:16.151 [BlockingClient network thread for z33nukt7ngik3cpe.onion:8333] INFO  org.bitcoinj.core.PeerGroup: [z33nukt7ngik3cpe.onion]:8333: New peer      (4 connected, 5 pending, 9 max) 
Dec-16 09:24:16.193 [NetworkNode:SendMessage-to-jhgcy2won7xnslrb.onion:8000] INFO  b.n.p.p.g.m.PreliminaryGetDataRequest: Sending a PreliminaryGetDataRequest with 1980.089 kB 
Dec-16 09:24:16.347 [BlockingClient network thread for 3r44ddzjitznyahw.onion:8333] INFO  org.bitcoinj.core.Peer: Announcing to 3r44ddzjitznyahw.onion:8333 as: /bitcoinj:0.14.7.bisq.1-SNAPSHOT/Bisq:1.2.4/ 
Dec-16 09:24:16.369 [NetworkNode:SendMessage-to-fl3mmribyxgrv63c.onion:8000] INFO  b.n.p.p.g.m.PreliminaryGetDataRequest: Sending a PreliminaryGetDataRequest with 1980.089 kB 
Dec-16 09:24:16.448 [BlockingClient network thread for jiuuuislm7ooesic.onion:8333] INFO  org.bitcoinj.core.Peer: [jiuuuislm7ooesic.onion]:8333: Got version=70015, subVer='/Satoshi:0.18.1/', services=0x1037, time=2019-12-16 09:24:18, blocks=608391 
Dec-16 09:24:16.449 [BlockingClient network thread for jiuuuislm7ooesic.onion:8333] INFO  org.bitcoinj.core.PeerGroup: [jiuuuislm7ooesic.onion]:8333: New peer      (5 connected, 4 pending, 9 max) 
Dec-16 09:24:16.460 [BlockingClient network thread for mxdtrjhe2yfsx3pg.onion:8333] INFO  org.bitcoinj.core.Peer: Announcing to mxdtrjhe2yfsx3pg.onion:8333 as: /bitcoinj:0.14.7.bisq.1-SNAPSHOT/Bisq:1.2.4/ 
Dec-16 09:24:16.484 [pool-39-thread-1] INFO  b.n.p.p.g.m.PreliminaryGetDataRequest: Sending a PreliminaryGetDataRequest with 1980.089 kB 
Dec-16 09:24:16.534 [pool-41-thread-1] INFO  b.n.p.p.g.m.PreliminaryGetDataRequest: Sending a PreliminaryGetDataRequest with 1980.089 kB 
Dec-16 09:24:16.851 [BlockingClient network thread for 3r44ddzjitznyahw.onion:8333] INFO  org.bitcoinj.core.Peer: [3r44ddzjitznyahw.onion]:8333: Got version=70015, subVer='/Satoshi:0.18.1/', services=0x1037, time=2019-12-16 09:24:19, blocks=608391 
Dec-16 09:24:16.852 [BlockingClient network thread for 3r44ddzjitznyahw.onion:8333] INFO  org.bitcoinj.core.PeerGroup: [3r44ddzjitznyahw.onion]:8333: New peer      (6 connected, 3 pending, 9 max) 
Dec-16 09:24:16.994 [BlockingClient network thread for mxdtrjhe2yfsx3pg.onion:8333] INFO  org.bitcoinj.core.Peer: [mxdtrjhe2yfsx3pg.onion]:8333: Got version=70015, subVer='/Satoshi:0.18.0/', services=0x1037, time=2019-12-16 09:24:19, blocks=608391 
Dec-16 09:24:16.994 [BlockingClient network thread for mxdtrjhe2yfsx3pg.onion:8333] INFO  org.bitcoinj.core.PeerGroup: [mxdtrjhe2yfsx3pg.onion]:8333: New peer      (7 connected, 2 pending, 9 max) 
Dec-16 09:24:17.101 [BlockingClient network thread for 22tg6ufbwz6o3l2u.onion:8333] INFO  org.bitcoinj.core.Peer: [22tg6ufbwz6o3l2u.onion]:8333: Got version=70015, subVer='/Satoshi:0.19.0.1/', services=0x1037, time=2019-12-16 09:24:19, blocks=608391 
Dec-16 09:24:17.102 [BlockingClient network thread for 22tg6ufbwz6o3l2u.onion:8333] INFO  org.bitcoinj.core.PeerGroup: [22tg6ufbwz6o3l2u.onion]:8333: New peer      (8 connected, 1 pending, 9 max) 
Dec-16 09:24:25.734 [JavaFX Application Thread] INFO  b.n.p2p.peers.PeerManager: onDisconnect called: nodeAddress=Optional[fl3mmribyxgrv63c.onion:8000], closeConnectionReason=CloseConnectionReason{sendCloseMessage=false, isIntended=true} CLOSE_REQUESTED_BY_PEER 
Dec-16 09:24:25.738 [JavaFX Application Thread] INFO  b.n.p.p.p.PeerExchangeManager: onDisconnect closeConnectionReason=CloseConnectionReason{sendCloseMessage=false, isIntended=true} CLOSE_REQUESTED_BY_PEER, nodeAddressOpt=Optional[fl3mmribyxgrv63c.onion:8000] 
Dec-16 09:24:26.191 [JavaFX Application Thread] INFO  b.n.p2p.peers.PeerManager: We have 1 connections open. Our limit is 12 
Dec-16 09:24:43.511 [TorControlParser] INFO  o.b.netlayer.tor.Tor: Hidden Service shavx2j47oghl62o.onion.onion has been announced to the Tor network. 
Dec-16 09:24:43.519 [Thread-18] INFO  b.n.p2p.network.TorNetworkNode: 
################################################################
Tor hidden service published after 33917 ms. Socked=HiddenServiceSocket[addr=shavx2j47oghl62o.onion,port=9999]
################################################################ 
Dec-16 09:24:43.992 [TorControlParser] INFO  o.b.netlayer.tor.Tor: Hidden Service shavx2j47oghl62o.onion.onion has been announced to the Tor network. 
Dec-16 09:24:44.187 [TorControlParser] INFO  o.b.netlayer.tor.Tor: Hidden Service shavx2j47oghl62o.onion.onion has been announced to the Tor network. 
Dec-16 09:24:44.298 [TorControlParser] INFO  o.b.netlayer.tor.Tor: Hidden Service shavx2j47oghl62o.onion.onion has been announced to the Tor network. 
Dec-16 09:24:44.472 [TorControlParser] INFO  o.b.netlayer.tor.Tor: Hidden Service shavx2j47oghl62o.onion.onion has been announced to the Tor network. 
Dec-16 09:24:45.390 [TorControlParser] INFO  o.b.netlayer.tor.Tor: Hidden Service shavx2j47oghl62o.onion.onion has been announced to the Tor network. 
Dec-16 09:25:40.094 [JavaFX Application Thread] INFO  b.n.p.p.g.RequestDataHandler: A timeout occurred at sending getDataRequest:PreliminaryGetDataRequest(supportedCapabilities=[0, 1, 2, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]) on nodeAddress:jhgcy2won7xnslrb.onion:8000 
Dec-16 09:25:40.131 [JavaFX Application Thread] INFO  b.n.p.p.g.RequestDataHandler: We send a PreliminaryGetDataRequest to peer 723ljisnynbtdohi.onion:8000.  
Dec-16 09:25:43.691 [NetworkNode:SendMessage-to-723ljisnynbtdohi.onion:8000] INFO  b.n.p.p.g.m.PreliminaryGetDataRequest: Sending a PreliminaryGetDataRequest with 1980.084 kB 
Dec-16 09:25:43.798 [pool-43-thread-1] INFO  b.n.p.p.g.m.PreliminaryGetDataRequest: Sending a PreliminaryGetDataRequest with 1980.084 kB 
Dec-16 09:25:47.522 [InputHandler-jhgcy2won7xnslrb.onion:8000] INFO  b.n.p.p.g.m.GetDataResponse: Received a GetDataResponse with 3166.148 kB 
Dec-16 09:25:53.682 [JavaFX Application Thread] INFO  b.n.p2p.peers.PeerManager: We have 2 connections open. Our limit is 12 
Dec-16 09:26:02.276 [InputHandler-723ljisnynbtdohi.onion:8000] INFO  b.n.p.p.g.m.GetDataResponse: Received a GetDataResponse with 3521.577 kB 
Dec-16 09:26:02.388 [JavaFX Application Thread] INFO  b.n.p.p.g.RequestDataHandler: 
#################################################################
Connected to node: 723ljisnynbtdohi.onion:8000
Received 892 instances
RefundAgent: 1
Filter: 2
MailboxStoragePayload: 544
Mediator: 2
Alert: 1
OfferPayload: 342
#################################################################

The text was updated successfully, but these errors were encountered:

chimp1984 · 2019-12-17T00:40:13Z

The seednodes should not intentionally disconnect nodes. We should find out why that is the case (as you stated they are probably overloaded - or maybe its at the restart process.... ).
With logs from the seed node it should be possible to find the matching requests and disconnection.
I agree we should handle such a close reason with starting a new request inside RequestDataManager on such disconnects.

ripcurlx · 2020-01-28T12:26:48Z

@wiz Could you have a look at this issue?

ripcurlx · 2020-02-18T11:19:40Z

Related to: #3797 #1023 #3763 #3347 #2549

freimair · 2020-03-12T11:08:14Z

this is a tricky one.

CloseConnectionMessages are terminated silently in Connection.java. Therefore, there is only useful messages in the remaining onMessage tree which keeps the rest of the app unaware that something went wrong.

Making a change to this behavior (in order to address this issue) might result in huge testing efforts.

I suggest a wont-fix as we have other countermeasures coming up. @ripcurlx @sqrrm?

wiz · 2020-03-12T11:52:54Z

Can you add monitoring for the number of connections on each seednode and display it on grafana?

freimair · 2020-03-12T11:59:22Z

I expect the number of connections for each seednode to be close to their maximum (50 or something?). Simply because we only have 8 seednodes and each client fires up 2 connections -> 200 clients can be connected to the seednodes at any point in time. If another client comes along, the seednode decides which connection to close according to the rules.

However, you might be able to grep a connections open. Our limit is on the logs of a live seednode just to see if my expectation is reasonable.

ripcurlx added the an:investigation label Jan 28, 2020

ripcurlx added the is:priority PR or issue marked with this label is up for compensation label Feb 18, 2020

ripcurlx mentioned this issue Feb 18, 2020

Unhandled unrecoverable program state following failed handshake with seed nodes #2549

Closed

This was referenced Feb 18, 2020

Tor bridge not work #1023

Closed

Application won't progress past Step 3/4 Hidden Service published #3763

Open

Bisq crashes while starting #3347

Open

ripcurlx added the is:critical https://bisq.wiki/Critical_bug label Feb 18, 2020

cbeams added the a:bug label Feb 27, 2020

freimair mentioned this issue Mar 5, 2020

Reduce bandwidth-requirement on Bisq app startup bisq-network/projects#25

Closed

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Seednode CLOSE_REQUESTED_BY_PEER + Timeout == 90s delay in startup #3797

Seednode CLOSE_REQUESTED_BY_PEER + Timeout == 90s delay in startup #3797

julianknutsen commented Dec 16, 2019

chimp1984 commented Dec 17, 2019

ripcurlx commented Jan 28, 2020

ripcurlx commented Feb 18, 2020

freimair commented Mar 12, 2020

wiz commented Mar 12, 2020

freimair commented Mar 12, 2020 •

edited

Loading

Seednode CLOSE_REQUESTED_BY_PEER + Timeout == 90s delay in startup #3797

Seednode CLOSE_REQUESTED_BY_PEER + Timeout == 90s delay in startup #3797

Comments

julianknutsen commented Dec 16, 2019

Description

Version

Steps to reproduce

Expected behaviour

Actual behaviour

Screenshots

Device or machine

Additional info

chimp1984 commented Dec 17, 2019

ripcurlx commented Jan 28, 2020

ripcurlx commented Feb 18, 2020

freimair commented Mar 12, 2020

wiz commented Mar 12, 2020

freimair commented Mar 12, 2020 • edited Loading

freimair commented Mar 12, 2020 •

edited

Loading