Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Find and switch peers #4408

Merged
merged 17 commits into from Jun 28, 2022
Merged

Find and switch peers #4408

merged 17 commits into from Jun 28, 2022

Conversation

solo1g
Copy link
Contributor

@solo1g solo1g commented Jun 20, 2022

Should do the following:

  • Find a peer to sync with
  • Try syncing with it
  • If any issues with that peer (disconnection, too many connection drops, ignored queries etc), then find and use another.

Key changes

  • Added a PeerFinder which deals with peer discovery. Moved relevant parts from PeerManager to PeerFinder.
  • Added functionality to query peers and get notified if the query times out. Used to switch peers in case the current one does not work. Adds timeouts to Initializing and a new state Waiting in PeerMessageReceiver.
  • Changed InitializeDisconnect state to now disconnect when connecting too. Added StopReconnect state to PeerMessageReceiver to stop reconnect from our side when not connected.
  • Some fixes dealing with edge cases in P2PClient and PeerMessageReceiver.

@solo1g solo1g force-pushed the mp-switch branch 2 times, most recently from ea29d12 to 6a14c00 Compare June 21, 2022 13:25
@solo1g
Copy link
Contributor Author

solo1g commented Jun 21, 2022

So this should be working as intended now. I will leave comments on all the changes before taking the draft off. Meanwhile, this could be tested by running with the following config

peers = []
maxConnectedPeers = 8
enable-peer-discovery = true
use-default-peers = false

@Christewart Christewart added node work for the node project neutrino Implements the Bitcoin-S Neutrino node labels Jun 21, 2022
@Christewart Christewart added this to the 2.0 milestone Jun 21, 2022
@solo1g solo1g mentioned this pull request Jun 22, 2022
@@ -166,13 +166,27 @@ object GetDataMessage extends Factory[GetDataMessage] {
GetDataMessage(Seq(inventory))
}

sealed trait ExpectsResponse {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I needed a way to if the peer failed to respond to a query for eg getheaders etc that is used in sync, so that I can switch to a different peer and continue sync from there. For NetworkPayload that extends this, an ExpectReponse message is sent to the P2PClient right after sending the message. The message initiates a timer in PeerMessageReceiver that notifies if the message timed out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this common when testing? This is an interesting blog post about the bitcoin p2p network possibly slowing down: https://blog.lopp.net/is-bitcoin-network-slowing-down/

it details the logic bitcoin core has about detecting peers that are stalling, probably worth a read. It seems the behavior detailed there is what you experienced as well?

Copy link
Contributor Author

@solo1g solo1g Jun 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting 👀 .
Not very common but did see this a few times. Instead of being connected and not responding, what I found more common was for the peer to straight up disconnect us after initialization. Reconnection would be successful followed by the same behavior.

Current behavior is for timeouts to complete instantly and requery another peer (same if that's all we have) in case of a disconnection and wait for the whole duration in case its still connected. This covers both cases.

f <- isAllDisconnectedF(node)
} yield f
disconnF.map(assert(_))
def peerManager = node.peerManager
Copy link
Contributor Author

@solo1g solo1g Jun 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did away with using indexes and added check to ensure we have 2 peers.

}

//todo: so the disconnection should rather originate from the remote client rather than our side
ignore must "sync with another second peer if the first one is disconnected" in {
Copy link
Contributor Author

@solo1g solo1g Jun 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reconnect after stopping was throwing some exception about a closed port. Looking into it, didn't want to block review on minor stuff so marked to ignore test at the moment,

@@ -26,7 +26,7 @@ case class NeutrinoNode(
nodeConfig: NodeAppConfig,
chainConfig: ChainAppConfig,
actorSystem: ActorSystem,
configPeersOverride: Vector[Peer] = Vector.empty)
paramPeers: Vector[Peer] = Vector.empty)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we don't override the config ones anymore, rather use both. This was one of the suggestion earlier so added in this.

val sendCompactFilterHeaderMsgF = {
peerManager.randomPeerMsgSenderWithCompactFilters
syncPeerMsgSender
Copy link
Contributor Author

@solo1g solo1g Jun 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So unless the peers are all ideal, can't really use a random one, besides without parallel sync, not much point in switching. The previous change too was by me, so fixed it now. Would add this later with properly working parallel sync.

}
isInitializedF
def send(msg: NetworkPayload, peer: Peer): Future[Unit] = {
peerManager.peerData(peer).peerMessageSender.sendMsg(msg)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alll of this is now in PeerManager

onReconnect = node.sync
onReconnect = node.peerManager.onReconnect,
onStop = node.peerManager.onP2PClientStopped,
maxReconnectionTries = 4
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed max attempts to 4 as this would roughly be the time that we spend on each peer waiting for it to initialize. Ig all of these related constant need to be put together.

extends P2PLogger {

/** Returns peers by querying each dns seed once. These will be IPv4 addresses. */
def getPeersFromDnsSeeds: Vector[Peer] = {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved this from manager to a new Finder to separate things.

logger.debug(s"Removing persistent peer $peer")
val client = peerData(peer).client
_peerData.remove(peer)
//so we need to remove if from the map for connected peers so no more request could be sent to it but we before
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we don't want a peer as one of our persistent ones, we remove it from the peerData map, but still keep it around to check if the client actor for it actually stopped.

@Christewart
Copy link
Contributor

I was able to sync correctly on c5f0f9c with the default out of the box settings 🎉 . I'm going to start playing with the various settings introduced in this PR but wanted to document that things do seem to work out of the box.

Screenshot from 2022-06-24 08-41-51

@Christewart
Copy link
Contributor

Sync worked correctly for me on c5f0f9c

with these settings

bitcoin-s.node.use-default-peers = false
bitcoin-s.node.enable-peer-discovery = true

@Christewart
Copy link
Contributor

Christewart commented Jun 24, 2022

Things work on c5f0f9c for me with IBD and running a few hours with the following settings

bitcoin-s.node.use-default-peers = false
bitcoin-s.node.enable-peer-discovery = true
bitcoin-s.node.maxConnectedPeers = 8

IIUC what the logs means, it should be reworded to something about disconnecting. Also does this mean they disconnected from us or does it mean we disconnected from them?

2022-06-24T16:49:33UTC WARN [PeerManager] onP2PClientStopped called for unknown Peer(51.38.11.15:8333)

@solo1g
Copy link
Contributor Author

solo1g commented Jun 25, 2022

2022-06-24T16:49:33UTC WARN [PeerManager] onP2PClientStopped called for unknown Peer(51.38.11.15:8333)

Was this common? Ideally, you should not be seeing this. This basically means somewhere down the line a peer got "lost" from the PeerManager before it's ciient actor was stopped.

Edit: Fixed this. This happened when one of the peers while it still is initalizing is received again in an addr message and another actor is created for it while still having only one entry in peerData.

@solo1g solo1g requested a review from Christewart June 27, 2022 08:22
@Christewart Christewart merged commit 42564bc into bitcoin-s:master Jun 28, 2022
@Christewart Christewart added this to Done in 2.0 Jun 28, 2022
Christewart pushed a commit that referenced this pull request Jul 4, 2022
* fix node test shared fixtures bug

The cached bitcoind fixtures were used and stopped in UnsyncedNeutrinoNodeTest which causes an error if NeutrinoNodeTest is run at the same time on high performant systems, which is why it escaped CI. Merges NeutrinoNodeTest and UnsyncedNeutrinoNodeTest

* fix possible issues with PeerMessageReceiverTest

This reverts commit 55e7caf.

* fix filter sync issue when wallet creation time indicates already synced

* move switch peer test to NeutrinoNodeWithUnachedBitcoindTest
@Christewart Christewart modified the milestones: 2.0, 1.9.2, 1.9.3 Aug 6, 2022
@Christewart Christewart removed this from Done in 2.0 Aug 6, 2022
@Christewart Christewart added this to In Progress in 1.9.3 via automation Aug 6, 2022
@Christewart Christewart moved this from In Progress to Done in 1.9.3 Aug 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
neutrino Implements the Bitcoin-S Neutrino node node work for the node project
Projects
No open projects
1.9.3
Done
Development

Successfully merging this pull request may close these issues.

None yet

2 participants