-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cherry pick new dial scheduler from upstream #1192
Conversation
…20592) Conflicts: p2p/dial.go p2p/server.go p2p/server_test.go * p2p: new dial scheduler This change replaces the peer-to-peer dial scheduler with a new and improved implementation. The new code is better than the previous implementation in two key aspects: - The time between discovery of a node and dialing that node is significantly lower in the new version. The old dialState kept a buffer of nodes and launched a task to refill it whenever the buffer became empty. This worked well with the discovery interface we used to have, but doesn't really work with the new iterator-based discovery API. - Selection of static dial candidates (created by Server.AddPeer or through static-nodes.json) performs much better for large amounts of static peers. Connections to static nodes are now limited like dynanic dials and can no longer overstep MaxPeers or the dial ratio. * p2p/simulations/adapters: adapt to new NodeDialer interface * p2p: re-add check for self in checkDial * p2p: remove peersetCh * p2p: allow static dials when discovery is disabled * p2p: add test for dialScheduler.removeStatic * p2p: remove blank line * p2p: fix documentation of maxDialPeers * p2p: change "ok" to "added" in static node log * p2p: improve dialTask docs Also increase log level for "Can't resolve node" * p2p: ensure dial resolver is truly nil without discovery * p2p: add "looking for peers" log message * p2p: clean up Server.run comments * p2p: fix maxDialedConns for maxpeers < dialRatio Always allocate at least one dial slot unless dialing is disabled using NoDial or MaxPeers == 0. Most importantly, this fixes MaxPeers == 1 to dedicate the sole slot to dialing instead of listening. * p2p: fix RemovePeer to disconnect the peer again Also make RemovePeer synchronous and add a test. * p2p: remove "Connection set up" log message * p2p: clean up connection logging We previously logged outgoing connection failures up to three times. - in SetupConn() as "Setting up connection failed addr=..." - in setupConn() with an error-specific message and "id=... addr=..." - in dial() as "Dial error task=..." This commit ensures a single log message is emitted per failure and adds "id=... addr=... conn=..." everywhere (id= omitted when the ID isn't known yet). Also avoid printing a log message when a static dial fails but can't be resolved because discv4 is disabled. The light client hit this case all the time, increasing the message count to four lines per failed connection. * p2p: document that RemovePeer blocks
Static peers are ones we explicitly want to connect to, and so should not be subject to this limit. This is especially important since connections between validators/proxies and other validators/proxies rely on this being so, and because that's how it was prior to the new dial scheduler from upstream.
Removing the "do not merge" tag, now that I've added the commit to exempt static dials from the limits. Unfortunately it involved considerable changes to the unit tests, but at the same time that's a good thing in that it indicates the unit tests were testing what they should be. |
p2p/server.go
Outdated
@@ -833,6 +851,10 @@ func (srv *Server) run(dialstate dialer) { | |||
p.RemovePurpose(purpose) | |||
} | |||
} | |||
if ch == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know all this changes come from master, but:
nit: i think ch == nil
is a convoluted way to asking: HasNoPurpose && not in peers[id]
. I would try to make it more explicit of what ch == nil
actually means (change it to something else).
In fact, i would make ch
and sub
local variables to the goroutine, and thus reduce their scope. No need to have them as global to the whole function.
Also, i now realize that when removing a static we subscribe to all events from all peers; probably there's not easier way of doing this, but seems like an overkill.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, I'll move them and add a boolean called disconnecting
that can then be used in the if
As far as the subscription, it does look like there is no way to subscribe to only certain events, so there's not much else we could do.
This increment was moved lower down in the function, but not removed here in the conflict resolution in #1192.
This increment was moved lower down in the function, but not removed here in the conflict resolution in #1192.
Description
Cherry pick the new dial scheduler from upstream. Also cherry-picks ethereum/go-ethereum#20688 because it is a closely-related follow-up to the dial scheduler.
Marked with "do not merge" until I add another commit, as explained below.
Merge notes:
removestaticDone
to let us know when the peer has been disconnected fully (using the same method that they used, but in a different place now)Tested
Related issues
Backwards compatibility
This only modifies how the node starts new connections to other nodes, and doesn't affect the protocols used between nodes, so there aren't backwards compatibility issues.