Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New connection strategy #1807

Closed
wants to merge 13 commits into from

Conversation

@vogel
Copy link
Contributor

commented Oct 10, 2018

Description

Partially resolves #1753

This PR adds new outboundvalid field to PeerRecord that is stored in database. outboundvalid means that at least once it was possible to make valid outbound connection to that peer. Peers with outboundvalid flags are first to try to join to and first to send to other peers as a response to GETPEERS message.

This PR also makes MAX_PENDING_PEERS setting only applicable to incoming connections, as OverlayManagerImpl already has logic for initiating only proper number of outgoing connections at any time, and it would be bad to have all PENDING slots taken by inbound connection from misbehaving peers.

Checklist

  • Reviewed the contributing document
  • Rebased on top of master (no merge commits)
  • Ran clang-format
  • Compiles
  • Ran all tests
  • If change impacts performance, include supporting evidence per the performance document
@MonsieurNicolas
Copy link
Contributor

left a comment

Looks good overall! Not too complicated.

I think it plays nice with existing peers too.

There are a couple things missing, use of random (to make the network better connected), and changes to getPeers.

Connection strategy

  • When picking peers to connect to, we should pick a random subset of peers for which Nextattempt is in the past. Right now we have strict ordering with nextattempt

Promotion strategy:

  • When picking peers to promote, we should pick a random subset of peers for which Nextattempt is in the past.

Updates to getPeers

  • We should send random peers from connpeers and complement with random peers from peers
src/main/CommandHandler.cpp Show resolved Hide resolved
src/overlay/OverlayManager.h Outdated Show resolved Hide resolved
src/overlay/OverlayManagerImpl.h Outdated Show resolved Hide resolved
src/overlay/OverlayManagerImpl.cpp Outdated Show resolved Hide resolved
src/overlay/OverlayManagerImpl.cpp Outdated Show resolved Hide resolved
src/overlay/Peer.cpp Outdated Show resolved Hide resolved
src/overlay/Peer.cpp Outdated Show resolved Hide resolved
src/main/Config.cpp Outdated Show resolved Hide resolved
src/overlay/OverlayManagerImpl.cpp Outdated Show resolved Hide resolved
src/overlay/OverlayManagerImpl.cpp Outdated Show resolved Hide resolved

@MonsieurNicolas MonsieurNicolas added this to To Do in v10.1.0 via automation Oct 26, 2018

@MonsieurNicolas MonsieurNicolas moved this from To Do to In progress in v10.1.0 Oct 26, 2018

@vogel

This comment has been minimized.

Copy link
Contributor Author

commented Nov 30, 2018

Updated

@MonsieurNicolas MonsieurNicolas added this to In progress in v10.2.0 via automation Dec 3, 2018

@MonsieurNicolas MonsieurNicolas removed this from In progress in v10.1.0 Dec 3, 2018

src/overlay/PeerRecord.cpp Outdated Show resolved Hide resolved
src/overlay/RandomPeerSource.cpp Outdated Show resolved Hide resolved
src/overlay/RandomPeerSource.cpp Outdated Show resolved Hide resolved
src/overlay/RandomPeerSource.h Outdated Show resolved Hide resolved
src/overlay/RandomPeerSource.cpp Outdated Show resolved Hide resolved
src/overlay/Peer.h Outdated Show resolved Hide resolved
src/overlay/OverlayManagerImpl.cpp Outdated Show resolved Hide resolved
@MonsieurNicolas MonsieurNicolas referenced this pull request Dec 6, 2018
@MonsieurNicolas
Copy link
Contributor

left a comment

some comments to look into

src/overlay/OverlayManagerImpl.cpp Outdated Show resolved Hide resolved
src/overlay/RandomPeerSource.cpp Outdated Show resolved Hide resolved
src/overlay/Peer.cpp Show resolved Hide resolved
@@ -629,7 +629,8 @@ CommandHandler::dropPeer(std::string const& params, std::string& retStr)
auto peer = peers.find(n);
if (peer != peers.end())
{
peer->second->drop(ERR_MISC, "dropped by user");
peer->second->drop(ERR_MISC, "dropped by user",

This comment has been minimized.

Copy link
@MonsieurNicolas

MonsieurNicolas Dec 15, 2018

Contributor

why do we care to distinguish the two modes? It seems that this complexity doesn't really buy us anything

This comment has been minimized.

Copy link
@vogel

vogel Dec 19, 2018

Author Contributor

There are two places where we need to flush write queue before closing connection.
One is when we are dropping peer because of load, and we want to send list of peers to it - before dropping we need to really send that list.

The other one is for promotion to outbound. We need to fully send our hello message in response to other peers hello (because we will drop in a moment, as in case of promotion we are already connected to that peer), so it can see that we can handshake and set outbound value to true.

This comment has been minimized.

Copy link
@MonsieurNicolas

MonsieurNicolas Dec 19, 2018

Contributor

you didn't really answer my question: yes we can have this special handling, my question is, why bother with the "no flush" version?

This comment has been minimized.

Copy link
@vogel

vogel Dec 20, 2018

Author Contributor

So you wan't all drops do the flush? We can do that, that should not do any harm.

This comment has been minimized.

Copy link
@MonsieurNicolas

MonsieurNicolas Dec 20, 2018

Contributor

yeah how master works on that front seems fine

This comment has been minimized.

Copy link
@vogel

vogel Dec 21, 2018

Author Contributor

Current master drops messages for all cases except when dropping because of ERR_LOAD (to send peers).

return;
}

updatePeerRecordAfterEcho();

This comment has been minimized.

Copy link
@MonsieurNicolas

MonsieurNicolas Dec 15, 2018

Contributor

why do we do anything before auth? Previous code seems actually simpler to follow

This comment has been minimized.

Copy link
@vogel

vogel Dec 19, 2018

Author Contributor

Its also for promotion. If some peer connected to us, we may want to promote it to outbound peer. So we connect to it, send our hello and receive hello from it. Then we need to store that this peere is outbound before we reject it because we are already connected to it. So it need to happen inside recvHello but before it drops in response to predicates isAuthenticated or isPending that are checked at the end of recvHello.

This comment has been minimized.

Copy link
@MonsieurNicolas

MonsieurNicolas Dec 19, 2018

Contributor

still not following in the context of this commit: we don't drop a peer that early.

The check if (getRole() == authenticatedIt->second->getRole()) ensures that we don't drop a peer already connected to us if all we're doing is promotion.

so we should always reach the call to updatePeerRecordAfterAuthentication in that situation.

Maybe I am missing something

This comment has been minimized.

Copy link
@vogel

vogel Dec 20, 2018

Author Contributor

This condition was removed later: 2ab5445#diff-239e869bd0b671610b35830852c1cbadL1038

Ordering of changes will be fixed during rebase/squash phase.

src/main/Config.cpp Outdated Show resolved Hide resolved
src/overlay/RandomPeerSource.cpp Outdated Show resolved Hide resolved
src/overlay/RandomPeerSource.cpp Outdated Show resolved Hide resolved
});
newMsg.peers().reserve(peerList.size());
for (auto const& pr : peerList)
auto peers = mApp.getOverlayManager().getRandomPeerSource().getRandomPeers(

This comment has been minimized.

Copy link
@MonsieurNicolas

MonsieurNicolas Dec 15, 2018

Contributor

just realized that there is a small bug here (that was there in master as well I think): because we call backoff when we connect to peers, that means that pending peers (that may be good!) are not returned in this list.

There is another issue here (that may allow us to fix that issue): we are sharing the "randomPeerSource" with the one used for connecting to other peers.
We should not do this: this creates a bunch of problems as it makes the distribution potentially biases for connecting outbound.
By separating those, we can also change the select used for building this list of peers: we don't need to filter by date at all for this scenario, only failure count (don't use peers with failure count more than some high number like 9).

src/overlay/OverlayManagerImpl.cpp Show resolved Hide resolved

@vogel vogel force-pushed the vogel:new-connection-strategy branch 4 times, most recently from 2fa47dc to 1dd268e Jan 8, 2019

vogel added 7 commits Jan 10, 2019
rename PeerRecord files to PeerManager
Signed-off-by: Rafał Malinowski <rafal.przemyslaw.malinowski@gmail.com>
test if preferred wins in both ways
Signed-off-by: Rafał Malinowski <rafal.przemyslaw.malinowski@gmail.com>
test limits of pending peers for both ways
Signed-off-by: Rafał Malinowski <rafal.przemyslaw.malinowski@gmail.com>
use PeerManager to get access to stored peer data
Signed-off-by: Rafał Malinowski <rafal.przemyslaw.malinowski@gmail.com>
add new outbound field to PeerRecord
Signed-off-by: Rafał Malinowski <rafal.przemyslaw.malinowski@gmail.com>
add DropMode for each drop call
Signed-off-by: Rafał Malinowski <rafal.przemyslaw.malinowski@gmail.com>

@vogel vogel force-pushed the vogel:new-connection-strategy branch from 1dd268e to 011d036 Jan 14, 2019

vogel added 6 commits Jan 10, 2019
split peer data in OverlayManager into inbound and outbound
Signed-off-by: Rafał Malinowski <rafal.przemyslaw.malinowski@gmail.com>
bettter split available connections between inbound and outbound
Signed-off-by: Rafał Malinowski <rafal.przemyslaw.malinowski@gmail.com>
implement outbound promotion
Signed-off-by: Rafał Malinowski <rafal.przemyslaw.malinowski@gmail.com>
allow extra pending connnection from preferred peers
Signed-off-by: Rafał Malinowski <rafal.przemyslaw.malinowski@gmail.com>
improve connecting to preferred peers at each tick
Signed-off-by: Rafał Malinowski <rafal.przemyslaw.malinowski@gmail.com>

@vogel vogel force-pushed the vogel:new-connection-strategy branch from 011d036 to 1ee4647 Jan 14, 2019

@vogel vogel closed this Jan 15, 2019

v10.2.0 automation moved this from In progress to Done Jan 15, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
3 participants
You can’t perform that action at this time.