server: improve initial peer bootstrapping #1205

wpaulino · 2018-05-08T04:58:36Z

In this commit, we address an existing issue with regards to the inital
peer bootstrapping stage. At times, the bootstrappers can be unreliable
by providing addresses for peers that no longer exist/are currently
offline. This would lead to nodes quickly entering an exponential
backoff method used to maintain a minimum target of peers without first
achieving said target.

We address this by separating the peer bootstrapper into two stages: the
initial peer bootstrapping and maintaining a target set of nodes to
maintain an up-to-date view of the network. The initial peer
bootstrapping stage has been made aggressive in order to provide such
view of the network as quickly as possible. Once done, we continue on
with the existing exponential backoff method responsible for maintaining
a target set of nodes.

We also randomize the order of the different bootstrappers to prevent
from always querying potentially unreliable bootstrappers first.

halseth

Really like how small the diff for this PR turned out :) A much needed optimization!

halseth · 2018-05-14T13:35:14Z

server.go

-	// new connection via our public endpoint, which will require the lock
-	// an add the peer to the server's internal state.
+	if done != nil {
+		done <- struct{}{}


you can close the channel instead, which will never block :)

halseth · 2018-05-14T13:37:30Z

server.go

 		if err != nil {
 			return err
 		}

+		// ignore is a set used to keep track of peers already retrieved
+		// from our bootstrappers in order to avoid duplicates.
+		ignore := make(map[autopilot.NodeID]struct{})


this can be initialized inside initialPeerBootstrap ?

Yep, forgot to move this. Fixed.

halseth · 2018-05-14T13:43:56Z

server.go

+				// quickly by discarding peers that are slowing
+				// us down.
+				select {
+				case <-done:


If the connection fails, then this won't be sent on, and we will wait the 3 seconds before trying the next peers. Should we isntead make this channel chan error, where we can send an error in case it fails, and close in case it succeeds?

Good idea, fixed!

Roasbeef

LGTM 👌

Tested locally and confirmed that we'll aggressively converge on our initial set of peers as advertised.

Roasbeef · 2018-06-13T01:52:38Z

Needs a rebase to fix conflicts as well, also we can manually implement the shuffle using rand.Perm.

wpaulino · 2018-06-13T02:46:44Z

Rebased.

halseth

LGTM, but could be useful to add some more logs to the logic I think, for easier debugging of potential connection issues.

halseth · 2018-06-13T07:45:21Z

server.go

+				// quickly by discarding peers that are slowing
+				// us down.
+				select {
+				case <-errChan:


halseth · 2018-06-13T07:45:40Z

server.go

+				case <-errChan:
+				// TODO: tune timeout? 3 seconds might be *too*
+				// aggressive but works well.
+				case <-time.After(3 * time.Second):


handle the timeout somehow? At least log.

The timeout doesn't need to be handled since we'll just continue to the next peer. Added a trace log though.

halseth · 2018-06-13T07:48:42Z

server.go

+					errChan := make(chan error, 1)
+					s.connectToPeer(a, errChan)
+					select {
+					case <-errChan:


halseth · 2018-06-13T07:49:19Z

server.go

+	select {
+	case err := <-errChan:
+		return err
+	case <-s.quit:


can also return ErrServerShuttingDown in this case

In this commit, we address an existing issue with regards to the inital peer bootstrapping stage. At times, the bootstrappers can be unreliable by providing addresses for peers that no longer exist/are currently offline. This would lead to nodes quickly entering an exponential backoff method used to maintain a minimum target of peers without first achieving said target. We address this by separating the peer bootstrapper into two stages: the initial peer bootstrapping and maintaining a target set of nodes to maintain an up-to-date view of the network. The initial peer bootstrapping stage has been made aggressive in order to provide such view of the network as quickly as possible. Once done, we continue on with the existing exponential backoff method responsible for maintaining a target set of nodes.

In this commit, we randomize the order of the different bootstrappers in order to prevent from always querying potentially unreliable bootstrappers first.

meshcollider added the discovery Peer and route discovery / whisper protocol related issues/PRs label May 8, 2018

Roasbeef added this to the 0.5 milestone May 8, 2018

Roasbeef assigned aakselrod and unassigned aakselrod May 9, 2018

Roasbeef requested a review from aakselrod May 9, 2018 23:29

halseth suggested changes May 14, 2018

View reviewed changes

wpaulino force-pushed the initial-peer-bootstrap branch from 6b23ef7 to 8cffbf5 Compare May 31, 2018 23:39

Roasbeef previously approved these changes Jun 13, 2018

View reviewed changes

wpaulino dismissed Roasbeef’s stale review via c216ba5 June 13, 2018 02:03

wpaulino force-pushed the initial-peer-bootstrap branch 2 times, most recently from c216ba5 to 8826b8d Compare June 13, 2018 02:27

halseth previously approved these changes Jun 13, 2018

View reviewed changes

wpaulino added 2 commits June 13, 2018 08:59

discovery: randomize order of different bootstrappers

800b136

In this commit, we randomize the order of the different bootstrappers in order to prevent from always querying potentially unreliable bootstrappers first.

wpaulino dismissed halseth’s stale review via 800b136 June 13, 2018 17:15

wpaulino force-pushed the initial-peer-bootstrap branch from 8826b8d to 800b136 Compare June 13, 2018 17:15

Roasbeef merged commit dddbfa7 into lightningnetwork:master Jun 13, 2018

wpaulino deleted the initial-peer-bootstrap branch June 13, 2018 23:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

server: improve initial peer bootstrapping #1205

server: improve initial peer bootstrapping #1205

wpaulino commented May 8, 2018

halseth left a comment

halseth May 14, 2018

wpaulino May 23, 2018

halseth May 14, 2018

wpaulino May 23, 2018

halseth May 14, 2018

wpaulino May 23, 2018

Roasbeef left a comment

Roasbeef commented Jun 13, 2018

wpaulino commented Jun 13, 2018

halseth left a comment

halseth Jun 13, 2018

wpaulino Jun 13, 2018

halseth Jun 13, 2018

wpaulino Jun 13, 2018

halseth Jun 13, 2018

wpaulino Jun 13, 2018

halseth Jun 13, 2018

wpaulino Jun 13, 2018

server: improve initial peer bootstrapping #1205

server: improve initial peer bootstrapping #1205

Conversation

wpaulino commented May 8, 2018

halseth left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Roasbeef left a comment

Choose a reason for hiding this comment

Roasbeef commented Jun 13, 2018

wpaulino commented Jun 13, 2018

halseth left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment