Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DHT Bootstrap issue #572

Closed
jbenet opened this issue Jan 15, 2015 · 10 comments
Closed

DHT Bootstrap issue #572

jbenet opened this issue Jan 15, 2015 · 10 comments
Assignees
Milestone

Comments

@jbenet
Copy link
Member

jbenet commented Jan 15, 2015

Currently, DHT bootstrap only runs when whole core bootstrap runs. Turns out this is not correct, because it conflates IpfsCore's bootstrap requirements with those of the dht. Namely:

  • IPFS core bootstrap requirement: that we're connected to at least N nodes, where N is a tunable parameter related to the size of the number of bootstrap peers we use.
  • DHT bootstrap requirement: that we have a well-formed routing table (i.e. we have a representative samle of peers in our table).

What this yields is that nodes -- upon starting up -- will connect to a bunch of peers and run dht bootstrap. dht routing table will be well formed. As time passes, those connections may break momentarily. the core bootstrap routine doesnt do anything about it, because the node may be connected to more than N bootstrap nodes, so it skips bootstrap altogether. Meanwhile, the dht's routing table worsens, but the dht doesn't do anything about it.

Note that this is not technically the core's problem to fix-- it's requirements on being connected to N bootstrap peers are correct. And, as far as it's concerned, it called dht.Bootstrap() at least once. Now, we may lift the requirement that ipfs Bootstrap should include good Routing system bootstrap conditions, or not:

2 proposed solutions:

  1. (*IpfsCore).Bootstrap() periodic routine always calls dht.Bootstrap() (never skips).
  2. IpfsDHT uses its own async worker to periodically call run its own bootstrapping logic. (spawned by either a call to Bootstrap, or a call to NewDHT.

(1) is simpler on the DHT. (2) is simpler on the Core.

Example of 1:

 func (core *IpfsCore) Bootstrap() {
    ...
    // _always_ call Bootstrap. routing can decide
    // what "being well bootstrapped" means.
    core.Routing.Bootstrap()
    ...
 }

Example of 2:

dht := NewDHT(...)
dht.Bootstrap()
// dht is ready to be used, and will continue usable.

Feedback on the approaches appreciated, i'm atm leaning towards (2). cc @briantigerchow @whyrusleeping

@whyrusleeping
Copy link
Member

I think the DHT running its own async worker that periodically bootstraps to maintain its routing table is going to be the best option

@btc
Copy link
Contributor

btc commented Jan 15, 2015

If the DHT knows best, then it may not make sense to get callers (IpfsNode et al) involved.

If the dht exposes bootstrapping functionality in its interface, dht.Bootstrap() should only perform the operation once and should either a) block during the operation or b) return a future. This way callers retain control over the object.

Periodic bootstrapping might work best as a constructor option. (constructor is a mess, needs cleanup, but this is beside the point)

@whyrusleeping
Copy link
Member

I think the call to start the boostrapping routine should be separate, we might have a scenario where we wish to construct a DHT, but not start the bootstrapping, so:

ipfsdht := dht.NewDht(...)
go ipfsdht.Bootstrap()

@btc
Copy link
Contributor

btc commented Jan 15, 2015

this is why I mention it as a constructor option. The lifecycle of the periodic-bootstrap goroutine is tied to the lifecycle of the dht so it's best to let the dht manage that internally (so the worker closes when the dht closes).

@whyrusleeping
Copy link
Member

ooooh, i see what you mean. 👍

@jbenet jbenet modified the milestone: α Jan 15, 2015
@jbenet jbenet self-assigned this Jan 15, 2015
@jbenet
Copy link
Member Author

jbenet commented Jan 16, 2015

this is why I mention it as a constructor option.

As it stands now, this wont work correctly: we instantiate the DHT well before bootstrap finishes. TRTTD is for the network passed in to have some notion of being "online or offline", and the dht bootstrap to wait on that. However that requires lots of changes to how the interfaces work right now, and IMO not worth it to fix this one fix atm. (this can be the way going fwd though, but should be designed carefully).

I'll opt for something in the middle:

// the bootstrap function returns a process
func (dht *IpfsDHT) Boostrap() goprocess.Process { ... }

// if desired, the caller can store that procsess
bsproc := dht.Bootstrap()

// and stop it when they wish
bsproc.Close()

@jbenet
Copy link
Member Author

jbenet commented Jan 16, 2015

the caller can treat it as a simple io.Closer:

var bsproc io.Closer = dht.Bootstrap()
bsproc.Close()

but returning a goprocess so the caller can also wire the signals:

bsproc := dht.Bootstrap()
anotherProc.AddChild(bsproc)
anotherProc.Close() // closes bsproc too

@btc
Copy link
Contributor

btc commented Jan 16, 2015

(The goprocess should be able to manage closers. It simplifies the interface for users since the only thing the process could possibly do is close the argument. This occurred to be yesterday with the context closer.)

jbenet/go-ctxgroup@f6f9601

var bsproc io.Closer = dht.Bootstrap()
anotherProc.Foo(bsproc) // closes closer 

@jbenet
Copy link
Member Author

jbenet commented Jan 16, 2015

yep, i saw: jbenet/go-ctxgroup#1 (comment)

And, that's not exactly correct, there is value in knowing it's a process and not just a closer: https://github.com/jbenet/goprocess/blob/master/impl-mutex.go#L138

goprocess can do this by attempting a cast, and otherwise wrapping it in a process to capture the error properly. One of the important features of goprocess is that it calls Close with at-most-once semantics, and thus never loses the returned error. note many closers are written such that subsequent calls to close return nil. This works in spite of a process receiving multiple signals to close-- all callers receive the same error.

jbenet added a commit that referenced this issue Jan 16, 2015
jbenet added a commit that referenced this issue Jan 16, 2015
jbenet added a commit that referenced this issue Jan 19, 2015
jbenet added a commit that referenced this issue Jan 19, 2015
jbenet added a commit that referenced this issue Jan 20, 2015
jbenet added a commit that referenced this issue Jan 20, 2015
jbenet added a commit that referenced this issue Jan 20, 2015
jbenet added a commit that referenced this issue Jan 20, 2015
jbenet added a commit that referenced this issue Jan 20, 2015
jbenet added a commit that referenced this issue Jan 21, 2015
jbenet added a commit that referenced this issue Jan 21, 2015
jbenet added a commit that referenced this issue Jan 23, 2015
@jbenet
Copy link
Member Author

jbenet commented Jan 23, 2015

Addressed in #583

@jbenet jbenet closed this as completed Jan 23, 2015
ariescodescream pushed a commit to ariescodescream/go-ipfs that referenced this issue Oct 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants