Service Discovery instead of Protocol Negotiation #32

Stebalien · 2017-08-15T22:01:22Z

Currently, there's a lot of overhead when each connection is established opening up multiple streams to figure out what services the other peer supports. This also tends to be a bit racy and really messy.

Instead, we could introduce a push-based service discovery system. On connect, a peer would announce what protocols it supports up-front (the list shouldn't be that large) and whenever a new service gets added/removed, the peer would tell its peers about this change. We could then avoid an entire round trip when opening new streams because we already know the protocols our peers support. This would also remove the need for a lot of our peer probing code (we could just have the host keep track of which connections support which protocols).

This would also make it possible to implement a pseudo message-oriented system on-top of our stream-oriented communication primitives. Why? Because most consumers of libp2p in go-ipfs are already (mostly) message oriented. That is, we have, for almost every service:

Something to keep track of peers and metadata about them (and keep track of the status of the connection).
Go routines to accept messages from peers and pipe it into a central event loop.
Go routines to send messages back out to peers. Note: they are sent back to peers in-order, this is the only non-message oriented part of this and why implementing this on-top of streams actually makes sense.

We shouldn't have to implement this over-and-over for every service; this is where a lot of our leaks and race conditions come from...

If we had this central service registry, we could use it to transparently manage all streams, opening them as necessary and possibly even managing associated metadata.

dryajov · 2017-08-15T22:50:21Z

@Stebalien I like the general idea and I believe this is to some extent along the same lines as ipfs/notes#237. We need a way to tell what a peer can do and who it knows about (connected to?) and I believe @vyzo and I have both independently proposed something very similar to this while working on the circuit implementations.

whyrusleeping · 2017-08-15T23:35:47Z

We do currently announce what protocols we support up front (via identify). The issue i think is that other protocols dont bother waiting for identify to finish. Great suggestion overall, i agree.

RangerMauve · 2018-01-12T15:48:15Z

How does one use the identify API from js-ipfs?

dryajov · 2018-01-12T18:13:58Z

@RangerMauve afaik identify is a lower level libp2p protocol and we don't expose it with the http API.

RangerMauve · 2018-01-12T19:23:04Z

@dryajov js-ipfs in the browser can run a full IPFS node which uses js-libp2p. Where exactly in libp2p would this call be? Would it be somewhere inside swarm?

dryajov · 2018-01-12T19:55:11Z

@RangerMauve https://github.com/libp2p/js-libp2p-swarm/blob/master/src/connection.js#L35...L84

RangerMauve · 2018-01-12T20:38:04Z

Thank you @dryajov!
So this will give me a list of protocols that are supported (like circuit relay), are there services that don't advertise as protocols that are currently undiscoverable? (Or is service===protocol in libp2p?

Also, it looks like peer.on("peer-mux-established") is what should be used to wait for the info to be available. This may be relevant to libp2p/js-libp2p-websocket-star#41 so that it can discover relay-able peers.

dryajov · 2018-01-12T21:02:01Z

So this will give me a list of protocols that are supported (like circuit relay), are there services that don't advertise as protocols that are currently undiscoverable? (Or is service===protocol in libp2p?

@RangerMauve unfortunately, not really - it will provide you with a more up to date peer-info of the remote peer. It's possible to tell which transports are available based on the peers multiaddrs, but nothing more.

Also, it looks like peer.on("peer-mux-established") is what should be used to wait for the info to be available. This may be relevant to libp2p/js-libp2p-websocket-star#41 so that it can discover relay-able peers.

Yep, in fact, that's what circuit uses to probe incoming peers for relay support.

RangerMauve · 2018-01-12T21:22:55Z

Thanks for the details!

Is this the canonical place to discuss the discovery service?
What can I do to help contribute to this feature?

dryajov · 2018-01-12T21:31:55Z

Is this the canonical place to discuss the discovery service?

Good question, could you provide more info on what is your use case?

dryajov · 2018-01-12T21:42:09Z

@RangerMauve could you create an issue describing what you're looking for, I believe it would help identifying where it falls in the stack and what is required. Also, I believe that service discovery is something that means different things to different folks :) - in the future, it would be helpful to clearly identify, what's a service, whats a protocol and what are general purpose capability discovery.

RangerMauve · 2018-01-12T21:58:47Z

I'm learning as much as I can about the different parts of libp2p and IPFS at the moment, but my goal is to help contribute to making the browser build more fully-featured. I got here while looking into what was missing from getting the browser to participate in the DHT.

There's some effort happening for migrating websocket-star to use circuit-relay instead of having centralized socket.io servers, and that seems to be dependent on circuit-relay working and being able to discover circuit-relay peers, which seems to be related to this issue of lack of service discovery.

RangerMauve · 2018-01-29T17:17:01Z

Question, how does this relate to Resource Discovery from the spec?

Stebalien · 2018-01-29T20:38:05Z

Question, how does this relate to Resource Discovery from the spec?

Mostly unrelated. Resource Discovery allows us to find peers that claim to be able to do something. However, that doesn't necessarily tell us precisely which protocols they support and if they're willing to speak the protocol with us at the current point in time.

The issue here is more about structuring libp2p programs. Instead of having each libp2p service say "Look, a new peer! I wonder if they'll talk with me.", we can have services register with libp2p and tell libp2p "Hey, libp2p, when you see a peer willing to speak protocol X with me, tell me."

The current system (in go-ipfs, at least), is like a bunch of puppies running to the door to see if someone is interested in petting them. What I'd like is a bunch of well-trained dogs that wait to be called when needed.

RangerMauve · 2018-01-29T21:16:43Z

@Stebalien What would need to happen to add a new field to the PeerInfo data type listing service names?
I think that that would be the most obvious place to put it and would make it easy to filter remote peers for ones that offer a specific service we want to connect to.

I think that the bunch of puppies approach may be behind some of the performance issues we're seeing in the browser, too. (I don't have numbers to back that claim up)

Stebalien · 2018-01-29T23:40:36Z

What would need to happen to add a new field to the PeerInfo data type listing service names?

One issue there is that the peerstore is already massive. However, I'd absolutely love that feature. In theory, we could just add it to the Peer message defined in the DHT's protobuf. After that, we'd need to record this information somewhere in the peerstore.

However, I don't think doing that will make anything faster at the moment as, currently, all nodes offer pretty much every service (except, experimental ones like relay and pubsub). This will only really become an issue when we make relaying non-experimental (we need a nice way to advertise "I'm willing to relay").

RangerMauve · 2018-01-30T00:15:29Z

Would writing up a change to PeerInfo be welcome, or should this be discussed more first?

I'm also a little confused by the docs for listening. It implies that one should be using a multicodec, but looking at examples it seems that node.handle can take an arbitrary string.

I think PeerInfo should contain a list of the things that the node is listening on, and I'm a little confused as to what data type that should really be.

Stebalien · 2018-01-30T02:08:06Z

@RangerMauve it definitely needs to be discussed more (and probably in a separate issue). It's a non-trivial change with wide-reaching consequences.

I'm also a little confused by the docs for listening. It implies that one should be using a multicodec, but looking at examples it seems that node.handle can take an arbitrary string.

I believe those docs are old. We use strings for protocols (usually of the form /name/version). As a matter of fact, we originally wanted to use IPFS paths pointing to the spec (/ipfs/QmId/my_spec/1_0.md) but didn't for performance reasons (for now).

I think PeerInfo should contain a list of the things that the node is listening on, and I'm a little confused as to what data type that should really be.

We'd have a list of strings. However, this does motivate adding these protocols to the multicodec table and using multicodecs instead (for space efficiency).

RangerMauve · 2018-01-30T02:57:50Z

@Stebalien is this the correct repo to open the issue in?

motivate adding these protocols to the multicodec table and using multicodecs instead

I think that the nature of the multicodec table will make it harder for people to use custom (application specific) protocols.

What about using hashes of protocol names? That way you can use a string to represent it in code, but store the existence of it in a binary multihash representation. This will also give a bit of privacy as to what the protocol is actually called and will make it more annoying to reverse the purpose of services.

Stebalien · 2018-01-30T18:50:25Z

@RangerMauve

is this the correct repo to open the issue in?

Yes.

think that the nature of the multicodec table will make it harder for people to use custom (application specific) protocols.

We'd would have to support both.

What about using hashes of protocol names? That way you can use a string to represent it in code, but store the existence of it in a binary multihash representation. This will also give a bit of privacy as to what the protocol is actually called and will make it more annoying to reverse the purpose of services.

I believe they'd generally end up longer that way. For example, bitswap is /bitswap/1.1.0 atm (14 bytes) while the hash would probably be 32 bytes. I'm also generally scared of "a bit of privacy" as it tends to lead people into a false sense of privacy where non really exists.

RangerMauve · 2018-01-30T19:02:43Z

I just noticed that there is already a protocols section in the protobuf for the the Identify service.

Isn't that exactly what we were just talking about?

Edit: It's in the go implementation, too.

Stebalien · 2018-01-30T19:28:49Z

@RangerMauve It is. However, we aren't really using it properly.

RangerMauve · 2018-01-30T19:31:36Z

@Stebalien

So the groundwork for this would be:

When handling a new protocol in swarm, add it to your local PeerInfo's protocol list.

In JS
In Go

That should be a pretty simple change, and it's totally backwards compatible.

From there, protocols that rely on other protocols will have to wait for ident to finish and check if a protocol is present in the PeerInfo.
This will have to be delayed while the ecosystem upgrades to versions of libp2p that have this change.

Would PRs for these two changes be welcome, or is there more discussion needed?

Stebalien · 2018-01-30T19:34:16Z

@RangerMauve we need to discuss this in a new issue (and invite anyone interested). Mostly concerning whether or not we actually want this, specific usecases, requirements like signing peer info's, etc.

RangerMauve · 2018-01-30T19:36:02Z

K, I'll write something up. Thank you!

RangerMauve · 2018-03-02T15:26:52Z

Looking at the docs in multiformats-select, it looks like you can get information about what the other end supports through the multistream protocol. I guess this means that one doesn't have to bother using PeerInfo anymore.

dryajov mentioned this issue Sep 7, 2017

automatically try relay ipfs/kubo#4213

Closed

RangerMauve mentioned this issue Feb 3, 2018

Add protocols to identity when adding a new handler #42

Closed

This was referenced Mar 7, 2018

Feat/circuit relay ipfs/interop#6

Merged

remove connect timeouts from circuit tests once service discovery is implemented ipfs/interop#17

Open

Stebalien mentioned this issue Nov 8, 2018

[WIP] [RFC] Multistream-2.0 #95

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Service Discovery instead of Protocol Negotiation #32

Service Discovery instead of Protocol Negotiation #32

Stebalien commented Aug 15, 2017

dryajov commented Aug 15, 2017

whyrusleeping commented Aug 15, 2017

RangerMauve commented Jan 12, 2018

dryajov commented Jan 12, 2018

RangerMauve commented Jan 12, 2018

dryajov commented Jan 12, 2018 •

edited

Loading

RangerMauve commented Jan 12, 2018 •

edited

Loading

dryajov commented Jan 12, 2018

RangerMauve commented Jan 12, 2018

dryajov commented Jan 12, 2018

dryajov commented Jan 12, 2018 •

edited

Loading

RangerMauve commented Jan 12, 2018

RangerMauve commented Jan 29, 2018

Stebalien commented Jan 29, 2018

RangerMauve commented Jan 29, 2018

Stebalien commented Jan 29, 2018

RangerMauve commented Jan 30, 2018

Stebalien commented Jan 30, 2018

RangerMauve commented Jan 30, 2018

Stebalien commented Jan 30, 2018

RangerMauve commented Jan 30, 2018 •

edited

Loading

Stebalien commented Jan 30, 2018

RangerMauve commented Jan 30, 2018 •

edited

Loading

Stebalien commented Jan 30, 2018

RangerMauve commented Jan 30, 2018

RangerMauve commented Mar 2, 2018

Service Discovery instead of Protocol Negotiation #32

Service Discovery instead of Protocol Negotiation #32

Comments

Stebalien commented Aug 15, 2017

dryajov commented Aug 15, 2017

whyrusleeping commented Aug 15, 2017

RangerMauve commented Jan 12, 2018

dryajov commented Jan 12, 2018

RangerMauve commented Jan 12, 2018

dryajov commented Jan 12, 2018 • edited Loading

RangerMauve commented Jan 12, 2018 • edited Loading

dryajov commented Jan 12, 2018

RangerMauve commented Jan 12, 2018

dryajov commented Jan 12, 2018

dryajov commented Jan 12, 2018 • edited Loading

RangerMauve commented Jan 12, 2018

RangerMauve commented Jan 29, 2018

Stebalien commented Jan 29, 2018

RangerMauve commented Jan 29, 2018

Stebalien commented Jan 29, 2018

RangerMauve commented Jan 30, 2018

Stebalien commented Jan 30, 2018

RangerMauve commented Jan 30, 2018

Stebalien commented Jan 30, 2018

RangerMauve commented Jan 30, 2018 • edited Loading

Stebalien commented Jan 30, 2018

RangerMauve commented Jan 30, 2018 • edited Loading

Stebalien commented Jan 30, 2018

RangerMauve commented Jan 30, 2018

RangerMauve commented Mar 2, 2018

dryajov commented Jan 12, 2018 •

edited

Loading

RangerMauve commented Jan 12, 2018 •

edited

Loading

dryajov commented Jan 12, 2018 •

edited

Loading

RangerMauve commented Jan 30, 2018 •

edited

Loading

RangerMauve commented Jan 30, 2018 •

edited

Loading