Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Labels & Policies #99

Open
Stebalien opened this issue Oct 13, 2021 · 4 comments
Open

Labels & Policies #99

Stebalien opened this issue Oct 13, 2021 · 4 comments

Comments

@Stebalien
Copy link
Member

Background

Currently, connections are generally considered to be equivalent, "global" resources shared by all users of a single libp2p host. However, this isn't always desirable:

  1. Some connections may be transient (solved manually for now), slow, or limited.
  2. Some connections may be expensive.
  3. Some services may want to use separate transports and/or connections for different types of messages (Support per-topic TCP connections go-libp2p-pubsub#455).
  4. Some connections may be "important" and should not be cut. There's currently no good way to specify "policies" in the connection manager.

Proposal

Introduce a concept of "labels" to libp2p implementations where:

  1. Transports, connections, streams, etc. can acquire one or more "labels". Or, more generally, (label, value) tuples.
  2. Requests (connection attempts, new streams, etc.) may specify policies over these labels.

Examples

Example 1: Transient Connections

  1. The relay transport (maybe?) and transient relay connections would be labeled with "transient".
  2. The default dial and "new stream" protocol would reject "objects" with the "transient" label.

Example 2: Service-specific connections

  1. Hosts needing special-purpose connections would register one transport per connection type per protocol. E.g., there might be a TCP transport for "data" and a TCP transport for "pubsub".
  2. When creating new streams, services would be able to specify "policies" describing which type of connection they want.

Questions:

  • How do we distinguish between these transports in peer records? I assume we'd need to advertise the labels somehow.
  • How do we make sure both sides agree on the labels? Is that necessary? We can probably use identify but this could be... interesting.
  • How do we prevent other services from using connections with "reserved" labels? We'd probably need some form of "default" or "general" label where all connections would inherit this label by default and all request (dials, etc.) would require this label by default.

Example 3: Priorities

A request (dial, etc.) could specify a priority for labels. E.g., prefer connections that don't require file descriptors, reject transient connections.

Why Labels

So, the simple solution here is to allow the user to pass a bunch of filters in the context (or in some form of policy). E.g., they could pass a filter/comparator to filter/sort transports, addresses, streams, connections, etc.

However, labels let us abstract this away a bit. Instead of having to handle transports, addresses, streams, and connections, the filters/comparators would operate over labeled abstract "objects".

@raulk
Copy link
Member

raulk commented Oct 13, 2021

Before we even approach this subject, we should agree whether we want to invest or not in first-class userland support for N:1 connections (more than one simultaneous connection to a peer).

go-libp2p was built around a policy of "single multiplexed connection per peer", and a departure from that would imply a model breakage, redesign, and significant refactors.

My opinion is "yes"; libp2p should be a flexible as possible to allow the developer to define their own connection strategies. Of course there are tradeoffs between one model and the other, but there's no reason why libp2p should limit possibilities for the user.

Assuming we decide to go in this direction, connection selection is only one problem to solve.

I'm personally not sold on the label approach at all. I think it oversimplifies the problem, and it seems like a poor abstraction for the level of expressiveness we will want to eventually achieve. Labels are arbitrary strings and we'll find them limiting. For example, if a connection is labelled with "ephemeral", it is reasonable to want to enquire the TTL and expiry, which is quite cumbersome with labels.

I would nudge towards using language-specific traits/interfaces semantics, which would allow for more expressiveness and extensibility.

For example, connections could implement the Ephemeral trait (which in Go would be an interface) that offer methods to know Lock() the connection, to query the TTL(), more. We would then use functions for selecting and filtering.

However, before we go on designing concrete solutions, we should identify everything that proper N:1 requires. Some questions:

  1. Are protocols able to decide autonomously when new connections are open (e.g. if we were to support patterns like Support per-topic TCP connections go-libp2p-pubsub#455)
  2. How does competition/contention for connections across protocols work.
  3. How does a thing acquire a connection.
  4. How does a thing know which transports match specific desired traits (introspection).
  5. How do we establish quotas for transports.
  6. What's the experience with ephemeral connections; how do we proactively notify the acquirer that the connection is about to be killed.
  7. Can connections be of exclusive usage (e.g. lock a connection so that nobody else can use it).
  8. etc.

@Stebalien
Copy link
Member Author

go-libp2p was built around a policy of "single multiplexed connection per peer", and a departure from that would imply a model breakage, redesign, and significant refactors.

Ish? go-libp2p does, in fact, support more than one connection per peer, it just doesn't expose this very well. We're already starting to rely on this for transient connections.

Basically, I agree with you here but feel more strongly: we not only should support this, we're already using it so we might as well make it first-class.

Labels are arbitrary strings and we'll find them limiting. For example, if a connection is labelled with "ephemeral", it is reasonable to want to enquire the TTL and expiry, which is quite cumbersome with labels.

I agree. In my current proposal, I'm suggesting (label, value) pairs to handle this but... calling this "labels" is a stretch. Maybe "properties"?

I don't think we can reliably solve this in the type system. This is where I got stuck during the last libp2p refactor and handling wrapping, upgrades, etc. with types and type assertions gets nasty. There's also the fact that when X is tunneled over Y, X should generally inherit the properties of Y.

With properties (I'm just going to call them properties now), composition becomes a lot simpler.

Note: properties also allow us to communicate information about connections over the network.

@vyzo
Copy link

vyzo commented Oct 14, 2021

In general I very much like the idea; the labels can carry semantic information for use from various protocols a components of the system.

The main difficulty I see is how those labels get applied. Is it a dial (eg WithLabel(s)) option? Or is it something that gets applied afterwards, on an existing connection?
Also, some labels (eg transient in a generalization of the current property) are applied by the transport itself.

@Stebalien
Copy link
Member Author

I'm expecting labels (properties) to be applied by the transports, connections, etc. at first. I'd also expect the user to be able to "add" labels (e.g., with a WithLabels param) but I'm more interested in the properties of the connection itself at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants