Add a Stream Migration spec #406

MarcoPolo · 2022-04-13T19:48:30Z

Initial pass at creating a spec for #328. I'll start a PoC as well to start fleshing out some of the ideas and check my understanding.

MarcoPolo · 2022-04-13T19:51:30Z

r? @marten-seemann for the first pass, I'll tag more folks as this progresses. (also I don't think I have permissions to add reviewers to the PR).

marten-seemann

This is a good start!

A few thoughts:

We need to define how the stream labels (A and B) are assigned (multistream).
I’m wondering if each side should use their own namespace for streams, or if they should share a namespace (e.g. by say that client/server use even/odd stream IDs). Maybe then it would be possible to migrate streams that the peer initiated (is this even desireable?).

connections/stream-migration.md

marten-seemann · 2022-04-13T21:09:10Z

connections/stream-migration.md

+plantuml stream-migration.md -o stream-migration -tsvg
+```
+</details>
+


What happens if the peer misbehaves, e.g. when B just doesn’t send an EOF on stream B? While A can “consider” the stream as closed, it still needs to be closed explicitly, otherwise the stream multiplexer can’t garbage-collect the stream after it has been EOFed from both sides.

I think if the Responder misbehaves and doesn't close B then the stream migration hasn't finished since the new stream isn't in the same state as the old stream. I'm not sure what else we can do besides consider it not spec-compliant.

connections/stream-migration.md

MarcoPolo · 2022-04-14T00:45:32Z

I’m wondering if each side should use their own namespace for streams, or if they should share a namespace (e.g. by say that client/server use even/odd stream IDs). Maybe then it would be possible to migrate streams that the peer initiated (is this even desireable?).

Can we rely on the underlying stream to give us a stream ID we can use for identifying streams?

What do you mean by migrating streams that the peer initiated? If I'm migrating a stream from a connection a remote peer started to a connection I started should that matter? I'm just changing how the bytes are getting there but everything else should be the same? Maybe there's something I'm missing here.

marten-seemann · 2022-04-16T12:39:06Z

Can we rely on the underlying stream to give us a stream ID we can use for identifying streams?

Probably not. While stream IDs are unique for every, they're not across muxers. It's probably easier if we introduce a new ID here.

What do you mean by migrating streams that the peer initiated? If I'm migrating a stream from a connection a remote peer started to a connection I started should that matter? I'm just changing how the bytes are getting there but everything else should be the same?

Maybe I'm just confused by the word "initiator" here. Do we mean the initiator of the connection or the initiator of the stream?

MarcoPolo · 2022-04-16T15:09:28Z

I think this finally clicked for me yesterday. The stream migration protocol is a prefix on top of another protocol. That’s how you can send the identify this stream as id=A message. I’ll update the spec on Monday.

vyzo · 2022-04-19T08:03:20Z

connections/stream-migration.md

+### Stream migration protocol id
+
+The stream migration protocol id should follow the format of
+`/streamMigration/1.0.0/<streamID>`. The stream should be an int.


/libp2p/stream-migration/1.0.0

/libp2p/stream-migration/, we can still version this later if necessary.

The protocol name can't include the stream ID. The stream ID is the payload of this protocol.

The protocol name can't include the stream ID. The stream ID is the payload of this protocol.

Why not? This way you wouldn't need a separate payload message.

The initiator would send the stream-migration protocol ID+stream id, then can send the underlying protocol right away without having to wait for a response.

The responder would read the stream migration protocol and ack it (by echoing back as in multistream select. Note this doc doesn't say this part yet.), then continue negotiating the underlying protocol. If the other node for some reason doesn't support stream-migration (even if we thought it did), it would echo back na, and continue as before.

ah, it could be because we cache the protocols seen on the other side, correct?

Yes, exactly. Even worse, we send an Identify Delta messages for every new protocol that we add.

Also, logically speaking, the stream ID is not part of the protocol ID, but it's a payload of the stream migration protocol. Counting the bytes on the wire, it makes no (or at least not more than a few byte) difference if it's in the protocol name or in the payload.

Not sure this was mentioned before: For the sake of evolvability of the protocol in the future, can the stream ID be send embedded in a Protobuf? That way we can add new fields in the future.

Good call out. I did this in the poc, but I’ll update the spec to include this.

connections/stream-migration.md

marten-seemann · 2022-04-19T10:38:09Z

connections/stream-migration.md

+### Stream migration protocol id
+
+The stream migration protocol id should follow the format of
+`/streamMigration/1.0.0/<streamID>`. The stream should be an int.


The protocol name can't include the stream ID. The stream ID is the payload of this protocol.

…the protocol id

mxinden · 2022-04-25T07:30:56Z

connections/stream-migration.md

+   match the swarm's behavior in best connection selection.
+
+Note that it's not required that all implementations (and all versions) follow
+the same heuristics since the initiator is driving the migration and specifies


Below I am assuming that Initiator refers to the connection initiator, not the stream initiator. Please correct me in case I am misunderstanding this @MarcoPolo.

Say there are two nodes A and B. Connection AB is initiated by A to B. Conneciton BA is initiated by B to A.

Say that A and B follow different heuristics to pick the best connection. A chooses AB as the best connection, B chooses BA as the best connection.

If I understand the above correctly, this would result in A moving all the streams it created to AB and B moving all the streams it created to BA. Both connections would thus stay alive.

Potential solution: Instead of allowing both A and B to migrate streams, how about delegating the decision making to the peer with the lower peer ID, e.g. in this case A?

Yes!

I was thinking about this this weekend and came up with a similar solution. Glad to see you also came to the same conclusion. I'll update this spec to make this explicit.

Few thoughts on this:

Currently each peer chooses its own IDs for streams, i.e. there are two distinct spaces of stream IDs. If we want to allow the receiver of a stream to migrate that stream, we need a single stream ID space. One way to realize this would be to mandate the client (roles as seen by the stream muxer) to use odd and the server to use even stream IDs.

I don't think this document should describe how peers would choose AB over BA. This document should only describe how to migrate one libp2p stream from one muxer stream to another. For all that this spec cares about, those streams might (or might not) live on the same underlying connection. We can then use the stream migration protocol as a building block to converge onto one connection (and the peer ID comparison is quite a neat idea, I like it!), but that should probably be described in a different document.

Yes. I was considering adding a boolean flag that would indicate which peer initially identified the stream it's referencing when migrating (was I the initiator of the from stream?). This is the same as using even and odd numbers, since that scheme effectively encodes this boolean in the least significant bit. I'm fine with either way. Maybe it's a little easier to think about even and odds, so I'll do that.

Agreed that describing how to sort connections is out of the scope of this document (I imagine that spec to iterate more and and possibly have more subtle details). But I do think this spec should define who is responsible for doing the stream migration. If we end up in the situation where we have two identical connections (A dialed B and B dialed A at roughly the same time) we should describe who is in charge of doing the stream migration. By defining which node starts the stream migration we simplify this protocol and also avoid having to handle cases where both sides start stream migration at the same time.

"Potential solution: Instead of allowing both A and B to migrate streams, how about delegating the decision making to the peer with the lower peer ID, e.g. in this case A?"

Wouldn't this create a biais toward lower peerIDs? Maybe we can hash the concatenation of the two peerIDs (the lower first). If the hash is even, use the lowest. Else use the highest. That way it is deterministic but no peerID is systematically favored over another.

Maybe it's not worth the extra complexity though since for a random ID A there's a 50% odd that it's less than another random id B. (since it's equiprobable that B is smaller).

Yeah on second thought I agree with Max. I actually don't see the benefits since it's still 50% odds either way. Updated 49c0597

I'm still not convinced that we should specify anything here at all. Stream migration is a general feature, a building block.

The use case we have in mind now is migrating all streams from one connection to another, but we might come up with other use cases in the future. I'd prefer to have stream migration just be a thing that any node, regardless of its peer ID, can use in principle.
For the specific use case of converging on a single connection, comparing peer IDs seems reasonable.

Let me try to rephrase to see if I understand:

The protocol should only specify how a node would perform a stream migration.

It doesn't define who starts the migration or why.

Consolidating connections would be a layer on top of this that defines which nodes is in charge of migrating streams to empty and close connections.

We just have to make sure that what we design here doesn't block point 3.

If that seems accurate, then I agree we don't need this in here.

also @marten-seemann to highlight a recent change re:

Currently each peer chooses its own IDs for streams, i.e. there are two distinct spaces of stream IDs. If we want to allow the receiver of a stream to migrate that stream, we need a single stream ID space. One way to realize this would be to mandate the client (roles as seen by the stream muxer) to use odd and the server to use even stream IDs.

I've specced something similar, except the lower peer id node uses even and the higher peer id node uses odds. This let's us avoid having to rely on the stream muxer to give us this role. And it also works across connections (it gets confusing if the stream muxer says we are the client on connection and the server on the other).

marten-seemann · 2022-05-23T14:52:18Z

connections/stream-migration/streammigration.proto

+  oneof type {
+    Label label = 1;
+    Migrate migrate = 2;
+  }


Does this need an AckMigrate option?

Ah yes, thank you

marten-seemann · 2022-05-23T14:53:36Z

connections/stream-migration.md

+initial stream `1` was half closed, then the final migrated stream `2` should
+also be half closed. Note this may involve an extra step by one of the nodes.
+If a node, had closed writes to its old stream before migration it should also
+close writes to the new stream after migration.


We should probably call out what happens if a node sends new data if the stream was already closed in that direction: that is a connection error.

marten-seemann · 2022-05-23T14:55:41Z

connections/stream-migration.md

+state of the old stream.
+
+The protocol should only be used when the initiator knows the responder
+understands the stream-migration protocol. Otherwise we waste 1 round trip.


Suggested change

understands the stream-migration protocol. Otherwise we waste 1 round trip.

understands the stream-migration protocol. In that case the negotiation of the stream migration protocol can be pipelined with the negotiation of the application protocol, and therefore doesn't cost any additional round trips.

MarcoPolo added 2 commits April 13, 2022 12:46

Initial Pass

4fc768f

Add authors and interest group

b0db416

marten-seemann reviewed Apr 13, 2022

View reviewed changes

Clarifications

dac9a8d

MarcoPolo marked this pull request as draft April 16, 2022 15:26

Update spec to clarify this is a protocol that prefixes other protocols

37a86ff

vyzo reviewed Apr 19, 2022

View reviewed changes

marten-seemann reviewed Apr 19, 2022

View reviewed changes

Don't pass arguments in protocol id

6f03926

MarcoPolo mentioned this pull request Apr 21, 2022

Stream Migration PoC libp2p/go-libp2p#1413

Draft

Update note on protocol ID. Add should on knowing responder supports …

406171f

…the protocol id

mxinden reviewed Apr 25, 2022

View reviewed changes

MarcoPolo added 3 commits April 25, 2022 10:50

Rename stream A to stream 1 to match uints

99f055e

Update stream migration spec

c0fd69f

Add some links

eef233e

MarcoPolo requested a review from marten-seemann April 25, 2022 15:52

MarcoPolo added 2 commits April 25, 2022 18:17

Simply use the lower peer id

49c0597

Remove designated stream migrator

4e1f106

MarcoPolo marked this pull request as ready for review April 27, 2022 10:10

BigLep assigned MarcoPolo Apr 27, 2022

marten-seemann approved these changes May 23, 2022

View reviewed changes

Add ack_migrate

a380dc4

Stebalien mentioned this pull request Jan 2, 2023

Idea: Friendly Streams #500

Open

SgtPooki mentioned this pull request Aug 7, 2023

Create spec legend and terminology guideline #565

Open

5 tasks

MarcoPolo mentioned this pull request Aug 7, 2023

Stream Migration Protocol #328

Open

Jorropo mentioned this pull request Mar 8, 2024

feat: add connection selection logic libp2p/go-libp2p#2726

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a Stream Migration spec #406

Add a Stream Migration spec #406

MarcoPolo commented Apr 13, 2022

MarcoPolo commented Apr 13, 2022

marten-seemann left a comment

marten-seemann Apr 13, 2022

MarcoPolo Apr 14, 2022

MarcoPolo commented Apr 14, 2022

marten-seemann commented Apr 16, 2022

MarcoPolo commented Apr 16, 2022

vyzo Apr 19, 2022

marten-seemann Apr 19, 2022

marten-seemann Apr 19, 2022

MarcoPolo Apr 20, 2022 •

edited

MarcoPolo Apr 20, 2022

marten-seemann Apr 20, 2022

mxinden Apr 21, 2022

MarcoPolo Apr 21, 2022

marten-seemann Apr 19, 2022

mxinden Apr 25, 2022

MarcoPolo Apr 25, 2022

marten-seemann Apr 25, 2022

MarcoPolo Apr 25, 2022

bertrandfalguiere Apr 25, 2022

MarcoPolo Apr 25, 2022 •

edited

MarcoPolo Apr 25, 2022 •

edited

marten-seemann Apr 25, 2022 •

edited

MarcoPolo Apr 25, 2022

MarcoPolo Apr 25, 2022

marten-seemann May 23, 2022

MarcoPolo May 23, 2022

marten-seemann May 23, 2022

marten-seemann May 23, 2022

	understands the stream-migration protocol. Otherwise we waste 1 round trip.
	understands the stream-migration protocol. In that case the negotiation of the stream migration protocol can be pipelined with the negotiation of the application protocol, and therefore doesn't cost any additional round trips.

Add a Stream Migration spec #406

Are you sure you want to change the base?

Add a Stream Migration spec #406

Conversation

MarcoPolo commented Apr 13, 2022

MarcoPolo commented Apr 13, 2022

marten-seemann left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MarcoPolo commented Apr 14, 2022

marten-seemann commented Apr 16, 2022

MarcoPolo commented Apr 16, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MarcoPolo Apr 20, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MarcoPolo Apr 25, 2022 • edited

Choose a reason for hiding this comment

MarcoPolo Apr 25, 2022 • edited

Choose a reason for hiding this comment

marten-seemann Apr 25, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MarcoPolo Apr 20, 2022 •

edited

MarcoPolo Apr 25, 2022 •

edited

MarcoPolo Apr 25, 2022 •

edited

marten-seemann Apr 25, 2022 •

edited