-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do we actually have an established consensus on path definition? #291
Comments
Some clarity would be good. The mechanism that I do understand is the numbering space. It is a tool that we use to number packets sent over one of the possible 4 tuples between two endpoints. This is important for several reasons:
So far, this means selecting a number space and one or several packet IDs to send on a given four tuple. Two designs can work: 1-1 mapping between connection ID and number space, or 1-N mapping, in which several connection ID point to the same number space. Up to this point, we do not require any tie between the two directions of communications. The tying happens because if the client starts sending using a 4 tuple, the server may decide to send using the symmetric 4 tuple. The server discovers a possible "sending path" by receiving packets from the client. Thus, a bit of complexity:
The "1-1 mapping between CID and number space ID" solves these requirements, but does induce a discontinuity in packet number when a connection ID is rotated. The "1-N mapping between number space and CID" can also solve these issues. It does make CID rotation easy to handle, including when CID rotation is combined with NAT rebinding. The mapping does require some additional code for handling the 1-N mapping, such as for example a new "new CID" frame that links a CID to a path identifier. The 1-N mapping also requires extra care when abandoning a path and freeing the resource, because just retiring a CID does not retire the number space. For example, does abandoning a number space automatically induce retiring the CID tied to that number space? How to handle arrival of packets for a CID that is not retired yet, but is tied to a retired number space? Back to @yfmascgy question: I think we get much clarity if we focus on number spaces instead of focusing on the actual definition of a path. I also think that we should be very explcit with our handling of Path directions. For the MP ACK, this is simple: the number space identifier is that used by the sender of the packets being acknowledged. For the Path Abandon, this is less clear. Is the statement "I will not send anything more using that number space" (i.e., local sender space ID), or "Please don't send anything anymore with that number space" (i.e., remote sender space ID)? |
@yfmascgy : I think you make a very good point. With the existence of middleboxes between QUIC endpoints, the thing that defines a "path" may be different from the perspective of the different endpoints - e.g. the source IP address and UDP port that is inserted into an IP datagram by a sender may not be that same as the source IP address and UDP port seen in the the IP datagram by the receiver. And, as RFC9000 took pains to describe, the source IP address and UDP port seen in the the IP datagram by the receiver may change over time without the knowledge of the sender, resulting in (passive) connection migration. Without the use of a path identifier that is independent of the 4-tuple, it is hard to see how the endpoints can have any meaningful dialogue regarding "paths". Some would argue that this independent path identifier should be a connection identifier, others would argue that it should be an explicit path identifier selected by an endpoint. It might be worthwhile to look at the events that might result in the declaration of a new path. I don't think a receiver can unilaterally declare a new path based only on the detection of a passive connection migration (i.e. a change in the source IP address and/or UDP port seen in the a IP datagram). A sender may determine that it is using a new path based on: a) a change in the source IP address used by the sender, and/or b) a change in the network access point used by the sender. Note that (b) often results in (a) but there are some access networks where the same source IP address can be used across different access points. There are some deployments where (a) and/or (b) are hidden from the peer endpoint by the intervening network. Therefore, the declaration of a new path can only be made by a sender. @huitema : As you might guess, I am not in favour of solutions that require multiple packet number spaces ;-) |
Regarding the meaning of a "path", I agree that this is something conceptually difficult to concretise without ambiguity. Over a "path", you actually have two (unidirectional) flows: one on which the endpoint sends packets, and one on which it receives them. While the endpoint has some control on its sending flow (at least, at the beginning), it does not on receiving packets (for instance, passive migration or changes of the 4-tuple due to the network). Regarding the notion of "explicit Path ID", I think it is useful as the receiving endpoint could figure out what its peer is doing, i.e., it can map 1-1 a sending flow of its peer to one of its receiving flows, even if 4-tuple or CID change. This enables "flow continuity" from the endpoint point of view. @BillGageIETF, regarding the multiple packet number spaces, this is something we actually studied and presented results at the IETF 115. You can see the Alibaba's technical report or SIGCOMM CCR publication. In particular, using a single packet number space for the whole connection can cause issues when handling with different paths having very different characteristics. |
@qdeconinck : I don't want to repeat the discussion of issue #214 here. I will just note that since those earlier single-versus-multiple space discussions, the multiple space solution (in draft -06) has resulted in divergence from RFC9000 in some pretty fundamental ways. I will also note (as I did in #214) that those earlier discussions were not aware of the incompatibility that has come to light between draft -06 and the masque QUIC-aware UDP proxy draft. Handling diverse path characteristics within a single packet number space is a solvable problem even if it makes the implementation more complex. |
I think we have consensus that path is something that is identified by the 4-tuple (or 5-, if we consider the address family). In case of "stable path ID," I think it is just the name that might be confusing; it is actually referring to a particular "send slot" belonging to a QUIC connection. If we are to adopt such a design, I think the way to proceed would be to rename "stable path ID" to something else, rather than going back to the definition of "path." |
Comments record here: marten-seemann [on Nov 20, 2023]> (#292 (comment))
|
The debate over whether to use a stable path ID #214 may have deeper logical underpinnings than initially perceived and hinge on whether there’s an agreed-upon definition of a path.
The current draft considers defining a path as a 4-tuple between two endpoints. This definition currently shapes the draft design, where different 4-tuples are assigned unique IDs and linked to distinct packet number spaces, maintaining logical consistency as this definition alone does not incorporate the concept of path "continuity".
Those with differing views (i.e., stable path ID) essentially contest this definition of a path, perhaps even unknowingly. To adopt a stable path ID, we must redefine what constitutes a path, incorporating the idea of “continuity.”
But defining continuity is challenging. It suggests creating a transition graph where paths sharing an ID can trace back to a common origin, illustrated by transitions like
A1 -----> A2 ------> A3
, whereA
is some observable attributes of a path.An imperfect example of such transition graph is based on the combination of CID and 4-tuple:
(CID0, tuple0) ----> (CID0, tuple1) -----> (CID1, tuple1)
. However, the first complexity arises when both CID and tuple change simultaneously, as in(CID0, tuple0) ----????-----> (CID1, tuple1)
. Here, it’s unclear if these represent the same or different paths, making it impossible to construct a definitive transition graph. The second issue is that forcefully linking two such nodes with a path ID could lead to anomalies. For instance when we tie CID to path ID #214, you might start on an LTE path (PathID1) and have a standby path on Wi-Fi (PathID2) in a restaurant. Moving to a shopping mall and connecting to its Wi-Fi, you initially use a CID associated with PathID2. Yet, even after a CID rotation post path validation, you remain linked to PathID2 because you would rotate to another CID assigned for that PathID, despite the Wi-Fi path, its ISP, and the MTU potentially being different. There is another proposal of sticking to the loose path ID model but allowing the use of the same packet number space, again the logic hinges on whether we have a well-defined continuity to build a definitive transition so that we can link two paths to the same packet number space with no ambiguity.Given the years of back-and-forth on this issue, it seems helpful to first reach a consensus on the path definition. The design should logically follow from this agreed definition, similar to how different assumptions about parallel lines underpin Euclidean and non-Euclidean geometries. Without a clear definition, when we say "This is the same path because it has the same identifier", we are potentially entering a circular reasoning zone and we should avoid logical fallacy like that.
I could be wrong, but I think if we want to have a test of time design, we need to address the path definition unequivocally. Therefore, I would suggest to call for a consensus on a formal path definition. Two questions: (1) Do we like to incorporate "continuity" in the definition or stick to the current 4-tuple definition? (2) How to come up with a formal definition to incorporate the idea of continuity in path definition?
The text was updated successfully, but these errors were encountered: