-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Packet number echo with variable-length numbering #391
Conversation
Thanks for writing this up. We can worry about the details later, but this question of exposing largest acked needs discussion. A couple of comments in the meanwhile: |
+1 to Jana's point II. Given that in that case very ACK need to have the echo but not every packet that has an echo needs to be an ACK, I don't think we expose anything that shouldn't be exposed. |
Interesting idea, but if we remove the Largest Acked from the ACK frame itself, then we lose the property that every frame is self-describing (which, if I remember correctly, was considered a desirable property in Tokyo). From an implementor's perspective, this would introduce a lot of coupling between code parts that are otherwise separate. In the end, all we save is 4 bytes per RTT, so I'm not sure if it's really worth it. |
to summarize (will also post on quic@)... there seem to be three possibilities emerging here: Option 1: Echo (at least) once per RTT. Leave ACK frames unchanged.(This is what's written up in this PR). Rationale: What happens in the frame layer stays in the frame layer. Packet number echo is a pure packet-layer change to the transport protocol. Since this is additional overhead, we should work to minimize it, hence once per RTT, the minimum frequency that works for passive latency measurement. Upsides: It's a tiny change to the protocol as presently described, for a low overhead cost per RTT (1-4 bytes, usually 2). The ACK frame remains self-describing, so the ack frame handling component in an implementation can remain separate from the packet handling component. Endpoints can choose to echo more frequently (e.g., when configured to do so for debugging purposes). Downsides: an echoing endpoint with a passively-cooperative peer can intentionally skew passive measurements, since the value in the packet header is not necessarily joined to the value in the ack frame. The utility and practicality of such a thing is debatable. It's not clear how good the resulting number/echo stream will be for one-point loss and reordering estimation. Option 2: Echo max-acked on every packet with an ACK frame, and remove it from the ack frame.Rationale: This information already appears in the ACK layer, so there's no additional overhead. Upsides: Echo frequency is higher; this should make loss and reordering estimation work better at a single observation point. No additional overhead. Downsides: Increased complexity of implementations, which have to pass packet number echo information up the stack. No endpoint control over how often packet numbers are echoed, since max-ack is required by the transport mechanisms. Mixedsides (depending on your proclivities): The presence of an ACK frame is now exposed to the path. ACK-frame-containing packets can (and given the history of TCP, probably will) be treated differently by the network in certain circumstances. Option
|
Option 1 is more conservative, and what I had first envisioned, but Option 2 is certainly enticing. I secretly suspect that both ingress and egress largest packet numbers will have to be stored in the PCB somewhere, so it might not be too painful to implement. (Although if incoming packets are reordered, so the largest acked field drops momentarily, that could get messy...) Whether or not a packet has any user data is opaque enough to make any ack-dropping middlebox behavior so dangerous, that I doubt anyone would actually try it. So I think I am leaning towards option 2. The strongest counterargument I can see is @ianswett 's off-github comment that under option 1, endpoints can simply eliminate the echo if middleboxes are doing bad things with it. |
From the list: "The first option means that an implementation could not send the echo, for example if a network was using it in a way that was harmful to users. This provides an incentive for networks that do use it to either do nothing active with it or use it to improve user experience. (see Marcus Ihlar's thread for that) I think the largest upside(or downside, depending upon your perspective) of the second option where the largest acked is included with every frame is that it means one MUST implement this correctly to implement QUIC, at least if they want to use the standard ack frame. Personally, my largest concern with the second case is that of differential treatment of acks. What if an ack packet contains other data, and a middlebox decides that only the most recent ack needs to be delivered(a common TCP approach), so the data bundled with the ack is inadvertently lost? At this point, I lean towards the first option, because I understand the implications better, but I could be convinced otherwise. Either way, I think both of these are potentially worthy of including and I think they're preferable to doing nothing." In regards to reordering, the largest ack never decreases, even if packets are received out of order, so this should make decoding the value quite easy for middleboxes. Any packets with an echo present should be ignored if their packet number is less than the last echo you saw(because that means reordering in the ack stream) and otherwise the largest acked is monotonically increasing. Before we do add this, I'd really like to hear from some network operators about how critical this is. Ericsson clearly has a use case, but I wouldn't want to add this for a single company. My take is: If we give network operators the right information, they'll be able to operate their networks better, and that will benefit QUIC in the long run. If we don't think this information is critical to operations, it just adds one more thing to ossify. |
I made this point quite unclearly in my comment. The annoying thing, implementation-wise, about option 2 is having to pass the header field all the way into ACK frame processing. What I was trying to say is that largest_acked is probably a PCB variable anyway, so header processing just writes the largest_acked value into the PCB and ACK processing can pull it from there. If the ACK stream is reordered, I'm not going to rewrite pcb->largest_acked to a lower value. So I'm stuck passing the value down again. Not the end of the world, but just trying to address complexity. |
There's another consideration: how useful the information is for passive measurement. Here, Option 2 edges out Option 1 for passive RTT measurement -- since the echo signal is more frequent, observation points can have higher-fidelity estimates with less state. Given this, I'm leaning in favor of option 2 as well. Echo isn't very helpful for loss measurement -- most of the literature on one-point loss measurement makes use of the fact that TCP exposes retransmissions. After spending a few minutes at the whiteboard this morning, the only not-terrible idea that comes to mind is having a detailed RTT series (as possible with Option 2, but not with option 1) and estimating events generated by loss bursts by looking for transients in residuals in the RTT series. Loss estimation that doesn't suck needs additional information, such as adding a ConEx-like signal as discussed in #279. |
Why couple the echo to the presence of ACK? Why not send it on every packet then? |
It's not necessarily coupled. If the max-acked field comes out of the ACK frame, a sender MUST send an echo on any packet containing an ACK frame, and MAY on any other packet with a short header. Indeed, if we want to make sure that nobody gets the idea that "packet number echo == ACK-only packet a la TCP" we should ensure that echo also appears on some non-ACK packets. Thinking about this a bit more... Consider the common case of an extremely asymmetric flow (e.g. big HTTP object GET). The client is going to be sending primarily ACK frames, so will be sending lots of echos. The server won't be ACKing anything, since it got the whole request in the first few packets. The max packet number seen keeps going up, since it's getting a lot of packets full of ACKs, but it's not ACKing them. So you're only seeing echoes in one direction. That's bad for one-point RTT measurement. (for those not used to thinking about measuring RTT at a single point; here's an illustration I made a few years ago; the main point is you need sufficient samples in both directions so you can add "forward-side" and "reverse-side" estimates together) So just "echo on ACK" is probably insufficient. Let me suggest the following simplification, then: Option 3: Echo max-acked / max-seen on every packet with a short header, and remove it from the ACK frameRationale: This ensures sufficient samples for high-fidelity RTT measurement regardless of flow asymmetry, and offsets the overhead somewhat and eliminates redundancy by removing the echoed information from the ACK frame. Upsides: It's very simple to know when to echo. The presence of an ACK frame is not exposed to the path. This provides the best possible signal for passive RTT measurement. Downsides: A little more overhead. Shares with option 2 the lack of frame self-description for ACK frames. |
@britram I think it's ok to provide an RTT measurement less frequently when little data is flowing. The removal of STOP_WAITING and acking acks results in sending a retransmittable frame with an ack approximately once per RTT. So both directions should get an RTT estimate approximately once per RTT. @martinthomson Connections are typically bandwidth limited in one direction, but commonly not both. We'd like to limit the overhead this adds, particularly on the bandwidth limited direction. So putting it on every packet containing an ack is a reasonable compromise. I'm still apprehensive about making this required in any way. Past experience with exposing more information to the network has not been uniformly positive. |
The issue here is that each direction isn't providing an RTT measurement; it's providing a measurement only for the component of the RTT between the observation point and one endpoint. Referring to the illustration, an observation point close to the receiver in an asymmetric flow gets the RTT component toward the receiver far more often than the RTT component toward the sender.
Ah, right. Sorry, still thinking in TCP terms. Okay, so Option 2 echoes on (practically) a superset of Option 1. FWIW I continue to share your apprehension about making this non-optional. I like the idea of allowing application/user veto, which is only possible when the packet number echo is redundant with the ACK frames. |
While true, I'm not sure in practice that middlebox operators care about both sides of the path. In many cases, one side is "network I'm responsible for" and one is "the internet." In others, I might be using RTT to detect congestion and throttle egress accordingly, which implies that there's data going out. All that said, I prefer option 3 until someone gives me an example of the echo being used maliciously. |
I like Option 3. It is especially powerful when combined with Loss bit from #279. |
Replaces #357 and #367. It adds optional packet number echo to the short header, with the echo taking the same size as the packet number, including guidance as to the selecting packet number size. This PR introduces no technical changes compared to #367.