-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compensation of ack_delay is fragile against errors #2060
Comments
Thanks, I think that's a good suggestion. Writing a PR now. |
You still want to have the if check, to avoid a large value due to overflow. Which unfortunately leaves us at the same place. I think what you want is:
That said, I'm not sure that this is a good change. Basically if you have a bad clock at the receiver, this means that you constantly underestimate the latest RTT sample. That is bad for two reasons: (i) you miss any queues that might be building up because you underestimate latest_rtt, and (ii) the smoothed_rtt value that might be used for pacing uses this underestimated value. OTOH, in the current world, if the receiver is consistently bad in reporting ack_delay, the sender becomes less aggressive than it should be, since the latest_rtt is larger than it should be. Pacing rate is lower, and any sender queueing estimate that uses latest_rtt hits detects a queue when there isn't one. This isn't unsafe, this is just poor performance, which should be expected for a receiver with a poor clock. |
That code produces the same outcome as Ian's patch, though it's 4 lines instead of 1. I tend to think that this sort of defensive guard is useful though. If a peer under- or over-reports the ACK delay, that's going to produce bad results, but someone with a clock that much jank is going to have a bad time no matter what. |
Isn't that mathematically equivalent to
What type of behavior are you assuming for a "bad clock"? The issue with the current approach is that it considers the true ack_delay to be zero when the value reported by the peer is huge. Assuming that we cannot tell how good the clock of the peer (or the endpoint) is, we should round ack_delay to the nearest reasonable value. That is, if |
@kazuho @martinthomson : Mathematically, yes, but I wanted to protect against overflow. @kazuho: I'm concerned that if the ack_delay being reported is too large, I don't know how much of that is in error and how much of that is accurate. The patch assumes that exactly latest_rtt - min_rtt is the correct ack_delay, but there's no evidence of it. The problem with the proposed patch is that a client that reports an ack_delay that is too large will automatically cause the sender to always believe that the RTT is not growing at all. (Imagine the case where ack_delay is always reported as MAX_INT. This means that the sender will always end up with latest_rtt = min_rtt.) I'm suggesting that if there's an indication that the peer is feeding in values that are too large for ack_delay (which should not happen consistently or all the time), then we should ignore ack_delay. |
I thought that we had accepted the costs of integrating ack_delay in calculations (it might be wrong) against the benefits (we get better accounting in the majority of the cases where it is set accurately). |
In general, yes, but in this case, I don't see a reason to use any part of
ack_delay when we know it's incorrect.
…On Wed, Nov 28, 2018, 7:44 PM Martin Thomson ***@***.*** wrote:
I thought that we had accepted the costs of integrating ack_delay in
calculations (it might be wrong) against the benefits (we get better
accounting in the majority of the cases where it is set accurately).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2060 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AKjg1Oht0XwTGilWqn_4uqKhZpD4eg1Lks5uz1gCgaJpZM4Y22J5>
.
|
Would it be better to disregard ack_delay for all future packets on the connection then? |
If folks feel strongly about this, I'm fine with it. I don't think it makes
sense, but it does seem like I'm in the minority, and if you all think it
makes sense, that's what we'll do.
…On Wed, Nov 28, 2018, 7:47 PM Jana Iyengar ***@***.*** wrote:
In general, yes, but in this case, I don't see a reason to use any part of
ack_delay when we know it's incorrect.
On Wed, Nov 28, 2018, 7:44 PM Martin Thomson ***@***.***
wrote:
> I thought that we had accepted the costs of integrating ack_delay in
> calculations (it might be wrong) against the benefits (we get better
> accounting in the majority of the cases where it is set accurately).
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#2060 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AKjg1Oht0XwTGilWqn_4uqKhZpD4eg1Lks5uz1gCgaJpZM4Y22J5>
> .
>
|
I don't feel strongly. I'm just trying to understand. It's a little odd that we would take ack_delay into account conditionally on a per-packet basis. |
In general, I think having an inflated RTT is better than a lower than actual RTT, since inflated RTT leads to a less aggressive sender and smaller than actual RTT leads to a more aggressive sender. Admittedly, it's possible for there to be error when latest_rtt - (inflated) ack_delay > min_rtt. That's the error that I had considered an accepted cost of using ack_delay. It's fair to ask what should we do about a receiver that we consider might have a bad clock... I wouldn't be opposed to ignoring ack_delay entirely for the rest of the connection... I wonder if there are counter cases that I'm missing though where this ends up being too aggressive. In general, the receiver should be rounding down when reporting delay, not up. |
I do not think that that is an issue, because we always use If it is the case that the clock has a jitter, the observed SRTT using the proposed formula will end up in a value that is either equal to or slightly higher than the true SRTT. If it is the case that the clock occasionally jumps (to adjust time), the observed SRTT using the proposed formula would also be either equal to or slightly higher than the true SRTT. So I would assume that the proposed formula has no issues in the cases above.
I can understand your concern about broken implementations, but I am not sure if the current defense is good enough if it is the case that we need to make care of them. Consider the case of an endpoint that always set To put it another way, it seems to me that there is no reliable way of detecting peers sending broken Or alternatively, we might want to state that QUIC stacks running with an unreliable clock SHOULD set |
FYI, my experience with clocks is they're either noisy, and if so over very short timescales(<1ms) or they jump suddenly. In the case of a sudden jump, any protection from largely ignoring the jump should be fine. I like Kazuho's suggestion because it changes a case that's currently discontinuous(1us extra max ack delay and the RTT sample can jump tens of ms) into one that's continuous. Discontinuities in congestion control and loss recovery tend to lead to weird corner cases. The current pseudocode is what gQUIC currently does, which is why it's there, but especially now that max ack delay is explicitly communicated, I think Kazuho's suggestion is a good one. There is the question of what to do if the specified ack_delay is larger than the value specified in transport params. Possibly we should only adjust by min(ack_delay, max_ack_delay), and then if the alarm is always waking up late, SRTT will include that even though max_ack_delay is not? |
Yeah. Discontinuity sounds the best word to explain the issue in the current approach.
I am not sure if min(ack_delay, max_ack_delay) desirable, because ack_delay would typically be greater than max_ack_delay for retransmitted ACKs. |
In regards to min(ack_delay, max_ack_delay), it's critical to consider what srtt is used for, which is loss detection and timers(TLP and RTO). If there is a high rate of ACK loss or if the peer's alarms are persistently waking up late, then the SRTT and/or RTTVar will increase, thereby increasing these timers. If this happens very infrequently, then the EWMA will forget them relatively quickly. I think that seems appropriate? To think of the pathological case, what if a peer communicates a max_ack_delay of 0? I think it's important the rest of the recovery machinery still functions correctly in that case. |
I had not thought about that, and admittedly I am unfamiliar with loss recovery, however I wonder if that's appropriate. IIUC, the reason we use X * SRTT where X is slightly greater than one is because we expect an ACK to arrive in SRTT with some jitter. However, in case of ACK loss, the SRTT using the computation would become too large for the case where there is no loss, or too small for the case where there is a loss. For eaxmaple, in the case of having ACK loss at 33%, SRTT becomes 33% greater than the RTT. I am not sure if that is a desirable behavior. If we can assume that ACK loss happens infrequently, IMO the correct thing to do would be to ignore the effect of ACK retransmissions. |
It's important to consider the fact that typically, even if there is some ACK loss, so many immediate ACKs are being sent(particularly with acking every other packet), the skew is minimal. The interesting case is at the tail where the final ACK is lost and the next ACK arrives a long time afterwards. |
I think I might have a simpler case that illustrates the problem of capping the compensation to What happens if an endpoint receives an ACK-only packet, sleeps for 10 seconds, then sends some data? The sent packet will contain data and a ACK frame with an ack-delay field of 10 seconds. Don't we need to subtract 10 seconds in this case? |
Responding to the most recent couple of comments first: SRTT does not include ack_delay, which is why we add max_ack_delay back into the RTO computation. So, even in the case of ack loss, the SRTT should be the same... the reported ack_delay might keep increasing. There is the question of what to do if ack_delay gets larger than max_ack_delay, which is why I had argued that should prefer observed ack_delays over explicitly stated ones (in an earlier thread.) I think we had discussed monitoring ack_delays and using that value over time, but left it for later... |
I think there should be a goal of making both the ack_delay always 0 case and the max_ack_delay of 0 case converge to the same behavior? I think the old approach was problematic because it used a max filter for something that potentially had some loss and tail latency that was not typical, but could permanently drive up tail latency. As such, I think limiting by no more than max_ack_delay is a good realworld compromise. |
I understand the argument about sudden increases when the ack_delay is increased... but that is exactly my point as well. It's easy to see a sudden decrease in latest_rtt with the patch when ack_delay is increased. Consider a clock adjustment at the client where the clock "caught up" with real time (I've seen this happen), and the reported ack_delay is quite large. If the actual RTT had departed a fair bit from min_rtt, then this single misreported ack_delay will cause the sender's estimate of latest_rtt to drop to min_rtt. As @ianswett notes, what matters is where this erroneous value -- too small or too large -- gets used. It's used in RTO, TLP, and perhaps in pacing. My argument is that having too large a value -- as would happen with the spec -- is better than having too small a value, because that's the conservative thing to do. For context, we should be mindful that any sudden and occasional increase/decrease in ack_delay is easily absorbed by the RTT filters; that's in fact exactly why the EWMA filter is weighted as it is. Any single sample does not change the SRTT or RTTVAR by much. Individual samples can be noisy. Ultimately, as long as it's an RTT spike, it actually doesn't matter much. It only matters if it's persistent. |
@ianswett : I agree that we compromised. I was just noting that the ideal solution might be to use an EWMA perhaps, but we're settling for simplicity right now. That has a cost... and these corner cases are that cost I think. I'm tempted to suggest that we simply drop the RTT samples where the ack_delay does not make sense. This should not be the common case anyways. |
Thanks to @ianswett and @janaiyengar for all the conversation. For connections that are advancing, I now think that what @ianswett says makes sense; capping the compensation to However, in case of a heavily congested network (or after disruption), capping the number means that we could have an unreasonably high SRTT. Seeing that is why I raised the issue. Considering that, I think I might prefer @janaiyengar's suggestion to skip updating RTT information based on ACK frames with ack_delays that do not make sense. |
@kazuho, I see two issues here that we are discussing, and I'll state them explicitly, to ensure that I'm not confused :-)
Does this seem right? |
Yes, those are the two issues, and they're somewhat separate. For 1, I'm concerned we could ignore samples for a long time under some circumstances. If the RTT is 25ms and a connection is always sending 1 packet and hitting the delayed ack timer, then the min_rtt will be 50ms, not 25ms. And I believe ignoring SRTT samples that are less than min_rtt would result in ignoring every SRTT sample until an immediate ack was sent. Admittedly, we say to ACK crypto packets immediately, but if one used a 1ms timer as 'immediate', I think the same problem results? For 2, I think putting it into SRTT is the natural thing. Adding another EWMA doesn't seem worthwhile, and in most cases it is very similar to incorporating it into SRTT. |
@kazuho up above you were concerned about a 10 second RTT sample due to bundling an ACK of an ACK. That should not be an issue because only when the largest acked is a retransmittable packet should an RTT sample taken. But now that I look at the text and pseudocode, the retransmittable qualification is missing, so I'll writeup a PR to fix that. |
@janaiyengar writes:
I hope you reported this as a bug to the client's implementers: Most operating systems provide an monotonic clock. |
Thank you for the clarification and the PR. They make me less worried about the issue :-) |
@dtikhonov : The clock jumped up. It was monotonic. |
@ianswett: Responding to your comment (here) and PR #2062. For (1), yes, that is true, but we have already put a stake in the ground: min_rtt is only driven by what a sender measures and not by reported ack_delay. Under that framework, no SRTT that is smaller than measured min_rtt will be valid. That is a consequence we know and hopefully we should be measuring true min_rtt often enough that this isn't a problem. Basically, my core argument is this: SRTT is used for loss detection and potentially congestion control and pacing.
Given that this is a mis-reporting receiver, a natural consequence of mis-reporting should be poor performance, not more aggressive sending. I still think the sender should ignore a computed latest_rtt sample which seems to be smaller than min_rtt. |
There are certainly cases where this does not require a mis-reporting receiver(ie: what I described above), but I think your argument is that you don't care about them? If so, this does make it somewhat critical to send a reasonable number of non-delayed ACKs to ensure min_rtt is discovered relatively quickly. I completely agree that I don't want to change the definition of min_rtt, FWIW. |
@ianswett : Sorry, I was referring to a mis-reporting receiver in the case that the min_rtt is known. In the case you mention, yes, that is a potential condition where a sender might have poor performance too, but I imagine this is not something we would expect to happen for very long, especially given all the handshake packets should be acked immediately. Using a 1ms timer means for immediate means that's just a part of the min_rtt... nothing a sender can do about that. Basically, I don't think optimistically compensating at the sender for receiver delays is the way to go. If the RTT sample seems broken, I would argue it should be ignored. |
As commented on the PR, I'm increasingly thinking we should limit by max_ack_delay and then if it still seems invalid, ignore ack_delay, as in the current text, but not ignore the entire sample, since even without the ack_delay adjustment, we expect the RTT sample to be somewhat useful. |
Limiting to max_ack_delay sounds fine to me as I've said earlier. If the RTT sample is used without ack_delay (current spec text) instead of ignoring, this causes the SRTT to increase which is safe. I'm fine with it. |
Limiting to max_ack_delay sounds good to me, especially considering the fact that now we only sample ACK-eliciting packets. |
SGTM, closing the PR and this issue then. We can always revisit it if we have some new data that indicates the current text is a problem. |
The logic in UpdateRtt function that compensates for ack_delay is as follows.
The issue here is the delay is completely ignored if
ack_delay
is a bit greater than the true value due to measurement errors, etc. If this happens on a congested network where ACKs get lost,latest_rtt
ends up in an incredibly high value.IMO this should be
latest_rtt = max(latest_rtt - ack_delay, min_rtt)
.The text was updated successfully, but these errors were encountered: