diff --git a/draft-ietf-quic-recovery.md b/draft-ietf-quic-recovery.md index e343fbeb06..9da73ae7d7 100644 --- a/draft-ietf-quic-recovery.md +++ b/draft-ietf-quic-recovery.md @@ -239,101 +239,19 @@ forward progress without relying on timeouts. QUIC endpoints measure the delay incurred between when a packet is received and when the corresponding acknowledgment is sent, allowing a peer to maintain a -more accurate round-trip time estimate (see {{host-delay}}). - - -# Generating Acknowledgements {#generating-acks} - -An acknowledgement SHOULD be sent immediately upon receipt of a second -ack-eliciting packet. QUIC recovery algorithms do not assume the peer sends -an ACK immediately when receiving a second ack-eliciting packet. - -In order to accelerate loss recovery and reduce timeouts, an endpoint SHOULD -immediately send an ACK frame when it receives an out-of-order packet that is -ACK-eliciting. The endpoint MAY continue sending ACK frames immediately on each -subsequently received packet, but the endpoint SHOULD return to acknowledging -every other packet after a period of 1/8 x RTT, unless more ACK-eliciting -packets are received out of order. If every subsequent ACK-eliciting packet -arrives out of order, then an ACK frame SHOULD be sent immediately for every -received ACK-eliciting packet. - -If a packet is received with the ECN Congestion Experienced (CE) codepoint in -the IP header, the endpoint SHOULD respond with an ACK frame immediately, even -if the packet is received in order. Doing so reduces the peer's response time -to congestion events. - -If multiple packets are already available at an endpoint, the endpoint MAY -process them all before sending an ACK frame in response. The endpoint can -determine whether an acknowledgement should be sent immediately or delayed after -processing the batch. - -## Crypto Handshake Data - -In order to quickly complete the handshake and avoid spurious retransmissions -due to crypto retransmission timeouts, crypto packets SHOULD use a very short -ack delay, such as the local timer granularity. ACK frames SHOULD be sent -immediately when the crypto stack indicates all data for that packet number -space has been received. - -## ACK Ranges - -When an ACK frame is sent, one or more ranges of acknowledged packets are -included. Including older packets reduces the chance of spurious retransmits -caused by losing previously sent ACK frames, at the cost of larger ACK frames. - -ACK frames SHOULD always acknowledge the most recently received packets, and the -more out-of-order the packets are, the more important it is to send an updated -ACK frame quickly, to prevent the peer from declaring a packet as lost and -spuriously retransmitting the frames it contains. - -Below is one recommended approach for determining what packets to include in an -ACK frame. - -## Receiver Tracking of ACK Frames - -When a packet containing an ACK frame is sent, the largest acknowledged in that -frame may be saved. When a packet containing an ACK frame is acknowledged, the -receiver can stop acknowledging packets less than or equal to the largest -acknowledged in the sent ACK frame. - -In cases without ACK frame loss, this algorithm allows for a minimum of 1 RTT -of reordering. In cases with ACK frame loss and reordering, this approach does -not guarantee that every acknowledgement is seen by the sender before it is no -longer included in the ACK frame. Packets could be received out of order and -all subsequent ACK frames containing them could be lost. In this case, the -loss recovery algorithm may cause spurious retransmits, but the sender will -continue making forward progress. - -## Measuring and Reporting Host Delay {#host-delay} - -An endpoint measures the delays intentionally introduced between when an -ACK-eliciting packet is received and the corresponding acknowledgment is sent. -The endpoint encodes this delay for the largest acknowledged packet in the -Ack Delay field of an ACK frame (see Section 19.3 of {{QUIC-TRANSPORT}}). -This allows the receiver of the ACK to adjust for any intentional delays, -which is important for delayed acknowledgements, when estimating the path RTT. -A packet might be held in the OS kernel or elsewhere on the host before being -processed. An endpoint SHOULD NOT include these unintentional delays when -populating the Ack Delay field in an ACK frame. - -An endpoint MUST NOT excessively delay acknowledgements of ack-eliciting -packets. The maximum ack delay is communicated in the max_ack_delay transport -parameter; see Section 18.2 of {{QUIC-TRANSPORT}}. max_ack_delay implies an -explicit contract: an endpoint promises to never delay acknowledgments of an -ack-eliciting packet by more than the indicated value. If it does, any excess -accrues to the RTT estimate and could result in spurious retransmissions from -the peer. For Initial and Handshake packets, a max_ack_delay of 0 is used. +more accurate round-trip time estimate (see Section 13.2 of {{QUIC-TRANSPORT}}). # Estimating the Round-Trip Time {#compute-rtt} At a high level, an endpoint measures the time from when a packet was sent to when it is acknowledged as a round-trip time (RTT) sample. The endpoint uses -RTT samples and peer-reported host delays ({{host-delay}}) to generate a -statistical description of the connection's RTT. An endpoint computes the -following three values: the minimum value observed over the lifetime of the -connection (min_rtt), an exponentially-weighted moving average (smoothed_rtt), -and the variance in the observed RTT samples (rttvar). +RTT samples and peer-reported host delays (see Section 13.2 of +{{QUIC-TRANSPORT}}) to generate a statistical description of the connection's +RTT. An endpoint computes the following three values: the minimum value +observed over the lifetime of the connection (min_rtt), an +exponentially-weighted moving average (smoothed_rtt), and the variance in the +observed RTT samples (rttvar). ## Generating RTT samples {#latest-rtt} @@ -380,10 +298,10 @@ min_rtt is set to the latest_rtt on the first sample in a connection, and to the lesser of min_rtt and latest_rtt on subsequent samples. An endpoint uses only locally observed times in computing the min_rtt and does -not adjust for host delays reported by the peer ({{host-delay}}). Doing so -allows the endpoint to set a lower bound for the smoothed_rtt based entirely on -what it observes (see {{smoothed-rtt}}), and limits potential underestimation -due to erroneously-reported delays by the peer. +not adjust for host delays reported by the peer. Doing so allows the endpoint +to set a lower bound for the smoothed_rtt based entirely on what it observes +(see {{smoothed-rtt}}), and limits potential underestimation due to +erroneously-reported delays by the peer. ## Estimating smoothed_rtt and rttvar {#smoothed-rtt} @@ -391,15 +309,14 @@ smoothed_rtt is an exponentially-weighted moving average of an endpoint's RTT samples, and rttvar is the endpoint's estimated variance in the RTT samples. The calculation of smoothed_rtt uses path latency after adjusting RTT samples -for host delays ({{host-delay}}). For packets sent in the ApplicationData -packet number space, a peer limits any delay in sending an acknowledgement for -an ack-eliciting packet to no greater than the value it advertised in the -max_ack_delay transport parameter. Consequently, when a peer reports an Ack -Delay that is greater than its max_ack_delay, the delay is attributed to reasons -out of the peer's control, such as scheduler latency at the peer or loss of -previous ACK frames. Any delays beyond the peer's max_ack_delay are therefore -considered effectively part of path delay and incorporated into the smoothed_rtt -estimate. +for host delays. For packets sent in the ApplicationData packet number space, +a peer limits any delay in sending an acknowledgement for an ack-eliciting +packet to no greater than the value it advertised in the max_ack_delay transport +parameter. Consequently, when a peer reports an Ack Delay that is greater than +its max_ack_delay, the delay is attributed to reasons out of the peer's control, +such as scheduler latency at the peer or loss of previous ACK frames. Any +delays beyond the peer's max_ack_delay are therefore considered effectively +part of path delay and incorporated into the smoothed_rtt estimate. When adjusting an RTT sample using peer-reported acknowledgement delays, an endpoint: @@ -408,7 +325,7 @@ endpoint: Initial and Handshake packet number space. - MUST use the lesser of the value reported in Ack Delay field of the ACK frame - and the peer's max_ack_delay transport parameter ({{host-delay}}). + and the peer's max_ack_delay transport parameter. - MUST NOT apply the adjustment if the resulting RTT sample is smaller than the min_rtt. This limits the underestimation that a misreporting peer can cause @@ -445,7 +362,7 @@ this section provides a description of these algorithms. If a packet is lost, the QUIC transport needs to recover from that loss, such as by retransmitting the data, sending an updated frame, or abandoning the -frame. For more information, see Section 13.2 of {{QUIC-TRANSPORT}}. +frame. For more information, see Section 13.3 of {{QUIC-TRANSPORT}}. ## Acknowledgement-based Detection {#ack-loss-detection} diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index 485484689a..4b74190ab5 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -2963,7 +2963,7 @@ streams as necessary in outgoing packets without losing transmission efficiency to underfilled packets. -## Packet Processing and Acknowledgment {#processing-and-ack} +## Packet Processing {#processing} A packet MUST NOT be acknowledged until packet protection has been successfully removed and all frames contained in the packet have been processed. For STREAM @@ -2979,21 +2979,57 @@ packet. expectations about what implementations do with packets that have errors after valid frames? --> + +## Generating Acknowledgements {#generating-acks} + +Endpoints acknowledge all packets they receive and process. However, only +ack-eliciting packets (see {{QUIC-RECOVERY}}) trigger the sending of an ACK +frame. Packets that are not ack-eliciting are only acknowledged when an ACK +frame is sent for other reasons. + +When sending a packet for any reason, an endpoint should attempt to bundle an +ACK frame if one has not been sent recently. Doing so helps with timely loss +detection at the peer. + +In general, frequent feedback from a receiver improves loss and congestion +response, but this has to be balanced against excessive load generated by a +receiver that sends an ACK frame in response to every ack-eliciting packet. The +guidance offered below seeks to strike this balance. + ### Sending ACK Frames -An endpoint sends ACK frames to acknowledge packets it has received and -processed. +An ACK frame SHOULD be generated for at least every second ack-eliciting packet. +This recommendation is in keeping with standard practice for TCP {{?RFC5681}}. -Packets containing only ACK frames are not congestion controlled, so there are -limits on how frequently they can be sent. An endpoint MUST NOT send more than -one ACK-frame-only packet in response to receiving an ACK-eliciting packet -(one containing frames other than ACK and/or PADDING). An endpoint MUST NOT -send a packet containing only an ACK frame in response to a non-ACK-eliciting -packet (one containing only ACK and/or PADDING frames), even if there are -packet gaps which precede the received packet. Limiting ACK frames avoids an -infinite feedback loop of acknowledgements, which could prevent the connection -from ever becoming idle. However, the endpoint acknowledges non-ACK-eliciting -packets when it sends an ACK frame. +A receiver's delayed acknowledgment timer SHOULD NOT exceed the current RTT +estimate or the value it indicates in the `max_ack_delay` transport parameter. +This ensures an acknowledgment is sent at least once per RTT when packets +needing acknowledgement are received. The sender can use the receiver's +`max_ack_delay` value in determining timeouts for timer-based retransmission. + +In order to assist loss detection at the sender, an endpoint SHOULD send an ACK +frame immediately on receiving an ack-eliciting packet that is out of order. The +endpoint MAY continue sending ACK frames immediately on each subsequently +received packet, but the endpoint SHOULD return to acknowledging every other +packet after a period of 1/8 x RTT, unless more ACK-eliciting packets are +received out of order. If every subsequent ACK-eliciting packet arrives out of +order, then an ACK frame SHOULD be sent immediately for every received +ACK-eliciting packet. + +Similarly, packets marked with the ECN Congestion Experienced (CE) codepoint in +the IP header SHOULD be acknowledged immediately, to reduce the peer's response +time to congestion events. + +As an optimization, a receiver MAY process multiple packets before sending any +ACK frames in response. In this case the receiver can determine whether an +immediate or delayed acknowledgement should be generated after processing +incoming packets. + +Acknowledgements of packets carrying CRYPTO frames SHOULD be minimally delayed, +to complete the handshake with minimal latency. Delaying them by a small amount, +such as the local timer granularity, allows the endpoint to bundle any data sent +in response with the ACK frame. ACK frames SHOULD be sent immediately when the +crypto stack indicates all data for that packet number space has been received. Packets containing PADDING frames are considered to be in flight for congestion control purposes {{QUIC-RECOVERY}}. Sending only PADDING frames might cause the @@ -3002,29 +3038,64 @@ sender to become limited by the congestion controller (as described in receiver. Therefore, a sender SHOULD ensure that other frames are sent in addition to PADDING frames to elicit acknowledgments from the receiver. -An endpoint that is only sending ACK frames will not receive -acknowledgments from its peer unless those acknowledgements are included in -packets with ACK-eliciting frames. An endpoint SHOULD bundle ACK frames with -other frames when there are new ACK-eliciting packets to acknowledge. -When only non-ACK-eliciting packets need to be acknowledged, an endpoint MAY -wait until an ACK-eliciting packet has been received to bundle an ACK frame -with outgoing frames. +An endpoint that is only sending ACK frames will not receive acknowledgments +from its peer unless those acknowledgements are included in packets with +ACK-eliciting frames. An endpoint SHOULD bundle ACK frames with other frames +when there are new ACK-eliciting packets to acknowledge. When only +non-ACK-eliciting packets need to be acknowledged, an endpoint MAY wait until an +ACK-eliciting packet has been received to bundle an ACK frame with outgoing +frames. + +The algorithms in {{QUIC-RECOVERY}} are resilient to receivers that do not +follow guidance offered above. However, an implementor should only deviate from +these requirements after careful consideration of the performance implications +of doing so. + +Packets containing only ACK frames are not congestion controlled, so there are +limits on how frequently they can be sent. An endpoint MUST NOT send more than +one ACK-frame-only packet in response to receiving an ACK-eliciting packet (one +containing frames other than ACK and/or PADDING). An endpoint MUST NOT send a +packet containing only an ACK frame in response to a non-ACK-eliciting packet +(one containing only ACK and/or PADDING frames), even if there are packet gaps +which precede the received packet. Limiting ACK frames avoids an infinite +feedback loop of acknowledgements, which could prevent the connection from ever +becoming idle. However, the endpoint acknowledges non-ACK-eliciting packets when +it sends an ACK frame. An endpoint SHOULD treat receipt of an acknowledgment for a packet it did not send as a connection error of type PROTOCOL_VIOLATION, if it is able to detect the condition. -The receiver's delayed acknowledgment timer SHOULD NOT exceed the current RTT -estimate or the value it indicates in the `max_ack_delay` transport parameter. -This ensures an acknowledgment is sent at least once per RTT when packets -needing acknowledgement are received. The sender can use the receiver's -`max_ack_delay` value in determining timeouts for timer-based retransmission. +### Managing ACK Ranges + +When an ACK frame is sent, one or more ranges of acknowledged packets are +included. Including older packets reduces the chance of spurious retransmits +caused by losing previously sent ACK frames, at the cost of larger ACK frames. -Strategies and implications of the frequency of generating acknowledgments are -discussed in more detail in {{QUIC-RECOVERY}}. +ACK frames SHOULD always acknowledge the most recently received packets, and the +more out-of-order the packets are, the more important it is to send an updated +ACK frame quickly, to prevent the peer from declaring a packet as lost and +spuriously retransmitting the frames it contains. +{{ack-tracking}} and {{ack-limiting}} describe an exemplary approach for +determining what packets to acknowledge in each ACK frame. -### Limiting ACK Ranges +### Receiver Tracking of ACK Frames {#ack-tracking} + +When a packet containing an ACK frame is sent, the largest acknowledged in that +frame may be saved. When a packet containing an ACK frame is acknowledged, the +receiver can stop acknowledging packets less than or equal to the largest +acknowledged in the sent ACK frame. + +In cases without ACK frame loss, this algorithm allows for a minimum of 1 RTT +of reordering. In cases with ACK frame loss and reordering, this approach does +not guarantee that every acknowledgement is seen by the sender before it is no +longer included in the ACK frame. Packets could be received out of order and +all subsequent ACK frames containing them could be lost. In this case, the +loss recovery algorithm could cause spurious retransmits, but the sender will +continue making forward progress. + +### Limiting ACK Ranges {#ack-limiting} To limit ACK Ranges (see {{ack-ranges}}) to those that have not yet been received by the sender, the receiver SHOULD track which ACK frames have been @@ -3046,12 +3117,26 @@ to unnecessarily retransmit some data. Standard QUIC algorithms acknowledged. Therefore, the receiver SHOULD repeatedly acknowledge newly received packets in preference to packets received in the past. -An endpoint SHOULD treat receipt of an acknowledgment for a packet it did not -send as a connection error of type PROTOCOL_VIOLATION, if it is able to detect -the condition. This includes receiving an ACK frame containing a packet number -that the endpoint has not sent, as well as acknowledgements for 0-RTT packets -when the server has rejected the use of 0-RTT. - +### Measuring and Reporting Host Delay {#host-delay} + +An endpoint measures the delays intentionally introduced between when an +ACK-eliciting packet is received and the corresponding acknowledgment is sent. +The endpoint encodes this delay for the largest acknowledged packet in the Ack +Delay field of an ACK frame (see {{frame-ack}}). This allows the receiver of the +ACK to adjust for any intentional delays, which is important for getting a +better estimate of the path RTT when acknowledgments are delayed. A packet might +be held in the OS kernel or elsewhere on the host before being processed. An +endpoint MUST NOT include delays that is does not control when populating the +Ack Delay field in an ACK frame. + +An endpoint MUST NOT excessively delay acknowledgements of ack-eliciting +packets. An endpoint commits to a maximum delay using the max_ack_delay +transport parameter; see {{transport-parameter-definitions}}. max_ack_delay +declares an explicit contract: an endpoint promises to never delay +acknowledgments of an ack-eliciting packet by more than the indicated value. If +it does, any excess accrues to the RTT estimate and could result in delayed +retransmissions from the peer. For Initial and Handshake packets, a +max_ack_delay of 0 is used. ### ACK Frames and Packet Protection @@ -3066,9 +3151,6 @@ unable to use these acknowledgments if the server cryptographic handshake messages are delayed or lost. Note that the same limitation applies to other data sent by the server protected by the 1-RTT keys. -Endpoints SHOULD send acknowledgments for packets containing CRYPTO frames with -a reduced delay; see Section 6.2 of {{QUIC-RECOVERY}}. - ## Retransmission of Information @@ -3195,7 +3277,7 @@ during connection establishment and when migrating to a new path On receiving a QUIC packet with an ECT or CE codepoint, an ECN-enabled endpoint that can access the ECN codepoints from the enclosing IP packet increases the corresponding ECT(0), ECT(1), or CE count, and includes these counts in -subsequent ACK frames (see {{processing-and-ack}} and {{frame-ack}}). Note +subsequent ACK frames (see {{generating-acks}} and {{frame-ack}}). Note that this requires being able to read the ECN codepoints from the enclosing IP packet, which is not possible on all platforms. @@ -3204,7 +3286,7 @@ local ECN codepoint counts; see ({{security-ecn}}) for relevant security concerns. If an endpoint receives a QUIC packet without an ECT or CE codepoint in the IP -packet header, it responds per {{processing-and-ack}} with an ACK frame without +packet header, it responds per {{generating-acks}} with an ACK frame without increasing any ECN counts. If an endpoint does not implement ECN support or does not have access to received ECN codepoints, it does not increase ECN counts.