Skip to content

Commit

Permalink
WGLC review nits
Browse files Browse the repository at this point in the history
  • Loading branch information
MikeBishop committed Jul 8, 2020
1 parent ae8cb06 commit 3b24fdf
Showing 1 changed file with 62 additions and 64 deletions.
126 changes: 62 additions & 64 deletions draft-ietf-quic-recovery.md
Expand Up @@ -103,12 +103,12 @@ TCP implementations.

Definitions of terms that are used in this document:

Ack-eliciting Frames:
Ack-eliciting frames:

: All frames other than ACK, PADDING, and CONNECTION_CLOSE are considered
ack-eliciting.

Ack-eliciting Packets:
Ack-eliciting packets:

: Packets that contain ack-eliciting frames elicit an ACK from the receiver
within the maximum ack delay and are called ack-eliciting packets.
Expand All @@ -129,8 +129,8 @@ number space for the lifetime of a connection. Packet numbers are sent in
monotonically increasing order within a space, preventing ambiguity.

This design obviates the need for disambiguating between transmissions and
retransmissions and eliminates significant complexity from QUIC's interpretation
of TCP loss detection mechanisms.
retransmissions; this eliminates significant complexity from QUIC's
interpretation of TCP loss detection mechanisms.

QUIC packets can contain multiple frames of different types. The recovery
mechanisms ensure that data and frames that need reliable delivery are
Expand All @@ -154,9 +154,9 @@ of frames contained in a packet affect recovery and congestion control logic:
# Relevant Differences Between QUIC and TCP

Readers familiar with TCP's loss detection and congestion control will find
algorithms here that parallel well-known TCP ones. Protocol differences between
QUIC and TCP however contribute to algorithmic differences. We briefly describe
these protocol differences below.
algorithms here that parallel well-known TCP ones. However, protocol differences
between QUIC and TCP contribute to algorithmic differences. These protocol
differences are briefly described below.

## Separate Packet Number Spaces

Expand All @@ -180,7 +180,7 @@ QUIC's packet number is strictly increasing within a packet number space,
and directly encodes transmission order. A higher packet number signifies
that the packet was sent later, and a lower packet number signifies that
the packet was sent earlier. When a packet containing ack-eliciting
frames is detected lost, QUIC rebundles necessary frames in a new packet
frames is detected lost, QUIC includes necessary frames in a new packet
with a new packet number, removing ambiguity about which packet is
acknowledged when an ACK is received. Consequently, more accurate RTT
measurements can be made, spurious retransmissions are trivially detected, and
Expand All @@ -205,8 +205,8 @@ once across multiple round trips.
## No Reneging

QUIC ACKs contain information that is similar to TCP SACK, but QUIC does not
allow any acked packet to be reneged, greatly simplifying implementations on
both sides and reducing memory pressure on the sender.
allow any acknowledged packet to be reneged, greatly simplifying implementations
on both sides and reducing memory pressure on the sender.

## More ACK Ranges

Expand All @@ -232,7 +232,7 @@ in-flight lost, QUIC allows probe packets to temporarily exceed the congestion
window whenever the timer expires.

In doing this, QUIC avoids unnecessary congestion window reductions, obviating
the need for correcting mechanisms such as F-RTO {{?RFC5682}}. Since QUIC does
the need for correcting mechanisms such as F-RTO ({{?RFC5682}}). Since QUIC does
not collapse the congestion window on a PTO expiration, a QUIC sender is not
limited from sending more in-flight packets after a PTO expiration if it still
has available congestion window. This occurs when a sender is
Expand Down Expand Up @@ -291,7 +291,7 @@ An RTT sample is generated using only the largest acknowledged packet in the
received ACK frame. This is because a peer reports ACK delays for only the
largest acknowledged packet in an ACK frame. While the reported ACK delay is
not used by the RTT sample measurement, it is used to adjust the RTT sample in
subsequent computations of smoothed_rtt and rttvar {{smoothed-rtt}}.
subsequent computations of smoothed_rtt and rttvar ({{smoothed-rtt}}).

To avoid generating multiple RTT samples for a single packet, an ACK frame
SHOULD NOT be used to update RTT estimates if it does not newly acknowledge the
Expand Down Expand Up @@ -326,7 +326,7 @@ erroneously-reported delays by the peer.
The RTT for a network path may change over time. If a path's actual RTT
decreases, the min_rtt will adapt immediately on the first low sample. If
the path's actual RTT increases, the min_rtt will not adapt to it, allowing
future RTT samples that are smaller than the new RTT be included in
future RTT samples that are smaller than the new RTT to be included in
smoothed_rtt.

## Estimating smoothed_rtt and rttvar {#smoothed-rtt}
Expand All @@ -338,7 +338,7 @@ mean variation.
The calculation of smoothed_rtt uses path latency after adjusting RTT samples
for acknowledgement delays. These delays are computed using the ACK Delay
field of the ACK frame as described in Section 19.3 of {{QUIC-TRANSPORT}}.
For packets sent in the ApplicationData packet number space, a peer limits
For packets sent in the Application Data packet number space, a peer limits
any delay in sending an acknowledgement for an ack-eliciting packet to no
greater than the value it advertised in the max_ack_delay transport parameter.
Consequently, when a peer reports an Ack Delay that is greater than its
Expand Down Expand Up @@ -399,12 +399,12 @@ as by retransmitting the data, sending an updated frame, or abandoning the
frame. For more information, see Section 13.3 of {{QUIC-TRANSPORT}}.


## Acknowledgement-based Detection {#ack-loss-detection}
## Acknowledgement-Based Detection {#ack-loss-detection}

Acknowledgement-based loss detection implements the spirit of TCP's Fast
Retransmit {{?RFC5681}}, Early Retransmit {{?RFC5827}}, FACK {{FACK}}, SACK loss
recovery {{?RFC6675}}, and RACK {{?RACK=I-D.ietf-tcpm-rack}}. This section
provides an overview of how these algorithms are implemented in QUIC.
Retransmit ({{?RFC5681}}), Early Retransmit ({{?RFC5827}}), FACK ({{FACK}}),
SACK loss recovery ({{?RFC6675}}), and RACK ({{?RACK=I-D.ietf-tcpm-rack}}). This
section provides an overview of how these algorithms are implemented in QUIC.

A packet is declared lost if it meets all the following conditions:

Expand All @@ -430,13 +430,13 @@ thresholds to minimize recovery latency.

The RECOMMENDED initial value for the packet reordering threshold
(kPacketThreshold) is 3, based on best practices for TCP loss detection
{{?RFC5681}} {{?RFC6675}}. Implementations SHOULD NOT use a packet threshold
less than 3, to keep in line with TCP {{?RFC5681}}.
({{?RFC5681}}, {{?RFC6675}}). Implementations SHOULD NOT use a packet threshold
less than 3, to keep in line with TCP ({{?RFC5681}}).

Some networks may exhibit higher degrees of reordering, causing a sender to
detect spurious losses. Algorithms that increase the reordering threshold after
spuriously detecting losses, such as TCP-NCR {{?RFC4653}}, have proven to be
useful in TCP and are expected to at least as useful in QUIC. Re-ordering
spuriously detecting losses, such as TCP-NCR ({{?RFC4653}}), have proven to be
useful in TCP and are expected to be at least as useful in QUIC. Re-ordering
could be more common with QUIC than TCP, because network elements cannot observe
and fix the order of out-of-order packets.

Expand Down Expand Up @@ -486,11 +486,11 @@ prior unacknowledged packets to be marked as lost. When an acknowledgement
is received that newly acknowledges packets, loss detection proceeds as
dictated by packet and time threshold mechanisms; see {{ack-loss-detection}}.

As with loss detection, the probe timeout is per packet number space.
The PTO algorithm used in QUIC implements the reliability functions of
Tail Loss Probe {{?RACK}}, RTO {{?RFC5681}}, and F-RTO algorithms for
TCP {{?RFC5682}}. The timeout computation is based on TCP's retransmission
timeout period {{?RFC6298}}.
As with loss detection, the probe timeout is per packet number space. The PTO
algorithm used in QUIC implements the reliability functions of Tail Loss Probe
({{?RACK}}), RTO ({{?RFC5681}}), and F-RTO ({{?RFC5682}}) algorithms for TCP.
The timeout computation is based on TCP's retransmission timeout period
({{?RFC6298}}).

### Computing PTO

Expand All @@ -503,11 +503,11 @@ PTO = smoothed_rtt + max(4*rttvar, kGranularity) + max_ack_delay

The PTO period is the amount of time that a sender ought to wait for an
acknowledgement of a sent packet. This time period includes the estimated
network roundtrip-time (smoothed_rtt), the variation in the estimate (4*rttvar),
and max_ack_delay, to account for the maximum time by which a receiver might
delay sending an acknowledgement. When the PTO is armed for Initial or
Handshake packet number spaces, the max_ack_delay is 0, as specified in
13.2.1 of {{QUIC-TRANSPORT}}.
network round-trip time (smoothed_rtt), the variation in the estimate
(4*rttvar), and max_ack_delay, to account for the maximum time by which a
receiver might delay sending an acknowledgement. When the PTO is armed for
Initial or Handshake packet number spaces, the max_ack_delay is 0, as specified
in 13.2.1 of {{QUIC-TRANSPORT}}.

The PTO value MUST be set to at least kGranularity, to avoid the timer expiring
immediately.
Expand All @@ -520,9 +520,9 @@ correct packet number space.

When ack-eliciting packets in multiple packet number spaces are in flight,
the timer MUST be set for the packet number space with the earliest timeout,
with one exception. The ApplicationData packet number space (Section 4.1.1
with one exception. The Application Data packet number space (Section 4.1.1
of {{QUIC-TLS}}) MUST be ignored until the handshake completes. Not arming
the PTO for ApplicationData prevents a client from retransmitting a 0-RTT
the PTO for Application Data prevents a client from retransmitting a 0-RTT
packet on a PTO expiration before confirming that the server is able to
decrypt 0-RTT packets, and prevents a server from sending a 1-RTT packet on
a PTO expiration before it has the keys to process an acknowledgement.
Expand All @@ -534,8 +534,7 @@ might take longer to respond to packets during the handshake than otherwise. To
protect such a server from repeated client probes, the PTO backoff is not reset
at a client that is not yet certain that the server has finished validating the
client's address. That is, a client does not reset the PTO backoff factor on
receiving acknowledgements until it receives a HANDSHAKE_DONE frame or an
acknowledgement for one of its Handshake or 1-RTT packets.
receiving acknowledgements until the handshake is confirmed.

This exponential reduction in the sender's rate is important because
consecutive PTOs might be caused by loss of packets or acknowledgements due to
Expand All @@ -548,7 +547,7 @@ in the Handshake packet number space.
The life of a connection that is experiencing consecutive PTOs is limited by
the endpoint's idle timeout.

The probe timer MUST NOT be set if the time threshold {{time-threshold}} loss
The probe timer MUST NOT be set if the time threshold ({{time-threshold}}) loss
detection timer is set. The time threshold loss detection timer is expected
to both expire earlier than the PTO and be less likely to spuriously retransmit
data.
Expand All @@ -565,7 +564,7 @@ PATH_RESPONSE to set the initial RTT (see kInitialRtt in
{{constants-of-interest}}) for a new path, but the delay SHOULD NOT be
considered an RTT sample.

Prior to handshake completion, when few to none RTT samples have been
Prior to handshake completion, when few to no RTT samples have been
generated, it is possible that the probe timer expiration is due to an
incorrect RTT estimate at the client. To allow the client to improve its RTT
estimate, the new packet that it sends MUST be ack-eliciting.
Expand All @@ -587,7 +586,7 @@ received from the client, because packets sent on PTO count against the
anti-amplification limit. Note that the server could fail to validate the
client's address even if 0-RTT is accepted.

Since the server could be blocked until more packets are received from the
Since the server could be blocked until more bytes are received from the
client, it is the client's responsibility to send packets to unblock the server
until it is certain that the server has finished its address validation
(see Section 8 of {{QUIC-TRANSPORT}}). That is, the client MUST set the
Expand All @@ -612,7 +611,7 @@ it may assume some or all of the server's Initial packets were lost.

To speed up handshake completion under these conditions, an endpoint MAY send
a packet containing unacknowledged CRYPTO data earlier than the PTO expiry,
subject to address validation limits; see Section 8.1 of {{QUIC-TRANSPORT}}.
subject to the address validation limits in Section 8.1 of {{QUIC-TRANSPORT}}.

Peers can also use coalesced packets to ensure that each datagram elicits at
least one acknowledgement. For example, clients can coalesce an Initial packet
Expand All @@ -633,7 +632,7 @@ In addition to sending data in the packet number space for which the timer
expired, the sender SHOULD send ack-eliciting packets from other packet
number spaces with in-flight data, coalescing packets if possible. This is
particularly valuable when the server has both Initial and Handshake data
in-flight or the client has both Handshake and ApplicationData in-flight,
in-flight or the client has both Handshake and Application Data in-flight,
because the peer might only have receive keys for one of the two packet number
spaces.

Expand Down Expand Up @@ -711,18 +710,18 @@ before Initial packets, early 0-RTT packets will be declared lost, but that
is expected to be infrequent.

It is expected that keys are discarded after packets encrypted with them would
be acknowledged or declared lost. Initial secrets however might be destroyed
sooner, as soon as handshake keys are available; see Section 4.11.1 of
{{QUIC-TLS}}.
be acknowledged or declared lost. However, Initial secrets are destroyed as
soon as handshake keys are available to both client and server; see Section
4.11.1 of {{QUIC-TLS}}.

# Congestion Control {#congestion-control}

This document specifies a congestion controller for QUIC similar to
TCP NewReno {{?RFC6582}}.
TCP NewReno ({{?RFC6582}}).

The signals QUIC provides for congestion control are generic and are designed to
support different algorithms. Endpoints can unilaterally choose a different
algorithm to use, such as Cubic {{?RFC8312}}.
algorithm to use, such as Cubic ({{?RFC8312}}).

If an endpoint uses a different controller than that specified in this document,
the chosen controller MUST conform to the congestion control guidelines
Expand All @@ -743,7 +742,7 @@ is sent on a PTO timer expiration; see {{pto}}.

## Explicit Congestion Notification {#congestion-ecn}

If a path has been verified to support ECN {{?RFC3168}} {{?RFC8311}}, QUIC
If a path has been verified to support ECN ({{?RFC3168}}, {{?RFC8311}}), QUIC
treats a Congestion Experienced (CE) codepoint in the IP header as a signal of
congestion. This document specifies an endpoint's response when its peer
receives packets with the ECN-CE codepoint.
Expand Down Expand Up @@ -836,11 +835,11 @@ experiencing persistent congestion. Commonly, this can be established by
consecutive PTOs, but since the PTO timer is reset when a new ack-eliciting
packet is sent, an explicit duration must be used to account for those cases
where PTOs do not occur or are substantially delayed. The rationale for this
threshold is to enable a sender to use initial PTOs for aggressive probing,
as TCP does with Tail Loss Probe (TLP) {{RACK}}, before establishing persistent
congestion, as TCP does with a Retransmission Timeout (RTO) {{?RFC5681}}.
The RECOMMENDED value for kPersistentCongestionThreshold is 3, which is
approximately equivalent to two TLPs before an RTO in TCP.
threshold is to enable a sender to use initial PTOs for aggressive probing, as
TCP does with Tail Loss Probe (TLP; see {{RACK}}), before establishing
persistent congestion, as TCP does with a Retransmission Timeout (RTO; see
{{?RFC5681}}). The RECOMMENDED value for kPersistentCongestionThreshold is 3,
which is approximately equivalent to two TLPs before an RTO in TCP.

This duration is computed as follows:

Expand Down Expand Up @@ -881,7 +880,7 @@ When persistent congestion is established, the sender's congestion window MUST
be reduced to the minimum congestion window (kMinimumWindow). This response of
collapsing the congestion window on persistent congestion is functionally
similar to a sender's response on a Retransmission Timeout (RTO) in TCP
{{RFC5681}} after Tail Loss Probes (TLP) {{RACK}}.
({{RFC5681}}) after Tail Loss Probes (TLP; see {{RACK}}).

## Pacing {#pacing}

Expand Down Expand Up @@ -985,9 +984,9 @@ limits and so no advantage is gained by doing so.

Endpoints choose the congestion controller that they use. Though congestion
controllers generally treat reports of ECN-CE markings as equivalent to loss
[RFC8311], the exact response for each controller could be different. Failure
to correctly respond to information about ECN markings is therefore difficult to
detect.
({{?RFC8311}}), the exact response for each controller could be different.
Failure to correctly respond to information about ECN markings is therefore
difficult to detect.


# IANA Considerations
Expand Down Expand Up @@ -1097,7 +1096,7 @@ min_rtt:

max_ack_delay:
: The maximum amount of time by which the receiver intends to delay
acknowledgments for packets in the ApplicationData packet number space. The
acknowledgments for packets in the Application Data packet number space. The
actual ack_delay in a received ACK frame may be larger due to late timers,
reordering, or lost ACK frames.

Expand Down Expand Up @@ -1291,10 +1290,10 @@ GetPtoTimeAndSpace():
if (no in-flight packets in space):
continue;
if (space == ApplicationData):
// Skip ApplicationData until handshake complete.
// Skip Application Data until handshake complete.
if (handshake is not complete):
return pto_timeout, pto_space
// Include max_ack_delay and backoff for ApplicationData.
// Include max_ack_delay and backoff for Application Data.
duration += max_ack_delay * (2 ^ pto_count)

t = time_of_last_ack_eliciting_packet[space] + duration
Expand Down Expand Up @@ -1438,12 +1437,11 @@ kMinimumWindow:

kLossReductionFactor:
: Reduction in congestion window when a new loss event is detected.
The {{congestion-control}} section recommends a value is 0.5.
{{congestion-control}} recommends a value is 0.5.

kPersistentCongestionThreshold:
: Period of time for persistent congestion to be established, specified
as a PTO multiplier. The {{persistent-congestion}} section recommends a
value of 3.
: Period of time for persistent congestion to be established, specified as a PTO
multiplier. {{persistent-congestion}} recommends a value of 3.


## Variables of interest {#vars-of-interest}
Expand All @@ -1465,7 +1463,7 @@ ecn_ce_counters\[kPacketNumberSpace]:

bytes_in_flight:
: The sum of the size in bytes of all sent packets that contain at least one
ack-eliciting or PADDING frame, and have not been acked or declared
ack-eliciting or PADDING frame, and have not been acknowledged or declared
lost. The size does not include IP or UDP overhead, but does include the QUIC
header and AEAD overhead. Packets only containing ACK frames do not count
towards bytes_in_flight to ensure congestion control does not impede
Expand Down

0 comments on commit 3b24fdf

Please sign in to comment.