Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gorry's ECN rewrite #4059

Merged
merged 16 commits into from
Sep 15, 2020
20 changes: 10 additions & 10 deletions draft-ietf-quic-transport.md
Original file line number Diff line number Diff line change
Expand Up @@ -3889,13 +3889,13 @@ see {{ecn-validation}}.

Use of ECN requires the receiving endpoint to read the ECN field from an IP
packet, which is not possible on all platforms. If an endpoint does not
implement ECN support or does not have access to received ECN field, it
implement ECN support or does not have access to received ECN fields, it
does not report ECN counts for packets it receives.

Even if an endpoint does not set an ECT field on packets it sends, the endpoint
MUST provide feedback about ECN markings it receives, if these are accessible.
Failing to report the ECN counts will cause the sender to disable use of ECN
for packets to this receiver.
for this connection.

On receiving an IP packet with an ECT(0), ECT(1) or CE codepoint, an
ECN-enabled endpoint accesses the ECN field and increases the corresponding
Expand All @@ -3904,11 +3904,11 @@ frames; see {{generating-acks}} and {{frame-ack}}.

Each packet number space maintains separate acknowledgement state and separate
ECN counts. Coalesced QUIC packets (see {{packet-coalesce}}) share the same IP
header so the ECN counts are incremented once for each QUIC packet.
header so the ECN counts are incremented once for each coalesced QUIC packet.

For example, if one each of an Initial, Handshake, and 1-RTT QUIC packet are
coalesced into a single UDP datagram, the ECN counts for all three packet
number spaces will be incremented by one each.
number spaces will be incremented by one each, based on the ECN field of the single IP header.

ECN counts are only incremented when QUIC packets from the received IP
packet are processed. As such, duplicate QUIC packets are not processed and
Expand All @@ -3933,7 +3933,7 @@ To perform ECN validation for a new path:
that ECN validation has failed.

If an endpoint has cause to expect that IP packets with an ECT codepoint might
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting ECN on 10 packets if you think they're going to be lost seems odd and I can't imagine anyone would do that. This text is such a departure from the existing text, I'm unsure how to fix it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't a departure from existing text -- see lines 3953-3956 in the old text.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I’m going to chime-in here. Indeed it was in the “original” text, which is why I included this. However, we now have several people (me included) who doubt this is yet correct, so here are my thoughts:

(1) I think it is important to be robust and describe how the protocol falls back in the corner case where the path drops IP packets with a non-zero ECN field - and we should very clearly say this.
(2) I think this can actually happen, but we should not make it sound like this often happens in general. I’m confident this is not “normal”.
(3) If we need to measure the path characteristic, using 10 packets seems a good test to me. However, as a fall back detection mechanism in a transport protocol this seems odd to stop using ECN after 10 packets when we expect that ECT is forwarded. Note: The text says {{ecn-ack}} says 10, the current text does not.

So.... This is a proposed change: We seem to be converging on clear text, would it be possible to remove the "10 packets" and revise this mechanism to say that the ECN counts need to increment, and pathological loss with no incrementing of counts is an indication that the sender should become “failed” for this connection? I think the text already virtually says this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Pathological loss" isn't going to be good enough as a mechanism, and there's a rathole waiting there to define precisely what "pathological" means. We chose 10 packets based on the IW, and also because we wanted to start somewhere and then change the mechanism as we learned more through experience.

I suggest that we don't muddle this particular (large) PR with an additional, potentially-contentious discussion about a mechanism change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do like Jana's proposal to not muddle this particular (large) PR with this "issue". Would others agree to finalise the PR and then discuss this as a separate topic - afterwards?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we discussed this all ready once. I'm fine with 10 packet as it's the initial window (but maybe that's what we should say), however, given the result is the same I would rather state it as enable ECT marking and if all packet of your first flight are lost validation fails and you disable. Describing this specifically as probably makes it just sounds more complicated that it is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The github diff didn't line up well, hence my confusion. The way this is described still seems a bit unrealistic to me, but I can see this PR doesn't substantively change it, so I'm fine with that.

be dropped by faulty network element, the endpoint could set an ECT codepoint
be dropped by a faulty network element, the endpoint could set an ECT codepoint
for only the first ten outgoing packets on a path, or for a period of three
RTTs. If all packets marked with non-zero ECN codepoints are subsequently lost,
it can disable marking on the assumption that the marking causes in loss.
Expand All @@ -3950,19 +3950,19 @@ perform ECN validation using the reported ECT(1) counts.

#### Receiving ACK Frames with ECN Counts {#ecn-ack}

Erroneous application of ECN-CE markings by the network can result in degraded
Erroneous application of CE markings by the network can result in degraded
connection performance. An endpoint that receives an ACK frame with ECN counts
therefore validates the counts before using them. It performs this validation by
comparing newly received counts against those from the last successfully
processed ACK frame. Any increase in the ECN counts is validated based on the
ECN markings that were applied to packets that are newly acknowledged in the ACK
frame.

If an ACK frame newly acknowledges a packet that the endpoint sent setting
either the ECT(0) or ECT(1) codepoint, ECN validation fails if the
If an ACK frame newly acknowledges a packet that the endpoint sent with
either the ECT(0) or ECT(1) codepoint set, ECN validation fails if the
corresponding ECN counts are not present in the ACK frame. This check detects a
network element that zeroes the ECN field or a peer that is unable to access
ECN the markings.
network element that zeroes the ECN field or a peer that does not report
ECN markings.

ECN validation also fails if the sum of the increase in ECT(0) and ECN-CE counts
is less than the number of newly acknowledged packets that were originally sent
Expand Down