Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Required state for retaining unacked RETIRE_CONNECTION_ID frames is unbound #3509

Closed
kazuho opened this issue Mar 10, 2020 · 53 comments · Fixed by #3547
Closed

Required state for retaining unacked RETIRE_CONNECTION_ID frames is unbound #3509

kazuho opened this issue Mar 10, 2020 · 53 comments · Fixed by #3547
Labels
-transport design An issue that affects the design of the protocol; resolution requires consensus. has-consensus An issue that the Chairs have determined has consensus, by canvassing the mailing list.

Comments

@kazuho
Copy link
Member

kazuho commented Mar 10, 2020

When an endpoint sends a RETIRE_CONNECTION_ID frame carrying a connection ID that has been active until that point, that endpoint is expected to open room for accepting a new connection ID.

That means that if the peer does not acknowledge the packets that contains RETIRE_CONNECTION_ID frames, but keeps on sending NEW_CONNECTION_ID frames with increasing values in the Retire Prior To field, the number of unacknowledged RETIRE_CONNECTION_ID frames (or to be precise, the number of unacked retiring connection IDs) keeps on increasing.

@kazuho
Copy link
Member Author

kazuho commented Mar 10, 2020

To address the issue, I think we need to do one of the following:

  • State that the attack exists, and that QUIC stacks should implement mitigations.
  • Introduce MAX_CONNECTION_IDS frame. The reason we do not have similar issue in stream concurrency control is because the credit is controlled by MAX_STREAMS frames, rather than using FIN (or reset) as an implicit signal to indicate availability of new credit. We could do the same for connection IDs.

@mnot mnot added this to Triage in Late Stage Processing Mar 11, 2020
@martinthomson
Copy link
Member

Mathematically, this might be unbounded, but it's more likely that in practice the limit is 2*active_connection_id_limit as it would be unlikely that unacknowledged RETIRE_CONNECTION_ID frames would remain so once more NEW_CONNECTION_ID frames start arriving. More so as the issuer of those NEW_CONNECTION_ID frames has to have received the RETIRE_CONNECTION_ID frames.

Adding new frames for this seems a little unwieldy, so I'd prefer to just acknowledge that the problem exists.

Note also that Retire Prior To has the effect of making this limit squishy in other ways.

(This is a design issue.)

@marten-seemann
Copy link
Contributor

I'm not sure I understand the attack here. Why would an implementation need to keep track of acknowledgements of RETIRE_CONNECTION_ID frames (other than for purposes of loss recovery, but then this applies to any retransmittable frame type)?

An endpoint sends a RETIRE_CONNECTION_ID frame when it decides to not use the respective CID any longer. At this point, it can forget about the connection ID altogether. It might make sense to hold on to the stateless reset token for a while (3 PTO?) longer, in order to be able to detect stateless resets, but this is an optimization and wouldn't lead to unbounded state anyway.

@kazuho
Copy link
Member Author

kazuho commented Mar 11, 2020

@martinthomson

Mathematically, this might be unbounded, but it's more likely that in practice the limit is 2*active_connection_id_limit as it would be unlikely that unacknowledged RETIRE_CONNECTION_ID frames would remain so once more NEW_CONNECTION_ID frames start arriving. More so as the issuer of those NEW_CONNECTION_ID frames has to have received the RETIRE_CONNECTION_ID frames.

I think 2*active_connection_id_limit is fine, assuming that the endpoint would just stop retransmitting oldest RETIRE_CONNECTION_ID frames. That limit would be too stern to be used for detecting potential attacks (and therefore closing the connection with PROTOCOL_VIOLATION).

@kazuho
Copy link
Member Author

kazuho commented Mar 11, 2020

@marten-seemann

I'm not sure I understand the attack here. Why would an implementation need to keep track of acknowledgements of RETIRE_CONNECTION_ID frames (other than for purposes of loss recovery, but then this applies to any retransmittable frame type)?

This is about keeping state for retransmission for recovering loss. As stated in my first comment, the issue is unique to RETIRE_CONNECTION_ID frames, because this frame type is (ab)used for providing the peer additional credit, compared to other flow control that uses dedicated frame types (e.g., MAX_*) for communicating credit.

@marten-seemann
Copy link
Contributor

Thank you for the explanation, @kazuho. I think I now understood the problem.

Mathematically, this might be unbounded, but it's more likely that in practice the limit is 2*active_connection_id_limit as it would be unlikely that unacknowledged RETIRE_CONNECTION_ID frames would remain so once more NEW_CONNECTION_ID frames start arriving. More so as the issuer of those NEW_CONNECTION_ID frames has to have received the RETIRE_CONNECTION_ID frames.

This would require endpoints to make assumptions about how the peer handles the active_connection_id_limit and how it issues new CIDs. This can get complicated really quickly and leave the connection in an unrecoverable state (or at the least a state with less CIDs than it should have). Introducing a new frame, as @kazuho suggested, seems a lot cleaner to me.

@kazuho
Copy link
Member Author

kazuho commented Mar 12, 2020

Just to be clear, while I used RPT as a way of explaining the design issue, I think it is not a requirement.

A malicious client can repeatedly migrate to a new address, initiating the use of new CID pairs, at the same time intentionally not acknowledging packets containing RETIRE_CONNECTION_ID frames sent from the server. If the client does that, the number of CIDs that the server has to track for retirement increases as time goes.

@erickinnear
Copy link
Contributor

Clarification question: If the person sending RCID isn't getting ACKs back, I think we still need to assume that those CIDs are being replaced? If not, then the non-RPT side of this would just mean the person sending RCID frames runs out of CIDs to use and has to stop.

So the problem statement is that: this person is running over their max_active_cid limit from the perspective of the person withholding ACKs, but each side has their own view and so the person retiring them stopped counting them as active once the frame was sent (but not acknowledged).
And that can happen via RPT as well, just because it means that they're going to be instructed to send the RCID, but the problem is the same either way?

(Just want to make sure I'm understanding the problem correctly)

@kazuho
Copy link
Member Author

kazuho commented Mar 12, 2020

@erickinnear Yes, I think your problem statement is correct.

Or to be even more concise, the problem is that the issuer of CIDs is given additional credit when it receives RCID, which happens before the consumer of CIDs drops the state it needs to retain for retransmitting RCID.

For the attack to work, an attacker has to let the peer consume and retire CIDs. RPT and intentional migration are examples that makes that happen.

@MikeBishop
Copy link
Contributor

So the issue here is fundamentally that you can induce the peer to send retransmittable frames which aren't subject to some kind of flow control, and this is the only instance we have of that.

STOP_SENDING and failing to acknowledge the RESET_STREAM appears to have a similar profile, except that the number of RESET_STREAMs you could have outstanding is bounded by the number of streams the peer has opened, which they control.

You could execute the same attack at a larger scale by choosing particular regions of stream data to never acknowledge, forcing the peer to keep them in memory for retransmission, but the peer could stop sending until that region gets through (applying backpressure) or simply reset the stream. Since a RESET_STREAM is smaller than the data region and only one is needed per stream, this collapses into the previous case.

What's different here is that the endpoint being attacked doesn't choose whether to retire these CIDs and has no mechanism to apply backpressure.

It seems like there are a couple paths to resolving this.

  • As you suggest, we could define an explicit frame that grants the additional CID space, and implementations would not send / could withhold this frame until the RCID is ACK'd. This creates a backpressure mechanism and has the side benefit of allowing an implementation to vary the number of CIDs it wants to track over time, but the disadvantage of requiring an extra round trip before CIDs can be replaced in the normal case.
  • We could mimic the "collapsing" behavior above by defining a frame that retires all CIDs up to a given sequence number. Then the implementation under memory pressure chooses to retire a few extra CIDs and collapses the whole set waiting to be retransmitted to a single frame.

@kazuho
Copy link
Member Author

kazuho commented Mar 24, 2020

@MikeBishop I think your summarization of the problem is accurate.

Regarding an attacker selectively acking packets carrying STREAM frames, yes, the attack exists, but it the sender can mitigate that problem in certain ways. We already state that in section 21.5.

This issue is the only case that a protocol cannot be implemented properly, with a bounded amount of state.

Regarding how we should fix the problem, if we are going make a protocol change, my preference goes to applying the same design principle that we have in other places (i.e. introduce MAX_CONNECTIONS_IDS frame). I do not like designing a mechanism specific for this purpose, as there is a chance that we might make mistakes.

I can also live with what @martinthomson proposed in #3509 (comment) (i.e. encourage leaky behavior).

@ianswett
Copy link
Contributor

Similar to @martinthomson I'd prefer to acknowledge the problem and indicate that peers may want to limit the number of unacknowledged RETIRE_CONNECTION_ID frames to the number of connection IDs they support.

The other option I can imagine is to make the RETIRE_CONNECTION_ID frame cumulative, which provides a strong incentive to use connection IDs in order. Though we may have to make an exception for the connection ID in the received packet if we did that.

@martinthomson
Copy link
Member

So the bounds on this aren't that terrible. Say you had to retire 1000 connection IDs all at once. You could track each one individually as you send RETIRE_CONNECTION_ID. That's 1000 things to track, which might be more state than you want (it's not 1000 bits because you need to maintain relationships with packet numbers).

Or you could just keep CID[smallest unacknowledged retire frame] - CID[largest acknowledged retire frame] strictly limited, no matter how many connection IDs you need to retire. Tracking that, plus what you have been requested to retire should bound the state required to implement retirement.

In other words, you do control how much state you commit to retirement, just like everything else.

@martinduke
Copy link
Contributor

@martinthomson

Or you could just keep CID[smallest unacknowledged retire frame] - CID[largest acknowledged retire frame] strictly limited, no matter how many connection IDs you need to retire. Tracking that, plus what you have been requested to retire should bound the state required to implement retirement.

That's my preferred solution, but in my reading of the spec this is not allowed. We MUST send RCID for each sequence number for which the peer has sent retire-prior-to. I think the language in Section 5.1.2 needs to change.

@martinduke
Copy link
Contributor

@ianswett's idea of allowing RCID to be cumulative is also good and computationally cleaner, although we'd need a different frame type or an additional field in the frame (I don't think we can get rid of retiring non-continuous sequence numbers)

@martinduke
Copy link
Contributor

I'm happy to file a PR for the @martinthomson or @ianswett approach based on the overall sentiment.

@MikeBishop
Copy link
Contributor

One purely implementation option: It's always legitimate to repeat a frame. Therefore, if you track only the smallest and largest un-acked RCID sequence numbers, you can just keep retiring the whole range every time you retransmit. ACKs will shrink the range (hopefully to zero). There's no harm beyond unnecessary bytes in retransmitting something a few extra times to avoid excessive state tracking.

@kazuho
Copy link
Member Author

kazuho commented Mar 25, 2020

@ianswett

The other option I can imagine is to make the RETIRE_CONNECTION_ID frame cumulative, which provides a strong incentive to use connection IDs in order. Though we may have to make an exception for the connection ID in the received packet if we did that.

While I won't say that's impossible, I'd argue that the design is going to be complex.

When CIDs are used in a legitimate manner, they are not going to be retired one by one. A CID is retired when a path using that CID is abandoned.

Consider the case where an endpoint has 4 active CIDs (CID0 to CID3), and two paths. One path could be using CID0, the other path could be using CID3. When the endpoint abandon the path that is using CID3, it has to retire CID3 but it cannot retire others.

@MikeBishop The amount of CIDs that can be sent on wire at a time is restricted by the CWND, therefore there is no guarantee that you can send RCID frames for all CIDs that you want to retire at once. In other words, I think the approach at best collapses into what @martinthomson has proposed (i.e. leaky behavior).

@ianswett
Copy link
Contributor

At this point, I also think @martinthomson solution is the best approach, since it is a very minor change, and changing RETIRE_CONNECTION_ID to be cumulative is a larger change than I want to accept at this point. @kazuho, thanks for your examples.

@ianswett
Copy link
Contributor

One idea: Can we simplify everything by not requiring sending RETIRE_CONNECTION_ID for CIDs less than the Retire Prior To? RETIRE_CONNECTION_ID would then only be used when a CID was used and needed to be retired so more CIDs could be issued to the user of the CID.

@martinduke
Copy link
Contributor

@ianswett see my comment in the PR. More disruptive to current implementations, and eliminates a useful signal. The current proposal will not matter for well-behaved clients but will protect servers from some attacks.

@ianswett
Copy link
Contributor

We haven't implemented this yet, so it's not disruptive to our implementation FWIW

I wonder how useful the signal is. Retire Prior To is really saying "These are going away sometime soon, whether you like it or not." to the peer. If an implementation never sent RETIRE_CONNECTION_ID, would that create a problem of some sort, because I suspect someone will do just that.

@kazuho
Copy link
Member Author

kazuho commented Mar 26, 2020

@ianswett As stated in #3509 (comment), this attack can be mounted without using Retire Prior To.

@martinduke
Copy link
Contributor

@kazuho in this case isn't the client limited by active_connection_id_limit? As a server, if the client has in no way indicated it has accepted my retirement, I'm not going to accept new CIDs.

The NCID/RPT case is different because the spec specifically allows the client to NCID speculatively assuming the server will immediately retire the CID.

@kazuho
Copy link
Member Author

kazuho commented Mar 27, 2020

@martinduke

in this case isn't the client limited by active_connection_id_limit?

The attack here is that when a malicious client retires a CID to which the server has responded, that server would retire the CID that it has used on that path, and also provide a new CID to the client. But the client never ACKs the RCID frames that the server sends.

As an example, consider the case where a client uses sCID1 on a new path, the server responds on that path using cCID1, the client abandons that path and sends RCID (seq=sCID1). When receiving this RCID frame, the server would send RCID (seq=cCID1), and also send NCID(sCID2). The client intentionally does not ack the packet carrying these frames, but uses sCID2 on a new path, that carries NCID(cCID2), repeating this procedure.

@martinduke
Copy link
Contributor

Ah, I see now that my defense doesn't work, because if I reject the NCID I have to kill the connection, and it may just be because of unlucky ack losses.

This whole section is not very precisely written and suffers from two-generals problems about when a CID is actually retired.

I will revise to account for this, but I'm not sure how.

@kazuho
Copy link
Member Author

kazuho commented Mar 30, 2020

@martinthomson

I know that you have characterized this as "leaky", but I'm not sure that I understand exactly what is being leaked. Nothing is lost in #3547, except that the endpoint requesting retirement has to tolerate a potentially extended period without old connection IDs being retired.

In the approach we take in #3547, there is a possibility that the issuer of the CID might not receive the sequence numbers of all the CIDs that have been retired by the peer, when the issuer is issuing new CIDs at high rate and the receiver is also consuming them at a hight rate. That is the consequence of endpoints limiting the amount of inflight RCID frames it tracks.

I think we can provide advice so that the "leak" would not be an issue in practice, and assuming that would happen, I am happy with #3547 being the solution.

I think #3553 is cleaner, but I acknowledge that it is more disruptive than #3547. Part of my intent behind creating #3553 is to see how a clean fix (in terms of state machinery) would look like, so that it can be compared against #3547.

In case this helps, I have no big preference between #3547 and #3550. There is a certain elegance to #3550 that might tip me toward that, but it is slightly more disruptive.

As stated in #3550 (comment), #3550 by itself only partly fixes the problem. I am not against adopting #3550 along with #3547 though.

#3548 is also a partial fix, I also think that it is a non-starter.

@martinthomson
Copy link
Member

I think that I understand your point; thanks for your patience.

I didn't really classify it as a leak though, so the choice of words threw me a little, it's more of a lag.

Let's say that you have a pathological endpoint that sends an infinite series of NEW_CONNECTION_ID frames with Sequence Number N and Retire Prior To of (N-1). That's legal and will never violate any limits. But unless they maintain acknowledgments for RETIRE_CONNECTION_ID frames at a similar rate (which, to be fair should be easy as sizeof(NEW_CONNECTION_ID) > sizeof(RETIRE_CONNECTION_ID) > sizeof(ACK) for the same values), they could get far ahead of their peer.

I don't think that it is strictly a leak, just a lag between the connection IDs being pushed and the connection IDs being successfully retired. Without #3547 or something else this could be problematic. And I now understand why #3550 is really just an orthogonal refinement, though it might make other defenses less likely to be required by making sizeof(sum(RETIRE_CONNECTION_ID)) zero in cases where an endpoint isn't using discretionary retirement.

The defense in #3553 is to make sizeof(sum(RETIRE_CONNECTION_ID)) effectively constant, no matter how many frames need to be retired. My main objection is really that it is a maximal design where only a minimal one is warranted.

It's probably worth pointing out that there is also the simplest defense: If you can't keep up with RETIRE_CONNECTION_ID, stop talking to the peer that does this to you. This is just another example from that category of abuse we don't have specific protocol mechanisms to mitigate.

@ianswett
Copy link
Contributor

I agree that I'd like to see a minimal solution, so I'm leery of heading towards #3553.

In order for a peer to need to send a lot of RETIRE_CONNECTION_ID frames without Retire Prior To, a lot of 5-tuple + CID changes need to be coming in. One way to rate limit that naturally is to stop giving out CIDs if the peer hasn't acknowledged your RETIRE_CONNECTION_ID frames.

ie: Limit the number of NEW_CONNECTION_ID frames in flight to:
max_active_connection_ids +1 - num(RETIRE_CONNECTION_ID frames)

So if the peer stops acknowledging your retirements, you stop giving them new CIDs. I think this avoids the need for any other limits? I added 1 in my example so one NCID could be sent even if all of the peer's CIDs had just been retired, though that concern disappears if we decided to also do something like #3550

@martinduke
Copy link
Contributor

What a mess we've made! Is there interest in doing a Zoom about this? Maybe 0900 Tokyo/1100 Sydney/1700 Seattle/2000 Boston on Tue/Wed?

The root cause is that the flow control mechanism isn't all that well designed. IMO #3553 is the actual correct fix to this problem, but as people say, it's disruptive.

#3548 uses ACKs as an implicit signal, which seems like a problem. #3553 has a different implicit signal, and I don't think it solves the case where the client is withholding acks for unsolicited RCIDs.

I really think we should just use #3547 or #3553 depending on our appetite for changing the wire image. #3547 is great for the RCID sender, and if the receiver won't miss any RCIDs unless it's doing pathological stuff or the ack loss patterns are tremendously unlucky.

#3553, OTOH, is a comprehensive fix to the problem.

@ianswett
Copy link
Contributor

I think #3547 with some added text about limiting outstanding NCIDs when there are lots of RCIDs would be sufficiently robust.

@kazuho
Copy link
Member Author

kazuho commented Mar 31, 2020

I'm fine with trying to see if we can reach an agreement on #3547.

@marten-seemann
Copy link
Contributor

I'd prefer #3553 over #3547. I can't really see why it would more disruptive than #3547.

The problem with #3547 is that it will likely work find in most cases, but there are some corner case (when the ACKs for RETIRE_CONNECTION_ID frames are lost), in which things will break. Properly accounting for these corner cases requires some non-trivial tracking logic in order to avoid risking the connection being closed with a CONNECTION_ID_LIMIT_ERROR.

On the other hand, #3553 seems to be conceptually simpler. I'd rather implement a few changes to my frame parser than introduce some ACK-tracking logic.

@kazuho
Copy link
Member Author

kazuho commented Mar 31, 2020

Just to give another example of how an endpoint might end up required to track the retirement of more than active_connection_id_limit, when talking to a well-behaving peer:

  • A server tries to supply three active CIDs to the peer. At the moment, it has provided CID0, CID1, CID2 to the client.
  • The client sends RCID(CID1), RCID(CID2). The server provides NCID(CID3), NCID(CID4). The two NCID frames are delivered to the client, but the ACK for the RCID frames gets lost.
  • The client decides to probe two paths concurrently, by using CID3 and CID4, but shortly after the packets are sent using those paths, the underlying network for those two paths disappear.
  • The client now needs to make sure that RCID frames carrying 4 CIDs reach the peer.

In this example, we might argue that it would be possible for the client to figure out that CID3 and CID4 were provided as substitutions for CID1 and CID2. But do we want to implement logic that detects such condition into our code? Note that tracking of substitutions can become tricky; if RCID(CID1) and RCID(CID2) were sent in different packets, it would be impossible for the client to see if CID4 was provided as a supplement for CID1 or for CID2.

@nibanks
Copy link
Member

nibanks commented Mar 31, 2020

I agree with @kazuho that if we go with #3547 the limit should be higher than active_connection_id_limit.

@LPardue LPardue added the design An issue that affects the design of the protocol; resolution requires consensus. label Mar 31, 2020
@project-bot project-bot bot moved this from Triage to Design Issues in Late Stage Processing Mar 31, 2020
@janaiyengar
Copy link
Contributor

I finally caught up with this issue, and implementer advice is definitely my preferred way forward. We're going to need some complexity to (and some magic numbers) no matter what, let's do this while limiting mechanism to a minimum. To that end, I agree with #3547 as a good resolution.

@marten-seemann
Copy link
Contributor

While increasing the limit will certainly help, this introduces a really unpleasant flakiness in the protocol. QUIC, so far, has the property that no amount or pattern of packet loss will lead to a protocol violation (it might lead to a connection timeout if all packets are blackholed, but that's a different kind of error). This means that when running QUIC in production, screening for (transport-level) errors will be very useful for finding bugs in implementations.

My fear regarding #3547 is that no matter how high we choose the limit, there will always be a loss pattern that will trigger a protocol violation.

@kazuho
Copy link
Member Author

kazuho commented Apr 1, 2020

@marten-seemann

My fear regarding #3547 is that no matter how high we choose the limit, there will always be a loss pattern that will trigger a protocol violation.

Thank you for raising the concern, I think the recommended behavior when exceeding that limit is to stop sending RCID frames for some sequence numbers rather than triggering a protocol violation. That's actually what I had in mind hence called it "leaky."

The benefit of stopping retiring some sequence numbers is that the worst case becomes endpoints running out of CIDs - something that is expected to happen on ordinary connections too. Note also that in HTTP/3, reestablishing a connection in 0-RTT is always a viable option.

To summarize, I think that the concern can be resolved for HTTP/3 by changing the recommended behavior from resetting the connection to stop sending RCIF frames for some sequence numbers.

@ianswett
Copy link
Contributor

ianswett commented Apr 1, 2020

@kazuho I don't think not sending RCID frames is that easy. Tracking something that needs to be sent consumes comparable state to tracking it in flight. Or are you suggesting a different algorithm that indicates a RCID frame doesn't need to be sent?

If the peer has to wait for one Retire Prior To to take effect before it's increased again, that leaves only the peer migrating really quickly and lost ACKs as the unbounded case. As @kazuho said, I think it's more likely a well-behaved peer would run out of CIDs in this case than any other failure mode. I previously suggested stopping sending out NCID frames when the RCID frame limit had been reached to force the issue, but that just changes who has to close the connection.

FWIW, we already have sanity checks where our implementation closes the connection when a datastructure becomes too large. The limits are high enough that they're almost never hit unless there is a bug, but they've been helpful in finding bugs. I would advocate everyone have these. @marten-seemann You'll never get to 0 transport errors. Some implementations will have a bug, and some peers or servers will have faulty or corrupted memory. For example, in Google QUIC we found we received the ClientHello on a stream besides 1 surprisingly often.

@martinthomson
Copy link
Member

I think that the idea is to reduce the set of RCID frames needed to a range of sequence numbers. You only have to remember the few exceptions for sequence numbers that are in use, sequence numbers for which you have outstanding frames, and a range of sequence numbers that require retirement.

As long as your discretionary use is sequential, this is bounded state. As mentioned, the risk is that lack of acknowledgment leads to potentially large delays in getting new connection IDs, but if you ever run short, abandon the connection. Either you have a pathological loss pattern or your peer is messing with you. In either case, it is probably not worth continuing.

@kazuho
Copy link
Member Author

kazuho commented Apr 1, 2020

@martinthomson

I think that the idea is to reduce the set of RCID frames needed to a range of sequence numbers. You only have to remember the few exceptions for sequence numbers that are in use, sequence numbers for which you have outstanding frames, and a range of sequence numbers that require retirement.

As long as your discretionary use is sequential, this is bounded state.

Yes, this is one method of limiting the state that is being used for retaining sequence numbers that need to be sent. And it works even when discretionary use is not sequential, as the number of "exceptions" (i.e. gaps) is bounded by max_connection_id_limit. The other assumption of using this method is that you would somehow bound the state that is used to track RCID frames being inflight. There are various ways of accomplishing that, e.g., have a limit on the number of RCID frames sent at once, track bunch of inflight RCID frames as one meta-frame.

And it is correct that the outcome would be a "lag" in this method.

The downside of this method is that it might be a disruptive change to existing implementations. It is essentially equivalent to what #3553 proposes, with the exception being that tracking RCID frames are difficult (as you track to send many).

For existing implementations, it would be easier to adopt a "leaky" method, i.e.:

  • let endpoints have a counter counting on the number of sequence numbers that are retired but are yet to be acknowledged
  • when a CID is being retired, if the value of the counter is below a certain threshold (i.e. 2 * max_connection_id_limit), the counter is incremented, and a RCID frame carrying the sequence number of that CID is added to the outstanding queue. If the counter was not below the threshold, no RCID frame will be sent (leak!)
  • when an RCID frame is deemed lost, it will be resubmitted into the outstanding queue
  • when an RCID frame is acknowledged before deemed lost, the counter is decremented
  • when an RCID frame is late-acked, we do nothing

Assuming that the solution we prefer is the least-disruptive approach, I think we need to recommend the latter. If we have distaste in the latter and prefer the former, I think we'd better switch to #3553, as it more cleanly reflects the design of the former.

@ianswett I think the latter is not that a big change to existing implementations.

@martinthomson
Copy link
Member

Yep, I think that is a good approach. Obviously, the range encoding is superior in terms of how much state it can hold, but a simple list is fine. And the effects of lag will be limited by the tolerance for retaining retired connection IDs, which seems proportional to me.

@martinduke
Copy link
Contributor

@kazuho I doubt that's less impactful on existing implementations, certainly from an interoperability standpoint. Today endpoints should hold on to incoming CIDs until they are formally retired. Making them leaky can lead to trapped state.

Sending them all, but with a delay, is no added state but solves this problem.

@martinthomson
Copy link
Member

After a lot of back-and-forth, I think that we've all generally converged on #3547.

@martinthomson martinthomson added the proposal-ready An issue which has a proposal that is believed to be ready for a consensus call. label Apr 8, 2020
@martinthomson martinthomson moved this from Design Issues to Consensus Emerging in Late Stage Processing Apr 14, 2020
@LPardue LPardue added call-issued An issue that the Chairs have issued a Consensus call for. and removed proposal-ready An issue which has a proposal that is believed to be ready for a consensus call. labels Apr 29, 2020
@project-bot project-bot bot moved this from Consensus Emerging to Consensus Call issued in Late Stage Processing Apr 29, 2020
@LPardue LPardue added has-consensus An issue that the Chairs have determined has consensus, by canvassing the mailing list. and removed call-issued An issue that the Chairs have issued a Consensus call for. labels May 13, 2020
@project-bot project-bot bot moved this from Consensus Call issued to Consensus Declared in Late Stage Processing May 13, 2020
Late Stage Processing automation moved this from Consensus Declared to Issue Handled May 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
-transport design An issue that affects the design of the protocol; resolution requires consensus. has-consensus An issue that the Chairs have determined has consensus, by canvassing the mailing list.
Projects
Late Stage Processing
  
Issue Handled