Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split Connection Closure discussion into two parts #298

Merged
merged 18 commits into from
Apr 14, 2021

Conversation

LPardue
Copy link
Member

@LPardue LPardue commented Mar 24, 2021

Fixes #297 and #175.

This ended up much larger than I anticiated but I found the old Connection Closure text quite hard to modify in a way that let me frame the graceful close additions.

I think this is a net positive structural move, even if we might need to bash the section ordering and contained text.

Copy link
Member

@martinthomson martinthomson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looked good up until the point where it started to talk about stream limit management.

draft-ietf-quic-applicability.md Outdated Show resolved Hide resolved
draft-ietf-quic-applicability.md Outdated Show resolved Hide resolved

QUIC defines an error code space that is used for error handling. QUIC
encourages endpoints to use the most-specific code, although any applicable code
is permitted including generic ones. Applications using QUIC can define an error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
is permitted including generic ones. Applications using QUIC can define an error
is permitted including generic ones.
Applications using QUIC can define an error

draft-ietf-quic-applicability.md Outdated Show resolved Hide resolved
draft-ietf-quic-applicability.md Outdated Show resolved Hide resolved
draft-ietf-quic-applicability.md Outdated Show resolved Hide resolved
Comment on lines 617 to 624
{{error-handling}}). Immediate close causes all streams to become immediately
closed. QUIC endpoints can manage the cumulative maximum number of streams they
would allow to be opened using the MAX_STREAMS frames but there is no mechanism
to reduce the value. An application that uses QUIC might commit to a number of
openable streams but require the connection to be closed (for example, a
scheduled maintenance period). Depending on how an application uses QUIC streams
(see {{use-of-streams}}), abrupt closure of actively-used streams may be
undesireable or detrimental. In contrast, waiting for an endpoint to exhaust the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bit about streams completely surprised me. It's now into flow control and maximum stream limits, which really don't belong here. If you like the text, I would suggest finding it another home.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, section 4 is on use of streams. Maybe we can just add another short subsection there to talk about the max stream limit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will see what I can do. This text was really the main thing I was looking for in my issue #175 because shutdowns affect streams but QUIC doesn't talk about what termination of streams means for an app. I realise now that a subsection for stream limits and termination is probably useful, and then this part can cross-referencet to that as a "global reset of all streams" type thing

Comment on lines 627 to 631
enacting and immediate close. Alternatively, a graceful close mechanim can be
used to commicate the intention to explicitly close the connection at some
future point. QUIC does not provide any mechanism for graceful connection
termination, applications using QUIC can define their own graceful termination
process (see, for example, {{Section 5.2 of QUIC-HTTP}}).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The text on graceful close should be its own paragraph.

Comment on lines 633 to 635
A stateless reset is an option of last resort for an endpoint that does not have
access to connection state. It is not expected that application using QUIC need
information or knowledge that a stateless reset was triggered.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little odd. I would phrase the last sentence here more along the lines of "receiving a stateless reset is an indication of an unrecoverable error distinct from connection errors in that there is no application-layer information provided"

Copy link
Contributor

@MikeBishop MikeBishop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the general outline, but lots of editorial nits here. Also, you're inconsistent on whether the "l" gets doubled in "signaling/signalling" and "signaled/signalled."

draft-ietf-quic-applicability.md Outdated Show resolved Hide resolved
draft-ietf-quic-applicability.md Outdated Show resolved Hide resolved
draft-ietf-quic-applicability.md Outdated Show resolved Hide resolved
draft-ietf-quic-applicability.md Outdated Show resolved Hide resolved
draft-ietf-quic-applicability.md Outdated Show resolved Hide resolved
draft-ietf-quic-applicability.md Outdated Show resolved Hide resolved
draft-ietf-quic-applicability.md Outdated Show resolved Hide resolved
draft-ietf-quic-applicability.md Outdated Show resolved Hide resolved
Co-authored-by: Mike Bishop <mbishop@evequefou.be>
Co-authored-by: Martin Thomson <mt@lowentropy.net>
@LPardue
Copy link
Member Author

LPardue commented Apr 12, 2021

Many thanks for the editorial feedback. I believe this is now addresses it all. I'll defer to @mirjak to merge this one, I strongly suggest the "squash and merge button" :)

messages at the transport layer to avoid unnecessary load, as specified in
{{Section 10.1.2 of QUIC}}. Alternatively, applications using QUIC could define
their own mechanism, such as an application-layer ping, that achieves a similar
result. See {{resumption-v-keepalive}} for further guidance on keep-alives.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say you "only" should use application layer ping if you also need that information for the application layer logic itself. If you do application layer pings that just looks like payload traffic on the transport which means even if QUIC is configured to send keep-alives it will not send any because it never becomes idle. So not sure if we actually need to say anything about application layer pings here at all...?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's already what this paragraph says IMO - if there is no useful application data to send, you can use QUIC keepalives or design an application mechanism. The important thing is that applications might have their own idle timeouts above QUIC (for example, time between HTTP requests, time between DATA frames) and its totally fine for applications to have those and design mechanisms that avoid them being triggered.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the tiny different I would be looking for is to say it more like: if there is no useful application data to send, you should use QUIC keepalives or only design an application mechanism if that is needed on the app layer anyway for something else. Or to put if differently: please don't design your own app layer ping just because you can.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree. An application protocol designer probably wants wide interoperability and they could be sorely disappointed to find that QUIC implementations don't all expose a keep alive mechanism. The best way to mitigate that risk is to pretend transport keepalives don't exist and to always have an application one. Doing it that way, means that all implementations of an application protocol can expect the same behaviour.

The text here attempts to tread a fine line by stating what is possible but without recommending any particular course of action.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right on the point that this discussion actually doesn't belong in this document as that question is not QUIC specific. Therefore I would still advocate for removing the second to last sentence here about app layer pings because of course applications can always do that but is also not specific for applications that use QUIC.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. So given that this doc is focussed on pointing out caveats for applications, how about we swizzle the text a bit to say

"Application data exchanged on streams or in datagrams defers the QUIC idle timeout. Applications that provide their own keep-alive mechanisms will therefore keep a QUIC connection alive. Applications that don't provide their own keep-alive might be able to use transport-layer mechanism (see {{Section 10.1.2 of QUIC}}, and {{resumption-v-keepalive}}). However, QUIC implementation interfaces for controlling such transport behaviour can vary, affecting the robustness of such designs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WFM. Thanks!

scheduled maintenance period). Depending on how an application uses QUIC
streams, abrupt closure of actively used streams may be undesireable or
detrimental. In contrast, waiting for an endpoint to exhaust the advertised
limit may not suit application or operational needs. Applications using QUIC can
Copy link
Contributor

@mirjak mirjak Apr 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I miss some connection here: why should an endpoint ever wait until all streams are exhausted...?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So imagine an HTTP/3 server that advertises that up to 10 bidirectional client streams streams can be created. The client has created 2 streams for requests,and the responses are large payloads.

The server gets scheduled for an upgrade and the operator wants to drain the connection gracefully. It is happy to let the large downloads conclude but it wants new requests to get routed to a new machine. Any form of connection closure at this stage risks affecting the in-flight response. Connection closure when all streams are exhausted is a much safer option.

The client believes it can create 8 more streams and it is sub optimal to create a new connection while the current one appears to be able to handle what is needed. It could also be bad to open concurrent connections opportunistically in anticipation of a server having to gracefully close. The server could just wait for the client to create those 8 streams and the instantly reject them with STOP_SENDING, RESET_STREAM, or an HTTP response. That's a very chatty process, and might delay the client from being able to fulfill its duties in a timely manner.

HTTP/3 GOAWAY is a graceful close signal that permits server to tell the client that it won't accept new requests, but it doesn't interrupt active streams. A client that receives GOAWAY could decide to open a new connection in parallel to the active transfers completing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand why you need/want GOAWAY at the application.I'm just think that waiting until stream are exhausted is never really an option because as soon as one stream is closed the client can immediately open another one. I think the two options you have is you either wait until the connection is done entirely, or you have an application layer mechanism that indicates that no new streams should be created.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The client can only open as many as MAX_STREAMs. So if a server wants to drain a connection without a GOWAY, it can stop granting stream credits (this is what I mean by exhaustion) and eventually the client will run out of things to do. From the other guidance in the spec, a client with nothing to do should eventually give up either via idle timeout or explicit close.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, you are right. I was misremembering how MAX_STREAM works (would maybe make sense to add a sentence). So basically what you are saying is that if the server wants to close the connection but doesn't want to interrupt on-going activities, the only thing it can do is to not issue any new stream credits and wait until the client is done because it easier has nothing to send anymore or reaches the stream limit (if you don't have an graceful close on application level). I think this can be worded simpler than taking about "exhausting streams" :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the text to address your comments, PTAL

detrimental. In contrast, waiting for an endpoint to exhaust the advertised
limit may not suit application or operational needs. Applications using QUIC can
use conservative stream limits and run to completion before enacting an
immediate close. Alternatively, a graceful close mechanism can be used to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean graceful close at the application layer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah good spot. I lost the context when moving the text to this section, so let me make it clear again.

Copy link
Contributor

@mirjak mirjak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

proposed edits...

(see {{connection-termination}}) causes abrupt closure of actively used streams.
Depending on how an application uses QUIC streams, this could be undesireable
or detrimental to behaviour or performance. An alternative to immediate close
is to wait for a peer to consume all of the advertised stream limit. However, the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
is to wait for a peer to consume all of the advertised stream limit. However, the
is to not issue any new stream credits and wait for a peer to close the connection,
either because there is no additional data to send or it is forced to close when it
consumed all of the advertised stream limit. However, the

Getting a bit lengthly; it's just a proposal...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about we put your suggested text about how endpoints might close when the limit is reached, then the sentence here can say

"An alternative to immediate close is to stop increasing the stream limit and wait for the peer to consume the remaining streams."

?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

draft-ietf-quic-applicability.md Outdated Show resolved Hide resolved
draft-ietf-quic-applicability.md Outdated Show resolved Hide resolved
draft-ietf-quic-applicability.md Outdated Show resolved Hide resolved
LPardue and others added 2 commits April 12, 2021 21:51
Co-authored-by: mirjak <mirja.kuehlewind@ericsson.com>
Co-authored-by: mirjak <mirja.kuehlewind@ericsson.com>
Co-authored-by: mirjak <mirja.kuehlewind@ericsson.com>
@britram britram merged commit 21483d9 into quicwg:master Apr 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Better discuss the nuance between transport error codes and application error codes
5 participants