From a0a991a616dc190728e1695af790c2b1b9475cd8 Mon Sep 17 00:00:00 2001 From: Mike Bishop Date: Wed, 3 Oct 2018 15:51:24 -0700 Subject: [PATCH 01/57] Editorial commitment --- draft-ietf-quic-http.md | 4 ++++ draft-ietf-quic-qpack.md | 4 ++++ draft-ietf-quic-transport.md | 4 ++++ 3 files changed, 12 insertions(+) diff --git a/draft-ietf-quic-http.md b/draft-ietf-quic-http.md index 465fd41d27..b7f890153b 100644 --- a/draft-ietf-quic-http.md +++ b/draft-ietf-quic-http.md @@ -1809,6 +1809,10 @@ Sender: > **RFC Editor's Note:** Please remove this section prior to publication of a > final version of this document. +## Since draft-ietf-quic-http-15 + +Substantial editorial reorganization; no technical changes. + ## Since draft-ietf-quic-http-14 - Recommend sensible values for QUIC transport parameters (#1720,#1806) diff --git a/draft-ietf-quic-qpack.md b/draft-ietf-quic-qpack.md index b71620fc32..1fa2a57b87 100644 --- a/draft-ietf-quic-qpack.md +++ b/draft-ietf-quic-qpack.md @@ -1229,6 +1229,10 @@ Code" registry established in {{QUIC-HTTP}}. > **RFC Editor's Note:** Please remove this section prior to publication of a > final version of this document. +## Since draft-ietf-quic-qpack-03 + +Substantial editorial reorganization; no technical changes. + ## Since draft-ietf-quic-qpack-02 - Largest Reference encoded modulo MaxEntries (#1763) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index 0bee3cba61..968fd471d7 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -5274,6 +5274,10 @@ Issue and pull request numbers are listed with a leading octothorp. ## Since draft-ietf-quic-transport-14 +Substantial editorial reorganization; no technical changes. + +## Since draft-ietf-quic-transport-14 + - Merge ACK and ACK_ECN (#1778, #1801) - Explicitly communicate max_ack_delay (#981, #1781) - Validate original connection ID after Retry packets (#1710, #1486, #1793) From f444a7ac2bb27181ea7c42b3a819ec4b4053a791 Mon Sep 17 00:00:00 2001 From: Mike Bishop Date: Fri, 5 Oct 2018 10:31:11 -0700 Subject: [PATCH 02/57] Move content into a Request Lifecycle section --- draft-ietf-quic-http.md | 569 ++++++++++++++++++++-------------------- 1 file changed, 288 insertions(+), 281 deletions(-) diff --git a/draft-ietf-quic-http.md b/draft-ietf-quic-http.md index b7f890153b..7d8105efb4 100644 --- a/draft-ietf-quic-http.md +++ b/draft-ietf-quic-http.md @@ -256,7 +256,11 @@ frames, exposing the data contained within as a reliable byte stream to the application. When HTTP headers and data are sent over QUIC, the QUIC layer handles most of -the stream management. +the stream management. HTTP does not need to do any separate multiplexing when +using QUIC - data sent over a QUIC stream always maps to a particular HTTP +transaction or connection context. + +## Bidirectional Streams All client-initiated bidirectional streams are used for HTTP requests and responses. A bidirectional stream ensures that the response can be readily @@ -269,252 +273,13 @@ HTTP/QUIC server SHOULD send non-zero values for the QUIC transport parameters recommended that `initial_max_bidi_streams` be no smaller than 100, so as to not unnecessarily limit parallelism. -These streams carry frames related to the request/response (see {{frames}}). -When a stream terminates cleanly, if the last frame on the stream was truncated, -this MUST be treated as a connection error (see HTTP_MALFORMED_FRAME in -{{http-error-codes}}). Streams which terminate abruptly may be reset at any -point in the frame. - -HTTP/QUIC does not use server-initiated bidirectional streams. The use of -unidirectional streams is discussed in {{unidirectional-streams}}. Both clients -and servers SHOULD send a value of three or greater for the QUIC transport -parameter `initial_max_uni_streams`. - -HTTP does not need to do any separate multiplexing when using QUIC - data sent -over a QUIC stream always maps to a particular HTTP transaction. Requests and -responses are considered complete when the corresponding QUIC stream is closed -in the appropriate direction. - -## HTTP Message Exchanges {#request-response} - -A client sends an HTTP request on a client-initiated bidirectional QUIC -stream. A server sends an HTTP response on the same stream as the request. - -An HTTP message (request or response) consists of: - -1. one header block (see {{frame-headers}}) containing the message header (see - {{!RFC7230}}, Section 3.2), - -2. the payload body (see {{!RFC7230}}, Section 3.3), sent as a series of DATA - frames (see {{frame-data}}), - -3. optionally, one header block containing the trailer-part, if present (see - {{!RFC7230}}, Section 4.1.2). - -In addition, prior to sending the message header block indicated above, a -response may contain zero or more header blocks containing the message headers -of informational (1xx) HTTP responses (see {{!RFC7230}}, Section 3.2 and -{{!RFC7231}}, Section 6.2). - -A server MAY interleave one or more PUSH_PROMISE frames (see -{{frame-push-promise}}) with the frames of a response message. These -PUSH_PROMISE frames are not part of the response; see {{server-push}} for more -details. - -The "chunked" transfer encoding defined in Section 4.1 of {{!RFC7230}} MUST NOT -be used. - -Trailing header fields are carried in an additional header block following the -body. Senders MUST send only one header block in the trailers section; -receivers MUST discard any subsequent header blocks. +These streams carry frames related to the request/response (see +{{request-response}}). When a stream terminates cleanly, if the last frame on +the stream was truncated, this MUST be treated as a connection error (see +HTTP_MALFORMED_FRAME in {{http-error-codes}}). Streams which terminate abruptly +may be reset at any point in the frame. -An HTTP request/response exchange fully consumes a bidirectional QUIC stream. -After sending a request, a client closes the stream for sending; after sending a -response, the server closes the stream for sending and the QUIC stream is fully -closed. - -A server can send a complete response prior to the client sending an entire -request if the response does not depend on any portion of the request that has -not been sent and received. When this is true, a server MAY request that the -client abort transmission of a request without error by triggering a QUIC -STOP_SENDING with error code HTTP_EARLY_RESPONSE, sending a complete response, -and cleanly closing its streams. Clients MUST NOT discard complete responses as -a result of having their request terminated abruptly, though clients can always -discard responses at their discretion for other reasons. - -Changes to the state of a request stream, including receiving a RST_STREAM with -any error code, do not affect the state of the server's response. Servers do not -abort a response in progress solely due to a state change on the request stream. -However, if the request stream terminates without containing a usable HTTP -request, the server SHOULD abort its response with the error code -HTTP_INCOMPLETE_REQUEST. - -### Header Formatting and Compression - -HTTP header fields carry information as a series of key-value pairs. For a -listing of registered HTTP header fields, see the "Message Header Field" -registry maintained at . - -Just as in previous versions of HTTP, header field names are strings of ASCII -characters that are compared in a case-insensitive fashion. Properties of HTTP -header field names and values are discussed in more detail in Section 3.2 of -{{!RFC7230}}, though the wire rendering in HTTP/QUIC differs. As in HTTP/2, -header field names MUST be converted to lowercase prior to their encoding. A -request or response containing uppercase header field names MUST be treated as -malformed. - -As in HTTP/2, HTTP/QUIC uses special pseudo-header fields beginning with ':' -character (ASCII 0x3a) to convey the target URI, the method of the request, and -the status code for the response. These pseudo-header fields are defined in -Section 8.1.2.3 and 8.1.2.4 of {{!RFC7540}}. Pseudo-header fields are not HTTP -header fields. Endpoints MUST NOT generate pseudo-header fields other than -those defined in {{!RFC7540}}. The restrictions on the use of pseudo-header -fields in Section 8.1.2.1 of {{!RFC7540}} also apply to HTTP/QUIC. - -HTTP/QUIC uses QPACK header compression as described in [QPACK], a variation of -HPACK which allows the flexibility to avoid header-compression-induced -head-of-line blocking. See that document for additional details. - -### The CONNECT Method - -The pseudo-method CONNECT ({{!RFC7231}}, Section 4.3.6) is primarily used with -HTTP proxies to establish a TLS session with an origin server for the purposes -of interacting with "https" resources. In HTTP/1.x, CONNECT is used to convert -an entire HTTP connection into a tunnel to a remote host. In HTTP/2, the CONNECT -method is used to establish a tunnel over a single HTTP/2 stream to a remote -host for similar purposes. - -A CONNECT request in HTTP/QUIC functions in the same manner as in HTTP/2. The -request MUST be formatted as described in {{!RFC7540}}, Section 8.3. A CONNECT -request that does not conform to these restrictions is malformed. The request -stream MUST NOT be half-closed at the end of the request. - -A proxy that supports CONNECT establishes a TCP connection ({{!RFC0793}}) to the -server identified in the ":authority" pseudo-header field. Once this connection -is successfully established, the proxy sends a HEADERS frame containing a 2xx -series status code to the client, as defined in {{!RFC7231}}, Section 4.3.6. - -All DATA frames on the request stream correspond to data sent on the TCP -connection. Any DATA frame sent by the client is transmitted by the proxy to the -TCP server; data received from the TCP server is packaged into DATA frames by -the proxy. Note that the size and number of TCP segments is not guaranteed to -map predictably to the size and number of HTTP DATA or QUIC STREAM frames. - -The TCP connection can be closed by either peer. When the client ends the -request stream (that is, the receive stream at the proxy enters the "Data Recvd" -state), the proxy will set the FIN bit on its connection to the TCP server. When -the proxy receives a packet with the FIN bit set, it will terminate the send -stream that it sends to client. TCP connections which remain half-closed in a -single direction are not invalid, but are often handled poorly by servers, so -clients SHOULD NOT close a stream for sending while they still expect to receive -data from the target of the CONNECT. - -A TCP connection error is signaled with RST_STREAM. A proxy treats any error in -the TCP connection, which includes receiving a TCP segment with the RST bit set, -as a stream error of type HTTP_CONNECT_ERROR ({{http-error-codes}}). -Correspondingly, a proxy MUST send a TCP segment with the RST bit set if it -detects an error with the stream or the QUIC connection. - -### Request Cancellation - -Either client or server can cancel requests by aborting the stream (QUIC -RST_STREAM or STOP_SENDING frames, as appropriate) with an error code of -HTTP_REQUEST_CANCELLED ({{http-error-codes}}). When the client cancels a -response, it indicates that this response is no longer of interest. Clients -SHOULD cancel requests by aborting both directions of a stream. - -When the server cancels its response stream using HTTP_REQUEST_CANCELLED, it -indicates that no application processing was performed. The client can treat -requests cancelled by the server as though they had never been sent at all, -thereby allowing them to be retried later on a new connection. Servers MUST NOT -use the HTTP_REQUEST_CANCELLED status for requests which were partially or fully -processed. - - Note: - : In this context, "processed" means that some data from the stream was - passed to some higher layer of software that might have taken some action as - a result. - -If a stream is cancelled after receiving a complete response, the client MAY -ignore the cancellation and use the response. However, if a stream is cancelled -after receiving a partial response, the response SHOULD NOT be used. -Automatically retrying such requests is not possible, unless this is otherwise -permitted (e.g., idempotent actions like GET, PUT, or DELETE). - - -## Request Prioritization {#priority} - -HTTP/QUIC uses a priority scheme similar to that described in {{!RFC7540}}, -Section 5.3. In this priority scheme, a given stream can be designated as -dependent upon another request, which expresses the preference that the latter -stream (the "parent" request) be allocated resources before the former stream -(the "dependent" request). Taken together, the dependencies across all requests -in a connection form a dependency tree. The structure of the dependency tree -changes as PRIORITY frames add, remove, or change the dependency links between -requests. - -The PRIORITY frame {{frame-priority}} identifies a prioritized element. The -elements which can be prioritized are: - -- Requests, identified by the ID of the request stream -- Pushes, identified by the Push ID of the promised resource - ({{frame-push-promise}}) -- Placeholders, identified by a Placeholder ID - -An element can depend on another element or on the root of the tree. A -reference to an element which is no longer in the tree is treated as a reference -to the root of the tree. - -Only a client can send PRIORITY frames. A server MUST NOT send a PRIORITY -frame. - -### Placeholders - -In HTTP/2, certain implementations used closed or unused streams as placeholders -in describing the relative priority of requests. However, this created -confusion as servers could not reliably identify which elements of the priority -tree could safely be discarded. Clients could potentially reference closed -streams long after the server had discarded state, leading to disparate views of -the prioritization the client had attempted to express. - -In HTTP/QUIC, a number of placeholders are explicitly permitted by the server -using the `SETTINGS_NUM_PLACEHOLDERS` setting. Because the server commits to -maintain these IDs in the tree, clients can use them with confidence that the -server will not have discarded the state. - -Placeholders are identified by an ID between zero and one less than the number -of placeholders the server has permitted. - -### Priority Tree Maintenance - -Servers can aggressively prune inactive regions from the priority tree, because -placeholders will be used to "root" any persistent structure of the tree which -the client cares about retaining. For prioritization purposes, a node in the -tree is considered "inactive" when the corresponding stream has been closed for -at least two round-trip times (using any reasonable estimate available on the -server). This delay helps mitigate race conditions where the server has pruned -a node the client believed was still active and used as a Stream Dependency. - -Specifically, the server MAY at any time: - -- Identify and discard branches of the tree containing only inactive nodes - (i.e. a node with only other inactive nodes as descendants, along with those - descendants) -- Identify and condense interior regions of the tree containing only inactive - nodes, allocating weight appropriately - -~~~~~~~~~~ drawing - x x x - | | | - P P P - / \ | | - I I ==> I ==> A - / \ | | - A I A A - | | - A A -~~~~~~~~~~ -{: #fig-pruning title="Example of Priority Tree Pruning"} - -In the example in {{fig-pruning}}, `P` represents a Placeholder, `A` represents -an active node, and `I` represents an inactive node. In the first step, the -server discards two inactive branches (each a single node). In the second step, -the server condenses an interior inactive node. Note that these transformations -will result in no change in the resources allocated to a particular active -stream. - -Clients SHOULD assume the server is actively performing such pruning and SHOULD -NOT declare a dependency on a stream it knows to have been closed. +HTTP/QUIC does not use server-initiated bidirectional streams. ## Unidirectional Streams @@ -537,6 +302,9 @@ defined in this document: control streams ({{control-streams}}) and push streams ({{server-push}}). Other stream types can be defined by extensions to HTTP/QUIC. +Both clients and servers SHOULD send a value of three or greater for the QUIC +transport parameter `initial_max_uni_streams`. + If the stream header indicates a stream type which is not supported by the recipient, the remainder of the stream cannot be consumed as the semantics are unknown. Recipients of unknown stream types MAY trigger a QUIC STOP_SENDING @@ -578,31 +346,13 @@ stream. This allows either peer to send data as soon they are able. Depending on whether 0-RTT is enabled on the connection, either client or server might be able to send stream data first after the cryptographic handshake completes. -### Server Push - -HTTP/QUIC server push is similar to what is described in HTTP/2 {{!RFC7540}}, -but uses different mechanisms. - -The PUSH_PROMISE frame ({{frame-push-promise}}) is sent on the client-initiated -bidirectional stream that carried the request that generated the push. This -allows the server push to be associated with a request. Ordering of a -PUSH_PROMISE in relation to certain parts of the response is important (see -Section 8.2.1 of {{!RFC7540}}). - -The PUSH_PROMISE frame does not reference a stream; it contains a Push ID that -uniquely identifies a server push. This allows a server to fulfill promises in -the order that best suits its needs. The same Push ID can be used in multiple -PUSH_PROMISE frames (see {{frame-push-promise}}). When a server later fulfills -a promise, the server push response is conveyed on a push stream. +### Push Streams A push stream is indicated by a stream type of `0x50` (ASCII 'P'), followed by the Push ID of the promise that it fulfills, encoded as a variable-length integer. The remaining data on this stream consists of HTTP/QUIC frames, as -defined in {{frames}}, and carries the response side of an HTTP message exchange -as described in {{request-response}}. The header of the request message is -carried by a PUSH_PROMISE frame (see {{frame-push-promise}}) on the request -stream which generated the push. Promised requests MUST conform to the -requirements in Section 8.2 of {{!RFC7540}}. +defined in {{frames}}, and fulfills a promised server push, as described in +{{server-push}}. Only servers can push; if a server receives a client-initiated push stream, this MUST be treated as a stream error of type HTTP_WRONG_STREAM_DIRECTION. @@ -616,24 +366,10 @@ this MUST be treated as a stream error of type HTTP_WRONG_STREAM_DIRECTION. ~~~~~~~~~~ {: #fig-push-stream-header title="Push Stream Header"} -Server push is only enabled on a connection when a client sends a MAX_PUSH_ID -frame (see {{frame-max-push-id}}). A server cannot use server push -until it receives a MAX_PUSH_ID frame. A client sends additional MAX_PUSH_ID -frames to control the number of pushes that a server can promise. A server -SHOULD use Push IDs sequentially, starting at 0. A client MUST treat receipt -of a push stream with a Push ID that is greater than the maximum Push ID as a -connection error of type HTTP_PUSH_LIMIT_EXCEEDED. - Each Push ID MUST only be used once in a push stream header. If a push stream header includes a Push ID that was used in another push stream header, the client MUST treat this as a connection error of type HTTP_DUPLICATE_PUSH. -If a promised server push is not needed by the client, the client SHOULD send a -CANCEL_PUSH frame. If the push stream is already open, a QUIC STOP_SENDING frame -with an appropriate error code can be used instead (e.g., HTTP_PUSH_REFUSED, -HTTP_PUSH_ALREADY_IN_CACHE; see {{errors}}). This asks the server not to -transfer the data and indicates that it will be discarded upon receipt. - # HTTP Framing Layer {#http-framing-layer} @@ -1085,6 +821,277 @@ A server MUST treat a MAX_PUSH_ID frame payload that does not contain a single variable-length integer as a connection error of type HTTP_MALFORMED_FRAME. +# HTTP Request Lifecycle + +## HTTP Message Exchanges {#request-response} + +A client sends an HTTP request on a client-initiated bidirectional QUIC +stream. A server sends an HTTP response on the same stream as the request. + +An HTTP message (request or response) consists of: + +1. one header block (see {{frame-headers}}) containing the message header (see + {{!RFC7230}}, Section 3.2), + +2. the payload body (see {{!RFC7230}}, Section 3.3), sent as a series of DATA + frames (see {{frame-data}}), + +3. optionally, one header block containing the trailer-part, if present (see + {{!RFC7230}}, Section 4.1.2). + +In addition, prior to sending the message header block indicated above, a +response contains zero or more header blocks containing the message headers of +informational (1xx) HTTP responses (see {{!RFC7230}}, Section 3.2 and +{{!RFC7231}}, Section 6.2). + +A server MAY interleave one or more PUSH_PROMISE frames (see +{{frame-push-promise}}) with the frames of a response message. These +PUSH_PROMISE frames are not part of the response; see {{server-push}} for more +details. + +The "chunked" transfer encoding defined in Section 4.1 of {{!RFC7230}} MUST NOT +be used. + +Trailing header fields are carried in an additional header block following the +body. Senders MUST send only one header block in the trailers section; +receivers MUST discard any subsequent header blocks. + +An HTTP request/response exchange fully consumes a bidirectional QUIC stream. +After sending a request, a client closes the stream for sending; after sending a +response, the server closes the stream for sending and the QUIC stream is fully +closed. Requests and responses are considered complete when the corresponding +QUIC stream is closed in the appropriate direction. + +A server can send a complete response prior to the client sending an entire +request if the response does not depend on any portion of the request that has +not been sent and received. When this is true, a server MAY request that the +client abort transmission of a request without error by triggering a QUIC +STOP_SENDING with error code HTTP_EARLY_RESPONSE, sending a complete response, +and cleanly closing its stream. Clients MUST NOT discard complete responses as +a result of having their request terminated abruptly, though clients can always +discard responses at their discretion for other reasons. + +Changes to the state of a request stream, including receiving a RST_STREAM with +any error code, do not affect the state of the server's response. Servers do not +abort a response in progress solely due to a state change on the request stream. +However, if the request stream terminates without containing a usable HTTP +request, the server SHOULD abort its response with the error code +HTTP_INCOMPLETE_REQUEST. + + +### Header Formatting and Compression + +HTTP header fields carry information as a series of key-value pairs. For a +listing of registered HTTP header fields, see the "Message Header Field" +registry maintained at . + +Just as in previous versions of HTTP, header field names are strings of ASCII +characters that are compared in a case-insensitive fashion. Properties of HTTP +header field names and values are discussed in more detail in Section 3.2 of +{{!RFC7230}}, though the wire rendering in HTTP/QUIC differs. As in HTTP/2, +header field names MUST be converted to lowercase prior to their encoding. A +request or response containing uppercase header field names MUST be treated as +malformed. + +As in HTTP/2, HTTP/QUIC uses special pseudo-header fields beginning with ':' +character (ASCII 0x3a) to convey the target URI, the method of the request, and +the status code for the response. These pseudo-header fields are defined in +Section 8.1.2.3 and 8.1.2.4 of {{!RFC7540}}. Pseudo-header fields are not HTTP +header fields. Endpoints MUST NOT generate pseudo-header fields other than +those defined in {{!RFC7540}}. The restrictions on the use of pseudo-header +fields in Section 8.1.2.1 of {{!RFC7540}} also apply to HTTP/QUIC. + +HTTP/QUIC uses QPACK header compression as described in [QPACK], a variation of +HPACK which allows the flexibility to avoid header-compression-induced +head-of-line blocking. See that document for additional details. + +### Request Cancellation + +Either client or server can cancel requests by aborting the stream (QUIC +RST_STREAM or STOP_SENDING frames, as appropriate) with an error code of +HTTP_REQUEST_CANCELLED ({{http-error-codes}}). When the client cancels a +response, it indicates that this response is no longer of interest. Clients +SHOULD cancel requests by aborting both directions of a stream. + +When the server cancels its response stream using HTTP_REQUEST_CANCELLED, it +indicates that no application processing was performed. The client can treat +requests cancelled by the server as though they had never been sent at all, +thereby allowing them to be retried later on a new connection. Servers MUST NOT +use the HTTP_REQUEST_CANCELLED status for requests which were partially or fully +processed. + + Note: + : In this context, "processed" means that some data from the stream was + passed to some higher layer of software that might have taken some action as + a result. + +If a stream is cancelled after receiving a complete response, the client MAY +ignore the cancellation and use the response. However, if a stream is cancelled +after receiving a partial response, the response SHOULD NOT be used. +Automatically retrying such requests is not possible, unless this is otherwise +permitted (e.g., idempotent actions like GET, PUT, or DELETE). + + +## The CONNECT Method + +The pseudo-method CONNECT ({{!RFC7231}}, Section 4.3.6) is primarily used with +HTTP proxies to establish a TLS session with an origin server for the purposes +of interacting with "https" resources. In HTTP/1.x, CONNECT is used to convert +an entire HTTP connection into a tunnel to a remote host. In HTTP/2, the CONNECT +method is used to establish a tunnel over a single HTTP/2 stream to a remote +host for similar purposes. + +A CONNECT request in HTTP/QUIC functions in the same manner as in HTTP/2. The +request MUST be formatted as described in {{!RFC7540}}, Section 8.3. A CONNECT +request that does not conform to these restrictions is malformed. The request +stream MUST NOT be half-closed at the end of the request. + +A proxy that supports CONNECT establishes a TCP connection ({{!RFC0793}}) to the +server identified in the ":authority" pseudo-header field. Once this connection +is successfully established, the proxy sends a HEADERS frame containing a 2xx +series status code to the client, as defined in {{!RFC7231}}, Section 4.3.6. + +All DATA frames on the request stream correspond to data sent on the TCP +connection. Any DATA frame sent by the client is transmitted by the proxy to the +TCP server; data received from the TCP server is packaged into DATA frames by +the proxy. Note that the size and number of TCP segments is not guaranteed to +map predictably to the size and number of HTTP DATA or QUIC STREAM frames. + +The TCP connection can be closed by either peer. When the client ends the +request stream (that is, the receive stream at the proxy enters the "Data Recvd" +state), the proxy will set the FIN bit on its connection to the TCP server. When +the proxy receives a packet with the FIN bit set, it will terminate the send +stream that it sends to the client. TCP connections which remain half-closed in +a single direction are not invalid, but are often handled poorly by servers, so +clients SHOULD NOT close a stream for sending while they still expect to receive +data from the target of the CONNECT. + +A TCP connection error is signaled with RST_STREAM. A proxy treats any error in +the TCP connection, which includes receiving a TCP segment with the RST bit set, +as a stream error of type HTTP_CONNECT_ERROR ({{http-error-codes}}). +Correspondingly, a proxy MUST send a TCP segment with the RST bit set if it +detects an error with the stream or the QUIC connection. + +## Request Prioritization {#priority} + +HTTP/QUIC uses a priority scheme similar to that described in {{!RFC7540}}, +Section 5.3. In this priority scheme, a given stream can be designated as +dependent upon another request, which expresses the preference that the latter +stream (the "parent" request) be allocated resources before the former stream +(the "dependent" request). Taken together, the dependencies across all requests +in a connection form a dependency tree. The structure of the dependency tree +changes as PRIORITY frames add, remove, or change the dependency links between +requests. + +The PRIORITY frame {{frame-priority}} identifies a prioritized element. The +elements which can be prioritized are: + +- Requests, identified by the ID of the request stream +- Pushes, identified by the Push ID of the promised resource + ({{frame-push-promise}}) +- Placeholders, identified by a Placeholder ID + +An element can depend on another element or on the root of the tree. A +reference to an element which is no longer in the tree is treated as a reference +to the root of the tree. + +Only a client can send PRIORITY frames. A server MUST NOT send a PRIORITY +frame. + +### Placeholders + +In HTTP/2, certain implementations used closed or unused streams as placeholders +in describing the relative priority of requests. However, this created +confusion as servers could not reliably identify which elements of the priority +tree could safely be discarded. Clients could potentially reference closed +streams long after the server had discarded state, leading to disparate views of +the prioritization the client had attempted to express. + +In HTTP/QUIC, a number of placeholders are explicitly permitted by the server +using the `SETTINGS_NUM_PLACEHOLDERS` setting. Because the server commits to +maintain these IDs in the tree, clients can use them with confidence that the +server will not have discarded the state. + +Placeholders are identified by an ID between zero and one less than the number +of placeholders the server has permitted. + +### Priority Tree Maintenance + +Servers can aggressively prune inactive regions from the priority tree, because +placeholders will be used to "root" any persistent structure of the tree which +the client cares about retaining. For prioritization purposes, a node in the +tree is considered "inactive" when the corresponding stream has been closed for +at least two round-trip times (using any reasonable estimate available on the +server). This delay helps mitigate race conditions where the server has pruned +a node the client believed was still active and used as a Stream Dependency. + +Specifically, the server MAY at any time: + +- Identify and discard branches of the tree containing only inactive nodes + (i.e. a node with only other inactive nodes as descendants, along with those + descendants) +- Identify and condense interior regions of the tree containing only inactive + nodes, allocating weight appropriately + +~~~~~~~~~~ drawing + x x x + | | | + P P P + / \ | | + I I ==> I ==> A + / \ | | + A I A A + | | + A A +~~~~~~~~~~ +{: #fig-pruning title="Example of Priority Tree Pruning"} + +In the example in {{fig-pruning}}, `P` represents a Placeholder, `A` represents +an active node, and `I` represents an inactive node. In the first step, the +server discards two inactive branches (each a single node). In the second step, +the server condenses an interior inactive node. Note that these transformations +will result in no change in the resources allocated to a particular active +stream. + +Clients SHOULD assume the server is actively performing such pruning and SHOULD +NOT declare a dependency on a stream it knows to have been closed. + +## Server Push + +HTTP/QUIC server push is similar to what is described in HTTP/2 {{!RFC7540}}, +but uses different mechanisms. + +Each promised response is assigned a unique Push ID that uniquely identifies a +server push. This allows a server to fulfill promises in the order that best +suits its needs. The same Push ID can be used in multiple PUSH_PROMISE frames +(see {{frame-push-promise}}). + +Server push is only enabled on a connection when a client sends a MAX_PUSH_ID +frame (see {{frame-max-push-id}}). A server cannot use server push +until it receives a MAX_PUSH_ID frame. A client sends additional MAX_PUSH_ID +frames to control the number of pushes that a server can promise. A server +SHOULD use Push IDs sequentially, starting at 0. A client MUST treat receipt +of a push stream with a Push ID that is greater than the maximum Push ID as a +connection error of type HTTP_PUSH_LIMIT_EXCEEDED. + +The header of the request message is carried by a PUSH_PROMISE frame (see +{{frame-push-promise}}) on the request stream which generated the push. This +allows the server push to be associated with a request. Ordering of a +PUSH_PROMISE in relation to certain parts of the response is important (see +Section 8.2.1 of {{!RFC7540}}). Promised requests MUST conform to the +requirements in Section 8.2 of {{!RFC7540}}. + +When a server later fulfills a promise, the server push response is conveyed on +a push stream (see {{push-streams}}). The push stream identifies the Push ID of +the promise that it fulfills, then contains a response to the promised request +using the same format described for responses in {{request-response}}. + +If a promised server push is not needed by the client, the client SHOULD send a +CANCEL_PUSH frame. If the push stream is already open or opens after sending the +CANCEL_PUSH frame, a QUIC STOP_SENDING frame with an appropriate error code can +also be used (e.g., HTTP_PUSH_REFUSED, HTTP_PUSH_ALREADY_IN_CACHE; see +{{errors}}). This asks the server not to transfer additional data and indicates +that it will be discarded upon receipt. # Connection Closure From c1ab7279b1382bbeb64395067b88378ad5143e5a Mon Sep 17 00:00:00 2001 From: Mike Bishop Date: Fri, 5 Oct 2018 10:35:46 -0700 Subject: [PATCH 03/57] Move Extensions before Error Handling --- draft-ietf-quic-http.md | 76 ++++++++++++++++++++--------------------- 1 file changed, 38 insertions(+), 38 deletions(-) diff --git a/draft-ietf-quic-http.md b/draft-ietf-quic-http.md index 7d8105efb4..325f8f52a0 100644 --- a/draft-ietf-quic-http.md +++ b/draft-ietf-quic-http.md @@ -1199,6 +1199,44 @@ interrupts connectivity. If a connection terminates without a GOAWAY frame, clients MUST assume that any request which was sent, whether in whole or in part, might have been processed. +# Extensions to HTTP/QUIC + +HTTP/QUIC permits extension of the protocol. Within the limitations described +in this section, protocol extensions can be used to provide additional services +or alter any aspect of the protocol. Extensions are effective only within the +scope of a single HTTP/QUIC connection. + +This applies to the protocol elements defined in this document. This does not +affect the existing options for extending HTTP, such as defining new methods, +status codes, or header fields. + +Extensions are permitted to use new frame types ({{frames}}), new settings +({{settings-parameters}}), new error codes ({{errors}}), or new unidirectional +stream types ({{unidirectional-streams}}). Registries are established for +managing these extension points: frame types ({{iana-frames}}), settings +({{iana-settings}}), error codes ({{iana-error-codes}}), and stream types +({{iana-stream-types}}). + +Implementations MUST ignore unknown or unsupported values in all extensible +protocol elements. Implementations MUST discard frames and unidirectional +streams that have unknown or unsupported types. This means that any of these +extension points can be safely used by extensions without prior arrangement or +negotiation. + +Extensions that could change the semantics of existing protocol components MUST +be negotiated before being used. For example, an extension that changes the +layout of the HEADERS frame cannot be used until the peer has given a positive +signal that this is acceptable. In this case, it could also be necessary to +coordinate when the revised layout comes into effect. + +This document doesn't mandate a specific method for negotiating the use of an +extension but notes that a setting ({{settings-parameters}}) could be used for +that purpose. If both peers set a value that indicates willingness to use the +extension, then the extension can be used. If a setting is used for extension +negotiation, the default value MUST be defined in such a fashion that the +extension is disabled if the setting is omitted. + + # Error Handling {#errors} QUIC allows the application to abruptly terminate (reset) individual streams or @@ -1290,44 +1328,6 @@ HTTP_MALFORMED_FRAME (0x01XX): be indicated with the code (0x10D). -# Extensions to HTTP/QUIC - -HTTP/QUIC permits extension of the protocol. Within the limitations described -in this section, protocol extensions can be used to provide additional services -or alter any aspect of the protocol. Extensions are effective only within the -scope of a single HTTP/QUIC connection. - -This applies to the protocol elements defined in this document. This does not -affect the existing options for extending HTTP, such as defining new methods, -status codes, or header fields. - -Extensions are permitted to use new frame types ({{frames}}), new settings -({{settings-parameters}}), new error codes ({{errors}}), or new unidirectional -stream types ({{unidirectional-streams}}). Registries are established for -managing these extension points: frame types ({{iana-frames}}), settings -({{iana-settings}}), error codes ({{iana-error-codes}}), and stream types -({{iana-stream-types}}). - -Implementations MUST ignore unknown or unsupported values in all extensible -protocol elements. Implementations MUST discard frames and unidirectional -streams that have unknown or unsupported types. This means that any of these -extension points can be safely used by extensions without prior arrangement or -negotiation. - -Extensions that could change the semantics of existing protocol components MUST -be negotiated before being used. For example, an extension that changes the -layout of the HEADERS frame cannot be used until the peer has given a positive -signal that this is acceptable. In this case, it could also be necessary to -coordinate when the revised layout comes into effect. - -This document doesn't mandate a specific method for negotiating the use of an -extension but notes that a setting ({{settings-parameters}}) could be used for -that purpose. If both peers set a value that indicates willingness to use the -extension, then the extension can be used. If a setting is used for extension -negotiation, the default value MUST be defined in such a fashion that the -extension is disabled if the setting is omitted. - - # Considerations for Transitioning from HTTP/2 HTTP/QUIC is strongly informed by HTTP/2, and bears many similarities. This From 9bd4e44f9e6d7b2d0b598ffa34df3735418f111e Mon Sep 17 00:00:00 2001 From: Mike Bishop Date: Fri, 5 Oct 2018 10:37:06 -0700 Subject: [PATCH 04/57] Move H2 Considerations to an appendix --- draft-ietf-quic-http.md | 460 ++++++++++++++++++++-------------------- 1 file changed, 230 insertions(+), 230 deletions(-) diff --git a/draft-ietf-quic-http.md b/draft-ietf-quic-http.md index 325f8f52a0..5ecfd300a3 100644 --- a/draft-ietf-quic-http.md +++ b/draft-ietf-quic-http.md @@ -1328,236 +1328,6 @@ HTTP_MALFORMED_FRAME (0x01XX): be indicated with the code (0x10D). -# Considerations for Transitioning from HTTP/2 - -HTTP/QUIC is strongly informed by HTTP/2, and bears many similarities. This -section describes the approach taken to design HTTP/QUIC, points out important -differences from HTTP/2, and describes how to map HTTP/2 extensions into -HTTP/QUIC. - -HTTP/QUIC begins from the premise that HTTP/2 code reuse is a useful feature, -but not a hard requirement. HTTP/QUIC departs from HTTP/2 primarily where -necessary to accommodate the differences in behavior between QUIC and TCP (lack -of ordering, support for streams). We intend to avoid gratuitous changes which -make it difficult or impossible to build extensions with the same semantics -applicable to both protocols at once. - -These departures are noted in this section. - -## Streams {#h2-streams} - -HTTP/QUIC permits use of a larger number of streams (2^62-1) than HTTP/2. The -considerations about exhaustion of stream identifier space apply, though the -space is significantly larger such that it is likely that other limits in QUIC -are reached first, such as the limit on the connection flow control window. - -## HTTP Frame Types {#h2-frames} - -Many framing concepts from HTTP/2 can be elided away on QUIC, because the -transport deals with them. Because frames are already on a stream, they can omit -the stream number. Because frames do not block multiplexing (QUIC's multiplexing -occurs below this layer), the support for variable-maximum-length packets can be -removed. Because stream termination is handled by QUIC, an END_STREAM flag is -not required. This permits the removal of the Flags field from the generic -frame layout. - -Frame payloads are largely drawn from {{!RFC7540}}. However, QUIC includes many -features (e.g. flow control) which are also present in HTTP/2. In these cases, -the HTTP mapping does not re-implement them. As a result, several HTTP/2 frame -types are not required in HTTP/QUIC. Where an HTTP/2-defined frame is no longer -used, the frame ID has been reserved in order to maximize portability between -HTTP/2 and HTTP/QUIC implementations. However, even equivalent frames between -the two mappings are not identical. - -Many of the differences arise from the fact that HTTP/2 provides an absolute -ordering between frames across all streams, while QUIC provides this guarantee -on each stream only. As a result, if a frame type makes assumptions that frames -from different streams will still be received in the order sent, HTTP/QUIC will -break them. - -For example, implicit in the HTTP/2 prioritization scheme is the notion of -in-order delivery of priority changes (i.e., dependency tree mutations): since -operations on the dependency tree such as reparenting a subtree are not -commutative, both sender and receiver must apply them in the same order to -ensure that both sides have a consistent view of the stream dependency tree. -HTTP/2 specifies priority assignments in PRIORITY frames and (optionally) in -HEADERS frames. To achieve in-order delivery of priority changes in HTTP/QUIC, -PRIORITY frames are sent on the control stream and the PRIORITY section is -removed from the HEADERS frame. - -Likewise, HPACK was designed with the assumption of in-order delivery. A -sequence of encoded header blocks must arrive (and be decoded) at an endpoint in -the same order in which they were encoded. This ensures that the dynamic state -at the two endpoints remains in sync. As a result, HTTP/QUIC uses a modified -version of HPACK, described in [QPACK]. - -Frame type definitions in HTTP/QUIC often use the QUIC variable-length integer -encoding. In particular, Stream IDs use this encoding, which allow for a larger -range of possible values than the encoding used in HTTP/2. Some frames in -HTTP/QUIC use an identifier rather than a Stream ID (e.g. Push IDs in PRIORITY -frames). Redefinition of the encoding of extension frame types might be -necessary if the encoding includes a Stream ID. - -Because the Flags field is not present in generic HTTP/QUIC frames, those frames -which depend on the presence of flags need to allocate space for flags as part -of their frame payload. - -Other than this issue, frame type HTTP/2 extensions are typically portable to -QUIC simply by replacing Stream 0 in HTTP/2 with a control stream in HTTP/QUIC. -HTTP/QUIC extensions will not assume ordering, but would not be harmed by -ordering, and would be portable to HTTP/2 in the same manner. - -Below is a listing of how each HTTP/2 frame type is mapped: - -DATA (0x0): -: Padding is not defined in HTTP/QUIC frames. See {{frame-data}}. - -HEADERS (0x1): -: As described above, the PRIORITY region of HEADERS is not supported. A - separate PRIORITY frame MUST be used. Padding is not defined in HTTP/QUIC - frames. See {{frame-headers}}. - -PRIORITY (0x2): -: As described above, the PRIORITY frame is sent on the control stream and can - reference either a Stream ID or a Push ID. See {{frame-priority}}. - -RST_STREAM (0x3): -: RST_STREAM frames do not exist, since QUIC provides stream lifecycle - management. The same code point is used for the CANCEL_PUSH frame - ({{frame-cancel-push}}). - -SETTINGS (0x4): -: SETTINGS frames are sent only at the beginning of the connection. See - {{frame-settings}} and {{h2-settings}}. - -PUSH_PROMISE (0x5): -: The PUSH_PROMISE does not reference a stream; instead the push stream - references the PUSH_PROMISE frame using a Push ID. See - {{frame-push-promise}}. - -PING (0x6): -: PING frames do not exist, since QUIC provides equivalent functionality. - -GOAWAY (0x7): -: GOAWAY is sent only from server to client and does not contain an error code. - See {{frame-goaway}}. - -WINDOW_UPDATE (0x8): -: WINDOW_UPDATE frames do not exist, since QUIC provides flow control. - -CONTINUATION (0x9): -: CONTINUATION frames do not exist; instead, larger HEADERS/PUSH_PROMISE - frames than HTTP/2 are permitted, and HEADERS frames can be used in series. - -Frame types defined by extensions to HTTP/2 need to be separately registered for -HTTP/QUIC if still applicable. The IDs of frames defined in {{!RFC7540}} have -been reserved for simplicity. See {{iana-frames}}. - -## HTTP/2 SETTINGS Parameters {#h2-settings} - -An important difference from HTTP/2 is that settings are sent once, at the -beginning of the connection, and thereafter cannot change. This eliminates -many corner cases around synchronization of changes. - -Some transport-level options that HTTP/2 specifies via the SETTINGS frame are -superseded by QUIC transport parameters in HTTP/QUIC. The HTTP-level options -that are retained in HTTP/QUIC have the same value as in HTTP/2. - -Below is a listing of how each HTTP/2 SETTINGS parameter is mapped: - -SETTINGS_HEADER_TABLE_SIZE: -: See {{settings-parameters}}. - -SETTINGS_ENABLE_PUSH: -: This is removed in favor of the MAX_PUSH_ID which provides a more granular - control over server push. - -SETTINGS_MAX_CONCURRENT_STREAMS: -: QUIC controls the largest open Stream ID as part of its flow control logic. - Specifying SETTINGS_MAX_CONCURRENT_STREAMS in the SETTINGS frame is an error. - -SETTINGS_INITIAL_WINDOW_SIZE: -: QUIC requires both stream and connection flow control window sizes to be - specified in the initial transport handshake. Specifying - SETTINGS_INITIAL_WINDOW_SIZE in the SETTINGS frame is an error. - -SETTINGS_MAX_FRAME_SIZE: -: This setting has no equivalent in HTTP/QUIC. Specifying it in the SETTINGS - frame is an error. - -SETTINGS_MAX_HEADER_LIST_SIZE: -: See {{settings-parameters}}. - -In HTTP/QUIC, setting values are variable-length integers (6, 14, 30, or 62 bits -long) rather than fixed-length 32-bit fields as in HTTP/2. This will often -produce a shorter encoding, but can produce a longer encoding for settings which -use the full 32-bit space. Settings ported from HTTP/2 might choose to redefine -the format of their settings to avoid using the 62-bit encoding. - -Settings need to be defined separately for HTTP/2 and HTTP/QUIC. The IDs of -settings defined in {{!RFC7540}} have been reserved for simplicity. See -{{iana-settings}}. - - -## HTTP/2 Error Codes - -QUIC has the same concepts of "stream" and "connection" errors that HTTP/2 -provides. However, because the error code space is shared between multiple -components, there is no direct portability of HTTP/2 error codes. - -The HTTP/2 error codes defined in Section 7 of {{!RFC7540}} map to the HTTP/QUIC -error codes as follows: - -NO_ERROR (0x0): -: HTTP_NO_ERROR in {{http-error-codes}}. - -PROTOCOL_ERROR (0x1): -: No single mapping. See new HTTP_MALFORMED_FRAME error codes defined in - {{http-error-codes}}. - -INTERNAL_ERROR (0x2): -: HTTP_INTERNAL_ERROR in {{http-error-codes}}. - -FLOW_CONTROL_ERROR (0x3): -: Not applicable, since QUIC handles flow control. Would provoke a - QUIC_FLOW_CONTROL_RECEIVED_TOO_MUCH_DATA from the QUIC layer. - -SETTINGS_TIMEOUT (0x4): -: Not applicable, since no acknowledgement of SETTINGS is defined. - -STREAM_CLOSED (0x5): -: Not applicable, since QUIC handles stream management. Would provoke a - QUIC_STREAM_DATA_AFTER_TERMINATION from the QUIC layer. - -FRAME_SIZE_ERROR (0x6): -: No single mapping. See new error codes defined in {{http-error-codes}}. - -REFUSED_STREAM (0x7): -: Not applicable, since QUIC handles stream management. Would provoke a - QUIC_TOO_MANY_OPEN_STREAMS from the QUIC layer. - -CANCEL (0x8): -: HTTP_REQUEST_CANCELLED in {{http-error-codes}}. - -COMPRESSION_ERROR (0x9): -: HTTP_QPACK_DECOMPRESSION_FAILED in [QPACK]. - -CONNECT_ERROR (0xa): -: HTTP_CONNECT_ERROR in {{http-error-codes}}. - -ENHANCE_YOUR_CALM (0xb): -: HTTP_EXCESSIVE_LOAD in {{http-error-codes}}. - -INADEQUATE_SECURITY (0xc): -: Not applicable, since QUIC is assumed to provide sufficient security on all - connections. - -HTTP_1_1_REQUIRED (0xd): -: HTTP_VERSION_FALLBACK in {{http-error-codes}}. - -Error codes need to be defined for HTTP/2 and HTTP/QUIC separately. See -{{iana-error-codes}}. - # Security Considerations The security considerations of HTTP/QUIC should be comparable to those of HTTP/2 @@ -1811,6 +1581,236 @@ Sender: --- back +# Considerations for Transitioning from HTTP/2 + +HTTP/QUIC is strongly informed by HTTP/2, and bears many similarities. This +section describes the approach taken to design HTTP/QUIC, points out important +differences from HTTP/2, and describes how to map HTTP/2 extensions into +HTTP/QUIC. + +HTTP/QUIC begins from the premise that HTTP/2 code reuse is a useful feature, +but not a hard requirement. HTTP/QUIC departs from HTTP/2 primarily where +necessary to accommodate the differences in behavior between QUIC and TCP (lack +of ordering, support for streams). We intend to avoid gratuitous changes which +make it difficult or impossible to build extensions with the same semantics +applicable to both protocols at once. + +These departures are noted in this section. + +## Streams {#h2-streams} + +HTTP/QUIC permits use of a larger number of streams (2^62-1) than HTTP/2. The +considerations about exhaustion of stream identifier space apply, though the +space is significantly larger such that it is likely that other limits in QUIC +are reached first, such as the limit on the connection flow control window. + +## HTTP Frame Types {#h2-frames} + +Many framing concepts from HTTP/2 can be elided away on QUIC, because the +transport deals with them. Because frames are already on a stream, they can omit +the stream number. Because frames do not block multiplexing (QUIC's multiplexing +occurs below this layer), the support for variable-maximum-length packets can be +removed. Because stream termination is handled by QUIC, an END_STREAM flag is +not required. This permits the removal of the Flags field from the generic +frame layout. + +Frame payloads are largely drawn from {{!RFC7540}}. However, QUIC includes many +features (e.g. flow control) which are also present in HTTP/2. In these cases, +the HTTP mapping does not re-implement them. As a result, several HTTP/2 frame +types are not required in HTTP/QUIC. Where an HTTP/2-defined frame is no longer +used, the frame ID has been reserved in order to maximize portability between +HTTP/2 and HTTP/QUIC implementations. However, even equivalent frames between +the two mappings are not identical. + +Many of the differences arise from the fact that HTTP/2 provides an absolute +ordering between frames across all streams, while QUIC provides this guarantee +on each stream only. As a result, if a frame type makes assumptions that frames +from different streams will still be received in the order sent, HTTP/QUIC will +break them. + +For example, implicit in the HTTP/2 prioritization scheme is the notion of +in-order delivery of priority changes (i.e., dependency tree mutations): since +operations on the dependency tree such as reparenting a subtree are not +commutative, both sender and receiver must apply them in the same order to +ensure that both sides have a consistent view of the stream dependency tree. +HTTP/2 specifies priority assignments in PRIORITY frames and (optionally) in +HEADERS frames. To achieve in-order delivery of priority changes in HTTP/QUIC, +PRIORITY frames are sent on the control stream and the PRIORITY section is +removed from the HEADERS frame. + +Likewise, HPACK was designed with the assumption of in-order delivery. A +sequence of encoded header blocks must arrive (and be decoded) at an endpoint in +the same order in which they were encoded. This ensures that the dynamic state +at the two endpoints remains in sync. As a result, HTTP/QUIC uses a modified +version of HPACK, described in [QPACK]. + +Frame type definitions in HTTP/QUIC often use the QUIC variable-length integer +encoding. In particular, Stream IDs use this encoding, which allow for a larger +range of possible values than the encoding used in HTTP/2. Some frames in +HTTP/QUIC use an identifier rather than a Stream ID (e.g. Push IDs in PRIORITY +frames). Redefinition of the encoding of extension frame types might be +necessary if the encoding includes a Stream ID. + +Because the Flags field is not present in generic HTTP/QUIC frames, those frames +which depend on the presence of flags need to allocate space for flags as part +of their frame payload. + +Other than this issue, frame type HTTP/2 extensions are typically portable to +QUIC simply by replacing Stream 0 in HTTP/2 with a control stream in HTTP/QUIC. +HTTP/QUIC extensions will not assume ordering, but would not be harmed by +ordering, and would be portable to HTTP/2 in the same manner. + +Below is a listing of how each HTTP/2 frame type is mapped: + +DATA (0x0): +: Padding is not defined in HTTP/QUIC frames. See {{frame-data}}. + +HEADERS (0x1): +: As described above, the PRIORITY region of HEADERS is not supported. A + separate PRIORITY frame MUST be used. Padding is not defined in HTTP/QUIC + frames. See {{frame-headers}}. + +PRIORITY (0x2): +: As described above, the PRIORITY frame is sent on the control stream and can + reference either a Stream ID or a Push ID. See {{frame-priority}}. + +RST_STREAM (0x3): +: RST_STREAM frames do not exist, since QUIC provides stream lifecycle + management. The same code point is used for the CANCEL_PUSH frame + ({{frame-cancel-push}}). + +SETTINGS (0x4): +: SETTINGS frames are sent only at the beginning of the connection. See + {{frame-settings}} and {{h2-settings}}. + +PUSH_PROMISE (0x5): +: The PUSH_PROMISE does not reference a stream; instead the push stream + references the PUSH_PROMISE frame using a Push ID. See + {{frame-push-promise}}. + +PING (0x6): +: PING frames do not exist, since QUIC provides equivalent functionality. + +GOAWAY (0x7): +: GOAWAY is sent only from server to client and does not contain an error code. + See {{frame-goaway}}. + +WINDOW_UPDATE (0x8): +: WINDOW_UPDATE frames do not exist, since QUIC provides flow control. + +CONTINUATION (0x9): +: CONTINUATION frames do not exist; instead, larger HEADERS/PUSH_PROMISE + frames than HTTP/2 are permitted, and HEADERS frames can be used in series. + +Frame types defined by extensions to HTTP/2 need to be separately registered for +HTTP/QUIC if still applicable. The IDs of frames defined in {{!RFC7540}} have +been reserved for simplicity. See {{iana-frames}}. + +## HTTP/2 SETTINGS Parameters {#h2-settings} + +An important difference from HTTP/2 is that settings are sent once, at the +beginning of the connection, and thereafter cannot change. This eliminates +many corner cases around synchronization of changes. + +Some transport-level options that HTTP/2 specifies via the SETTINGS frame are +superseded by QUIC transport parameters in HTTP/QUIC. The HTTP-level options +that are retained in HTTP/QUIC have the same value as in HTTP/2. + +Below is a listing of how each HTTP/2 SETTINGS parameter is mapped: + +SETTINGS_HEADER_TABLE_SIZE: +: See {{settings-parameters}}. + +SETTINGS_ENABLE_PUSH: +: This is removed in favor of the MAX_PUSH_ID which provides a more granular + control over server push. + +SETTINGS_MAX_CONCURRENT_STREAMS: +: QUIC controls the largest open Stream ID as part of its flow control logic. + Specifying SETTINGS_MAX_CONCURRENT_STREAMS in the SETTINGS frame is an error. + +SETTINGS_INITIAL_WINDOW_SIZE: +: QUIC requires both stream and connection flow control window sizes to be + specified in the initial transport handshake. Specifying + SETTINGS_INITIAL_WINDOW_SIZE in the SETTINGS frame is an error. + +SETTINGS_MAX_FRAME_SIZE: +: This setting has no equivalent in HTTP/QUIC. Specifying it in the SETTINGS + frame is an error. + +SETTINGS_MAX_HEADER_LIST_SIZE: +: See {{settings-parameters}}. + +In HTTP/QUIC, setting values are variable-length integers (6, 14, 30, or 62 bits +long) rather than fixed-length 32-bit fields as in HTTP/2. This will often +produce a shorter encoding, but can produce a longer encoding for settings which +use the full 32-bit space. Settings ported from HTTP/2 might choose to redefine +the format of their settings to avoid using the 62-bit encoding. + +Settings need to be defined separately for HTTP/2 and HTTP/QUIC. The IDs of +settings defined in {{!RFC7540}} have been reserved for simplicity. See +{{iana-settings}}. + + +## HTTP/2 Error Codes + +QUIC has the same concepts of "stream" and "connection" errors that HTTP/2 +provides. However, because the error code space is shared between multiple +components, there is no direct portability of HTTP/2 error codes. + +The HTTP/2 error codes defined in Section 7 of {{!RFC7540}} map to the HTTP/QUIC +error codes as follows: + +NO_ERROR (0x0): +: HTTP_NO_ERROR in {{http-error-codes}}. + +PROTOCOL_ERROR (0x1): +: No single mapping. See new HTTP_MALFORMED_FRAME error codes defined in + {{http-error-codes}}. + +INTERNAL_ERROR (0x2): +: HTTP_INTERNAL_ERROR in {{http-error-codes}}. + +FLOW_CONTROL_ERROR (0x3): +: Not applicable, since QUIC handles flow control. Would provoke a + QUIC_FLOW_CONTROL_RECEIVED_TOO_MUCH_DATA from the QUIC layer. + +SETTINGS_TIMEOUT (0x4): +: Not applicable, since no acknowledgement of SETTINGS is defined. + +STREAM_CLOSED (0x5): +: Not applicable, since QUIC handles stream management. Would provoke a + QUIC_STREAM_DATA_AFTER_TERMINATION from the QUIC layer. + +FRAME_SIZE_ERROR (0x6): +: No single mapping. See new error codes defined in {{http-error-codes}}. + +REFUSED_STREAM (0x7): +: Not applicable, since QUIC handles stream management. Would provoke a + QUIC_TOO_MANY_OPEN_STREAMS from the QUIC layer. + +CANCEL (0x8): +: HTTP_REQUEST_CANCELLED in {{http-error-codes}}. + +COMPRESSION_ERROR (0x9): +: HTTP_QPACK_DECOMPRESSION_FAILED in [QPACK]. + +CONNECT_ERROR (0xa): +: HTTP_CONNECT_ERROR in {{http-error-codes}}. + +ENHANCE_YOUR_CALM (0xb): +: HTTP_EXCESSIVE_LOAD in {{http-error-codes}}. + +INADEQUATE_SECURITY (0xc): +: Not applicable, since QUIC is assumed to provide sufficient security on all + connections. + +HTTP_1_1_REQUIRED (0xd): +: HTTP_VERSION_FALLBACK in {{http-error-codes}}. + +Error codes need to be defined for HTTP/2 and HTTP/QUIC separately. See +{{iana-error-codes}}. + # Change Log > **RFC Editor's Note:** Please remove this section prior to publication of a From 6305be849d275486968f16f1175db0b1dd374cbe Mon Sep 17 00:00:00 2001 From: Mike Bishop Date: Thu, 11 Oct 2018 14:33:42 -0700 Subject: [PATCH 05/57] Expand introduction, discuss relationship with HTTP and TCP --- draft-ietf-quic-http.md | 24 ++++++++++++++++++------ 1 file changed, 18 insertions(+), 6 deletions(-) diff --git a/draft-ietf-quic-http.md b/draft-ietf-quic-http.md index 5ecfd300a3..9b76ebdb82 100644 --- a/draft-ietf-quic-http.md +++ b/draft-ietf-quic-http.md @@ -86,12 +86,24 @@ code and issues list for this draft can be found at # Introduction -The QUIC transport protocol has several features that are desirable in a -transport for HTTP, such as stream multiplexing, per-stream flow control, and -low-latency connection establishment. This document describes a mapping of HTTP -semantics over QUIC, drawing heavily on the existing TCP mapping, HTTP/2. -Specifically, this document identifies HTTP/2 features that are subsumed by -QUIC, and describes how the other features can be implemented atop QUIC. +HTTP semantics are used for a broad range of services on the Internet. These +semantics have commonly been used with two different TCP mappings, HTTP/1.1 and +HTTP/2. HTTP/2 introduced a framing and multiplexing layer to improve latency +without modifying the transport layer. However, TCP's lack of visibility into +parallel requests in both mappings limited the possible performance gains. + +The QUIC transport protocol has incorporates stream multiplexing and per-stream +flow control, similar to that provided by the HTTP/2 framing layer. By providing +reliability at the stream level and congestion control across the entire +connection, it has the capability to improve the performance of HTTP compared to +a TCP mapping. QUIC also incorporates TLS 1.3 at the transport layer, offering +comparable security to running TLS over TCP, but with improved connection setup +latency. + +This document describes a mapping of HTTP semantics over the QUIC transport +protocol, drawing heavily on design of HTTP/2. This document identifies HTTP/2 +features that are subsumed by QUIC, and describes how the other features can be +implemented atop QUIC. QUIC is described in {{QUIC-TRANSPORT}}. For a full description of HTTP/2, see {{!RFC7540}}. From d755aac12716788232d093d9361d7a36b418312a Mon Sep 17 00:00:00 2001 From: Mike Bishop Date: Thu, 11 Oct 2018 14:36:25 -0700 Subject: [PATCH 06/57] Fix reserved version example --- draft-ietf-quic-http.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/draft-ietf-quic-http.md b/draft-ietf-quic-http.md index 9b76ebdb82..4d5b825ada 100644 --- a/draft-ietf-quic-http.md +++ b/draft-ietf-quic-http.md @@ -203,17 +203,17 @@ parameter MUST NOT occur more than once; clients SHOULD process only the first occurrence. For example, suppose a server supported both version 0x00000001 and the version -rendered in ASCII as "Q034". If it opted to include the reserved versions (from -Section 4 of {{QUIC-TRANSPORT}}) 0x0 and 0x1abadaba, it could specify the +rendered in ASCII as "Q034". If it also opted to include the reserved version +(from Section 3 of {{QUIC-TRANSPORT}}) 0x1abadaba, it could specify the following header field: ~~~ example -Alt-Svc: hq=":49288";quic="1,1abadaba,51303334,0" +Alt-Svc: hq=":49288";quic="1,1abadaba,51303334" ~~~ -A client acting on this header field would drop the reserved versions (because -it does not support them), then attempt to connect to the alternative using the -first version in the list which it does support. +A client acting on this header field would drop the reserved version (not +supported), then attempt to connect to the alternative using the first version +in the list which it does support, if any. ## Connection Establishment {#connection-establishment} From 3fadd85332e7fd83d3d36c5e9894b1b51aef7529 Mon Sep 17 00:00:00 2001 From: Mike Bishop Date: Thu, 11 Oct 2018 14:38:20 -0700 Subject: [PATCH 07/57] Narrow restriction on server sending before SETTINGS --- draft-ietf-quic-http.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/draft-ietf-quic-http.md b/draft-ietf-quic-http.md index 4d5b825ada..8f30e91c16 100644 --- a/draft-ietf-quic-http.md +++ b/draft-ietf-quic-http.md @@ -233,8 +233,8 @@ the initial crypto handshake, HTTP/QUIC-specific settings are conveyed in the SETTINGS frame. After the QUIC connection is established, a SETTINGS frame ({{frame-settings}}) MUST be sent by each endpoint as the initial frame of their respective HTTP control stream (see {{control-streams}}). The server MUST NOT -send data on any other stream until the client's SETTINGS frame has been -received. +process any request streams or send responses until the client's SETTINGS frame +has been received. ## Connection Reuse From 971da604998eeee8128d22143cd705c1d394a806 Mon Sep 17 00:00:00 2001 From: Mike Bishop Date: Thu, 11 Oct 2018 14:41:59 -0700 Subject: [PATCH 08/57] Better discussion of QUIC streams --- draft-ietf-quic-http.md | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/draft-ietf-quic-http.md b/draft-ietf-quic-http.md index 8f30e91c16..8dae7a707c 100644 --- a/draft-ietf-quic-http.md +++ b/draft-ietf-quic-http.md @@ -263,9 +263,13 @@ management of HTTP/QUIC connections. A QUIC stream provides reliable in-order delivery of bytes, but makes no guarantees about order of delivery with regard to bytes on other streams. On the wire, data is framed into QUIC STREAM frames, but this framing is invisible to -the HTTP framing layer. A QUIC receiver buffers and orders received STREAM -frames, exposing the data contained within as a reliable byte stream to the -application. +the HTTP framing layer. The transport layer buffers and orders received QUIC +STREAM frames, exposing the data contained within as a reliable byte stream to +the application. + +QUIC streams can be either unidirectional, carrying data only from initiator to +receiver, or bidirectional. Streams can be initiated by either the client or +the server. For more detail on QUIC streams, see {{QUIC-TRANSPORT}}, Section 9. When HTTP headers and data are sent over QUIC, the QUIC layer handles most of the stream management. HTTP does not need to do any separate multiplexing when @@ -291,7 +295,9 @@ the stream was truncated, this MUST be treated as a connection error (see HTTP_MALFORMED_FRAME in {{http-error-codes}}). Streams which terminate abruptly may be reset at any point in the frame. -HTTP/QUIC does not use server-initiated bidirectional streams. +HTTP/QUIC does not use server-initiated bidirectional streams; clients MUST omit +or specify a value of zero for the QUIC transport parameter +`initial_max_bidi_streams`. ## Unidirectional Streams From f3f5accaad26adb7b96415b2a55e8d6026495bd1 Mon Sep 17 00:00:00 2001 From: Mike Bishop Date: Thu, 11 Oct 2018 14:53:27 -0700 Subject: [PATCH 09/57] Fix references, reorder unidirectional stream section --- draft-ietf-quic-http.md | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/draft-ietf-quic-http.md b/draft-ietf-quic-http.md index 8dae7a707c..c1e4266b4f 100644 --- a/draft-ietf-quic-http.md +++ b/draft-ietf-quic-http.md @@ -317,7 +317,7 @@ this header is determined by the stream type. Some stream types are reserved ({{stream-grease}}). Two stream types are defined in this document: control streams ({{control-streams}}) and push streams -({{server-push}}). Other stream types can be defined by extensions to +({{push-streams}}). Other stream types can be defined by extensions to HTTP/QUIC. Both clients and servers SHOULD send a value of three or greater for the QUIC @@ -334,17 +334,6 @@ them. However, stream types which could modify the state or semantics of existing protocol components, including QPACK or other extensions, MUST NOT be sent until the peer is known to support them. -### Reserved Stream Types {#stream-grease} - -Stream types of the format `0x1f * N` are reserved to exercise the requirement -that unknown types be ignored. These streams have no semantic meaning, and can -be sent when application-layer padding is desired. They MAY also be sent on -connections where no request data is currently being transferred. Endpoints MUST -NOT consider these streams to have any meaning upon receipt. - -The payload and length of the stream are selected in any manner the -implementation chooses. - ### Control Streams The control stream is indicated by a stream type of `0x43` (ASCII 'C'). Data on @@ -369,8 +358,8 @@ able to send stream data first after the cryptographic handshake completes. A push stream is indicated by a stream type of `0x50` (ASCII 'P'), followed by the Push ID of the promise that it fulfills, encoded as a variable-length integer. The remaining data on this stream consists of HTTP/QUIC frames, as -defined in {{frames}}, and fulfills a promised server push, as described in -{{server-push}}. +defined in {{frames}}, and fulfills a promised server push. Server push and +Push IDs are described in {{server-push}}. Only servers can push; if a server receives a client-initiated push stream, this MUST be treated as a stream error of type HTTP_WRONG_STREAM_DIRECTION. @@ -388,6 +377,17 @@ Each Push ID MUST only be used once in a push stream header. If a push stream header includes a Push ID that was used in another push stream header, the client MUST treat this as a connection error of type HTTP_DUPLICATE_PUSH. +### Reserved Stream Types {#stream-grease} + +Stream types of the format `0x1f * N` are reserved to exercise the requirement +that unknown types be ignored. These streams have no semantic meaning, and can +be sent when application-layer padding is desired. They MAY also be sent on +connections where no request data is currently being transferred. Endpoints MUST +NOT consider these streams to have any meaning upon receipt. + +The payload and length of the stream are selected in any manner the +implementation chooses. + # HTTP Framing Layer {#http-framing-layer} From aefe5a40d139750eead5d849bcd0ac339d283514 Mon Sep 17 00:00:00 2001 From: Mike Bishop Date: Thu, 11 Oct 2018 15:06:42 -0700 Subject: [PATCH 10/57] General framing layer fixups --- draft-ietf-quic-http.md | 40 ++++++++++++++++++++-------------------- 1 file changed, 20 insertions(+), 20 deletions(-) diff --git a/draft-ietf-quic-http.md b/draft-ietf-quic-http.md index c1e4266b4f..5b3eeeaf13 100644 --- a/draft-ietf-quic-http.md +++ b/draft-ietf-quic-http.md @@ -392,8 +392,8 @@ implementation chooses. # HTTP Framing Layer {#http-framing-layer} Frames are used on the control stream, request streams, and push streams. This -section describes HTTP framing in QUIC and highlights some differences from -HTTP/2 framing. For more detail on differences from HTTP/2, see {{h2-frames}}. +section describes HTTP framing in QUIC. For a comparison with HTTP/2 frames, +see {{h2-frames}}. ## Frame Layout @@ -422,21 +422,13 @@ A frame includes the following fields: Frame Payload: : A payload, the semantics of which are determined by the Type field. +Each frame's payload MUST contain exactly the identified fields. A frame that +contains additional octets after the identified fields or a frame that +terminates before the end of the identified fields MUST be treated as a +connection error of type HTTP_MALFORMED_FRAME. ## Frame Definitions {#frames} -### Reserved Frame Types {#frame-grease} - -Frame types of the format `0xb + (0x1f * N)` are reserved to exercise the -requirement that unknown types be ignored. These frames have no semantic -meaning, and can be sent when application-layer padding is desired. They MAY -also be sent on connections where no request data is currently being -transferred. Endpoints MUST NOT consider these frames to have any meaning upon -receipt. - -The payload and length of the frames are selected in any manner the -implementation chooses. - ### DATA {#frame-data} DATA frames (type=0x0) convey arbitrary, variable-length sequences of octets @@ -566,11 +558,6 @@ type HTTP_MALFORMED_FRAME. A PRIORITY frame that references a non-existent Push ID or a Placeholder ID greater than the server's limit MUST be treated as a HTTP_MALFORMED_FRAME error. -A PRIORITY frame MUST contain only the identified fields. A PRIORITY frame that -contains more or fewer fields, or a PRIORITY frame that includes a truncated -integer encoding MUST be treated as a connection error of type -HTTP_MALFORMED_FRAME. - ### CANCEL_PUSH {#frame-cancel-push} @@ -839,6 +826,19 @@ A server MUST treat a MAX_PUSH_ID frame payload that does not contain a single variable-length integer as a connection error of type HTTP_MALFORMED_FRAME. +### Reserved Frame Types {#frame-grease} + +Frame types of the format `0xb + (0x1f * N)` are reserved to exercise the +requirement that unknown types be ignored ({{extensions}}). These frames have no +semantic value, and can be sent when application-layer padding is desired. They +MAY also be sent on connections where no request data is currently being +transferred. Endpoints MUST NOT consider these frames to have any meaning upon +receipt. + +The payload and length of the frames are selected in any manner the +implementation chooses. + + # HTTP Request Lifecycle ## HTTP Message Exchanges {#request-response} @@ -1217,7 +1217,7 @@ interrupts connectivity. If a connection terminates without a GOAWAY frame, clients MUST assume that any request which was sent, whether in whole or in part, might have been processed. -# Extensions to HTTP/QUIC +# Extensions to HTTP/QUIC {#extensions} HTTP/QUIC permits extension of the protocol. Within the limitations described in this section, protocol extensions can be used to provide additional services From a8036b2b0ec52a124954a363407ed3ca67e398da Mon Sep 17 00:00:00 2001 From: Mike Bishop Date: Thu, 11 Oct 2018 15:07:31 -0700 Subject: [PATCH 11/57] PRIORITY fixups; fixes #1848 --- draft-ietf-quic-http.md | 27 ++++++++++----------------- 1 file changed, 10 insertions(+), 17 deletions(-) diff --git a/draft-ietf-quic-http.md b/draft-ietf-quic-http.md index 5b3eeeaf13..a44c2927d2 100644 --- a/draft-ietf-quic-http.md +++ b/draft-ietf-quic-http.md @@ -469,24 +469,17 @@ HEADERS frames can only be sent on request / push streams. ### PRIORITY {#frame-priority} -The PRIORITY (type=0x02) frame specifies the sender-advised priority of a stream -and is substantially different in format from {{!RFC7540}}. In order to ensure -that prioritization is processed in a consistent order, PRIORITY frames MUST be -sent on the control stream. A PRIORITY frame sent on any other stream MUST be -treated as a HTTP_WRONG_STREAM error. - -The format has been modified to accommodate not being sent on a request stream, -to allow for identification of server pushes, and the larger stream ID space of -QUIC. The semantics of the Stream Dependency, Weight, and E flag are otherwise -the same as in HTTP/2. +The PRIORITY (type=0x02) frame specifies the sender-advised priority of a +stream. In order to ensure that prioritization is processed in a consistent +order, PRIORITY frames MUST be sent on the control stream. A PRIORITY frame +sent on any other stream MUST be treated as a connection error of type +HTTP_WRONG_STREAM. ~~~~~~~~~~ drawing 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -|PT |DT |Empty|E| -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Prioritized Element ID (i) ... +|PT |DT |Empty|E| Prioritized Element ID (i) ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Element Dependency ID (i) ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ @@ -519,9 +512,9 @@ The PRIORITY frame payload has the following fields: Element Dependency ID: : A variable-length integer that identifies the element on which a dependency - is being expressed. Depending on the value of Dependency Type, this - contains the Stream ID of a request stream, the Push ID of a promised - resource, or a Placeholder ID of a placeholder. For details of + is being expressed. Depending on the value of Dependency Type, this contains + the Stream ID of a request stream, the Push ID of a promised resource, a + Placeholder ID of a placeholder, or is ignored. For details of dependencies, see {{priority}} and {{!RFC7540}}, Section 5.3. Weight: @@ -1690,7 +1683,7 @@ HEADERS (0x1): PRIORITY (0x2): : As described above, the PRIORITY frame is sent on the control stream and can - reference either a Stream ID or a Push ID. See {{frame-priority}}. + reference a variety of identifiers. See {{frame-priority}}. RST_STREAM (0x3): : RST_STREAM frames do not exist, since QUIC provides stream lifecycle From 15d51bb68eaac9bcdb767fba4319caaa20339f26 Mon Sep 17 00:00:00 2001 From: Mike Bishop Date: Thu, 11 Oct 2018 15:07:50 -0700 Subject: [PATCH 12/57] CANCEL_PUSH fixups --- draft-ietf-quic-http.md | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/draft-ietf-quic-http.md b/draft-ietf-quic-http.md index a44c2927d2..959f20f394 100644 --- a/draft-ietf-quic-http.md +++ b/draft-ietf-quic-http.md @@ -554,23 +554,23 @@ greater than the server's limit MUST be treated as a HTTP_MALFORMED_FRAME error. ### CANCEL_PUSH {#frame-cancel-push} -The CANCEL_PUSH frame (type=0x3) is used to request cancellation of server push -prior to the push stream being created. The CANCEL_PUSH frame identifies a -server push request by Push ID (see {{frame-push-promise}}) using a +The CANCEL_PUSH frame (type=0x3) is used to request cancellation of a server +push prior to the push stream being created. The CANCEL_PUSH frame identifies a +server push by Push ID (see {{frame-push-promise}}), encoded as a variable-length integer. When a server receives this frame, it aborts sending the response for the identified server push. If the server has not yet started to send the server -push, it can use the receipt of a CANCEL_PUSH frame to avoid opening a +push, it can use the receipt of a CANCEL_PUSH frame to avoid opening a push stream. If the push stream has been opened by the server, the server SHOULD -send a QUIC RST_STREAM frame on those streams and cease transmission of the +send a QUIC RST_STREAM frame on that stream and cease transmission of the response. -A server can send this frame to indicate that it won't be sending a response -prior to creation of a push stream. Once the push stream has been created, -sending CANCEL_PUSH has no effect on the state of the push stream. A QUIC -RST_STREAM frame SHOULD be used instead to cancel transmission of the server -push response. +A server can send this frame to indicate that it will not be fulfilling a +promise prior to creation of a push stream. Once the push stream has been +created, sending CANCEL_PUSH has no effect on the state of the push stream. A +QUIC RST_STREAM frame SHOULD be used instead to abort transmission of the +server push response. A CANCEL_PUSH frame is sent on the control stream. Sending a CANCEL_PUSH frame on a stream other than the control stream MUST be treated as a stream error of From 9e863ab7ebdbe4d30a83eca731f9ebaa347600dd Mon Sep 17 00:00:00 2001 From: Mike Bishop Date: Thu, 11 Oct 2018 15:15:43 -0700 Subject: [PATCH 13/57] SETTINGS fixups --- draft-ietf-quic-http.md | 30 ++++++++++++++++-------------- 1 file changed, 16 insertions(+), 14 deletions(-) diff --git a/draft-ietf-quic-http.md b/draft-ietf-quic-http.md index 959f20f394..a0416ecc05 100644 --- a/draft-ietf-quic-http.md +++ b/draft-ietf-quic-http.md @@ -600,9 +600,10 @@ HTTP_MALFORMED_FRAME. ### SETTINGS {#frame-settings} The SETTINGS frame (type=0x4) conveys configuration parameters that affect how -endpoints communicate, such as preferences and constraints on peer behavior, and -is different from {{!RFC7540}}. Individually, a SETTINGS parameter can also be -referred to as a "setting". +endpoints communicate, such as preferences and constraints on peer behavior. +Individually, a SETTINGS parameter can also be referred to as a "setting"; the +identifier and value of each setting parameter can be referred to as a "setting +identifier" and a "setting value". SETTINGS parameters are not negotiated; they describe characteristics of the sending peer, which can be used by the receiving peer. However, a negotiation @@ -630,19 +631,19 @@ QUIC variable-length integer encoding. | Identifier (16) | Value (i) ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~~~~~~~~~~~~~~~ -{: #fig-ext-settings title="SETTINGS value format"} +{: #fig-ext-settings title="SETTINGS parameter format"} Each value MUST be compared against the remaining length of the SETTINGS frame. -Any value which purports to cross the end of the frame MUST cause the SETTINGS -frame to be considered malformed and trigger a connection error of type -HTTP_MALFORMED_FRAME. +A variable-length integer value which cannot fit within the remaining length of +the SETTINGS frame MUST cause the SETTINGS frame to be considered malformed and +trigger a connection error of type HTTP_MALFORMED_FRAME. An implementation MUST ignore the contents for any SETTINGS identifier it does not understand. SETTINGS frames always apply to a connection, never a single stream. A SETTINGS -frame MUST be sent as the first frame of either control stream (see -{{stream-mapping}}) by each peer, and MUST NOT be sent subsequently or on any +frame MUST be sent as the first frame of each control stream (see +{{control-streams}}) by each peer, and MUST NOT be sent subsequently or on any other stream. If an endpoint receives a SETTINGS frame on a different stream, the endpoint MUST respond with a connection error of type HTTP_WRONG_STREAM. If an endpoint receives a second SETTINGS frame, the endpoint MUST respond with a @@ -663,17 +664,18 @@ The following settings are defined in HTTP/QUIC: SETTINGS_MAX_HEADER_LIST_SIZE (0x6): : The default value is unlimited. -Settings values of the format `0x?a?a` are reserved to exercise the requirement -that unknown parameters be ignored. Such settings have no defined meaning. -Endpoints SHOULD include at least one such setting in their SETTINGS frame. -Endpoints MUST NOT consider such settings to have any meaning upon receipt. +Setting identifiers of the format `0x?a?a` are reserved to exercise the +requirement that unknown identifiers be ignored. Such settings have no defined +meaning. Endpoints SHOULD include at least one such setting in their SETTINGS +frame. Endpoints MUST NOT consider such settings to have any meaning upon +receipt. Because the setting has no defined meaning, the value of the setting can be any value the implementation selects. Additional settings MAY be defined by extensions to HTTP/QUIC. -#### Initial SETTINGS Values +#### Initialization When a 0-RTT QUIC connection is being used, the client's initial requests will be sent before the arrival of the server's SETTINGS frame. Clients MUST store From de88400a856474af3b468c9b9a501d306d1c607d Mon Sep 17 00:00:00 2001 From: Mike Bishop Date: Thu, 11 Oct 2018 15:23:56 -0700 Subject: [PATCH 14/57] Import definition of MAX_HEADER_LIST_SIZE --- draft-ietf-quic-http.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/draft-ietf-quic-http.md b/draft-ietf-quic-http.md index a0416ecc05..4740d75bd7 100644 --- a/draft-ietf-quic-http.md +++ b/draft-ietf-quic-http.md @@ -918,6 +918,14 @@ HTTP/QUIC uses QPACK header compression as described in [QPACK], a variation of HPACK which allows the flexibility to avoid header-compression-induced head-of-line blocking. See that document for additional details. +An HTTP/QUIC implementation MAY impose a limit on the maximum size of the header +it will accept on an individual HTTP message. This limit is conveyed as a +number of octets in the `SETTINGS_MAX_HEADER_LIST_SIZE` parameter. The size of +an header block is calculated based on the uncompressed size of header fields, +including the length of the name and value in octets plus an overhead of 32 +octets for each header field. Encountering a message header larger than this +value SHOULD be treated as a stream error of type `HTTP_EXCESSIVE_LOAD`. + ### Request Cancellation Either client or server can cancel requests by aborting the stream (QUIC From ce52c50cfc74493fb5b5b2169cdd9d9a27aefb65 Mon Sep 17 00:00:00 2001 From: Mike Bishop Date: Thu, 11 Oct 2018 15:25:22 -0700 Subject: [PATCH 15/57] PUSH_PROMISE fixups --- draft-ietf-quic-http.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/draft-ietf-quic-http.md b/draft-ietf-quic-http.md index 4740d75bd7..0671d412f6 100644 --- a/draft-ietf-quic-http.md +++ b/draft-ietf-quic-http.md @@ -699,8 +699,8 @@ prior to receiving and processing the server's SETTINGS frame. ### PUSH_PROMISE {#frame-push-promise} -The PUSH_PROMISE frame (type=0x05) is used to carry a request header set from -server to client, as in HTTP/2. +The PUSH_PROMISE frame (type=0x05) is used to carry a promised request header +set from server to client, as in HTTP/2. ~~~~~~~~~~ drawing 0 1 2 3 @@ -716,8 +716,8 @@ server to client, as in HTTP/2. The payload consists of: Push ID: -: A variable-length integer that identifies the server push request. A push ID - is used in push stream header ({{server-push}}), CANCEL_PUSH frames +: A variable-length integer that identifies the server push operation. A push + ID is used in push stream headers ({{server-push}}), CANCEL_PUSH frames ({{frame-cancel-push}}), and PRIORITY frames ({{frame-priority}}). Header Block: From 78de8d628b0d51145214dd5b1426b6f865e78904 Mon Sep 17 00:00:00 2001 From: Mike Bishop Date: Thu, 11 Oct 2018 15:26:20 -0700 Subject: [PATCH 16/57] MAX_PUSH_ID fixup --- draft-ietf-quic-http.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/draft-ietf-quic-http.md b/draft-ietf-quic-http.md index 0671d412f6..026136bcfc 100644 --- a/draft-ietf-quic-http.md +++ b/draft-ietf-quic-http.md @@ -800,7 +800,7 @@ a MAX_PUSH_ID frame as a connection error of type HTTP_MALFORMED_FRAME. The maximum Push ID is unset when a connection is created, meaning that a server cannot push until it receives a MAX_PUSH_ID frame. A client that wishes to manage the number of promised server pushes can increase the maximum Push ID by -sending a MAX_PUSH_ID frame as the server fulfills or cancels server pushes. +sending MAX_PUSH_ID frames as the server fulfills or cancels server pushes. ~~~~~~~~~~ drawing 0 1 2 3 From 5be7761ec2f33bf63229189c1e21cddfeaee4a62 Mon Sep 17 00:00:00 2001 From: Mike Bishop Date: Fri, 12 Oct 2018 09:34:37 -0700 Subject: [PATCH 17/57] Headers and message formatting fixups --- draft-ietf-quic-http.md | 41 +++++++++++++++++++++-------------------- 1 file changed, 21 insertions(+), 20 deletions(-) diff --git a/draft-ietf-quic-http.md b/draft-ietf-quic-http.md index 026136bcfc..b1f926e8ef 100644 --- a/draft-ietf-quic-http.md +++ b/draft-ietf-quic-http.md @@ -843,20 +843,15 @@ stream. A server sends an HTTP response on the same stream as the request. An HTTP message (request or response) consists of: -1. one header block (see {{frame-headers}}) containing the message header (see - {{!RFC7230}}, Section 3.2), +1. the message header (see {{!RFC7230}}, Section 3.2), sent as a single HEADERS + frame (see {{frame-headers}}), 2. the payload body (see {{!RFC7230}}, Section 3.3), sent as a series of DATA frames (see {{frame-data}}), -3. optionally, one header block containing the trailer-part, if present (see +3. optionally, one HEADERS frame containing the trailer-part, if present (see {{!RFC7230}}, Section 4.1.2). -In addition, prior to sending the message header block indicated above, a -response contains zero or more header blocks containing the message headers of -informational (1xx) HTTP responses (see {{!RFC7230}}, Section 3.2 and -{{!RFC7231}}, Section 6.2). - A server MAY interleave one or more PUSH_PROMISE frames (see {{frame-push-promise}}) with the frames of a response message. These PUSH_PROMISE frames are not part of the response; see {{server-push}} for more @@ -869,20 +864,25 @@ Trailing header fields are carried in an additional header block following the body. Senders MUST send only one header block in the trailers section; receivers MUST discard any subsequent header blocks. +A response MAY consist of multiple messages when and only when one or more +informational responses (1xx, see {{!RFC7231}}, Section 6.2) precede a final +response to the same request. Non-final responses do not contain a payload body +or trailers. + An HTTP request/response exchange fully consumes a bidirectional QUIC stream. After sending a request, a client closes the stream for sending; after sending a -response, the server closes the stream for sending and the QUIC stream is fully -closed. Requests and responses are considered complete when the corresponding -QUIC stream is closed in the appropriate direction. +final response, the server closes the stream for sending and the QUIC stream is +fully closed. Requests and responses are considered complete when the +corresponding QUIC stream is closed in the appropriate direction. A server can send a complete response prior to the client sending an entire request if the response does not depend on any portion of the request that has not been sent and received. When this is true, a server MAY request that the client abort transmission of a request without error by triggering a QUIC -STOP_SENDING with error code HTTP_EARLY_RESPONSE, sending a complete response, -and cleanly closing its stream. Clients MUST NOT discard complete responses as -a result of having their request terminated abruptly, though clients can always -discard responses at their discretion for other reasons. +STOP_SENDING frame with error code HTTP_EARLY_RESPONSE, sending a complete +response, and cleanly closing its stream. Clients MUST NOT discard complete +responses as a result of having their request terminated abruptly, though +clients can always discard responses at their discretion for other reasons. Changes to the state of a request stream, including receiving a RST_STREAM with any error code, do not affect the state of the server's response. Servers do not @@ -894,9 +894,10 @@ HTTP_INCOMPLETE_REQUEST. ### Header Formatting and Compression -HTTP header fields carry information as a series of key-value pairs. For a -listing of registered HTTP header fields, see the "Message Header Field" -registry maintained at . +HTTP message headers carry information as a series of key-value pairs, called +header fields. For a listing of registered HTTP header fields, see the "Message +Header Field" registry maintained at +. Just as in previous versions of HTTP, header field names are strings of ASCII characters that are compared in a case-insensitive fashion. Properties of HTTP @@ -906,7 +907,7 @@ header field names MUST be converted to lowercase prior to their encoding. A request or response containing uppercase header field names MUST be treated as malformed. -As in HTTP/2, HTTP/QUIC uses special pseudo-header fields beginning with ':' +As in HTTP/2, HTTP/QUIC uses special pseudo-header fields beginning with the ':' character (ASCII 0x3a) to convey the target URI, the method of the request, and the status code for the response. These pseudo-header fields are defined in Section 8.1.2.3 and 8.1.2.4 of {{!RFC7540}}. Pseudo-header fields are not HTTP @@ -921,7 +922,7 @@ head-of-line blocking. See that document for additional details. An HTTP/QUIC implementation MAY impose a limit on the maximum size of the header it will accept on an individual HTTP message. This limit is conveyed as a number of octets in the `SETTINGS_MAX_HEADER_LIST_SIZE` parameter. The size of -an header block is calculated based on the uncompressed size of header fields, +a header list is calculated based on the uncompressed size of header fields, including the length of the name and value in octets plus an overhead of 32 octets for each header field. Encountering a message header larger than this value SHOULD be treated as a stream error of type `HTTP_EXCESSIVE_LOAD`. From ec75492f9c9f4df8004cb773b3619dc5c919cd96 Mon Sep 17 00:00:00 2001 From: Mike Bishop Date: Fri, 12 Oct 2018 09:37:41 -0700 Subject: [PATCH 18/57] Cancellation and CONNECT fixups --- draft-ietf-quic-http.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/draft-ietf-quic-http.md b/draft-ietf-quic-http.md index b1f926e8ef..822000e12d 100644 --- a/draft-ietf-quic-http.md +++ b/draft-ietf-quic-http.md @@ -930,12 +930,12 @@ value SHOULD be treated as a stream error of type `HTTP_EXCESSIVE_LOAD`. ### Request Cancellation Either client or server can cancel requests by aborting the stream (QUIC -RST_STREAM or STOP_SENDING frames, as appropriate) with an error code of +RST_STREAM and/or STOP_SENDING frames, as appropriate) with an error code of HTTP_REQUEST_CANCELLED ({{http-error-codes}}). When the client cancels a -response, it indicates that this response is no longer of interest. Clients -SHOULD cancel requests by aborting both directions of a stream. +response, it indicates that this response is no longer of interest. +Implementations SHOULD cancel requests by aborting both directions of a stream. -When the server cancels its response stream using HTTP_REQUEST_CANCELLED, it +When the server aborts its response stream using HTTP_REQUEST_CANCELLED, it indicates that no application processing was performed. The client can treat requests cancelled by the server as though they had never been sent at all, thereby allowing them to be retried later on a new connection. Servers MUST NOT @@ -966,14 +966,14 @@ host for similar purposes. A CONNECT request in HTTP/QUIC functions in the same manner as in HTTP/2. The request MUST be formatted as described in {{!RFC7540}}, Section 8.3. A CONNECT request that does not conform to these restrictions is malformed. The request -stream MUST NOT be half-closed at the end of the request. +stream MUST NOT be closed at the end of the request. A proxy that supports CONNECT establishes a TCP connection ({{!RFC0793}}) to the server identified in the ":authority" pseudo-header field. Once this connection is successfully established, the proxy sends a HEADERS frame containing a 2xx series status code to the client, as defined in {{!RFC7231}}, Section 4.3.6. -All DATA frames on the request stream correspond to data sent on the TCP +All DATA frames on the stream correspond to data sent or received on the TCP connection. Any DATA frame sent by the client is transmitted by the proxy to the TCP server; data received from the TCP server is packaged into DATA frames by the proxy. Note that the size and number of TCP segments is not guaranteed to From d9ae519b8d4501840ed0c486a090b7043a82c145 Mon Sep 17 00:00:00 2001 From: Mike Bishop Date: Fri, 12 Oct 2018 10:20:32 -0700 Subject: [PATCH 19/57] Tighten some redundancies --- draft-ietf-quic-http.md | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/draft-ietf-quic-http.md b/draft-ietf-quic-http.md index 822000e12d..60c314ff36 100644 --- a/draft-ietf-quic-http.md +++ b/draft-ietf-quic-http.md @@ -1017,9 +1017,6 @@ An element can depend on another element or on the root of the tree. A reference to an element which is no longer in the tree is treated as a reference to the root of the tree. -Only a client can send PRIORITY frames. A server MUST NOT send a PRIORITY -frame. - ### Placeholders In HTTP/2, certain implementations used closed or unused streams as placeholders @@ -1083,10 +1080,9 @@ NOT declare a dependency on a stream it knows to have been closed. HTTP/QUIC server push is similar to what is described in HTTP/2 {{!RFC7540}}, but uses different mechanisms. -Each promised response is assigned a unique Push ID that uniquely identifies a -server push. This allows a server to fulfill promises in the order that best -suits its needs. The same Push ID can be used in multiple PUSH_PROMISE frames -(see {{frame-push-promise}}). +Each server push is identified by a unique Push ID. The same Push ID can be used +in one or more PUSH_PROMISE frames (see {{frame-push-promise}}), then included +with the push stream which ultimately fulfills those promises. Server push is only enabled on a connection when a client sends a MAX_PUSH_ID frame (see {{frame-max-push-id}}). A server cannot use server push @@ -1098,7 +1094,7 @@ connection error of type HTTP_PUSH_LIMIT_EXCEEDED. The header of the request message is carried by a PUSH_PROMISE frame (see {{frame-push-promise}}) on the request stream which generated the push. This -allows the server push to be associated with a request. Ordering of a +allows the server push to be associated with a client request. Ordering of a PUSH_PROMISE in relation to certain parts of the response is important (see Section 8.2.1 of {{!RFC7540}}). Promised requests MUST conform to the requirements in Section 8.2 of {{!RFC7540}}. From c35349504c60d528aecb325a0b2b69f6cbbdd7bb Mon Sep 17 00:00:00 2001 From: Mike Bishop Date: Fri, 12 Oct 2018 10:41:20 -0700 Subject: [PATCH 20/57] Explicitly mention error upgrades --- draft-ietf-quic-http.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/draft-ietf-quic-http.md b/draft-ietf-quic-http.md index 60c314ff36..59c30165cd 100644 --- a/draft-ietf-quic-http.md +++ b/draft-ietf-quic-http.md @@ -1260,7 +1260,8 @@ extension is disabled if the setting is omitted. QUIC allows the application to abruptly terminate (reset) individual streams or the entire connection when an error is encountered. These are referred to as "stream errors" or "connection errors" and are described in more detail in -{{QUIC-TRANSPORT}}. +{{QUIC-TRANSPORT}}. An endpoint MAY choose to treat a stream error as a +connection error. This section describes HTTP/QUIC-specific error codes which can be used to express the cause of a connection or stream error. From f23e01b80c98a2e32049a0e3ed6c5759134c7bed Mon Sep 17 00:00:00 2001 From: Mike Bishop Date: Fri, 12 Oct 2018 10:44:13 -0700 Subject: [PATCH 21/57] Generalize varint text in Security Considerations --- draft-ietf-quic-http.md | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/draft-ietf-quic-http.md b/draft-ietf-quic-http.md index 59c30165cd..53b8b00790 100644 --- a/draft-ietf-quic-http.md +++ b/draft-ietf-quic-http.md @@ -1350,18 +1350,19 @@ HTTP_MALFORMED_FRAME (0x01XX): # Security Considerations The security considerations of HTTP/QUIC should be comparable to those of HTTP/2 -with TLS. Note that where HTTP/2 employs PADDING frames to make a connection -more resistant to traffic analysis, HTTP/QUIC can rely on QUIC's own PADDING -frames or employ the reserved frame and stream types discussed in -{{frame-grease}} and {{stream-grease}}. +with TLS. Note that where HTTP/2 employs PADDING frames and Padding fields in +other frames to make a connection more resistant to traffic analysis, HTTP/QUIC +can rely on QUIC PADDING frames or employ the reserved frame and stream types +discussed in {{frame-grease}} and {{stream-grease}}. When HTTP Alternative Services is used for discovery for HTTP/QUIC endpoints, the security considerations of {{!ALTSVC}} also apply. -The modified SETTINGS format contains nested length elements, which could pose -a security risk to an incautious implementer. A SETTINGS frame parser MUST -ensure that the length of the frame exactly matches the length of the settings -it contains. +Several protocol elements contain nested length elements, typically in the form +of frames with an explicit length containing variable-length integers. This +could pose a security risk to an incautious implementer. An implementation MUST +ensure that the length of a frame exactly matches the length of the fields it +contains. # IANA Considerations From febb274725311d6b41f8880c250aabd7fe06e6c3 Mon Sep 17 00:00:00 2001 From: Mike Bishop Date: Fri, 12 Oct 2018 10:53:03 -0700 Subject: [PATCH 22/57] Keep H2 Considerations up-to-date --- draft-ietf-quic-http.md | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/draft-ietf-quic-http.md b/draft-ietf-quic-http.md index 53b8b00790..82898ce5cd 100644 --- a/draft-ietf-quic-http.md +++ b/draft-ietf-quic-http.md @@ -1608,7 +1608,7 @@ section describes the approach taken to design HTTP/QUIC, points out important differences from HTTP/2, and describes how to map HTTP/2 extensions into HTTP/QUIC. -HTTP/QUIC begins from the premise that HTTP/2 code reuse is a useful feature, +HTTP/QUIC begins from the premise that similarity to HTTP/2 is preferable, but not a hard requirement. HTTP/QUIC departs from HTTP/2 primarily where necessary to accommodate the differences in behavior between QUIC and TCP (lack of ordering, support for streams). We intend to avoid gratuitous changes which @@ -1720,7 +1720,7 @@ WINDOW_UPDATE (0x8): CONTINUATION (0x9): : CONTINUATION frames do not exist; instead, larger HEADERS/PUSH_PROMISE - frames than HTTP/2 are permitted, and HEADERS frames can be used in series. + frames than HTTP/2 are permitted. Frame types defined by extensions to HTTP/2 need to be separately registered for HTTP/QUIC if still applicable. The IDs of frames defined in {{!RFC7540}} have @@ -1739,7 +1739,7 @@ that are retained in HTTP/QUIC have the same value as in HTTP/2. Below is a listing of how each HTTP/2 SETTINGS parameter is mapped: SETTINGS_HEADER_TABLE_SIZE: -: See {{settings-parameters}}. +: See [QPACK]. SETTINGS_ENABLE_PUSH: : This is removed in favor of the MAX_PUSH_ID which provides a more granular @@ -1775,8 +1775,7 @@ settings defined in {{!RFC7540}} have been reserved for simplicity. See ## HTTP/2 Error Codes QUIC has the same concepts of "stream" and "connection" errors that HTTP/2 -provides. However, because the error code space is shared between multiple -components, there is no direct portability of HTTP/2 error codes. +provides. However, there is no direct portability of HTTP/2 error codes. The HTTP/2 error codes defined in Section 7 of {{!RFC7540}} map to the HTTP/QUIC error codes as follows: @@ -1803,17 +1802,17 @@ STREAM_CLOSED (0x5): QUIC_STREAM_DATA_AFTER_TERMINATION from the QUIC layer. FRAME_SIZE_ERROR (0x6): -: No single mapping. See new error codes defined in {{http-error-codes}}. +: HTTP_MALFORMED_FRAME error codes defined in {{http-error-codes}}. REFUSED_STREAM (0x7): : Not applicable, since QUIC handles stream management. Would provoke a - QUIC_TOO_MANY_OPEN_STREAMS from the QUIC layer. + STREAM_ID_ERROR from the QUIC layer. CANCEL (0x8): : HTTP_REQUEST_CANCELLED in {{http-error-codes}}. COMPRESSION_ERROR (0x9): -: HTTP_QPACK_DECOMPRESSION_FAILED in [QPACK]. +: Multiple error codes are defined in [QPACK]. CONNECT_ERROR (0xa): : HTTP_CONNECT_ERROR in {{http-error-codes}}. From f7e9118a3c97b0d1e523ac65af82b13c697ffe5a Mon Sep 17 00:00:00 2001 From: Mike Bishop Date: Fri, 12 Oct 2018 10:54:49 -0700 Subject: [PATCH 23/57] has incorporates --- draft-ietf-quic-http.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/draft-ietf-quic-http.md b/draft-ietf-quic-http.md index 82898ce5cd..ca39ab458e 100644 --- a/draft-ietf-quic-http.md +++ b/draft-ietf-quic-http.md @@ -92,8 +92,8 @@ HTTP/2. HTTP/2 introduced a framing and multiplexing layer to improve latency without modifying the transport layer. However, TCP's lack of visibility into parallel requests in both mappings limited the possible performance gains. -The QUIC transport protocol has incorporates stream multiplexing and per-stream -flow control, similar to that provided by the HTTP/2 framing layer. By providing +The QUIC transport protocol incorporates stream multiplexing and per-stream flow +control, similar to that provided by the HTTP/2 framing layer. By providing reliability at the stream level and congestion control across the entire connection, it has the capability to improve the performance of HTTP compared to a TCP mapping. QUIC also incorporates TLS 1.3 at the transport layer, offering From b2d119eb5d6bb2ec9a8bf54a8d9ec8b389345fe0 Mon Sep 17 00:00:00 2001 From: Mike Bishop Date: Fri, 12 Oct 2018 11:01:20 -0700 Subject: [PATCH 24/57] Use a/the consistently --- draft-ietf-quic-http.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/draft-ietf-quic-http.md b/draft-ietf-quic-http.md index ca39ab458e..86d4ea4dfc 100644 --- a/draft-ietf-quic-http.md +++ b/draft-ietf-quic-http.md @@ -513,7 +513,7 @@ The PRIORITY frame payload has the following fields: Element Dependency ID: : A variable-length integer that identifies the element on which a dependency is being expressed. Depending on the value of Dependency Type, this contains - the Stream ID of a request stream, the Push ID of a promised resource, a + the Stream ID of a request stream, the Push ID of a promised resource, the Placeholder ID of a placeholder, or is ignored. For details of dependencies, see {{priority}} and {{!RFC7540}}, Section 5.3. From 5ebd7caa395ce2a91a6edc465b0bf63e7f5a5a41 Mon Sep 17 00:00:00 2001 From: Jana Iyengar Date: Fri, 12 Oct 2018 14:48:01 -0700 Subject: [PATCH 25/57] Moved stream stuff up and added intro text to describe structure of doc --- draft-ietf-quic-transport.md | 6833 +++++++++++++++++----------------- 1 file changed, 3421 insertions(+), 3412 deletions(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index 968fd471d7..be2ba769a1 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -134,10 +134,19 @@ encrypts most of the data it exchanges, including its signaling. This allows the protocol to evolve without incurring a dependency on upgrades to middleboxes. -This document describes the core QUIC protocol, including the conceptual design, -wire format, and mechanisms of the QUIC protocol for connection establishment, -stream multiplexing, stream and connection-level flow control, connection -migration, and data reliability. +This document describes the core QUIC protocol, and is structured as follows: + +* Streams, QUIC's service abstraction to applications, including stream + multiplexing, and stream and connection-level flow control ({{streams}}, + {{stream-states}}, and {{flow-control}}); + +* Connections, including connection establishment, migration, and shutdown; + +* Packets and Frames, including QUIC's model and mechanics of reliability + (acknowledgements and retransmission) and packet sizing; + +* Wire format, including versioning, packet formats, frame formats, and error + codes. Accompanying documents describe QUIC's loss detection and congestion control {{QUIC-RECOVERY}}, and the use of TLS 1.3 for key negotiation {{QUIC-TLS}}. @@ -145,7 +154,7 @@ Accompanying documents describe QUIC's loss detection and congestion control QUIC version 1 conforms to the protocol invariants in {{QUIC-INVARIANTS}}. -# Conventions and Definitions +## Conventions and Definitions The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this @@ -208,3173 +217,3358 @@ x (*) ... : Indicates that x is variable-length -# Versions {#versions} +# Streams: QUIC's Data Structuring Abstraction {#streams} -QUIC versions are identified using a 32-bit unsigned number. +Streams in QUIC provide a lightweight, ordered byte-stream abstraction. -The version 0x00000000 is reserved to represent version negotiation. This -version of the specification is identified by the number 0x00000001. +There are two basic types of stream in QUIC. Unidirectional streams carry data +in one direction: from the initiator of the stream to its peer; +bidirectional streams allow for data to be sent in both directions. Different +stream identifiers are used to distinguish between unidirectional and +bidirectional streams, as well as to create a separation between streams that +are initiated by the client and server (see {{stream-id}}). -Other versions of QUIC might have different properties to this version. The -properties of QUIC that are guaranteed to be consistent across all versions of -the protocol are described in {{QUIC-INVARIANTS}}. +Either type of stream can be created by either endpoint, can concurrently send +data interleaved with other streams, and can be cancelled. -Version 0x00000001 of QUIC uses TLS as a cryptographic handshake protocol, as -described in {{QUIC-TLS}}. +Stream offsets allow for the octets on a stream to be placed in order. An +endpoint MUST be capable of delivering data received on a stream in order. +Implementations MAY choose to offer the ability to deliver data out of order. +There is no means of ensuring ordering between octets on different streams. -Versions with the most significant 16 bits of the version number cleared are -reserved for use in future IETF consensus documents. +The creation and destruction of streams are expected to have minimal bandwidth +and computational cost. A single STREAM frame may create, carry data for, and +terminate a stream, or a stream may last the entire duration of a connection. -Versions that follow the pattern 0x?a?a?a?a are reserved for use in forcing -version negotiation to be exercised. That is, any version number where the low -four bits of all octets is 1010 (in binary). A client or server MAY advertise -support for any of these reserved versions. +Streams are individually flow controlled, allowing an endpoint to limit memory +commitment and to apply back pressure. The creation of streams is also flow +controlled, with each peer declaring the maximum stream ID it is willing to +accept at a given time. -Reserved version numbers will probably never represent a real protocol; a client -MAY use one of these version numbers with the expectation that the server will -initiate version negotiation; a server MAY advertise support for one of these -versions and can expect that clients ignore the value. +An alternative view of QUIC streams is as an elastic "message" abstraction, +similar to the way ephemeral streams are used in SST +{{?SST=DOI.10.1145/1282427.1282421}}, which may be a more appealing description +for some applications. -\[\[RFC editor: please remove the remainder of this section before -publication.]] -The version number for the final version of this specification (0x00000001), is -reserved for the version of the protocol that is published as an RFC. +## Stream Identifiers {#stream-id} -Version numbers used to identify IETF drafts are created by adding the draft -number to 0xff000000. For example, draft-ietf-quic-transport-13 would be -identified as 0xff00000D. +Streams are identified by an unsigned 62-bit integer, referred to as the Stream +ID. The least significant two bits of the Stream ID are used to identify the +type of stream (unidirectional or bidirectional) and the initiator of the +stream. -Implementors are encouraged to register version numbers of QUIC that they are -using for private experimentation on the GitHub wiki at -\. +The least significant bit (0x1) of the Stream ID identifies the initiator of the +stream. Clients initiate even-numbered streams (those with the least +significant bit set to 0); servers initiate odd-numbered streams (with the bit +set to 1). Separation of the stream identifiers ensures that client and server +are able to open streams without the latency imposed by negotiating for an +identifier. +If an endpoint receives a frame for a stream that it expects to initiate (i.e., +odd-numbered for the client or even-numbered for the server), but which it has +not yet opened, it MUST close the connection with error code STREAM_STATE_ERROR. -# Packet Types and Formats +The second least significant bit (0x2) of the Stream ID differentiates between +unidirectional streams and bidirectional streams. Unidirectional streams always +have this bit set to 1 and bidirectional streams have this bit set to 0. -We first describe QUIC's packet types and their formats, since some are -referenced in subsequent mechanisms. +The two type bits from a Stream ID therefore identify streams as summarized in +{{stream-id-types}}. -All numeric values are encoded in network byte order (that is, big-endian) and -all field sizes are in bits. When discussing individual bits of fields, the -least significant bit is referred to as bit 0. Hexadecimal notation is used for -describing the value of fields. +| Low Bits | Stream Type | +|:---------|:---------------------------------| +| 0x0 | Client-Initiated, Bidirectional | +| 0x1 | Server-Initiated, Bidirectional | +| 0x2 | Client-Initiated, Unidirectional | +| 0x3 | Server-Initiated, Unidirectional | +{: #stream-id-types title="Stream ID Types"} -Any QUIC packet has either a long or a short header, as indicated by the Header -Form bit. Long headers are expected to be used early in the connection before -version negotiation and establishment of 1-RTT keys. Short headers are minimal -version-specific headers, which are used after version negotiation and 1-RTT -keys are established. +The first bi-directional stream opened by the client is stream 0. -## Long Header {#long-header} +A QUIC endpoint MUST NOT reuse a Stream ID. Streams of each type are created in +numeric order. Streams that are used out of order result in opening all +lower-numbered streams of the same type in the same direction. -~~~~~ - 0 1 2 3 - 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 -+-+-+-+-+-+-+-+-+ -|1| Type (7) | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Version (32) | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -|DCIL(4)|SCIL(4)| -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Destination Connection ID (0/32..144) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Source Connection ID (0/32..144) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Length (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Packet Number (8/16/32) | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Payload (*) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -~~~~~ -{: #fig-long-header title="Long Header Packet Format"} +Stream IDs are encoded as a variable-length integer (see {{integer-encoding}}). -Long headers are used for packets that are sent prior to the completion of -version negotiation and establishment of 1-RTT keys. Once both conditions are -met, a sender switches to sending packets using the short header -({{short-header}}). The long form allows for special packets - such as the -Version Negotiation packet - to be represented in this uniform fixed-length -packet format. Packets that use the long header contain the following fields: -Header Form: +## Stream Concurrency {#stream-concurrency} -: The most significant bit (0x80) of octet 0 (the first octet) is set to 1 for - long headers. +An endpoint limits the number of concurrently active incoming streams by +adjusting the maximum stream ID. An initial value is set in the transport +parameters (see {{transport-parameter-definitions}}) and is subsequently +increased by MAX_STREAM_ID frames (see {{frame-max-stream-id}}). -Long Packet Type: +The maximum stream ID is specific to each endpoint and applies only to the peer +that receives the setting. That is, clients specify the maximum stream ID the +server can initiate, and servers specify the maximum stream ID the client can +initiate. Each endpoint may respond on streams initiated by the other peer, +regardless of whether it is permitted to initiate new streams. -: The remaining seven bits of octet 0 contain the packet type. This field can - indicate one of 128 packet types. The types specified for this version are - listed in {{long-packet-types}}. +Endpoints MUST NOT exceed the limit set by their peer. An endpoint that +receives a STREAM frame with an ID greater than the limit it has sent MUST treat +this as a stream error of type STREAM_ID_ERROR ({{error-handling}}), unless this +is a result of a change in the initial limits (see {{zerortt-parameters}}). -Version: +A receiver cannot renege on an advertisement; that is, once a receiver +advertises a stream ID via a MAX_STREAM_ID frame, advertising a smaller maximum +ID has no effect. A sender MUST ignore any MAX_STREAM_ID frame that does not +increase the maximum stream ID. -: The QUIC Version is a 32-bit field that follows the Type. This field - indicates which version of QUIC is in use and determines how the rest of the - protocol fields are interpreted. -DCIL and SCIL: +## Sending and Receiving Data -: The octet following the version contains the lengths of the two connection ID - fields that follow it. These lengths are encoded as two 4-bit unsigned - integers. The Destination Connection ID Length (DCIL) field occupies the 4 - high bits of the octet and the Source Connection ID Length (SCIL) field - occupies the 4 low bits of the octet. An encoded length of 0 indicates that - the connection ID is also 0 octets in length. Non-zero encoded lengths are - increased by 3 to get the full length of the connection ID, producing a length - between 4 and 18 octets inclusive. For example, an octet with the value 0x50 - describes an 8-octet Destination Connection ID and a zero-length Source - Connection ID. +Once a stream is created, endpoints may use the stream to send and receive data. +Each endpoint may send a series of STREAM frames encapsulating data on a stream +until the stream is terminated in that direction. Streams are an ordered +byte-stream abstraction, and they have no other structure within them. STREAM +frame boundaries are not expected to be preserved in retransmissions from the +sender or during delivery to the application at the receiver. -Destination Connection ID: +When new data is to be sent on a stream, a sender MUST set the encapsulating +STREAM frame's offset field to the stream offset of the first byte of this new +data. The first octet of data on a stream has an offset of 0. An endpoint is +expected to send every stream octet. The largest offset delivered on a stream +MUST be less than 2^62. -: The Destination Connection ID field follows the connection ID lengths and is - either 0 octets in length or between 4 and 18 octets. - {{connection-id-encoding}} describes the use of this field in more detail. +QUIC makes no specific allowances for partial reliability or delivery of stream +data out of order. Endpoints MUST be able to deliver stream data to an +application as an ordered byte-stream. Delivering an ordered byte-stream +requires that an endpoint buffer any data that is received out of order, up to +the advertised flow control limit. -Source Connection ID: +An endpoint could receive the same octets multiple times; octets that have +already been received can be discarded. The value for a given octet MUST NOT +change if it is sent multiple times; an endpoint MAY treat receipt of a changed +octet as a connection error of type PROTOCOL_VIOLATION. -: The Source Connection ID field follows the Destination Connection ID and is - either 0 octets in length or between 4 and 18 octets. - {{connection-id-encoding}} describes the use of this field in more detail. +An endpoint MUST NOT send data on any stream without ensuring that it is within +the data limits set by its peer. -Length: +Flow control is described in detail in {{flow-control}}, and congestion control +is described in the companion document {{QUIC-RECOVERY}}. -: The length of the remainder of the packet (that is, the Packet Number and - Payload fields) in octets, encoded as a variable-length integer - ({{integer-encoding}}). -Packet Number: +## Stream Prioritization -: The packet number field is 1, 2, or 4 octets long. The packet number has - confidentiality protection separate from packet protection, as described - in Section 5.3 of {{QUIC-TLS}}. The length of the packet number field is - encoded in the plaintext packet number. See {{packet-numbers}} for details. +Stream multiplexing has a significant effect on application performance if +resources allocated to streams are correctly prioritized. Experience with other +multiplexed protocols, such as HTTP/2 {{?HTTP2}}, shows that effective +prioritization strategies have a significant positive impact on performance. -Payload: +QUIC does not provide frames for exchanging prioritization information. Instead +it relies on receiving priority information from the application that uses QUIC. +Protocols that use QUIC are able to define any prioritization scheme that suits +their application semantics. A protocol might define explicit messages for +signaling priority, such as those defined in HTTP/2; it could define rules that +allow an endpoint to determine priority based on context; or it could leave the +determination to the application. -: The payload of the packet. +A QUIC implementation SHOULD provide ways in which an application can indicate +the relative priority of streams. When deciding which streams to dedicate +resources to, QUIC SHOULD use the information provided by the application. +Failure to account for priority of streams can result in suboptimal performance. -The following packet types are defined: +Stream priority is most relevant when deciding which stream data will be +transmitted. Often, there will be limits on what can be transmitted as a result +of connection flow control or the current congestion controller state. -| Type | Name | Section | -|:-----|:------------------------------|:----------------------------| -| 0x7F | Initial | {{packet-initial}} | -| 0x7E | Retry | {{packet-retry}} | -| 0x7D | Handshake | {{packet-handshake}} | -| 0x7C | 0-RTT Protected | {{packet-protected}} | -{: #long-packet-types title="Long Header Packet Types"} +Giving preference to the transmission of its own management frames ensures that +the protocol functions efficiently. That is, prioritizing frames other than +STREAM frames ensures that loss recovery, congestion control, and flow control +operate effectively. -The header form, type, connection ID lengths octet, destination and source -connection IDs, and version fields of a long header packet are -version-independent. The packet number and values for packet types defined in -{{long-packet-types}} are version-specific. See {{QUIC-INVARIANTS}} for details -on how packets from different versions of QUIC are interpreted. +CRYPTO frames SHOULD be prioritized over other streams prior to the completion +of the cryptographic handshake. This includes the retransmission of the second +flight of client handshake messages, that is, the TLS Finished and any client +authentication messages. -The interpretation of the fields and the payload are specific to a version and -packet type. Type-specific semantics for this version are described in the -following sections. +STREAM data in frames determined to be lost SHOULD be retransmitted before +sending new data, unless application priorities indicate otherwise. +Retransmitting lost stream data can fill in gaps, which allows the peer to +consume already received data and free up the flow control window. -The end of the packet is determined by the Length field. The Length field -covers both the Packet Number and Payload fields, both of which are -confidentiality protected and initially of unknown length. The size of the -Payload field is learned once the packet number protection is removed. -Senders can sometimes coalesce multiple packets into one UDP datagram. See -{{packet-coalesce}} for more details. +# Stream States: Life of a Stream {#stream-states} +This section describes the two types of QUIC stream in terms of the states of +their send or receive components. Two state machines are described: one for +streams on which an endpoint transmits data ({{stream-send-states}}); another +for streams from which an endpoint receives data ({{stream-recv-states}}). -## Short Header +Unidirectional streams use the applicable state machine directly. Bidirectional +streams use both state machines. For the most part, the use of these state +machines is the same whether the stream is unidirectional or bidirectional. The +conditions for opening a stream are slightly more complex for a bidirectional +stream because the opening of either send or receive sides causes the stream +to open in both directions. -~~~~~ - 0 1 2 3 - 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 -+-+-+-+-+-+-+-+-+ -|0|K|1|1|0|R R R| -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Destination Connection ID (0..144) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Packet Number (8/16/32) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Protected Payload (*) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -~~~~~ -{: #fig-short-header title="Short Header Packet Format"} +An endpoint can open streams up to its maximum stream limit in any order, +however endpoints SHOULD open the send side of streams for each type in order. -The short header can be used after the version and 1-RTT keys are negotiated. -Packets that use the short header contain the following fields: +Note: -Header Form: +: These states are largely informative. This document uses stream states to + describe rules for when and how different types of frames can be sent and the + reactions that are expected when different types of frames are received. + Though these state machines are intended to be useful in implementing QUIC, + these states aren't intended to constrain implementations. An implementation + can define a different state machine as long as its behavior is consistent + with an implementation that implements these states. -: The most significant bit (0x80) of octet 0 is set to 0 for the short header. -Key Phase Bit: +## Send Stream States {#stream-send-states} -: The second bit (0x40) of octet 0 indicates the key phase, which allows a - recipient of a packet to identify the packet protection keys that are used to - protect the packet. See {{QUIC-TLS}} for details. +{{fig-stream-send-states}} shows the states for the part of a stream that sends +data to a peer. -\[\[Editor's Note: this section should be removed and the bit definitions -changed before this draft goes to the IESG.]] +~~~ + o + | Create Stream (Sending) + | Create Bidirectional Stream (Receiving) + v + +-------+ + | Ready | Send RST_STREAM + | |-----------------------. + +-------+ | + | | + | Send STREAM / | + | STREAM_BLOCKED | + | | + | Create Bidirectional | + | Stream (Receiving) | + v | + +-------+ | + | Send | Send RST_STREAM | + | |---------------------->| + +-------+ | + | | + | Send STREAM + FIN | + v v + +-------+ +-------+ + | Data | Send RST_STREAM | Reset | + | Sent |------------------>| Sent | + +-------+ +-------+ + | | + | Recv All ACKs | Recv ACK + v v + +-------+ +-------+ + | Data | | Reset | + | Recvd | | Recvd | + +-------+ +-------+ +~~~ +{: #fig-stream-send-states title="States for Send Streams"} -Third Bit: +The sending part of stream that the endpoint initiates (types 0 and 2 for +clients, 1 and 3 for servers) is opened by the application or application +protocol. The "Ready" state represents a newly created stream that is able to +accept data from the application. Stream data might be buffered in this state +in preparation for sending. -: The third bit (0x20) of octet 0 is set to 1. +Sending the first STREAM or STREAM_BLOCKED frame causes a send stream to enter +the "Send" state. An implementation might choose to defer allocating a Stream +ID to a send stream until it sends the first frame and enters this state, which +can allow for better stream prioritization. -\[\[Editor's Note: this section should be removed and the bit definitions -changed before this draft goes to the IESG.]] +The sending part of a bidirectional stream initiated by a peer (type 0 for a +server, type 1 for a client) enters the "Ready" state then immediately +transitions to the "Send" state if the receiving part enters the "Recv" state. -Fourth Bit: +In the "Send" state, an endpoint transmits - and retransmits as necessary - data +in STREAM frames. The endpoint respects the flow control limits of its peer, +accepting MAX_STREAM_DATA frames. An endpoint in the "Send" state generates +STREAM_BLOCKED frames if it encounters flow control limits. -: The fourth bit (0x10) of octet 0 is set to 1. +After the application indicates that stream data is complete and a STREAM frame +containing the FIN bit is sent, the send stream enters the "Data Sent" state. +From this state, the endpoint only retransmits stream data as necessary. The +endpoint no longer needs to track flow control limits or send STREAM_BLOCKED +frames for a send stream in this state. The endpoint can ignore any +MAX_STREAM_DATA frames it receives from its peer in this state; MAX_STREAM_DATA +frames might be received until the peer receives the final stream offset. -\[\[Editor's Note: this section should be removed and the bit definitions -changed before this draft goes to the IESG.]] +Once all stream data has been successfully acknowledged, the send stream enters +the "Data Recvd" state, which is a terminal state. -Google QUIC Demultiplexing Bit: +From any of the "Ready", "Send", or "Data Sent" states, an application can +signal that it wishes to abandon transmission of stream data. Similarly, the +endpoint might receive a STOP_SENDING frame from its peer. In either case, the +endpoint sends a RST_STREAM frame, which causes the stream to enter the "Reset +Sent" state. -: The fifth bit (0x8) of octet 0 is set to 0. This allows implementations of - Google QUIC to distinguish Google QUIC packets from short header packets sent - by a client because Google QUIC servers expect the connection ID to always be - present. - The special interpretation of this bit SHOULD be removed from this - specification when Google QUIC has finished transitioning to the new header - format. +An endpoint MAY send a RST_STREAM as the first frame on a send stream; this +causes the send stream to open and then immediately transition to the "Reset +Sent" state. -Reserved: +Once a packet containing a RST_STREAM has been acknowledged, the send stream +enters the "Reset Recvd" state, which is a terminal state. -: The sixth, seventh, and eighth bits (0x7) of octet 0 are reserved for - experimentation. Endpoints MUST ignore these bits on packets they receive - unless they are participating in an experiment that uses these bits. An - endpoint not actively using these bits SHOULD set the value randomly on - packets they send to protect against unwanted inference about particular - values. -Destination Connection ID: +## Receive Stream States {#stream-recv-states} -: The Destination Connection ID is a connection ID that is chosen by the - intended recipient of the packet. See {{connection-id}} for more details. +{{fig-stream-recv-states}} shows the states for the part of a stream that +receives data from a peer. The states for a receive stream mirror only some of +the states of the send stream at the peer. A receive stream doesn't track +states on the send stream that cannot be observed, such as the "Ready" state; +instead, receive streams track the delivery of data to the application or +application protocol some of which cannot be observed by the sender. -Packet Number: +~~~ + o + | Recv STREAM / STREAM_BLOCKED / RST_STREAM + | Create Bidirectional Stream (Sending) + | Recv MAX_STREAM_DATA + | Create Higher-Numbered Stream + v + +-------+ + | Recv | Recv RST_STREAM + | |-----------------------. + +-------+ | + | | + | Recv STREAM + FIN | + v | + +-------+ | + | Size | Recv RST_STREAM | + | Known |---------------------->| + +-------+ | + | | + | Recv All Data | + v v + +-------+ Recv RST_STREAM +-------+ + | Data |--- (optional) --->| Reset | + | Recvd | Recv All Data | Recvd | + +-------+<-- (optional) ----+-------+ + | | + | App Read All Data | App Read RST + v v + +-------+ +-------+ + | Data | | Reset | + | Read | | Read | + +-------+ +-------+ +~~~ +{: #fig-stream-recv-states title="States for Receive Streams"} -: The packet number field is 1, 2, or 4 octets long. The packet number has - confidentiality protection separate from packet protection, as described in - Section 5.3 of {{QUIC-TLS}}. The length of the packet number field is encoded - in the plaintext packet number. See {{packet-numbers}} for details. +The receiving part of a stream initiated by a peer (types 1 and 3 for a client, +or 0 and 2 for a server) are created when the first STREAM, STREAM_BLOCKED, +RST_STREAM, or MAX_STREAM_DATA (bidirectional only, see below) is received for +that stream. The initial state for a receive stream is "Recv". Receiving a +RST_STREAM frame causes the receive stream to immediately transition to the +"Reset Recvd". -Protected Payload: +The receive stream enters the "Recv" state when the sending part of a +bidirectional stream initiated by the endpoint (type 0 for a client, type 1 for +a server) enters the "Ready" state. -: Packets with a short header always include a 1-RTT protected payload. +A bidirectional stream also opens when a MAX_STREAM_DATA frame is received. +Receiving a MAX_STREAM_DATA frame implies that the remote peer has opened the +stream and is providing flow control credit. A MAX_STREAM_DATA frame might +arrive before a STREAM or STREAM_BLOCKED frame if packets are lost or reordered. -The header form and connection ID field of a short header packet are -version-independent. The remaining fields are specific to the selected QUIC -version. See {{QUIC-INVARIANTS}} for details on how packets from different -versions of QUIC are interpreted. +Before creating a stream, all lower-numbered streams of the same type MUST be +created. That means that receipt of a frame that would open a stream causes all +lower-numbered streams of the same type to be opened in numeric order. This +ensures that the creation order for streams is consistent on both endpoints. +In the "Recv" state, the endpoint receives STREAM and STREAM_BLOCKED frames. +Incoming data is buffered and can be reassembled into the correct order for +delivery to the application. As data is consumed by the application and buffer +space becomes available, the endpoint sends MAX_STREAM_DATA frames to allow the +peer to send more data. -## Version Negotiation Packet {#packet-version} +When a STREAM frame with a FIN bit is received, the final offset (see +{{final-offset}}) is known. The receive stream enters the "Size Known" state. +In this state, the endpoint no longer needs to send MAX_STREAM_DATA frames, it +only receives any retransmissions of stream data. -A Version Negotiation packet is inherently not version-specific, and does not -use the long packet header (see {{long-header}}. Upon receipt by a client, it -will appear to be a packet using the long header, but will be identified as a -Version Negotiation packet based on the Version field having a value of 0. +Once all data for the stream has been received, the receive stream enters the +"Data Recvd" state. This might happen as a result of receiving the same STREAM +frame that causes the transition to "Size Known". In this state, the endpoint +has all stream data. Any STREAM or STREAM_BLOCKED frames it receives for the +stream can be discarded. -The Version Negotiation packet is a response to a client packet that contains a -version that is not supported by the server, and is only sent by servers. +The "Data Recvd" state persists until stream data has been delivered to the +application or application protocol. Once stream data has been delivered, the +stream enters the "Data Read" state, which is a terminal state. -The layout of a Version Negotiation packet is: +Receiving a RST_STREAM frame in the "Recv" or "Size Known" states causes the +stream to enter the "Reset Recvd" state. This might cause the delivery of +stream data to the application to be interrupted. -~~~ - 0 1 2 3 - 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 -+-+-+-+-+-+-+-+-+ -|1| Unused (7) | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Version (32) | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -|DCIL(4)|SCIL(4)| -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Destination Connection ID (0/32..144) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Source Connection ID (0/32..144) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Supported Version 1 (32) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| [Supported Version 2 (32)] ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ - ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| [Supported Version N (32)] ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -~~~ -{: #version-negotiation-format title="Version Negotiation Packet"} - -The value in the Unused field is selected randomly by the server. +It is possible that all stream data is received when a RST_STREAM is received +(that is, from the "Data Recvd" state). Similarly, it is possible for remaining +stream data to arrive after receiving a RST_STREAM frame (the "Reset Recvd" +state). An implementation is able to manage this situation as they choose. +Sending RST_STREAM means that an endpoint cannot guarantee delivery of stream +data; however there is no requirement that stream data not be delivered if a +RST_STREAM is received. An implementation MAY interrupt delivery of stream +data, discard any data that was not consumed, and signal the existence of the +RST_STREAM immediately. Alternatively, the RST_STREAM signal might be +suppressed or withheld if stream data is completely received. In the latter +case, the receive stream effectively transitions to "Data Recvd" from "Reset +Recvd". -The Version field of a Version Negotiation packet MUST be set to 0x00000000. +Once the application has been delivered the signal indicating that the receive +stream was reset, the receive stream transitions to the "Reset Read" state, +which is a terminal state. -The server MUST include the value from the Source Connection ID field of the -packet it receives in the Destination Connection ID field. The value for Source -Connection ID MUST be copied from the Destination Connection ID of the received -packet, which is initially randomly selected by a client. Echoing both -connection IDs gives clients some assurance that the server received the packet -and that the Version Negotiation packet was not generated by an off-path -attacker. -The remainder of the Version Negotiation packet is a list of 32-bit versions -which the server supports. +## Permitted Frame Types -A Version Negotiation packet cannot be explicitly acknowledged in an ACK frame -by a client. Receiving another Initial packet implicitly acknowledges a Version -Negotiation packet. +The sender of a stream sends just three frame types that affect the state of a +stream at either sender or receiver: STREAM ({{frame-stream}}), STREAM_BLOCKED +({{frame-stream-blocked}}), and RST_STREAM ({{frame-rst-stream}}). -The Version Negotiation packet does not include the Packet Number and Length -fields present in other packets that use the long header form. Consequently, -a Version Negotiation packet consumes an entire UDP datagram. +A sender MUST NOT send any of these frames from a terminal state ("Data Recvd" +or "Reset Recvd"). A sender MUST NOT send STREAM or STREAM_BLOCKED after +sending a RST_STREAM; that is, in the "Reset Sent" state in addition to the +terminal states. A receiver could receive any of these frames in any state, but +only due to the possibility of delayed delivery of packets carrying them. -See {{version-negotiation}} for a description of the version negotiation -process. +The receiver of a stream sends MAX_STREAM_DATA ({{frame-max-stream-data}}) and +STOP_SENDING frames ({{frame-stop-sending}}). +The receiver only sends MAX_STREAM_DATA in the "Recv" state. A receiver can +send STOP_SENDING in any state where it has not received a RST_STREAM frame; +that is states other than "Reset Recvd" or "Reset Read". However there is +little value in sending a STOP_SENDING frame after all stream data has been +received in the "Data Recvd" state. A sender could receive these frames in any +state as a result of delayed delivery of packets. -## Retry Packet {#packet-retry} -A Retry packet uses a long packet header with a type value of 0x7E. It carries -an address validation token created by the server. It is used by a server that -wishes to perform a stateless retry (see {{stateless-retry}}). +## Bidirectional Stream States {#stream-bidi-states} -~~~ - 0 1 2 3 - 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 -+-+-+-+-+-+-+-+-+ -|1| 0x7e | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Version (32) | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -|DCIL(4)|SCIL(4)| -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Destination Connection ID (0/32..144) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Source Connection ID (0/32..144) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| ODCIL(8) | Original Destination Connection ID (*) | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Retry Token (*) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -~~~ -{: #retry-format title="Retry Packet"} +A bidirectional stream is composed of a send stream and a receive stream. +Implementations may represent states of the bidirectional stream as composites +of send and receive stream states. The simplest model presents the stream as +"open" when either send or receive stream is in a non-terminal state and +"closed" when both send and receive streams are in a terminal state. -A Retry packet (shown in {{retry-format}}) only uses the invariant portion of -the long packet header {{QUIC-INVARIANTS}}; that is, the fields up to and -including the Destination and Source Connection ID fields. A Retry packet does -not contain any protected fields. Like Version Negotiation, a Retry packet -contains the long header including the connection IDs, but omits the Length, -Packet Number, and Payload fields. These are replaced with: +{{stream-bidi-mapping}} shows a more complex mapping of bidirectional stream +states that loosely correspond to the stream states in HTTP/2 +{{?HTTP2=RFC7540}}. This shows that multiple states on send or receive streams +are mapped to the same composite state. Note that this is just one possibility +for such a mapping; this mapping requires that data is acknowledged before the +transition to a "closed" or "half-closed" state. -ODCIL: +| Send Stream | Receive Stream | Composite State | +|:-----------------------|:-----------------------|:---------------------| +| No Stream/Ready | No Stream/Recv *1 | idle | +| Ready/Send/Data Sent | Recv/Size Known | open | +| Ready/Send/Data Sent | Data Recvd/Data Read | half-closed (remote) | +| Ready/Send/Data Sent | Reset Recvd/Reset Read | half-closed (remote) | +| Data Recvd | Recv/Size Known | half-closed (local) | +| Reset Sent/Reset Recvd | Recv/Size Known | half-closed (local) | +| Data Recvd | Recv/Size Known | half-closed (local) | +| Reset Sent/Reset Recvd | Data Recvd/Data Read | closed | +| Reset Sent/Reset Recvd | Reset Recvd/Reset Read | closed | +| Data Recvd | Data Recvd/Data Read | closed | +| Data Recvd | Reset Recvd/Reset Read | closed | +{: #stream-bidi-mapping title="Possible Mapping of Stream States to HTTP/2"} -: The length of the Original Destination Connection ID field. The length is - encoded in the least significant 4 bits of the octet, using the same encoding - as the DCIL and SCIL fields. The most significant 4 bits of this octet are - reserved. Unless a use for these bits has been negotiated, endpoints SHOULD - send randomized values and MUST ignore any value that it receives. +Note (*1): -Original Destination Connection ID: +: A stream is considered "idle" if it has not yet been created, or if the + receive stream is in the "Recv" state without yet having received any frames. -: The Original Destination Connection ID contains the value of the Destination - Connection ID from the Initial packet that this Retry is in response to. The - length of this field is given in ODCIL. -Retry Token: +## Solicited State Transitions -: An opaque token that the server can use to validate the client's address. +If an endpoint is no longer interested in the data it is receiving on a stream, +it MAY send a STOP_SENDING frame identifying that stream to prompt closure of +the stream in the opposite direction. This typically indicates that the +receiving application is no longer reading data it receives from the stream, but +is not a guarantee that incoming data will be ignored. -The server populates the Destination Connection ID with the connection ID that -the client included in the Source Connection ID of the Initial packet. +STREAM frames received after sending STOP_SENDING are still counted toward the +connection and stream flow-control windows, even though these frames will be +discarded upon receipt. This avoids potential ambiguity about which STREAM +frames count toward flow control. -The server includes a connection ID of its choice in the Source Connection ID -field. This value MUST not be equal to the Destination Connection ID field of -the packet sent by the client. The client MUST use this connection ID in the -Destination Connection ID of subsequent packets that it sends. +A STOP_SENDING frame requests that the receiving endpoint send a RST_STREAM +frame. An endpoint that receives a STOP_SENDING frame MUST send a RST_STREAM +frame for that stream, and can use an error code of STOPPING. If the +STOP_SENDING frame is received on a send stream that is already in the "Data +Sent" state, a RST_STREAM frame MAY still be sent in order to cancel +retransmission of previously-sent STREAM frames. -A server MAY send Retry packets in response to Initial and 0-RTT packets. A -server can either discard or buffer 0-RTT packets that it receives. A server -can send multiple Retry packets as it receives Initial or 0-RTT packets. +STOP_SENDING SHOULD only be sent for a receive stream that has not been +reset. STOP_SENDING is most useful for streams in the "Recv" or "Size Known" +states. -A client MUST accept and process at most one Retry packet for each connection -attempt. After the client has received and processed an Initial or Retry packet -from the server, it MUST discard any subsequent Retry packets that it receives. +An endpoint is expected to send another STOP_SENDING frame if a packet +containing a previous STOP_SENDING is lost. However, once either all stream +data or a RST_STREAM frame has been received for the stream - that is, the +stream is in any state other than "Recv" or "Size Known" - sending a +STOP_SENDING frame is unnecessary. -Clients MUST discard Retry packets that contain an Original Destination -Connection ID field that does not match the Destination Connection ID from its -Initial packet. This prevents an off-path attacker from injecting a Retry -packet. -The client responds to a Retry packet with an Initial packet that includes the -provided Retry Token to continue connection establishment. +# Flow Control {#flow-control} -A client sets the Destination Connection ID field of this Initial packet to the -value from the Source Connection ID in the Retry packet. Changing Destination -Connection ID also results in a change to the keys used to protect the Initial -packet. It also sets the Token field to the token provided in the Retry. The -client MUST NOT change the Source Connection ID because the server could include -the connection ID as part of its token validation logic (see {{tokens}}). +It is necessary to limit the amount of data that a sender may have outstanding +at any time, so as to prevent a fast sender from overwhelming a slow receiver, +or to prevent a malicious sender from consuming significant resources at a +receiver. This section describes QUIC's flow-control mechanisms. -All subsequent Initial packets from the client MUST use the connection ID and -token values from the Retry packet. Aside from this, the Initial packet sent -by the client is subject to the same restrictions as the first Initial packet. -A client can either reuse the cryptographic handshake message or construct a -new one at its discretion. +QUIC employs a credit-based flow-control scheme similar to HTTP/2's flow control +{{?HTTP2}}. A receiver advertises the number of octets it is prepared to +receive on a given stream and for the entire connection. This leads to two +levels of flow control in QUIC: (i) Connection flow control, which prevents +senders from exceeding a receiver's buffer capacity for the connection, and (ii) +Stream flow control, which prevents a single stream from consuming the entire +receive buffer for a connection. -A client MAY attempt 0-RTT after receiving a Retry packet by sending 0-RTT -packets to the connection ID provided by the server. A client that sends -additional 0-RTT packets without constructing a new cryptographic handshake -message MUST NOT reset the packet number to 0 after a Retry packet, see -{{retry-0rtt-pn}}. +A data receiver sends MAX_STREAM_DATA or MAX_DATA frames to the sender +to advertise additional credit. MAX_STREAM_DATA frames send the +maximum absolute byte offset of a stream, while MAX_DATA sends the +maximum of the sum of the absolute byte offsets of all streams. -A server acknowledges the use of a Retry packet for a connection using the -original_connection_id transport parameter (see -{{transport-parameter-definitions}}). If the server sends a Retry packet, it -MUST include the value of the Original Destination Connection ID field of the -Retry packet (that is, the Destination Connection ID field from the client's -first Initial packet) in the transport parameter. +A receiver MAY advertise a larger offset at any point by sending MAX_DATA or +MAX_STREAM_DATA frames. A receiver cannot renege on an advertisement; that is, +once a receiver advertises an offset, advertising a smaller offset has no +effect. A sender MUST therefore ignore any MAX_DATA or MAX_STREAM_DATA frames +that do not increase flow control limits. -If the client received and processed a Retry packet, it validates that the -original_connection_id transport parameter is present and correct; otherwise, it -validates that the transport parameter is absent. A client MUST treat a failed -validation as a connection error of type TRANSPORT_PARAMETER_ERROR. +A receiver MUST close the connection with a FLOW_CONTROL_ERROR error +({{error-handling}}) if the peer violates the advertised connection or stream +data limits. -A Retry packet does not include a packet number and cannot be explicitly -acknowledged by a client. +A sender SHOULD send BLOCKED or STREAM_BLOCKED frames to indicate it has data to +write but is blocked by flow control limits. These frames are expected to be +sent infrequently in common cases, but they are considered useful for debugging +and monitoring purposes. +A receiver advertises credit for a stream by sending a MAX_STREAM_DATA frame +with the Stream ID set appropriately. A receiver could use the current offset of +data consumed to determine the flow control offset to be advertised. A receiver +MAY send MAX_STREAM_DATA frames in multiple packets in order to make sure that +the sender receives an update before running out of flow control credit, even if +one of the packets is lost. -## Cryptographic Handshake Packets {#handshake-packets} +Connection flow control is a limit to the total bytes of stream data sent in +STREAM frames on all streams. A receiver advertises credit for a connection by +sending a MAX_DATA frame. A receiver maintains a cumulative sum of bytes +received on all contributing streams, which are used to check for flow control +violations. A receiver might use a sum of bytes consumed on all contributing +streams to determine the maximum data limit to be advertised. -Once version negotiation is complete, the cryptographic handshake is used to -agree on cryptographic keys. The cryptographic handshake is carried in Initial -({{packet-initial}}) and Handshake ({{packet-handshake}}) packets. +## Edge Cases and Other Considerations -All these packets use the long header and contain the current QUIC version in -the version field. +There are some edge cases which must be considered when dealing with stream and +connection level flow control. Given enough time, both endpoints must agree on +flow control state. If one end believes it can send more than the other end is +willing to receive, the connection will be torn down when too much data arrives. -In order to prevent tampering by version-unaware middleboxes, Initial -packets are protected with connection- and version-specific keys -(Initial keys) as described in {{QUIC-TLS}}. This protection does not -provide confidentiality or integrity against on-path attackers, but -provides some level of protection against off-path attackers. +Conversely if a sender believes it is blocked, while endpoint B expects more +data can be received, then the connection can be in a deadlock, with the sender +waiting for a MAX_DATA or MAX_STREAM_DATA frame which will never come. +On receipt of a RST_STREAM frame, an endpoint will tear down state for the +matching stream and ignore further data arriving on that stream. This could +result in the endpoints getting out of sync, since the RST_STREAM frame may have +arrived out of order and there may be further bytes in flight. The data sender +would have counted the data against its connection level flow control budget, +but a receiver that has not received these bytes would not know to include them +as well. The receiver must learn the number of bytes that were sent on the +stream to make the same adjustment in its connection flow controller. -## Initial Packet {#packet-initial} +To avoid this de-synchronization, a RST_STREAM sender MUST include the final +byte offset sent on the stream in the RST_STREAM frame. On receiving a +RST_STREAM frame, a receiver definitively knows how many bytes were sent on that +stream before the RST_STREAM frame, and the receiver MUST use the final offset +to account for all bytes sent on the stream in its connection level flow +controller. -The Initial packet uses long headers with a type value of 0x7F. It carries the -first CRYPTO frames sent by the client and server to perform key exchange, and -carries ACKs in either direction. The Initial packet is protected by Initial -keys as described in {{QUIC-TLS}}. +### Response to a RST_STREAM -The Initial packet (shown in {{initial-format}}) has two additional header -fields that are added to the Long Header before the Length field. +RST_STREAM terminates one direction of a stream abruptly. Whether any action or +response can or should be taken on the data already received is an +application-specific issue, but it will often be the case that upon receipt of a +RST_STREAM an endpoint will choose to stop sending data in its own direction. If +the sender of a RST_STREAM wishes to explicitly state that no future data will +be processed, that endpoint MAY send a STOP_SENDING frame at the same time. -~~~ -+-+-+-+-+-+-+-+-+ -|1| 0x7f | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Version (32) | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -|DCIL(4)|SCIL(4)| -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Destination Connection ID (0/32..144) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Source Connection ID (0/32..144) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Token Length (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Token (*) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Length (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Packet Number (8/16/32) | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Payload (*) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -~~~ -{: #initial-format title="Initial Packet"} +### Data Limit Increments {#fc-credit} -These fields include the token that was previously provided in a Retry packet or -NEW_TOKEN frame: +This document leaves when and how many bytes to advertise in a MAX_DATA or +MAX_STREAM_DATA to implementations, but offers a few considerations. These +frames contribute to connection overhead. Therefore frequently sending frames +with small changes is undesirable. At the same time, infrequent updates require +larger increments to limits if blocking is to be avoided. Thus, larger updates +require a receiver to commit to larger resource commitments. Thus there is a +trade-off between resource commitment and overhead when determining how large a +limit is advertised. -Token Length: +A receiver MAY use an autotuning mechanism to tune the frequency and amount that +it increases data limits based on a round-trip time estimate and the rate at +which the receiving application consumes data, similar to common TCP +implementations. -: A variable-length integer specifying the length of the Token field, in bytes. - This value is zero if no token is present. Initial packets sent by the server - MUST set the Token Length field to zero; clients that receive an Initial - packet with a non-zero Token Length field MUST either discard the packet or - generate a connection error of type PROTOCOL_VIOLATION. +## Stream Limit Increment -Token: +As with flow control, this document leaves when and how many streams to make +available to a peer via MAX_STREAM_ID to implementations, but offers a few +considerations. MAX_STREAM_ID frames constitute minimal overhead, while +withholding MAX_STREAM_ID frames can prevent the peer from using the available +parallelism. -: The value of the token. +Implementations will likely want to increase the maximum stream ID as +peer-initiated streams close. A receiver MAY also advance the maximum stream ID +based on current activity, system conditions, and other environmental factors. -The client and server use the Initial packet type for any packet that contains -an initial cryptographic handshake message. This includes all cases where a new -packet containing the initial cryptographic message needs to be created, such as -the packets sent after receiving a Version Negotiation ({{packet-version}}) or -Retry packet ({{packet-retry}}). -A server sends its first Initial packet in response to a client Initial. A -server may send multiple Initial packets. The cryptographic key exchange could -require multiple round trips or retransmissions of this data. +### Blocking on Flow Control {#blocking} -The payload of an Initial packet includes a CRYPTO frame (or frames) containing -a cryptographic handshake message, ACK frames, or both. PADDING and -CONNECTION_CLOSE frames are also permitted. An endpoint that receives an -Initial packet containing other frames can either discard the packet as spurious -or treat it as a connection error. +If a sender does not receive a MAX_DATA or MAX_STREAM_DATA frame when it has run +out of flow control credit, the sender will be blocked and SHOULD send a BLOCKED +or STREAM_BLOCKED frame. These frames are expected to be useful for debugging +at the receiver; they do not require any other action. A receiver SHOULD NOT +wait for a BLOCKED or STREAM_BLOCKED frame before sending MAX_DATA or +MAX_STREAM_DATA, since doing so will mean that a sender is unable to send for an +entire round trip. -The first packet sent by a client always includes a CRYPTO frame that contains -the entirety of the first cryptographic handshake message. This packet, and the -cryptographic handshake message, MUST fit in a single UDP datagram (see -{{handshake}}). The first CRYPTO frame sent always begins at an offset of 0 -(see {{handshake}}). +For smooth operation of the congestion controller, it is generally considered +best to not let the sender go into quiescence if avoidable. To avoid blocking a +sender, and to reasonably account for the possibility of loss, a receiver should +send a MAX_DATA or MAX_STREAM_DATA frame at least two round trips before it +expects the sender to get blocked. -Note that if the server sends a HelloRetryRequest, the client will send a second -Initial packet. This Initial packet will continue the cryptographic handshake -and will contain a CRYPTO frame with an offset matching the size of the CRYPTO -frame sent in the first Initial packet. Cryptographic handshake messages -subsequent to the first do not need to fit within a single UDP datagram. +A sender sends a single BLOCKED or STREAM_BLOCKED frame only once when it +reaches a data limit. A sender SHOULD NOT send multiple BLOCKED or +STREAM_BLOCKED frames for the same data limit, unless the original frame is +determined to be lost. Another BLOCKED or STREAM_BLOCKED frame can be sent +after the data limit is increased. -### Connection IDs +## Stream Final Offset {#final-offset} -When an Initial packet is sent by a client which has not previously received a -Retry packet from the server, it populates the Destination Connection ID field -with an unpredictable value. This MUST be at least 8 octets in length. Until a -packet is received from the server, the client MUST use the same value unless it -abandons the connection attempt and starts a new one. The initial Destination -Connection ID is used to determine packet protection keys for Initial packets. +The final offset is the count of the number of octets that are transmitted on a +stream. For a stream that is reset, the final offset is carried explicitly in +a RST_STREAM frame. Otherwise, the final offset is the offset of the end of the +data carried in a STREAM frame marked with a FIN flag, or 0 in the case of +incoming unidirectional streams. -The client populates the Source Connection ID field with a value of its choosing -and sets the SCIL field to match. +An endpoint will know the final offset for a stream when the receive stream +enters the "Size Known" or "Reset Recvd" state. -The Destination Connection ID field in the server's Initial packet contains a -connection ID that is chosen by the recipient of the packet (i.e., the client); -the Source Connection ID includes the connection ID that the sender of the -packet wishes to use (see {{connection-id}}). The server MUST use consistent -Source Connection IDs during the handshake. +An endpoint MUST NOT send data on a stream at or beyond the final offset. -On first receiving an Initial or Retry packet from the server, the client uses -the Source Connection ID supplied by the server as the Destination Connection ID -for subsequent packets. That means that a client might change the Destination -Connection ID twice during connection establishment. Once a client has received -an Initial packet from the server, it MUST discard any packet it receives with a -different Source Connection ID. +Once a final offset for a stream is known, it cannot change. If a RST_STREAM or +STREAM frame causes the final offset to change for a stream, an endpoint SHOULD +respond with a FINAL_OFFSET_ERROR error (see {{error-handling}}). A receiver +SHOULD treat receipt of data at or beyond the final offset as a +FINAL_OFFSET_ERROR error, even after a stream is closed. Generating these +errors is not mandatory, but only because requiring that an endpoint generate +these errors also means that the endpoint needs to maintain the final offset +state for closed streams, which could mean a significant state commitment. +## Flow Control for Cryptographic Handshake {#flow-control-crypto} -### Tokens +Data sent in CRYPTO frames is not flow controlled in the same way as STREAM +frames. QUIC relies on the cryptographic protocol implementation to avoid +excessive buffering of data, see {{QUIC-TLS}}. The implementation SHOULD +provide an interface to QUIC to tell it about its buffering limits so that there +is not excessive buffering at multiple layers. -If the client has a token received in a NEW_TOKEN frame on a previous connection -to what it believes to be the same server, it can include that value in the -Token field of its Initial packet. -A token allows a server to correlate activity between connections. -Specifically, the connection where the token was issued, and any connection -where it is used. Clients that want to break continuity of identity with a -server MAY discard tokens provided using the NEW_TOKEN frame. Tokens obtained -in Retry packets MUST NOT be discarded. +# Versions {#versions} -A client SHOULD NOT reuse a token. Reusing a token allows connections to be -linked by entities on the network path (see {{migration-linkability}}). A -client MUST NOT reuse a token if it believes that its point of network -attachment has changed since the token was last used; that is, if there is a -change in its local IP address or network interface. A client needs to start -the connection process over if it migrates prior to completing the handshake. +QUIC versions are identified using a 32-bit unsigned number. -When a server receives an Initial packet with an address validation token, it -SHOULD attempt to validate it. If the token is invalid then the server SHOULD -proceed as if the client did not have a validated address, including potentially -sending a Retry. If the validation succeeds, the server SHOULD then allow the -handshake to proceed (see {{stateless-retry}}). +The version 0x00000000 is reserved to represent version negotiation. This +version of the specification is identified by the number 0x00000001. -Note: +Other versions of QUIC might have different properties to this version. The +properties of QUIC that are guaranteed to be consistent across all versions of +the protocol are described in {{QUIC-INVARIANTS}}. -: The rationale for treating the client as unvalidated rather than discarding - the packet is that the client might have received the token in a previous - connection using the NEW_TOKEN frame, and if the server has lost state, it - might be unable to validate the token at all, leading to connection failure if - the packet is discarded. A server MAY encode tokens provided with NEW_TOKEN - frames and Retry packets differently, and validate the latter more strictly. +Version 0x00000001 of QUIC uses TLS as a cryptographic handshake protocol, as +described in {{QUIC-TLS}}. -In a stateless design, a server can use encrypted and authenticated tokens to -pass information to clients that the server can later recover and use to -validate a client address. Tokens are not integrated into the cryptographic -handshake and so they are not authenticated. For instance, a client might be -able to reuse a token. To avoid attacks that exploit this property, a server -can limit its use of tokens to only the information needed validate client -addresses. +Versions with the most significant 16 bits of the version number cleared are +reserved for use in future IETF consensus documents. +Versions that follow the pattern 0x?a?a?a?a are reserved for use in forcing +version negotiation to be exercised. That is, any version number where the low +four bits of all octets is 1010 (in binary). A client or server MAY advertise +support for any of these reserved versions. -### Starting Packet Numbers +Reserved version numbers will probably never represent a real protocol; a client +MAY use one of these version numbers with the expectation that the server will +initiate version negotiation; a server MAY advertise support for one of these +versions and can expect that clients ignore the value. -The first Initial packet sent by either endpoint contains a packet number of -0. The packet number MUST increase monotonically thereafter. Initial packets -are in a different packet number space to other packets (see -{{packet-numbers}}). +\[\[RFC editor: please remove the remainder of this section before +publication.]] +The version number for the final version of this specification (0x00000001), is +reserved for the version of the protocol that is published as an RFC. -### 0-RTT Packet Numbers {#retry-0rtt-pn} +Version numbers used to identify IETF drafts are created by adding the draft +number to 0xff000000. For example, draft-ietf-quic-transport-13 would be +identified as 0xff00000D. -Packet numbers for 0-RTT protected packets use the same space as 1-RTT protected -packets. +Implementors are encouraged to register version numbers of QUIC that they are +using for private experimentation on the GitHub wiki at +\. -After a client receives a Retry or Version Negotiation packet, 0-RTT packets are -likely to have been lost or discarded by the server. A client MAY attempt to -resend data in 0-RTT packets after it sends a new Initial packet. -A client MUST NOT reset the packet number it uses for 0-RTT packets. The keys -used to protect 0-RTT packets will not change as a result of responding to a -Retry or Version Negotiation packet unless the client also regenerates the -cryptographic handshake message. Sending packets with the same packet number in -that case is likely to compromise the packet protection for all 0-RTT packets -because the same key and nonce could be used to protect different content. +# Packet Types and Formats -Receiving a Retry or Version Negotiation packet, especially a Retry that changes -the connection ID used for subsequent packets, indicates a strong possibility -that 0-RTT packets could be lost. A client only receives acknowledgments for -its 0-RTT packets once the handshake is complete. Consequently, a server might -expect 0-RTT packets to start with a packet number of 0. Therefore, in -determining the length of the packet number encoding for 0-RTT packets, a client -MUST assume that all packets up to the current packet number are in flight, -starting from a packet number of 0. Thus, 0-RTT packets could need to use a -longer packet number encoding. +We first describe QUIC's packet types and their formats, since some are +referenced in subsequent mechanisms. -A client SHOULD instead generate a fresh cryptographic handshake message and -start packet numbers from 0. This ensures that new 0-RTT packets will not use -the same keys, avoiding any risk of key and nonce reuse; this also prevents -0-RTT packets from previous handshake attempts from being accepted as part of -the connection. +All numeric values are encoded in network byte order (that is, big-endian) and +all field sizes are in bits. When discussing individual bits of fields, the +least significant bit is referred to as bit 0. Hexadecimal notation is used for +describing the value of fields. +Any QUIC packet has either a long or a short header, as indicated by the Header +Form bit. Long headers are expected to be used early in the connection before +version negotiation and establishment of 1-RTT keys. Short headers are minimal +version-specific headers, which are used after version negotiation and 1-RTT +keys are established. -### Minimum Packet Size +## Long Header {#long-header} -The payload of a UDP datagram carrying the Initial packet MUST be expanded to at -least 1200 octets (see {{packetization}}), by adding PADDING frames to the -Initial packet and/or by combining the Initial packet with a 0-RTT packet (see -{{packet-coalesce}}). +~~~~~ + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+ +|1| Type (7) | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Version (32) | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +|DCIL(4)|SCIL(4)| ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Destination Connection ID (0/32..144) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Source Connection ID (0/32..144) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Length (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Packet Number (8/16/32) | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Payload (*) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +~~~~~ +{: #fig-long-header title="Long Header Packet Format"} +Long headers are used for packets that are sent prior to the completion of +version negotiation and establishment of 1-RTT keys. Once both conditions are +met, a sender switches to sending packets using the short header +({{short-header}}). The long form allows for special packets - such as the +Version Negotiation packet - to be represented in this uniform fixed-length +packet format. Packets that use the long header contain the following fields: -## Handshake Packet {#packet-handshake} +Header Form: -A Handshake packet uses long headers with a type value of 0x7D. It is -used to carry acknowledgments and cryptographic handshake messages from the -server and client. +: The most significant bit (0x80) of octet 0 (the first octet) is set to 1 for + long headers. -A server sends its cryptographic handshake in one or more Handshake packets in -response to an Initial packet if it does not send a Retry packet. Once a client -has received a Handshake packet from a server, it uses Handshake packets to send -subsequent cryptographic handshake messages and acknowledgments to the server. +Long Packet Type: -The Destination Connection ID field in a Handshake packet contains a connection -ID that is chosen by the recipient of the packet; the Source Connection ID -includes the connection ID that the sender of the packet wishes to use (see -{{connection-id-encoding}}). +: The remaining seven bits of octet 0 contain the packet type. This field can + indicate one of 128 packet types. The types specified for this version are + listed in {{long-packet-types}}. -The first Handshake packet sent by a server contains a packet number of 0. -Handshake packets are their own packet number space. Packet numbers are -incremented normally for other Handshake packets. +Version: -Servers MUST NOT send more than three times as many bytes as the number of bytes -received prior to verifying the client's address. Source addresses can be -verified through an address validation token (delivered via a Retry packet or -a NEW_TOKEN frame) or by processing any message from the client encrypted using -the Handshake keys. This limit exists to mitigate amplification attacks. +: The QUIC Version is a 32-bit field that follows the Type. This field + indicates which version of QUIC is in use and determines how the rest of the + protocol fields are interpreted. -In order to prevent this limit causing a handshake deadlock, the client SHOULD -always send a packet upon a handshake timeout, as described in -{{QUIC-RECOVERY}}. If the client has no data to retransmit and does not have -Handshake keys, it SHOULD send an Initial packet in a UDP datagram of at least -1200 octets. If the client has Handshake keys, it SHOULD send a Handshake -packet. +DCIL and SCIL: -The payload of this packet contains CRYPTO frames and could contain PADDING, or -ACK frames. Handshake packets MAY contain CONNECTION_CLOSE or APPLICATION_CLOSE -frames. Endpoints MUST treat receipt of Handshake packets with other frames as -a connection error. +: The octet following the version contains the lengths of the two connection ID + fields that follow it. These lengths are encoded as two 4-bit unsigned + integers. The Destination Connection ID Length (DCIL) field occupies the 4 + high bits of the octet and the Source Connection ID Length (SCIL) field + occupies the 4 low bits of the octet. An encoded length of 0 indicates that + the connection ID is also 0 octets in length. Non-zero encoded lengths are + increased by 3 to get the full length of the connection ID, producing a length + between 4 and 18 octets inclusive. For example, an octet with the value 0x50 + describes an 8-octet Destination Connection ID and a zero-length Source + Connection ID. +Destination Connection ID: -## Protected Packets {#packet-protected} +: The Destination Connection ID field follows the connection ID lengths and is + either 0 octets in length or between 4 and 18 octets. + {{connection-id-encoding}} describes the use of this field in more detail. -All QUIC packets use packet protection. Packets that are protected with the -static handshake keys or the 0-RTT keys are sent with long headers; all packets -protected with 1-RTT keys are sent with short headers. The different packet -types explicitly indicate the encryption level and therefore the keys that are -used to remove packet protection. 0-RTT and 1-RTT protected packets share a -single packet number space. +Source Connection ID: -Packets protected with handshake keys only use packet protection to ensure that -the sender of the packet is on the network path. This packet protection is not -effective confidentiality protection; any entity that receives the Initial -packet from a client can recover the keys necessary to remove packet protection -or to generate packets that will be successfully authenticated. +: The Source Connection ID field follows the Destination Connection ID and is + either 0 octets in length or between 4 and 18 octets. + {{connection-id-encoding}} describes the use of this field in more detail. -Packets protected with 0-RTT and 1-RTT keys are expected to have confidentiality -and data origin authentication; the cryptographic handshake ensures that only -the communicating endpoints receive the corresponding keys. +Length: -Packets protected with 0-RTT keys use a type value of 0x7C. The connection ID -fields for a 0-RTT packet MUST match the values used in the Initial packet -({{packet-initial}}). +: The length of the remainder of the packet (that is, the Packet Number and + Payload fields) in octets, encoded as a variable-length integer + ({{integer-encoding}}). -The version field for protected packets is the current QUIC version. +Packet Number: -The packet number field contains a packet number, which has additional -confidentiality protection that is applied after packet protection is applied -(see {{QUIC-TLS}} for details). The underlying packet number increases with -each packet sent, see {{packet-numbers}} for details. +: The packet number field is 1, 2, or 4 octets long. The packet number has + confidentiality protection separate from packet protection, as described + in Section 5.3 of {{QUIC-TLS}}. The length of the packet number field is + encoded in the plaintext packet number. See {{packet-numbers}} for details. -The payload is protected using authenticated encryption. {{QUIC-TLS}} describes -packet protection in detail. After decryption, the plaintext consists of a -sequence of frames, as described in {{frames}}. +Payload: +: The payload of the packet. -## Coalescing Packets {#packet-coalesce} +The following packet types are defined: -A sender can coalesce multiple QUIC packets (typically a Cryptographic Handshake -packet and a Protected packet) into one UDP datagram. This can reduce the -number of UDP datagrams needed to send application data during the handshake and -immediately afterwards. It is not necessary for senders to coalesce -packets, though failing to do so will require sending a significantly -larger number of datagrams during the handshake. Receivers MUST -be able to process coalesced packets. +| Type | Name | Section | +|:-----|:------------------------------|:----------------------------| +| 0x7F | Initial | {{packet-initial}} | +| 0x7E | Retry | {{packet-retry}} | +| 0x7D | Handshake | {{packet-handshake}} | +| 0x7C | 0-RTT Protected | {{packet-protected}} | +{: #long-packet-types title="Long Header Packet Types"} -Coalescing packets in order of increasing encryption levels (Initial, 0-RTT, -Handshake, 1-RTT) makes it more likely the receiver will be able to process all -the packets in a single pass. A packet with a short header does not include a -length, so it will always be the last packet included in a UDP datagram. +The header form, type, connection ID lengths octet, destination and source +connection IDs, and version fields of a long header packet are +version-independent. The packet number and values for packet types defined in +{{long-packet-types}} are version-specific. See {{QUIC-INVARIANTS}} for details +on how packets from different versions of QUIC are interpreted. -Senders MUST NOT coalesce QUIC packets with different Destination Connection -IDs into a single UDP datagram. Receivers SHOULD ignore any subsequent packets -with a different Destination Connection ID than the first packet in the -datagram. +The interpretation of the fields and the payload are specific to a version and +packet type. Type-specific semantics for this version are described in the +following sections. -Every QUIC packet that is coalesced into a single UDP datagram is separate and -complete. Though the values of some fields in the packet header might be -redundant, no fields are omitted. The receiver of coalesced QUIC packets MUST -individually process each QUIC packet and separately acknowledge them, as if -they were received as the payload of different UDP datagrams. If one or more -packets in a datagram cannot be processed yet (because the keys are not yet -available) or processing fails (decryption failure, unknown type, etc.), the -receiver MUST still attempt to process the remaining packets. The skipped -packets MAY either be discarded or buffered for later processing, just as if the -packets were received out-of-order in separate datagrams. +The end of the packet is determined by the Length field. The Length field +covers both the Packet Number and Payload fields, both of which are +confidentiality protected and initially of unknown length. The size of the +Payload field is learned once the packet number protection is removed. -Retry ({{packet-retry}}) and Version Negotiation ({{packet-version}}) packets -cannot be coalesced. +Senders can sometimes coalesce multiple packets into one UDP datagram. See +{{packet-coalesce}} for more details. -## Connection ID Encoding +## Short Header -A connection ID is used to ensure consistent routing of packets, as described in -{{connection-id}}. The long header contains two connection IDs: the Destination -Connection ID is chosen by the recipient of the packet and is used to provide -consistent routing; the Source Connection ID is used to set the Destination -Connection ID used by the peer. +~~~~~ + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+ +|0|K|1|1|0|R R R| ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Destination Connection ID (0..144) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Packet Number (8/16/32) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Protected Payload (*) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +~~~~~ +{: #fig-short-header title="Short Header Packet Format"} -During the handshake, packets with the long header are used to establish the -connection ID that each endpoint uses. Each endpoint uses the Source Connection -ID field to specify the connection ID that is used in the Destination Connection -ID field of packets being sent to them. Upon receiving a packet, each endpoint -sets the Destination Connection ID it sends to match the value of the Source -Connection ID that they receive. +The short header can be used after the version and 1-RTT keys are negotiated. +Packets that use the short header contain the following fields: -During the handshake, a client can receive both a Retry and an Initial packet, -and thus be given two opportunities to update the Destination Connection ID it -sends. A client MUST only change the value it sends in the Destination -Connection ID in response to the first packet of each type it receives from the -server (Retry or Initial); a server MUST set its value based on the Initial -packet. Any additional changes are not permitted; if subsequent packets of -those types include a different Source Connection ID, they MUST be discarded. -This avoids problems that might arise from stateless processing of multiple -Initial packets producing different connection IDs. +Header Form: -Short headers only include the Destination Connection ID and omit the explicit -length. The length of the Destination Connection ID field is expected to be -known to endpoints. +: The most significant bit (0x80) of octet 0 is set to 0 for the short header. -Endpoints using a connection-ID based load balancer could agree with the load -balancer on a fixed or minimum length and on an encoding for connection IDs. -This fixed portion could encode an explicit length, which allows the entire -connection ID to vary in length and still be used by the load balancer. - -The very first packet sent by a client includes a random value for Destination -Connection ID. The same value MUST be used for all 0-RTT packets sent on that -connection ({{packet-protected}}). This randomized value is used to determine -the packet protection keys for Initial packets (see Section 5.2 of -{{QUIC-TLS}}). - -A Version Negotiation ({{packet-version}}) packet MUST use both connection IDs -selected by the client, swapped to ensure correct routing toward the client. +Key Phase Bit: -The connection ID can change over the lifetime of a connection, especially in -response to connection migration ({{migration}}). NEW_CONNECTION_ID frames -({{frame-new-connection-id}}) are used to provide new connection ID values. +: The second bit (0x40) of octet 0 indicates the key phase, which allows a + recipient of a packet to identify the packet protection keys that are used to + protect the packet. See {{QUIC-TLS}} for details. +\[\[Editor's Note: this section should be removed and the bit definitions +changed before this draft goes to the IESG.]] -## Packet Numbers {#packet-numbers} +Third Bit: -The packet number is an integer in the range 0 to 2^62-1. The value is used in -determining the cryptographic nonce for packet protection. Each endpoint -maintains a separate packet number for sending and receiving. +: The third bit (0x20) of octet 0 is set to 1. -Packet numbers are divided into 3 spaces in QUIC: +\[\[Editor's Note: this section should be removed and the bit definitions +changed before this draft goes to the IESG.]] -- Initial space: All Initial packets {{packet-initial}} are in this space. -- Handshake space: All Handshake packets {{packet-handshake}} are in this space. -- Application data space: All 0-RTT and 1-RTT encrypted packets - {{packet-protected}} are in this space. +Fourth Bit: -As described in {{QUIC-TLS}}, each packet type uses different protection keys. +: The fourth bit (0x10) of octet 0 is set to 1. -Conceptually, a packet number space is the context in which a packet can be -processed and acknowledged. Initial packets can only be sent with Initial -packet protection keys and acknowledged in packets which are also Initial -packets. Similarly, Handshake packets are sent at the Handshake encryption -level and can only be acknowledged in Handshake packets. +\[\[Editor's Note: this section should be removed and the bit definitions +changed before this draft goes to the IESG.]] -This enforces cryptographic separation between the data sent in the different -packet sequence number spaces. Each packet number space starts at packet number -0. Subsequent packets sent in the same packet number space MUST increase the -packet number by at least one. +Google QUIC Demultiplexing Bit: -0-RTT and 1-RTT data exist in the same packet number space to make loss recovery -algorithms easier to implement between the two packet types. +: The fifth bit (0x8) of octet 0 is set to 0. This allows implementations of + Google QUIC to distinguish Google QUIC packets from short header packets sent + by a client because Google QUIC servers expect the connection ID to always be + present. + The special interpretation of this bit SHOULD be removed from this + specification when Google QUIC has finished transitioning to the new header + format. -A QUIC endpoint MUST NOT reuse a packet number within the same packet number -space in one connection (that is, under the same cryptographic keys). If the -packet number for sending reaches 2^62 - 1, the sender MUST close the connection -without sending a CONNECTION_CLOSE frame or any further packets; an endpoint MAY -send a Stateless Reset ({{stateless-reset}}) in response to further packets that -it receives. +Reserved: -In the QUIC long and short packet headers, the number of bits required to -represent the packet number is reduced by including only a variable number of -the least significant bits of the packet number. One or two of the most -significant bits of the first octet determine how many bits of the packet -number are provided, as shown in {{pn-encodings}}. +: The sixth, seventh, and eighth bits (0x7) of octet 0 are reserved for + experimentation. Endpoints MUST ignore these bits on packets they receive + unless they are participating in an experiment that uses these bits. An + endpoint not actively using these bits SHOULD set the value randomly on + packets they send to protect against unwanted inference about particular + values. -| First octet pattern | Encoded Length | Bits Present | -|:--------------------|:---------------|:-------------| -| 0b0xxxxxxx | 1 octet | 7 | -| 0b10xxxxxx | 2 | 14 | -| 0b11xxxxxx | 4 | 30 | -{: #pn-encodings title="Packet Number Encodings for Packet Headers"} +Destination Connection ID: -Note that these encodings are similar to those in {{integer-encoding}}, but -use different values. +: The Destination Connection ID is a connection ID that is chosen by the + intended recipient of the packet. See {{connection-id}} for more details. -The encoded packet number is protected as described in Section 5.3 -{{QUIC-TLS}}. Protection of the packet number is removed prior to recovering the -full packet number. The full packet number is reconstructed at the receiver -based on the number of significant bits present, the value of those bits, and -the largest packet number received on a successfully authenticated -packet. Recovering the full packet number is necessary to successfully remove -packet protection. +Packet Number: -Once packet number protection is removed, the packet number is decoded by -finding the packet number value that is closest to the next expected packet. -The next expected packet is the highest received packet number plus one. For -example, if the highest successfully authenticated packet had a packet number of -0xaa82f30e, then a packet containing a 14-bit value of 0x9b3 will be decoded as -0xaa8309b3. -Example pseudo-code for packet number decoding can be found in -{{sample-packet-number-decoding}}. +: The packet number field is 1, 2, or 4 octets long. The packet number has + confidentiality protection separate from packet protection, as described in + Section 5.3 of {{QUIC-TLS}}. The length of the packet number field is encoded + in the plaintext packet number. See {{packet-numbers}} for details. -The sender MUST use a packet number size able to represent more than twice as -large a range than the difference between the largest acknowledged packet and -packet number being sent. A peer receiving the packet will then correctly -decode the packet number, unless the packet is delayed in transit such that it -arrives after many higher-numbered packets have been received. An endpoint -SHOULD use a large enough packet number encoding to allow the packet number to -be recovered even if the packet arrives after packets that are sent afterwards. +Protected Payload: -As a result, the size of the packet number encoding is at least one more than -the base 2 logarithm of the number of contiguous unacknowledged packet numbers, -including the new packet. +: Packets with a short header always include a 1-RTT protected payload. -For example, if an endpoint has received an acknowledgment for packet 0x6afa2f, -sending a packet with a number of 0x6b2d79 requires a packet number encoding -with 14 bits or more; whereas the 30-bit packet number encoding is needed to -send a packet with a number of 0x6bc107. +The header form and connection ID field of a short header packet are +version-independent. The remaining fields are specific to the selected QUIC +version. See {{QUIC-INVARIANTS}} for details on how packets from different +versions of QUIC are interpreted. -A receiver MUST discard a newly unprotected packet unless it is certain that it -has not processed another packet with the same packet number from the same -packet number space. Duplicate suppression MUST happen after removing packet -protection for the reasons described in Section 9.3 of {{QUIC-TLS}}. An -efficient algorithm for duplicate suppression can be found in Section 3.4.3 of -{{?RFC2406}}. -A Version Negotiation packet ({{packet-version}}) does not include a packet -number. The Retry packet ({{packet-retry}}) has special rules for populating -the packet number field. +## Version Negotiation Packet {#packet-version} +A Version Negotiation packet is inherently not version-specific, and does not +use the long packet header (see {{long-header}}. Upon receipt by a client, it +will appear to be a packet using the long header, but will be identified as a +Version Negotiation packet based on the Version field having a value of 0. -# Frames and Frame Types {#frames} +The Version Negotiation packet is a response to a client packet that contains a +version that is not supported by the server, and is only sent by servers. -The payload of all packets, after removing packet protection, consists of a -sequence of frames, as shown in {{packet-frames}}. Version Negotiation and -Stateless Reset do not contain frames. +The layout of a Version Negotiation packet is: ~~~ 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+ +|1| Unused (7) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Frame 1 (*) ... +| Version (32) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Frame 2 (*) ... +|DCIL(4)|SCIL(4)| ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Destination Connection ID (0/32..144) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Source Connection ID (0/32..144) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Supported Version 1 (32) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| [Supported Version 2 (32)] ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Frame N (*) ... +| [Supported Version N (32)] ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~~~ -{: #packet-frames title="QUIC Payload"} +{: #version-negotiation-format title="Version Negotiation Packet"} -QUIC payloads MUST contain at least one frame, and MAY contain multiple -frames and multiple frame types. +The value in the Unused field is selected randomly by the server. -Frames MUST fit within a single QUIC packet and MUST NOT span a QUIC packet -boundary. Each frame begins with a Frame Type, indicating its type, followed by -additional type-dependent fields: +The Version field of a Version Negotiation packet MUST be set to 0x00000000. + +The server MUST include the value from the Source Connection ID field of the +packet it receives in the Destination Connection ID field. The value for Source +Connection ID MUST be copied from the Destination Connection ID of the received +packet, which is initially randomly selected by a client. Echoing both +connection IDs gives clients some assurance that the server received the packet +and that the Version Negotiation packet was not generated by an off-path +attacker. + +The remainder of the Version Negotiation packet is a list of 32-bit versions +which the server supports. + +A Version Negotiation packet cannot be explicitly acknowledged in an ACK frame +by a client. Receiving another Initial packet implicitly acknowledges a Version +Negotiation packet. + +The Version Negotiation packet does not include the Packet Number and Length +fields present in other packets that use the long header form. Consequently, +a Version Negotiation packet consumes an entire UDP datagram. + +See {{version-negotiation}} for a description of the version negotiation +process. + + +## Retry Packet {#packet-retry} + +A Retry packet uses a long packet header with a type value of 0x7E. It carries +an address validation token created by the server. It is used by a server that +wishes to perform a stateless retry (see {{stateless-retry}}). ~~~ 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+ +|1| 0x7e | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Frame Type (i) ... +| Version (32) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Type-Dependent Fields (*) ... +|DCIL(4)|SCIL(4)| ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Destination Connection ID (0/32..144) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Source Connection ID (0/32..144) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| ODCIL(8) | Original Destination Connection ID (*) | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Retry Token (*) ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~~~ -{: #frame-layout title="Generic Frame Layout"} - -The frame types defined in this specification are listed in {{frame-types}}. -The Frame Type in STREAM frames is used to carry other frame-specific flags. -For all other frames, the Frame Type field simply identifies the frame. These -frames are explained in more detail as they are referenced later in the -document. +{: #retry-format title="Retry Packet"} -| Type Value | Frame Type Name | Definition | -|:------------|:---------------------|:-------------------------------| -| 0x00 | PADDING | {{frame-padding}} | -| 0x01 | RST_STREAM | {{frame-rst-stream}} | -| 0x02 | CONNECTION_CLOSE | {{frame-connection-close}} | -| 0x03 | APPLICATION_CLOSE | {{frame-application-close}} | -| 0x04 | MAX_DATA | {{frame-max-data}} | -| 0x05 | MAX_STREAM_DATA | {{frame-max-stream-data}} | -| 0x06 | MAX_STREAM_ID | {{frame-max-stream-id}} | -| 0x07 | PING | {{frame-ping}} | -| 0x08 | BLOCKED | {{frame-blocked}} | -| 0x09 | STREAM_BLOCKED | {{frame-stream-blocked}} | -| 0x0a | STREAM_ID_BLOCKED | {{frame-stream-id-blocked}} | -| 0x0b | NEW_CONNECTION_ID | {{frame-new-connection-id}} | -| 0x0c | STOP_SENDING | {{frame-stop-sending}} | -| 0x0d | RETIRE_CONNECTION_ID | {{frame-retire-connection-id}} | -| 0x0e | PATH_CHALLENGE | {{frame-path-challenge}} | -| 0x0f | PATH_RESPONSE | {{frame-path-response}} | -| 0x10 - 0x17 | STREAM | {{frame-stream}} | -| 0x18 | CRYPTO | {{frame-crypto}} | -| 0x19 | NEW_TOKEN | {{frame-new-token}} | -| 0x1a - 0x1b | ACK | {{frame-ack}} | -{: #frame-types title="Frame Types"} +A Retry packet (shown in {{retry-format}}) only uses the invariant portion of +the long packet header {{QUIC-INVARIANTS}}; that is, the fields up to and +including the Destination and Source Connection ID fields. A Retry packet does +not contain any protected fields. Like Version Negotiation, a Retry packet +contains the long header including the connection IDs, but omits the Length, +Packet Number, and Payload fields. These are replaced with: -All QUIC frames are idempotent. That is, a valid frame does not cause -undesirable side effects or errors when received more than once. +ODCIL: -The Frame Type field uses a variable length integer encoding (see -{{integer-encoding}}) with one exception. To ensure simple and efficient -implementations of frame parsing, a frame type MUST use the shortest possible -encoding. Though a two-, four- or eight-octet encoding of the frame types -defined in this document is possible, the Frame Type field for these frames is -encoded on a single octet. For instance, though 0x4007 is a legitimate -two-octet encoding for a variable-length integer with a value of 7, PING frames -are always encoded as a single octet with the value 0x07. An endpoint MUST -treat the receipt of a frame type that uses a longer encoding than necessary as -a connection error of type PROTOCOL_VIOLATION. +: The length of the Original Destination Connection ID field. The length is + encoded in the least significant 4 bits of the octet, using the same encoding + as the DCIL and SCIL fields. The most significant 4 bits of this octet are + reserved. Unless a use for these bits has been negotiated, endpoints SHOULD + send randomized values and MUST ignore any value that it receives. +Original Destination Connection ID: -## Extension Frames +: The Original Destination Connection ID contains the value of the Destination + Connection ID from the Initial packet that this Retry is in response to. The + length of this field is given in ODCIL. -QUIC frames do not use a self-describing encoding. An endpoint therefore needs -to understand the syntax of all frames before it can successfully process a -packet. This allows for efficient encoding of frames, but it means that an -endpoint cannot send a frame of a type that is unknown to its peer. +Retry Token: -An extension to QUIC that wishes to use a new type of frame MUST first ensure -that a peer is able to understand the frame. An endpoint can use a transport -parameter to signal its willingness to receive one or more extension frame types -with the one transport parameter. +: An opaque token that the server can use to validate the client's address. -Extension frames MUST be congestion controlled and MUST cause an ACK frame to -be sent. The exception is extension frames that replace or supplement the ACK -frame. Extension frames are not included in flow control unless specified -in the extension. +The server populates the Destination Connection ID with the connection ID that +the client included in the Source Connection ID of the Initial packet. -An IANA registry is used to manage the assignment of frame types, see -{{iana-frames}}. +The server includes a connection ID of its choice in the Source Connection ID +field. This value MUST not be equal to the Destination Connection ID field of +the packet sent by the client. The client MUST use this connection ID in the +Destination Connection ID of subsequent packets that it sends. +A server MAY send Retry packets in response to Initial and 0-RTT packets. A +server can either discard or buffer 0-RTT packets that it receives. A server +can send multiple Retry packets as it receives Initial or 0-RTT packets. -# Life of a Connection +A client MUST accept and process at most one Retry packet for each connection +attempt. After the client has received and processed an Initial or Retry packet +from the server, it MUST discard any subsequent Retry packets that it receives. -A QUIC connection is a single conversation between two QUIC endpoints. QUIC's -connection establishment intertwines version negotiation with the cryptographic -and transport handshakes to reduce connection establishment latency, as -described in {{handshake}}. Once established, a connection may migrate to a -different IP or port at either endpoint, due to NAT rebinding or mobility, as -described in {{migration}}. Finally, a connection may be terminated by either -endpoint, as described in {{termination}}. +Clients MUST discard Retry packets that contain an Original Destination +Connection ID field that does not match the Destination Connection ID from its +Initial packet. This prevents an off-path attacker from injecting a Retry +packet. -## Connection ID +The client responds to a Retry packet with an Initial packet that includes the +provided Retry Token to continue connection establishment. -Each connection possesses a set of identifiers, any of which could be used to -distinguish it from other connections. Connection IDs are selected -independently in each direction. Each Connection ID has an associated sequence -number to assist in deduplicating messages. +A client sets the Destination Connection ID field of this Initial packet to the +value from the Source Connection ID in the Retry packet. Changing Destination +Connection ID also results in a change to the keys used to protect the Initial +packet. It also sets the Token field to the token provided in the Retry. The +client MUST NOT change the Source Connection ID because the server could include +the connection ID as part of its token validation logic (see {{tokens}}). -The primary function of a connection ID is to ensure that changes in addressing -at lower protocol layers (UDP, IP, and below) don't cause packets for a QUIC -connection to be delivered to the wrong endpoint. Each endpoint selects -connection IDs using an implementation-specific (and perhaps -deployment-specific) method which will allow packets with that connection ID to -be routed back to the endpoint and identified by the endpoint upon receipt. +All subsequent Initial packets from the client MUST use the connection ID and +token values from the Retry packet. Aside from this, the Initial packet sent +by the client is subject to the same restrictions as the first Initial packet. +A client can either reuse the cryptographic handshake message or construct a +new one at its discretion. -Connection IDs MUST NOT contain any information that can be used to correlate -them with other connection IDs for the same connection. As a trivial example, -this means the same connection ID MUST NOT be issued more than once on the same -connection. +A client MAY attempt 0-RTT after receiving a Retry packet by sending 0-RTT +packets to the connection ID provided by the server. A client that sends +additional 0-RTT packets without constructing a new cryptographic handshake +message MUST NOT reset the packet number to 0 after a Retry packet, see +{{retry-0rtt-pn}}. -A zero-length connection ID MAY be used when the connection ID is not needed for -routing and the address/port tuple of packets is sufficient to identify a -connection. An endpoint whose peer has selected a zero-length connection ID MUST -continue to use a zero-length connection ID for the lifetime of the connection -and MUST NOT send packets from any other local address. +A server acknowledges the use of a Retry packet for a connection using the +original_connection_id transport parameter (see +{{transport-parameter-definitions}}). If the server sends a Retry packet, it +MUST include the value of the Original Destination Connection ID field of the +Retry packet (that is, the Destination Connection ID field from the client's +first Initial packet) in the transport parameter. -When an endpoint has requested a non-zero-length connection ID, it needs to -ensure that the peer has a supply of connection IDs from which to choose for -packets sent to the endpoint. These connection IDs are supplied by the endpoint -using the NEW_CONNECTION_ID frame ({{frame-new-connection-id}}). +If the client received and processed a Retry packet, it validates that the +original_connection_id transport parameter is present and correct; otherwise, it +validates that the transport parameter is absent. A client MUST treat a failed +validation as a connection error of type TRANSPORT_PARAMETER_ERROR. +A Retry packet does not include a packet number and cannot be explicitly +acknowledged by a client. -### Issuing Connection IDs -The initial connection ID issued by an endpoint is the Source Connection ID -during the handshake. The sequence number of the initial connection ID is 0. If -the preferred_address transport parameter is sent, the sequence number of the -supplied connection ID is 1. Subsequent connection IDs are communicated to the -peer using NEW_CONNECTION_ID frames ({{frame-new-connection-id}}), and the -sequence number on each newly-issued connection ID MUST increase by 1. The -connection ID randomly selected by the client in the Initial packet and any -connection ID provided by a Reset packet are not assigned sequence numbers -unless a server opts to retain them as its initial connection ID. +## Cryptographic Handshake Packets {#handshake-packets} -When an endpoint issues a connection ID, it MUST accept packets that carry this -connection ID for the duration of the connection or until its peer invalidates -the connection ID via a RETIRE_CONNECTION_ID frame -({{frame-retire-connection-id}}). +Once version negotiation is complete, the cryptographic handshake is used to +agree on cryptographic keys. The cryptographic handshake is carried in Initial +({{packet-initial}}) and Handshake ({{packet-handshake}}) packets. -An endpoint SHOULD ensure that its peer has a sufficient number of available and -unused connection IDs. While each endpoint independently chooses how many -connection IDs to issue, endpoints SHOULD provide and maintain at least eight -connection IDs. The endpoint can do this by always supplying a new connection -ID when a connection ID is retired by its peer or when the endpoint receives a -packet with a previously unused connection ID. Endpoints that initiate -migration and require non-zero-length connection IDs SHOULD provide their peers -with new connection IDs before migration, or risk the peer closing the -connection. +All these packets use the long header and contain the current QUIC version in +the version field. +In order to prevent tampering by version-unaware middleboxes, Initial +packets are protected with connection- and version-specific keys +(Initial keys) as described in {{QUIC-TLS}}. This protection does not +provide confidentiality or integrity against on-path attackers, but +provides some level of protection against off-path attackers. -### Consuming and Retiring Connection IDs {#retiring-cids} -An endpoint can change the connection ID it uses for a peer to another available -one at any time during the connection. An endpoint consumes connection IDs in -response to a migrating peer, see {{migration-linkability}} for more. +## Initial Packet {#packet-initial} -An endpoint maintains a set of connection IDs received from its peer, any of -which it can use when sending packets. When the endpoint wishes to remove a -connection ID from use, it sends a RETIRE_CONNECTION_ID frame to its peer, -indicating that the peer might bring a new connection ID into circulation using -the NEW_CONNECTION_ID frame. +The Initial packet uses long headers with a type value of 0x7F. It carries the +first CRYPTO frames sent by the client and server to perform key exchange, and +carries ACKs in either direction. The Initial packet is protected by Initial +keys as described in {{QUIC-TLS}}. -An endpoint that retires a connection ID can retain knowledge of that connection -ID for a period of time after sending the RETIRE_CONNECTION_ID frame, or until -that frame is acknowledged. +The Initial packet (shown in {{initial-format}}) has two additional header +fields that are added to the Long Header before the Length field. -As discussed in {{migration-linkability}}, each connection ID MUST be used on -packets sent from only one local address. An endpoint that migrates away from a -local address SHOULD retire all connection IDs used on that address once it no -longer plans to use that address. +~~~ ++-+-+-+-+-+-+-+-+ +|1| 0x7f | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Version (32) | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +|DCIL(4)|SCIL(4)| ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Destination Connection ID (0/32..144) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Source Connection ID (0/32..144) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Token Length (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Token (*) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Length (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Packet Number (8/16/32) | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Payload (*) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +~~~ +{: #initial-format title="Initial Packet"} +These fields include the token that was previously provided in a Retry packet or +NEW_TOKEN frame: -## Matching Packets to Connections {#packet-handling} +Token Length: -Incoming packets are classified on receipt. Packets can either be associated -with an existing connection, or - for servers - potentially create a new -connection. +: A variable-length integer specifying the length of the Token field, in bytes. + This value is zero if no token is present. Initial packets sent by the server + MUST set the Token Length field to zero; clients that receive an Initial + packet with a non-zero Token Length field MUST either discard the packet or + generate a connection error of type PROTOCOL_VIOLATION. -Hosts try to associate a packet with an existing connection. If the packet has a -Destination Connection ID corresponding to an existing connection, QUIC -processes that packet accordingly. Note that more than one connection ID can be -associated with a connection; see {{connection-id}}. +Token: -If the Destination Connection ID is zero length and the packet matches the -address/port tuple of a connection where the host did not require connection -IDs, QUIC processes the packet as part of that connection. Endpoints MUST drop -packets with zero-length Destination Connection ID fields if they do not -correspond to a single connection. +: The value of the token. -Endpoints SHOULD send a Stateless Reset ({{stateless-reset}}) for any packets -that cannot be attributed to an existing connection. +The client and server use the Initial packet type for any packet that contains +an initial cryptographic handshake message. This includes all cases where a new +packet containing the initial cryptographic message needs to be created, such as +the packets sent after receiving a Version Negotiation ({{packet-version}}) or +Retry packet ({{packet-retry}}). +A server sends its first Initial packet in response to a client Initial. A +server may send multiple Initial packets. The cryptographic key exchange could +require multiple round trips or retransmissions of this data. -### Client Packet Handling {#client-pkt-handling} +The payload of an Initial packet includes a CRYPTO frame (or frames) containing +a cryptographic handshake message, ACK frames, or both. PADDING and +CONNECTION_CLOSE frames are also permitted. An endpoint that receives an +Initial packet containing other frames can either discard the packet as spurious +or treat it as a connection error. -Valid packets sent to clients always include a Destination Connection ID that -matches the value the client selects. Clients that choose to receive -zero-length connection IDs can use the address/port tuple to identify a -connection. Packets that don't match an existing connection are discarded. +The first packet sent by a client always includes a CRYPTO frame that contains +the entirety of the first cryptographic handshake message. This packet, and the +cryptographic handshake message, MUST fit in a single UDP datagram (see +{{handshake}}). The first CRYPTO frame sent always begins at an offset of 0 +(see {{handshake}}). -Due to packet reordering or loss, clients might receive packets for a connection -that are encrypted with a key it has not yet computed. Clients MAY drop these -packets, or MAY buffer them in anticipation of later packets that allow it to -compute the key. +Note that if the server sends a HelloRetryRequest, the client will send a second +Initial packet. This Initial packet will continue the cryptographic handshake +and will contain a CRYPTO frame with an offset matching the size of the CRYPTO +frame sent in the first Initial packet. Cryptographic handshake messages +subsequent to the first do not need to fit within a single UDP datagram. -If a client receives a packet that has an unsupported version, it MUST discard -that packet. +### Connection IDs -### Server Packet Handling {#server-pkt-handling} - -If a server receives a packet that has an unsupported version, but the packet is -sufficiently large to initiate a new connection for any version supported by the -server, it SHOULD send a Version Negotiation packet as described in -{{send-vn}}. Servers MAY rate control these packets to avoid storms of Version -Negotiation packets. +When an Initial packet is sent by a client which has not previously received a +Retry packet from the server, it populates the Destination Connection ID field +with an unpredictable value. This MUST be at least 8 octets in length. Until a +packet is received from the server, the client MUST use the same value unless it +abandons the connection attempt and starts a new one. The initial Destination +Connection ID is used to determine packet protection keys for Initial packets. -The first packet for an unsupported version can use different semantics and -encodings for any version-specific field. In particular, different packet -protection keys might be used for different versions. Servers that do not -support a particular version are unlikely to be able to decrypt the payload of -the packet. Servers SHOULD NOT attempt to decode or decrypt a packet from an -unknown version, but instead send a Version Negotiation packet, provided that -the packet is sufficiently long. +The client populates the Source Connection ID field with a value of its choosing +and sets the SCIL field to match. -Servers MUST drop other packets that contain unsupported versions. +The Destination Connection ID field in the server's Initial packet contains a +connection ID that is chosen by the recipient of the packet (i.e., the client); +the Source Connection ID includes the connection ID that the sender of the +packet wishes to use (see {{connection-id}}). The server MUST use consistent +Source Connection IDs during the handshake. -Packets with a supported version, or no version field, are matched to a -connection as described in {{packet-handling}}. If not matched, the server -continues below. +On first receiving an Initial or Retry packet from the server, the client uses +the Source Connection ID supplied by the server as the Destination Connection ID +for subsequent packets. That means that a client might change the Destination +Connection ID twice during connection establishment. Once a client has received +an Initial packet from the server, it MUST discard any packet it receives with a +different Source Connection ID. -If the packet is an Initial packet fully conforming with the specification, the -server proceeds with the handshake ({{handshake}}). This commits the server to -the version that the client selected. -If a server isn't currently accepting any new connections, it SHOULD send an -Initial packet containing a CONNECTION_CLOSE frame with error code -SERVER_BUSY. +### Tokens -If the packet is a 0-RTT packet, the server MAY buffer a limited number of these -packets in anticipation of a late-arriving Initial Packet. Clients are forbidden -from sending Handshake packets prior to receiving a server response, so servers -SHOULD ignore any such packets. +If the client has a token received in a NEW_TOKEN frame on a previous connection +to what it believes to be the same server, it can include that value in the +Token field of its Initial packet. -Servers MUST drop incoming packets under all other circumstances. +A token allows a server to correlate activity between connections. +Specifically, the connection where the token was issued, and any connection +where it is used. Clients that want to break continuity of identity with a +server MAY discard tokens provided using the NEW_TOKEN frame. Tokens obtained +in Retry packets MUST NOT be discarded. -## Version Negotiation +A client SHOULD NOT reuse a token. Reusing a token allows connections to be +linked by entities on the network path (see {{migration-linkability}}). A +client MUST NOT reuse a token if it believes that its point of network +attachment has changed since the token was last used; that is, if there is a +change in its local IP address or network interface. A client needs to start +the connection process over if it migrates prior to completing the handshake. -Version negotiation ensures that client and server agree to a QUIC version -that is mutually supported. A server sends a Version Negotiation packet in -response to each packet that might initiate a new connection, see -{{packet-handling}} for details. +When a server receives an Initial packet with an address validation token, it +SHOULD attempt to validate it. If the token is invalid then the server SHOULD +proceed as if the client did not have a validated address, including potentially +sending a Retry. If the validation succeeds, the server SHOULD then allow the +handshake to proceed (see {{stateless-retry}}). -The size of the first packet sent by a client will determine whether a server -sends a Version Negotiation packet. Clients that support multiple QUIC versions -SHOULD pad the first packet they send to the largest of the minimum packet sizes -across all versions they support. This ensures that the server responds if there -is a mutually supported version. +Note: -### Sending Version Negotiation Packets {#send-vn} +: The rationale for treating the client as unvalidated rather than discarding + the packet is that the client might have received the token in a previous + connection using the NEW_TOKEN frame, and if the server has lost state, it + might be unable to validate the token at all, leading to connection failure if + the packet is discarded. A server MAY encode tokens provided with NEW_TOKEN + frames and Retry packets differently, and validate the latter more strictly. -If the version selected by the client is not acceptable to the server, the -server responds with a Version Negotiation packet (see {{packet-version}}). -This includes a list of versions that the server will accept. +In a stateless design, a server can use encrypted and authenticated tokens to +pass information to clients that the server can later recover and use to +validate a client address. Tokens are not integrated into the cryptographic +handshake and so they are not authenticated. For instance, a client might be +able to reuse a token. To avoid attacks that exploit this property, a server +can limit its use of tokens to only the information needed validate client +addresses. -This system allows a server to process packets with unsupported versions without -retaining state. Though either the Initial packet or the Version Negotiation -packet that is sent in response could be lost, the client will send new packets -until it successfully receives a response or it abandons the connection attempt. +### Starting Packet Numbers -### Handling Version Negotiation Packets {#handle-vn} +The first Initial packet sent by either endpoint contains a packet number of +0. The packet number MUST increase monotonically thereafter. Initial packets +are in a different packet number space to other packets (see +{{packet-numbers}}). -When the client receives a Version Negotiation packet, it first checks that the -Destination and Source Connection ID fields match the Source and Destination -Connection ID fields in a packet that the client sent. If this check fails, the -packet MUST be discarded. -Once the Version Negotiation packet is determined to be valid, the client then -selects an acceptable protocol version from the list provided by the server. -The client then attempts to create a connection using that version. Though the -content of the Initial packet the client sends might not change in response to -version negotiation, a client MUST increase the packet number it uses on every -packet it sends. Packets MUST continue to use long headers and MUST include the -new negotiated protocol version. +### 0-RTT Packet Numbers {#retry-0rtt-pn} -The client MUST use the long header format and include its selected version on -all packets until it has 1-RTT keys and it has received a packet from the server -which is not a Version Negotiation packet. +Packet numbers for 0-RTT protected packets use the same space as 1-RTT protected +packets. -A client MUST NOT change the version it uses unless it is in response to a -Version Negotiation packet from the server. Once a client receives a packet -from the server which is not a Version Negotiation packet, it MUST discard other -Version Negotiation packets on the same connection. Similarly, a client MUST -ignore a Version Negotiation packet if it has already received and acted on a -Version Negotiation packet. +After a client receives a Retry or Version Negotiation packet, 0-RTT packets are +likely to have been lost or discarded by the server. A client MAY attempt to +resend data in 0-RTT packets after it sends a new Initial packet. -A client MUST ignore a Version Negotiation packet that lists the client's chosen -version. +A client MUST NOT reset the packet number it uses for 0-RTT packets. The keys +used to protect 0-RTT packets will not change as a result of responding to a +Retry or Version Negotiation packet unless the client also regenerates the +cryptographic handshake message. Sending packets with the same packet number in +that case is likely to compromise the packet protection for all 0-RTT packets +because the same key and nonce could be used to protect different content. -A client MAY attempt 0-RTT after receiving a Version Negotiation packet. A -client that sends additional 0-RTT packets MUST NOT reset the packet number to 0 -as a result, see {{retry-0rtt-pn}}. +Receiving a Retry or Version Negotiation packet, especially a Retry that changes +the connection ID used for subsequent packets, indicates a strong possibility +that 0-RTT packets could be lost. A client only receives acknowledgments for +its 0-RTT packets once the handshake is complete. Consequently, a server might +expect 0-RTT packets to start with a packet number of 0. Therefore, in +determining the length of the packet number encoding for 0-RTT packets, a client +MUST assume that all packets up to the current packet number are in flight, +starting from a packet number of 0. Thus, 0-RTT packets could need to use a +longer packet number encoding. -Version negotiation packets have no cryptographic protection. The result of the -negotiation MUST be revalidated as part of the cryptographic handshake (see -{{version-validation}}). +A client SHOULD instead generate a fresh cryptographic handshake message and +start packet numbers from 0. This ensures that new 0-RTT packets will not use +the same keys, avoiding any risk of key and nonce reuse; this also prevents +0-RTT packets from previous handshake attempts from being accepted as part of +the connection. -### Using Reserved Versions +### Minimum Packet Size -For a server to use a new version in the future, clients must correctly handle -unsupported versions. To help ensure this, a server SHOULD include a reserved -version (see {{versions}}) while generating a Version Negotiation packet. +The payload of a UDP datagram carrying the Initial packet MUST be expanded to at +least 1200 octets (see {{packetization}}), by adding PADDING frames to the +Initial packet and/or by combining the Initial packet with a 0-RTT packet (see +{{packet-coalesce}}). -The design of version negotiation permits a server to avoid maintaining state -for packets that it rejects in this fashion. The validation of version -negotiation (see {{version-validation}}) only validates the result of version -negotiation, which is the same no matter which reserved version was sent. -A server MAY therefore send different reserved version numbers in the Version -Negotiation Packet and in its transport parameters. -A client MAY send a packet using a reserved version number. This can be used to -solicit a list of supported versions from a server. +## Handshake Packet {#packet-handshake} +A Handshake packet uses long headers with a type value of 0x7D. It is +used to carry acknowledgments and cryptographic handshake messages from the +server and client. -## Cryptographic and Transport Handshake {#handshake} +A server sends its cryptographic handshake in one or more Handshake packets in +response to an Initial packet if it does not send a Retry packet. Once a client +has received a Handshake packet from a server, it uses Handshake packets to send +subsequent cryptographic handshake messages and acknowledgments to the server. -QUIC relies on a combined cryptographic and transport handshake to minimize -connection establishment latency. QUIC uses the CRYPTO frame {{frame-crypto}} -to transmit the cryptographic handshake. Version 0x00000001 of QUIC uses TLS -1.3 as described in {{QUIC-TLS}}; a different QUIC version number could indicate -that a different cryptographic handshake protocol is in use. +The Destination Connection ID field in a Handshake packet contains a connection +ID that is chosen by the recipient of the packet; the Source Connection ID +includes the connection ID that the sender of the packet wishes to use (see +{{connection-id-encoding}}). -QUIC provides reliable, ordered delivery of the cryptographic handshake -data. QUIC packet protection ensures confidentiality and integrity protection -that meets the requirements of the cryptographic handshake protocol: +The first Handshake packet sent by a server contains a packet number of 0. +Handshake packets are their own packet number space. Packet numbers are +incremented normally for other Handshake packets. -* authenticated key exchange, where +Servers MUST NOT send more than three times as many bytes as the number of bytes +received prior to verifying the client's address. Source addresses can be +verified through an address validation token (delivered via a Retry packet or +a NEW_TOKEN frame) or by processing any message from the client encrypted using +the Handshake keys. This limit exists to mitigate amplification attacks. - * a server is always authenticated, +In order to prevent this limit causing a handshake deadlock, the client SHOULD +always send a packet upon a handshake timeout, as described in +{{QUIC-RECOVERY}}. If the client has no data to retransmit and does not have +Handshake keys, it SHOULD send an Initial packet in a UDP datagram of at least +1200 octets. If the client has Handshake keys, it SHOULD send a Handshake +packet. - * a client is optionally authenticated, +The payload of this packet contains CRYPTO frames and could contain PADDING, or +ACK frames. Handshake packets MAY contain CONNECTION_CLOSE or APPLICATION_CLOSE +frames. Endpoints MUST treat receipt of Handshake packets with other frames as +a connection error. - * every connection produces distinct and unrelated keys, - * keying material is usable for packet protection for both 0-RTT and 1-RTT - packets, and +## Protected Packets {#packet-protected} - * 1-RTT keys have forward secrecy +All QUIC packets use packet protection. Packets that are protected with the +static handshake keys or the 0-RTT keys are sent with long headers; all packets +protected with 1-RTT keys are sent with short headers. The different packet +types explicitly indicate the encryption level and therefore the keys that are +used to remove packet protection. 0-RTT and 1-RTT protected packets share a +single packet number space. -* authenticated values for the transport parameters of the peer (see - {{transport-parameters}}) +Packets protected with handshake keys only use packet protection to ensure that +the sender of the packet is on the network path. This packet protection is not +effective confidentiality protection; any entity that receives the Initial +packet from a client can recover the keys necessary to remove packet protection +or to generate packets that will be successfully authenticated. -* authenticated confirmation of version negotiation (see {{version-validation}}) +Packets protected with 0-RTT and 1-RTT keys are expected to have confidentiality +and data origin authentication; the cryptographic handshake ensures that only +the communicating endpoints receive the corresponding keys. -* authenticated negotiation of an application protocol (TLS uses ALPN - {{?RFC7301}} for this purpose) +Packets protected with 0-RTT keys use a type value of 0x7C. The connection ID +fields for a 0-RTT packet MUST match the values used in the Initial packet +({{packet-initial}}). -* for the server, the ability to carry data that provides assurance that the - client can receive packets that are addressed with the transport address that - is claimed by the client (see {{address-validation}}) +The version field for protected packets is the current QUIC version. -The first CRYPTO frame MUST be sent in a single packet. Any second attempt -that is triggered by address validation MUST also be sent within a single -packet. This avoids having to reassemble a message from multiple packets. +The packet number field contains a packet number, which has additional +confidentiality protection that is applied after packet protection is applied +(see {{QUIC-TLS}} for details). The underlying packet number increases with +each packet sent, see {{packet-numbers}} for details. -The first client packet of the cryptographic handshake protocol MUST fit within -a 1232 octet QUIC packet payload. This includes overheads that reduce the space -available to the cryptographic handshake protocol. - -The CRYPTO frame can be sent in different packet number spaces. CRYPTO frames -in each packet number space carry a separate sequence of handshake data starting -from an offset of 0. - -## Example Handshake Flows - -Details of how TLS is integrated with QUIC are provided in {{QUIC-TLS}}, but -some examples are provided here. - -{{tls-1rtt-handshake}} provides an overview of the 1-RTT handshake. Each line -shows a QUIC packet with the packet type and packet number shown first, followed -by the frames that are typically contained in those packets. So, for instance -the first packet is of type Initial, with packet number 0, and contains a CRYPTO -frame carrying the ClientHello. - -Note that multiple QUIC packets -- even of different encryption levels -- may be -coalesced into a single UDP datagram (see {{packet-coalesce}}), and so this -handshake may consist of as few as 4 UDP datagrams, or any number more. For -instance, the server's first flight contains packets from the Initial encryption -level (obfuscation), the Handshake level, and "0.5-RTT data" from the server at -the 1-RTT encryption level. - -~~~~ -Client Server +The payload is protected using authenticated encryption. {{QUIC-TLS}} describes +packet protection in detail. After decryption, the plaintext consists of a +sequence of frames, as described in {{frames}}. -Initial[0]: CRYPTO[CH] -> - Initial[0]: CRYPTO[SH] ACK[0] - Handshake[0]: CRYPTO[EE, CERT, CV, FIN] - <- 1-RTT[0]: STREAM[1, "..."] +## Coalescing Packets {#packet-coalesce} -Initial[1]: ACK[0] -Handshake[0]: CRYPTO[FIN], ACK[0] -1-RTT[0]: STREAM[0, "..."], ACK[0] -> +A sender can coalesce multiple QUIC packets (typically a Cryptographic Handshake +packet and a Protected packet) into one UDP datagram. This can reduce the +number of UDP datagrams needed to send application data during the handshake and +immediately afterwards. It is not necessary for senders to coalesce +packets, though failing to do so will require sending a significantly +larger number of datagrams during the handshake. Receivers MUST +be able to process coalesced packets. - 1-RTT[1]: STREAM[55, "..."], ACK[0] - <- Handshake[1]: ACK[0] -~~~~ -{: #tls-1rtt-handshake title="Example 1-RTT Handshake"} +Coalescing packets in order of increasing encryption levels (Initial, 0-RTT, +Handshake, 1-RTT) makes it more likely the receiver will be able to process all +the packets in a single pass. A packet with a short header does not include a +length, so it will always be the last packet included in a UDP datagram. +Senders MUST NOT coalesce QUIC packets with different Destination Connection +IDs into a single UDP datagram. Receivers SHOULD ignore any subsequent packets +with a different Destination Connection ID than the first packet in the +datagram. -{{tls-0rtt-handshake}} shows an example of a connection with a 0-RTT handshake -and a single packet of 0-RTT data. Note that as described in {{packet-numbers}}, -the server ACKs the 0-RTT data at the 1-RTT encryption level, and the client's -sequence numbers at the 1-RTT encryption level continue to increment from its -0-RTT packets. +Every QUIC packet that is coalesced into a single UDP datagram is separate and +complete. Though the values of some fields in the packet header might be +redundant, no fields are omitted. The receiver of coalesced QUIC packets MUST +individually process each QUIC packet and separately acknowledge them, as if +they were received as the payload of different UDP datagrams. If one or more +packets in a datagram cannot be processed yet (because the keys are not yet +available) or processing fails (decryption failure, unknown type, etc.), the +receiver MUST still attempt to process the remaining packets. The skipped +packets MAY either be discarded or buffered for later processing, just as if the +packets were received out-of-order in separate datagrams. -~~~~ -Client Server +Retry ({{packet-retry}}) and Version Negotiation ({{packet-version}}) packets +cannot be coalesced. -Initial[0]: CRYPTO[CH] -0-RTT[0]: STREAM[0, "..."] -> - Initial[0]: CRYPTO[SH] ACK[0] - Handshake[0] CRYPTO[EE, CERT, CV, FIN] - <- 1-RTT[0]: STREAM[1, "..."] ACK[0] +## Connection ID Encoding -Initial[1]: ACK[0] -0-RTT[1]: CRYPTO[EOED] -Handshake[0]: CRYPTO[FIN], ACK[0] -1-RTT[2]: STREAM[0, "..."] ACK[0] -> +A connection ID is used to ensure consistent routing of packets, as described in +{{connection-id}}. The long header contains two connection IDs: the Destination +Connection ID is chosen by the recipient of the packet and is used to provide +consistent routing; the Source Connection ID is used to set the Destination +Connection ID used by the peer. - 1-RTT[1]: STREAM[55, "..."], ACK[1,2] - <- Handshake[1]: ACK[0] -~~~~ -{: #tls-0rtt-handshake title="Example 0-RTT Handshake"} +During the handshake, packets with the long header are used to establish the +connection ID that each endpoint uses. Each endpoint uses the Source Connection +ID field to specify the connection ID that is used in the Destination Connection +ID field of packets being sent to them. Upon receiving a packet, each endpoint +sets the Destination Connection ID it sends to match the value of the Source +Connection ID that they receive. +During the handshake, a client can receive both a Retry and an Initial packet, +and thus be given two opportunities to update the Destination Connection ID it +sends. A client MUST only change the value it sends in the Destination +Connection ID in response to the first packet of each type it receives from the +server (Retry or Initial); a server MUST set its value based on the Initial +packet. Any additional changes are not permitted; if subsequent packets of +those types include a different Source Connection ID, they MUST be discarded. +This avoids problems that might arise from stateless processing of multiple +Initial packets producing different connection IDs. -## Transport Parameters +Short headers only include the Destination Connection ID and omit the explicit +length. The length of the Destination Connection ID field is expected to be +known to endpoints. -During connection establishment, both endpoints make authenticated declarations -of their transport parameters. These declarations are made unilaterally by each -endpoint. Endpoints are required to comply with the restrictions implied by -these parameters; the description of each parameter includes rules for its -handling. +Endpoints using a connection-ID based load balancer could agree with the load +balancer on a fixed or minimum length and on an encoding for connection IDs. +This fixed portion could encode an explicit length, which allows the entire +connection ID to vary in length and still be used by the load balancer. -The format of the transport parameters is the TransportParameters struct from -{{figure-transport-parameters}}. This is described using the presentation -language from Section 3 of {{!TLS13=RFC8446}}. +The very first packet sent by a client includes a random value for Destination +Connection ID. The same value MUST be used for all 0-RTT packets sent on that +connection ({{packet-protected}}). This randomized value is used to determine +the packet protection keys for Initial packets (see Section 5.2 of +{{QUIC-TLS}}). -~~~ - uint32 QuicVersion; +A Version Negotiation ({{packet-version}}) packet MUST use both connection IDs +selected by the client, swapped to ensure correct routing toward the client. - enum { - initial_max_stream_data_bidi_local(0), - initial_max_data(1), - initial_max_bidi_streams(2), - idle_timeout(3), - preferred_address(4), - max_packet_size(5), - stateless_reset_token(6), - ack_delay_exponent(7), - initial_max_uni_streams(8), - disable_migration(9), - initial_max_stream_data_bidi_remote(10), - initial_max_stream_data_uni(11), - max_ack_delay(12), - original_connection_id(13), - (65535) - } TransportParameterId; +The connection ID can change over the lifetime of a connection, especially in +response to connection migration ({{migration}}). NEW_CONNECTION_ID frames +({{frame-new-connection-id}}) are used to provide new connection ID values. - struct { - TransportParameterId parameter; - opaque value<0..2^16-1>; - } TransportParameter; - struct { - select (Handshake.msg_type) { - case client_hello: - QuicVersion initial_version; +## Packet Numbers {#packet-numbers} - case encrypted_extensions: - QuicVersion negotiated_version; - QuicVersion supported_versions<4..2^8-4>; - }; - TransportParameter parameters<22..2^16-1>; - } TransportParameters; +The packet number is an integer in the range 0 to 2^62-1. The value is used in +determining the cryptographic nonce for packet protection. Each endpoint +maintains a separate packet number for sending and receiving. - struct { - enum { IPv4(4), IPv6(6), (15) } ipVersion; - opaque ipAddress<4..2^8-1>; - uint16 port; - opaque connectionId<0..18>; - opaque statelessResetToken[16]; - } PreferredAddress; -~~~ -{: #figure-transport-parameters title="Definition of TransportParameters"} +Packet numbers are divided into 3 spaces in QUIC: -The `extension_data` field of the quic_transport_parameters extension defined in -{{QUIC-TLS}} contains a TransportParameters value. TLS encoding rules are -therefore used to encode the transport parameters. +- Initial space: All Initial packets {{packet-initial}} are in this space. +- Handshake space: All Handshake packets {{packet-handshake}} are in this space. +- Application data space: All 0-RTT and 1-RTT encrypted packets + {{packet-protected}} are in this space. -QUIC encodes transport parameters into a sequence of octets, which are then -included in the cryptographic handshake. Once the handshake completes, the -transport parameters declared by the peer are available. Each endpoint -validates the value provided by its peer. In particular, version negotiation -MUST be validated (see {{version-validation}}) before the connection -establishment is considered properly complete. +As described in {{QUIC-TLS}}, each packet type uses different protection keys. -Definitions for each of the defined transport parameters are included in -{{transport-parameter-definitions}}. Any given parameter MUST appear -at most once in a given transport parameters extension. An endpoint MUST -treat receipt of duplicate transport parameters as a connection error of -type TRANSPORT_PARAMETER_ERROR. +Conceptually, a packet number space is the context in which a packet can be +processed and acknowledged. Initial packets can only be sent with Initial +packet protection keys and acknowledged in packets which are also Initial +packets. Similarly, Handshake packets are sent at the Handshake encryption +level and can only be acknowledged in Handshake packets. +This enforces cryptographic separation between the data sent in the different +packet sequence number spaces. Each packet number space starts at packet number +0. Subsequent packets sent in the same packet number space MUST increase the +packet number by at least one. -### Transport Parameter Definitions +0-RTT and 1-RTT data exist in the same packet number space to make loss recovery +algorithms easier to implement between the two packet types. -An endpoint MAY use the following transport parameters: +A QUIC endpoint MUST NOT reuse a packet number within the same packet number +space in one connection (that is, under the same cryptographic keys). If the +packet number for sending reaches 2^62 - 1, the sender MUST close the connection +without sending a CONNECTION_CLOSE frame or any further packets; an endpoint MAY +send a Stateless Reset ({{stateless-reset}}) in response to further packets that +it receives. -initial_max_data (0x0001): +In the QUIC long and short packet headers, the number of bits required to +represent the packet number is reduced by including only a variable number of +the least significant bits of the packet number. One or two of the most +significant bits of the first octet determine how many bits of the packet +number are provided, as shown in {{pn-encodings}}. -: The initial maximum data parameter contains the initial value for the maximum - amount of data that can be sent on the connection. This parameter is encoded - as an unsigned 32-bit integer in units of octets. This is equivalent to - sending a MAX_DATA ({{frame-max-data}}) for the connection immediately after - completing the handshake. If the transport parameter is absent, the connection - starts with a flow control limit of 0. +| First octet pattern | Encoded Length | Bits Present | +|:--------------------|:---------------|:-------------| +| 0b0xxxxxxx | 1 octet | 7 | +| 0b10xxxxxx | 2 | 14 | +| 0b11xxxxxx | 4 | 30 | +{: #pn-encodings title="Packet Number Encodings for Packet Headers"} -initial_max_bidi_streams (0x0002): +Note that these encodings are similar to those in {{integer-encoding}}, but +use different values. -: The initial maximum bidirectional streams parameter contains the initial - maximum number of bidirectional streams the peer may initiate, encoded as an - unsigned 16-bit integer. If this parameter is absent or zero, bidirectional - streams cannot be created until a MAX_STREAM_ID frame is sent. Setting this - parameter is equivalent to sending a MAX_STREAM_ID ({{frame-max-stream-id}}) - immediately after completing the handshake containing the corresponding Stream - ID. For example, a value of 0x05 would be equivalent to receiving a - MAX_STREAM_ID containing 16 when received by a client or 17 when received by a - server. +The encoded packet number is protected as described in Section 5.3 +{{QUIC-TLS}}. Protection of the packet number is removed prior to recovering the +full packet number. The full packet number is reconstructed at the receiver +based on the number of significant bits present, the value of those bits, and +the largest packet number received on a successfully authenticated +packet. Recovering the full packet number is necessary to successfully remove +packet protection. -initial_max_uni_streams (0x0008): +Once packet number protection is removed, the packet number is decoded by +finding the packet number value that is closest to the next expected packet. +The next expected packet is the highest received packet number plus one. For +example, if the highest successfully authenticated packet had a packet number of +0xaa82f30e, then a packet containing a 14-bit value of 0x9b3 will be decoded as +0xaa8309b3. +Example pseudo-code for packet number decoding can be found in +{{sample-packet-number-decoding}}. -: The initial maximum unidirectional streams parameter contains the initial - maximum number of unidirectional streams the peer may initiate, encoded as an - unsigned 16-bit integer. If this parameter is absent or zero, unidirectional - streams cannot be created until a MAX_STREAM_ID frame is sent. Setting this - parameter is equivalent to sending a MAX_STREAM_ID ({{frame-max-stream-id}}) - immediately after completing the handshake containing the corresponding Stream - ID. For example, a value of 0x05 would be equivalent to receiving a - MAX_STREAM_ID containing 18 when received by a client or 19 when received by a - server. +The sender MUST use a packet number size able to represent more than twice as +large a range than the difference between the largest acknowledged packet and +packet number being sent. A peer receiving the packet will then correctly +decode the packet number, unless the packet is delayed in transit such that it +arrives after many higher-numbered packets have been received. An endpoint +SHOULD use a large enough packet number encoding to allow the packet number to +be recovered even if the packet arrives after packets that are sent afterwards. -idle_timeout (0x0003): +As a result, the size of the packet number encoding is at least one more than +the base 2 logarithm of the number of contiguous unacknowledged packet numbers, +including the new packet. -: The idle timeout is a value in seconds that is encoded as an unsigned 16-bit - integer. If this parameter is absent or zero then the idle timeout is - disabled. +For example, if an endpoint has received an acknowledgment for packet 0x6afa2f, +sending a packet with a number of 0x6b2d79 requires a packet number encoding +with 14 bits or more; whereas the 30-bit packet number encoding is needed to +send a packet with a number of 0x6bc107. -max_packet_size (0x0005): +A receiver MUST discard a newly unprotected packet unless it is certain that it +has not processed another packet with the same packet number from the same +packet number space. Duplicate suppression MUST happen after removing packet +protection for the reasons described in Section 9.3 of {{QUIC-TLS}}. An +efficient algorithm for duplicate suppression can be found in Section 3.4.3 of +{{?RFC2406}}. -: The maximum packet size parameter places a limit on the size of packets that - the endpoint is willing to receive, encoded as an unsigned 16-bit integer. - This indicates that packets larger than this limit will be dropped. The - default for this parameter is the maximum permitted UDP payload of 65527. - Values below 1200 are invalid. This limit only applies to protected packets - ({{packet-protected}}). +A Version Negotiation packet ({{packet-version}}) does not include a packet +number. The Retry packet ({{packet-retry}}) has special rules for populating +the packet number field. -ack_delay_exponent (0x0007): -: An 8-bit unsigned integer value indicating an exponent used to decode the ACK - Delay field in the ACK frame, see {{frame-ack}}. If this value is absent, a - default value of 3 is assumed (indicating a multiplier of 8). The default - value is also used for ACK frames that are sent in Initial and Handshake - packets. Values above 20 are invalid. +# Frames and Frame Types {#frames} -disable_migration (0x0009): +The payload of all packets, after removing packet protection, consists of a +sequence of frames, as shown in {{packet-frames}}. Version Negotiation and +Stateless Reset do not contain frames. -: The endpoint does not support connection migration ({{migration}}). Peers MUST - NOT send any packets, including probing packets ({{probing}}), from a local - address other than that used to perform the handshake. This parameter is a - zero-length value. +~~~ + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Frame 1 (*) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Frame 2 (*) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Frame N (*) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +~~~ +{: #packet-frames title="QUIC Payload"} -max_ack_delay (0x000c): +QUIC payloads MUST contain at least one frame, and MAY contain multiple +frames and multiple frame types. -: An 8 bit unsigned integer value indicating the maximum amount of time in - milliseconds by which it will delay sending of acknowledgments. If this - value is absent, a default of 25 milliseconds is assumed. +Frames MUST fit within a single QUIC packet and MUST NOT span a QUIC packet +boundary. Each frame begins with a Frame Type, indicating its type, followed by +additional type-dependent fields: -Either peer MAY advertise an initial value for the flow control on each type of -stream on which they might receive data. Each of the following transport -parameters is encoded as an unsigned 32-bit integer in units of octets: +~~~ + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Frame Type (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Type-Dependent Fields (*) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +~~~ +{: #frame-layout title="Generic Frame Layout"} -initial_max_stream_data_bidi_local (0x0000): +The frame types defined in this specification are listed in {{frame-types}}. +The Frame Type in STREAM frames is used to carry other frame-specific flags. +For all other frames, the Frame Type field simply identifies the frame. These +frames are explained in more detail as they are referenced later in the +document. -: The initial stream maximum data for bidirectional, locally-initiated streams - parameter contains the initial flow control limit for newly created - bidirectional streams opened by the endpoint that sets the transport - parameter. In client transport parameters, this applies to streams with an - identifier ending in 0x0; in server transport parameters, this applies to - streams ending in 0x1. +| Type Value | Frame Type Name | Definition | +|:------------|:---------------------|:-------------------------------| +| 0x00 | PADDING | {{frame-padding}} | +| 0x01 | RST_STREAM | {{frame-rst-stream}} | +| 0x02 | CONNECTION_CLOSE | {{frame-connection-close}} | +| 0x03 | APPLICATION_CLOSE | {{frame-application-close}} | +| 0x04 | MAX_DATA | {{frame-max-data}} | +| 0x05 | MAX_STREAM_DATA | {{frame-max-stream-data}} | +| 0x06 | MAX_STREAM_ID | {{frame-max-stream-id}} | +| 0x07 | PING | {{frame-ping}} | +| 0x08 | BLOCKED | {{frame-blocked}} | +| 0x09 | STREAM_BLOCKED | {{frame-stream-blocked}} | +| 0x0a | STREAM_ID_BLOCKED | {{frame-stream-id-blocked}} | +| 0x0b | NEW_CONNECTION_ID | {{frame-new-connection-id}} | +| 0x0c | STOP_SENDING | {{frame-stop-sending}} | +| 0x0d | RETIRE_CONNECTION_ID | {{frame-retire-connection-id}} | +| 0x0e | PATH_CHALLENGE | {{frame-path-challenge}} | +| 0x0f | PATH_RESPONSE | {{frame-path-response}} | +| 0x10 - 0x17 | STREAM | {{frame-stream}} | +| 0x18 | CRYPTO | {{frame-crypto}} | +| 0x19 | NEW_TOKEN | {{frame-new-token}} | +| 0x1a - 0x1b | ACK | {{frame-ack}} | +{: #frame-types title="Frame Types"} -initial_max_stream_data_bidi_remote (0x000a): +All QUIC frames are idempotent. That is, a valid frame does not cause +undesirable side effects or errors when received more than once. -: The initial stream maximum data for bidirectional, peer-initiated streams - parameter contains the initial flow control limit for newly created - bidirectional streams opened by the endpoint that receives the transport - parameter. In client transport parameters, this applies to streams with an - identifier ending in 0x1; in server transport parameters, this applies to - streams ending in 0x0. +The Frame Type field uses a variable length integer encoding (see +{{integer-encoding}}) with one exception. To ensure simple and efficient +implementations of frame parsing, a frame type MUST use the shortest possible +encoding. Though a two-, four- or eight-octet encoding of the frame types +defined in this document is possible, the Frame Type field for these frames is +encoded on a single octet. For instance, though 0x4007 is a legitimate +two-octet encoding for a variable-length integer with a value of 7, PING frames +are always encoded as a single octet with the value 0x07. An endpoint MUST +treat the receipt of a frame type that uses a longer encoding than necessary as +a connection error of type PROTOCOL_VIOLATION. -initial_max_stream_data_uni (0x000b): -: The initial stream maximum data for unidirectional streams parameter contains - the initial flow control limit for newly created unidirectional streams opened - by the endpoint that receives the transport parameter. In client transport - parameters, this applies to streams with an identifier ending in 0x3; in - server transport parameters, this applies to streams ending in 0x2. +## Extension Frames -If present, transport parameters that set initial stream flow control limits are -equivalent to sending a MAX_STREAM_DATA frame ({{frame-max-stream-data}}) on -every stream of the corresponding type immediately after opening. If the -transport parameter is absent, streams of that type start with a flow control -limit of 0. +QUIC frames do not use a self-describing encoding. An endpoint therefore needs +to understand the syntax of all frames before it can successfully process a +packet. This allows for efficient encoding of frames, but it means that an +endpoint cannot send a frame of a type that is unknown to its peer. -A server MUST include the original_connection_id transport parameter if it sent -a Retry packet: +An extension to QUIC that wishes to use a new type of frame MUST first ensure +that a peer is able to understand the frame. An endpoint can use a transport +parameter to signal its willingness to receive one or more extension frame types +with the one transport parameter. -original_connection_id (0x000d): +Extension frames MUST be congestion controlled and MUST cause an ACK frame to +be sent. The exception is extension frames that replace or supplement the ACK +frame. Extension frames are not included in flow control unless specified +in the extension. -: The value of the Destination Connection ID field from the first Initial packet - sent by the client. This transport parameter is only sent by the server. +An IANA registry is used to manage the assignment of frame types, see +{{iana-frames}}. -A server MAY include the following transport parameters: -stateless_reset_token (0x0006): +# Life of a Connection -: The Stateless Reset Token is used in verifying a stateless reset, see - {{stateless-reset}}. This parameter is a sequence of 16 octets. +A QUIC connection is a single conversation between two QUIC endpoints. QUIC's +connection establishment intertwines version negotiation with the cryptographic +and transport handshakes to reduce connection establishment latency, as +described in {{handshake}}. Once established, a connection may migrate to a +different IP or port at either endpoint, due to NAT rebinding or mobility, as +described in {{migration}}. Finally, a connection may be terminated by either +endpoint, as described in {{termination}}. -preferred_address (0x0004): +## Connection ID -: The server's Preferred Address is used to effect a change in server address at - the end of the handshake, as described in {{preferred-address}}. +Each connection possesses a set of identifiers, any of which could be used to +distinguish it from other connections. Connection IDs are selected +independently in each direction. Each Connection ID has an associated sequence +number to assist in deduplicating messages. -A client MUST NOT include a stateless reset token or a preferred address. A -server MUST treat receipt of either transport parameter as a connection error of -type TRANSPORT_PARAMETER_ERROR. +The primary function of a connection ID is to ensure that changes in addressing +at lower protocol layers (UDP, IP, and below) don't cause packets for a QUIC +connection to be delivered to the wrong endpoint. Each endpoint selects +connection IDs using an implementation-specific (and perhaps +deployment-specific) method which will allow packets with that connection ID to +be routed back to the endpoint and identified by the endpoint upon receipt. +Connection IDs MUST NOT contain any information that can be used to correlate +them with other connection IDs for the same connection. As a trivial example, +this means the same connection ID MUST NOT be issued more than once on the same +connection. -### Values of Transport Parameters for 0-RTT {#zerortt-parameters} +A zero-length connection ID MAY be used when the connection ID is not needed for +routing and the address/port tuple of packets is sufficient to identify a +connection. An endpoint whose peer has selected a zero-length connection ID MUST +continue to use a zero-length connection ID for the lifetime of the connection +and MUST NOT send packets from any other local address. -A client that attempts to send 0-RTT data MUST remember the transport parameters -used by the server. The transport parameters that the server advertises during -connection establishment apply to all connections that are resumed using the -keying material established during that handshake. Remembered transport -parameters apply to the new connection until the handshake completes and new -transport parameters from the server can be provided. +When an endpoint has requested a non-zero-length connection ID, it needs to +ensure that the peer has a supply of connection IDs from which to choose for +packets sent to the endpoint. These connection IDs are supplied by the endpoint +using the NEW_CONNECTION_ID frame ({{frame-new-connection-id}}). -A server can remember the transport parameters that it advertised, or store an -integrity-protected copy of the values in the ticket and recover the information -when accepting 0-RTT data. A server uses the transport parameters in -determining whether to accept 0-RTT data. -A server MAY accept 0-RTT and subsequently provide different values for -transport parameters for use in the new connection. If 0-RTT data is accepted -by the server, the server MUST NOT reduce any limits or alter any values that -might be violated by the client with its 0-RTT data. In particular, a server -that accepts 0-RTT data MUST NOT set values for initial_max_data, -initial_max_stream_data_bidi_local, initial_max_stream_data_bidi_remote, and -initial_max_stream_data_uni that are smaller than the remembered value of those -parameters. Similarly, a server MUST NOT reduce the value of -initial_max_bidi_streams or initial_max_uni_streams. +### Issuing Connection IDs -Omitting or setting a zero value for certain transport parameters can result in -0-RTT data being enabled, but not usable. The applicable subset of transport -parameters that permit sending of application data SHOULD be set to non-zero -values for 0-RTT. This includes initial_max_data and either -initial_max_bidi_streams and initial_max_stream_data_bidi_remote, or -initial_max_uni_streams and initial_max_stream_data_uni. +The initial connection ID issued by an endpoint is the Source Connection ID +during the handshake. The sequence number of the initial connection ID is 0. If +the preferred_address transport parameter is sent, the sequence number of the +supplied connection ID is 1. Subsequent connection IDs are communicated to the +peer using NEW_CONNECTION_ID frames ({{frame-new-connection-id}}), and the +sequence number on each newly-issued connection ID MUST increase by 1. The +connection ID randomly selected by the client in the Initial packet and any +connection ID provided by a Reset packet are not assigned sequence numbers +unless a server opts to retain them as its initial connection ID. -The value of the server's previous preferred_address MUST NOT be used when -establishing a new connection; rather, the client should wait to observe the -server's new preferred_address value in the handshake. +When an endpoint issues a connection ID, it MUST accept packets that carry this +connection ID for the duration of the connection or until its peer invalidates +the connection ID via a RETIRE_CONNECTION_ID frame +({{frame-retire-connection-id}}). -A server MUST reject 0-RTT data or even abort a handshake if the implied values -for transport parameters cannot be supported. +An endpoint SHOULD ensure that its peer has a sufficient number of available and +unused connection IDs. While each endpoint independently chooses how many +connection IDs to issue, endpoints SHOULD provide and maintain at least eight +connection IDs. The endpoint can do this by always supplying a new connection +ID when a connection ID is retired by its peer or when the endpoint receives a +packet with a previously unused connection ID. Endpoints that initiate +migration and require non-zero-length connection IDs SHOULD provide their peers +with new connection IDs before migration, or risk the peer closing the +connection. -### New Transport Parameters +### Consuming and Retiring Connection IDs {#retiring-cids} -New transport parameters can be used to negotiate new protocol behavior. An -endpoint MUST ignore transport parameters that it does not support. Absence of -a transport parameter therefore disables any optional protocol feature that is -negotiated using the parameter. +An endpoint can change the connection ID it uses for a peer to another available +one at any time during the connection. An endpoint consumes connection IDs in +response to a migrating peer, see {{migration-linkability}} for more. -New transport parameters can be registered according to the rules in -{{iana-transport-parameters}}. +An endpoint maintains a set of connection IDs received from its peer, any of +which it can use when sending packets. When the endpoint wishes to remove a +connection ID from use, it sends a RETIRE_CONNECTION_ID frame to its peer, +indicating that the peer might bring a new connection ID into circulation using +the NEW_CONNECTION_ID frame. +An endpoint that retires a connection ID can retain knowledge of that connection +ID for a period of time after sending the RETIRE_CONNECTION_ID frame, or until +that frame is acknowledged. -### Version Negotiation Validation {#version-validation} +As discussed in {{migration-linkability}}, each connection ID MUST be used on +packets sent from only one local address. An endpoint that migrates away from a +local address SHOULD retire all connection IDs used on that address once it no +longer plans to use that address. -Though the cryptographic handshake has integrity protection, two forms of QUIC -version downgrade are possible. In the first, an attacker replaces the QUIC -version in the Initial packet. In the second, a fake Version Negotiation packet -is sent by an attacker. To protect against these attacks, the transport -parameters include three fields that encode version information. These -parameters are used to retroactively authenticate the choice of version (see -{{version-negotiation}}). -The cryptographic handshake provides integrity protection for the negotiated -version as part of the transport parameters (see {{transport-parameters}}). As -a result, attacks on version negotiation by an attacker can be detected. +## Matching Packets to Connections {#packet-handling} -The client includes the initial_version field in its transport parameters. The -initial_version is the version that the client initially attempted to use. If -the server did not send a Version Negotiation packet {{packet-version}}, this -will be identical to the negotiated_version field in the server transport -parameters. +Incoming packets are classified on receipt. Packets can either be associated +with an existing connection, or - for servers - potentially create a new +connection. -A server that processes all packets in a stateful fashion can remember how -version negotiation was performed and validate the initial_version value. +Hosts try to associate a packet with an existing connection. If the packet has a +Destination Connection ID corresponding to an existing connection, QUIC +processes that packet accordingly. Note that more than one connection ID can be +associated with a connection; see {{connection-id}}. -A server that does not maintain state for every packet it receives (i.e., a -stateless server) uses a different process. If the initial_version matches the -version of QUIC that is in use, a stateless server can accept the value. +If the Destination Connection ID is zero length and the packet matches the +address/port tuple of a connection where the host did not require connection +IDs, QUIC processes the packet as part of that connection. Endpoints MUST drop +packets with zero-length Destination Connection ID fields if they do not +correspond to a single connection. -If the initial_version is different from the version of QUIC that is in use, a -stateless server MUST check that it would have sent a Version Negotiation packet -if it had received a packet with the indicated initial_version. If a server -would have accepted the version included in the initial_version and the value -differs from the QUIC version that is in use, the server MUST terminate the -connection with a VERSION_NEGOTIATION_ERROR error. +Endpoints SHOULD send a Stateless Reset ({{stateless-reset}}) for any packets +that cannot be attributed to an existing connection. -The server includes both the version of QUIC that is in use and a list of the -QUIC versions that the server supports. -The negotiated_version field is the version that is in use. This MUST be set by -the server to the value that is on the Initial packet that it accepts (not an -Initial packet that triggers a Retry or Version Negotiation packet). A client -that receives a negotiated_version that does not match the version of QUIC that -is in use MUST terminate the connection with a VERSION_NEGOTIATION_ERROR error -code. +### Client Packet Handling {#client-pkt-handling} -The server includes a list of versions that it would send in any version -negotiation packet ({{packet-version}}) in the supported_versions field. The -server populates this field even if it did not send a version negotiation -packet. +Valid packets sent to clients always include a Destination Connection ID that +matches the value the client selects. Clients that choose to receive +zero-length connection IDs can use the address/port tuple to identify a +connection. Packets that don't match an existing connection are discarded. -The client validates that the negotiated_version is included in the -supported_versions list and - if version negotiation was performed - that it -would have selected the negotiated version. A client MUST terminate the -connection with a VERSION_NEGOTIATION_ERROR error code if the current QUIC -version is not listed in the supported_versions list. A client MUST terminate -with a VERSION_NEGOTIATION_ERROR error code if version negotiation occurred but -it would have selected a different version based on the value of the -supported_versions list. +Due to packet reordering or loss, clients might receive packets for a connection +that are encrypted with a key it has not yet computed. Clients MAY drop these +packets, or MAY buffer them in anticipation of later packets that allow it to +compute the key. -When an endpoint accepts multiple QUIC versions, it can potentially interpret -transport parameters as they are defined by any of the QUIC versions it -supports. The version field in the QUIC packet header is authenticated using -transport parameters. The position and the format of the version fields in -transport parameters MUST either be identical across different QUIC versions, or -be unambiguously different to ensure no confusion about their interpretation. -One way that a new format could be introduced is to define a TLS extension with -a different codepoint. +If a client receives a packet that has an unsupported version, it MUST discard +that packet. -## Stateless Retries {#stateless-retry} +### Server Packet Handling {#server-pkt-handling} -A server can process an Initial packet from a client without committing any -state. This allows a server to perform address validation -({{address-validation}}), or to defer connection establishment costs. +If a server receives a packet that has an unsupported version, but the packet is +sufficiently large to initiate a new connection for any version supported by the +server, it SHOULD send a Version Negotiation packet as described in +{{send-vn}}. Servers MAY rate control these packets to avoid storms of Version +Negotiation packets. -A server that generates a response to an Initial packet without retaining -connection state MUST use the Retry packet ({{packet-retry}}). This packet -causes a client to restart the connection attempt and includes the token in the -new Initial packet ({{packet-initial}}) to prove source address ownership. +The first packet for an unsupported version can use different semantics and +encodings for any version-specific field. In particular, different packet +protection keys might be used for different versions. Servers that do not +support a particular version are unlikely to be able to decrypt the payload of +the packet. Servers SHOULD NOT attempt to decode or decrypt a packet from an +unknown version, but instead send a Version Negotiation packet, provided that +the packet is sufficiently long. +Servers MUST drop other packets that contain unsupported versions. -## Using Explicit Congestion Notification {#using-ecn} +Packets with a supported version, or no version field, are matched to a +connection as described in {{packet-handling}}. If not matched, the server +continues below. -QUIC endpoints use Explicit Congestion Notification (ECN) {{!RFC3168}} to detect -and respond to network congestion. ECN allows a network node to indicate -congestion in the network by setting a codepoint in the IP header of a packet -instead of dropping it. Endpoints react to congestion by reducing their sending -rate in response, as described in {{QUIC-RECOVERY}}. +If the packet is an Initial packet fully conforming with the specification, the +server proceeds with the handshake ({{handshake}}). This commits the server to +the version that the client selected. -To use ECN, QUIC endpoints first determine whether a path supports ECN marking -and the peer is able to access the ECN codepoint in the IP header. A network -path does not support ECN if ECN marked packets get dropped or ECN markings are -rewritten on the path. An endpoint verifies the path, both during connection -establishment and when migrating to a new path (see {{migration}}). +If a server isn't currently accepting any new connections, it SHOULD send an +Initial packet containing a CONNECTION_CLOSE frame with error code +SERVER_BUSY. -Each endpoint independently verifies and enables use of ECN by setting the IP -header ECN codepoint to ECN Capable Transport (ECT) for the path from it to the -other peer. Even if ECN is not used on the path to the peer, the endpoint MUST -provide feedback about ECN markings received (if accessible). +If the packet is a 0-RTT packet, the server MAY buffer a limited number of these +packets in anticipation of a late-arriving Initial Packet. Clients are forbidden +from sending Handshake packets prior to receiving a server response, so servers +SHOULD ignore any such packets. -To verify both that a path supports ECN and the peer can provide ECN feedback, -an endpoint MUST set the ECT(0) codepoint in the IP header of all outgoing -packets {{!RFC8311}}. +Servers MUST drop incoming packets under all other circumstances. -If an ECT codepoint set in the IP header is not corrupted by a network device, -then a received packet contains either the codepoint sent by the peer or the -Congestion Experienced (CE) codepoint set by a network device that is -experiencing congestion. +## Version Negotiation -On receiving a packet with an ECT or CE codepoint, an endpoint that can access -the IP ECN codepoints increases the corresponding ECT(0), ECT(1), or CE count, -and includes these counters in subsequent (see {{processing-and-ack}}) ACK -frames (see {{frame-ack}}). +Version negotiation ensures that client and server agree to a QUIC version +that is mutually supported. A server sends a Version Negotiation packet in +response to each packet that might initiate a new connection, see +{{packet-handling}} for details. -A packet detected by a receiver as a duplicate does not affect the receiver's -local ECN codepoint counts; see ({{security-ecn}}) for relevant security -concerns. +The size of the first packet sent by a client will determine whether a server +sends a Version Negotiation packet. Clients that support multiple QUIC versions +SHOULD pad the first packet they send to the largest of the minimum packet sizes +across all versions they support. This ensures that the server responds if there +is a mutually supported version. -If an endpoint receives a packet without an ECT or CE codepoint, it responds per -{{processing-and-ack}} with an ACK frame. +### Sending Version Negotiation Packets {#send-vn} -If an endpoint does not have access to received ECN codepoints, it acknowledges -received packets per {{processing-and-ack}} with an ACK frame. +If the version selected by the client is not acceptable to the server, the +server responds with a Version Negotiation packet (see {{packet-version}}). +This includes a list of versions that the server will accept. -If a packet sent with an ECT codepoint is newly acknowledged by the peer in an -ACK frame, the endpoint stops setting ECT codepoints in subsequent packets, with -the expectation that either the network or the peer no longer supports ECN. +This system allows a server to process packets with unsupported versions without +retaining state. Though either the Initial packet or the Version Negotiation +packet that is sent in response could be lost, the client will send new packets +until it successfully receives a response or it abandons the connection attempt. -To protect the connection from arbitrary corruption of ECN codepoints by the -network, an endpoint verifies the following when an ACK frame is received: -* The increase in ECT(0) and ECT(1) counters MUST be at least the number of - packets newly acknowledged that were sent with the corresponding codepoint. +### Handling Version Negotiation Packets {#handle-vn} -* The total increase in ECT(0), ECT(1), and CE counters reported in the ACK - frame MUST be at least the total number of packets newly acknowledged in this - ACK frame. +When the client receives a Version Negotiation packet, it first checks that the +Destination and Source Connection ID fields match the Source and Destination +Connection ID fields in a packet that the client sent. If this check fails, the +packet MUST be discarded. -An endpoint could miss acknowledgements for a packet when ACK frames are lost. -It is therefore possible for the total increase in ECT(0), ECT(1), and CE -counters to be greater than the number of packets acknowledged in an ACK frame. -When this happens, the local reference counts MUST be increased to match the -counters in the ACK frame. +Once the Version Negotiation packet is determined to be valid, the client then +selects an acceptable protocol version from the list provided by the server. +The client then attempts to create a connection using that version. Though the +content of the Initial packet the client sends might not change in response to +version negotiation, a client MUST increase the packet number it uses on every +packet it sends. Packets MUST continue to use long headers and MUST include the +new negotiated protocol version. -Upon successful verification, an endpoint continues to set ECT codepoints in -subsequent packets with the expectation that the path is ECN-capable. +The client MUST use the long header format and include its selected version on +all packets until it has 1-RTT keys and it has received a packet from the server +which is not a Version Negotiation packet. -If verification fails, then the endpoint ceases setting ECT codepoints in -subsequent packets with the expectation that either the network or the peer does -not support ECN. +A client MUST NOT change the version it uses unless it is in response to a +Version Negotiation packet from the server. Once a client receives a packet +from the server which is not a Version Negotiation packet, it MUST discard other +Version Negotiation packets on the same connection. Similarly, a client MUST +ignore a Version Negotiation packet if it has already received and acted on a +Version Negotiation packet. -If an endpoint sets ECT codepoints on outgoing packets and encounters a -retransmission timeout due to the absence of acknowledgments from the peer (see -{{QUIC-RECOVERY}}), or if an endpoint has reason to believe that a network -element might be corrupting ECN codepoints, the endpoint MAY cease setting ECT -codepoints in subsequent packets. Doing so allows the connection to traverse -network elements that drop or corrupt ECN codepoints in the IP header. +A client MUST ignore a Version Negotiation packet that lists the client's chosen +version. +A client MAY attempt 0-RTT after receiving a Version Negotiation packet. A +client that sends additional 0-RTT packets MUST NOT reset the packet number to 0 +as a result, see {{retry-0rtt-pn}}. -## Proof of Source Address Ownership {#address-validation} +Version negotiation packets have no cryptographic protection. The result of the +negotiation MUST be revalidated as part of the cryptographic handshake (see +{{version-validation}}). -Transport protocols commonly spend a round trip checking that a client owns the -transport address (IP and port) that it claims. Verifying that a client can -receive packets sent to its claimed transport address protects against spoofing -of this information by malicious clients. -This technique is used primarily to avoid QUIC from being used for traffic -amplification attack. In such an attack, a packet is sent to a server with -spoofed source address information that identifies a victim. If a server -generates more or larger packets in response to that packet, the attacker can -use the server to send more data toward the victim than it would be able to send -on its own. +### Using Reserved Versions -Several methods are used in QUIC to mitigate this attack. Firstly, the initial -handshake packet is sent in a UDP datagram that contains at least 1200 octets of -UDP payload. This allows a server to send a similar amount of data without -risking causing an amplification attack toward an unproven remote address. +For a server to use a new version in the future, clients must correctly handle +unsupported versions. To help ensure this, a server SHOULD include a reserved +version (see {{versions}}) while generating a Version Negotiation packet. -A server eventually confirms that a client has received its messages when the -first Handshake-level message is received. This might be insufficient, -either because the server wishes to avoid the computational cost of completing -the handshake, or it might be that the size of the packets that are sent during -the handshake is too large. This is especially important for 0-RTT, where the -server might wish to provide application data traffic - such as a response to a -request - in response to the data carried in the early data from the client. +The design of version negotiation permits a server to avoid maintaining state +for packets that it rejects in this fashion. The validation of version +negotiation (see {{version-validation}}) only validates the result of version +negotiation, which is the same no matter which reserved version was sent. +A server MAY therefore send different reserved version numbers in the Version +Negotiation Packet and in its transport parameters. -To send additional data prior to completing the cryptographic handshake, the -server then needs to validate that the client owns the address that it claims. +A client MAY send a packet using a reserved version number. This can be used to +solicit a list of supported versions from a server. -Source address validation is therefore performed by the core transport -protocol during the establishment of a connection. -A different type of source address validation is performed after a connection -migration, see {{migrate-validate}}. +## Cryptographic and Transport Handshake {#handshake} +QUIC relies on a combined cryptographic and transport handshake to minimize +connection establishment latency. QUIC uses the CRYPTO frame {{frame-crypto}} +to transmit the cryptographic handshake. Version 0x00000001 of QUIC uses TLS +1.3 as described in {{QUIC-TLS}}; a different QUIC version number could indicate +that a different cryptographic handshake protocol is in use. -### Client Address Validation Procedure +QUIC provides reliable, ordered delivery of the cryptographic handshake +data. QUIC packet protection ensures confidentiality and integrity protection +that meets the requirements of the cryptographic handshake protocol: -QUIC uses token-based address validation. Any time the server wishes -to validate a client address, it provides the client with a token. As -long as the token's authenticity can be checked (see -{{token-integrity}}) and the client is able to return that token, it -proves to the server that it received the token. +* authenticated key exchange, where -Upon receiving the client's Initial packet, the server can request -address validation by sending a Retry packet containing a token. This -token is repeated in the client's next Initial packet. Because the -token is consumed by the server that generates it, there is no need -for a single well-defined format. A token could include information -about the claimed client address (IP and port), a timestamp, and any -other supplementary information the server will need to validate the -token in the future. + * a server is always authenticated, -The Retry packet is sent to the client and a legitimate client will -respond with an Initial packet containing the token from the Retry packet -when it continues the handshake. In response to receiving the token, a -server can either abort the connection or permit it to proceed. + * a client is optionally authenticated, -A connection MAY be accepted without address validation - or with only limited -validation - but a server SHOULD limit the data it sends toward an unvalidated -address. Successful completion of the cryptographic handshake implicitly -provides proof that the client has received packets from the server. + * every connection produces distinct and unrelated keys, -The client should allow for additional Retry packets being sent in -response to Initial packets sent containing a token. There are several -situations in which the server might not be able to use the previously -generated token to validate the client's address and must send a new -Retry. A reasonable limit to the number of tries the client allows -for, before giving up, is 3. That is, the client MUST echo the -address validation token from a new Retry packet up to 3 times. After -that, it MAY give up on the connection attempt. + * keying material is usable for packet protection for both 0-RTT and 1-RTT + packets, and + * 1-RTT keys have forward secrecy -### Address Validation for Future Connections +* authenticated values for the transport parameters of the peer (see + {{transport-parameters}}) -A server MAY provide clients with an address validation token during one -connection that can be used on a subsequent connection. Address validation is -especially important with 0-RTT because a server potentially sends a significant -amount of data to a client in response to 0-RTT data. +* authenticated confirmation of version negotiation (see {{version-validation}}) -The server uses the NEW_TOKEN frame {{frame-new-token}} to provide the -client with an address validation token that can be used to validate -future connections. The client may then use this token to validate -future connections by including it in the Initial packet's header. -The client MUST NOT use the token provided in a Retry for future -connections. +* authenticated negotiation of an application protocol (TLS uses ALPN + {{?RFC7301}} for this purpose) -Unlike the token that is created for a Retry packet, there might be some time -between when the token is created and when the token is subsequently used. -Thus, a resumption token SHOULD include an expiration time. The server MAY -include either an explicit expiration time or an issued timestamp and -dynamically calculate the expiration time. It is also unlikely that the client -port number is the same on two different connections; validating the port is -therefore unlikely to be successful. +* for the server, the ability to carry data that provides assurance that the + client can receive packets that are addressed with the transport address that + is claimed by the client (see {{address-validation}}) +The first CRYPTO frame MUST be sent in a single packet. Any second attempt +that is triggered by address validation MUST also be sent within a single +packet. This avoids having to reassemble a message from multiple packets. -### Address Validation Token Integrity {#token-integrity} +The first client packet of the cryptographic handshake protocol MUST fit within +a 1232 octet QUIC packet payload. This includes overheads that reduce the space +available to the cryptographic handshake protocol. -An address validation token MUST be difficult to guess. Including a large -enough random value in the token would be sufficient, but this depends on the -server remembering the value it sends to clients. +The CRYPTO frame can be sent in different packet number spaces. CRYPTO frames +in each packet number space carry a separate sequence of handshake data starting +from an offset of 0. -A token-based scheme allows the server to offload any state associated with -validation to the client. For this design to work, the token MUST be covered by -integrity protection against modification or falsification by clients. Without -integrity protection, malicious clients could generate or guess values for -tokens that would be accepted by the server. Only the server requires access to -the integrity protection key for tokens. +## Example Handshake Flows +Details of how TLS is integrated with QUIC are provided in {{QUIC-TLS}}, but +some examples are provided here. -## Path Validation {#migrate-validate} +{{tls-1rtt-handshake}} provides an overview of the 1-RTT handshake. Each line +shows a QUIC packet with the packet type and packet number shown first, followed +by the frames that are typically contained in those packets. So, for instance +the first packet is of type Initial, with packet number 0, and contains a CRYPTO +frame carrying the ClientHello. -Path validation is used by an endpoint to verify reachability of a peer over a -specific path. That is, it tests reachability between a specific local address -and a specific peer address, where an address is the two-tuple of IP address and -port. Path validation tests that packets can be both sent to and received from -a peer. +Note that multiple QUIC packets -- even of different encryption levels -- may be +coalesced into a single UDP datagram (see {{packet-coalesce}}), and so this +handshake may consist of as few as 4 UDP datagrams, or any number more. For +instance, the server's first flight contains packets from the Initial encryption +level (obfuscation), the Handshake level, and "0.5-RTT data" from the server at +the 1-RTT encryption level. -Path validation is used during connection migration (see {{migration}} and -{{preferred-address}}) by the migrating endpoint to verify reachability of a -peer from a new local address. Path validation is also used by the peer to -verify that the migrating endpoint is able to receive packets sent to the its -new address. That is, that the packets received from the migrating endpoint do -not carry a spoofed source address. +~~~~ +Client Server -Path validation can be used at any time by either endpoint. For instance, an -endpoint might check that a peer is still in possession of its address after a -period of quiescence. +Initial[0]: CRYPTO[CH] -> -Path validation is not designed as a NAT traversal mechanism. Though the -mechanism described here might be effective for the creation of NAT bindings -that support NAT traversal, the expectation is that one or other peer is able to -receive packets without first having sent a packet on that path. Effective NAT -traversal needs additional synchronization mechanisms that are not provided -here. + Initial[0]: CRYPTO[SH] ACK[0] + Handshake[0]: CRYPTO[EE, CERT, CV, FIN] + <- 1-RTT[0]: STREAM[1, "..."] -An endpoint MAY bundle PATH_CHALLENGE and PATH_RESPONSE frames that are used for -path validation with other frames. For instance, an endpoint may pad a packet -carrying a PATH_CHALLENGE for PMTU discovery, or an endpoint may bundle a -PATH_RESPONSE with its own PATH_CHALLENGE. +Initial[1]: ACK[0] +Handshake[0]: CRYPTO[FIN], ACK[0] +1-RTT[0]: STREAM[0, "..."], ACK[0] -> -When probing a new path, an endpoint might want to ensure that its peer has an -unused connection ID available for responses. The endpoint can send -NEW_CONNECTION_ID and PATH_CHALLENGE frames in the same packet. This ensures -that an unused connection ID will be available to the peer when sending a -response. + 1-RTT[1]: STREAM[55, "..."], ACK[0] + <- Handshake[1]: ACK[0] +~~~~ +{: #tls-1rtt-handshake title="Example 1-RTT Handshake"} -### Initiation -To initiate path validation, an endpoint sends a PATH_CHALLENGE frame containing -a random payload on the path to be validated. +{{tls-0rtt-handshake}} shows an example of a connection with a 0-RTT handshake +and a single packet of 0-RTT data. Note that as described in {{packet-numbers}}, +the server ACKs the 0-RTT data at the 1-RTT encryption level, and the client's +sequence numbers at the 1-RTT encryption level continue to increment from its +0-RTT packets. -An endpoint MAY send additional PATH_CHALLENGE frames to handle packet loss. An -endpoint SHOULD NOT send a PATH_CHALLENGE more frequently than it would an -Initial packet, ensuring that connection migration is no more load on a new path -than establishing a new connection. +~~~~ +Client Server -The endpoint MUST use fresh random data in every PATH_CHALLENGE frame so that it -can associate the peer's response with the causative PATH_CHALLENGE. +Initial[0]: CRYPTO[CH] +0-RTT[0]: STREAM[0, "..."] -> + Initial[0]: CRYPTO[SH] ACK[0] + Handshake[0] CRYPTO[EE, CERT, CV, FIN] + <- 1-RTT[0]: STREAM[1, "..."] ACK[0] -### Response +Initial[1]: ACK[0] +0-RTT[1]: CRYPTO[EOED] +Handshake[0]: CRYPTO[FIN], ACK[0] +1-RTT[2]: STREAM[0, "..."] ACK[0] -> -On receiving a PATH_CHALLENGE frame, an endpoint MUST respond immediately by -echoing the data contained in the PATH_CHALLENGE frame in a PATH_RESPONSE frame, -with the following stipulation. Since a PATH_CHALLENGE might be sent from a -spoofed address, an endpoint MAY limit the rate at which it sends PATH_RESPONSE -frames and MAY silently discard PATH_CHALLENGE frames that would cause it to -respond at a higher rate. + 1-RTT[1]: STREAM[55, "..."], ACK[1,2] + <- Handshake[1]: ACK[0] +~~~~ +{: #tls-0rtt-handshake title="Example 0-RTT Handshake"} -To ensure that packets can be both sent to and received from the peer, the -PATH_RESPONSE MUST be sent on the same path as the triggering PATH_CHALLENGE: -from the same local address on which the PATH_CHALLENGE was received, to the -same remote address from which the PATH_CHALLENGE was received. +## Transport Parameters -### Completion +During connection establishment, both endpoints make authenticated declarations +of their transport parameters. These declarations are made unilaterally by each +endpoint. Endpoints are required to comply with the restrictions implied by +these parameters; the description of each parameter includes rules for its +handling. -A new address is considered valid when a PATH_RESPONSE frame is received -containing data that was sent in a previous PATH_CHALLENGE. Receipt of an -acknowledgment for a packet containing a PATH_CHALLENGE frame is not adequate -validation, since the acknowledgment can be spoofed by a malicious peer. +The format of the transport parameters is the TransportParameters struct from +{{figure-transport-parameters}}. This is described using the presentation +language from Section 3 of {{!TLS13=RFC8446}}. -For path validation to be successful, a PATH_RESPONSE frame MUST be received -from the same remote address to which the corresponding PATH_CHALLENGE was -sent. If a PATH_RESPONSE frame is received from a different remote address than -the one to which the PATH_CHALLENGE was sent, path validation is considered to -have failed, even if the data matches that sent in the PATH_CHALLENGE. +~~~ + uint32 QuicVersion; -Additionally, the PATH_RESPONSE frame MUST be received on the same local address -from which the corresponding PATH_CHALLENGE was sent. If a PATH_RESPONSE frame -is received on a different local address than the one from which the -PATH_CHALLENGE was sent, path validation is considered to have failed, even if -the data matches that sent in the PATH_CHALLENGE. Thus, the endpoint considers -the path to be valid when a PATH_RESPONSE frame is received on the same path -with the same payload as the PATH_CHALLENGE frame. + enum { + initial_max_stream_data_bidi_local(0), + initial_max_data(1), + initial_max_bidi_streams(2), + idle_timeout(3), + preferred_address(4), + max_packet_size(5), + stateless_reset_token(6), + ack_delay_exponent(7), + initial_max_uni_streams(8), + disable_migration(9), + initial_max_stream_data_bidi_remote(10), + initial_max_stream_data_uni(11), + max_ack_delay(12), + original_connection_id(13), + (65535) + } TransportParameterId; + struct { + TransportParameterId parameter; + opaque value<0..2^16-1>; + } TransportParameter; -### Abandonment + struct { + select (Handshake.msg_type) { + case client_hello: + QuicVersion initial_version; -An endpoint SHOULD abandon path validation after sending some number of -PATH_CHALLENGE frames or after some time has passed. When setting this timer, -implementations are cautioned that the new path could have a longer round-trip -time than the original. + case encrypted_extensions: + QuicVersion negotiated_version; + QuicVersion supported_versions<4..2^8-4>; + }; + TransportParameter parameters<22..2^16-1>; + } TransportParameters; -Note that the endpoint might receive packets containing other frames on the new -path, but a PATH_RESPONSE frame with appropriate data is required for path -validation to succeed. + struct { + enum { IPv4(4), IPv6(6), (15) } ipVersion; + opaque ipAddress<4..2^8-1>; + uint16 port; + opaque connectionId<0..18>; + opaque statelessResetToken[16]; + } PreferredAddress; +~~~ +{: #figure-transport-parameters title="Definition of TransportParameters"} -If path validation fails, the path is deemed unusable. This does not -necessarily imply a failure of the connection - endpoints can continue sending -packets over other paths as appropriate. If no paths are available, an endpoint -can wait for a new path to become available or close the connection. +The `extension_data` field of the quic_transport_parameters extension defined in +{{QUIC-TLS}} contains a TransportParameters value. TLS encoding rules are +therefore used to encode the transport parameters. -A path validation might be abandoned for other reasons besides -failure. Primarily, this happens if a connection migration to a new path is -initiated while a path validation on the old path is in progress. +QUIC encodes transport parameters into a sequence of octets, which are then +included in the cryptographic handshake. Once the handshake completes, the +transport parameters declared by the peer are available. Each endpoint +validates the value provided by its peer. In particular, version negotiation +MUST be validated (see {{version-validation}}) before the connection +establishment is considered properly complete. + +Definitions for each of the defined transport parameters are included in +{{transport-parameter-definitions}}. Any given parameter MUST appear +at most once in a given transport parameters extension. An endpoint MUST +treat receipt of duplicate transport parameters as a connection error of +type TRANSPORT_PARAMETER_ERROR. -## Connection Migration {#migration} +### Transport Parameter Definitions -QUIC allows connections to survive changes to endpoint addresses (that is, IP -address and/or port), such as those caused by an endpoint migrating to a new -network. This section describes the process by which an endpoint migrates to a -new address. +An endpoint MAY use the following transport parameters: -An endpoint MUST NOT initiate connection migration before the handshake is -finished and the endpoint has 1-RTT keys. The design of QUIC relies on -endpoints retaining a stable address for the duration of the handshake. +initial_max_data (0x0001): -An endpoint also MUST NOT initiate connection migration if the peer sent the -`disable_migration` transport parameter during the handshake. An endpoint which -has sent this transport parameter, but detects that a peer has nonetheless -migrated to a different network MAY treat this as a connection error of type -INVALID_MIGRATION. +: The initial maximum data parameter contains the initial value for the maximum + amount of data that can be sent on the connection. This parameter is encoded + as an unsigned 32-bit integer in units of octets. This is equivalent to + sending a MAX_DATA ({{frame-max-data}}) for the connection immediately after + completing the handshake. If the transport parameter is absent, the connection + starts with a flow control limit of 0. -Not all changes of peer address are intentional migrations. The peer could -experience NAT rebinding: a change of address due to a middlebox, usually a NAT, -allocating a new outgoing port or even a new outgoing IP address for a flow. -Endpoints SHOULD perform path validation ({{migrate-validate}}) if a NAT -rebinding does not cause the connection to fail. +initial_max_bidi_streams (0x0002): -This document limits migration of connections to new client addresses, except as -described in {{preferred-address}}. Clients are responsible for initiating all -migrations. Servers do not send non-probing packets (see {{probing}}) toward a -client address until they see a non-probing packet from that address. If a -client receives packets from an unknown server address, the client MAY discard -these packets. +: The initial maximum bidirectional streams parameter contains the initial + maximum number of bidirectional streams the peer may initiate, encoded as an + unsigned 16-bit integer. If this parameter is absent or zero, bidirectional + streams cannot be created until a MAX_STREAM_ID frame is sent. Setting this + parameter is equivalent to sending a MAX_STREAM_ID ({{frame-max-stream-id}}) + immediately after completing the handshake containing the corresponding Stream + ID. For example, a value of 0x05 would be equivalent to receiving a + MAX_STREAM_ID containing 16 when received by a client or 17 when received by a + server. +initial_max_uni_streams (0x0008): -### Probing a New Path {#probing} +: The initial maximum unidirectional streams parameter contains the initial + maximum number of unidirectional streams the peer may initiate, encoded as an + unsigned 16-bit integer. If this parameter is absent or zero, unidirectional + streams cannot be created until a MAX_STREAM_ID frame is sent. Setting this + parameter is equivalent to sending a MAX_STREAM_ID ({{frame-max-stream-id}}) + immediately after completing the handshake containing the corresponding Stream + ID. For example, a value of 0x05 would be equivalent to receiving a + MAX_STREAM_ID containing 18 when received by a client or 19 when received by a + server. -An endpoint MAY probe for peer reachability from a new local address using path -validation {{migrate-validate}} prior to migrating the connection to the new -local address. Failure of path validation simply means that the new path is not -usable for this connection. Failure to validate a path does not cause the -connection to end unless there are no valid alternative paths available. +idle_timeout (0x0003): + +: The idle timeout is a value in seconds that is encoded as an unsigned 16-bit + integer. If this parameter is absent or zero then the idle timeout is + disabled. -An endpoint uses a new connection ID for probes sent from a new local address, -see {{migration-linkability}} for further discussion. An endpoint that uses -a new local address needs to ensure that at least one new connection ID is -available at the peer. That can be achieved by including a NEW_CONNECTION_ID -frame in the probe. +max_packet_size (0x0005): -Receiving a PATH_CHALLENGE frame from a peer indicates that the peer is probing -for reachability on a path. An endpoint sends a PATH_RESPONSE in response as per -{{migrate-validate}}. +: The maximum packet size parameter places a limit on the size of packets that + the endpoint is willing to receive, encoded as an unsigned 16-bit integer. + This indicates that packets larger than this limit will be dropped. The + default for this parameter is the maximum permitted UDP payload of 65527. + Values below 1200 are invalid. This limit only applies to protected packets + ({{packet-protected}}). -PATH_CHALLENGE, PATH_RESPONSE, NEW_CONNECTION_ID, and PADDING frames are -"probing frames", and all other frames are "non-probing frames". A packet -containing only probing frames is a "probing packet", and a packet containing -any other frame is a "non-probing packet". +ack_delay_exponent (0x0007): +: An 8-bit unsigned integer value indicating an exponent used to decode the ACK + Delay field in the ACK frame, see {{frame-ack}}. If this value is absent, a + default value of 3 is assumed (indicating a multiplier of 8). The default + value is also used for ACK frames that are sent in Initial and Handshake + packets. Values above 20 are invalid. -### Initiating Connection Migration {#initiating-migration} +disable_migration (0x0009): -An endpoint can migrate a connection to a new local address by sending packets -containing frames other than probing frames from that address. +: The endpoint does not support connection migration ({{migration}}). Peers MUST + NOT send any packets, including probing packets ({{probing}}), from a local + address other than that used to perform the handshake. This parameter is a + zero-length value. -Each endpoint validates its peer's address during connection establishment. -Therefore, a migrating endpoint can send to its peer knowing that the peer is -willing to receive at the peer's current address. Thus an endpoint can migrate -to a new local address without first validating the peer's address. +max_ack_delay (0x000c): -When migrating, the new path might not support the endpoint's current sending -rate. Therefore, the endpoint resets its congestion controller, as described in -{{migration-cc}}. +: An 8 bit unsigned integer value indicating the maximum amount of time in + milliseconds by which it will delay sending of acknowledgments. If this + value is absent, a default of 25 milliseconds is assumed. -The new path might not have the same ECN capability. Therefore, the endpoint -verifies ECN capability as described in {{using-ecn}}. +Either peer MAY advertise an initial value for the flow control on each type of +stream on which they might receive data. Each of the following transport +parameters is encoded as an unsigned 32-bit integer in units of octets: -Receiving acknowledgments for data sent on the new path serves as proof of the -peer's reachability from the new address. Note that since acknowledgments may -be received on any path, return reachability on the new path is not -established. To establish return reachability on the new path, an endpoint MAY -concurrently initiate path validation {{migrate-validate}} on the new path. +initial_max_stream_data_bidi_local (0x0000): +: The initial stream maximum data for bidirectional, locally-initiated streams + parameter contains the initial flow control limit for newly created + bidirectional streams opened by the endpoint that sets the transport + parameter. In client transport parameters, this applies to streams with an + identifier ending in 0x0; in server transport parameters, this applies to + streams ending in 0x1. -### Responding to Connection Migration {#migration-response} +initial_max_stream_data_bidi_remote (0x000a): -Receiving a packet from a new peer address containing a non-probing frame -indicates that the peer has migrated to that address. +: The initial stream maximum data for bidirectional, peer-initiated streams + parameter contains the initial flow control limit for newly created + bidirectional streams opened by the endpoint that receives the transport + parameter. In client transport parameters, this applies to streams with an + identifier ending in 0x1; in server transport parameters, this applies to + streams ending in 0x0. -In response to such a packet, an endpoint MUST start sending subsequent packets -to the new peer address and MUST initiate path validation ({{migrate-validate}}) -to verify the peer's ownership of the unvalidated address. +initial_max_stream_data_uni (0x000b): -An endpoint MAY send data to an unvalidated peer address, but it MUST protect -against potential attacks as described in {{address-spoofing}} and -{{on-path-spoofing}}. An endpoint MAY skip validation of a peer address if that -address has been seen recently. +: The initial stream maximum data for unidirectional streams parameter contains + the initial flow control limit for newly created unidirectional streams opened + by the endpoint that receives the transport parameter. In client transport + parameters, this applies to streams with an identifier ending in 0x3; in + server transport parameters, this applies to streams ending in 0x2. -An endpoint only changes the address that it sends packets to in response to the -highest-numbered non-probing packet. This ensures that an endpoint does not send -packets to an old peer address in the case that it receives reordered packets. +If present, transport parameters that set initial stream flow control limits are +equivalent to sending a MAX_STREAM_DATA frame ({{frame-max-stream-data}}) on +every stream of the corresponding type immediately after opening. If the +transport parameter is absent, streams of that type start with a flow control +limit of 0. -After changing the address to which it sends non-probing packets, an endpoint -could abandon any path validation for other addresses. +A server MUST include the original_connection_id transport parameter if it sent +a Retry packet: -Receiving a packet from a new peer address might be the result of a NAT -rebinding at the peer. +original_connection_id (0x000d): -After verifying a new client address, the server SHOULD send new address -validation tokens ({{address-validation}}) to the client. +: The value of the Destination Connection ID field from the first Initial packet + sent by the client. This transport parameter is only sent by the server. +A server MAY include the following transport parameters: -#### Handling Address Spoofing by a Peer {#address-spoofing} +stateless_reset_token (0x0006): -It is possible that a peer is spoofing its source address to cause an endpoint -to send excessive amounts of data to an unwilling host. If the endpoint sends -significantly more data than the spoofing peer, connection migration might be -used to amplify the volume of data that an attacker can generate toward a -victim. +: The Stateless Reset Token is used in verifying a stateless reset, see + {{stateless-reset}}. This parameter is a sequence of 16 octets. -As described in {{migration-response}}, an endpoint is required to validate a -peer's new address to confirm the peer's possession of the new address. Until a -peer's address is deemed valid, an endpoint MUST limit the rate at which it -sends data to this address. The endpoint MUST NOT send more than a minimum -congestion window's worth of data per estimated round-trip time (kMinimumWindow, -as defined in {{QUIC-RECOVERY}}). In the absence of this limit, an endpoint -risks being used for a denial of service attack against an unsuspecting victim. -Note that since the endpoint will not have any round-trip time measurements to -this address, the estimate SHOULD be the default initial value (see -{{QUIC-RECOVERY}}). +preferred_address (0x0004): -If an endpoint skips validation of a peer address as described in -{{migration-response}}, it does not need to limit its sending rate. +: The server's Preferred Address is used to effect a change in server address at + the end of the handshake, as described in {{preferred-address}}. +A client MUST NOT include a stateless reset token or a preferred address. A +server MUST treat receipt of either transport parameter as a connection error of +type TRANSPORT_PARAMETER_ERROR. -#### Handling Address Spoofing by an On-path Attacker {#on-path-spoofing} -An on-path attacker could cause a spurious connection migration by copying and -forwarding a packet with a spoofed address such that it arrives before the -original packet. The packet with the spoofed address will be seen to come from -a migrating connection, and the original packet will be seen as a duplicate and -dropped. After a spurious migration, validation of the source address will fail -because the entity at the source address does not have the necessary -cryptographic keys to read or respond to the PATH_CHALLENGE frame that is sent -to it even if it wanted to. +### Values of Transport Parameters for 0-RTT {#zerortt-parameters} -To protect the connection from failing due to such a spurious migration, an -endpoint MUST revert to using the last validated peer address when validation of -a new peer address fails. +A client that attempts to send 0-RTT data MUST remember the transport parameters +used by the server. The transport parameters that the server advertises during +connection establishment apply to all connections that are resumed using the +keying material established during that handshake. Remembered transport +parameters apply to the new connection until the handshake completes and new +transport parameters from the server can be provided. -If an endpoint has no state about the last validated peer address, it MUST close -the connection silently by discarding all connection state. This results in new -packets on the connection being handled generically. For instance, an endpoint -MAY send a stateless reset in response to any further incoming packets. +A server can remember the transport parameters that it advertised, or store an +integrity-protected copy of the values in the ticket and recover the information +when accepting 0-RTT data. A server uses the transport parameters in +determining whether to accept 0-RTT data. -Note that receipt of packets with higher packet numbers from the legitimate peer -address will trigger another connection migration. This will cause the -validation of the address of the spurious migration to be abandoned. +A server MAY accept 0-RTT and subsequently provide different values for +transport parameters for use in the new connection. If 0-RTT data is accepted +by the server, the server MUST NOT reduce any limits or alter any values that +might be violated by the client with its 0-RTT data. In particular, a server +that accepts 0-RTT data MUST NOT set values for initial_max_data, +initial_max_stream_data_bidi_local, initial_max_stream_data_bidi_remote, and +initial_max_stream_data_uni that are smaller than the remembered value of those +parameters. Similarly, a server MUST NOT reduce the value of +initial_max_bidi_streams or initial_max_uni_streams. -### Loss Detection and Congestion Control {#migration-cc} +Omitting or setting a zero value for certain transport parameters can result in +0-RTT data being enabled, but not usable. The applicable subset of transport +parameters that permit sending of application data SHOULD be set to non-zero +values for 0-RTT. This includes initial_max_data and either +initial_max_bidi_streams and initial_max_stream_data_bidi_remote, or +initial_max_uni_streams and initial_max_stream_data_uni. -The capacity available on the new path might not be the same as the old path. -Packets sent on the old path SHOULD NOT contribute to congestion control or RTT -estimation for the new path. +The value of the server's previous preferred_address MUST NOT be used when +establishing a new connection; rather, the client should wait to observe the +server's new preferred_address value in the handshake. -On confirming a peer's ownership of its new address, an endpoint SHOULD -immediately reset the congestion controller and round-trip time estimator for -the new path. +A server MUST reject 0-RTT data or even abort a handshake if the implied values +for transport parameters cannot be supported. -An endpoint MUST NOT return to the send rate used for the previous path unless -it is reasonably sure that the previous send rate is valid for the new path. -For instance, a change in the client's port number is likely indicative of a -rebinding in a middlebox and not a complete change in path. This determination -likely depends on heuristics, which could be imperfect; if the new path capacity -is significantly reduced, ultimately this relies on the congestion controller -responding to congestion signals and reducing send rates appropriately. -There may be apparent reordering at the receiver when an endpoint sends data and -probes from/to multiple addresses during the migration period, since the two -resulting paths may have different round-trip times. A receiver of packets on -multiple paths will still send ACK frames covering all received packets. +### New Transport Parameters -While multiple paths might be used during connection migration, a single -congestion control context and a single loss recovery context (as described in -{{QUIC-RECOVERY}}) may be adequate. A sender can make exceptions for probe -packets so that their loss detection is independent and does not unduly cause -the congestion controller to reduce its sending rate. An endpoint might set a -separate timer when a PATH_CHALLENGE is sent, which is cancelled when the -corresponding PATH_RESPONSE is received. If the timer fires before the -PATH_RESPONSE is received, the endpoint might send a new PATH_CHALLENGE, and -restart the timer for a longer period of time. +New transport parameters can be used to negotiate new protocol behavior. An +endpoint MUST ignore transport parameters that it does not support. Absence of +a transport parameter therefore disables any optional protocol feature that is +negotiated using the parameter. +New transport parameters can be registered according to the rules in +{{iana-transport-parameters}}. -### Privacy Implications of Connection Migration {#migration-linkability} -Using a stable connection ID on multiple network paths allows a passive observer -to correlate activity between those paths. An endpoint that moves between -networks might not wish to have their activity correlated by any entity other -than their peer, so different connection IDs are used when sending from -different local addresses, as discussed in {{connection-id}}. For this to be -effective endpoints need to ensure that connections IDs they provide cannot be -linked by any other entity. +### Version Negotiation Validation {#version-validation} -This eliminates the use of the connection ID for linking activity from -the same connection on different networks. Protection of packet numbers ensures -that packet numbers cannot be used to correlate activity. This does not prevent -other properties of packets, such as timing and size, from being used to -correlate activity. +Though the cryptographic handshake has integrity protection, two forms of QUIC +version downgrade are possible. In the first, an attacker replaces the QUIC +version in the Initial packet. In the second, a fake Version Negotiation packet +is sent by an attacker. To protect against these attacks, the transport +parameters include three fields that encode version information. These +parameters are used to retroactively authenticate the choice of version (see +{{version-negotiation}}). -Clients MAY move to a new connection ID at any time based on -implementation-specific concerns. For example, after a period of network -inactivity NAT rebinding might occur when the client begins sending data again. +The cryptographic handshake provides integrity protection for the negotiated +version as part of the transport parameters (see {{transport-parameters}}). As +a result, attacks on version negotiation by an attacker can be detected. -A client might wish to reduce linkability by employing a new connection ID and -source UDP port when sending traffic after a period of inactivity. Changing the -UDP port from which it sends packets at the same time might cause the packet to -appear as a connection migration. This ensures that the mechanisms that support -migration are exercised even for clients that don't experience NAT rebindings or -genuine migrations. Changing port number can cause a peer to reset its -congestion state (see {{migration-cc}}), so the port SHOULD only be changed -infrequently. +The client includes the initial_version field in its transport parameters. The +initial_version is the version that the client initially attempted to use. If +the server did not send a Version Negotiation packet {{packet-version}}, this +will be identical to the negotiated_version field in the server transport +parameters. -Endpoints that use connection IDs with length greater than zero could have their -activity correlated if their peers keep using the same destination connection ID -after migration. Endpoints that receive packets with a previously unused -Destination Connection ID SHOULD change to sending packets with a connection ID -that has not been used on any other network path. The goal here is to ensure -that packets sent on different paths cannot be correlated. To fulfill this -privacy requirement, endpoints that initiate migration and use connection IDs -with length greater than zero SHOULD provide their peers with new connection IDs -before migration. +A server that processes all packets in a stateful fashion can remember how +version negotiation was performed and validate the initial_version value. -Caution: +A server that does not maintain state for every packet it receives (i.e., a +stateless server) uses a different process. If the initial_version matches the +version of QUIC that is in use, a stateless server can accept the value. -: If both endpoints change connection ID in response to seeing a change in - connection ID from their peer, then this can trigger an infinite sequence of - changes. +If the initial_version is different from the version of QUIC that is in use, a +stateless server MUST check that it would have sent a Version Negotiation packet +if it had received a packet with the indicated initial_version. If a server +would have accepted the version included in the initial_version and the value +differs from the QUIC version that is in use, the server MUST terminate the +connection with a VERSION_NEGOTIATION_ERROR error. -## Server's Preferred Address {#preferred-address} +The server includes both the version of QUIC that is in use and a list of the +QUIC versions that the server supports. -QUIC allows servers to accept connections on one IP address and attempt to -transfer these connections to a more preferred address shortly after the -handshake. This is particularly useful when clients initially connect to an -address shared by multiple servers but would prefer to use a unicast address to -ensure connection stability. This section describes the protocol for migrating a -connection to a preferred server address. +The negotiated_version field is the version that is in use. This MUST be set by +the server to the value that is on the Initial packet that it accepts (not an +Initial packet that triggers a Retry or Version Negotiation packet). A client +that receives a negotiated_version that does not match the version of QUIC that +is in use MUST terminate the connection with a VERSION_NEGOTIATION_ERROR error +code. -Migrating a connection to a new server address mid-connection is left for future -work. If a client receives packets from a new server address not indicated by -the preferred_address transport parameter, the client SHOULD discard these -packets. +The server includes a list of versions that it would send in any version +negotiation packet ({{packet-version}}) in the supported_versions field. The +server populates this field even if it did not send a version negotiation +packet. -### Communicating A Preferred Address +The client validates that the negotiated_version is included in the +supported_versions list and - if version negotiation was performed - that it +would have selected the negotiated version. A client MUST terminate the +connection with a VERSION_NEGOTIATION_ERROR error code if the current QUIC +version is not listed in the supported_versions list. A client MUST terminate +with a VERSION_NEGOTIATION_ERROR error code if version negotiation occurred but +it would have selected a different version based on the value of the +supported_versions list. -A server conveys a preferred address by including the preferred_address -transport parameter in the TLS handshake. +When an endpoint accepts multiple QUIC versions, it can potentially interpret +transport parameters as they are defined by any of the QUIC versions it +supports. The version field in the QUIC packet header is authenticated using +transport parameters. The position and the format of the version fields in +transport parameters MUST either be identical across different QUIC versions, or +be unambiguously different to ensure no confusion about their interpretation. +One way that a new format could be introduced is to define a TLS extension with +a different codepoint. -Once the handshake is finished, the client SHOULD initiate path validation (see -{{migrate-validate}}) of the server's preferred address using the connection ID -provided in the preferred_address transport parameter. -If path validation succeeds, the client SHOULD immediately begin sending all -future packets to the new server address using the new connection ID and -discontinue use of the old server address. If path validation fails, the client -MUST continue sending all future packets to the server's original IP address. +## Stateless Retries {#stateless-retry} +A server can process an Initial packet from a client without committing any +state. This allows a server to perform address validation +({{address-validation}}), or to defer connection establishment costs. -### Responding to Connection Migration +A server that generates a response to an Initial packet without retaining +connection state MUST use the Retry packet ({{packet-retry}}). This packet +causes a client to restart the connection attempt and includes the token in the +new Initial packet ({{packet-initial}}) to prove source address ownership. -A server might receive a packet addressed to its preferred IP address at any -time after the handshake is completed. If this packet contains a PATH_CHALLENGE -frame, the server sends a PATH_RESPONSE frame as per {{migrate-validate}}, but -the server MUST continue sending all other packets from its original IP address. -The server SHOULD also initiate path validation of the client using its -preferred address and the address from which it received the client probe. This -helps to guard against spurious migration initiated by an attacker. +## Using Explicit Congestion Notification {#using-ecn} -Once the server has completed its path validation and has received a non-probing -packet with a new largest packet number on its preferred address, the server -begins sending to the client exclusively from its preferred IP address. It -SHOULD drop packets for this connection received on the old IP address, but MAY -continue to process delayed packets. +QUIC endpoints use Explicit Congestion Notification (ECN) {{!RFC3168}} to detect +and respond to network congestion. ECN allows a network node to indicate +congestion in the network by setting a codepoint in the IP header of a packet +instead of dropping it. Endpoints react to congestion by reducing their sending +rate in response, as described in {{QUIC-RECOVERY}}. +To use ECN, QUIC endpoints first determine whether a path supports ECN marking +and the peer is able to access the ECN codepoint in the IP header. A network +path does not support ECN if ECN marked packets get dropped or ECN markings are +rewritten on the path. An endpoint verifies the path, both during connection +establishment and when migrating to a new path (see {{migration}}). -### Interaction of Client Migration and Preferred Address +Each endpoint independently verifies and enables use of ECN by setting the IP +header ECN codepoint to ECN Capable Transport (ECT) for the path from it to the +other peer. Even if ECN is not used on the path to the peer, the endpoint MUST +provide feedback about ECN markings received (if accessible). -A client might need to perform a connection migration before it has migrated to -the server's preferred address. In this case, the client SHOULD perform path -validation to both the original and preferred server address from the client's -new address concurrently. +To verify both that a path supports ECN and the peer can provide ECN feedback, +an endpoint MUST set the ECT(0) codepoint in the IP header of all outgoing +packets {{!RFC8311}}. -If path validation of the server's preferred address succeeds, the client MUST -abandon validation of the original address and migrate to using the server's -preferred address. If path validation of the server's preferred address fails, -but validation of the server's original address succeeds, the client MAY migrate -to using the original address from the client's new address. +If an ECT codepoint set in the IP header is not corrupted by a network device, +then a received packet contains either the codepoint sent by the peer or the +Congestion Experienced (CE) codepoint set by a network device that is +experiencing congestion. -If the connection to the server's preferred address is not from the same client -address, the server MUST protect against potential attacks as described in -{{address-spoofing}} and {{on-path-spoofing}}. In addition to intentional -simultaneous migration, this might also occur because the client's access -network used a different NAT binding for the server's preferred address. +On receiving a packet with an ECT or CE codepoint, an endpoint that can access +the IP ECN codepoints increases the corresponding ECT(0), ECT(1), or CE count, +and includes these counters in subsequent (see {{processing-and-ack}}) ACK +frames (see {{frame-ack}}). -Servers SHOULD initiate path validation to the client's new address upon -receiving a probe packet from a different address. Servers MUST NOT send more -than a minimum congestion window's worth of non-probing packets to the new -address before path validation is complete. +A packet detected by a receiver as a duplicate does not affect the receiver's +local ECN codepoint counts; see ({{security-ecn}}) for relevant security +concerns. +If an endpoint receives a packet without an ECT or CE codepoint, it responds per +{{processing-and-ack}} with an ACK frame. -## Connection Termination {#termination} +If an endpoint does not have access to received ECN codepoints, it acknowledges +received packets per {{processing-and-ack}} with an ACK frame. -Connections should remain open until they become idle for a pre-negotiated -period of time. A QUIC connection, once established, can be terminated in one -of three ways: +If a packet sent with an ECT codepoint is newly acknowledged by the peer in an +ACK frame, the endpoint stops setting ECT codepoints in subsequent packets, with +the expectation that either the network or the peer no longer supports ECN. -* idle timeout ({{idle-timeout}}) -* immediate close ({{immediate-close}}) -* stateless reset ({{stateless-reset}}) +To protect the connection from arbitrary corruption of ECN codepoints by the +network, an endpoint verifies the following when an ACK frame is received: +* The increase in ECT(0) and ECT(1) counters MUST be at least the number of + packets newly acknowledged that were sent with the corresponding codepoint. -### Closing and Draining Connection States {#draining} +* The total increase in ECT(0), ECT(1), and CE counters reported in the ACK + frame MUST be at least the total number of packets newly acknowledged in this + ACK frame. -The closing and draining connection states exist to ensure that connections -close cleanly and that delayed or reordered packets are properly discarded. -These states SHOULD persist for three times the current Retransmission Timeout -(RTO) interval as defined in {{QUIC-RECOVERY}}. +An endpoint could miss acknowledgements for a packet when ACK frames are lost. +It is therefore possible for the total increase in ECT(0), ECT(1), and CE +counters to be greater than the number of packets acknowledged in an ACK frame. +When this happens, the local reference counts MUST be increased to match the +counters in the ACK frame. -An endpoint enters a closing period after initiating an immediate close -({{immediate-close}}). While closing, an endpoint MUST NOT send packets unless -they contain a CONNECTION_CLOSE or APPLICATION_CLOSE frame (see -{{immediate-close}} for details). +Upon successful verification, an endpoint continues to set ECT codepoints in +subsequent packets with the expectation that the path is ECN-capable. -In the closing state, only a packet containing a closing frame can be sent. An -endpoint retains only enough information to generate a packet containing a -closing frame and to identify packets as belonging to the connection. The -connection ID and QUIC version is sufficient information to identify packets for -a closing connection; an endpoint can discard all other connection state. An -endpoint MAY retain packet protection keys for incoming packets to allow it to -read and process a closing frame. +If verification fails, then the endpoint ceases setting ECT codepoints in +subsequent packets with the expectation that either the network or the peer does +not support ECN. -The draining state is entered once an endpoint receives a signal that its peer -is closing or draining. While otherwise identical to the closing state, an -endpoint in the draining state MUST NOT send any packets. Retaining packet -protection keys is unnecessary once a connection is in the draining state. +If an endpoint sets ECT codepoints on outgoing packets and encounters a +retransmission timeout due to the absence of acknowledgments from the peer (see +{{QUIC-RECOVERY}}), or if an endpoint has reason to believe that a network +element might be corrupting ECN codepoints, the endpoint MAY cease setting ECT +codepoints in subsequent packets. Doing so allows the connection to traverse +network elements that drop or corrupt ECN codepoints in the IP header. -An endpoint MAY transition from the closing period to the draining period if it -can confirm that its peer is also closing or draining. Receiving a closing -frame is sufficient confirmation, as is receiving a stateless reset. The -draining period SHOULD end when the closing period would have ended. In other -words, the endpoint can use the same end time, but cease retransmission of the -closing packet. -Disposing of connection state prior to the end of the closing or draining period -could cause delayed or reordered packets to be handled poorly. Endpoints that -have some alternative means to ensure that late-arriving packets on the -connection do not create QUIC state, such as those that are able to close the -UDP socket, MAY use an abbreviated draining period which can allow for faster -resource recovery. Servers that retain an open socket for accepting new -connections SHOULD NOT exit the closing or draining period early. +## Proof of Source Address Ownership {#address-validation} -Once the closing or draining period has ended, an endpoint SHOULD discard all -connection state. This results in new packets on the connection being handled -generically. For instance, an endpoint MAY send a stateless reset in response -to any further incoming packets. +Transport protocols commonly spend a round trip checking that a client owns the +transport address (IP and port) that it claims. Verifying that a client can +receive packets sent to its claimed transport address protects against spoofing +of this information by malicious clients. -The draining and closing periods do not apply when a stateless reset -({{stateless-reset}}) is sent. +This technique is used primarily to avoid QUIC from being used for traffic +amplification attack. In such an attack, a packet is sent to a server with +spoofed source address information that identifies a victim. If a server +generates more or larger packets in response to that packet, the attacker can +use the server to send more data toward the victim than it would be able to send +on its own. -An endpoint is not expected to handle key updates when it is closing or -draining. A key update might prevent the endpoint from moving from the closing -state to draining, but it otherwise has no impact. +Several methods are used in QUIC to mitigate this attack. Firstly, the initial +handshake packet is sent in a UDP datagram that contains at least 1200 octets of +UDP payload. This allows a server to send a similar amount of data without +risking causing an amplification attack toward an unproven remote address. -An endpoint could receive packets from a new source address, indicating a client -connection migration ({{migration}}), while in the closing period. An endpoint -in the closing state MUST strictly limit the number of packets it sends to this -new address until the address is validated (see {{migrate-validate}}). A server -in the closing state MAY instead choose to discard packets received from a new -source address. +A server eventually confirms that a client has received its messages when the +first Handshake-level message is received. This might be insufficient, +either because the server wishes to avoid the computational cost of completing +the handshake, or it might be that the size of the packets that are sent during +the handshake is too large. This is especially important for 0-RTT, where the +server might wish to provide application data traffic - such as a response to a +request - in response to the data carried in the early data from the client. + +To send additional data prior to completing the cryptographic handshake, the +server then needs to validate that the client owns the address that it claims. + +Source address validation is therefore performed by the core transport +protocol during the establishment of a connection. +A different type of source address validation is performed after a connection +migration, see {{migrate-validate}}. -### Idle Timeout -If the idle timeout is enabled, a connection that remains idle for longer than -the advertised idle timeout (see {{transport-parameter-definitions}}) is closed. -A connection enters the draining state when the idle timeout expires. +### Client Address Validation Procedure -Each endpoint advertises their own idle timeout to their peer. The idle timeout -starts from the last packet received. In order to ensure that initiating new -activity postpones an idle timeout, an endpoint restarts this timer when sending -a packet. An endpoint does not postpone the idle timeout if another packet has -been sent containing frames other than ACK or PADDING, and that other packet has -not been acknowledged or declared lost. Packets that contain only ACK or -PADDING frames are not acknowledged until an endpoint has other frames to send, -so they could prevent the timeout from being refreshed. +QUIC uses token-based address validation. Any time the server wishes +to validate a client address, it provides the client with a token. As +long as the token's authenticity can be checked (see +{{token-integrity}}) and the client is able to return that token, it +proves to the server that it received the token. -The value for an idle timeout can be asymmetric. The value advertised by an -endpoint is only used to determine whether the connection is live at that -endpoint. An endpoint that sends packets near the end of the idle timeout -period of a peer risks having those packets discarded if its peer enters the -draining state before the packets arrive. If a peer could timeout within an RTO -(see Section 4.3.3 of {{QUIC-RECOVERY}}), it is advisable to test for liveness -before sending any data that cannot be retried safely. +Upon receiving the client's Initial packet, the server can request +address validation by sending a Retry packet containing a token. This +token is repeated in the client's next Initial packet. Because the +token is consumed by the server that generates it, there is no need +for a single well-defined format. A token could include information +about the claimed client address (IP and port), a timestamp, and any +other supplementary information the server will need to validate the +token in the future. +The Retry packet is sent to the client and a legitimate client will +respond with an Initial packet containing the token from the Retry packet +when it continues the handshake. In response to receiving the token, a +server can either abort the connection or permit it to proceed. -### Immediate Close +A connection MAY be accepted without address validation - or with only limited +validation - but a server SHOULD limit the data it sends toward an unvalidated +address. Successful completion of the cryptographic handshake implicitly +provides proof that the client has received packets from the server. -An endpoint sends a closing frame (CONNECTION_CLOSE or APPLICATION_CLOSE) to -terminate the connection immediately. Any closing frame causes all streams to -immediately become closed; open streams can be assumed to be implicitly reset. +The client should allow for additional Retry packets being sent in +response to Initial packets sent containing a token. There are several +situations in which the server might not be able to use the previously +generated token to validate the client's address and must send a new +Retry. A reasonable limit to the number of tries the client allows +for, before giving up, is 3. That is, the client MUST echo the +address validation token from a new Retry packet up to 3 times. After +that, it MAY give up on the connection attempt. -After sending a closing frame, endpoints immediately enter the closing state. -During the closing period, an endpoint that sends a closing frame SHOULD respond -to any packet that it receives with another packet containing a closing frame. -To minimize the state that an endpoint maintains for a closing connection, -endpoints MAY send the exact same packet. However, endpoints SHOULD limit the -number of packets they generate containing a closing frame. For instance, an -endpoint could progressively increase the number of packets that it receives -before sending additional packets or increase the time between packets. -Note: +### Address Validation for Future Connections -: Allowing retransmission of a packet contradicts other advice in this document - that recommends the creation of new packet numbers for every packet. Sending - new packet numbers is primarily of advantage to loss recovery and congestion - control, which are not expected to be relevant for a closed connection. - Retransmitting the final packet requires less state. +A server MAY provide clients with an address validation token during one +connection that can be used on a subsequent connection. Address validation is +especially important with 0-RTT because a server potentially sends a significant +amount of data to a client in response to 0-RTT data. -After receiving a closing frame, endpoints enter the draining state. An -endpoint that receives a closing frame MAY send a single packet containing a -closing frame before entering the draining state, using a CONNECTION_CLOSE frame -and a NO_ERROR code if appropriate. An endpoint MUST NOT send further packets, -which could result in a constant exchange of closing frames until the closing -period on either peer ended. +The server uses the NEW_TOKEN frame {{frame-new-token}} to provide the +client with an address validation token that can be used to validate +future connections. The client may then use this token to validate +future connections by including it in the Initial packet's header. +The client MUST NOT use the token provided in a Retry for future +connections. -An immediate close can be used after an application protocol has arranged to -close a connection. This might be after the application protocols negotiates a -graceful shutdown. The application protocol exchanges whatever messages that -are needed to cause both endpoints to agree to close the connection, after which -the application requests that the connection be closed. The application -protocol can use an APPLICATION_CLOSE message with an appropriate error code to -signal closure. +Unlike the token that is created for a Retry packet, there might be some time +between when the token is created and when the token is subsequently used. +Thus, a resumption token SHOULD include an expiration time. The server MAY +include either an explicit expiration time or an issued timestamp and +dynamically calculate the expiration time. It is also unlikely that the client +port number is the same on two different connections; validating the port is +therefore unlikely to be successful. -### Stateless Reset {#stateless-reset} +### Address Validation Token Integrity {#token-integrity} -A stateless reset is provided as an option of last resort for an endpoint that -does not have access to the state of a connection. A crash or outage might -result in peers continuing to send data to an endpoint that is unable to -properly continue the connection. An endpoint that wishes to communicate a -fatal connection error MUST use a closing frame if it has sufficient state to do -so. +An address validation token MUST be difficult to guess. Including a large +enough random value in the token would be sufficient, but this depends on the +server remembering the value it sends to clients. -To support this process, a token is sent by endpoints. The token is carried in -the NEW_CONNECTION_ID frame sent by either peer, and servers can specify the -stateless_reset_token transport parameter during the handshake (clients cannot -because their transport parameters don't have confidentiality protection). This -value is protected by encryption, so only client and server know this value. -Tokens sent via NEW_CONNECTION_ID frames are invalidated when their associated -connection ID is retired via a RETIRE_CONNECTION_ID frame -({{frame-retire-connection-id}}). +A token-based scheme allows the server to offload any state associated with +validation to the client. For this design to work, the token MUST be covered by +integrity protection against modification or falsification by clients. Without +integrity protection, malicious clients could generate or guess values for +tokens that would be accepted by the server. Only the server requires access to +the integrity protection key for tokens. -An endpoint that receives packets that it cannot process sends a packet in the -following layout: -~~~ - 0 1 2 3 - 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 -+-+-+-+-+-+-+-+-+ -|0|K|1|1|0|0|0|0| -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Random Octets (160..) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| | -+ + -| | -+ Stateless Reset Token (128) + -| | -+ + -| | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -~~~ -{: #fig-stateless-reset title="Stateless Reset Packet"} +## Path Validation {#migrate-validate} -This design ensures that a stateless reset packet is - to the extent possible - -indistinguishable from a regular packet with a short header. +Path validation is used by an endpoint to verify reachability of a peer over a +specific path. That is, it tests reachability between a specific local address +and a specific peer address, where an address is the two-tuple of IP address and +port. Path validation tests that packets can be both sent to and received from +a peer. -The message consists of a header octet, followed by an arbitrary number of -random octets, followed by a Stateless Reset Token. +Path validation is used during connection migration (see {{migration}} and +{{preferred-address}}) by the migrating endpoint to verify reachability of a +peer from a new local address. Path validation is also used by the peer to +verify that the migrating endpoint is able to receive packets sent to the its +new address. That is, that the packets received from the migrating endpoint do +not carry a spoofed source address. -A stateless reset will be interpreted by a recipient as a packet with a short -header. For the packet to appear as valid, the Random Octets field needs to -include at least 20 octets of random or unpredictable values. This is intended -to allow for a destination connection ID of the maximum length permitted, a -packet number, and minimal payload. The Stateless Reset Token corresponds to -the minimum expansion of the packet protection AEAD. More random octets might -be necessary if the endpoint could have negotiated a packet protection scheme -with a larger minimum AEAD expansion. +Path validation can be used at any time by either endpoint. For instance, an +endpoint might check that a peer is still in possession of its address after a +period of quiescence. -An endpoint SHOULD NOT send a stateless reset that is significantly larger than -the packet it receives. Endpoints MUST discard packets that are too small to be -valid QUIC packets. With the set of AEAD functions defined in {{QUIC-TLS}}, -packets less than 19 octets long are never valid. +Path validation is not designed as a NAT traversal mechanism. Though the +mechanism described here might be effective for the creation of NAT bindings +that support NAT traversal, the expectation is that one or other peer is able to +receive packets without first having sent a packet on that path. Effective NAT +traversal needs additional synchronization mechanisms that are not provided +here. -An endpoint MAY send a stateless reset in response to a packet with a long -header. This would not be effective if the stateless reset token was not yet -available to a peer. In this QUIC version, packets with a long header are only -used during connection establishment. Because the stateless reset token is not -available until connection establishment is complete or near completion, -ignoring an unknown packet with a long header might be more effective. +An endpoint MAY bundle PATH_CHALLENGE and PATH_RESPONSE frames that are used for +path validation with other frames. For instance, an endpoint may pad a packet +carrying a PATH_CHALLENGE for PMTU discovery, or an endpoint may bundle a +PATH_RESPONSE with its own PATH_CHALLENGE. -An endpoint cannot determine the Source Connection ID from a packet with a short -header, therefore it cannot set the Destination Connection ID in the stateless -reset packet. The Destination Connection ID will therefore differ from the -value used in previous packets. A random Destination Connection ID makes the -connection ID appear to be the result of moving to a new connection ID that was -provided using a NEW_CONNECTION_ID frame ({{frame-new-connection-id}}). +When probing a new path, an endpoint might want to ensure that its peer has an +unused connection ID available for responses. The endpoint can send +NEW_CONNECTION_ID and PATH_CHALLENGE frames in the same packet. This ensures +that an unused connection ID will be available to the peer when sending a +response. -Using a randomized connection ID results in two problems: +### Initiation -* The packet might not reach the peer. If the Destination Connection ID is - critical for routing toward the peer, then this packet could be incorrectly - routed. This might also trigger another Stateless Reset in response, see - {{reset-looping}}. A Stateless Reset that is not correctly routed is - ineffective in causing errors to be quickly detected and recovered. In this - case, endpoints will need to rely on other methods - such as timers - to - detect that the connection has failed. +To initiate path validation, an endpoint sends a PATH_CHALLENGE frame containing +a random payload on the path to be validated. -* The randomly generated connection ID can be used by entities other than the - peer to identify this as a potential stateless reset. An endpoint that - occasionally uses different connection IDs might introduce some uncertainty - about this. +An endpoint MAY send additional PATH_CHALLENGE frames to handle packet loss. An +endpoint SHOULD NOT send a PATH_CHALLENGE more frequently than it would an +Initial packet, ensuring that connection migration is no more load on a new path +than establishing a new connection. -Finally, the last 16 octets of the packet are set to the value of the Stateless -Reset Token. +The endpoint MUST use fresh random data in every PATH_CHALLENGE frame so that it +can associate the peer's response with the causative PATH_CHALLENGE. -A stateless reset is not appropriate for signaling error conditions. An -endpoint that wishes to communicate a fatal connection error MUST use a -CONNECTION_CLOSE or APPLICATION_CLOSE frame if it has sufficient state to do so. -This stateless reset design is specific to QUIC version 1. An endpoint that -supports multiple versions of QUIC needs to generate a stateless reset that will -be accepted by peers that support any version that the endpoint might support -(or might have supported prior to losing state). Designers of new versions of -QUIC need to be aware of this and either reuse this design, or use a portion of -the packet other than the last 16 octets for carrying data. +### Response + +On receiving a PATH_CHALLENGE frame, an endpoint MUST respond immediately by +echoing the data contained in the PATH_CHALLENGE frame in a PATH_RESPONSE frame, +with the following stipulation. Since a PATH_CHALLENGE might be sent from a +spoofed address, an endpoint MAY limit the rate at which it sends PATH_RESPONSE +frames and MAY silently discard PATH_CHALLENGE frames that would cause it to +respond at a higher rate. +To ensure that packets can be both sent to and received from the peer, the +PATH_RESPONSE MUST be sent on the same path as the triggering PATH_CHALLENGE: +from the same local address on which the PATH_CHALLENGE was received, to the +same remote address from which the PATH_CHALLENGE was received. -#### Detecting a Stateless Reset -An endpoint detects a potential stateless reset when a packet with a short -header either cannot be decrypted or is marked as a duplicate packet. The -endpoint then compares the last 16 octets of the packet with the Stateless Reset -Token provided by its peer, either in a NEW_CONNECTION_ID frame or the server's -transport parameters. If these values are identical, the endpoint MUST enter -the draining period and not send any further packets on this connection. If the -comparison fails, the packet can be discarded. +### Completion +A new address is considered valid when a PATH_RESPONSE frame is received +containing data that was sent in a previous PATH_CHALLENGE. Receipt of an +acknowledgment for a packet containing a PATH_CHALLENGE frame is not adequate +validation, since the acknowledgment can be spoofed by a malicious peer. -#### Calculating a Stateless Reset Token {#reset-token} +For path validation to be successful, a PATH_RESPONSE frame MUST be received +from the same remote address to which the corresponding PATH_CHALLENGE was +sent. If a PATH_RESPONSE frame is received from a different remote address than +the one to which the PATH_CHALLENGE was sent, path validation is considered to +have failed, even if the data matches that sent in the PATH_CHALLENGE. -The stateless reset token MUST be difficult to guess. In order to create a -Stateless Reset Token, an endpoint could randomly generate {{!RFC4086}} a secret -for every connection that it creates. However, this presents a coordination -problem when there are multiple instances in a cluster or a storage problem for -an endpoint that might lose state. Stateless reset specifically exists to -handle the case where state is lost, so this approach is suboptimal. +Additionally, the PATH_RESPONSE frame MUST be received on the same local address +from which the corresponding PATH_CHALLENGE was sent. If a PATH_RESPONSE frame +is received on a different local address than the one from which the +PATH_CHALLENGE was sent, path validation is considered to have failed, even if +the data matches that sent in the PATH_CHALLENGE. Thus, the endpoint considers +the path to be valid when a PATH_RESPONSE frame is received on the same path +with the same payload as the PATH_CHALLENGE frame. -A single static key can be used across all connections to the same endpoint by -generating the proof using a second iteration of a preimage-resistant function -that takes a static key and the connection ID chosen by the endpoint (see -{{connection-id}}) as input. An endpoint could use HMAC {{?RFC2104}} (for -example, HMAC(static_key, connection_id)) or HKDF {{?RFC5869}} (for example, -using the static key as input keying material, with the connection ID as salt). -The output of this function is truncated to 16 octets to produce the Stateless -Reset Token for that connection. -An endpoint that loses state can use the same method to generate a valid -Stateless Reset Token. The connection ID comes from the packet that the -endpoint receives. +### Abandonment -This design relies on the peer always sending a connection ID in its packets so -that the endpoint can use the connection ID from a packet to reset the -connection. An endpoint that uses this design MUST either use the same -connection ID length for all connections or encode the length of the connection -ID such that it can be recovered without state. In addition, it MUST NOT -provide a zero-length connection ID. +An endpoint SHOULD abandon path validation after sending some number of +PATH_CHALLENGE frames or after some time has passed. When setting this timer, +implementations are cautioned that the new path could have a longer round-trip +time than the original. -Revealing the Stateless Reset Token allows any entity to terminate the -connection, so a value can only be used once. This method for choosing the -Stateless Reset Token means that the combination of connection ID and static key -cannot occur for another connection. A denial of service attack is possible if -the same connection ID is used by instances that share a static key, or if an -attacker can cause a packet to be routed to an instance that has no state but -the same static key (see {{reset-oracle}}). A connection ID from a connection -that is reset by revealing the Stateless Reset Token cannot be reused for new -connections at nodes that share a static key. +Note that the endpoint might receive packets containing other frames on the new +path, but a PATH_RESPONSE frame with appropriate data is required for path +validation to succeed. -Note that Stateless Reset packets do not have any cryptographic protection. +If path validation fails, the path is deemed unusable. This does not +necessarily imply a failure of the connection - endpoints can continue sending +packets over other paths as appropriate. If no paths are available, an endpoint +can wait for a new path to become available or close the connection. +A path validation might be abandoned for other reasons besides +failure. Primarily, this happens if a connection migration to a new path is +initiated while a path validation on the old path is in progress. -#### Looping {#reset-looping} -The design of a Stateless Reset is such that it is indistinguishable from a -valid packet. This means that a Stateless Reset might trigger the sending of a -Stateless Reset in response, which could lead to infinite exchanges. +## Connection Migration {#migration} -An endpoint MUST ensure that every Stateless Reset that it sends is smaller than -the packet which triggered it, unless it maintains state sufficient to prevent -looping. In the event of a loop, this results in packets eventually being too -small to trigger a response. +QUIC allows connections to survive changes to endpoint addresses (that is, IP +address and/or port), such as those caused by an endpoint migrating to a new +network. This section describes the process by which an endpoint migrates to a +new address. -An endpoint can remember the number of Stateless Reset packets that it has sent -and stop generating new Stateless Reset packets once a limit is reached. Using -separate limits for different remote addresses will ensure that Stateless Reset -packets can be used to close connections when other peers or connections have -exhausted limits. +An endpoint MUST NOT initiate connection migration before the handshake is +finished and the endpoint has 1-RTT keys. The design of QUIC relies on +endpoints retaining a stable address for the duration of the handshake. -Reducing the size of a Stateless Reset below the recommended minimum size of 37 -octets could mean that the packet could reveal to an observer that it is a -Stateless Reset. Conversely, refusing to send a Stateless Reset in response to -a small packet might result in Stateless Reset not being useful in detecting -cases of broken connections where only very small packets are sent; such -failures might only be detected by other means, such as timers. +An endpoint also MUST NOT initiate connection migration if the peer sent the +`disable_migration` transport parameter during the handshake. An endpoint which +has sent this transport parameter, but detects that a peer has nonetheless +migrated to a different network MAY treat this as a connection error of type +INVALID_MIGRATION. -An endpoint can increase the odds that a packet will trigger a Stateless Reset -if it cannot be processed by padding it to at least 38 octets. +Not all changes of peer address are intentional migrations. The peer could +experience NAT rebinding: a change of address due to a middlebox, usually a NAT, +allocating a new outgoing port or even a new outgoing IP address for a flow. +Endpoints SHOULD perform path validation ({{migrate-validate}}) if a NAT +rebinding does not cause the connection to fail. +This document limits migration of connections to new client addresses, except as +described in {{preferred-address}}. Clients are responsible for initiating all +migrations. Servers do not send non-probing packets (see {{probing}}) toward a +client address until they see a non-probing packet from that address. If a +client receives packets from an unknown server address, the client MAY discard +these packets. -# Frame Types and Formats -As described in {{frames}}, packets contain one or more frames. This section -describes the format and semantics of the core QUIC frame types. +### Probing a New Path {#probing} +An endpoint MAY probe for peer reachability from a new local address using path +validation {{migrate-validate}} prior to migrating the connection to the new +local address. Failure of path validation simply means that the new path is not +usable for this connection. Failure to validate a path does not cause the +connection to end unless there are no valid alternative paths available. -## Variable-Length Integer Encoding {#integer-encoding} +An endpoint uses a new connection ID for probes sent from a new local address, +see {{migration-linkability}} for further discussion. An endpoint that uses +a new local address needs to ensure that at least one new connection ID is +available at the peer. That can be achieved by including a NEW_CONNECTION_ID +frame in the probe. -QUIC frames commonly use a variable-length encoding for non-negative integer -values. This encoding ensures that smaller integer values need fewer octets to -encode. +Receiving a PATH_CHALLENGE frame from a peer indicates that the peer is probing +for reachability on a path. An endpoint sends a PATH_RESPONSE in response as per +{{migrate-validate}}. -The QUIC variable-length integer encoding reserves the two most significant bits -of the first octet to encode the base 2 logarithm of the integer encoding length -in octets. The integer value is encoded on the remaining bits, in network byte -order. +PATH_CHALLENGE, PATH_RESPONSE, NEW_CONNECTION_ID, and PADDING frames are +"probing frames", and all other frames are "non-probing frames". A packet +containing only probing frames is a "probing packet", and a packet containing +any other frame is a "non-probing packet". -This means that integers are encoded on 1, 2, 4, or 8 octets and can encode 6, -14, 30, or 62 bit values respectively. {{integer-summary}} summarizes the -encoding properties. -| 2Bit | Length | Usable Bits | Range | -|:-----|:-------|:------------|:----------------------| -| 00 | 1 | 6 | 0-63 | -| 01 | 2 | 14 | 0-16383 | -| 10 | 4 | 30 | 0-1073741823 | -| 11 | 8 | 62 | 0-4611686018427387903 | -{: #integer-summary title="Summary of Integer Encodings"} +### Initiating Connection Migration {#initiating-migration} -For example, the eight octet sequence c2 19 7c 5e ff 14 e8 8c (in hexadecimal) -decodes to the decimal value 151288809941952652; the four octet sequence 9d 7f -3e 7d decodes to 494878333; the two octet sequence 7b bd decodes to 15293; and -the single octet 25 decodes to 37 (as does the two octet sequence 40 25). +An endpoint can migrate a connection to a new local address by sending packets +containing frames other than probing frames from that address. -Error codes ({{error-codes}}) are described using integers, but do not use this -encoding. +Each endpoint validates its peer's address during connection establishment. +Therefore, a migrating endpoint can send to its peer knowing that the peer is +willing to receive at the peer's current address. Thus an endpoint can migrate +to a new local address without first validating the peer's address. +When migrating, the new path might not support the endpoint's current sending +rate. Therefore, the endpoint resets its congestion controller, as described in +{{migration-cc}}. -## PADDING Frame {#frame-padding} +The new path might not have the same ECN capability. Therefore, the endpoint +verifies ECN capability as described in {{using-ecn}}. -The PADDING frame (type=0x00) has no semantic value. PADDING frames can be used -to increase the size of a packet. Padding can be used to increase an initial -client packet to the minimum required size, or to provide protection against -traffic analysis for protected packets. +Receiving acknowledgments for data sent on the new path serves as proof of the +peer's reachability from the new address. Note that since acknowledgments may +be received on any path, return reachability on the new path is not +established. To establish return reachability on the new path, an endpoint MAY +concurrently initiate path validation {{migrate-validate}} on the new path. -A PADDING frame has no content. That is, a PADDING frame consists of the single -octet that identifies the frame as a PADDING frame. +### Responding to Connection Migration {#migration-response} -## RST_STREAM Frame {#frame-rst-stream} +Receiving a packet from a new peer address containing a non-probing frame +indicates that the peer has migrated to that address. -An endpoint may use a RST_STREAM frame (type=0x01) to abruptly terminate a -stream. +In response to such a packet, an endpoint MUST start sending subsequent packets +to the new peer address and MUST initiate path validation ({{migrate-validate}}) +to verify the peer's ownership of the unvalidated address. -After sending a RST_STREAM, an endpoint ceases transmission and retransmission -of STREAM frames on the identified stream. A receiver of RST_STREAM can discard -any data that it already received on that stream. +An endpoint MAY send data to an unvalidated peer address, but it MUST protect +against potential attacks as described in {{address-spoofing}} and +{{on-path-spoofing}}. An endpoint MAY skip validation of a peer address if that +address has been seen recently. -An endpoint that receives a RST_STREAM frame for a send-only stream MUST -terminate the connection with error PROTOCOL_VIOLATION. +An endpoint only changes the address that it sends packets to in response to the +highest-numbered non-probing packet. This ensures that an endpoint does not send +packets to an old peer address in the case that it receives reordered packets. -The RST_STREAM frame is as follows: +After changing the address to which it sends non-probing packets, an endpoint +could abandon any path validation for other addresses. -~~~ - 0 1 2 3 - 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Stream ID (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Application Error Code (16) | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Final Offset (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -~~~ +Receiving a packet from a new peer address might be the result of a NAT +rebinding at the peer. -The fields are: +After verifying a new client address, the server SHOULD send new address +validation tokens ({{address-validation}}) to the client. -Stream ID: -: A variable-length integer encoding of the Stream ID of the stream being - terminated. +#### Handling Address Spoofing by a Peer {#address-spoofing} -Application Protocol Error Code: +It is possible that a peer is spoofing its source address to cause an endpoint +to send excessive amounts of data to an unwilling host. If the endpoint sends +significantly more data than the spoofing peer, connection migration might be +used to amplify the volume of data that an attacker can generate toward a +victim. -: A 16-bit application protocol error code (see {{app-error-codes}}) which - indicates why the stream is being closed. +As described in {{migration-response}}, an endpoint is required to validate a +peer's new address to confirm the peer's possession of the new address. Until a +peer's address is deemed valid, an endpoint MUST limit the rate at which it +sends data to this address. The endpoint MUST NOT send more than a minimum +congestion window's worth of data per estimated round-trip time (kMinimumWindow, +as defined in {{QUIC-RECOVERY}}). In the absence of this limit, an endpoint +risks being used for a denial of service attack against an unsuspecting victim. +Note that since the endpoint will not have any round-trip time measurements to +this address, the estimate SHOULD be the default initial value (see +{{QUIC-RECOVERY}}). -Final Offset: +If an endpoint skips validation of a peer address as described in +{{migration-response}}, it does not need to limit its sending rate. -: A variable-length integer indicating the absolute byte offset of the end of - data written on this stream by the RST_STREAM sender. +#### Handling Address Spoofing by an On-path Attacker {#on-path-spoofing} + +An on-path attacker could cause a spurious connection migration by copying and +forwarding a packet with a spoofed address such that it arrives before the +original packet. The packet with the spoofed address will be seen to come from +a migrating connection, and the original packet will be seen as a duplicate and +dropped. After a spurious migration, validation of the source address will fail +because the entity at the source address does not have the necessary +cryptographic keys to read or respond to the PATH_CHALLENGE frame that is sent +to it even if it wanted to. -## CONNECTION_CLOSE frame {#frame-connection-close} +To protect the connection from failing due to such a spurious migration, an +endpoint MUST revert to using the last validated peer address when validation of +a new peer address fails. -An endpoint sends a CONNECTION_CLOSE frame (type=0x02) to notify its peer that -the connection is being closed. CONNECTION_CLOSE is used to signal errors at -the QUIC layer, or the absence of errors (with the NO_ERROR code). +If an endpoint has no state about the last validated peer address, it MUST close +the connection silently by discarding all connection state. This results in new +packets on the connection being handled generically. For instance, an endpoint +MAY send a stateless reset in response to any further incoming packets. -If there are open streams that haven't been explicitly closed, they are -implicitly closed when the connection is closed. +Note that receipt of packets with higher packet numbers from the legitimate peer +address will trigger another connection migration. This will cause the +validation of the address of the spurious migration to be abandoned. -The CONNECTION_CLOSE frame is as follows: +### Loss Detection and Congestion Control {#migration-cc} -~~~ - 0 1 2 3 - 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Error Code (16) | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Frame Type (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Reason Phrase Length (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Reason Phrase (*) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -~~~ +The capacity available on the new path might not be the same as the old path. +Packets sent on the old path SHOULD NOT contribute to congestion control or RTT +estimation for the new path. -The fields of a CONNECTION_CLOSE frame are as follows: +On confirming a peer's ownership of its new address, an endpoint SHOULD +immediately reset the congestion controller and round-trip time estimator for +the new path. -Error Code: +An endpoint MUST NOT return to the send rate used for the previous path unless +it is reasonably sure that the previous send rate is valid for the new path. +For instance, a change in the client's port number is likely indicative of a +rebinding in a middlebox and not a complete change in path. This determination +likely depends on heuristics, which could be imperfect; if the new path capacity +is significantly reduced, ultimately this relies on the congestion controller +responding to congestion signals and reducing send rates appropriately. -: A 16-bit error code which indicates the reason for closing this connection. - CONNECTION_CLOSE uses codes from the space defined in {{error-codes}}. +There may be apparent reordering at the receiver when an endpoint sends data and +probes from/to multiple addresses during the migration period, since the two +resulting paths may have different round-trip times. A receiver of packets on +multiple paths will still send ACK frames covering all received packets. -Frame Type: +While multiple paths might be used during connection migration, a single +congestion control context and a single loss recovery context (as described in +{{QUIC-RECOVERY}}) may be adequate. A sender can make exceptions for probe +packets so that their loss detection is independent and does not unduly cause +the congestion controller to reduce its sending rate. An endpoint might set a +separate timer when a PATH_CHALLENGE is sent, which is cancelled when the +corresponding PATH_RESPONSE is received. If the timer fires before the +PATH_RESPONSE is received, the endpoint might send a new PATH_CHALLENGE, and +restart the timer for a longer period of time. -: A variable-length integer encoding the type of frame that triggered the error. - A value of 0 (equivalent to the mention of the PADDING frame) is used when the - frame type is unknown. -Reason Phrase Length: +### Privacy Implications of Connection Migration {#migration-linkability} -: A variable-length integer specifying the length of the reason phrase in bytes. - Note that a CONNECTION_CLOSE frame cannot be split between packets, so in - practice any limits on packet size will also limit the space available for a - reason phrase. +Using a stable connection ID on multiple network paths allows a passive observer +to correlate activity between those paths. An endpoint that moves between +networks might not wish to have their activity correlated by any entity other +than their peer, so different connection IDs are used when sending from +different local addresses, as discussed in {{connection-id}}. For this to be +effective endpoints need to ensure that connections IDs they provide cannot be +linked by any other entity. -Reason Phrase: +This eliminates the use of the connection ID for linking activity from +the same connection on different networks. Protection of packet numbers ensures +that packet numbers cannot be used to correlate activity. This does not prevent +other properties of packets, such as timing and size, from being used to +correlate activity. -: A human-readable explanation for why the connection was closed. This can be - zero length if the sender chooses to not give details beyond the Error Code. - This SHOULD be a UTF-8 encoded string {{!RFC3629}}. +Clients MAY move to a new connection ID at any time based on +implementation-specific concerns. For example, after a period of network +inactivity NAT rebinding might occur when the client begins sending data again. +A client might wish to reduce linkability by employing a new connection ID and +source UDP port when sending traffic after a period of inactivity. Changing the +UDP port from which it sends packets at the same time might cause the packet to +appear as a connection migration. This ensures that the mechanisms that support +migration are exercised even for clients that don't experience NAT rebindings or +genuine migrations. Changing port number can cause a peer to reset its +congestion state (see {{migration-cc}}), so the port SHOULD only be changed +infrequently. -## APPLICATION_CLOSE frame {#frame-application-close} +Endpoints that use connection IDs with length greater than zero could have their +activity correlated if their peers keep using the same destination connection ID +after migration. Endpoints that receive packets with a previously unused +Destination Connection ID SHOULD change to sending packets with a connection ID +that has not been used on any other network path. The goal here is to ensure +that packets sent on different paths cannot be correlated. To fulfill this +privacy requirement, endpoints that initiate migration and use connection IDs +with length greater than zero SHOULD provide their peers with new connection IDs +before migration. -An APPLICATION_CLOSE frame (type=0x03) is used to signal an error with the -protocol that uses QUIC. +Caution: -The APPLICATION_CLOSE frame is as follows: +: If both endpoints change connection ID in response to seeing a change in + connection ID from their peer, then this can trigger an infinite sequence of + changes. -~~~ - 0 1 2 3 - 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Error Code (16) | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Reason Phrase Length (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Reason Phrase (*) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -~~~ +## Server's Preferred Address {#preferred-address} -The fields of an APPLICATION_CLOSE frame are as follows: +QUIC allows servers to accept connections on one IP address and attempt to +transfer these connections to a more preferred address shortly after the +handshake. This is particularly useful when clients initially connect to an +address shared by multiple servers but would prefer to use a unicast address to +ensure connection stability. This section describes the protocol for migrating a +connection to a preferred server address. -Error Code: +Migrating a connection to a new server address mid-connection is left for future +work. If a client receives packets from a new server address not indicated by +the preferred_address transport parameter, the client SHOULD discard these +packets. -: A 16-bit error code which indicates the reason for closing this connection. - APPLICATION_CLOSE uses codes from the application protocol error code space, - see {{app-error-codes}}. +### Communicating A Preferred Address -Reason Phrase Length: +A server conveys a preferred address by including the preferred_address +transport parameter in the TLS handshake. -: This field is identical in format and semantics to the Reason Phrase Length - field from CONNECTION_CLOSE. +Once the handshake is finished, the client SHOULD initiate path validation (see +{{migrate-validate}}) of the server's preferred address using the connection ID +provided in the preferred_address transport parameter. -Reason Phrase: +If path validation succeeds, the client SHOULD immediately begin sending all +future packets to the new server address using the new connection ID and +discontinue use of the old server address. If path validation fails, the client +MUST continue sending all future packets to the server's original IP address. -: This field is identical in format and semantics to the Reason Phrase field - from CONNECTION_CLOSE. -APPLICATION_CLOSE has similar format and semantics to the CONNECTION_CLOSE frame -({{frame-connection-close}}). Aside from the semantics of the Error Code field -and the omission of the Frame Type field, both frames are used to close the -connection. +### Responding to Connection Migration +A server might receive a packet addressed to its preferred IP address at any +time after the handshake is completed. If this packet contains a PATH_CHALLENGE +frame, the server sends a PATH_RESPONSE frame as per {{migrate-validate}}, but +the server MUST continue sending all other packets from its original IP address. -## MAX_DATA Frame {#frame-max-data} +The server SHOULD also initiate path validation of the client using its +preferred address and the address from which it received the client probe. This +helps to guard against spurious migration initiated by an attacker. -The MAX_DATA frame (type=0x04) is used in flow control to inform the peer of -the maximum amount of data that can be sent on the connection as a whole. +Once the server has completed its path validation and has received a non-probing +packet with a new largest packet number on its preferred address, the server +begins sending to the client exclusively from its preferred IP address. It +SHOULD drop packets for this connection received on the old IP address, but MAY +continue to process delayed packets. -The frame is as follows: -~~~ - 0 1 2 3 - 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Maximum Data (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -~~~ +### Interaction of Client Migration and Preferred Address -The fields in the MAX_DATA frame are as follows: +A client might need to perform a connection migration before it has migrated to +the server's preferred address. In this case, the client SHOULD perform path +validation to both the original and preferred server address from the client's +new address concurrently. -Maximum Data: +If path validation of the server's preferred address succeeds, the client MUST +abandon validation of the original address and migrate to using the server's +preferred address. If path validation of the server's preferred address fails, +but validation of the server's original address succeeds, the client MAY migrate +to using the original address from the client's new address. -: A variable-length integer indicating the maximum amount of data that can be - sent on the entire connection, in units of octets. +If the connection to the server's preferred address is not from the same client +address, the server MUST protect against potential attacks as described in +{{address-spoofing}} and {{on-path-spoofing}}. In addition to intentional +simultaneous migration, this might also occur because the client's access +network used a different NAT binding for the server's preferred address. -All data sent in STREAM frames counts toward this limit. The sum of the largest -received offsets on all streams - including streams in terminal states - MUST -NOT exceed the value advertised by a receiver. An endpoint MUST terminate a -connection with a FLOW_CONTROL_ERROR error if it receives more data than the -maximum data value that it has sent, unless this is a result of a change in -the initial limits (see {{zerortt-parameters}}). +Servers SHOULD initiate path validation to the client's new address upon +receiving a probe packet from a different address. Servers MUST NOT send more +than a minimum congestion window's worth of non-probing packets to the new +address before path validation is complete. -## MAX_STREAM_DATA Frame {#frame-max-stream-data} +## Connection Termination {#termination} -The MAX_STREAM_DATA frame (type=0x05) is used in flow control to inform a peer -of the maximum amount of data that can be sent on a stream. +Connections should remain open until they become idle for a pre-negotiated +period of time. A QUIC connection, once established, can be terminated in one +of three ways: -An endpoint that receives a MAX_STREAM_DATA frame for a receive-only stream -MUST terminate the connection with error PROTOCOL_VIOLATION. +* idle timeout ({{idle-timeout}}) +* immediate close ({{immediate-close}}) +* stateless reset ({{stateless-reset}}) -An endpoint that receives a MAX_STREAM_DATA frame for a send-only stream -it has not opened MUST terminate the connection with error PROTOCOL_VIOLATION. -Note that an endpoint may legally receive a MAX_STREAM_DATA frame on a -bidirectional stream it has not opened. +### Closing and Draining Connection States {#draining} -The frame is as follows: +The closing and draining connection states exist to ensure that connections +close cleanly and that delayed or reordered packets are properly discarded. +These states SHOULD persist for three times the current Retransmission Timeout +(RTO) interval as defined in {{QUIC-RECOVERY}}. -~~~ - 0 1 2 3 - 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Stream ID (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Maximum Stream Data (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -~~~ +An endpoint enters a closing period after initiating an immediate close +({{immediate-close}}). While closing, an endpoint MUST NOT send packets unless +they contain a CONNECTION_CLOSE or APPLICATION_CLOSE frame (see +{{immediate-close}} for details). -The fields in the MAX_STREAM_DATA frame are as follows: +In the closing state, only a packet containing a closing frame can be sent. An +endpoint retains only enough information to generate a packet containing a +closing frame and to identify packets as belonging to the connection. The +connection ID and QUIC version is sufficient information to identify packets for +a closing connection; an endpoint can discard all other connection state. An +endpoint MAY retain packet protection keys for incoming packets to allow it to +read and process a closing frame. -Stream ID: +The draining state is entered once an endpoint receives a signal that its peer +is closing or draining. While otherwise identical to the closing state, an +endpoint in the draining state MUST NOT send any packets. Retaining packet +protection keys is unnecessary once a connection is in the draining state. -: The stream ID of the stream that is affected encoded as a variable-length - integer. +An endpoint MAY transition from the closing period to the draining period if it +can confirm that its peer is also closing or draining. Receiving a closing +frame is sufficient confirmation, as is receiving a stateless reset. The +draining period SHOULD end when the closing period would have ended. In other +words, the endpoint can use the same end time, but cease retransmission of the +closing packet. -Maximum Stream Data: +Disposing of connection state prior to the end of the closing or draining period +could cause delayed or reordered packets to be handled poorly. Endpoints that +have some alternative means to ensure that late-arriving packets on the +connection do not create QUIC state, such as those that are able to close the +UDP socket, MAY use an abbreviated draining period which can allow for faster +resource recovery. Servers that retain an open socket for accepting new +connections SHOULD NOT exit the closing or draining period early. -: A variable-length integer indicating the maximum amount of data that can be - sent on the identified stream, in units of octets. +Once the closing or draining period has ended, an endpoint SHOULD discard all +connection state. This results in new packets on the connection being handled +generically. For instance, an endpoint MAY send a stateless reset in response +to any further incoming packets. -When counting data toward this limit, an endpoint accounts for the largest -received offset of data that is sent or received on the stream. Loss or -reordering can mean that the largest received offset on a stream can be greater -than the total size of data received on that stream. Receiving STREAM frames -might not increase the largest received offset. +The draining and closing periods do not apply when a stateless reset +({{stateless-reset}}) is sent. -The data sent on a stream MUST NOT exceed the largest maximum stream data value -advertised by the receiver. An endpoint MUST terminate a connection with a -FLOW_CONTROL_ERROR error if it receives more data than the largest maximum -stream data that it has sent for the affected stream, unless this is a result of -a change in the initial limits (see {{zerortt-parameters}}). +An endpoint is not expected to handle key updates when it is closing or +draining. A key update might prevent the endpoint from moving from the closing +state to draining, but it otherwise has no impact. +An endpoint could receive packets from a new source address, indicating a client +connection migration ({{migration}}), while in the closing period. An endpoint +in the closing state MUST strictly limit the number of packets it sends to this +new address until the address is validated (see {{migrate-validate}}). A server +in the closing state MAY instead choose to discard packets received from a new +source address. -## MAX_STREAM_ID Frame {#frame-max-stream-id} -The MAX_STREAM_ID frame (type=0x06) informs the peer of the maximum stream ID -that they are permitted to open. +### Idle Timeout -The frame is as follows: +If the idle timeout is enabled, a connection that remains idle for longer than +the advertised idle timeout (see {{transport-parameter-definitions}}) is closed. +A connection enters the draining state when the idle timeout expires. -~~~ - 0 1 2 3 - 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Maximum Stream ID (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -~~~ +Each endpoint advertises their own idle timeout to their peer. The idle timeout +starts from the last packet received. In order to ensure that initiating new +activity postpones an idle timeout, an endpoint restarts this timer when sending +a packet. An endpoint does not postpone the idle timeout if another packet has +been sent containing frames other than ACK or PADDING, and that other packet has +not been acknowledged or declared lost. Packets that contain only ACK or +PADDING frames are not acknowledged until an endpoint has other frames to send, +so they could prevent the timeout from being refreshed. -The fields in the MAX_STREAM_ID frame are as follows: +The value for an idle timeout can be asymmetric. The value advertised by an +endpoint is only used to determine whether the connection is live at that +endpoint. An endpoint that sends packets near the end of the idle timeout +period of a peer risks having those packets discarded if its peer enters the +draining state before the packets arrive. If a peer could timeout within an RTO +(see Section 4.3.3 of {{QUIC-RECOVERY}}), it is advisable to test for liveness +before sending any data that cannot be retried safely. -Maximum Stream ID: -: ID of the maximum unidirectional or bidirectional peer-initiated stream ID for - the connection encoded as a variable-length integer. The limit applies to - unidirectional steams if the second least signification bit of the stream ID - is 1, and applies to bidirectional streams if it is 0. -Loss or reordering can mean that a MAX_STREAM_ID frame can be received which -states a lower stream limit than the client has previously received. -MAX_STREAM_ID frames which do not increase the maximum stream ID MUST be -ignored. +### Immediate Close -A peer MUST NOT initiate a stream with a higher stream ID than the greatest -maximum stream ID it has received. An endpoint MUST terminate a connection with -a STREAM_ID_ERROR error if a peer initiates a stream with a higher stream ID -than it has sent, unless this is a result of a change in the initial limits (see -{{zerortt-parameters}}). +An endpoint sends a closing frame (CONNECTION_CLOSE or APPLICATION_CLOSE) to +terminate the connection immediately. Any closing frame causes all streams to +immediately become closed; open streams can be assumed to be implicitly reset. +After sending a closing frame, endpoints immediately enter the closing state. +During the closing period, an endpoint that sends a closing frame SHOULD respond +to any packet that it receives with another packet containing a closing frame. +To minimize the state that an endpoint maintains for a closing connection, +endpoints MAY send the exact same packet. However, endpoints SHOULD limit the +number of packets they generate containing a closing frame. For instance, an +endpoint could progressively increase the number of packets that it receives +before sending additional packets or increase the time between packets. -## PING Frame {#frame-ping} +Note: -Endpoints can use PING frames (type=0x07) to verify that their peers are still -alive or to check reachability to the peer. The PING frame contains no -additional fields. +: Allowing retransmission of a packet contradicts other advice in this document + that recommends the creation of new packet numbers for every packet. Sending + new packet numbers is primarily of advantage to loss recovery and congestion + control, which are not expected to be relevant for a closed connection. + Retransmitting the final packet requires less state. -The receiver of a PING frame simply needs to acknowledge the packet containing -this frame. +After receiving a closing frame, endpoints enter the draining state. An +endpoint that receives a closing frame MAY send a single packet containing a +closing frame before entering the draining state, using a CONNECTION_CLOSE frame +and a NO_ERROR code if appropriate. An endpoint MUST NOT send further packets, +which could result in a constant exchange of closing frames until the closing +period on either peer ended. -The PING frame can be used to keep a connection alive when an application or -application protocol wishes to prevent the connection from timing out. An -application protocol SHOULD provide guidance about the conditions under which -generating a PING is recommended. This guidance SHOULD indicate whether it is -the client or the server that is expected to send the PING. Having both -endpoints send PING frames without coordination can produce an excessive number -of packets and poor performance. +An immediate close can be used after an application protocol has arranged to +close a connection. This might be after the application protocols negotiates a +graceful shutdown. The application protocol exchanges whatever messages that +are needed to cause both endpoints to agree to close the connection, after which +the application requests that the connection be closed. The application +protocol can use an APPLICATION_CLOSE message with an appropriate error code to +signal closure. -A connection will time out if no packets are sent or received for a period -longer than the time specified in the idle_timeout transport parameter (see -{{termination}}). However, state in middleboxes might time out earlier than -that. Though REQ-5 in {{?RFC4787}} recommends a 2 minute timeout interval, -experience shows that sending packets every 15 to 30 seconds is necessary to -prevent the majority of middleboxes from losing state for UDP flows. +### Stateless Reset {#stateless-reset} -## BLOCKED Frame {#frame-blocked} +A stateless reset is provided as an option of last resort for an endpoint that +does not have access to the state of a connection. A crash or outage might +result in peers continuing to send data to an endpoint that is unable to +properly continue the connection. An endpoint that wishes to communicate a +fatal connection error MUST use a closing frame if it has sufficient state to do +so. -A sender SHOULD send a BLOCKED frame (type=0x08) when it wishes to send data, -but is unable to due to connection-level flow control (see {{blocking}}). -BLOCKED frames can be used as input to tuning of flow control algorithms (see -{{fc-credit}}). +To support this process, a token is sent by endpoints. The token is carried in +the NEW_CONNECTION_ID frame sent by either peer, and servers can specify the +stateless_reset_token transport parameter during the handshake (clients cannot +because their transport parameters don't have confidentiality protection). This +value is protected by encryption, so only client and server know this value. +Tokens sent via NEW_CONNECTION_ID frames are invalidated when their associated +connection ID is retired via a RETIRE_CONNECTION_ID frame +({{frame-retire-connection-id}}). -The BLOCKED frame is as follows: +An endpoint that receives packets that it cannot process sends a packet in the +following layout: ~~~ 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+ +|0|K|1|1|0|0|0|0| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Offset (i) ... +| Random Octets (160..) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| | ++ + +| | ++ Stateless Reset Token (128) + +| | ++ + +| | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~~~ +{: #fig-stateless-reset title="Stateless Reset Packet"} -The BLOCKED frame contains a single field. - -Offset: - -: A variable-length integer indicating the connection-level offset at which - the blocking occurred. - +This design ensures that a stateless reset packet is - to the extent possible - +indistinguishable from a regular packet with a short header. -## STREAM_BLOCKED Frame {#frame-stream-blocked} +The message consists of a header octet, followed by an arbitrary number of +random octets, followed by a Stateless Reset Token. -A sender SHOULD send a STREAM_BLOCKED frame (type=0x09) when it wishes to send -data, but is unable to due to stream-level flow control. This frame is -analogous to BLOCKED ({{frame-blocked}}). +A stateless reset will be interpreted by a recipient as a packet with a short +header. For the packet to appear as valid, the Random Octets field needs to +include at least 20 octets of random or unpredictable values. This is intended +to allow for a destination connection ID of the maximum length permitted, a +packet number, and minimal payload. The Stateless Reset Token corresponds to +the minimum expansion of the packet protection AEAD. More random octets might +be necessary if the endpoint could have negotiated a packet protection scheme +with a larger minimum AEAD expansion. -An endpoint that receives a STREAM_BLOCKED frame for a send-only stream MUST -terminate the connection with error PROTOCOL_VIOLATION. +An endpoint SHOULD NOT send a stateless reset that is significantly larger than +the packet it receives. Endpoints MUST discard packets that are too small to be +valid QUIC packets. With the set of AEAD functions defined in {{QUIC-TLS}}, +packets less than 19 octets long are never valid. -The STREAM_BLOCKED frame is as follows: +An endpoint MAY send a stateless reset in response to a packet with a long +header. This would not be effective if the stateless reset token was not yet +available to a peer. In this QUIC version, packets with a long header are only +used during connection establishment. Because the stateless reset token is not +available until connection establishment is complete or near completion, +ignoring an unknown packet with a long header might be more effective. -~~~ - 0 1 2 3 - 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Stream ID (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Offset (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -~~~ +An endpoint cannot determine the Source Connection ID from a packet with a short +header, therefore it cannot set the Destination Connection ID in the stateless +reset packet. The Destination Connection ID will therefore differ from the +value used in previous packets. A random Destination Connection ID makes the +connection ID appear to be the result of moving to a new connection ID that was +provided using a NEW_CONNECTION_ID frame ({{frame-new-connection-id}}). -The STREAM_BLOCKED frame contains two fields: +Using a randomized connection ID results in two problems: -Stream ID: +* The packet might not reach the peer. If the Destination Connection ID is + critical for routing toward the peer, then this packet could be incorrectly + routed. This might also trigger another Stateless Reset in response, see + {{reset-looping}}. A Stateless Reset that is not correctly routed is + ineffective in causing errors to be quickly detected and recovered. In this + case, endpoints will need to rely on other methods - such as timers - to + detect that the connection has failed. -: A variable-length integer indicating the stream which is flow control blocked. +* The randomly generated connection ID can be used by entities other than the + peer to identify this as a potential stateless reset. An endpoint that + occasionally uses different connection IDs might introduce some uncertainty + about this. -Offset: +Finally, the last 16 octets of the packet are set to the value of the Stateless +Reset Token. -: A variable-length integer indicating the offset of the stream at which the - blocking occurred. +A stateless reset is not appropriate for signaling error conditions. An +endpoint that wishes to communicate a fatal connection error MUST use a +CONNECTION_CLOSE or APPLICATION_CLOSE frame if it has sufficient state to do so. +This stateless reset design is specific to QUIC version 1. An endpoint that +supports multiple versions of QUIC needs to generate a stateless reset that will +be accepted by peers that support any version that the endpoint might support +(or might have supported prior to losing state). Designers of new versions of +QUIC need to be aware of this and either reuse this design, or use a portion of +the packet other than the last 16 octets for carrying data. -## STREAM_ID_BLOCKED Frame {#frame-stream-id-blocked} -A sender MAY send a STREAM_ID_BLOCKED frame (type=0x0a) when it wishes to open a -stream, but is unable to due to the maximum stream ID limit set by its peer (see -{{frame-max-stream-id}}). This does not open the stream, but informs the peer -that a new stream was needed, but the stream limit prevented the creation of the -stream. +#### Detecting a Stateless Reset -The STREAM_ID_BLOCKED frame is as follows: +An endpoint detects a potential stateless reset when a packet with a short +header either cannot be decrypted or is marked as a duplicate packet. The +endpoint then compares the last 16 octets of the packet with the Stateless Reset +Token provided by its peer, either in a NEW_CONNECTION_ID frame or the server's +transport parameters. If these values are identical, the endpoint MUST enter +the draining period and not send any further packets on this connection. If the +comparison fails, the packet can be discarded. -~~~ - 0 1 2 3 - 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Stream ID (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -~~~ -The STREAM_ID_BLOCKED frame contains a single field. +#### Calculating a Stateless Reset Token {#reset-token} -Stream ID: +The stateless reset token MUST be difficult to guess. In order to create a +Stateless Reset Token, an endpoint could randomly generate {{!RFC4086}} a secret +for every connection that it creates. However, this presents a coordination +problem when there are multiple instances in a cluster or a storage problem for +an endpoint that might lose state. Stateless reset specifically exists to +handle the case where state is lost, so this approach is suboptimal. -: A variable-length integer indicating the highest stream ID that the sender - was permitted to open. +A single static key can be used across all connections to the same endpoint by +generating the proof using a second iteration of a preimage-resistant function +that takes a static key and the connection ID chosen by the endpoint (see +{{connection-id}}) as input. An endpoint could use HMAC {{?RFC2104}} (for +example, HMAC(static_key, connection_id)) or HKDF {{?RFC5869}} (for example, +using the static key as input keying material, with the connection ID as salt). +The output of this function is truncated to 16 octets to produce the Stateless +Reset Token for that connection. -## NEW_CONNECTION_ID Frame {#frame-new-connection-id} +An endpoint that loses state can use the same method to generate a valid +Stateless Reset Token. The connection ID comes from the packet that the +endpoint receives. -An endpoint sends a NEW_CONNECTION_ID frame (type=0x0b) to provide its peer with -alternative connection IDs that can be used to break linkability when migrating -connections (see {{migration-linkability}}). +This design relies on the peer always sending a connection ID in its packets so +that the endpoint can use the connection ID from a packet to reset the +connection. An endpoint that uses this design MUST either use the same +connection ID length for all connections or encode the length of the connection +ID such that it can be recovered without state. In addition, it MUST NOT +provide a zero-length connection ID. -The NEW_CONNECTION_ID frame is as follows: +Revealing the Stateless Reset Token allows any entity to terminate the +connection, so a value can only be used once. This method for choosing the +Stateless Reset Token means that the combination of connection ID and static key +cannot occur for another connection. A denial of service attack is possible if +the same connection ID is used by instances that share a static key, or if an +attacker can cause a packet to be routed to an instance that has no state but +the same static key (see {{reset-oracle}}). A connection ID from a connection +that is reset by revealing the Stateless Reset Token cannot be reused for new +connections at nodes that share a static key. -~~~ - 0 1 2 3 - 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Length (8) | Sequence Number (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Connection ID (32..144) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| | -+ + -| | -+ Stateless Reset Token (128) + -| | -+ + -| | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -~~~ +Note that Stateless Reset packets do not have any cryptographic protection. -The fields are: -Length: +#### Looping {#reset-looping} -: An 8-bit unsigned integer containing the length of the connection ID. Values - less than 4 and greater than 18 are invalid and MUST be treated as a - connection error of type PROTOCOL_VIOLATION. +The design of a Stateless Reset is such that it is indistinguishable from a +valid packet. This means that a Stateless Reset might trigger the sending of a +Stateless Reset in response, which could lead to infinite exchanges. -Sequence Number: +An endpoint MUST ensure that every Stateless Reset that it sends is smaller than +the packet which triggered it, unless it maintains state sufficient to prevent +looping. In the event of a loop, this results in packets eventually being too +small to trigger a response. -: The sequence number assigned to the connection ID by the sender. See - {{issuing-connection-ids}}. +An endpoint can remember the number of Stateless Reset packets that it has sent +and stop generating new Stateless Reset packets once a limit is reached. Using +separate limits for different remote addresses will ensure that Stateless Reset +packets can be used to close connections when other peers or connections have +exhausted limits. -Connection ID: +Reducing the size of a Stateless Reset below the recommended minimum size of 37 +octets could mean that the packet could reveal to an observer that it is a +Stateless Reset. Conversely, refusing to send a Stateless Reset in response to +a small packet might result in Stateless Reset not being useful in detecting +cases of broken connections where only very small packets are sent; such +failures might only be detected by other means, such as timers. -: A connection ID of the specified length. +An endpoint can increase the odds that a packet will trigger a Stateless Reset +if it cannot be processed by padding it to at least 38 octets. -Stateless Reset Token: -: A 128-bit value that will be used for a stateless reset when the associated - connection ID is used (see {{stateless-reset}}). +# Frame Types and Formats -An endpoint MUST NOT send this frame if it currently requires that its peer send -packets with a zero-length Destination Connection ID. Changing the length of a -connection ID to or from zero-length makes it difficult to identify when the -value of the connection ID changed. An endpoint that is sending packets with a -zero-length Destination Connection ID MUST treat receipt of a NEW_CONNECTION_ID -frame as a connection error of type PROTOCOL_VIOLATION. +As described in {{frames}}, packets contain one or more frames. This section +describes the format and semantics of the core QUIC frame types. -Transmission errors, timeouts and retransmissions might cause the same -NEW_CONNECTION_ID frame to be received multiple times. Receipt of the same -frame multiple times MUST NOT be treated as a connection error. A receiver can -use the sequence number supplied in the NEW_CONNECTION_ID frame to identify new -connection IDs from old ones. -If an endpoint receives a NEW_CONNECTION_ID frame that repeats a previously -issued connection ID with a different Stateless Reset Token or a different -sequence number, the endpoint MAY treat that receipt as a connection error of -type PROTOCOL_VIOLATION. +## Variable-Length Integer Encoding {#integer-encoding} -## RETIRE_CONNECTION_ID Frame {#frame-retire-connection-id} +QUIC frames commonly use a variable-length encoding for non-negative integer +values. This encoding ensures that smaller integer values need fewer octets to +encode. -An endpoint sends a RETIRE_CONNECTION_ID frame (type=0x1b) to indicate that it -will no longer use a connection ID that was issued by its peer. This may include -the connection ID provided during the handshake. Sending a RETIRE_CONNECTION_ID -frame also serves as a request to the peer to send additional connection IDs for -future use (see {{connection-id}}). New connection IDs can be delivered to a -peer using the NEW_CONNECTION_ID frame ({{frame-new-connection-id}}). +The QUIC variable-length integer encoding reserves the two most significant bits +of the first octet to encode the base 2 logarithm of the integer encoding length +in octets. The integer value is encoded on the remaining bits, in network byte +order. -Retiring a connection ID invalidates the stateless reset token associated with -that connection ID. +This means that integers are encoded on 1, 2, 4, or 8 octets and can encode 6, +14, 30, or 62 bit values respectively. {{integer-summary}} summarizes the +encoding properties. -The RETIRE_CONNECTION_ID frame is as follows: +| 2Bit | Length | Usable Bits | Range | +|:-----|:-------|:------------|:----------------------| +| 00 | 1 | 6 | 0-63 | +| 01 | 2 | 14 | 0-16383 | +| 10 | 4 | 30 | 0-1073741823 | +| 11 | 8 | 62 | 0-4611686018427387903 | +{: #integer-summary title="Summary of Integer Encodings"} -~~~ - 0 1 2 3 - 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Sequence Number (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -~~~ +For example, the eight octet sequence c2 19 7c 5e ff 14 e8 8c (in hexadecimal) +decodes to the decimal value 151288809941952652; the four octet sequence 9d 7f +3e 7d decodes to 494878333; the two octet sequence 7b bd decodes to 15293; and +the single octet 25 decodes to 37 (as does the two octet sequence 40 25). -The fields are: +Error codes ({{error-codes}}) are described using integers, but do not use this +encoding. -Sequence Number: -: The sequence number of the connection ID being retired. See - {{retiring-cids}}. +## PADDING Frame {#frame-padding} -Receipt of a RETIRE_CONNECTION_ID frame containing a sequence number greater -than any previously sent to the peer MAY be treated as a connection error of -type PROTOCOL_VIOLATION. +The PADDING frame (type=0x00) has no semantic value. PADDING frames can be used +to increase the size of a packet. Padding can be used to increase an initial +client packet to the minimum required size, or to provide protection against +traffic analysis for protected packets. -An endpoint cannot send this frame if it was provided with a zero-length -connection ID by its peer. An endpoint that provides a zero-length connection -ID MUST treat receipt of a RETIRE_CONNECTION_ID frame as a connection error of -type PROTOCOL_VIOLATION. +A PADDING frame has no content. That is, a PADDING frame consists of the single +octet that identifies the frame as a PADDING frame. -## STOP_SENDING Frame {#frame-stop-sending} +## RST_STREAM Frame {#frame-rst-stream} -An endpoint may use a STOP_SENDING frame (type=0x0c) to communicate that -incoming data is being discarded on receipt at application request. This -signals a peer to abruptly terminate transmission on a stream. +An endpoint may use a RST_STREAM frame (type=0x01) to abruptly terminate a +stream. -Receipt of a STOP_SENDING frame is only valid for a send stream that exists and -is not in the "Ready" state (see {{stream-send-states}}). Receiving a -STOP_SENDING frame for a send stream that is "Ready" or non-existent MUST be -treated as a connection error of type PROTOCOL_VIOLATION. An endpoint that -receives a STOP_SENDING frame for a receive-only stream MUST terminate the -connection with error PROTOCOL_VIOLATION. +After sending a RST_STREAM, an endpoint ceases transmission and retransmission +of STREAM frames on the identified stream. A receiver of RST_STREAM can discard +any data that it already received on that stream. -The STOP_SENDING frame is as follows: +An endpoint that receives a RST_STREAM frame for a send-only stream MUST +terminate the connection with error PROTOCOL_VIOLATION. + +The RST_STREAM frame is as follows: ~~~ 0 1 2 3 @@ -3383,1364 +3577,1179 @@ The STOP_SENDING frame is as follows: | Stream ID (i) ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Application Error Code (16) | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Final Offset (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~~~ The fields are: Stream ID: -: A variable-length integer carrying the Stream ID of the stream being ignored. - -Application Error Code: +: A variable-length integer encoding of the Stream ID of the stream being + terminated. -: A 16-bit, application-specified reason the sender is ignoring the stream (see - {{app-error-codes}}). +Application Protocol Error Code: +: A 16-bit application protocol error code (see {{app-error-codes}}) which + indicates why the stream is being closed. -## ACK Frame {#frame-ack} +Final Offset: -Receivers send ACK frames (types 0x1a and 0x1b) to inform senders of packets -they have received and processed. The ACK frame contains one or more ACK Blocks. -ACK Blocks are ranges of acknowledged packets. If the frame type is 0x1b, ACK -frames also contain the sum of ECN marks received on the connection up until -this point. +: A variable-length integer indicating the absolute byte offset of the end of + data written on this stream by the RST_STREAM sender. -QUIC acknowledgements are irrevocable. Once acknowledged, a packet remains -acknowledged, even if it does not appear in a future ACK frame. This is unlike -TCP SACKs ({{?RFC2018}}). -It is expected that a sender will reuse the same packet number across different -packet number spaces. ACK frames only acknowledge the packet numbers that were -transmitted by the sender in the same packet number space of the packet that the -ACK was received in. +## CONNECTION_CLOSE frame {#frame-connection-close} -Version Negotiation and Retry packets cannot be acknowledged because they do not -contain a packet number. Rather than relying on ACK frames, these packets are -implicitly acknowledged by the next Initial packet sent by the client. +An endpoint sends a CONNECTION_CLOSE frame (type=0x02) to notify its peer that +the connection is being closed. CONNECTION_CLOSE is used to signal errors at +the QUIC layer, or the absence of errors (with the NO_ERROR code). -An ACK frame is shown below. +If there are open streams that haven't been explicitly closed, they are +implicitly closed when the connection is closed. + +The CONNECTION_CLOSE frame is as follows: ~~~ 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Largest Acknowledged (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| ACK Delay (i) ... +| Error Code (16) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| ACK Block Count (i) ... +| Frame Type (i) ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| ACK Blocks (*) ... +| Reason Phrase Length (i) ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| [ECN Section] ... +| Reason Phrase (*) ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~~~ -{: #ack-format title="ACK Frame Format"} - -The fields in the ACK frame are as follows: -Largest Acknowledged: +The fields of a CONNECTION_CLOSE frame are as follows: -: A variable-length integer representing the largest packet number the peer is - acknowledging; this is usually the largest packet number that the peer has - received prior to generating the ACK frame. Unlike the packet number in the - QUIC long or short header, the value in an ACK frame is not truncated. +Error Code: -ACK Delay: +: A 16-bit error code which indicates the reason for closing this connection. + CONNECTION_CLOSE uses codes from the space defined in {{error-codes}}. -: A variable-length integer including the time in microseconds that the largest - acknowledged packet, as indicated in the Largest Acknowledged field, was - received by this peer to when this ACK was sent. The value of the ACK Delay - field is scaled by multiplying the encoded value by 2 to the power of the - value of the `ack_delay_exponent` transport parameter set by the sender of the - ACK frame. The `ack_delay_exponent` defaults to 3, or a multiplier of 8 (see - {{transport-parameter-definitions}}). Scaling in this fashion allows for a - larger range of values with a shorter encoding at the cost of lower - resolution. +Frame Type: -ACK Block Count: +: A variable-length integer encoding the type of frame that triggered the error. + A value of 0 (equivalent to the mention of the PADDING frame) is used when the + frame type is unknown. -: A variable-length integer specifying the number of Additional ACK Block (and - Gap) fields after the First ACK Block. +Reason Phrase Length: -ACK Blocks: +: A variable-length integer specifying the length of the reason phrase in bytes. + Note that a CONNECTION_CLOSE frame cannot be split between packets, so in + practice any limits on packet size will also limit the space available for a + reason phrase. -: Contains one or more blocks of packet numbers which have been successfully - received, see {{ack-block-section}}. +Reason Phrase: +: A human-readable explanation for why the connection was closed. This can be + zero length if the sender chooses to not give details beyond the Error Code. + This SHOULD be a UTF-8 encoded string {{!RFC3629}}. -### ACK Block Section {#ack-block-section} -The ACK Block Section consists of alternating Gap and ACK Block fields in -descending packet number order. A First Ack Block field is followed by a -variable number of alternating Gap and Additional ACK Blocks. The number of -Gap and Additional ACK Block fields is determined by the ACK Block Count field. +## APPLICATION_CLOSE frame {#frame-application-close} -Gap and ACK Block fields use a relative integer encoding for efficiency. Though -each encoded value is positive, the values are subtracted, so that each ACK -Block describes progressively lower-numbered packets. As long as contiguous -ranges of packets are small, the variable-length integer encoding ensures that -each range can be expressed in a small number of octets. +An APPLICATION_CLOSE frame (type=0x03) is used to signal an error with the +protocol that uses QUIC. -The ACK frame uses the least significant bit(bit (that is, type 0x1b) to -indicate ECN feedback and report receipt of packets with ECN codepoints of -ECT(0), ECT(1), or CE in the packet's IP header. +The APPLICATION_CLOSE frame is as follows: ~~~ 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| First ACK Block (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Gap (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Additional ACK Block (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Gap (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Additional ACK Block (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ - ... +| Error Code (16) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Gap (i) ... +| Reason Phrase Length (i) ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Additional ACK Block (i) ... +| Reason Phrase (*) ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~~~ -{: #ack-block-format title="ACK Block Section"} - -Each ACK Block acknowledges a contiguous range of packets by indicating the -number of acknowledged packets that precede the largest packet number in that -block. A value of zero indicates that only the largest packet number is -acknowledged. Larger ACK Block values indicate a larger range, with -corresponding lower values for the smallest packet number in the range. Thus, -given a largest packet number for the ACK, the smallest value is determined by -the formula: - -~~~ - smallest = largest - ack_block -~~~ - -The range of packets that are acknowledged by the ACK Block include the range -from the smallest packet number to the largest, inclusive. - -The largest value for the First ACK Block is determined by the Largest -Acknowledged field; the largest for Additional ACK Blocks is determined by -cumulatively subtracting the size of all preceding ACK Blocks and Gaps. - -Each Gap indicates a range of packets that are not being acknowledged. The -number of packets in the gap is one higher than the encoded value of the Gap -Field. -The value of the Gap field establishes the largest packet number value for the -ACK Block that follows the gap using the following formula: +The fields of an APPLICATION_CLOSE frame are as follows: -~~~ - largest = previous_smallest - gap - 2 -~~~ +Error Code: -If the calculated value for largest or smallest packet number for any ACK Block -is negative, an endpoint MUST generate a connection error of type -FRAME_ENCODING_ERROR indicating an error in an ACK frame. +: A 16-bit error code which indicates the reason for closing this connection. + APPLICATION_CLOSE uses codes from the application protocol error code space, + see {{app-error-codes}}. -The fields in the ACK Block Section are: +Reason Phrase Length: -First ACK Block: +: This field is identical in format and semantics to the Reason Phrase Length + field from CONNECTION_CLOSE. -: A variable-length integer indicating the number of contiguous packets - preceding the Largest Acknowledged that are being acknowledged. +Reason Phrase: -Gap (repeated): +: This field is identical in format and semantics to the Reason Phrase field + from CONNECTION_CLOSE. -: A variable-length integer indicating the number of contiguous unacknowledged - packets preceding the packet number one lower than the smallest in the - preceding ACK Block. +APPLICATION_CLOSE has similar format and semantics to the CONNECTION_CLOSE frame +({{frame-connection-close}}). Aside from the semantics of the Error Code field +and the omission of the Frame Type field, both frames are used to close the +connection. -Additional ACK Block (repeated): -: A variable-length integer indicating the number of contiguous acknowledged - packets preceding the largest packet number, as determined by the - preceding Gap. +## MAX_DATA Frame {#frame-max-data} -### ECN section +The MAX_DATA frame (type=0x04) is used in flow control to inform the peer of +the maximum amount of data that can be sent on the connection as a whole. -The ECN section should only be parsed when the ACK frame type byte is 0x1b. -The ECN section consists of 3 ECN counters as shown below. +The frame is as follows: ~~~ 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| ECT(0) Count (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| ECT(1) Count (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| ECN-CE Count (i) ... +| Maximum Data (i) ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~~~ -ECT(0) Count: -: A variable-length integer representing the total number packets received with - the ECT(0) codepoint. +The fields in the MAX_DATA frame are as follows: -ECT(1) Count: -: A variable-length integer representing the total number packets received with - the ECT(1) codepoint. +Maximum Data: -CE Count: -: A variable-length integer representing the total number packets received with - the CE codepoint. +: A variable-length integer indicating the maximum amount of data that can be + sent on the entire connection, in units of octets. -### Sending ACK Frames +All data sent in STREAM frames counts toward this limit. The sum of the largest +received offsets on all streams - including streams in terminal states - MUST +NOT exceed the value advertised by a receiver. An endpoint MUST terminate a +connection with a FLOW_CONTROL_ERROR error if it receives more data than the +maximum data value that it has sent, unless this is a result of a change in +the initial limits (see {{zerortt-parameters}}). -Implementations MUST NOT generate packets that only contain ACK frames in -response to packets which only contain ACK and PADDING frames. However, they -MUST acknowledge packets containing only ACK and PADDING frames when sending -ACK frames in response to other packets. Implementations MUST NOT send more -than one packet containing only an ACK frame per received packet that contains -frames other than ACK and PADDING frames. Packets containing frames besides -ACK and PADDING MUST be acknowledged immediately or when a delayed ack timer -expires. -The receiver's delayed acknowledgment timer SHOULD NOT exceed the current RTT -estimate or the value it indicates in the `max_ack_delay` transport parameter. -This ensures an acknowledgment is sent at least once per RTT when packets -needing acknowledgement are received. The sender can use the receiver's -`max_ack_delay` value in determining timeouts for timer-based retransmission. +## MAX_STREAM_DATA Frame {#frame-max-stream-data} -An acknowledgment SHOULD be sent immediately after receiving 2 packets that -require acknowledgement, unless multiple packets are received together. +The MAX_STREAM_DATA frame (type=0x05) is used in flow control to inform a peer +of the maximum amount of data that can be sent on a stream. -To limit ACK Blocks to those that have not yet been received by the sender, the -receiver SHOULD track which ACK frames have been acknowledged by its peer. Once -an ACK frame has been acknowledged, the packets it acknowledges SHOULD NOT be -acknowledged again. +An endpoint that receives a MAX_STREAM_DATA frame for a receive-only stream +MUST terminate the connection with error PROTOCOL_VIOLATION. -Because ACK frames are not sent in response to ACK-only packets, a receiver that -is only sending ACK frames will only receive acknowledgements for its packets -if the sender includes them in packets with non-ACK frames. A sender SHOULD -bundle ACK frames with other frames when possible. +An endpoint that receives a MAX_STREAM_DATA frame for a send-only stream +it has not opened MUST terminate the connection with error PROTOCOL_VIOLATION. -Endpoints can only acknowledge packets sent in a particular packet number -space by sending ACK frames in packets from the same packet number space. +Note that an endpoint may legally receive a MAX_STREAM_DATA frame on a +bidirectional stream it has not opened. -To limit receiver state or the size of ACK frames, a receiver MAY limit the -number of ACK Blocks it sends. A receiver can do this even without receiving -acknowledgment of its ACK frames, with the knowledge this could cause the sender -to unnecessarily retransmit some data. Standard QUIC {{QUIC-RECOVERY}} -algorithms declare packets lost after sufficiently newer packets are -acknowledged. Therefore, the receiver SHOULD repeatedly acknowledge newly -received packets in preference to packets received in the past. +The frame is as follows: -### ACK Frames and Packet Protection +~~~ + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Stream ID (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Maximum Stream Data (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +~~~ -ACK frames MUST only be carried in a packet that has the same packet -number space as the packet being ACKed (see {{packet-protected}}). For -instance, packets that are protected with 1-RTT keys MUST be -acknowledged in packets that are also protected with 1-RTT keys. +The fields in the MAX_STREAM_DATA frame are as follows: -Packets that a client sends with 0-RTT packet protection MUST be acknowledged by -the server in packets protected by 1-RTT keys. This can mean that the client is -unable to use these acknowledgments if the server cryptographic handshake -messages are delayed or lost. Note that the same limitation applies to other -data sent by the server protected by the 1-RTT keys. +Stream ID: + +: The stream ID of the stream that is affected encoded as a variable-length + integer. + +Maximum Stream Data: + +: A variable-length integer indicating the maximum amount of data that can be + sent on the identified stream, in units of octets. + +When counting data toward this limit, an endpoint accounts for the largest +received offset of data that is sent or received on the stream. Loss or +reordering can mean that the largest received offset on a stream can be greater +than the total size of data received on that stream. Receiving STREAM frames +might not increase the largest received offset. -Endpoints SHOULD send acknowledgments for packets containing CRYPTO frames with -a reduced delay; see Section 4.3.1 of {{QUIC-RECOVERY}}. +The data sent on a stream MUST NOT exceed the largest maximum stream data value +advertised by the receiver. An endpoint MUST terminate a connection with a +FLOW_CONTROL_ERROR error if it receives more data than the largest maximum +stream data that it has sent for the affected stream, unless this is a result of +a change in the initial limits (see {{zerortt-parameters}}). -## PATH_CHALLENGE Frame {#frame-path-challenge} -Endpoints can use PATH_CHALLENGE frames (type=0x0e) to check reachability to the -peer and for path validation during connection migration. +## MAX_STREAM_ID Frame {#frame-max-stream-id} -PATH_CHALLENGE frames contain an 8-byte payload. +The MAX_STREAM_ID frame (type=0x06) informs the peer of the maximum stream ID +that they are permitted to open. + +The frame is as follows: ~~~ 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| | -+ Data (8) + -| | +| Maximum Stream ID (i) ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~~~ -Data: +The fields in the MAX_STREAM_ID frame are as follows: -: This 8-byte field contains arbitrary data. +Maximum Stream ID: +: ID of the maximum unidirectional or bidirectional peer-initiated stream ID for + the connection encoded as a variable-length integer. The limit applies to + unidirectional steams if the second least signification bit of the stream ID + is 1, and applies to bidirectional streams if it is 0. -A PATH_CHALLENGE frame containing 8 octets that are hard to guess is sufficient -to ensure that it is easier to receive the packet than it is to guess the value -correctly. +Loss or reordering can mean that a MAX_STREAM_ID frame can be received which +states a lower stream limit than the client has previously received. +MAX_STREAM_ID frames which do not increase the maximum stream ID MUST be +ignored. -The recipient of this frame MUST generate a PATH_RESPONSE frame -({{frame-path-response}}) containing the same Data. +A peer MUST NOT initiate a stream with a higher stream ID than the greatest +maximum stream ID it has received. An endpoint MUST terminate a connection with +a STREAM_ID_ERROR error if a peer initiates a stream with a higher stream ID +than it has sent, unless this is a result of a change in the initial limits (see +{{zerortt-parameters}}). -## PATH_RESPONSE Frame {#frame-path-response} +## PING Frame {#frame-ping} -The PATH_RESPONSE frame (type=0x0f) is sent in response to a PATH_CHALLENGE -frame. Its format is identical to the PATH_CHALLENGE frame -({{frame-path-challenge}}). +Endpoints can use PING frames (type=0x07) to verify that their peers are still +alive or to check reachability to the peer. The PING frame contains no +additional fields. -If the content of a PATH_RESPONSE frame does not match the content of a -PATH_CHALLENGE frame previously sent by the endpoint, the endpoint MAY generate -a connection error of type PROTOCOL_VIOLATION. +The receiver of a PING frame simply needs to acknowledge the packet containing +this frame. + +The PING frame can be used to keep a connection alive when an application or +application protocol wishes to prevent the connection from timing out. An +application protocol SHOULD provide guidance about the conditions under which +generating a PING is recommended. This guidance SHOULD indicate whether it is +the client or the server that is expected to send the PING. Having both +endpoints send PING frames without coordination can produce an excessive number +of packets and poor performance. +A connection will time out if no packets are sent or received for a period +longer than the time specified in the idle_timeout transport parameter (see +{{termination}}). However, state in middleboxes might time out earlier than +that. Though REQ-5 in {{?RFC4787}} recommends a 2 minute timeout interval, +experience shows that sending packets every 15 to 30 seconds is necessary to +prevent the majority of middleboxes from losing state for UDP flows. -## NEW_TOKEN frame {#frame-new-token} -A server sends a NEW_TOKEN frame (type=0x19) to provide the client a token to -send in the header of an Initial packet for a future connection. +## BLOCKED Frame {#frame-blocked} -The NEW_TOKEN frame is as follows: +A sender SHOULD send a BLOCKED frame (type=0x08) when it wishes to send data, +but is unable to due to connection-level flow control (see {{blocking}}). +BLOCKED frames can be used as input to tuning of flow control algorithms (see +{{fc-credit}}). + +The BLOCKED frame is as follows: ~~~ 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Token Length (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Token (*) ... +| Offset (i) ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~~~ -The fields of a NEW_TOKEN frame are as follows: - -Token Length: - -: A variable-length integer specifying the length of the token in bytes. - -Token: - -: An opaque blob that the client may use with a future Initial packet. - +The BLOCKED frame contains a single field. -## STREAM Frames {#frame-stream} +Offset: -STREAM frames implicitly create a stream and carry stream data. The STREAM -frame takes the form 0b00010XXX (or the set of values from 0x10 to 0x17). The -value of the three low-order bits of the frame type determine the fields that -are present in the frame. +: A variable-length integer indicating the connection-level offset at which + the blocking occurred. -* The OFF bit (0x04) in the frame type is set to indicate that there is an - Offset field present. When set to 1, the Offset field is present; when set to - 0, the Offset field is absent and the Stream Data starts at an offset of 0 - (that is, the frame contains the first octets of the stream, or the end of a - stream that includes no data). -* The LEN bit (0x02) in the frame type is set to indicate that there is a Length - field present. If this bit is set to 0, the Length field is absent and the - Stream Data field extends to the end of the packet. If this bit is set to 1, - the Length field is present. +## STREAM_BLOCKED Frame {#frame-stream-blocked} -* The FIN bit (0x01) of the frame type is set only on frames that contain the - final offset of the stream. Setting this bit indicates that the frame - marks the end of the stream. +A sender SHOULD send a STREAM_BLOCKED frame (type=0x09) when it wishes to send +data, but is unable to due to stream-level flow control. This frame is +analogous to BLOCKED ({{frame-blocked}}). -An endpoint that receives a STREAM frame for a send-only stream MUST terminate -the connection with error PROTOCOL_VIOLATION. +An endpoint that receives a STREAM_BLOCKED frame for a send-only stream MUST +terminate the connection with error PROTOCOL_VIOLATION. -A STREAM frame is shown below. +The STREAM_BLOCKED frame is as follows: ~~~ 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Stream ID (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| [Offset (i)] ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| [Length (i)] ... +| Stream ID (i) ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Stream Data (*) ... +| Offset (i) ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~~~ -{: #stream-format title="STREAM Frame Format"} -The STREAM frame contains the following fields: +The STREAM_BLOCKED frame contains two fields: Stream ID: -: A variable-length integer indicating the stream ID of the stream (see - {{stream-id}}). +: A variable-length integer indicating the stream which is flow control blocked. Offset: -: A variable-length integer specifying the byte offset in the stream for the - data in this STREAM frame. This field is present when the OFF bit is set to - 1. When the Offset field is absent, the offset is 0. - -Length: +: A variable-length integer indicating the offset of the stream at which the + blocking occurred. -: A variable-length integer specifying the length of the Stream Data field in - this STREAM frame. This field is present when the LEN bit is set to 1. When - the LEN bit is set to 0, the Stream Data field consumes all the remaining - octets in the packet. -Stream Data: +## STREAM_ID_BLOCKED Frame {#frame-stream-id-blocked} -: The bytes from the designated stream to be delivered. +A sender MAY send a STREAM_ID_BLOCKED frame (type=0x0a) when it wishes to open a +stream, but is unable to due to the maximum stream ID limit set by its peer (see +{{frame-max-stream-id}}). This does not open the stream, but informs the peer +that a new stream was needed, but the stream limit prevented the creation of the +stream. -When a Stream Data field has a length of 0, the offset in the STREAM frame is -the offset of the next byte that would be sent. +The STREAM_ID_BLOCKED frame is as follows: -The first byte in the stream has an offset of 0. The largest offset delivered -on a stream - the sum of the re-constructed offset and data length - MUST be -less than 2^62. +~~~ + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Stream ID (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +~~~ -Stream multiplexing is achieved by interleaving STREAM frames from multiple -streams into one or more QUIC packets. A single QUIC packet can include -multiple STREAM frames from one or more streams. +The STREAM_ID_BLOCKED frame contains a single field. -Implementation note: One of the benefits of QUIC is avoidance of head-of-line -blocking across multiple streams. When a packet loss occurs, only streams with -data in that packet are blocked waiting for a retransmission to be received, -while other streams can continue making progress. Note that when data from -multiple streams is bundled into a single QUIC packet, loss of that packet -blocks all those streams from making progress. An implementation is therefore -advised to bundle as few streams as necessary in outgoing packets without losing -transmission efficiency to underfilled packets. +Stream ID: +: A variable-length integer indicating the highest stream ID that the sender + was permitted to open. -## CRYPTO Frame {#frame-crypto} +## NEW_CONNECTION_ID Frame {#frame-new-connection-id} -The CRYPTO frame (type=0x18) is used to transmit cryptographic handshake -messages. It can be sent in all packet types. The CRYPTO frame offers the -cryptographic protocol an in-order stream of bytes. CRYPTO frames are -functionally identical to STREAM frames, except that they do not bear a stream -identifier; they are not flow controlled; and they do not carry markers for -optional offset, optional length, and the end of the stream. +An endpoint sends a NEW_CONNECTION_ID frame (type=0x0b) to provide its peer with +alternative connection IDs that can be used to break linkability when migrating +connections (see {{migration-linkability}}). -A CRYPTO frame is shown below. +The NEW_CONNECTION_ID frame is as follows: ~~~ 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Offset (i) ... +| Length (8) | Sequence Number (i) ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Length (i) ... +| Connection ID (32..144) ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Crypto Data (*) ... +| | ++ + +| | ++ Stateless Reset Token (128) + +| | ++ + +| | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~~~ -{: #crypto-format title="CRYPTO Frame Format"} - -The CRYPTO frame contains the following fields: - -Offset: -: A variable-length integer specifying the byte offset in the stream for the - data in this CRYPTO frame. +The fields are: Length: -: A variable-length integer specifying the length of the Crypto Data field in - this CRYPTO frame. - -Crypto Data: - -: The cryptographic message data. - -There is a separate flow of cryptographic handshake data in each encryption -level, each of which starts at an offset of 0. This implies that each encryption -level is treated as a separate CRYPTO stream of data. - -Unlike STREAM frames, which include a Stream ID indicating to which stream the -data belongs, the CRYPTO frame carries data for a single stream per encryption -level. The stream does not have an explicit end, so CRYPTO frames do not have a -FIN bit. - +: An 8-bit unsigned integer containing the length of the connection ID. Values + less than 4 and greater than 18 are invalid and MUST be treated as a + connection error of type PROTOCOL_VIOLATION. -# Packetization and Reliability {#packetization} +Sequence Number: -A sender bundles one or more frames in a QUIC packet (see {{frames}}). +: The sequence number assigned to the connection ID by the sender. See + {{issuing-connection-ids}}. -A sender SHOULD minimize per-packet bandwidth and computational costs by -bundling as many frames as possible within a QUIC packet. A sender MAY wait for -a short period of time to bundle multiple frames before sending a packet that is -not maximally packed, to avoid sending out large numbers of small packets. An -implementation may use knowledge about application sending behavior or -heuristics to determine whether and for how long to wait. This waiting period -is an implementation decision, and an implementation should be careful to delay -conservatively, since any delay is likely to increase application-visible -latency. +Connection ID: +: A connection ID of the specified length. -## Packet Processing and Acknowledgment {#processing-and-ack} +Stateless Reset Token: -A packet MUST NOT be acknowledged until packet protection has been successfully -removed and all frames contained in the packet have been processed. For STREAM -frames, this means the data has been enqueued in preparation to be received by -the application protocol, but it does not require that data is delivered and -consumed. +: A 128-bit value that will be used for a stateless reset when the associated + connection ID is used (see {{stateless-reset}}). -Once the packet has been fully processed, a receiver acknowledges receipt by -sending one or more ACK frames containing the packet number of the received -packet. To avoid creating an indefinite feedback loop, an endpoint MUST NOT -send an ACK frame in response to a packet containing only ACK or PADDING frames, -even if there are packet gaps which precede the received packet. The endpoint -MUST acknowledge packets containing only ACK or PADDING frames in the next ACK -frame that it sends. +An endpoint MUST NOT send this frame if it currently requires that its peer send +packets with a zero-length Destination Connection ID. Changing the length of a +connection ID to or from zero-length makes it difficult to identify when the +value of the connection ID changed. An endpoint that is sending packets with a +zero-length Destination Connection ID MUST treat receipt of a NEW_CONNECTION_ID +frame as a connection error of type PROTOCOL_VIOLATION. -While PADDING frames do not elicit an ACK frame from a receiver, they are -considered to be in flight for congestion control purposes -{{QUIC-RECOVERY}}. Sending only PADDING frames might cause the sender to become -limited by the congestion controller (as described in {{QUIC-RECOVERY}}) with no -acknowledgments forthcoming from the receiver. Therefore, a sender should ensure -that other frames are sent in addition to PADDING frames to elicit -acknowledgments from the receiver. +Transmission errors, timeouts and retransmissions might cause the same +NEW_CONNECTION_ID frame to be received multiple times. Receipt of the same +frame multiple times MUST NOT be treated as a connection error. A receiver can +use the sequence number supplied in the NEW_CONNECTION_ID frame to identify new +connection IDs from old ones. -Strategies and implications of the frequency of generating acknowledgments are -discussed in more detail in {{QUIC-RECOVERY}}. +If an endpoint receives a NEW_CONNECTION_ID frame that repeats a previously +issued connection ID with a different Stateless Reset Token or a different +sequence number, the endpoint MAY treat that receipt as a connection error of +type PROTOCOL_VIOLATION. +## RETIRE_CONNECTION_ID Frame {#frame-retire-connection-id} -## Retransmission of Information +An endpoint sends a RETIRE_CONNECTION_ID frame (type=0x1b) to indicate that it +will no longer use a connection ID that was issued by its peer. This may include +the connection ID provided during the handshake. Sending a RETIRE_CONNECTION_ID +frame also serves as a request to the peer to send additional connection IDs for +future use (see {{connection-id}}). New connection IDs can be delivered to a +peer using the NEW_CONNECTION_ID frame ({{frame-new-connection-id}}). -QUIC packets that are determined to be lost are not retransmitted whole. The -same applies to the frames that are contained within lost packets. Instead, the -information that might be carried in frames is sent again in new frames as -needed. +Retiring a connection ID invalidates the stateless reset token associated with +that connection ID. -New frames and packets are used to carry information that is determined to have -been lost. In general, information is sent again when a packet containing that -information is determined to be lost and sending ceases when a packet -containing that information is acknowledged. +The RETIRE_CONNECTION_ID frame is as follows: -* Data sent in CRYPTO frames is retransmitted according to the rules in - {{QUIC-RECOVERY}}, until either all data has been acknowledged or the crypto - state machine implicitly knows that the peer received the data. +~~~ + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Sequence Number (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +~~~ -* Application data sent in STREAM frames is retransmitted in new STREAM frames - unless the endpoint has sent a RST_STREAM for that stream. Once an endpoint - sends a RST_STREAM frame, no further STREAM frames are needed. +The fields are: -* The most recent set of acknowledgments are sent in ACK frames. An ACK frame - SHOULD contain all unacknowledged acknowledgments, as described in - {{sending-ack-frames}}. +Sequence Number: -* Cancellation of stream transmission, as carried in a RST_STREAM frame, is - sent until acknowledged or until all stream data is acknowledged by the peer - (that is, either the "Reset Recvd" or "Data Recvd" state is reached on the - send stream). The content of a RST_STREAM frame MUST NOT change when it is - sent again. +: The sequence number of the connection ID being retired. See + {{retiring-cids}}. -* Similarly, a request to cancel stream transmission, as encoded in a - STOP_SENDING frame, is sent until the receive stream enters either a "Data - Recvd" or "Reset Recvd" state, see {{solicited-state-transitions}}. +Receipt of a RETIRE_CONNECTION_ID frame containing a sequence number greater +than any previously sent to the peer MAY be treated as a connection error of +type PROTOCOL_VIOLATION. -* Connection close signals, including those that use CONNECTION_CLOSE and - APPLICATION_CLOSE frames, are not sent again when packet loss is detected, but - as described in {{termination}}. +An endpoint cannot send this frame if it was provided with a zero-length +connection ID by its peer. An endpoint that provides a zero-length connection +ID MUST treat receipt of a RETIRE_CONNECTION_ID frame as a connection error of +type PROTOCOL_VIOLATION. -* The current connection maximum data is sent in MAX_DATA frames. An updated - value is sent in a MAX_DATA frame if the packet containing the most recently - sent MAX_DATA frame is declared lost, or when the endpoint decides to update - the limit. Care is necessary to avoid sending this frame too often as the - limit can increase frequently and cause an unnecessarily large number of - MAX_DATA frames to be sent. -* The current maximum stream data offset is sent in MAX_STREAM_DATA frames. - Like MAX_DATA, an updated value is sent when the packet containing - the most recent MAX_STREAM_DATA frame for a stream is lost or when the limit - is updated, with care taken to prevent the frame from being sent too often. An - endpoint SHOULD stop sending MAX_STREAM_DATA frames when the receive stream - enters a "Size Known" state. +## STOP_SENDING Frame {#frame-stop-sending} -* The maximum stream ID for a stream of a given type is sent in MAX_STREAM_ID - frames. Like MAX_DATA, an updated value is sent when a packet containing the - most recent MAX_STREAM_ID for a stream type frame is declared lost or when - the limit is updated, with care taken to prevent the frame from being sent - too often. +An endpoint may use a STOP_SENDING frame (type=0x0c) to communicate that +incoming data is being discarded on receipt at application request. This +signals a peer to abruptly terminate transmission on a stream. -* Blocked signals are carried in BLOCKED, STREAM_BLOCKED, and STREAM_ID_BLOCKED - frames. BLOCKED streams have connection scope, STREAM_BLOCKED frames have - stream scope, and STREAM_ID_BLOCKED frames are scoped to a specific stream - type. New frames are sent if packets containing the most recent frame for a - scope is lost, but only while the endpoint is blocked on the corresponding - limit. These frames always include the limit that is causing blocking at the - time that they are transmitted. +Receipt of a STOP_SENDING frame is only valid for a send stream that exists and +is not in the "Ready" state (see {{stream-send-states}}). Receiving a +STOP_SENDING frame for a send stream that is "Ready" or non-existent MUST be +treated as a connection error of type PROTOCOL_VIOLATION. An endpoint that +receives a STOP_SENDING frame for a receive-only stream MUST terminate the +connection with error PROTOCOL_VIOLATION. -* A liveness or path validation check using PATH_CHALLENGE frames is sent - periodically until a matching PATH_RESPONSE frame is received or until there - is no remaining need for liveness or path validation checking. PATH_CHALLENGE - frames include a different payload each time they are sent. +The STOP_SENDING frame is as follows: -* Responses to path validation using PATH_RESPONSE frames are sent just once. - A new PATH_CHALLENGE frame will be sent if another PATH_RESPONSE frame is - needed. +~~~ + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Stream ID (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Application Error Code (16) | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +~~~ -* New connection IDs are sent in NEW_CONNECTION_ID frames and retransmitted if - the packet containing them is lost. Retransmissions of this frame carry the - same sequence number value. Likewise, retired connection IDs are sent in - RETIRE_CONNECTION_ID frames and retransmitted if the packet containing them is - lost. +The fields are: -* PADDING frames contain no information, so lost PADDING frames do not require - repair. +Stream ID: -Upon detecting losses, a sender MUST take appropriate congestion control action. -The details of loss detection and congestion control are described in -{{QUIC-RECOVERY}}. +: A variable-length integer carrying the Stream ID of the stream being ignored. +Application Error Code: -## Packet Size {#packet-size} +: A 16-bit, application-specified reason the sender is ignoring the stream (see + {{app-error-codes}}). -The QUIC packet size includes the QUIC header and integrity check, but not the -UDP or IP header. -Clients MUST ensure that the first Initial packet they send is sent in a UDP -datagram that is at least 1200 octets. Padding the Initial packet or including a -0-RTT packet in the same datagram are ways to meet this requirement. Sending a -UDP datagram of this size ensures that the network path supports a reasonable -Maximum Transmission Unit (MTU), and helps reduce the amplitude of amplification -attacks caused by server responses toward an unverified client address. +## ACK Frame {#frame-ack} -The datagram containing the first Initial packet from a client MAY exceed 1200 -octets if the client believes that the Path Maximum Transmission Unit (PMTU) -supports the size that it chooses. +Receivers send ACK frames (types 0x1a and 0x1b) to inform senders of packets +they have received and processed. The ACK frame contains one or more ACK Blocks. +ACK Blocks are ranges of acknowledged packets. If the frame type is 0x1b, ACK +frames also contain the sum of ECN marks received on the connection up until +this point. -A server MAY send a CONNECTION_CLOSE frame with error code PROTOCOL_VIOLATION in -response to the first Initial packet it receives from a client if the UDP -datagram is smaller than 1200 octets. It MUST NOT send any other frame type in -response, or otherwise behave as if any part of the offending packet was -processed as valid. +QUIC acknowledgements are irrevocable. Once acknowledged, a packet remains +acknowledged, even if it does not appear in a future ACK frame. This is unlike +TCP SACKs ({{?RFC2018}}). + +It is expected that a sender will reuse the same packet number across different +packet number spaces. ACK frames only acknowledge the packet numbers that were +transmitted by the sender in the same packet number space of the packet that the +ACK was received in. +Version Negotiation and Retry packets cannot be acknowledged because they do not +contain a packet number. Rather than relying on ACK frames, these packets are +implicitly acknowledged by the next Initial packet sent by the client. -## Path Maximum Transmission Unit +An ACK frame is shown below. -The Path Maximum Transmission Unit (PMTU) is the maximum size of the entire IP -header, UDP header, and UDP payload. The UDP payload includes the QUIC packet -header, protected payload, and any authentication fields. +~~~ + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Largest Acknowledged (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| ACK Delay (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| ACK Block Count (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| ACK Blocks (*) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| [ECN Section] ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +~~~ +{: #ack-format title="ACK Frame Format"} -All QUIC packets SHOULD be sized to fit within the estimated PMTU to avoid IP -fragmentation or packet drops. To optimize bandwidth efficiency, endpoints -SHOULD use Packetization Layer PMTU Discovery ({{!PLPMTUD=RFC4821}}). Endpoints -MAY use PMTU Discovery ({{!PMTUDv4=RFC1191}}, {{!PMTUDv6=RFC8201}}) for -detecting the PMTU, setting the PMTU appropriately, and storing the result of -previous PMTU determinations. +The fields in the ACK frame are as follows: -In the absence of these mechanisms, QUIC endpoints SHOULD NOT send IP packets -larger than 1280 octets. Assuming the minimum IP header size, this results in -a QUIC packet size of 1232 octets for IPv6 and 1252 octets for IPv4. Some -QUIC implementations MAY be more conservative in computing allowed QUIC packet -size given unknown tunneling overheads or IP header options. +Largest Acknowledged: -QUIC endpoints that implement any kind of PMTU discovery SHOULD maintain an -estimate for each combination of local and remote IP addresses. Each pairing of -local and remote addresses could have a different maximum MTU in the path. +: A variable-length integer representing the largest packet number the peer is + acknowledging; this is usually the largest packet number that the peer has + received prior to generating the ACK frame. Unlike the packet number in the + QUIC long or short header, the value in an ACK frame is not truncated. -QUIC depends on the network path supporting an MTU of at least 1280 octets. This -is the IPv6 minimum MTU and therefore also supported by most modern IPv4 -networks. An endpoint MUST NOT reduce its MTU below this number, even if it -receives signals that indicate a smaller limit might exist. +ACK Delay: + +: A variable-length integer including the time in microseconds that the largest + acknowledged packet, as indicated in the Largest Acknowledged field, was + received by this peer to when this ACK was sent. The value of the ACK Delay + field is scaled by multiplying the encoded value by 2 to the power of the + value of the `ack_delay_exponent` transport parameter set by the sender of the + ACK frame. The `ack_delay_exponent` defaults to 3, or a multiplier of 8 (see + {{transport-parameter-definitions}}). Scaling in this fashion allows for a + larger range of values with a shorter encoding at the cost of lower + resolution. + +ACK Block Count: -If a QUIC endpoint determines that the PMTU between any pair of local and remote -IP addresses has fallen below 1280 octets, it MUST immediately cease sending -QUIC packets on the affected path. This could result in termination of the -connection if an alternative path cannot be found. +: A variable-length integer specifying the number of Additional ACK Block (and + Gap) fields after the First ACK Block. +ACK Blocks: -### IPv4 PMTU Discovery {#v4-pmtud} +: Contains one or more blocks of packet numbers which have been successfully + received, see {{ack-block-section}}. -Traditional ICMP-based path MTU discovery in IPv4 {{!PMTUDv4}} is potentially -vulnerable to off-path attacks that successfully guess the IP/port 4-tuple and -reduce the MTU to a bandwidth-inefficient value. TCP connections mitigate this -risk by using the (at minimum) 8 bytes of transport header echoed in the ICMP -message to validate the TCP sequence number as valid for the current -connection. However, as QUIC operates over UDP, in IPv4 the echoed information -could consist only of the IP and UDP headers, which usually has insufficient -entropy to mitigate off-path attacks. -As a result, endpoints that implement PMTUD in IPv4 SHOULD take steps to -mitigate this risk. For instance, an application could: +### ACK Block Section {#ack-block-section} -* Set the IPv4 Don't Fragment (DF) bit on a small proportion of packets, so that -most invalid ICMP messages arrive when there are no DF packets outstanding, and -can therefore be identified as spurious. +The ACK Block Section consists of alternating Gap and ACK Block fields in +descending packet number order. A First Ack Block field is followed by a +variable number of alternating Gap and Additional ACK Blocks. The number of +Gap and Additional ACK Block fields is determined by the ACK Block Count field. -* Store additional information from the IP or UDP headers from DF packets (for -example, the IP ID or UDP checksum) to further authenticate incoming Datagram -Too Big messages. +Gap and ACK Block fields use a relative integer encoding for efficiency. Though +each encoded value is positive, the values are subtracted, so that each ACK +Block describes progressively lower-numbered packets. As long as contiguous +ranges of packets are small, the variable-length integer encoding ensures that +each range can be expressed in a small number of octets. -* Any reduction in PMTU due to a report contained in an ICMP packet is -provisional until QUIC's loss detection algorithm determines that the packet is -actually lost. +The ACK frame uses the least significant bit(bit (that is, type 0x1b) to +indicate ECN feedback and report receipt of packets with ECN codepoints of +ECT(0), ECT(1), or CE in the packet's IP header. +~~~ + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| First ACK Block (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Gap (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Additional ACK Block (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Gap (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Additional ACK Block (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Gap (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Additional ACK Block (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +~~~ +{: #ack-block-format title="ACK Block Section"} -### Special Considerations for Packetization Layer PMTU Discovery +Each ACK Block acknowledges a contiguous range of packets by indicating the +number of acknowledged packets that precede the largest packet number in that +block. A value of zero indicates that only the largest packet number is +acknowledged. Larger ACK Block values indicate a larger range, with +corresponding lower values for the smallest packet number in the range. Thus, +given a largest packet number for the ACK, the smallest value is determined by +the formula: +~~~ + smallest = largest - ack_block +~~~ -The PADDING frame provides a useful option for PMTU probe packets. PADDING -frames generate acknowledgements, but they need not be delivered reliably. As a -result, the loss of PADDING frames in probe packets does not require -delay-inducing retransmission. However, PADDING frames do consume congestion -window, which may delay the transmission of subsequent application data. +The range of packets that are acknowledged by the ACK Block include the range +from the smallest packet number to the largest, inclusive. -When implementing the algorithm in Section 7.2 of {{!PLPMTUD}}, the initial -value of search_low SHOULD be consistent with the IPv6 minimum packet size. -Paths that do not support this size cannot deliver Initial packets, and -therefore are not QUIC-compliant. +The largest value for the First ACK Block is determined by the Largest +Acknowledged field; the largest for Additional ACK Blocks is determined by +cumulatively subtracting the size of all preceding ACK Blocks and Gaps. -Section 7.3 of {{!PLPMTUD}} discusses trade-offs between small and large -increases in the size of probe packets. As QUIC probe packets need not contain -application data, aggressive increases in probe size carry fewer consequences. +Each Gap indicates a range of packets that are not being acknowledged. The +number of packets in the gap is one higher than the encoded value of the Gap +Field. +The value of the Gap field establishes the largest packet number value for the +ACK Block that follows the gap using the following formula: -# Streams: QUIC's Data Structuring Abstraction {#streams} +~~~ + largest = previous_smallest - gap - 2 +~~~ -Streams in QUIC provide a lightweight, ordered byte-stream abstraction. +If the calculated value for largest or smallest packet number for any ACK Block +is negative, an endpoint MUST generate a connection error of type +FRAME_ENCODING_ERROR indicating an error in an ACK frame. -There are two basic types of stream in QUIC. Unidirectional streams carry data -in one direction: from the initiator of the stream to its peer; -bidirectional streams allow for data to be sent in both directions. Different -stream identifiers are used to distinguish between unidirectional and -bidirectional streams, as well as to create a separation between streams that -are initiated by the client and server (see {{stream-id}}). +The fields in the ACK Block Section are: -Either type of stream can be created by either endpoint, can concurrently send -data interleaved with other streams, and can be cancelled. +First ACK Block: -Stream offsets allow for the octets on a stream to be placed in order. An -endpoint MUST be capable of delivering data received on a stream in order. -Implementations MAY choose to offer the ability to deliver data out of order. -There is no means of ensuring ordering between octets on different streams. +: A variable-length integer indicating the number of contiguous packets + preceding the Largest Acknowledged that are being acknowledged. -The creation and destruction of streams are expected to have minimal bandwidth -and computational cost. A single STREAM frame may create, carry data for, and -terminate a stream, or a stream may last the entire duration of a connection. +Gap (repeated): -Streams are individually flow controlled, allowing an endpoint to limit memory -commitment and to apply back pressure. The creation of streams is also flow -controlled, with each peer declaring the maximum stream ID it is willing to -accept at a given time. +: A variable-length integer indicating the number of contiguous unacknowledged + packets preceding the packet number one lower than the smallest in the + preceding ACK Block. -An alternative view of QUIC streams is as an elastic "message" abstraction, -similar to the way ephemeral streams are used in SST -{{?SST=DOI.10.1145/1282427.1282421}}, which may be a more appealing description -for some applications. +Additional ACK Block (repeated): +: A variable-length integer indicating the number of contiguous acknowledged + packets preceding the largest packet number, as determined by the + preceding Gap. -## Stream Identifiers {#stream-id} +### ECN section -Streams are identified by an unsigned 62-bit integer, referred to as the Stream -ID. The least significant two bits of the Stream ID are used to identify the -type of stream (unidirectional or bidirectional) and the initiator of the -stream. +The ECN section should only be parsed when the ACK frame type byte is 0x1b. +The ECN section consists of 3 ECN counters as shown below. -The least significant bit (0x1) of the Stream ID identifies the initiator of the -stream. Clients initiate even-numbered streams (those with the least -significant bit set to 0); servers initiate odd-numbered streams (with the bit -set to 1). Separation of the stream identifiers ensures that client and server -are able to open streams without the latency imposed by negotiating for an -identifier. +~~~ + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| ECT(0) Count (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| ECT(1) Count (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| ECN-CE Count (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +~~~ -If an endpoint receives a frame for a stream that it expects to initiate (i.e., -odd-numbered for the client or even-numbered for the server), but which it has -not yet opened, it MUST close the connection with error code STREAM_STATE_ERROR. +ECT(0) Count: +: A variable-length integer representing the total number packets received with + the ECT(0) codepoint. -The second least significant bit (0x2) of the Stream ID differentiates between -unidirectional streams and bidirectional streams. Unidirectional streams always -have this bit set to 1 and bidirectional streams have this bit set to 0. +ECT(1) Count: +: A variable-length integer representing the total number packets received with + the ECT(1) codepoint. -The two type bits from a Stream ID therefore identify streams as summarized in -{{stream-id-types}}. +CE Count: +: A variable-length integer representing the total number packets received with + the CE codepoint. -| Low Bits | Stream Type | -|:---------|:---------------------------------| -| 0x0 | Client-Initiated, Bidirectional | -| 0x1 | Server-Initiated, Bidirectional | -| 0x2 | Client-Initiated, Unidirectional | -| 0x3 | Server-Initiated, Unidirectional | -{: #stream-id-types title="Stream ID Types"} +### Sending ACK Frames -The first bi-directional stream opened by the client is stream 0. +Implementations MUST NOT generate packets that only contain ACK frames in +response to packets which only contain ACK and PADDING frames. However, they +MUST acknowledge packets containing only ACK and PADDING frames when sending +ACK frames in response to other packets. Implementations MUST NOT send more +than one packet containing only an ACK frame per received packet that contains +frames other than ACK and PADDING frames. Packets containing frames besides +ACK and PADDING MUST be acknowledged immediately or when a delayed ack timer +expires. -A QUIC endpoint MUST NOT reuse a Stream ID. Streams of each type are created in -numeric order. Streams that are used out of order result in opening all -lower-numbered streams of the same type in the same direction. +The receiver's delayed acknowledgment timer SHOULD NOT exceed the current RTT +estimate or the value it indicates in the `max_ack_delay` transport parameter. +This ensures an acknowledgment is sent at least once per RTT when packets +needing acknowledgement are received. The sender can use the receiver's +`max_ack_delay` value in determining timeouts for timer-based retransmission. -Stream IDs are encoded as a variable-length integer (see {{integer-encoding}}). +An acknowledgment SHOULD be sent immediately after receiving 2 packets that +require acknowledgement, unless multiple packets are received together. +To limit ACK Blocks to those that have not yet been received by the sender, the +receiver SHOULD track which ACK frames have been acknowledged by its peer. Once +an ACK frame has been acknowledged, the packets it acknowledges SHOULD NOT be +acknowledged again. -## Stream States {#stream-states} +Because ACK frames are not sent in response to ACK-only packets, a receiver that +is only sending ACK frames will only receive acknowledgements for its packets +if the sender includes them in packets with non-ACK frames. A sender SHOULD +bundle ACK frames with other frames when possible. -This section describes the two types of QUIC stream in terms of the states of -their send or receive components. Two state machines are described: one for -streams on which an endpoint transmits data ({{stream-send-states}}); another -for streams from which an endpoint receives data ({{stream-recv-states}}). +Endpoints can only acknowledge packets sent in a particular packet number +space by sending ACK frames in packets from the same packet number space. -Unidirectional streams use the applicable state machine directly. Bidirectional -streams use both state machines. For the most part, the use of these state -machines is the same whether the stream is unidirectional or bidirectional. The -conditions for opening a stream are slightly more complex for a bidirectional -stream because the opening of either send or receive sides causes the stream -to open in both directions. +To limit receiver state or the size of ACK frames, a receiver MAY limit the +number of ACK Blocks it sends. A receiver can do this even without receiving +acknowledgment of its ACK frames, with the knowledge this could cause the sender +to unnecessarily retransmit some data. Standard QUIC {{QUIC-RECOVERY}} +algorithms declare packets lost after sufficiently newer packets are +acknowledged. Therefore, the receiver SHOULD repeatedly acknowledge newly +received packets in preference to packets received in the past. -An endpoint can open streams up to its maximum stream limit in any order, -however endpoints SHOULD open the send side of streams for each type in order. +### ACK Frames and Packet Protection -Note: +ACK frames MUST only be carried in a packet that has the same packet +number space as the packet being ACKed (see {{packet-protected}}). For +instance, packets that are protected with 1-RTT keys MUST be +acknowledged in packets that are also protected with 1-RTT keys. -: These states are largely informative. This document uses stream states to - describe rules for when and how different types of frames can be sent and the - reactions that are expected when different types of frames are received. - Though these state machines are intended to be useful in implementing QUIC, - these states aren't intended to constrain implementations. An implementation - can define a different state machine as long as its behavior is consistent - with an implementation that implements these states. +Packets that a client sends with 0-RTT packet protection MUST be acknowledged by +the server in packets protected by 1-RTT keys. This can mean that the client is +unable to use these acknowledgments if the server cryptographic handshake +messages are delayed or lost. Note that the same limitation applies to other +data sent by the server protected by the 1-RTT keys. +Endpoints SHOULD send acknowledgments for packets containing CRYPTO frames with +a reduced delay; see Section 4.3.1 of {{QUIC-RECOVERY}}. -### Send Stream States {#stream-send-states} +## PATH_CHALLENGE Frame {#frame-path-challenge} -{{fig-stream-send-states}} shows the states for the part of a stream that sends -data to a peer. +Endpoints can use PATH_CHALLENGE frames (type=0x0e) to check reachability to the +peer and for path validation during connection migration. + +PATH_CHALLENGE frames contain an 8-byte payload. ~~~ - o - | Create Stream (Sending) - | Create Bidirectional Stream (Receiving) - v - +-------+ - | Ready | Send RST_STREAM - | |-----------------------. - +-------+ | - | | - | Send STREAM / | - | STREAM_BLOCKED | - | | - | Create Bidirectional | - | Stream (Receiving) | - v | - +-------+ | - | Send | Send RST_STREAM | - | |---------------------->| - +-------+ | - | | - | Send STREAM + FIN | - v v - +-------+ +-------+ - | Data | Send RST_STREAM | Reset | - | Sent |------------------>| Sent | - +-------+ +-------+ - | | - | Recv All ACKs | Recv ACK - v v - +-------+ +-------+ - | Data | | Reset | - | Recvd | | Recvd | - +-------+ +-------+ + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| | ++ Data (8) + +| | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~~~ -{: #fig-stream-send-states title="States for Send Streams"} -The sending part of stream that the endpoint initiates (types 0 and 2 for -clients, 1 and 3 for servers) is opened by the application or application -protocol. The "Ready" state represents a newly created stream that is able to -accept data from the application. Stream data might be buffered in this state -in preparation for sending. +Data: -Sending the first STREAM or STREAM_BLOCKED frame causes a send stream to enter -the "Send" state. An implementation might choose to defer allocating a Stream -ID to a send stream until it sends the first frame and enters this state, which -can allow for better stream prioritization. +: This 8-byte field contains arbitrary data. -The sending part of a bidirectional stream initiated by a peer (type 0 for a -server, type 1 for a client) enters the "Ready" state then immediately -transitions to the "Send" state if the receiving part enters the "Recv" state. +A PATH_CHALLENGE frame containing 8 octets that are hard to guess is sufficient +to ensure that it is easier to receive the packet than it is to guess the value +correctly. -In the "Send" state, an endpoint transmits - and retransmits as necessary - data -in STREAM frames. The endpoint respects the flow control limits of its peer, -accepting MAX_STREAM_DATA frames. An endpoint in the "Send" state generates -STREAM_BLOCKED frames if it encounters flow control limits. +The recipient of this frame MUST generate a PATH_RESPONSE frame +({{frame-path-response}}) containing the same Data. -After the application indicates that stream data is complete and a STREAM frame -containing the FIN bit is sent, the send stream enters the "Data Sent" state. -From this state, the endpoint only retransmits stream data as necessary. The -endpoint no longer needs to track flow control limits or send STREAM_BLOCKED -frames for a send stream in this state. The endpoint can ignore any -MAX_STREAM_DATA frames it receives from its peer in this state; MAX_STREAM_DATA -frames might be received until the peer receives the final stream offset. -Once all stream data has been successfully acknowledged, the send stream enters -the "Data Recvd" state, which is a terminal state. +## PATH_RESPONSE Frame {#frame-path-response} -From any of the "Ready", "Send", or "Data Sent" states, an application can -signal that it wishes to abandon transmission of stream data. Similarly, the -endpoint might receive a STOP_SENDING frame from its peer. In either case, the -endpoint sends a RST_STREAM frame, which causes the stream to enter the "Reset -Sent" state. +The PATH_RESPONSE frame (type=0x0f) is sent in response to a PATH_CHALLENGE +frame. Its format is identical to the PATH_CHALLENGE frame +({{frame-path-challenge}}). -An endpoint MAY send a RST_STREAM as the first frame on a send stream; this -causes the send stream to open and then immediately transition to the "Reset -Sent" state. +If the content of a PATH_RESPONSE frame does not match the content of a +PATH_CHALLENGE frame previously sent by the endpoint, the endpoint MAY generate +a connection error of type PROTOCOL_VIOLATION. -Once a packet containing a RST_STREAM has been acknowledged, the send stream -enters the "Reset Recvd" state, which is a terminal state. +## NEW_TOKEN frame {#frame-new-token} -### Receive Stream States {#stream-recv-states} +A server sends a NEW_TOKEN frame (type=0x19) to provide the client a token to +send in the header of an Initial packet for a future connection. -{{fig-stream-recv-states}} shows the states for the part of a stream that -receives data from a peer. The states for a receive stream mirror only some of -the states of the send stream at the peer. A receive stream doesn't track -states on the send stream that cannot be observed, such as the "Ready" state; -instead, receive streams track the delivery of data to the application or -application protocol some of which cannot be observed by the sender. +The NEW_TOKEN frame is as follows: ~~~ - o - | Recv STREAM / STREAM_BLOCKED / RST_STREAM - | Create Bidirectional Stream (Sending) - | Recv MAX_STREAM_DATA - | Create Higher-Numbered Stream - v - +-------+ - | Recv | Recv RST_STREAM - | |-----------------------. - +-------+ | - | | - | Recv STREAM + FIN | - v | - +-------+ | - | Size | Recv RST_STREAM | - | Known |---------------------->| - +-------+ | - | | - | Recv All Data | - v v - +-------+ Recv RST_STREAM +-------+ - | Data |--- (optional) --->| Reset | - | Recvd | Recv All Data | Recvd | - +-------+<-- (optional) ----+-------+ - | | - | App Read All Data | App Read RST - v v - +-------+ +-------+ - | Data | | Reset | - | Read | | Read | - +-------+ +-------+ + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Token Length (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Token (*) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~~~ -{: #fig-stream-recv-states title="States for Receive Streams"} -The receiving part of a stream initiated by a peer (types 1 and 3 for a client, -or 0 and 2 for a server) are created when the first STREAM, STREAM_BLOCKED, -RST_STREAM, or MAX_STREAM_DATA (bidirectional only, see below) is received for -that stream. The initial state for a receive stream is "Recv". Receiving a -RST_STREAM frame causes the receive stream to immediately transition to the -"Reset Recvd". +The fields of a NEW_TOKEN frame are as follows: -The receive stream enters the "Recv" state when the sending part of a -bidirectional stream initiated by the endpoint (type 0 for a client, type 1 for -a server) enters the "Ready" state. +Token Length: -A bidirectional stream also opens when a MAX_STREAM_DATA frame is received. -Receiving a MAX_STREAM_DATA frame implies that the remote peer has opened the -stream and is providing flow control credit. A MAX_STREAM_DATA frame might -arrive before a STREAM or STREAM_BLOCKED frame if packets are lost or reordered. +: A variable-length integer specifying the length of the token in bytes. -Before creating a stream, all lower-numbered streams of the same type MUST be -created. That means that receipt of a frame that would open a stream causes all -lower-numbered streams of the same type to be opened in numeric order. This -ensures that the creation order for streams is consistent on both endpoints. +Token: -In the "Recv" state, the endpoint receives STREAM and STREAM_BLOCKED frames. -Incoming data is buffered and can be reassembled into the correct order for -delivery to the application. As data is consumed by the application and buffer -space becomes available, the endpoint sends MAX_STREAM_DATA frames to allow the -peer to send more data. +: An opaque blob that the client may use with a future Initial packet. -When a STREAM frame with a FIN bit is received, the final offset (see -{{final-offset}}) is known. The receive stream enters the "Size Known" state. -In this state, the endpoint no longer needs to send MAX_STREAM_DATA frames, it -only receives any retransmissions of stream data. -Once all data for the stream has been received, the receive stream enters the -"Data Recvd" state. This might happen as a result of receiving the same STREAM -frame that causes the transition to "Size Known". In this state, the endpoint -has all stream data. Any STREAM or STREAM_BLOCKED frames it receives for the -stream can be discarded. +## STREAM Frames {#frame-stream} -The "Data Recvd" state persists until stream data has been delivered to the -application or application protocol. Once stream data has been delivered, the -stream enters the "Data Read" state, which is a terminal state. +STREAM frames implicitly create a stream and carry stream data. The STREAM +frame takes the form 0b00010XXX (or the set of values from 0x10 to 0x17). The +value of the three low-order bits of the frame type determine the fields that +are present in the frame. -Receiving a RST_STREAM frame in the "Recv" or "Size Known" states causes the -stream to enter the "Reset Recvd" state. This might cause the delivery of -stream data to the application to be interrupted. +* The OFF bit (0x04) in the frame type is set to indicate that there is an + Offset field present. When set to 1, the Offset field is present; when set to + 0, the Offset field is absent and the Stream Data starts at an offset of 0 + (that is, the frame contains the first octets of the stream, or the end of a + stream that includes no data). -It is possible that all stream data is received when a RST_STREAM is received -(that is, from the "Data Recvd" state). Similarly, it is possible for remaining -stream data to arrive after receiving a RST_STREAM frame (the "Reset Recvd" -state). An implementation is able to manage this situation as they choose. -Sending RST_STREAM means that an endpoint cannot guarantee delivery of stream -data; however there is no requirement that stream data not be delivered if a -RST_STREAM is received. An implementation MAY interrupt delivery of stream -data, discard any data that was not consumed, and signal the existence of the -RST_STREAM immediately. Alternatively, the RST_STREAM signal might be -suppressed or withheld if stream data is completely received. In the latter -case, the receive stream effectively transitions to "Data Recvd" from "Reset -Recvd". +* The LEN bit (0x02) in the frame type is set to indicate that there is a Length + field present. If this bit is set to 0, the Length field is absent and the + Stream Data field extends to the end of the packet. If this bit is set to 1, + the Length field is present. -Once the application has been delivered the signal indicating that the receive -stream was reset, the receive stream transitions to the "Reset Read" state, -which is a terminal state. +* The FIN bit (0x01) of the frame type is set only on frames that contain the + final offset of the stream. Setting this bit indicates that the frame + marks the end of the stream. +An endpoint that receives a STREAM frame for a send-only stream MUST terminate +the connection with error PROTOCOL_VIOLATION. + +A STREAM frame is shown below. + +~~~ + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Stream ID (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| [Offset (i)] ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| [Length (i)] ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Stream Data (*) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +~~~ +{: #stream-format title="STREAM Frame Format"} + +The STREAM frame contains the following fields: + +Stream ID: + +: A variable-length integer indicating the stream ID of the stream (see + {{stream-id}}). + +Offset: + +: A variable-length integer specifying the byte offset in the stream for the + data in this STREAM frame. This field is present when the OFF bit is set to + 1. When the Offset field is absent, the offset is 0. -### Permitted Frame Types +Length: -The sender of a stream sends just three frame types that affect the state of a -stream at either sender or receiver: STREAM ({{frame-stream}}), STREAM_BLOCKED -({{frame-stream-blocked}}), and RST_STREAM ({{frame-rst-stream}}). +: A variable-length integer specifying the length of the Stream Data field in + this STREAM frame. This field is present when the LEN bit is set to 1. When + the LEN bit is set to 0, the Stream Data field consumes all the remaining + octets in the packet. -A sender MUST NOT send any of these frames from a terminal state ("Data Recvd" -or "Reset Recvd"). A sender MUST NOT send STREAM or STREAM_BLOCKED after -sending a RST_STREAM; that is, in the "Reset Sent" state in addition to the -terminal states. A receiver could receive any of these frames in any state, but -only due to the possibility of delayed delivery of packets carrying them. +Stream Data: -The receiver of a stream sends MAX_STREAM_DATA ({{frame-max-stream-data}}) and -STOP_SENDING frames ({{frame-stop-sending}}). +: The bytes from the designated stream to be delivered. -The receiver only sends MAX_STREAM_DATA in the "Recv" state. A receiver can -send STOP_SENDING in any state where it has not received a RST_STREAM frame; -that is states other than "Reset Recvd" or "Reset Read". However there is -little value in sending a STOP_SENDING frame after all stream data has been -received in the "Data Recvd" state. A sender could receive these frames in any -state as a result of delayed delivery of packets. +When a Stream Data field has a length of 0, the offset in the STREAM frame is +the offset of the next byte that would be sent. +The first byte in the stream has an offset of 0. The largest offset delivered +on a stream - the sum of the re-constructed offset and data length - MUST be +less than 2^62. -### Bidirectional Stream States {#stream-bidi-states} +Stream multiplexing is achieved by interleaving STREAM frames from multiple +streams into one or more QUIC packets. A single QUIC packet can include +multiple STREAM frames from one or more streams. -A bidirectional stream is composed of a send stream and a receive stream. -Implementations may represent states of the bidirectional stream as composites -of send and receive stream states. The simplest model presents the stream as -"open" when either send or receive stream is in a non-terminal state and -"closed" when both send and receive streams are in a terminal state. +Implementation note: One of the benefits of QUIC is avoidance of head-of-line +blocking across multiple streams. When a packet loss occurs, only streams with +data in that packet are blocked waiting for a retransmission to be received, +while other streams can continue making progress. Note that when data from +multiple streams is bundled into a single QUIC packet, loss of that packet +blocks all those streams from making progress. An implementation is therefore +advised to bundle as few streams as necessary in outgoing packets without losing +transmission efficiency to underfilled packets. -{{stream-bidi-mapping}} shows a more complex mapping of bidirectional stream -states that loosely correspond to the stream states in HTTP/2 -{{?HTTP2=RFC7540}}. This shows that multiple states on send or receive streams -are mapped to the same composite state. Note that this is just one possibility -for such a mapping; this mapping requires that data is acknowledged before the -transition to a "closed" or "half-closed" state. -| Send Stream | Receive Stream | Composite State | -|:-----------------------|:-----------------------|:---------------------| -| No Stream/Ready | No Stream/Recv *1 | idle | -| Ready/Send/Data Sent | Recv/Size Known | open | -| Ready/Send/Data Sent | Data Recvd/Data Read | half-closed (remote) | -| Ready/Send/Data Sent | Reset Recvd/Reset Read | half-closed (remote) | -| Data Recvd | Recv/Size Known | half-closed (local) | -| Reset Sent/Reset Recvd | Recv/Size Known | half-closed (local) | -| Data Recvd | Recv/Size Known | half-closed (local) | -| Reset Sent/Reset Recvd | Data Recvd/Data Read | closed | -| Reset Sent/Reset Recvd | Reset Recvd/Reset Read | closed | -| Data Recvd | Data Recvd/Data Read | closed | -| Data Recvd | Reset Recvd/Reset Read | closed | -{: #stream-bidi-mapping title="Possible Mapping of Stream States to HTTP/2"} +## CRYPTO Frame {#frame-crypto} -Note (*1): +The CRYPTO frame (type=0x18) is used to transmit cryptographic handshake +messages. It can be sent in all packet types. The CRYPTO frame offers the +cryptographic protocol an in-order stream of bytes. CRYPTO frames are +functionally identical to STREAM frames, except that they do not bear a stream +identifier; they are not flow controlled; and they do not carry markers for +optional offset, optional length, and the end of the stream. -: A stream is considered "idle" if it has not yet been created, or if the - receive stream is in the "Recv" state without yet having received any frames. +A CRYPTO frame is shown below. +~~~ + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Offset (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Length (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Crypto Data (*) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +~~~ +{: #crypto-format title="CRYPTO Frame Format"} -## Solicited State Transitions +The CRYPTO frame contains the following fields: -If an endpoint is no longer interested in the data it is receiving on a stream, -it MAY send a STOP_SENDING frame identifying that stream to prompt closure of -the stream in the opposite direction. This typically indicates that the -receiving application is no longer reading data it receives from the stream, but -is not a guarantee that incoming data will be ignored. +Offset: -STREAM frames received after sending STOP_SENDING are still counted toward the -connection and stream flow-control windows, even though these frames will be -discarded upon receipt. This avoids potential ambiguity about which STREAM -frames count toward flow control. +: A variable-length integer specifying the byte offset in the stream for the + data in this CRYPTO frame. -A STOP_SENDING frame requests that the receiving endpoint send a RST_STREAM -frame. An endpoint that receives a STOP_SENDING frame MUST send a RST_STREAM -frame for that stream, and can use an error code of STOPPING. If the -STOP_SENDING frame is received on a send stream that is already in the "Data -Sent" state, a RST_STREAM frame MAY still be sent in order to cancel -retransmission of previously-sent STREAM frames. +Length: -STOP_SENDING SHOULD only be sent for a receive stream that has not been -reset. STOP_SENDING is most useful for streams in the "Recv" or "Size Known" -states. +: A variable-length integer specifying the length of the Crypto Data field in + this CRYPTO frame. -An endpoint is expected to send another STOP_SENDING frame if a packet -containing a previous STOP_SENDING is lost. However, once either all stream -data or a RST_STREAM frame has been received for the stream - that is, the -stream is in any state other than "Recv" or "Size Known" - sending a -STOP_SENDING frame is unnecessary. +Crypto Data: +: The cryptographic message data. -## Stream Concurrency {#stream-concurrency} +There is a separate flow of cryptographic handshake data in each encryption +level, each of which starts at an offset of 0. This implies that each encryption +level is treated as a separate CRYPTO stream of data. -An endpoint limits the number of concurrently active incoming streams by -adjusting the maximum stream ID. An initial value is set in the transport -parameters (see {{transport-parameter-definitions}}) and is subsequently -increased by MAX_STREAM_ID frames (see {{frame-max-stream-id}}). +Unlike STREAM frames, which include a Stream ID indicating to which stream the +data belongs, the CRYPTO frame carries data for a single stream per encryption +level. The stream does not have an explicit end, so CRYPTO frames do not have a +FIN bit. -The maximum stream ID is specific to each endpoint and applies only to the peer -that receives the setting. That is, clients specify the maximum stream ID the -server can initiate, and servers specify the maximum stream ID the client can -initiate. Each endpoint may respond on streams initiated by the other peer, -regardless of whether it is permitted to initiate new streams. -Endpoints MUST NOT exceed the limit set by their peer. An endpoint that -receives a STREAM frame with an ID greater than the limit it has sent MUST treat -this as a stream error of type STREAM_ID_ERROR ({{error-handling}}), unless this -is a result of a change in the initial limits (see {{zerortt-parameters}}). +# Packetization and Reliability {#packetization} -A receiver cannot renege on an advertisement; that is, once a receiver -advertises a stream ID via a MAX_STREAM_ID frame, advertising a smaller maximum -ID has no effect. A sender MUST ignore any MAX_STREAM_ID frame that does not -increase the maximum stream ID. +A sender bundles one or more frames in a QUIC packet (see {{frames}}). +A sender SHOULD minimize per-packet bandwidth and computational costs by +bundling as many frames as possible within a QUIC packet. A sender MAY wait for +a short period of time to bundle multiple frames before sending a packet that is +not maximally packed, to avoid sending out large numbers of small packets. An +implementation may use knowledge about application sending behavior or +heuristics to determine whether and for how long to wait. This waiting period +is an implementation decision, and an implementation should be careful to delay +conservatively, since any delay is likely to increase application-visible +latency. -## Sending and Receiving Data -Once a stream is created, endpoints may use the stream to send and receive data. -Each endpoint may send a series of STREAM frames encapsulating data on a stream -until the stream is terminated in that direction. Streams are an ordered -byte-stream abstraction, and they have no other structure within them. STREAM -frame boundaries are not expected to be preserved in retransmissions from the -sender or during delivery to the application at the receiver. +## Packet Processing and Acknowledgment {#processing-and-ack} -When new data is to be sent on a stream, a sender MUST set the encapsulating -STREAM frame's offset field to the stream offset of the first byte of this new -data. The first octet of data on a stream has an offset of 0. An endpoint is -expected to send every stream octet. The largest offset delivered on a stream -MUST be less than 2^62. +A packet MUST NOT be acknowledged until packet protection has been successfully +removed and all frames contained in the packet have been processed. For STREAM +frames, this means the data has been enqueued in preparation to be received by +the application protocol, but it does not require that data is delivered and +consumed. -QUIC makes no specific allowances for partial reliability or delivery of stream -data out of order. Endpoints MUST be able to deliver stream data to an -application as an ordered byte-stream. Delivering an ordered byte-stream -requires that an endpoint buffer any data that is received out of order, up to -the advertised flow control limit. +Once the packet has been fully processed, a receiver acknowledges receipt by +sending one or more ACK frames containing the packet number of the received +packet. To avoid creating an indefinite feedback loop, an endpoint MUST NOT +send an ACK frame in response to a packet containing only ACK or PADDING frames, +even if there are packet gaps which precede the received packet. The endpoint +MUST acknowledge packets containing only ACK or PADDING frames in the next ACK +frame that it sends. -An endpoint could receive the same octets multiple times; octets that have -already been received can be discarded. The value for a given octet MUST NOT -change if it is sent multiple times; an endpoint MAY treat receipt of a changed -octet as a connection error of type PROTOCOL_VIOLATION. +While PADDING frames do not elicit an ACK frame from a receiver, they are +considered to be in flight for congestion control purposes +{{QUIC-RECOVERY}}. Sending only PADDING frames might cause the sender to become +limited by the congestion controller (as described in {{QUIC-RECOVERY}}) with no +acknowledgments forthcoming from the receiver. Therefore, a sender should ensure +that other frames are sent in addition to PADDING frames to elicit +acknowledgments from the receiver. -An endpoint MUST NOT send data on any stream without ensuring that it is within -the data limits set by its peer. +Strategies and implications of the frequency of generating acknowledgments are +discussed in more detail in {{QUIC-RECOVERY}}. -Flow control is described in detail in {{flow-control}}, and congestion control -is described in the companion document {{QUIC-RECOVERY}}. +## Retransmission of Information -## Stream Prioritization +QUIC packets that are determined to be lost are not retransmitted whole. The +same applies to the frames that are contained within lost packets. Instead, the +information that might be carried in frames is sent again in new frames as +needed. -Stream multiplexing has a significant effect on application performance if -resources allocated to streams are correctly prioritized. Experience with other -multiplexed protocols, such as HTTP/2 {{?HTTP2}}, shows that effective -prioritization strategies have a significant positive impact on performance. +New frames and packets are used to carry information that is determined to have +been lost. In general, information is sent again when a packet containing that +information is determined to be lost and sending ceases when a packet +containing that information is acknowledged. -QUIC does not provide frames for exchanging prioritization information. Instead -it relies on receiving priority information from the application that uses QUIC. -Protocols that use QUIC are able to define any prioritization scheme that suits -their application semantics. A protocol might define explicit messages for -signaling priority, such as those defined in HTTP/2; it could define rules that -allow an endpoint to determine priority based on context; or it could leave the -determination to the application. +* Data sent in CRYPTO frames is retransmitted according to the rules in + {{QUIC-RECOVERY}}, until either all data has been acknowledged or the crypto + state machine implicitly knows that the peer received the data. -A QUIC implementation SHOULD provide ways in which an application can indicate -the relative priority of streams. When deciding which streams to dedicate -resources to, QUIC SHOULD use the information provided by the application. -Failure to account for priority of streams can result in suboptimal performance. +* Application data sent in STREAM frames is retransmitted in new STREAM frames + unless the endpoint has sent a RST_STREAM for that stream. Once an endpoint + sends a RST_STREAM frame, no further STREAM frames are needed. -Stream priority is most relevant when deciding which stream data will be -transmitted. Often, there will be limits on what can be transmitted as a result -of connection flow control or the current congestion controller state. +* The most recent set of acknowledgments are sent in ACK frames. An ACK frame + SHOULD contain all unacknowledged acknowledgments, as described in + {{sending-ack-frames}}. -Giving preference to the transmission of its own management frames ensures that -the protocol functions efficiently. That is, prioritizing frames other than -STREAM frames ensures that loss recovery, congestion control, and flow control -operate effectively. +* Cancellation of stream transmission, as carried in a RST_STREAM frame, is + sent until acknowledged or until all stream data is acknowledged by the peer + (that is, either the "Reset Recvd" or "Data Recvd" state is reached on the + send stream). The content of a RST_STREAM frame MUST NOT change when it is + sent again. -CRYPTO frames SHOULD be prioritized over other streams prior to the completion -of the cryptographic handshake. This includes the retransmission of the second -flight of client handshake messages, that is, the TLS Finished and any client -authentication messages. +* Similarly, a request to cancel stream transmission, as encoded in a + STOP_SENDING frame, is sent until the receive stream enters either a "Data + Recvd" or "Reset Recvd" state, see {{solicited-state-transitions}}. -STREAM data in frames determined to be lost SHOULD be retransmitted before -sending new data, unless application priorities indicate otherwise. -Retransmitting lost stream data can fill in gaps, which allows the peer to -consume already received data and free up the flow control window. +* Connection close signals, including those that use CONNECTION_CLOSE and + APPLICATION_CLOSE frames, are not sent again when packet loss is detected, but + as described in {{termination}}. +* The current connection maximum data is sent in MAX_DATA frames. An updated + value is sent in a MAX_DATA frame if the packet containing the most recently + sent MAX_DATA frame is declared lost, or when the endpoint decides to update + the limit. Care is necessary to avoid sending this frame too often as the + limit can increase frequently and cause an unnecessarily large number of + MAX_DATA frames to be sent. -# Flow Control {#flow-control} +* The current maximum stream data offset is sent in MAX_STREAM_DATA frames. + Like MAX_DATA, an updated value is sent when the packet containing + the most recent MAX_STREAM_DATA frame for a stream is lost or when the limit + is updated, with care taken to prevent the frame from being sent too often. An + endpoint SHOULD stop sending MAX_STREAM_DATA frames when the receive stream + enters a "Size Known" state. -It is necessary to limit the amount of data that a sender may have outstanding -at any time, so as to prevent a fast sender from overwhelming a slow receiver, -or to prevent a malicious sender from consuming significant resources at a -receiver. This section describes QUIC's flow-control mechanisms. +* The maximum stream ID for a stream of a given type is sent in MAX_STREAM_ID + frames. Like MAX_DATA, an updated value is sent when a packet containing the + most recent MAX_STREAM_ID for a stream type frame is declared lost or when + the limit is updated, with care taken to prevent the frame from being sent + too often. -QUIC employs a credit-based flow-control scheme similar to HTTP/2's flow control -{{?HTTP2}}. A receiver advertises the number of octets it is prepared to -receive on a given stream and for the entire connection. This leads to two -levels of flow control in QUIC: (i) Connection flow control, which prevents -senders from exceeding a receiver's buffer capacity for the connection, and (ii) -Stream flow control, which prevents a single stream from consuming the entire -receive buffer for a connection. +* Blocked signals are carried in BLOCKED, STREAM_BLOCKED, and STREAM_ID_BLOCKED + frames. BLOCKED streams have connection scope, STREAM_BLOCKED frames have + stream scope, and STREAM_ID_BLOCKED frames are scoped to a specific stream + type. New frames are sent if packets containing the most recent frame for a + scope is lost, but only while the endpoint is blocked on the corresponding + limit. These frames always include the limit that is causing blocking at the + time that they are transmitted. -A data receiver sends MAX_STREAM_DATA or MAX_DATA frames to the sender -to advertise additional credit. MAX_STREAM_DATA frames send the -maximum absolute byte offset of a stream, while MAX_DATA sends the -maximum of the sum of the absolute byte offsets of all streams. +* A liveness or path validation check using PATH_CHALLENGE frames is sent + periodically until a matching PATH_RESPONSE frame is received or until there + is no remaining need for liveness or path validation checking. PATH_CHALLENGE + frames include a different payload each time they are sent. -A receiver MAY advertise a larger offset at any point by sending MAX_DATA or -MAX_STREAM_DATA frames. A receiver cannot renege on an advertisement; that is, -once a receiver advertises an offset, advertising a smaller offset has no -effect. A sender MUST therefore ignore any MAX_DATA or MAX_STREAM_DATA frames -that do not increase flow control limits. +* Responses to path validation using PATH_RESPONSE frames are sent just once. + A new PATH_CHALLENGE frame will be sent if another PATH_RESPONSE frame is + needed. -A receiver MUST close the connection with a FLOW_CONTROL_ERROR error -({{error-handling}}) if the peer violates the advertised connection or stream -data limits. +* New connection IDs are sent in NEW_CONNECTION_ID frames and retransmitted if + the packet containing them is lost. Retransmissions of this frame carry the + same sequence number value. Likewise, retired connection IDs are sent in + RETIRE_CONNECTION_ID frames and retransmitted if the packet containing them is + lost. -A sender SHOULD send BLOCKED or STREAM_BLOCKED frames to indicate it has data to -write but is blocked by flow control limits. These frames are expected to be -sent infrequently in common cases, but they are considered useful for debugging -and monitoring purposes. +* PADDING frames contain no information, so lost PADDING frames do not require + repair. -A receiver advertises credit for a stream by sending a MAX_STREAM_DATA frame -with the Stream ID set appropriately. A receiver could use the current offset of -data consumed to determine the flow control offset to be advertised. A receiver -MAY send MAX_STREAM_DATA frames in multiple packets in order to make sure that -the sender receives an update before running out of flow control credit, even if -one of the packets is lost. +Upon detecting losses, a sender MUST take appropriate congestion control action. +The details of loss detection and congestion control are described in +{{QUIC-RECOVERY}}. -Connection flow control is a limit to the total bytes of stream data sent in -STREAM frames on all streams. A receiver advertises credit for a connection by -sending a MAX_DATA frame. A receiver maintains a cumulative sum of bytes -received on all contributing streams, which are used to check for flow control -violations. A receiver might use a sum of bytes consumed on all contributing -streams to determine the maximum data limit to be advertised. -## Edge Cases and Other Considerations +## Packet Size {#packet-size} -There are some edge cases which must be considered when dealing with stream and -connection level flow control. Given enough time, both endpoints must agree on -flow control state. If one end believes it can send more than the other end is -willing to receive, the connection will be torn down when too much data arrives. +The QUIC packet size includes the QUIC header and integrity check, but not the +UDP or IP header. -Conversely if a sender believes it is blocked, while endpoint B expects more -data can be received, then the connection can be in a deadlock, with the sender -waiting for a MAX_DATA or MAX_STREAM_DATA frame which will never come. +Clients MUST ensure that the first Initial packet they send is sent in a UDP +datagram that is at least 1200 octets. Padding the Initial packet or including a +0-RTT packet in the same datagram are ways to meet this requirement. Sending a +UDP datagram of this size ensures that the network path supports a reasonable +Maximum Transmission Unit (MTU), and helps reduce the amplitude of amplification +attacks caused by server responses toward an unverified client address. -On receipt of a RST_STREAM frame, an endpoint will tear down state for the -matching stream and ignore further data arriving on that stream. This could -result in the endpoints getting out of sync, since the RST_STREAM frame may have -arrived out of order and there may be further bytes in flight. The data sender -would have counted the data against its connection level flow control budget, -but a receiver that has not received these bytes would not know to include them -as well. The receiver must learn the number of bytes that were sent on the -stream to make the same adjustment in its connection flow controller. +The datagram containing the first Initial packet from a client MAY exceed 1200 +octets if the client believes that the Path Maximum Transmission Unit (PMTU) +supports the size that it chooses. -To avoid this de-synchronization, a RST_STREAM sender MUST include the final -byte offset sent on the stream in the RST_STREAM frame. On receiving a -RST_STREAM frame, a receiver definitively knows how many bytes were sent on that -stream before the RST_STREAM frame, and the receiver MUST use the final offset -to account for all bytes sent on the stream in its connection level flow -controller. +A server MAY send a CONNECTION_CLOSE frame with error code PROTOCOL_VIOLATION in +response to the first Initial packet it receives from a client if the UDP +datagram is smaller than 1200 octets. It MUST NOT send any other frame type in +response, or otherwise behave as if any part of the offending packet was +processed as valid. -### Response to a RST_STREAM -RST_STREAM terminates one direction of a stream abruptly. Whether any action or -response can or should be taken on the data already received is an -application-specific issue, but it will often be the case that upon receipt of a -RST_STREAM an endpoint will choose to stop sending data in its own direction. If -the sender of a RST_STREAM wishes to explicitly state that no future data will -be processed, that endpoint MAY send a STOP_SENDING frame at the same time. +## Path Maximum Transmission Unit -### Data Limit Increments {#fc-credit} +The Path Maximum Transmission Unit (PMTU) is the maximum size of the entire IP +header, UDP header, and UDP payload. The UDP payload includes the QUIC packet +header, protected payload, and any authentication fields. -This document leaves when and how many bytes to advertise in a MAX_DATA or -MAX_STREAM_DATA to implementations, but offers a few considerations. These -frames contribute to connection overhead. Therefore frequently sending frames -with small changes is undesirable. At the same time, infrequent updates require -larger increments to limits if blocking is to be avoided. Thus, larger updates -require a receiver to commit to larger resource commitments. Thus there is a -trade-off between resource commitment and overhead when determining how large a -limit is advertised. +All QUIC packets SHOULD be sized to fit within the estimated PMTU to avoid IP +fragmentation or packet drops. To optimize bandwidth efficiency, endpoints +SHOULD use Packetization Layer PMTU Discovery ({{!PLPMTUD=RFC4821}}). Endpoints +MAY use PMTU Discovery ({{!PMTUDv4=RFC1191}}, {{!PMTUDv6=RFC8201}}) for +detecting the PMTU, setting the PMTU appropriately, and storing the result of +previous PMTU determinations. -A receiver MAY use an autotuning mechanism to tune the frequency and amount that -it increases data limits based on a round-trip time estimate and the rate at -which the receiving application consumes data, similar to common TCP -implementations. +In the absence of these mechanisms, QUIC endpoints SHOULD NOT send IP packets +larger than 1280 octets. Assuming the minimum IP header size, this results in +a QUIC packet size of 1232 octets for IPv6 and 1252 octets for IPv4. Some +QUIC implementations MAY be more conservative in computing allowed QUIC packet +size given unknown tunneling overheads or IP header options. -## Stream Limit Increment +QUIC endpoints that implement any kind of PMTU discovery SHOULD maintain an +estimate for each combination of local and remote IP addresses. Each pairing of +local and remote addresses could have a different maximum MTU in the path. -As with flow control, this document leaves when and how many streams to make -available to a peer via MAX_STREAM_ID to implementations, but offers a few -considerations. MAX_STREAM_ID frames constitute minimal overhead, while -withholding MAX_STREAM_ID frames can prevent the peer from using the available -parallelism. +QUIC depends on the network path supporting an MTU of at least 1280 octets. This +is the IPv6 minimum MTU and therefore also supported by most modern IPv4 +networks. An endpoint MUST NOT reduce its MTU below this number, even if it +receives signals that indicate a smaller limit might exist. -Implementations will likely want to increase the maximum stream ID as -peer-initiated streams close. A receiver MAY also advance the maximum stream ID -based on current activity, system conditions, and other environmental factors. +If a QUIC endpoint determines that the PMTU between any pair of local and remote +IP addresses has fallen below 1280 octets, it MUST immediately cease sending +QUIC packets on the affected path. This could result in termination of the +connection if an alternative path cannot be found. -### Blocking on Flow Control {#blocking} +### IPv4 PMTU Discovery {#v4-pmtud} -If a sender does not receive a MAX_DATA or MAX_STREAM_DATA frame when it has run -out of flow control credit, the sender will be blocked and SHOULD send a BLOCKED -or STREAM_BLOCKED frame. These frames are expected to be useful for debugging -at the receiver; they do not require any other action. A receiver SHOULD NOT -wait for a BLOCKED or STREAM_BLOCKED frame before sending MAX_DATA or -MAX_STREAM_DATA, since doing so will mean that a sender is unable to send for an -entire round trip. +Traditional ICMP-based path MTU discovery in IPv4 {{!PMTUDv4}} is potentially +vulnerable to off-path attacks that successfully guess the IP/port 4-tuple and +reduce the MTU to a bandwidth-inefficient value. TCP connections mitigate this +risk by using the (at minimum) 8 bytes of transport header echoed in the ICMP +message to validate the TCP sequence number as valid for the current +connection. However, as QUIC operates over UDP, in IPv4 the echoed information +could consist only of the IP and UDP headers, which usually has insufficient +entropy to mitigate off-path attacks. -For smooth operation of the congestion controller, it is generally considered -best to not let the sender go into quiescence if avoidable. To avoid blocking a -sender, and to reasonably account for the possibility of loss, a receiver should -send a MAX_DATA or MAX_STREAM_DATA frame at least two round trips before it -expects the sender to get blocked. +As a result, endpoints that implement PMTUD in IPv4 SHOULD take steps to +mitigate this risk. For instance, an application could: -A sender sends a single BLOCKED or STREAM_BLOCKED frame only once when it -reaches a data limit. A sender SHOULD NOT send multiple BLOCKED or -STREAM_BLOCKED frames for the same data limit, unless the original frame is -determined to be lost. Another BLOCKED or STREAM_BLOCKED frame can be sent -after the data limit is increased. +* Set the IPv4 Don't Fragment (DF) bit on a small proportion of packets, so that +most invalid ICMP messages arrive when there are no DF packets outstanding, and +can therefore be identified as spurious. +* Store additional information from the IP or UDP headers from DF packets (for +example, the IP ID or UDP checksum) to further authenticate incoming Datagram +Too Big messages. -## Stream Final Offset {#final-offset} +* Any reduction in PMTU due to a report contained in an ICMP packet is +provisional until QUIC's loss detection algorithm determines that the packet is +actually lost. -The final offset is the count of the number of octets that are transmitted on a -stream. For a stream that is reset, the final offset is carried explicitly in -a RST_STREAM frame. Otherwise, the final offset is the offset of the end of the -data carried in a STREAM frame marked with a FIN flag, or 0 in the case of -incoming unidirectional streams. -An endpoint will know the final offset for a stream when the receive stream -enters the "Size Known" or "Reset Recvd" state. +### Special Considerations for Packetization Layer PMTU Discovery -An endpoint MUST NOT send data on a stream at or beyond the final offset. -Once a final offset for a stream is known, it cannot change. If a RST_STREAM or -STREAM frame causes the final offset to change for a stream, an endpoint SHOULD -respond with a FINAL_OFFSET_ERROR error (see {{error-handling}}). A receiver -SHOULD treat receipt of data at or beyond the final offset as a -FINAL_OFFSET_ERROR error, even after a stream is closed. Generating these -errors is not mandatory, but only because requiring that an endpoint generate -these errors also means that the endpoint needs to maintain the final offset -state for closed streams, which could mean a significant state commitment. +The PADDING frame provides a useful option for PMTU probe packets. PADDING +frames generate acknowledgements, but they need not be delivered reliably. As a +result, the loss of PADDING frames in probe packets does not require +delay-inducing retransmission. However, PADDING frames do consume congestion +window, which may delay the transmission of subsequent application data. -## Flow Control for Cryptographic Handshake {#flow-control-crypto} +When implementing the algorithm in Section 7.2 of {{!PLPMTUD}}, the initial +value of search_low SHOULD be consistent with the IPv6 minimum packet size. +Paths that do not support this size cannot deliver Initial packets, and +therefore are not QUIC-compliant. -Data sent in CRYPTO frames is not flow controlled in the same way as STREAM -frames. QUIC relies on the cryptographic protocol implementation to avoid -excessive buffering of data, see {{QUIC-TLS}}. The implementation SHOULD -provide an interface to QUIC to tell it about its buffering limits so that there -is not excessive buffering at multiple layers. +Section 7.3 of {{!PLPMTUD}} discusses trade-offs between small and large +increases in the size of probe packets. As QUIC probe packets need not contain +application data, aggressive increases in probe size carry fewer consequences. # Error Handling From d57c08d436598eb20a1ab59dbcee1af8251c9e3f Mon Sep 17 00:00:00 2001 From: Mike Bishop Date: Fri, 12 Oct 2018 15:35:52 -0700 Subject: [PATCH 26/57] Lucas's s/p/P/ --- draft-ietf-quic-http.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/draft-ietf-quic-http.md b/draft-ietf-quic-http.md index 86d4ea4dfc..dab046c6c7 100644 --- a/draft-ietf-quic-http.md +++ b/draft-ietf-quic-http.md @@ -716,7 +716,7 @@ set from server to client, as in HTTP/2. The payload consists of: Push ID: -: A variable-length integer that identifies the server push operation. A push +: A variable-length integer that identifies the server push operation. A Push ID is used in push stream headers ({{server-push}}), CANCEL_PUSH frames ({{frame-cancel-push}}), and PRIORITY frames ({{frame-priority}}). From c705fc1417edd3f66a7a64ec495520efab325efc Mon Sep 17 00:00:00 2001 From: Jana Iyengar Date: Fri, 12 Oct 2018 22:41:03 -0700 Subject: [PATCH 27/57] Moved major sections around. --- draft-ietf-quic-transport.md | 4545 +++++++++++++++++----------------- 1 file changed, 2277 insertions(+), 2268 deletions(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index be2ba769a1..a506524663 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -871,2640 +871,2862 @@ provide an interface to QUIC to tell it about its buffering limits so that there is not excessive buffering at multiple layers. -# Versions {#versions} +# Connections {#connection} -QUIC versions are identified using a 32-bit unsigned number. +A QUIC connection is a single conversation between two QUIC endpoints. QUIC's +connection establishment intertwines version negotiation with the cryptographic +and transport handshakes to reduce connection establishment latency, as +described in {{handshake}}. Once established, a connection may migrate to a +different IP or port at either endpoint, due to NAT rebinding or mobility, as +described in {{migration}}. Finally, a connection may be terminated by either +endpoint, as described in {{termination}}. -The version 0x00000000 is reserved to represent version negotiation. This -version of the specification is identified by the number 0x00000001. +## Connection ID -Other versions of QUIC might have different properties to this version. The -properties of QUIC that are guaranteed to be consistent across all versions of -the protocol are described in {{QUIC-INVARIANTS}}. +Each connection possesses a set of identifiers, any of which could be used to +distinguish it from other connections. Connection IDs are selected +independently in each direction. Each Connection ID has an associated sequence +number to assist in deduplicating messages. -Version 0x00000001 of QUIC uses TLS as a cryptographic handshake protocol, as -described in {{QUIC-TLS}}. +The primary function of a connection ID is to ensure that changes in addressing +at lower protocol layers (UDP, IP, and below) don't cause packets for a QUIC +connection to be delivered to the wrong endpoint. Each endpoint selects +connection IDs using an implementation-specific (and perhaps +deployment-specific) method which will allow packets with that connection ID to +be routed back to the endpoint and identified by the endpoint upon receipt. -Versions with the most significant 16 bits of the version number cleared are -reserved for use in future IETF consensus documents. +Connection IDs MUST NOT contain any information that can be used to correlate +them with other connection IDs for the same connection. As a trivial example, +this means the same connection ID MUST NOT be issued more than once on the same +connection. -Versions that follow the pattern 0x?a?a?a?a are reserved for use in forcing -version negotiation to be exercised. That is, any version number where the low -four bits of all octets is 1010 (in binary). A client or server MAY advertise -support for any of these reserved versions. +A zero-length connection ID MAY be used when the connection ID is not needed for +routing and the address/port tuple of packets is sufficient to identify a +connection. An endpoint whose peer has selected a zero-length connection ID MUST +continue to use a zero-length connection ID for the lifetime of the connection +and MUST NOT send packets from any other local address. -Reserved version numbers will probably never represent a real protocol; a client -MAY use one of these version numbers with the expectation that the server will -initiate version negotiation; a server MAY advertise support for one of these -versions and can expect that clients ignore the value. +When an endpoint has requested a non-zero-length connection ID, it needs to +ensure that the peer has a supply of connection IDs from which to choose for +packets sent to the endpoint. These connection IDs are supplied by the endpoint +using the NEW_CONNECTION_ID frame ({{frame-new-connection-id}}). -\[\[RFC editor: please remove the remainder of this section before -publication.]] -The version number for the final version of this specification (0x00000001), is -reserved for the version of the protocol that is published as an RFC. +### Issuing Connection IDs -Version numbers used to identify IETF drafts are created by adding the draft -number to 0xff000000. For example, draft-ietf-quic-transport-13 would be -identified as 0xff00000D. +The initial connection ID issued by an endpoint is the Source Connection ID +during the handshake. The sequence number of the initial connection ID is 0. If +the preferred_address transport parameter is sent, the sequence number of the +supplied connection ID is 1. Subsequent connection IDs are communicated to the +peer using NEW_CONNECTION_ID frames ({{frame-new-connection-id}}), and the +sequence number on each newly-issued connection ID MUST increase by 1. The +connection ID randomly selected by the client in the Initial packet and any +connection ID provided by a Reset packet are not assigned sequence numbers +unless a server opts to retain them as its initial connection ID. -Implementors are encouraged to register version numbers of QUIC that they are -using for private experimentation on the GitHub wiki at -\. +When an endpoint issues a connection ID, it MUST accept packets that carry this +connection ID for the duration of the connection or until its peer invalidates +the connection ID via a RETIRE_CONNECTION_ID frame +({{frame-retire-connection-id}}). +An endpoint SHOULD ensure that its peer has a sufficient number of available and +unused connection IDs. While each endpoint independently chooses how many +connection IDs to issue, endpoints SHOULD provide and maintain at least eight +connection IDs. The endpoint can do this by always supplying a new connection +ID when a connection ID is retired by its peer or when the endpoint receives a +packet with a previously unused connection ID. Endpoints that initiate +migration and require non-zero-length connection IDs SHOULD provide their peers +with new connection IDs before migration, or risk the peer closing the +connection. -# Packet Types and Formats -We first describe QUIC's packet types and their formats, since some are -referenced in subsequent mechanisms. +### Consuming and Retiring Connection IDs {#retiring-cids} -All numeric values are encoded in network byte order (that is, big-endian) and -all field sizes are in bits. When discussing individual bits of fields, the -least significant bit is referred to as bit 0. Hexadecimal notation is used for -describing the value of fields. +An endpoint can change the connection ID it uses for a peer to another available +one at any time during the connection. An endpoint consumes connection IDs in +response to a migrating peer, see {{migration-linkability}} for more. -Any QUIC packet has either a long or a short header, as indicated by the Header -Form bit. Long headers are expected to be used early in the connection before -version negotiation and establishment of 1-RTT keys. Short headers are minimal -version-specific headers, which are used after version negotiation and 1-RTT -keys are established. +An endpoint maintains a set of connection IDs received from its peer, any of +which it can use when sending packets. When the endpoint wishes to remove a +connection ID from use, it sends a RETIRE_CONNECTION_ID frame to its peer, +indicating that the peer might bring a new connection ID into circulation using +the NEW_CONNECTION_ID frame. -## Long Header {#long-header} +An endpoint that retires a connection ID can retain knowledge of that connection +ID for a period of time after sending the RETIRE_CONNECTION_ID frame, or until +that frame is acknowledged. -~~~~~ - 0 1 2 3 - 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 -+-+-+-+-+-+-+-+-+ -|1| Type (7) | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Version (32) | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -|DCIL(4)|SCIL(4)| -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Destination Connection ID (0/32..144) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Source Connection ID (0/32..144) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Length (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Packet Number (8/16/32) | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Payload (*) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -~~~~~ -{: #fig-long-header title="Long Header Packet Format"} +As discussed in {{migration-linkability}}, each connection ID MUST be used on +packets sent from only one local address. An endpoint that migrates away from a +local address SHOULD retire all connection IDs used on that address once it no +longer plans to use that address. -Long headers are used for packets that are sent prior to the completion of -version negotiation and establishment of 1-RTT keys. Once both conditions are -met, a sender switches to sending packets using the short header -({{short-header}}). The long form allows for special packets - such as the -Version Negotiation packet - to be represented in this uniform fixed-length -packet format. Packets that use the long header contain the following fields: -Header Form: +## Matching Packets to Connections {#packet-handling} -: The most significant bit (0x80) of octet 0 (the first octet) is set to 1 for - long headers. +Incoming packets are classified on receipt. Packets can either be associated +with an existing connection, or - for servers - potentially create a new +connection. -Long Packet Type: +Hosts try to associate a packet with an existing connection. If the packet has a +Destination Connection ID corresponding to an existing connection, QUIC +processes that packet accordingly. Note that more than one connection ID can be +associated with a connection; see {{connection-id}}. -: The remaining seven bits of octet 0 contain the packet type. This field can - indicate one of 128 packet types. The types specified for this version are - listed in {{long-packet-types}}. +If the Destination Connection ID is zero length and the packet matches the +address/port tuple of a connection where the host did not require connection +IDs, QUIC processes the packet as part of that connection. Endpoints MUST drop +packets with zero-length Destination Connection ID fields if they do not +correspond to a single connection. -Version: +Endpoints SHOULD send a Stateless Reset ({{stateless-reset}}) for any packets +that cannot be attributed to an existing connection. -: The QUIC Version is a 32-bit field that follows the Type. This field - indicates which version of QUIC is in use and determines how the rest of the - protocol fields are interpreted. -DCIL and SCIL: +### Client Packet Handling {#client-pkt-handling} -: The octet following the version contains the lengths of the two connection ID - fields that follow it. These lengths are encoded as two 4-bit unsigned - integers. The Destination Connection ID Length (DCIL) field occupies the 4 - high bits of the octet and the Source Connection ID Length (SCIL) field - occupies the 4 low bits of the octet. An encoded length of 0 indicates that - the connection ID is also 0 octets in length. Non-zero encoded lengths are - increased by 3 to get the full length of the connection ID, producing a length - between 4 and 18 octets inclusive. For example, an octet with the value 0x50 - describes an 8-octet Destination Connection ID and a zero-length Source - Connection ID. +Valid packets sent to clients always include a Destination Connection ID that +matches the value the client selects. Clients that choose to receive +zero-length connection IDs can use the address/port tuple to identify a +connection. Packets that don't match an existing connection are discarded. -Destination Connection ID: +Due to packet reordering or loss, clients might receive packets for a connection +that are encrypted with a key it has not yet computed. Clients MAY drop these +packets, or MAY buffer them in anticipation of later packets that allow it to +compute the key. -: The Destination Connection ID field follows the connection ID lengths and is - either 0 octets in length or between 4 and 18 octets. - {{connection-id-encoding}} describes the use of this field in more detail. +If a client receives a packet that has an unsupported version, it MUST discard +that packet. -Source Connection ID: -: The Source Connection ID field follows the Destination Connection ID and is - either 0 octets in length or between 4 and 18 octets. - {{connection-id-encoding}} describes the use of this field in more detail. +### Server Packet Handling {#server-pkt-handling} -Length: +If a server receives a packet that has an unsupported version, but the packet is +sufficiently large to initiate a new connection for any version supported by the +server, it SHOULD send a Version Negotiation packet as described in +{{send-vn}}. Servers MAY rate control these packets to avoid storms of Version +Negotiation packets. -: The length of the remainder of the packet (that is, the Packet Number and - Payload fields) in octets, encoded as a variable-length integer - ({{integer-encoding}}). +The first packet for an unsupported version can use different semantics and +encodings for any version-specific field. In particular, different packet +protection keys might be used for different versions. Servers that do not +support a particular version are unlikely to be able to decrypt the payload of +the packet. Servers SHOULD NOT attempt to decode or decrypt a packet from an +unknown version, but instead send a Version Negotiation packet, provided that +the packet is sufficiently long. -Packet Number: +Servers MUST drop other packets that contain unsupported versions. -: The packet number field is 1, 2, or 4 octets long. The packet number has - confidentiality protection separate from packet protection, as described - in Section 5.3 of {{QUIC-TLS}}. The length of the packet number field is - encoded in the plaintext packet number. See {{packet-numbers}} for details. +Packets with a supported version, or no version field, are matched to a +connection as described in {{packet-handling}}. If not matched, the server +continues below. -Payload: +If the packet is an Initial packet fully conforming with the specification, the +server proceeds with the handshake ({{handshake}}). This commits the server to +the version that the client selected. -: The payload of the packet. +If a server isn't currently accepting any new connections, it SHOULD send an +Initial packet containing a CONNECTION_CLOSE frame with error code +SERVER_BUSY. -The following packet types are defined: +If the packet is a 0-RTT packet, the server MAY buffer a limited number of these +packets in anticipation of a late-arriving Initial Packet. Clients are forbidden +from sending Handshake packets prior to receiving a server response, so servers +SHOULD ignore any such packets. -| Type | Name | Section | -|:-----|:------------------------------|:----------------------------| -| 0x7F | Initial | {{packet-initial}} | -| 0x7E | Retry | {{packet-retry}} | -| 0x7D | Handshake | {{packet-handshake}} | -| 0x7C | 0-RTT Protected | {{packet-protected}} | -{: #long-packet-types title="Long Header Packet Types"} +Servers MUST drop incoming packets under all other circumstances. -The header form, type, connection ID lengths octet, destination and source -connection IDs, and version fields of a long header packet are -version-independent. The packet number and values for packet types defined in -{{long-packet-types}} are version-specific. See {{QUIC-INVARIANTS}} for details -on how packets from different versions of QUIC are interpreted. -The interpretation of the fields and the payload are specific to a version and -packet type. Type-specific semantics for this version are described in the -following sections. -The end of the packet is determined by the Length field. The Length field -covers both the Packet Number and Payload fields, both of which are -confidentiality protected and initially of unknown length. The size of the -Payload field is learned once the packet number protection is removed. - -Senders can sometimes coalesce multiple packets into one UDP datagram. See -{{packet-coalesce}} for more details. +# Version Negotiation {#version-negotiation} +Version negotiation ensures that client and server agree to a QUIC version +that is mutually supported. A server sends a Version Negotiation packet in +response to each packet that might initiate a new connection, see +{{packet-handling}} for details. -## Short Header +The size of the first packet sent by a client will determine whether a server +sends a Version Negotiation packet. Clients that support multiple QUIC versions +SHOULD pad the first packet they send to the largest of the minimum packet sizes +across all versions they support. This ensures that the server responds if there +is a mutually supported version. -~~~~~ - 0 1 2 3 - 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 -+-+-+-+-+-+-+-+-+ -|0|K|1|1|0|R R R| -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Destination Connection ID (0..144) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Packet Number (8/16/32) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Protected Payload (*) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -~~~~~ -{: #fig-short-header title="Short Header Packet Format"} +## Sending Version Negotiation Packets {#send-vn} -The short header can be used after the version and 1-RTT keys are negotiated. -Packets that use the short header contain the following fields: +If the version selected by the client is not acceptable to the server, the +server responds with a Version Negotiation packet (see {{packet-version}}). +This includes a list of versions that the server will accept. -Header Form: +This system allows a server to process packets with unsupported versions without +retaining state. Though either the Initial packet or the Version Negotiation +packet that is sent in response could be lost, the client will send new packets +until it successfully receives a response or it abandons the connection attempt. -: The most significant bit (0x80) of octet 0 is set to 0 for the short header. -Key Phase Bit: +## Handling Version Negotiation Packets {#handle-vn} -: The second bit (0x40) of octet 0 indicates the key phase, which allows a - recipient of a packet to identify the packet protection keys that are used to - protect the packet. See {{QUIC-TLS}} for details. +When the client receives a Version Negotiation packet, it first checks that the +Destination and Source Connection ID fields match the Source and Destination +Connection ID fields in a packet that the client sent. If this check fails, the +packet MUST be discarded. -\[\[Editor's Note: this section should be removed and the bit definitions -changed before this draft goes to the IESG.]] +Once the Version Negotiation packet is determined to be valid, the client then +selects an acceptable protocol version from the list provided by the server. +The client then attempts to create a connection using that version. Though the +content of the Initial packet the client sends might not change in response to +version negotiation, a client MUST increase the packet number it uses on every +packet it sends. Packets MUST continue to use long headers and MUST include the +new negotiated protocol version. -Third Bit: +The client MUST use the long header format and include its selected version on +all packets until it has 1-RTT keys and it has received a packet from the server +which is not a Version Negotiation packet. -: The third bit (0x20) of octet 0 is set to 1. +A client MUST NOT change the version it uses unless it is in response to a +Version Negotiation packet from the server. Once a client receives a packet +from the server which is not a Version Negotiation packet, it MUST discard other +Version Negotiation packets on the same connection. Similarly, a client MUST +ignore a Version Negotiation packet if it has already received and acted on a +Version Negotiation packet. -\[\[Editor's Note: this section should be removed and the bit definitions -changed before this draft goes to the IESG.]] +A client MUST ignore a Version Negotiation packet that lists the client's chosen +version. -Fourth Bit: +A client MAY attempt 0-RTT after receiving a Version Negotiation packet. A +client that sends additional 0-RTT packets MUST NOT reset the packet number to 0 +as a result, see {{retry-0rtt-pn}}. -: The fourth bit (0x10) of octet 0 is set to 1. +Version negotiation packets have no cryptographic protection. The result of the +negotiation MUST be revalidated as part of the cryptographic handshake (see +{{version-validation}}). -\[\[Editor's Note: this section should be removed and the bit definitions -changed before this draft goes to the IESG.]] -Google QUIC Demultiplexing Bit: +## Using Reserved Versions -: The fifth bit (0x8) of octet 0 is set to 0. This allows implementations of - Google QUIC to distinguish Google QUIC packets from short header packets sent - by a client because Google QUIC servers expect the connection ID to always be - present. - The special interpretation of this bit SHOULD be removed from this - specification when Google QUIC has finished transitioning to the new header - format. +For a server to use a new version in the future, clients must correctly handle +unsupported versions. To help ensure this, a server SHOULD include a reserved +version (see {{versions}}) while generating a Version Negotiation packet. -Reserved: +The design of version negotiation permits a server to avoid maintaining state +for packets that it rejects in this fashion. The validation of version +negotiation (see {{version-validation}}) only validates the result of version +negotiation, which is the same no matter which reserved version was sent. +A server MAY therefore send different reserved version numbers in the Version +Negotiation Packet and in its transport parameters. -: The sixth, seventh, and eighth bits (0x7) of octet 0 are reserved for - experimentation. Endpoints MUST ignore these bits on packets they receive - unless they are participating in an experiment that uses these bits. An - endpoint not actively using these bits SHOULD set the value randomly on - packets they send to protect against unwanted inference about particular - values. +A client MAY send a packet using a reserved version number. This can be used to +solicit a list of supported versions from a server. -Destination Connection ID: -: The Destination Connection ID is a connection ID that is chosen by the - intended recipient of the packet. See {{connection-id}} for more details. -Packet Number: +# Cryptographic and Transport Handshake {#handshake} -: The packet number field is 1, 2, or 4 octets long. The packet number has - confidentiality protection separate from packet protection, as described in - Section 5.3 of {{QUIC-TLS}}. The length of the packet number field is encoded - in the plaintext packet number. See {{packet-numbers}} for details. +QUIC relies on a combined cryptographic and transport handshake to minimize +connection establishment latency. QUIC uses the CRYPTO frame {{frame-crypto}} +to transmit the cryptographic handshake. Version 0x00000001 of QUIC uses TLS +1.3 as described in {{QUIC-TLS}}; a different QUIC version number could indicate +that a different cryptographic handshake protocol is in use. -Protected Payload: +QUIC provides reliable, ordered delivery of the cryptographic handshake +data. QUIC packet protection ensures confidentiality and integrity protection +that meets the requirements of the cryptographic handshake protocol: -: Packets with a short header always include a 1-RTT protected payload. +* authenticated key exchange, where -The header form and connection ID field of a short header packet are -version-independent. The remaining fields are specific to the selected QUIC -version. See {{QUIC-INVARIANTS}} for details on how packets from different -versions of QUIC are interpreted. + * a server is always authenticated, + * a client is optionally authenticated, -## Version Negotiation Packet {#packet-version} + * every connection produces distinct and unrelated keys, -A Version Negotiation packet is inherently not version-specific, and does not -use the long packet header (see {{long-header}}. Upon receipt by a client, it -will appear to be a packet using the long header, but will be identified as a -Version Negotiation packet based on the Version field having a value of 0. + * keying material is usable for packet protection for both 0-RTT and 1-RTT + packets, and -The Version Negotiation packet is a response to a client packet that contains a -version that is not supported by the server, and is only sent by servers. + * 1-RTT keys have forward secrecy -The layout of a Version Negotiation packet is: +* authenticated values for the transport parameters of the peer (see + {{transport-parameters}}) -~~~ - 0 1 2 3 - 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 -+-+-+-+-+-+-+-+-+ -|1| Unused (7) | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Version (32) | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -|DCIL(4)|SCIL(4)| -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Destination Connection ID (0/32..144) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Source Connection ID (0/32..144) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Supported Version 1 (32) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| [Supported Version 2 (32)] ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ - ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| [Supported Version N (32)] ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -~~~ -{: #version-negotiation-format title="Version Negotiation Packet"} +* authenticated confirmation of version negotiation (see {{version-validation}}) -The value in the Unused field is selected randomly by the server. +* authenticated negotiation of an application protocol (TLS uses ALPN + {{?RFC7301}} for this purpose) -The Version field of a Version Negotiation packet MUST be set to 0x00000000. +* for the server, the ability to carry data that provides assurance that the + client can receive packets that are addressed with the transport address that + is claimed by the client (see {{address-validation}}) -The server MUST include the value from the Source Connection ID field of the -packet it receives in the Destination Connection ID field. The value for Source -Connection ID MUST be copied from the Destination Connection ID of the received -packet, which is initially randomly selected by a client. Echoing both -connection IDs gives clients some assurance that the server received the packet -and that the Version Negotiation packet was not generated by an off-path -attacker. +The first CRYPTO frame MUST be sent in a single packet. Any second attempt +that is triggered by address validation MUST also be sent within a single +packet. This avoids having to reassemble a message from multiple packets. -The remainder of the Version Negotiation packet is a list of 32-bit versions -which the server supports. +The first client packet of the cryptographic handshake protocol MUST fit within +a 1232 octet QUIC packet payload. This includes overheads that reduce the space +available to the cryptographic handshake protocol. -A Version Negotiation packet cannot be explicitly acknowledged in an ACK frame -by a client. Receiving another Initial packet implicitly acknowledges a Version -Negotiation packet. +The CRYPTO frame can be sent in different packet number spaces. CRYPTO frames +in each packet number space carry a separate sequence of handshake data starting +from an offset of 0. -The Version Negotiation packet does not include the Packet Number and Length -fields present in other packets that use the long header form. Consequently, -a Version Negotiation packet consumes an entire UDP datagram. +## Example Handshake Flows -See {{version-negotiation}} for a description of the version negotiation -process. +Details of how TLS is integrated with QUIC are provided in {{QUIC-TLS}}, but +some examples are provided here. +{{tls-1rtt-handshake}} provides an overview of the 1-RTT handshake. Each line +shows a QUIC packet with the packet type and packet number shown first, followed +by the frames that are typically contained in those packets. So, for instance +the first packet is of type Initial, with packet number 0, and contains a CRYPTO +frame carrying the ClientHello. -## Retry Packet {#packet-retry} +Note that multiple QUIC packets -- even of different encryption levels -- may be +coalesced into a single UDP datagram (see {{packet-coalesce}}), and so this +handshake may consist of as few as 4 UDP datagrams, or any number more. For +instance, the server's first flight contains packets from the Initial encryption +level (obfuscation), the Handshake level, and "0.5-RTT data" from the server at +the 1-RTT encryption level. -A Retry packet uses a long packet header with a type value of 0x7E. It carries -an address validation token created by the server. It is used by a server that -wishes to perform a stateless retry (see {{stateless-retry}}). +~~~~ +Client Server -~~~ - 0 1 2 3 - 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 -+-+-+-+-+-+-+-+-+ -|1| 0x7e | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Version (32) | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -|DCIL(4)|SCIL(4)| -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Destination Connection ID (0/32..144) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Source Connection ID (0/32..144) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| ODCIL(8) | Original Destination Connection ID (*) | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Retry Token (*) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -~~~ -{: #retry-format title="Retry Packet"} +Initial[0]: CRYPTO[CH] -> -A Retry packet (shown in {{retry-format}}) only uses the invariant portion of -the long packet header {{QUIC-INVARIANTS}}; that is, the fields up to and -including the Destination and Source Connection ID fields. A Retry packet does -not contain any protected fields. Like Version Negotiation, a Retry packet -contains the long header including the connection IDs, but omits the Length, -Packet Number, and Payload fields. These are replaced with: + Initial[0]: CRYPTO[SH] ACK[0] + Handshake[0]: CRYPTO[EE, CERT, CV, FIN] + <- 1-RTT[0]: STREAM[1, "..."] -ODCIL: +Initial[1]: ACK[0] +Handshake[0]: CRYPTO[FIN], ACK[0] +1-RTT[0]: STREAM[0, "..."], ACK[0] -> -: The length of the Original Destination Connection ID field. The length is - encoded in the least significant 4 bits of the octet, using the same encoding - as the DCIL and SCIL fields. The most significant 4 bits of this octet are - reserved. Unless a use for these bits has been negotiated, endpoints SHOULD - send randomized values and MUST ignore any value that it receives. + 1-RTT[1]: STREAM[55, "..."], ACK[0] + <- Handshake[1]: ACK[0] +~~~~ +{: #tls-1rtt-handshake title="Example 1-RTT Handshake"} -Original Destination Connection ID: -: The Original Destination Connection ID contains the value of the Destination - Connection ID from the Initial packet that this Retry is in response to. The - length of this field is given in ODCIL. +{{tls-0rtt-handshake}} shows an example of a connection with a 0-RTT handshake +and a single packet of 0-RTT data. Note that as described in {{packet-numbers}}, +the server ACKs the 0-RTT data at the 1-RTT encryption level, and the client's +sequence numbers at the 1-RTT encryption level continue to increment from its +0-RTT packets. -Retry Token: +~~~~ +Client Server -: An opaque token that the server can use to validate the client's address. +Initial[0]: CRYPTO[CH] +0-RTT[0]: STREAM[0, "..."] -> -The server populates the Destination Connection ID with the connection ID that -the client included in the Source Connection ID of the Initial packet. + Initial[0]: CRYPTO[SH] ACK[0] + Handshake[0] CRYPTO[EE, CERT, CV, FIN] + <- 1-RTT[0]: STREAM[1, "..."] ACK[0] -The server includes a connection ID of its choice in the Source Connection ID -field. This value MUST not be equal to the Destination Connection ID field of -the packet sent by the client. The client MUST use this connection ID in the -Destination Connection ID of subsequent packets that it sends. +Initial[1]: ACK[0] +0-RTT[1]: CRYPTO[EOED] +Handshake[0]: CRYPTO[FIN], ACK[0] +1-RTT[2]: STREAM[0, "..."] ACK[0] -> -A server MAY send Retry packets in response to Initial and 0-RTT packets. A -server can either discard or buffer 0-RTT packets that it receives. A server -can send multiple Retry packets as it receives Initial or 0-RTT packets. + 1-RTT[1]: STREAM[55, "..."], ACK[1,2] + <- Handshake[1]: ACK[0] +~~~~ +{: #tls-0rtt-handshake title="Example 0-RTT Handshake"} -A client MUST accept and process at most one Retry packet for each connection -attempt. After the client has received and processed an Initial or Retry packet -from the server, it MUST discard any subsequent Retry packets that it receives. -Clients MUST discard Retry packets that contain an Original Destination -Connection ID field that does not match the Destination Connection ID from its -Initial packet. This prevents an off-path attacker from injecting a Retry -packet. +## Transport Parameters -The client responds to a Retry packet with an Initial packet that includes the -provided Retry Token to continue connection establishment. +During connection establishment, both endpoints make authenticated declarations +of their transport parameters. These declarations are made unilaterally by each +endpoint. Endpoints are required to comply with the restrictions implied by +these parameters; the description of each parameter includes rules for its +handling. -A client sets the Destination Connection ID field of this Initial packet to the -value from the Source Connection ID in the Retry packet. Changing Destination -Connection ID also results in a change to the keys used to protect the Initial -packet. It also sets the Token field to the token provided in the Retry. The -client MUST NOT change the Source Connection ID because the server could include -the connection ID as part of its token validation logic (see {{tokens}}). +The format of the transport parameters is the TransportParameters struct from +{{figure-transport-parameters}}. This is described using the presentation +language from Section 3 of {{!TLS13=RFC8446}}. -All subsequent Initial packets from the client MUST use the connection ID and -token values from the Retry packet. Aside from this, the Initial packet sent -by the client is subject to the same restrictions as the first Initial packet. -A client can either reuse the cryptographic handshake message or construct a -new one at its discretion. +~~~ + uint32 QuicVersion; -A client MAY attempt 0-RTT after receiving a Retry packet by sending 0-RTT -packets to the connection ID provided by the server. A client that sends -additional 0-RTT packets without constructing a new cryptographic handshake -message MUST NOT reset the packet number to 0 after a Retry packet, see -{{retry-0rtt-pn}}. + enum { + initial_max_stream_data_bidi_local(0), + initial_max_data(1), + initial_max_bidi_streams(2), + idle_timeout(3), + preferred_address(4), + max_packet_size(5), + stateless_reset_token(6), + ack_delay_exponent(7), + initial_max_uni_streams(8), + disable_migration(9), + initial_max_stream_data_bidi_remote(10), + initial_max_stream_data_uni(11), + max_ack_delay(12), + original_connection_id(13), + (65535) + } TransportParameterId; -A server acknowledges the use of a Retry packet for a connection using the -original_connection_id transport parameter (see -{{transport-parameter-definitions}}). If the server sends a Retry packet, it -MUST include the value of the Original Destination Connection ID field of the -Retry packet (that is, the Destination Connection ID field from the client's -first Initial packet) in the transport parameter. + struct { + TransportParameterId parameter; + opaque value<0..2^16-1>; + } TransportParameter; -If the client received and processed a Retry packet, it validates that the -original_connection_id transport parameter is present and correct; otherwise, it -validates that the transport parameter is absent. A client MUST treat a failed -validation as a connection error of type TRANSPORT_PARAMETER_ERROR. + struct { + select (Handshake.msg_type) { + case client_hello: + QuicVersion initial_version; -A Retry packet does not include a packet number and cannot be explicitly -acknowledged by a client. + case encrypted_extensions: + QuicVersion negotiated_version; + QuicVersion supported_versions<4..2^8-4>; + }; + TransportParameter parameters<22..2^16-1>; + } TransportParameters; + struct { + enum { IPv4(4), IPv6(6), (15) } ipVersion; + opaque ipAddress<4..2^8-1>; + uint16 port; + opaque connectionId<0..18>; + opaque statelessResetToken[16]; + } PreferredAddress; +~~~ +{: #figure-transport-parameters title="Definition of TransportParameters"} -## Cryptographic Handshake Packets {#handshake-packets} +The `extension_data` field of the quic_transport_parameters extension defined in +{{QUIC-TLS}} contains a TransportParameters value. TLS encoding rules are +therefore used to encode the transport parameters. -Once version negotiation is complete, the cryptographic handshake is used to -agree on cryptographic keys. The cryptographic handshake is carried in Initial -({{packet-initial}}) and Handshake ({{packet-handshake}}) packets. +QUIC encodes transport parameters into a sequence of octets, which are then +included in the cryptographic handshake. Once the handshake completes, the +transport parameters declared by the peer are available. Each endpoint +validates the value provided by its peer. In particular, version negotiation +MUST be validated (see {{version-validation}}) before the connection +establishment is considered properly complete. -All these packets use the long header and contain the current QUIC version in -the version field. +Definitions for each of the defined transport parameters are included in +{{transport-parameter-definitions}}. Any given parameter MUST appear +at most once in a given transport parameters extension. An endpoint MUST +treat receipt of duplicate transport parameters as a connection error of +type TRANSPORT_PARAMETER_ERROR. -In order to prevent tampering by version-unaware middleboxes, Initial -packets are protected with connection- and version-specific keys -(Initial keys) as described in {{QUIC-TLS}}. This protection does not -provide confidentiality or integrity against on-path attackers, but -provides some level of protection against off-path attackers. +### Transport Parameter Definitions -## Initial Packet {#packet-initial} +An endpoint MAY use the following transport parameters: -The Initial packet uses long headers with a type value of 0x7F. It carries the -first CRYPTO frames sent by the client and server to perform key exchange, and -carries ACKs in either direction. The Initial packet is protected by Initial -keys as described in {{QUIC-TLS}}. +initial_max_data (0x0001): -The Initial packet (shown in {{initial-format}}) has two additional header -fields that are added to the Long Header before the Length field. +: The initial maximum data parameter contains the initial value for the maximum + amount of data that can be sent on the connection. This parameter is encoded + as an unsigned 32-bit integer in units of octets. This is equivalent to + sending a MAX_DATA ({{frame-max-data}}) for the connection immediately after + completing the handshake. If the transport parameter is absent, the connection + starts with a flow control limit of 0. -~~~ -+-+-+-+-+-+-+-+-+ -|1| 0x7f | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Version (32) | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -|DCIL(4)|SCIL(4)| -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Destination Connection ID (0/32..144) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Source Connection ID (0/32..144) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Token Length (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Token (*) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Length (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Packet Number (8/16/32) | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Payload (*) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -~~~ -{: #initial-format title="Initial Packet"} +initial_max_bidi_streams (0x0002): -These fields include the token that was previously provided in a Retry packet or -NEW_TOKEN frame: +: The initial maximum bidirectional streams parameter contains the initial + maximum number of bidirectional streams the peer may initiate, encoded as an + unsigned 16-bit integer. If this parameter is absent or zero, bidirectional + streams cannot be created until a MAX_STREAM_ID frame is sent. Setting this + parameter is equivalent to sending a MAX_STREAM_ID ({{frame-max-stream-id}}) + immediately after completing the handshake containing the corresponding Stream + ID. For example, a value of 0x05 would be equivalent to receiving a + MAX_STREAM_ID containing 16 when received by a client or 17 when received by a + server. -Token Length: +initial_max_uni_streams (0x0008): -: A variable-length integer specifying the length of the Token field, in bytes. - This value is zero if no token is present. Initial packets sent by the server - MUST set the Token Length field to zero; clients that receive an Initial - packet with a non-zero Token Length field MUST either discard the packet or - generate a connection error of type PROTOCOL_VIOLATION. +: The initial maximum unidirectional streams parameter contains the initial + maximum number of unidirectional streams the peer may initiate, encoded as an + unsigned 16-bit integer. If this parameter is absent or zero, unidirectional + streams cannot be created until a MAX_STREAM_ID frame is sent. Setting this + parameter is equivalent to sending a MAX_STREAM_ID ({{frame-max-stream-id}}) + immediately after completing the handshake containing the corresponding Stream + ID. For example, a value of 0x05 would be equivalent to receiving a + MAX_STREAM_ID containing 18 when received by a client or 19 when received by a + server. -Token: +idle_timeout (0x0003): -: The value of the token. +: The idle timeout is a value in seconds that is encoded as an unsigned 16-bit + integer. If this parameter is absent or zero then the idle timeout is + disabled. -The client and server use the Initial packet type for any packet that contains -an initial cryptographic handshake message. This includes all cases where a new -packet containing the initial cryptographic message needs to be created, such as -the packets sent after receiving a Version Negotiation ({{packet-version}}) or -Retry packet ({{packet-retry}}). +max_packet_size (0x0005): -A server sends its first Initial packet in response to a client Initial. A -server may send multiple Initial packets. The cryptographic key exchange could -require multiple round trips or retransmissions of this data. +: The maximum packet size parameter places a limit on the size of packets that + the endpoint is willing to receive, encoded as an unsigned 16-bit integer. + This indicates that packets larger than this limit will be dropped. The + default for this parameter is the maximum permitted UDP payload of 65527. + Values below 1200 are invalid. This limit only applies to protected packets + ({{packet-protected}}). -The payload of an Initial packet includes a CRYPTO frame (or frames) containing -a cryptographic handshake message, ACK frames, or both. PADDING and -CONNECTION_CLOSE frames are also permitted. An endpoint that receives an -Initial packet containing other frames can either discard the packet as spurious -or treat it as a connection error. +ack_delay_exponent (0x0007): -The first packet sent by a client always includes a CRYPTO frame that contains -the entirety of the first cryptographic handshake message. This packet, and the -cryptographic handshake message, MUST fit in a single UDP datagram (see -{{handshake}}). The first CRYPTO frame sent always begins at an offset of 0 -(see {{handshake}}). +: An 8-bit unsigned integer value indicating an exponent used to decode the ACK + Delay field in the ACK frame, see {{frame-ack}}. If this value is absent, a + default value of 3 is assumed (indicating a multiplier of 8). The default + value is also used for ACK frames that are sent in Initial and Handshake + packets. Values above 20 are invalid. -Note that if the server sends a HelloRetryRequest, the client will send a second -Initial packet. This Initial packet will continue the cryptographic handshake -and will contain a CRYPTO frame with an offset matching the size of the CRYPTO -frame sent in the first Initial packet. Cryptographic handshake messages -subsequent to the first do not need to fit within a single UDP datagram. +disable_migration (0x0009): +: The endpoint does not support connection migration ({{migration}}). Peers MUST + NOT send any packets, including probing packets ({{probing}}), from a local + address other than that used to perform the handshake. This parameter is a + zero-length value. -### Connection IDs +max_ack_delay (0x000c): -When an Initial packet is sent by a client which has not previously received a -Retry packet from the server, it populates the Destination Connection ID field -with an unpredictable value. This MUST be at least 8 octets in length. Until a -packet is received from the server, the client MUST use the same value unless it -abandons the connection attempt and starts a new one. The initial Destination -Connection ID is used to determine packet protection keys for Initial packets. +: An 8 bit unsigned integer value indicating the maximum amount of time in + milliseconds by which it will delay sending of acknowledgments. If this + value is absent, a default of 25 milliseconds is assumed. -The client populates the Source Connection ID field with a value of its choosing -and sets the SCIL field to match. +Either peer MAY advertise an initial value for the flow control on each type of +stream on which they might receive data. Each of the following transport +parameters is encoded as an unsigned 32-bit integer in units of octets: -The Destination Connection ID field in the server's Initial packet contains a -connection ID that is chosen by the recipient of the packet (i.e., the client); -the Source Connection ID includes the connection ID that the sender of the -packet wishes to use (see {{connection-id}}). The server MUST use consistent -Source Connection IDs during the handshake. +initial_max_stream_data_bidi_local (0x0000): -On first receiving an Initial or Retry packet from the server, the client uses -the Source Connection ID supplied by the server as the Destination Connection ID -for subsequent packets. That means that a client might change the Destination -Connection ID twice during connection establishment. Once a client has received -an Initial packet from the server, it MUST discard any packet it receives with a -different Source Connection ID. +: The initial stream maximum data for bidirectional, locally-initiated streams + parameter contains the initial flow control limit for newly created + bidirectional streams opened by the endpoint that sets the transport + parameter. In client transport parameters, this applies to streams with an + identifier ending in 0x0; in server transport parameters, this applies to + streams ending in 0x1. +initial_max_stream_data_bidi_remote (0x000a): -### Tokens +: The initial stream maximum data for bidirectional, peer-initiated streams + parameter contains the initial flow control limit for newly created + bidirectional streams opened by the endpoint that receives the transport + parameter. In client transport parameters, this applies to streams with an + identifier ending in 0x1; in server transport parameters, this applies to + streams ending in 0x0. -If the client has a token received in a NEW_TOKEN frame on a previous connection -to what it believes to be the same server, it can include that value in the -Token field of its Initial packet. +initial_max_stream_data_uni (0x000b): -A token allows a server to correlate activity between connections. -Specifically, the connection where the token was issued, and any connection -where it is used. Clients that want to break continuity of identity with a -server MAY discard tokens provided using the NEW_TOKEN frame. Tokens obtained -in Retry packets MUST NOT be discarded. +: The initial stream maximum data for unidirectional streams parameter contains + the initial flow control limit for newly created unidirectional streams opened + by the endpoint that receives the transport parameter. In client transport + parameters, this applies to streams with an identifier ending in 0x3; in + server transport parameters, this applies to streams ending in 0x2. -A client SHOULD NOT reuse a token. Reusing a token allows connections to be -linked by entities on the network path (see {{migration-linkability}}). A -client MUST NOT reuse a token if it believes that its point of network -attachment has changed since the token was last used; that is, if there is a -change in its local IP address or network interface. A client needs to start -the connection process over if it migrates prior to completing the handshake. +If present, transport parameters that set initial stream flow control limits are +equivalent to sending a MAX_STREAM_DATA frame ({{frame-max-stream-data}}) on +every stream of the corresponding type immediately after opening. If the +transport parameter is absent, streams of that type start with a flow control +limit of 0. -When a server receives an Initial packet with an address validation token, it -SHOULD attempt to validate it. If the token is invalid then the server SHOULD -proceed as if the client did not have a validated address, including potentially -sending a Retry. If the validation succeeds, the server SHOULD then allow the -handshake to proceed (see {{stateless-retry}}). +A server MUST include the original_connection_id transport parameter if it sent +a Retry packet: -Note: +original_connection_id (0x000d): -: The rationale for treating the client as unvalidated rather than discarding - the packet is that the client might have received the token in a previous - connection using the NEW_TOKEN frame, and if the server has lost state, it - might be unable to validate the token at all, leading to connection failure if - the packet is discarded. A server MAY encode tokens provided with NEW_TOKEN - frames and Retry packets differently, and validate the latter more strictly. +: The value of the Destination Connection ID field from the first Initial packet + sent by the client. This transport parameter is only sent by the server. -In a stateless design, a server can use encrypted and authenticated tokens to -pass information to clients that the server can later recover and use to -validate a client address. Tokens are not integrated into the cryptographic -handshake and so they are not authenticated. For instance, a client might be -able to reuse a token. To avoid attacks that exploit this property, a server -can limit its use of tokens to only the information needed validate client -addresses. +A server MAY include the following transport parameters: +stateless_reset_token (0x0006): -### Starting Packet Numbers +: The Stateless Reset Token is used in verifying a stateless reset, see + {{stateless-reset}}. This parameter is a sequence of 16 octets. -The first Initial packet sent by either endpoint contains a packet number of -0. The packet number MUST increase monotonically thereafter. Initial packets -are in a different packet number space to other packets (see -{{packet-numbers}}). +preferred_address (0x0004): +: The server's Preferred Address is used to effect a change in server address at + the end of the handshake, as described in {{preferred-address}}. -### 0-RTT Packet Numbers {#retry-0rtt-pn} +A client MUST NOT include a stateless reset token or a preferred address. A +server MUST treat receipt of either transport parameter as a connection error of +type TRANSPORT_PARAMETER_ERROR. -Packet numbers for 0-RTT protected packets use the same space as 1-RTT protected -packets. -After a client receives a Retry or Version Negotiation packet, 0-RTT packets are -likely to have been lost or discarded by the server. A client MAY attempt to -resend data in 0-RTT packets after it sends a new Initial packet. +### Values of Transport Parameters for 0-RTT {#zerortt-parameters} -A client MUST NOT reset the packet number it uses for 0-RTT packets. The keys -used to protect 0-RTT packets will not change as a result of responding to a -Retry or Version Negotiation packet unless the client also regenerates the -cryptographic handshake message. Sending packets with the same packet number in -that case is likely to compromise the packet protection for all 0-RTT packets -because the same key and nonce could be used to protect different content. +A client that attempts to send 0-RTT data MUST remember the transport parameters +used by the server. The transport parameters that the server advertises during +connection establishment apply to all connections that are resumed using the +keying material established during that handshake. Remembered transport +parameters apply to the new connection until the handshake completes and new +transport parameters from the server can be provided. -Receiving a Retry or Version Negotiation packet, especially a Retry that changes -the connection ID used for subsequent packets, indicates a strong possibility -that 0-RTT packets could be lost. A client only receives acknowledgments for -its 0-RTT packets once the handshake is complete. Consequently, a server might -expect 0-RTT packets to start with a packet number of 0. Therefore, in -determining the length of the packet number encoding for 0-RTT packets, a client -MUST assume that all packets up to the current packet number are in flight, -starting from a packet number of 0. Thus, 0-RTT packets could need to use a -longer packet number encoding. +A server can remember the transport parameters that it advertised, or store an +integrity-protected copy of the values in the ticket and recover the information +when accepting 0-RTT data. A server uses the transport parameters in +determining whether to accept 0-RTT data. -A client SHOULD instead generate a fresh cryptographic handshake message and -start packet numbers from 0. This ensures that new 0-RTT packets will not use -the same keys, avoiding any risk of key and nonce reuse; this also prevents -0-RTT packets from previous handshake attempts from being accepted as part of -the connection. +A server MAY accept 0-RTT and subsequently provide different values for +transport parameters for use in the new connection. If 0-RTT data is accepted +by the server, the server MUST NOT reduce any limits or alter any values that +might be violated by the client with its 0-RTT data. In particular, a server +that accepts 0-RTT data MUST NOT set values for initial_max_data, +initial_max_stream_data_bidi_local, initial_max_stream_data_bidi_remote, and +initial_max_stream_data_uni that are smaller than the remembered value of those +parameters. Similarly, a server MUST NOT reduce the value of +initial_max_bidi_streams or initial_max_uni_streams. +Omitting or setting a zero value for certain transport parameters can result in +0-RTT data being enabled, but not usable. The applicable subset of transport +parameters that permit sending of application data SHOULD be set to non-zero +values for 0-RTT. This includes initial_max_data and either +initial_max_bidi_streams and initial_max_stream_data_bidi_remote, or +initial_max_uni_streams and initial_max_stream_data_uni. -### Minimum Packet Size +The value of the server's previous preferred_address MUST NOT be used when +establishing a new connection; rather, the client should wait to observe the +server's new preferred_address value in the handshake. -The payload of a UDP datagram carrying the Initial packet MUST be expanded to at -least 1200 octets (see {{packetization}}), by adding PADDING frames to the -Initial packet and/or by combining the Initial packet with a 0-RTT packet (see -{{packet-coalesce}}). +A server MUST reject 0-RTT data or even abort a handshake if the implied values +for transport parameters cannot be supported. -## Handshake Packet {#packet-handshake} +### New Transport Parameters -A Handshake packet uses long headers with a type value of 0x7D. It is -used to carry acknowledgments and cryptographic handshake messages from the -server and client. +New transport parameters can be used to negotiate new protocol behavior. An +endpoint MUST ignore transport parameters that it does not support. Absence of +a transport parameter therefore disables any optional protocol feature that is +negotiated using the parameter. -A server sends its cryptographic handshake in one or more Handshake packets in -response to an Initial packet if it does not send a Retry packet. Once a client -has received a Handshake packet from a server, it uses Handshake packets to send -subsequent cryptographic handshake messages and acknowledgments to the server. +New transport parameters can be registered according to the rules in +{{iana-transport-parameters}}. -The Destination Connection ID field in a Handshake packet contains a connection -ID that is chosen by the recipient of the packet; the Source Connection ID -includes the connection ID that the sender of the packet wishes to use (see -{{connection-id-encoding}}). -The first Handshake packet sent by a server contains a packet number of 0. -Handshake packets are their own packet number space. Packet numbers are -incremented normally for other Handshake packets. +### Version Negotiation Validation {#version-validation} -Servers MUST NOT send more than three times as many bytes as the number of bytes -received prior to verifying the client's address. Source addresses can be -verified through an address validation token (delivered via a Retry packet or -a NEW_TOKEN frame) or by processing any message from the client encrypted using -the Handshake keys. This limit exists to mitigate amplification attacks. +Though the cryptographic handshake has integrity protection, two forms of QUIC +version downgrade are possible. In the first, an attacker replaces the QUIC +version in the Initial packet. In the second, a fake Version Negotiation packet +is sent by an attacker. To protect against these attacks, the transport +parameters include three fields that encode version information. These +parameters are used to retroactively authenticate the choice of version (see +{{version-negotiation}}). -In order to prevent this limit causing a handshake deadlock, the client SHOULD -always send a packet upon a handshake timeout, as described in -{{QUIC-RECOVERY}}. If the client has no data to retransmit and does not have -Handshake keys, it SHOULD send an Initial packet in a UDP datagram of at least -1200 octets. If the client has Handshake keys, it SHOULD send a Handshake -packet. +The cryptographic handshake provides integrity protection for the negotiated +version as part of the transport parameters (see {{transport-parameters}}). As +a result, attacks on version negotiation by an attacker can be detected. -The payload of this packet contains CRYPTO frames and could contain PADDING, or -ACK frames. Handshake packets MAY contain CONNECTION_CLOSE or APPLICATION_CLOSE -frames. Endpoints MUST treat receipt of Handshake packets with other frames as -a connection error. +The client includes the initial_version field in its transport parameters. The +initial_version is the version that the client initially attempted to use. If +the server did not send a Version Negotiation packet {{packet-version}}, this +will be identical to the negotiated_version field in the server transport +parameters. +A server that processes all packets in a stateful fashion can remember how +version negotiation was performed and validate the initial_version value. -## Protected Packets {#packet-protected} +A server that does not maintain state for every packet it receives (i.e., a +stateless server) uses a different process. If the initial_version matches the +version of QUIC that is in use, a stateless server can accept the value. -All QUIC packets use packet protection. Packets that are protected with the -static handshake keys or the 0-RTT keys are sent with long headers; all packets -protected with 1-RTT keys are sent with short headers. The different packet -types explicitly indicate the encryption level and therefore the keys that are -used to remove packet protection. 0-RTT and 1-RTT protected packets share a -single packet number space. +If the initial_version is different from the version of QUIC that is in use, a +stateless server MUST check that it would have sent a Version Negotiation packet +if it had received a packet with the indicated initial_version. If a server +would have accepted the version included in the initial_version and the value +differs from the QUIC version that is in use, the server MUST terminate the +connection with a VERSION_NEGOTIATION_ERROR error. -Packets protected with handshake keys only use packet protection to ensure that -the sender of the packet is on the network path. This packet protection is not -effective confidentiality protection; any entity that receives the Initial -packet from a client can recover the keys necessary to remove packet protection -or to generate packets that will be successfully authenticated. - -Packets protected with 0-RTT and 1-RTT keys are expected to have confidentiality -and data origin authentication; the cryptographic handshake ensures that only -the communicating endpoints receive the corresponding keys. +The server includes both the version of QUIC that is in use and a list of the +QUIC versions that the server supports. -Packets protected with 0-RTT keys use a type value of 0x7C. The connection ID -fields for a 0-RTT packet MUST match the values used in the Initial packet -({{packet-initial}}). +The negotiated_version field is the version that is in use. This MUST be set by +the server to the value that is on the Initial packet that it accepts (not an +Initial packet that triggers a Retry or Version Negotiation packet). A client +that receives a negotiated_version that does not match the version of QUIC that +is in use MUST terminate the connection with a VERSION_NEGOTIATION_ERROR error +code. -The version field for protected packets is the current QUIC version. +The server includes a list of versions that it would send in any version +negotiation packet ({{packet-version}}) in the supported_versions field. The +server populates this field even if it did not send a version negotiation +packet. -The packet number field contains a packet number, which has additional -confidentiality protection that is applied after packet protection is applied -(see {{QUIC-TLS}} for details). The underlying packet number increases with -each packet sent, see {{packet-numbers}} for details. +The client validates that the negotiated_version is included in the +supported_versions list and - if version negotiation was performed - that it +would have selected the negotiated version. A client MUST terminate the +connection with a VERSION_NEGOTIATION_ERROR error code if the current QUIC +version is not listed in the supported_versions list. A client MUST terminate +with a VERSION_NEGOTIATION_ERROR error code if version negotiation occurred but +it would have selected a different version based on the value of the +supported_versions list. -The payload is protected using authenticated encryption. {{QUIC-TLS}} describes -packet protection in detail. After decryption, the plaintext consists of a -sequence of frames, as described in {{frames}}. +When an endpoint accepts multiple QUIC versions, it can potentially interpret +transport parameters as they are defined by any of the QUIC versions it +supports. The version field in the QUIC packet header is authenticated using +transport parameters. The position and the format of the version fields in +transport parameters MUST either be identical across different QUIC versions, or +be unambiguously different to ensure no confusion about their interpretation. +One way that a new format could be introduced is to define a TLS extension with +a different codepoint. -## Coalescing Packets {#packet-coalesce} +## Proof of Source Address Ownership {#address-validation} -A sender can coalesce multiple QUIC packets (typically a Cryptographic Handshake -packet and a Protected packet) into one UDP datagram. This can reduce the -number of UDP datagrams needed to send application data during the handshake and -immediately afterwards. It is not necessary for senders to coalesce -packets, though failing to do so will require sending a significantly -larger number of datagrams during the handshake. Receivers MUST -be able to process coalesced packets. +Transport protocols commonly spend a round trip checking that a client owns the +transport address (IP and port) that it claims. Verifying that a client can +receive packets sent to its claimed transport address protects against spoofing +of this information by malicious clients. -Coalescing packets in order of increasing encryption levels (Initial, 0-RTT, -Handshake, 1-RTT) makes it more likely the receiver will be able to process all -the packets in a single pass. A packet with a short header does not include a -length, so it will always be the last packet included in a UDP datagram. +This technique is used primarily to avoid QUIC from being used for traffic +amplification attack. In such an attack, a packet is sent to a server with +spoofed source address information that identifies a victim. If a server +generates more or larger packets in response to that packet, the attacker can +use the server to send more data toward the victim than it would be able to send +on its own. -Senders MUST NOT coalesce QUIC packets with different Destination Connection -IDs into a single UDP datagram. Receivers SHOULD ignore any subsequent packets -with a different Destination Connection ID than the first packet in the -datagram. +Several methods are used in QUIC to mitigate this attack. Firstly, the initial +handshake packet is sent in a UDP datagram that contains at least 1200 octets of +UDP payload. This allows a server to send a similar amount of data without +risking causing an amplification attack toward an unproven remote address. -Every QUIC packet that is coalesced into a single UDP datagram is separate and -complete. Though the values of some fields in the packet header might be -redundant, no fields are omitted. The receiver of coalesced QUIC packets MUST -individually process each QUIC packet and separately acknowledge them, as if -they were received as the payload of different UDP datagrams. If one or more -packets in a datagram cannot be processed yet (because the keys are not yet -available) or processing fails (decryption failure, unknown type, etc.), the -receiver MUST still attempt to process the remaining packets. The skipped -packets MAY either be discarded or buffered for later processing, just as if the -packets were received out-of-order in separate datagrams. +A server eventually confirms that a client has received its messages when the +first Handshake-level message is received. This might be insufficient, +either because the server wishes to avoid the computational cost of completing +the handshake, or it might be that the size of the packets that are sent during +the handshake is too large. This is especially important for 0-RTT, where the +server might wish to provide application data traffic - such as a response to a +request - in response to the data carried in the early data from the client. -Retry ({{packet-retry}}) and Version Negotiation ({{packet-version}}) packets -cannot be coalesced. +To send additional data prior to completing the cryptographic handshake, the +server then needs to validate that the client owns the address that it claims. +Source address validation is therefore performed by the core transport +protocol during the establishment of a connection. -## Connection ID Encoding +A different type of source address validation is performed after a connection +migration, see {{migrate-validate}}. -A connection ID is used to ensure consistent routing of packets, as described in -{{connection-id}}. The long header contains two connection IDs: the Destination -Connection ID is chosen by the recipient of the packet and is used to provide -consistent routing; the Source Connection ID is used to set the Destination -Connection ID used by the peer. -During the handshake, packets with the long header are used to establish the -connection ID that each endpoint uses. Each endpoint uses the Source Connection -ID field to specify the connection ID that is used in the Destination Connection -ID field of packets being sent to them. Upon receiving a packet, each endpoint -sets the Destination Connection ID it sends to match the value of the Source -Connection ID that they receive. +### Client Address Validation Procedure -During the handshake, a client can receive both a Retry and an Initial packet, -and thus be given two opportunities to update the Destination Connection ID it -sends. A client MUST only change the value it sends in the Destination -Connection ID in response to the first packet of each type it receives from the -server (Retry or Initial); a server MUST set its value based on the Initial -packet. Any additional changes are not permitted; if subsequent packets of -those types include a different Source Connection ID, they MUST be discarded. -This avoids problems that might arise from stateless processing of multiple -Initial packets producing different connection IDs. +QUIC uses token-based address validation. Any time the server wishes +to validate a client address, it provides the client with a token. As +long as the token's authenticity can be checked (see +{{token-integrity}}) and the client is able to return that token, it +proves to the server that it received the token. -Short headers only include the Destination Connection ID and omit the explicit -length. The length of the Destination Connection ID field is expected to be -known to endpoints. +Upon receiving the client's Initial packet, the server can request +address validation by sending a Retry packet containing a token. This +token is repeated in the client's next Initial packet. Because the +token is consumed by the server that generates it, there is no need +for a single well-defined format. A token could include information +about the claimed client address (IP and port), a timestamp, and any +other supplementary information the server will need to validate the +token in the future. -Endpoints using a connection-ID based load balancer could agree with the load -balancer on a fixed or minimum length and on an encoding for connection IDs. -This fixed portion could encode an explicit length, which allows the entire -connection ID to vary in length and still be used by the load balancer. +The Retry packet is sent to the client and a legitimate client will +respond with an Initial packet containing the token from the Retry packet +when it continues the handshake. In response to receiving the token, a +server can either abort the connection or permit it to proceed. -The very first packet sent by a client includes a random value for Destination -Connection ID. The same value MUST be used for all 0-RTT packets sent on that -connection ({{packet-protected}}). This randomized value is used to determine -the packet protection keys for Initial packets (see Section 5.2 of -{{QUIC-TLS}}). +A connection MAY be accepted without address validation - or with only limited +validation - but a server SHOULD limit the data it sends toward an unvalidated +address. Successful completion of the cryptographic handshake implicitly +provides proof that the client has received packets from the server. -A Version Negotiation ({{packet-version}}) packet MUST use both connection IDs -selected by the client, swapped to ensure correct routing toward the client. +The client should allow for additional Retry packets being sent in +response to Initial packets sent containing a token. There are several +situations in which the server might not be able to use the previously +generated token to validate the client's address and must send a new +Retry. A reasonable limit to the number of tries the client allows +for, before giving up, is 3. That is, the client MUST echo the +address validation token from a new Retry packet up to 3 times. After +that, it MAY give up on the connection attempt. -The connection ID can change over the lifetime of a connection, especially in -response to connection migration ({{migration}}). NEW_CONNECTION_ID frames -({{frame-new-connection-id}}) are used to provide new connection ID values. +### Address Validation for Future Connections -## Packet Numbers {#packet-numbers} +A server MAY provide clients with an address validation token during one +connection that can be used on a subsequent connection. Address validation is +especially important with 0-RTT because a server potentially sends a significant +amount of data to a client in response to 0-RTT data. -The packet number is an integer in the range 0 to 2^62-1. The value is used in -determining the cryptographic nonce for packet protection. Each endpoint -maintains a separate packet number for sending and receiving. +The server uses the NEW_TOKEN frame {{frame-new-token}} to provide the +client with an address validation token that can be used to validate +future connections. The client may then use this token to validate +future connections by including it in the Initial packet's header. +The client MUST NOT use the token provided in a Retry for future +connections. -Packet numbers are divided into 3 spaces in QUIC: +Unlike the token that is created for a Retry packet, there might be some time +between when the token is created and when the token is subsequently used. +Thus, a resumption token SHOULD include an expiration time. The server MAY +include either an explicit expiration time or an issued timestamp and +dynamically calculate the expiration time. It is also unlikely that the client +port number is the same on two different connections; validating the port is +therefore unlikely to be successful. -- Initial space: All Initial packets {{packet-initial}} are in this space. -- Handshake space: All Handshake packets {{packet-handshake}} are in this space. -- Application data space: All 0-RTT and 1-RTT encrypted packets - {{packet-protected}} are in this space. -As described in {{QUIC-TLS}}, each packet type uses different protection keys. +### Address Validation Token Integrity {#token-integrity} -Conceptually, a packet number space is the context in which a packet can be -processed and acknowledged. Initial packets can only be sent with Initial -packet protection keys and acknowledged in packets which are also Initial -packets. Similarly, Handshake packets are sent at the Handshake encryption -level and can only be acknowledged in Handshake packets. +An address validation token MUST be difficult to guess. Including a large +enough random value in the token would be sufficient, but this depends on the +server remembering the value it sends to clients. -This enforces cryptographic separation between the data sent in the different -packet sequence number spaces. Each packet number space starts at packet number -0. Subsequent packets sent in the same packet number space MUST increase the -packet number by at least one. +A token-based scheme allows the server to offload any state associated with +validation to the client. For this design to work, the token MUST be covered by +integrity protection against modification or falsification by clients. Without +integrity protection, malicious clients could generate or guess values for +tokens that would be accepted by the server. Only the server requires access to +the integrity protection key for tokens. -0-RTT and 1-RTT data exist in the same packet number space to make loss recovery -algorithms easier to implement between the two packet types. +## Stateless Retries {#stateless-retry} -A QUIC endpoint MUST NOT reuse a packet number within the same packet number -space in one connection (that is, under the same cryptographic keys). If the -packet number for sending reaches 2^62 - 1, the sender MUST close the connection -without sending a CONNECTION_CLOSE frame or any further packets; an endpoint MAY -send a Stateless Reset ({{stateless-reset}}) in response to further packets that -it receives. +A server can process an Initial packet from a client without committing any +state. This allows a server to perform address validation +({{address-validation}}), or to defer connection establishment costs. -In the QUIC long and short packet headers, the number of bits required to -represent the packet number is reduced by including only a variable number of -the least significant bits of the packet number. One or two of the most -significant bits of the first octet determine how many bits of the packet -number are provided, as shown in {{pn-encodings}}. +A server that generates a response to an Initial packet without retaining +connection state MUST use the Retry packet ({{packet-retry}}). This packet +causes a client to restart the connection attempt and includes the token in the +new Initial packet ({{packet-initial}}) to prove source address ownership. -| First octet pattern | Encoded Length | Bits Present | -|:--------------------|:---------------|:-------------| -| 0b0xxxxxxx | 1 octet | 7 | -| 0b10xxxxxx | 2 | 14 | -| 0b11xxxxxx | 4 | 30 | -{: #pn-encodings title="Packet Number Encodings for Packet Headers"} -Note that these encodings are similar to those in {{integer-encoding}}, but -use different values. -The encoded packet number is protected as described in Section 5.3 -{{QUIC-TLS}}. Protection of the packet number is removed prior to recovering the -full packet number. The full packet number is reconstructed at the receiver -based on the number of significant bits present, the value of those bits, and -the largest packet number received on a successfully authenticated -packet. Recovering the full packet number is necessary to successfully remove -packet protection. +# Path Validation {#migrate-validate} -Once packet number protection is removed, the packet number is decoded by -finding the packet number value that is closest to the next expected packet. -The next expected packet is the highest received packet number plus one. For -example, if the highest successfully authenticated packet had a packet number of -0xaa82f30e, then a packet containing a 14-bit value of 0x9b3 will be decoded as -0xaa8309b3. -Example pseudo-code for packet number decoding can be found in -{{sample-packet-number-decoding}}. +Path validation is used by an endpoint to verify reachability of a peer over a +specific path. That is, it tests reachability between a specific local address +and a specific peer address, where an address is the two-tuple of IP address and +port. Path validation tests that packets can be both sent to and received from +a peer. -The sender MUST use a packet number size able to represent more than twice as -large a range than the difference between the largest acknowledged packet and -packet number being sent. A peer receiving the packet will then correctly -decode the packet number, unless the packet is delayed in transit such that it -arrives after many higher-numbered packets have been received. An endpoint -SHOULD use a large enough packet number encoding to allow the packet number to -be recovered even if the packet arrives after packets that are sent afterwards. +Path validation is used during connection migration (see {{migration}} and +{{preferred-address}}) by the migrating endpoint to verify reachability of a +peer from a new local address. Path validation is also used by the peer to +verify that the migrating endpoint is able to receive packets sent to the its +new address. That is, that the packets received from the migrating endpoint do +not carry a spoofed source address. -As a result, the size of the packet number encoding is at least one more than -the base 2 logarithm of the number of contiguous unacknowledged packet numbers, -including the new packet. +Path validation can be used at any time by either endpoint. For instance, an +endpoint might check that a peer is still in possession of its address after a +period of quiescence. -For example, if an endpoint has received an acknowledgment for packet 0x6afa2f, -sending a packet with a number of 0x6b2d79 requires a packet number encoding -with 14 bits or more; whereas the 30-bit packet number encoding is needed to -send a packet with a number of 0x6bc107. +Path validation is not designed as a NAT traversal mechanism. Though the +mechanism described here might be effective for the creation of NAT bindings +that support NAT traversal, the expectation is that one or other peer is able to +receive packets without first having sent a packet on that path. Effective NAT +traversal needs additional synchronization mechanisms that are not provided +here. -A receiver MUST discard a newly unprotected packet unless it is certain that it -has not processed another packet with the same packet number from the same -packet number space. Duplicate suppression MUST happen after removing packet -protection for the reasons described in Section 9.3 of {{QUIC-TLS}}. An -efficient algorithm for duplicate suppression can be found in Section 3.4.3 of -{{?RFC2406}}. +An endpoint MAY bundle PATH_CHALLENGE and PATH_RESPONSE frames that are used for +path validation with other frames. For instance, an endpoint may pad a packet +carrying a PATH_CHALLENGE for PMTU discovery, or an endpoint may bundle a +PATH_RESPONSE with its own PATH_CHALLENGE. -A Version Negotiation packet ({{packet-version}}) does not include a packet -number. The Retry packet ({{packet-retry}}) has special rules for populating -the packet number field. +When probing a new path, an endpoint might want to ensure that its peer has an +unused connection ID available for responses. The endpoint can send +NEW_CONNECTION_ID and PATH_CHALLENGE frames in the same packet. This ensures +that an unused connection ID will be available to the peer when sending a +response. +## Initiation -# Frames and Frame Types {#frames} +To initiate path validation, an endpoint sends a PATH_CHALLENGE frame containing +a random payload on the path to be validated. -The payload of all packets, after removing packet protection, consists of a -sequence of frames, as shown in {{packet-frames}}. Version Negotiation and -Stateless Reset do not contain frames. +An endpoint MAY send additional PATH_CHALLENGE frames to handle packet loss. An +endpoint SHOULD NOT send a PATH_CHALLENGE more frequently than it would an +Initial packet, ensuring that connection migration is no more load on a new path +than establishing a new connection. -~~~ - 0 1 2 3 - 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Frame 1 (*) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Frame 2 (*) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ - ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Frame N (*) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -~~~ -{: #packet-frames title="QUIC Payload"} +The endpoint MUST use fresh random data in every PATH_CHALLENGE frame so that it +can associate the peer's response with the causative PATH_CHALLENGE. -QUIC payloads MUST contain at least one frame, and MAY contain multiple -frames and multiple frame types. -Frames MUST fit within a single QUIC packet and MUST NOT span a QUIC packet -boundary. Each frame begins with a Frame Type, indicating its type, followed by -additional type-dependent fields: +## Response -~~~ - 0 1 2 3 - 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Frame Type (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Type-Dependent Fields (*) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -~~~ -{: #frame-layout title="Generic Frame Layout"} +On receiving a PATH_CHALLENGE frame, an endpoint MUST respond immediately by +echoing the data contained in the PATH_CHALLENGE frame in a PATH_RESPONSE frame, +with the following stipulation. Since a PATH_CHALLENGE might be sent from a +spoofed address, an endpoint MAY limit the rate at which it sends PATH_RESPONSE +frames and MAY silently discard PATH_CHALLENGE frames that would cause it to +respond at a higher rate. -The frame types defined in this specification are listed in {{frame-types}}. -The Frame Type in STREAM frames is used to carry other frame-specific flags. -For all other frames, the Frame Type field simply identifies the frame. These -frames are explained in more detail as they are referenced later in the -document. +To ensure that packets can be both sent to and received from the peer, the +PATH_RESPONSE MUST be sent on the same path as the triggering PATH_CHALLENGE: +from the same local address on which the PATH_CHALLENGE was received, to the +same remote address from which the PATH_CHALLENGE was received. -| Type Value | Frame Type Name | Definition | -|:------------|:---------------------|:-------------------------------| -| 0x00 | PADDING | {{frame-padding}} | -| 0x01 | RST_STREAM | {{frame-rst-stream}} | -| 0x02 | CONNECTION_CLOSE | {{frame-connection-close}} | -| 0x03 | APPLICATION_CLOSE | {{frame-application-close}} | -| 0x04 | MAX_DATA | {{frame-max-data}} | -| 0x05 | MAX_STREAM_DATA | {{frame-max-stream-data}} | -| 0x06 | MAX_STREAM_ID | {{frame-max-stream-id}} | -| 0x07 | PING | {{frame-ping}} | -| 0x08 | BLOCKED | {{frame-blocked}} | -| 0x09 | STREAM_BLOCKED | {{frame-stream-blocked}} | -| 0x0a | STREAM_ID_BLOCKED | {{frame-stream-id-blocked}} | -| 0x0b | NEW_CONNECTION_ID | {{frame-new-connection-id}} | -| 0x0c | STOP_SENDING | {{frame-stop-sending}} | -| 0x0d | RETIRE_CONNECTION_ID | {{frame-retire-connection-id}} | -| 0x0e | PATH_CHALLENGE | {{frame-path-challenge}} | -| 0x0f | PATH_RESPONSE | {{frame-path-response}} | -| 0x10 - 0x17 | STREAM | {{frame-stream}} | -| 0x18 | CRYPTO | {{frame-crypto}} | -| 0x19 | NEW_TOKEN | {{frame-new-token}} | -| 0x1a - 0x1b | ACK | {{frame-ack}} | -{: #frame-types title="Frame Types"} -All QUIC frames are idempotent. That is, a valid frame does not cause -undesirable side effects or errors when received more than once. +## Completion -The Frame Type field uses a variable length integer encoding (see -{{integer-encoding}}) with one exception. To ensure simple and efficient -implementations of frame parsing, a frame type MUST use the shortest possible -encoding. Though a two-, four- or eight-octet encoding of the frame types -defined in this document is possible, the Frame Type field for these frames is -encoded on a single octet. For instance, though 0x4007 is a legitimate -two-octet encoding for a variable-length integer with a value of 7, PING frames -are always encoded as a single octet with the value 0x07. An endpoint MUST -treat the receipt of a frame type that uses a longer encoding than necessary as -a connection error of type PROTOCOL_VIOLATION. +A new address is considered valid when a PATH_RESPONSE frame is received +containing data that was sent in a previous PATH_CHALLENGE. Receipt of an +acknowledgment for a packet containing a PATH_CHALLENGE frame is not adequate +validation, since the acknowledgment can be spoofed by a malicious peer. +For path validation to be successful, a PATH_RESPONSE frame MUST be received +from the same remote address to which the corresponding PATH_CHALLENGE was +sent. If a PATH_RESPONSE frame is received from a different remote address than +the one to which the PATH_CHALLENGE was sent, path validation is considered to +have failed, even if the data matches that sent in the PATH_CHALLENGE. -## Extension Frames +Additionally, the PATH_RESPONSE frame MUST be received on the same local address +from which the corresponding PATH_CHALLENGE was sent. If a PATH_RESPONSE frame +is received on a different local address than the one from which the +PATH_CHALLENGE was sent, path validation is considered to have failed, even if +the data matches that sent in the PATH_CHALLENGE. Thus, the endpoint considers +the path to be valid when a PATH_RESPONSE frame is received on the same path +with the same payload as the PATH_CHALLENGE frame. -QUIC frames do not use a self-describing encoding. An endpoint therefore needs -to understand the syntax of all frames before it can successfully process a -packet. This allows for efficient encoding of frames, but it means that an -endpoint cannot send a frame of a type that is unknown to its peer. -An extension to QUIC that wishes to use a new type of frame MUST first ensure -that a peer is able to understand the frame. An endpoint can use a transport -parameter to signal its willingness to receive one or more extension frame types -with the one transport parameter. +## Abandonment -Extension frames MUST be congestion controlled and MUST cause an ACK frame to -be sent. The exception is extension frames that replace or supplement the ACK -frame. Extension frames are not included in flow control unless specified -in the extension. +An endpoint SHOULD abandon path validation after sending some number of +PATH_CHALLENGE frames or after some time has passed. When setting this timer, +implementations are cautioned that the new path could have a longer round-trip +time than the original. -An IANA registry is used to manage the assignment of frame types, see -{{iana-frames}}. +Note that the endpoint might receive packets containing other frames on the new +path, but a PATH_RESPONSE frame with appropriate data is required for path +validation to succeed. +If path validation fails, the path is deemed unusable. This does not +necessarily imply a failure of the connection - endpoints can continue sending +packets over other paths as appropriate. If no paths are available, an endpoint +can wait for a new path to become available or close the connection. -# Life of a Connection +A path validation might be abandoned for other reasons besides +failure. Primarily, this happens if a connection migration to a new path is +initiated while a path validation on the old path is in progress. -A QUIC connection is a single conversation between two QUIC endpoints. QUIC's -connection establishment intertwines version negotiation with the cryptographic -and transport handshakes to reduce connection establishment latency, as -described in {{handshake}}. Once established, a connection may migrate to a -different IP or port at either endpoint, due to NAT rebinding or mobility, as -described in {{migration}}. Finally, a connection may be terminated by either -endpoint, as described in {{termination}}. -## Connection ID +# Connection Migration {#migration} -Each connection possesses a set of identifiers, any of which could be used to -distinguish it from other connections. Connection IDs are selected -independently in each direction. Each Connection ID has an associated sequence -number to assist in deduplicating messages. +QUIC allows connections to survive changes to endpoint addresses (that is, IP +address and/or port), such as those caused by an endpoint migrating to a new +network. This section describes the process by which an endpoint migrates to a +new address. -The primary function of a connection ID is to ensure that changes in addressing -at lower protocol layers (UDP, IP, and below) don't cause packets for a QUIC -connection to be delivered to the wrong endpoint. Each endpoint selects -connection IDs using an implementation-specific (and perhaps -deployment-specific) method which will allow packets with that connection ID to -be routed back to the endpoint and identified by the endpoint upon receipt. +An endpoint MUST NOT initiate connection migration before the handshake is +finished and the endpoint has 1-RTT keys. The design of QUIC relies on +endpoints retaining a stable address for the duration of the handshake. -Connection IDs MUST NOT contain any information that can be used to correlate -them with other connection IDs for the same connection. As a trivial example, -this means the same connection ID MUST NOT be issued more than once on the same -connection. +An endpoint also MUST NOT initiate connection migration if the peer sent the +`disable_migration` transport parameter during the handshake. An endpoint which +has sent this transport parameter, but detects that a peer has nonetheless +migrated to a different network MAY treat this as a connection error of type +INVALID_MIGRATION. -A zero-length connection ID MAY be used when the connection ID is not needed for -routing and the address/port tuple of packets is sufficient to identify a -connection. An endpoint whose peer has selected a zero-length connection ID MUST -continue to use a zero-length connection ID for the lifetime of the connection -and MUST NOT send packets from any other local address. +Not all changes of peer address are intentional migrations. The peer could +experience NAT rebinding: a change of address due to a middlebox, usually a NAT, +allocating a new outgoing port or even a new outgoing IP address for a flow. +Endpoints SHOULD perform path validation ({{migrate-validate}}) if a NAT +rebinding does not cause the connection to fail. -When an endpoint has requested a non-zero-length connection ID, it needs to -ensure that the peer has a supply of connection IDs from which to choose for -packets sent to the endpoint. These connection IDs are supplied by the endpoint -using the NEW_CONNECTION_ID frame ({{frame-new-connection-id}}). +This document limits migration of connections to new client addresses, except as +described in {{preferred-address}}. Clients are responsible for initiating all +migrations. Servers do not send non-probing packets (see {{probing}}) toward a +client address until they see a non-probing packet from that address. If a +client receives packets from an unknown server address, the client MAY discard +these packets. -### Issuing Connection IDs +## Probing a New Path {#probing} -The initial connection ID issued by an endpoint is the Source Connection ID -during the handshake. The sequence number of the initial connection ID is 0. If -the preferred_address transport parameter is sent, the sequence number of the -supplied connection ID is 1. Subsequent connection IDs are communicated to the -peer using NEW_CONNECTION_ID frames ({{frame-new-connection-id}}), and the -sequence number on each newly-issued connection ID MUST increase by 1. The -connection ID randomly selected by the client in the Initial packet and any -connection ID provided by a Reset packet are not assigned sequence numbers -unless a server opts to retain them as its initial connection ID. - -When an endpoint issues a connection ID, it MUST accept packets that carry this -connection ID for the duration of the connection or until its peer invalidates -the connection ID via a RETIRE_CONNECTION_ID frame -({{frame-retire-connection-id}}). - -An endpoint SHOULD ensure that its peer has a sufficient number of available and -unused connection IDs. While each endpoint independently chooses how many -connection IDs to issue, endpoints SHOULD provide and maintain at least eight -connection IDs. The endpoint can do this by always supplying a new connection -ID when a connection ID is retired by its peer or when the endpoint receives a -packet with a previously unused connection ID. Endpoints that initiate -migration and require non-zero-length connection IDs SHOULD provide their peers -with new connection IDs before migration, or risk the peer closing the -connection. +An endpoint MAY probe for peer reachability from a new local address using path +validation {{migrate-validate}} prior to migrating the connection to the new +local address. Failure of path validation simply means that the new path is not +usable for this connection. Failure to validate a path does not cause the +connection to end unless there are no valid alternative paths available. +An endpoint uses a new connection ID for probes sent from a new local address, +see {{migration-linkability}} for further discussion. An endpoint that uses +a new local address needs to ensure that at least one new connection ID is +available at the peer. That can be achieved by including a NEW_CONNECTION_ID +frame in the probe. -### Consuming and Retiring Connection IDs {#retiring-cids} +Receiving a PATH_CHALLENGE frame from a peer indicates that the peer is probing +for reachability on a path. An endpoint sends a PATH_RESPONSE in response as per +{{migrate-validate}}. -An endpoint can change the connection ID it uses for a peer to another available -one at any time during the connection. An endpoint consumes connection IDs in -response to a migrating peer, see {{migration-linkability}} for more. +PATH_CHALLENGE, PATH_RESPONSE, NEW_CONNECTION_ID, and PADDING frames are +"probing frames", and all other frames are "non-probing frames". A packet +containing only probing frames is a "probing packet", and a packet containing +any other frame is a "non-probing packet". -An endpoint maintains a set of connection IDs received from its peer, any of -which it can use when sending packets. When the endpoint wishes to remove a -connection ID from use, it sends a RETIRE_CONNECTION_ID frame to its peer, -indicating that the peer might bring a new connection ID into circulation using -the NEW_CONNECTION_ID frame. -An endpoint that retires a connection ID can retain knowledge of that connection -ID for a period of time after sending the RETIRE_CONNECTION_ID frame, or until -that frame is acknowledged. +## Initiating Connection Migration {#initiating-migration} -As discussed in {{migration-linkability}}, each connection ID MUST be used on -packets sent from only one local address. An endpoint that migrates away from a -local address SHOULD retire all connection IDs used on that address once it no -longer plans to use that address. +An endpoint can migrate a connection to a new local address by sending packets +containing frames other than probing frames from that address. +Each endpoint validates its peer's address during connection establishment. +Therefore, a migrating endpoint can send to its peer knowing that the peer is +willing to receive at the peer's current address. Thus an endpoint can migrate +to a new local address without first validating the peer's address. -## Matching Packets to Connections {#packet-handling} +When migrating, the new path might not support the endpoint's current sending +rate. Therefore, the endpoint resets its congestion controller, as described in +{{migration-cc}}. -Incoming packets are classified on receipt. Packets can either be associated -with an existing connection, or - for servers - potentially create a new -connection. +The new path might not have the same ECN capability. Therefore, the endpoint +verifies ECN capability as described in {{using-ecn}}. -Hosts try to associate a packet with an existing connection. If the packet has a -Destination Connection ID corresponding to an existing connection, QUIC -processes that packet accordingly. Note that more than one connection ID can be -associated with a connection; see {{connection-id}}. +Receiving acknowledgments for data sent on the new path serves as proof of the +peer's reachability from the new address. Note that since acknowledgments may +be received on any path, return reachability on the new path is not +established. To establish return reachability on the new path, an endpoint MAY +concurrently initiate path validation {{migrate-validate}} on the new path. -If the Destination Connection ID is zero length and the packet matches the -address/port tuple of a connection where the host did not require connection -IDs, QUIC processes the packet as part of that connection. Endpoints MUST drop -packets with zero-length Destination Connection ID fields if they do not -correspond to a single connection. -Endpoints SHOULD send a Stateless Reset ({{stateless-reset}}) for any packets -that cannot be attributed to an existing connection. +## Responding to Connection Migration {#migration-response} +Receiving a packet from a new peer address containing a non-probing frame +indicates that the peer has migrated to that address. -### Client Packet Handling {#client-pkt-handling} +In response to such a packet, an endpoint MUST start sending subsequent packets +to the new peer address and MUST initiate path validation ({{migrate-validate}}) +to verify the peer's ownership of the unvalidated address. -Valid packets sent to clients always include a Destination Connection ID that -matches the value the client selects. Clients that choose to receive -zero-length connection IDs can use the address/port tuple to identify a -connection. Packets that don't match an existing connection are discarded. +An endpoint MAY send data to an unvalidated peer address, but it MUST protect +against potential attacks as described in {{address-spoofing}} and +{{on-path-spoofing}}. An endpoint MAY skip validation of a peer address if that +address has been seen recently. -Due to packet reordering or loss, clients might receive packets for a connection -that are encrypted with a key it has not yet computed. Clients MAY drop these -packets, or MAY buffer them in anticipation of later packets that allow it to -compute the key. +An endpoint only changes the address that it sends packets to in response to the +highest-numbered non-probing packet. This ensures that an endpoint does not send +packets to an old peer address in the case that it receives reordered packets. -If a client receives a packet that has an unsupported version, it MUST discard -that packet. +After changing the address to which it sends non-probing packets, an endpoint +could abandon any path validation for other addresses. +Receiving a packet from a new peer address might be the result of a NAT +rebinding at the peer. -### Server Packet Handling {#server-pkt-handling} +After verifying a new client address, the server SHOULD send new address +validation tokens ({{address-validation}}) to the client. -If a server receives a packet that has an unsupported version, but the packet is -sufficiently large to initiate a new connection for any version supported by the -server, it SHOULD send a Version Negotiation packet as described in -{{send-vn}}. Servers MAY rate control these packets to avoid storms of Version -Negotiation packets. -The first packet for an unsupported version can use different semantics and -encodings for any version-specific field. In particular, different packet -protection keys might be used for different versions. Servers that do not -support a particular version are unlikely to be able to decrypt the payload of -the packet. Servers SHOULD NOT attempt to decode or decrypt a packet from an -unknown version, but instead send a Version Negotiation packet, provided that -the packet is sufficiently long. +### Handling Address Spoofing by a Peer {#address-spoofing} -Servers MUST drop other packets that contain unsupported versions. +It is possible that a peer is spoofing its source address to cause an endpoint +to send excessive amounts of data to an unwilling host. If the endpoint sends +significantly more data than the spoofing peer, connection migration might be +used to amplify the volume of data that an attacker can generate toward a +victim. -Packets with a supported version, or no version field, are matched to a -connection as described in {{packet-handling}}. If not matched, the server -continues below. +As described in {{migration-response}}, an endpoint is required to validate a +peer's new address to confirm the peer's possession of the new address. Until a +peer's address is deemed valid, an endpoint MUST limit the rate at which it +sends data to this address. The endpoint MUST NOT send more than a minimum +congestion window's worth of data per estimated round-trip time (kMinimumWindow, +as defined in {{QUIC-RECOVERY}}). In the absence of this limit, an endpoint +risks being used for a denial of service attack against an unsuspecting victim. +Note that since the endpoint will not have any round-trip time measurements to +this address, the estimate SHOULD be the default initial value (see +{{QUIC-RECOVERY}}). -If the packet is an Initial packet fully conforming with the specification, the -server proceeds with the handshake ({{handshake}}). This commits the server to -the version that the client selected. +If an endpoint skips validation of a peer address as described in +{{migration-response}}, it does not need to limit its sending rate. -If a server isn't currently accepting any new connections, it SHOULD send an -Initial packet containing a CONNECTION_CLOSE frame with error code -SERVER_BUSY. -If the packet is a 0-RTT packet, the server MAY buffer a limited number of these -packets in anticipation of a late-arriving Initial Packet. Clients are forbidden -from sending Handshake packets prior to receiving a server response, so servers -SHOULD ignore any such packets. +### Handling Address Spoofing by an On-path Attacker {#on-path-spoofing} -Servers MUST drop incoming packets under all other circumstances. +An on-path attacker could cause a spurious connection migration by copying and +forwarding a packet with a spoofed address such that it arrives before the +original packet. The packet with the spoofed address will be seen to come from +a migrating connection, and the original packet will be seen as a duplicate and +dropped. After a spurious migration, validation of the source address will fail +because the entity at the source address does not have the necessary +cryptographic keys to read or respond to the PATH_CHALLENGE frame that is sent +to it even if it wanted to. -## Version Negotiation +To protect the connection from failing due to such a spurious migration, an +endpoint MUST revert to using the last validated peer address when validation of +a new peer address fails. -Version negotiation ensures that client and server agree to a QUIC version -that is mutually supported. A server sends a Version Negotiation packet in -response to each packet that might initiate a new connection, see -{{packet-handling}} for details. +If an endpoint has no state about the last validated peer address, it MUST close +the connection silently by discarding all connection state. This results in new +packets on the connection being handled generically. For instance, an endpoint +MAY send a stateless reset in response to any further incoming packets. -The size of the first packet sent by a client will determine whether a server -sends a Version Negotiation packet. Clients that support multiple QUIC versions -SHOULD pad the first packet they send to the largest of the minimum packet sizes -across all versions they support. This ensures that the server responds if there -is a mutually supported version. +Note that receipt of packets with higher packet numbers from the legitimate peer +address will trigger another connection migration. This will cause the +validation of the address of the spurious migration to be abandoned. -### Sending Version Negotiation Packets {#send-vn} +## Loss Detection and Congestion Control {#migration-cc} -If the version selected by the client is not acceptable to the server, the -server responds with a Version Negotiation packet (see {{packet-version}}). -This includes a list of versions that the server will accept. +The capacity available on the new path might not be the same as the old path. +Packets sent on the old path SHOULD NOT contribute to congestion control or RTT +estimation for the new path. -This system allows a server to process packets with unsupported versions without -retaining state. Though either the Initial packet or the Version Negotiation -packet that is sent in response could be lost, the client will send new packets -until it successfully receives a response or it abandons the connection attempt. +On confirming a peer's ownership of its new address, an endpoint SHOULD +immediately reset the congestion controller and round-trip time estimator for +the new path. +An endpoint MUST NOT return to the send rate used for the previous path unless +it is reasonably sure that the previous send rate is valid for the new path. +For instance, a change in the client's port number is likely indicative of a +rebinding in a middlebox and not a complete change in path. This determination +likely depends on heuristics, which could be imperfect; if the new path capacity +is significantly reduced, ultimately this relies on the congestion controller +responding to congestion signals and reducing send rates appropriately. -### Handling Version Negotiation Packets {#handle-vn} +There may be apparent reordering at the receiver when an endpoint sends data and +probes from/to multiple addresses during the migration period, since the two +resulting paths may have different round-trip times. A receiver of packets on +multiple paths will still send ACK frames covering all received packets. -When the client receives a Version Negotiation packet, it first checks that the -Destination and Source Connection ID fields match the Source and Destination -Connection ID fields in a packet that the client sent. If this check fails, the -packet MUST be discarded. +While multiple paths might be used during connection migration, a single +congestion control context and a single loss recovery context (as described in +{{QUIC-RECOVERY}}) may be adequate. A sender can make exceptions for probe +packets so that their loss detection is independent and does not unduly cause +the congestion controller to reduce its sending rate. An endpoint might set a +separate timer when a PATH_CHALLENGE is sent, which is cancelled when the +corresponding PATH_RESPONSE is received. If the timer fires before the +PATH_RESPONSE is received, the endpoint might send a new PATH_CHALLENGE, and +restart the timer for a longer period of time. -Once the Version Negotiation packet is determined to be valid, the client then -selects an acceptable protocol version from the list provided by the server. -The client then attempts to create a connection using that version. Though the -content of the Initial packet the client sends might not change in response to -version negotiation, a client MUST increase the packet number it uses on every -packet it sends. Packets MUST continue to use long headers and MUST include the -new negotiated protocol version. -The client MUST use the long header format and include its selected version on -all packets until it has 1-RTT keys and it has received a packet from the server -which is not a Version Negotiation packet. +## Privacy Implications of Connection Migration {#migration-linkability} -A client MUST NOT change the version it uses unless it is in response to a -Version Negotiation packet from the server. Once a client receives a packet -from the server which is not a Version Negotiation packet, it MUST discard other -Version Negotiation packets on the same connection. Similarly, a client MUST -ignore a Version Negotiation packet if it has already received and acted on a -Version Negotiation packet. +Using a stable connection ID on multiple network paths allows a passive observer +to correlate activity between those paths. An endpoint that moves between +networks might not wish to have their activity correlated by any entity other +than their peer, so different connection IDs are used when sending from +different local addresses, as discussed in {{connection-id}}. For this to be +effective endpoints need to ensure that connections IDs they provide cannot be +linked by any other entity. -A client MUST ignore a Version Negotiation packet that lists the client's chosen -version. +This eliminates the use of the connection ID for linking activity from +the same connection on different networks. Protection of packet numbers ensures +that packet numbers cannot be used to correlate activity. This does not prevent +other properties of packets, such as timing and size, from being used to +correlate activity. -A client MAY attempt 0-RTT after receiving a Version Negotiation packet. A -client that sends additional 0-RTT packets MUST NOT reset the packet number to 0 -as a result, see {{retry-0rtt-pn}}. - -Version negotiation packets have no cryptographic protection. The result of the -negotiation MUST be revalidated as part of the cryptographic handshake (see -{{version-validation}}). +Clients MAY move to a new connection ID at any time based on +implementation-specific concerns. For example, after a period of network +inactivity NAT rebinding might occur when the client begins sending data again. +A client might wish to reduce linkability by employing a new connection ID and +source UDP port when sending traffic after a period of inactivity. Changing the +UDP port from which it sends packets at the same time might cause the packet to +appear as a connection migration. This ensures that the mechanisms that support +migration are exercised even for clients that don't experience NAT rebindings or +genuine migrations. Changing port number can cause a peer to reset its +congestion state (see {{migration-cc}}), so the port SHOULD only be changed +infrequently. -### Using Reserved Versions +Endpoints that use connection IDs with length greater than zero could have their +activity correlated if their peers keep using the same destination connection ID +after migration. Endpoints that receive packets with a previously unused +Destination Connection ID SHOULD change to sending packets with a connection ID +that has not been used on any other network path. The goal here is to ensure +that packets sent on different paths cannot be correlated. To fulfill this +privacy requirement, endpoints that initiate migration and use connection IDs +with length greater than zero SHOULD provide their peers with new connection IDs +before migration. -For a server to use a new version in the future, clients must correctly handle -unsupported versions. To help ensure this, a server SHOULD include a reserved -version (see {{versions}}) while generating a Version Negotiation packet. +Caution: -The design of version negotiation permits a server to avoid maintaining state -for packets that it rejects in this fashion. The validation of version -negotiation (see {{version-validation}}) only validates the result of version -negotiation, which is the same no matter which reserved version was sent. -A server MAY therefore send different reserved version numbers in the Version -Negotiation Packet and in its transport parameters. +: If both endpoints change connection ID in response to seeing a change in + connection ID from their peer, then this can trigger an infinite sequence of + changes. -A client MAY send a packet using a reserved version number. This can be used to -solicit a list of supported versions from a server. +# Server's Preferred Address {#preferred-address} -## Cryptographic and Transport Handshake {#handshake} +QUIC allows servers to accept connections on one IP address and attempt to +transfer these connections to a more preferred address shortly after the +handshake. This is particularly useful when clients initially connect to an +address shared by multiple servers but would prefer to use a unicast address to +ensure connection stability. This section describes the protocol for migrating a +connection to a preferred server address. -QUIC relies on a combined cryptographic and transport handshake to minimize -connection establishment latency. QUIC uses the CRYPTO frame {{frame-crypto}} -to transmit the cryptographic handshake. Version 0x00000001 of QUIC uses TLS -1.3 as described in {{QUIC-TLS}}; a different QUIC version number could indicate -that a different cryptographic handshake protocol is in use. +Migrating a connection to a new server address mid-connection is left for future +work. If a client receives packets from a new server address not indicated by +the preferred_address transport parameter, the client SHOULD discard these +packets. -QUIC provides reliable, ordered delivery of the cryptographic handshake -data. QUIC packet protection ensures confidentiality and integrity protection -that meets the requirements of the cryptographic handshake protocol: +## Communicating A Preferred Address -* authenticated key exchange, where +A server conveys a preferred address by including the preferred_address +transport parameter in the TLS handshake. - * a server is always authenticated, +Once the handshake is finished, the client SHOULD initiate path validation (see +{{migrate-validate}}) of the server's preferred address using the connection ID +provided in the preferred_address transport parameter. - * a client is optionally authenticated, +If path validation succeeds, the client SHOULD immediately begin sending all +future packets to the new server address using the new connection ID and +discontinue use of the old server address. If path validation fails, the client +MUST continue sending all future packets to the server's original IP address. - * every connection produces distinct and unrelated keys, - * keying material is usable for packet protection for both 0-RTT and 1-RTT - packets, and +## Responding to Connection Migration - * 1-RTT keys have forward secrecy +A server might receive a packet addressed to its preferred IP address at any +time after the handshake is completed. If this packet contains a PATH_CHALLENGE +frame, the server sends a PATH_RESPONSE frame as per {{migrate-validate}}, but +the server MUST continue sending all other packets from its original IP address. -* authenticated values for the transport parameters of the peer (see - {{transport-parameters}}) +The server SHOULD also initiate path validation of the client using its +preferred address and the address from which it received the client probe. This +helps to guard against spurious migration initiated by an attacker. -* authenticated confirmation of version negotiation (see {{version-validation}}) +Once the server has completed its path validation and has received a non-probing +packet with a new largest packet number on its preferred address, the server +begins sending to the client exclusively from its preferred IP address. It +SHOULD drop packets for this connection received on the old IP address, but MAY +continue to process delayed packets. -* authenticated negotiation of an application protocol (TLS uses ALPN - {{?RFC7301}} for this purpose) -* for the server, the ability to carry data that provides assurance that the - client can receive packets that are addressed with the transport address that - is claimed by the client (see {{address-validation}}) +## Interaction of Client Migration and Preferred Address -The first CRYPTO frame MUST be sent in a single packet. Any second attempt -that is triggered by address validation MUST also be sent within a single -packet. This avoids having to reassemble a message from multiple packets. +A client might need to perform a connection migration before it has migrated to +the server's preferred address. In this case, the client SHOULD perform path +validation to both the original and preferred server address from the client's +new address concurrently. -The first client packet of the cryptographic handshake protocol MUST fit within -a 1232 octet QUIC packet payload. This includes overheads that reduce the space -available to the cryptographic handshake protocol. +If path validation of the server's preferred address succeeds, the client MUST +abandon validation of the original address and migrate to using the server's +preferred address. If path validation of the server's preferred address fails, +but validation of the server's original address succeeds, the client MAY migrate +to using the original address from the client's new address. -The CRYPTO frame can be sent in different packet number spaces. CRYPTO frames -in each packet number space carry a separate sequence of handshake data starting -from an offset of 0. +If the connection to the server's preferred address is not from the same client +address, the server MUST protect against potential attacks as described in +{{address-spoofing}} and {{on-path-spoofing}}. In addition to intentional +simultaneous migration, this might also occur because the client's access +network used a different NAT binding for the server's preferred address. -## Example Handshake Flows +Servers SHOULD initiate path validation to the client's new address upon +receiving a probe packet from a different address. Servers MUST NOT send more +than a minimum congestion window's worth of non-probing packets to the new +address before path validation is complete. -Details of how TLS is integrated with QUIC are provided in {{QUIC-TLS}}, but -some examples are provided here. -{{tls-1rtt-handshake}} provides an overview of the 1-RTT handshake. Each line -shows a QUIC packet with the packet type and packet number shown first, followed -by the frames that are typically contained in those packets. So, for instance -the first packet is of type Initial, with packet number 0, and contains a CRYPTO -frame carrying the ClientHello. +# Using Explicit Congestion Notification {#using-ecn} -Note that multiple QUIC packets -- even of different encryption levels -- may be -coalesced into a single UDP datagram (see {{packet-coalesce}}), and so this -handshake may consist of as few as 4 UDP datagrams, or any number more. For -instance, the server's first flight contains packets from the Initial encryption -level (obfuscation), the Handshake level, and "0.5-RTT data" from the server at -the 1-RTT encryption level. +QUIC endpoints use Explicit Congestion Notification (ECN) {{!RFC3168}} to detect +and respond to network congestion. ECN allows a network node to indicate +congestion in the network by setting a codepoint in the IP header of a packet +instead of dropping it. Endpoints react to congestion by reducing their sending +rate in response, as described in {{QUIC-RECOVERY}}. -~~~~ -Client Server +To use ECN, QUIC endpoints first determine whether a path supports ECN marking +and the peer is able to access the ECN codepoint in the IP header. A network +path does not support ECN if ECN marked packets get dropped or ECN markings are +rewritten on the path. An endpoint verifies the path, both during connection +establishment and when migrating to a new path (see {{migration}}). -Initial[0]: CRYPTO[CH] -> +Each endpoint independently verifies and enables use of ECN by setting the IP +header ECN codepoint to ECN Capable Transport (ECT) for the path from it to the +other peer. Even if ECN is not used on the path to the peer, the endpoint MUST +provide feedback about ECN markings received (if accessible). - Initial[0]: CRYPTO[SH] ACK[0] - Handshake[0]: CRYPTO[EE, CERT, CV, FIN] - <- 1-RTT[0]: STREAM[1, "..."] +To verify both that a path supports ECN and the peer can provide ECN feedback, +an endpoint MUST set the ECT(0) codepoint in the IP header of all outgoing +packets {{!RFC8311}}. -Initial[1]: ACK[0] -Handshake[0]: CRYPTO[FIN], ACK[0] -1-RTT[0]: STREAM[0, "..."], ACK[0] -> +If an ECT codepoint set in the IP header is not corrupted by a network device, +then a received packet contains either the codepoint sent by the peer or the +Congestion Experienced (CE) codepoint set by a network device that is +experiencing congestion. - 1-RTT[1]: STREAM[55, "..."], ACK[0] - <- Handshake[1]: ACK[0] -~~~~ -{: #tls-1rtt-handshake title="Example 1-RTT Handshake"} +On receiving a packet with an ECT or CE codepoint, an endpoint that can access +the IP ECN codepoints increases the corresponding ECT(0), ECT(1), or CE count, +and includes these counters in subsequent (see {{processing-and-ack}}) ACK +frames (see {{frame-ack}}). +A packet detected by a receiver as a duplicate does not affect the receiver's +local ECN codepoint counts; see ({{security-ecn}}) for relevant security +concerns. -{{tls-0rtt-handshake}} shows an example of a connection with a 0-RTT handshake -and a single packet of 0-RTT data. Note that as described in {{packet-numbers}}, -the server ACKs the 0-RTT data at the 1-RTT encryption level, and the client's -sequence numbers at the 1-RTT encryption level continue to increment from its -0-RTT packets. +If an endpoint receives a packet without an ECT or CE codepoint, it responds per +{{processing-and-ack}} with an ACK frame. -~~~~ -Client Server +If an endpoint does not have access to received ECN codepoints, it acknowledges +received packets per {{processing-and-ack}} with an ACK frame. -Initial[0]: CRYPTO[CH] -0-RTT[0]: STREAM[0, "..."] -> +If a packet sent with an ECT codepoint is newly acknowledged by the peer in an +ACK frame, the endpoint stops setting ECT codepoints in subsequent packets, with +the expectation that either the network or the peer no longer supports ECN. - Initial[0]: CRYPTO[SH] ACK[0] - Handshake[0] CRYPTO[EE, CERT, CV, FIN] - <- 1-RTT[0]: STREAM[1, "..."] ACK[0] +To protect the connection from arbitrary corruption of ECN codepoints by the +network, an endpoint verifies the following when an ACK frame is received: -Initial[1]: ACK[0] -0-RTT[1]: CRYPTO[EOED] -Handshake[0]: CRYPTO[FIN], ACK[0] -1-RTT[2]: STREAM[0, "..."] ACK[0] -> +* The increase in ECT(0) and ECT(1) counters MUST be at least the number of + packets newly acknowledged that were sent with the corresponding codepoint. - 1-RTT[1]: STREAM[55, "..."], ACK[1,2] - <- Handshake[1]: ACK[0] -~~~~ -{: #tls-0rtt-handshake title="Example 0-RTT Handshake"} +* The total increase in ECT(0), ECT(1), and CE counters reported in the ACK + frame MUST be at least the total number of packets newly acknowledged in this + ACK frame. +An endpoint could miss acknowledgements for a packet when ACK frames are lost. +It is therefore possible for the total increase in ECT(0), ECT(1), and CE +counters to be greater than the number of packets acknowledged in an ACK frame. +When this happens, the local reference counts MUST be increased to match the +counters in the ACK frame. -## Transport Parameters +Upon successful verification, an endpoint continues to set ECT codepoints in +subsequent packets with the expectation that the path is ECN-capable. -During connection establishment, both endpoints make authenticated declarations -of their transport parameters. These declarations are made unilaterally by each -endpoint. Endpoints are required to comply with the restrictions implied by -these parameters; the description of each parameter includes rules for its -handling. +If verification fails, then the endpoint ceases setting ECT codepoints in +subsequent packets with the expectation that either the network or the peer does +not support ECN. -The format of the transport parameters is the TransportParameters struct from -{{figure-transport-parameters}}. This is described using the presentation -language from Section 3 of {{!TLS13=RFC8446}}. +If an endpoint sets ECT codepoints on outgoing packets and encounters a +retransmission timeout due to the absence of acknowledgments from the peer (see +{{QUIC-RECOVERY}}), or if an endpoint has reason to believe that a network +element might be corrupting ECN codepoints, the endpoint MAY cease setting ECT +codepoints in subsequent packets. Doing so allows the connection to traverse +network elements that drop or corrupt ECN codepoints in the IP header. -~~~ - uint32 QuicVersion; - enum { - initial_max_stream_data_bidi_local(0), - initial_max_data(1), - initial_max_bidi_streams(2), - idle_timeout(3), - preferred_address(4), - max_packet_size(5), - stateless_reset_token(6), - ack_delay_exponent(7), - initial_max_uni_streams(8), - disable_migration(9), - initial_max_stream_data_bidi_remote(10), - initial_max_stream_data_uni(11), - max_ack_delay(12), - original_connection_id(13), - (65535) - } TransportParameterId; +# Connection Termination {#termination} - struct { - TransportParameterId parameter; - opaque value<0..2^16-1>; - } TransportParameter; +Connections should remain open until they become idle for a pre-negotiated +period of time. A QUIC connection, once established, can be terminated in one +of three ways: - struct { - select (Handshake.msg_type) { - case client_hello: - QuicVersion initial_version; +* idle timeout ({{idle-timeout}}) +* immediate close ({{immediate-close}}) +* stateless reset ({{stateless-reset}}) - case encrypted_extensions: - QuicVersion negotiated_version; - QuicVersion supported_versions<4..2^8-4>; - }; - TransportParameter parameters<22..2^16-1>; - } TransportParameters; - struct { - enum { IPv4(4), IPv6(6), (15) } ipVersion; - opaque ipAddress<4..2^8-1>; - uint16 port; - opaque connectionId<0..18>; - opaque statelessResetToken[16]; - } PreferredAddress; -~~~ -{: #figure-transport-parameters title="Definition of TransportParameters"} +### Closing and Draining Connection States {#draining} -The `extension_data` field of the quic_transport_parameters extension defined in -{{QUIC-TLS}} contains a TransportParameters value. TLS encoding rules are -therefore used to encode the transport parameters. +The closing and draining connection states exist to ensure that connections +close cleanly and that delayed or reordered packets are properly discarded. +These states SHOULD persist for three times the current Retransmission Timeout +(RTO) interval as defined in {{QUIC-RECOVERY}}. -QUIC encodes transport parameters into a sequence of octets, which are then -included in the cryptographic handshake. Once the handshake completes, the -transport parameters declared by the peer are available. Each endpoint -validates the value provided by its peer. In particular, version negotiation -MUST be validated (see {{version-validation}}) before the connection -establishment is considered properly complete. +An endpoint enters a closing period after initiating an immediate close +({{immediate-close}}). While closing, an endpoint MUST NOT send packets unless +they contain a CONNECTION_CLOSE or APPLICATION_CLOSE frame (see +{{immediate-close}} for details). -Definitions for each of the defined transport parameters are included in -{{transport-parameter-definitions}}. Any given parameter MUST appear -at most once in a given transport parameters extension. An endpoint MUST -treat receipt of duplicate transport parameters as a connection error of -type TRANSPORT_PARAMETER_ERROR. +In the closing state, only a packet containing a closing frame can be sent. An +endpoint retains only enough information to generate a packet containing a +closing frame and to identify packets as belonging to the connection. The +connection ID and QUIC version is sufficient information to identify packets for +a closing connection; an endpoint can discard all other connection state. An +endpoint MAY retain packet protection keys for incoming packets to allow it to +read and process a closing frame. +The draining state is entered once an endpoint receives a signal that its peer +is closing or draining. While otherwise identical to the closing state, an +endpoint in the draining state MUST NOT send any packets. Retaining packet +protection keys is unnecessary once a connection is in the draining state. -### Transport Parameter Definitions +An endpoint MAY transition from the closing period to the draining period if it +can confirm that its peer is also closing or draining. Receiving a closing +frame is sufficient confirmation, as is receiving a stateless reset. The +draining period SHOULD end when the closing period would have ended. In other +words, the endpoint can use the same end time, but cease retransmission of the +closing packet. -An endpoint MAY use the following transport parameters: +Disposing of connection state prior to the end of the closing or draining period +could cause delayed or reordered packets to be handled poorly. Endpoints that +have some alternative means to ensure that late-arriving packets on the +connection do not create QUIC state, such as those that are able to close the +UDP socket, MAY use an abbreviated draining period which can allow for faster +resource recovery. Servers that retain an open socket for accepting new +connections SHOULD NOT exit the closing or draining period early. -initial_max_data (0x0001): +Once the closing or draining period has ended, an endpoint SHOULD discard all +connection state. This results in new packets on the connection being handled +generically. For instance, an endpoint MAY send a stateless reset in response +to any further incoming packets. -: The initial maximum data parameter contains the initial value for the maximum - amount of data that can be sent on the connection. This parameter is encoded - as an unsigned 32-bit integer in units of octets. This is equivalent to - sending a MAX_DATA ({{frame-max-data}}) for the connection immediately after - completing the handshake. If the transport parameter is absent, the connection - starts with a flow control limit of 0. +The draining and closing periods do not apply when a stateless reset +({{stateless-reset}}) is sent. -initial_max_bidi_streams (0x0002): +An endpoint is not expected to handle key updates when it is closing or +draining. A key update might prevent the endpoint from moving from the closing +state to draining, but it otherwise has no impact. -: The initial maximum bidirectional streams parameter contains the initial - maximum number of bidirectional streams the peer may initiate, encoded as an - unsigned 16-bit integer. If this parameter is absent or zero, bidirectional - streams cannot be created until a MAX_STREAM_ID frame is sent. Setting this - parameter is equivalent to sending a MAX_STREAM_ID ({{frame-max-stream-id}}) - immediately after completing the handshake containing the corresponding Stream - ID. For example, a value of 0x05 would be equivalent to receiving a - MAX_STREAM_ID containing 16 when received by a client or 17 when received by a - server. +An endpoint could receive packets from a new source address, indicating a client +connection migration ({{migration}}), while in the closing period. An endpoint +in the closing state MUST strictly limit the number of packets it sends to this +new address until the address is validated (see {{migrate-validate}}). A server +in the closing state MAY instead choose to discard packets received from a new +source address. -initial_max_uni_streams (0x0008): -: The initial maximum unidirectional streams parameter contains the initial - maximum number of unidirectional streams the peer may initiate, encoded as an - unsigned 16-bit integer. If this parameter is absent or zero, unidirectional - streams cannot be created until a MAX_STREAM_ID frame is sent. Setting this - parameter is equivalent to sending a MAX_STREAM_ID ({{frame-max-stream-id}}) - immediately after completing the handshake containing the corresponding Stream - ID. For example, a value of 0x05 would be equivalent to receiving a - MAX_STREAM_ID containing 18 when received by a client or 19 when received by a - server. +### Idle Timeout -idle_timeout (0x0003): +If the idle timeout is enabled, a connection that remains idle for longer than +the advertised idle timeout (see {{transport-parameter-definitions}}) is closed. +A connection enters the draining state when the idle timeout expires. -: The idle timeout is a value in seconds that is encoded as an unsigned 16-bit - integer. If this parameter is absent or zero then the idle timeout is - disabled. +Each endpoint advertises their own idle timeout to their peer. The idle timeout +starts from the last packet received. In order to ensure that initiating new +activity postpones an idle timeout, an endpoint restarts this timer when sending +a packet. An endpoint does not postpone the idle timeout if another packet has +been sent containing frames other than ACK or PADDING, and that other packet has +not been acknowledged or declared lost. Packets that contain only ACK or +PADDING frames are not acknowledged until an endpoint has other frames to send, +so they could prevent the timeout from being refreshed. -max_packet_size (0x0005): +The value for an idle timeout can be asymmetric. The value advertised by an +endpoint is only used to determine whether the connection is live at that +endpoint. An endpoint that sends packets near the end of the idle timeout +period of a peer risks having those packets discarded if its peer enters the +draining state before the packets arrive. If a peer could timeout within an RTO +(see Section 4.3.3 of {{QUIC-RECOVERY}}), it is advisable to test for liveness +before sending any data that cannot be retried safely. -: The maximum packet size parameter places a limit on the size of packets that - the endpoint is willing to receive, encoded as an unsigned 16-bit integer. - This indicates that packets larger than this limit will be dropped. The - default for this parameter is the maximum permitted UDP payload of 65527. - Values below 1200 are invalid. This limit only applies to protected packets - ({{packet-protected}}). -ack_delay_exponent (0x0007): +### Immediate Close -: An 8-bit unsigned integer value indicating an exponent used to decode the ACK - Delay field in the ACK frame, see {{frame-ack}}. If this value is absent, a - default value of 3 is assumed (indicating a multiplier of 8). The default - value is also used for ACK frames that are sent in Initial and Handshake - packets. Values above 20 are invalid. +An endpoint sends a closing frame (CONNECTION_CLOSE or APPLICATION_CLOSE) to +terminate the connection immediately. Any closing frame causes all streams to +immediately become closed; open streams can be assumed to be implicitly reset. -disable_migration (0x0009): +After sending a closing frame, endpoints immediately enter the closing state. +During the closing period, an endpoint that sends a closing frame SHOULD respond +to any packet that it receives with another packet containing a closing frame. +To minimize the state that an endpoint maintains for a closing connection, +endpoints MAY send the exact same packet. However, endpoints SHOULD limit the +number of packets they generate containing a closing frame. For instance, an +endpoint could progressively increase the number of packets that it receives +before sending additional packets or increase the time between packets. -: The endpoint does not support connection migration ({{migration}}). Peers MUST - NOT send any packets, including probing packets ({{probing}}), from a local - address other than that used to perform the handshake. This parameter is a - zero-length value. +Note: -max_ack_delay (0x000c): +: Allowing retransmission of a packet contradicts other advice in this document + that recommends the creation of new packet numbers for every packet. Sending + new packet numbers is primarily of advantage to loss recovery and congestion + control, which are not expected to be relevant for a closed connection. + Retransmitting the final packet requires less state. -: An 8 bit unsigned integer value indicating the maximum amount of time in - milliseconds by which it will delay sending of acknowledgments. If this - value is absent, a default of 25 milliseconds is assumed. +After receiving a closing frame, endpoints enter the draining state. An +endpoint that receives a closing frame MAY send a single packet containing a +closing frame before entering the draining state, using a CONNECTION_CLOSE frame +and a NO_ERROR code if appropriate. An endpoint MUST NOT send further packets, +which could result in a constant exchange of closing frames until the closing +period on either peer ended. -Either peer MAY advertise an initial value for the flow control on each type of -stream on which they might receive data. Each of the following transport -parameters is encoded as an unsigned 32-bit integer in units of octets: +An immediate close can be used after an application protocol has arranged to +close a connection. This might be after the application protocols negotiates a +graceful shutdown. The application protocol exchanges whatever messages that +are needed to cause both endpoints to agree to close the connection, after which +the application requests that the connection be closed. The application +protocol can use an APPLICATION_CLOSE message with an appropriate error code to +signal closure. -initial_max_stream_data_bidi_local (0x0000): -: The initial stream maximum data for bidirectional, locally-initiated streams - parameter contains the initial flow control limit for newly created - bidirectional streams opened by the endpoint that sets the transport - parameter. In client transport parameters, this applies to streams with an - identifier ending in 0x0; in server transport parameters, this applies to - streams ending in 0x1. +### Stateless Reset {#stateless-reset} -initial_max_stream_data_bidi_remote (0x000a): +A stateless reset is provided as an option of last resort for an endpoint that +does not have access to the state of a connection. A crash or outage might +result in peers continuing to send data to an endpoint that is unable to +properly continue the connection. An endpoint that wishes to communicate a +fatal connection error MUST use a closing frame if it has sufficient state to do +so. -: The initial stream maximum data for bidirectional, peer-initiated streams - parameter contains the initial flow control limit for newly created - bidirectional streams opened by the endpoint that receives the transport - parameter. In client transport parameters, this applies to streams with an - identifier ending in 0x1; in server transport parameters, this applies to - streams ending in 0x0. +To support this process, a token is sent by endpoints. The token is carried in +the NEW_CONNECTION_ID frame sent by either peer, and servers can specify the +stateless_reset_token transport parameter during the handshake (clients cannot +because their transport parameters don't have confidentiality protection). This +value is protected by encryption, so only client and server know this value. +Tokens sent via NEW_CONNECTION_ID frames are invalidated when their associated +connection ID is retired via a RETIRE_CONNECTION_ID frame +({{frame-retire-connection-id}}). -initial_max_stream_data_uni (0x000b): +An endpoint that receives packets that it cannot process sends a packet in the +following layout: -: The initial stream maximum data for unidirectional streams parameter contains - the initial flow control limit for newly created unidirectional streams opened - by the endpoint that receives the transport parameter. In client transport - parameters, this applies to streams with an identifier ending in 0x3; in - server transport parameters, this applies to streams ending in 0x2. +~~~ + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+ +|0|K|1|1|0|0|0|0| ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Random Octets (160..) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| | ++ + +| | ++ Stateless Reset Token (128) + +| | ++ + +| | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +~~~ +{: #fig-stateless-reset title="Stateless Reset Packet"} -If present, transport parameters that set initial stream flow control limits are -equivalent to sending a MAX_STREAM_DATA frame ({{frame-max-stream-data}}) on -every stream of the corresponding type immediately after opening. If the -transport parameter is absent, streams of that type start with a flow control -limit of 0. +This design ensures that a stateless reset packet is - to the extent possible - +indistinguishable from a regular packet with a short header. -A server MUST include the original_connection_id transport parameter if it sent -a Retry packet: - -original_connection_id (0x000d): - -: The value of the Destination Connection ID field from the first Initial packet - sent by the client. This transport parameter is only sent by the server. +The message consists of a header octet, followed by an arbitrary number of +random octets, followed by a Stateless Reset Token. -A server MAY include the following transport parameters: +A stateless reset will be interpreted by a recipient as a packet with a short +header. For the packet to appear as valid, the Random Octets field needs to +include at least 20 octets of random or unpredictable values. This is intended +to allow for a destination connection ID of the maximum length permitted, a +packet number, and minimal payload. The Stateless Reset Token corresponds to +the minimum expansion of the packet protection AEAD. More random octets might +be necessary if the endpoint could have negotiated a packet protection scheme +with a larger minimum AEAD expansion. -stateless_reset_token (0x0006): +An endpoint SHOULD NOT send a stateless reset that is significantly larger than +the packet it receives. Endpoints MUST discard packets that are too small to be +valid QUIC packets. With the set of AEAD functions defined in {{QUIC-TLS}}, +packets less than 19 octets long are never valid. -: The Stateless Reset Token is used in verifying a stateless reset, see - {{stateless-reset}}. This parameter is a sequence of 16 octets. +An endpoint MAY send a stateless reset in response to a packet with a long +header. This would not be effective if the stateless reset token was not yet +available to a peer. In this QUIC version, packets with a long header are only +used during connection establishment. Because the stateless reset token is not +available until connection establishment is complete or near completion, +ignoring an unknown packet with a long header might be more effective. -preferred_address (0x0004): +An endpoint cannot determine the Source Connection ID from a packet with a short +header, therefore it cannot set the Destination Connection ID in the stateless +reset packet. The Destination Connection ID will therefore differ from the +value used in previous packets. A random Destination Connection ID makes the +connection ID appear to be the result of moving to a new connection ID that was +provided using a NEW_CONNECTION_ID frame ({{frame-new-connection-id}}). -: The server's Preferred Address is used to effect a change in server address at - the end of the handshake, as described in {{preferred-address}}. +Using a randomized connection ID results in two problems: -A client MUST NOT include a stateless reset token or a preferred address. A -server MUST treat receipt of either transport parameter as a connection error of -type TRANSPORT_PARAMETER_ERROR. +* The packet might not reach the peer. If the Destination Connection ID is + critical for routing toward the peer, then this packet could be incorrectly + routed. This might also trigger another Stateless Reset in response, see + {{reset-looping}}. A Stateless Reset that is not correctly routed is + ineffective in causing errors to be quickly detected and recovered. In this + case, endpoints will need to rely on other methods - such as timers - to + detect that the connection has failed. +* The randomly generated connection ID can be used by entities other than the + peer to identify this as a potential stateless reset. An endpoint that + occasionally uses different connection IDs might introduce some uncertainty + about this. -### Values of Transport Parameters for 0-RTT {#zerortt-parameters} +Finally, the last 16 octets of the packet are set to the value of the Stateless +Reset Token. -A client that attempts to send 0-RTT data MUST remember the transport parameters -used by the server. The transport parameters that the server advertises during -connection establishment apply to all connections that are resumed using the -keying material established during that handshake. Remembered transport -parameters apply to the new connection until the handshake completes and new -transport parameters from the server can be provided. +A stateless reset is not appropriate for signaling error conditions. An +endpoint that wishes to communicate a fatal connection error MUST use a +CONNECTION_CLOSE or APPLICATION_CLOSE frame if it has sufficient state to do so. -A server can remember the transport parameters that it advertised, or store an -integrity-protected copy of the values in the ticket and recover the information -when accepting 0-RTT data. A server uses the transport parameters in -determining whether to accept 0-RTT data. +This stateless reset design is specific to QUIC version 1. An endpoint that +supports multiple versions of QUIC needs to generate a stateless reset that will +be accepted by peers that support any version that the endpoint might support +(or might have supported prior to losing state). Designers of new versions of +QUIC need to be aware of this and either reuse this design, or use a portion of +the packet other than the last 16 octets for carrying data. -A server MAY accept 0-RTT and subsequently provide different values for -transport parameters for use in the new connection. If 0-RTT data is accepted -by the server, the server MUST NOT reduce any limits or alter any values that -might be violated by the client with its 0-RTT data. In particular, a server -that accepts 0-RTT data MUST NOT set values for initial_max_data, -initial_max_stream_data_bidi_local, initial_max_stream_data_bidi_remote, and -initial_max_stream_data_uni that are smaller than the remembered value of those -parameters. Similarly, a server MUST NOT reduce the value of -initial_max_bidi_streams or initial_max_uni_streams. -Omitting or setting a zero value for certain transport parameters can result in -0-RTT data being enabled, but not usable. The applicable subset of transport -parameters that permit sending of application data SHOULD be set to non-zero -values for 0-RTT. This includes initial_max_data and either -initial_max_bidi_streams and initial_max_stream_data_bidi_remote, or -initial_max_uni_streams and initial_max_stream_data_uni. +#### Detecting a Stateless Reset -The value of the server's previous preferred_address MUST NOT be used when -establishing a new connection; rather, the client should wait to observe the -server's new preferred_address value in the handshake. +An endpoint detects a potential stateless reset when a packet with a short +header either cannot be decrypted or is marked as a duplicate packet. The +endpoint then compares the last 16 octets of the packet with the Stateless Reset +Token provided by its peer, either in a NEW_CONNECTION_ID frame or the server's +transport parameters. If these values are identical, the endpoint MUST enter +the draining period and not send any further packets on this connection. If the +comparison fails, the packet can be discarded. -A server MUST reject 0-RTT data or even abort a handshake if the implied values -for transport parameters cannot be supported. +#### Calculating a Stateless Reset Token {#reset-token} -### New Transport Parameters +The stateless reset token MUST be difficult to guess. In order to create a +Stateless Reset Token, an endpoint could randomly generate {{!RFC4086}} a secret +for every connection that it creates. However, this presents a coordination +problem when there are multiple instances in a cluster or a storage problem for +an endpoint that might lose state. Stateless reset specifically exists to +handle the case where state is lost, so this approach is suboptimal. -New transport parameters can be used to negotiate new protocol behavior. An -endpoint MUST ignore transport parameters that it does not support. Absence of -a transport parameter therefore disables any optional protocol feature that is -negotiated using the parameter. +A single static key can be used across all connections to the same endpoint by +generating the proof using a second iteration of a preimage-resistant function +that takes a static key and the connection ID chosen by the endpoint (see +{{connection-id}}) as input. An endpoint could use HMAC {{?RFC2104}} (for +example, HMAC(static_key, connection_id)) or HKDF {{?RFC5869}} (for example, +using the static key as input keying material, with the connection ID as salt). +The output of this function is truncated to 16 octets to produce the Stateless +Reset Token for that connection. -New transport parameters can be registered according to the rules in -{{iana-transport-parameters}}. +An endpoint that loses state can use the same method to generate a valid +Stateless Reset Token. The connection ID comes from the packet that the +endpoint receives. +This design relies on the peer always sending a connection ID in its packets so +that the endpoint can use the connection ID from a packet to reset the +connection. An endpoint that uses this design MUST either use the same +connection ID length for all connections or encode the length of the connection +ID such that it can be recovered without state. In addition, it MUST NOT +provide a zero-length connection ID. -### Version Negotiation Validation {#version-validation} +Revealing the Stateless Reset Token allows any entity to terminate the +connection, so a value can only be used once. This method for choosing the +Stateless Reset Token means that the combination of connection ID and static key +cannot occur for another connection. A denial of service attack is possible if +the same connection ID is used by instances that share a static key, or if an +attacker can cause a packet to be routed to an instance that has no state but +the same static key (see {{reset-oracle}}). A connection ID from a connection +that is reset by revealing the Stateless Reset Token cannot be reused for new +connections at nodes that share a static key. -Though the cryptographic handshake has integrity protection, two forms of QUIC -version downgrade are possible. In the first, an attacker replaces the QUIC -version in the Initial packet. In the second, a fake Version Negotiation packet -is sent by an attacker. To protect against these attacks, the transport -parameters include three fields that encode version information. These -parameters are used to retroactively authenticate the choice of version (see -{{version-negotiation}}). +Note that Stateless Reset packets do not have any cryptographic protection. -The cryptographic handshake provides integrity protection for the negotiated -version as part of the transport parameters (see {{transport-parameters}}). As -a result, attacks on version negotiation by an attacker can be detected. -The client includes the initial_version field in its transport parameters. The -initial_version is the version that the client initially attempted to use. If -the server did not send a Version Negotiation packet {{packet-version}}, this -will be identical to the negotiated_version field in the server transport -parameters. +#### Looping {#reset-looping} -A server that processes all packets in a stateful fashion can remember how -version negotiation was performed and validate the initial_version value. +The design of a Stateless Reset is such that it is indistinguishable from a +valid packet. This means that a Stateless Reset might trigger the sending of a +Stateless Reset in response, which could lead to infinite exchanges. -A server that does not maintain state for every packet it receives (i.e., a -stateless server) uses a different process. If the initial_version matches the -version of QUIC that is in use, a stateless server can accept the value. +An endpoint MUST ensure that every Stateless Reset that it sends is smaller than +the packet which triggered it, unless it maintains state sufficient to prevent +looping. In the event of a loop, this results in packets eventually being too +small to trigger a response. -If the initial_version is different from the version of QUIC that is in use, a -stateless server MUST check that it would have sent a Version Negotiation packet -if it had received a packet with the indicated initial_version. If a server -would have accepted the version included in the initial_version and the value -differs from the QUIC version that is in use, the server MUST terminate the -connection with a VERSION_NEGOTIATION_ERROR error. +An endpoint can remember the number of Stateless Reset packets that it has sent +and stop generating new Stateless Reset packets once a limit is reached. Using +separate limits for different remote addresses will ensure that Stateless Reset +packets can be used to close connections when other peers or connections have +exhausted limits. -The server includes both the version of QUIC that is in use and a list of the -QUIC versions that the server supports. +Reducing the size of a Stateless Reset below the recommended minimum size of 37 +octets could mean that the packet could reveal to an observer that it is a +Stateless Reset. Conversely, refusing to send a Stateless Reset in response to +a small packet might result in Stateless Reset not being useful in detecting +cases of broken connections where only very small packets are sent; such +failures might only be detected by other means, such as timers. -The negotiated_version field is the version that is in use. This MUST be set by -the server to the value that is on the Initial packet that it accepts (not an -Initial packet that triggers a Retry or Version Negotiation packet). A client -that receives a negotiated_version that does not match the version of QUIC that -is in use MUST terminate the connection with a VERSION_NEGOTIATION_ERROR error -code. +An endpoint can increase the odds that a packet will trigger a Stateless Reset +if it cannot be processed by padding it to at least 38 octets. -The server includes a list of versions that it would send in any version -negotiation packet ({{packet-version}}) in the supported_versions field. The -server populates this field even if it did not send a version negotiation -packet. -The client validates that the negotiated_version is included in the -supported_versions list and - if version negotiation was performed - that it -would have selected the negotiated version. A client MUST terminate the -connection with a VERSION_NEGOTIATION_ERROR error code if the current QUIC -version is not listed in the supported_versions list. A client MUST terminate -with a VERSION_NEGOTIATION_ERROR error code if version negotiation occurred but -it would have selected a different version based on the value of the -supported_versions list. -When an endpoint accepts multiple QUIC versions, it can potentially interpret -transport parameters as they are defined by any of the QUIC versions it -supports. The version field in the QUIC packet header is authenticated using -transport parameters. The position and the format of the version fields in -transport parameters MUST either be identical across different QUIC versions, or -be unambiguously different to ensure no confusion about their interpretation. -One way that a new format could be introduced is to define a TLS extension with -a different codepoint. +# Packetization and Reliability {#packetization} +A sender bundles one or more frames in a QUIC packet (see {{frames}}). -## Stateless Retries {#stateless-retry} +A sender SHOULD minimize per-packet bandwidth and computational costs by +bundling as many frames as possible within a QUIC packet. A sender MAY wait for +a short period of time to bundle multiple frames before sending a packet that is +not maximally packed, to avoid sending out large numbers of small packets. An +implementation may use knowledge about application sending behavior or +heuristics to determine whether and for how long to wait. This waiting period +is an implementation decision, and an implementation should be careful to delay +conservatively, since any delay is likely to increase application-visible +latency. -A server can process an Initial packet from a client without committing any -state. This allows a server to perform address validation -({{address-validation}}), or to defer connection establishment costs. -A server that generates a response to an Initial packet without retaining -connection state MUST use the Retry packet ({{packet-retry}}). This packet -causes a client to restart the connection attempt and includes the token in the -new Initial packet ({{packet-initial}}) to prove source address ownership. +## Packet Processing and Acknowledgment {#processing-and-ack} +A packet MUST NOT be acknowledged until packet protection has been successfully +removed and all frames contained in the packet have been processed. For STREAM +frames, this means the data has been enqueued in preparation to be received by +the application protocol, but it does not require that data is delivered and +consumed. -## Using Explicit Congestion Notification {#using-ecn} +Once the packet has been fully processed, a receiver acknowledges receipt by +sending one or more ACK frames containing the packet number of the received +packet. To avoid creating an indefinite feedback loop, an endpoint MUST NOT +send an ACK frame in response to a packet containing only ACK or PADDING frames, +even if there are packet gaps which precede the received packet. The endpoint +MUST acknowledge packets containing only ACK or PADDING frames in the next ACK +frame that it sends. -QUIC endpoints use Explicit Congestion Notification (ECN) {{!RFC3168}} to detect -and respond to network congestion. ECN allows a network node to indicate -congestion in the network by setting a codepoint in the IP header of a packet -instead of dropping it. Endpoints react to congestion by reducing their sending -rate in response, as described in {{QUIC-RECOVERY}}. - -To use ECN, QUIC endpoints first determine whether a path supports ECN marking -and the peer is able to access the ECN codepoint in the IP header. A network -path does not support ECN if ECN marked packets get dropped or ECN markings are -rewritten on the path. An endpoint verifies the path, both during connection -establishment and when migrating to a new path (see {{migration}}). - -Each endpoint independently verifies and enables use of ECN by setting the IP -header ECN codepoint to ECN Capable Transport (ECT) for the path from it to the -other peer. Even if ECN is not used on the path to the peer, the endpoint MUST -provide feedback about ECN markings received (if accessible). - -To verify both that a path supports ECN and the peer can provide ECN feedback, -an endpoint MUST set the ECT(0) codepoint in the IP header of all outgoing -packets {{!RFC8311}}. +While PADDING frames do not elicit an ACK frame from a receiver, they are +considered to be in flight for congestion control purposes +{{QUIC-RECOVERY}}. Sending only PADDING frames might cause the sender to become +limited by the congestion controller (as described in {{QUIC-RECOVERY}}) with no +acknowledgments forthcoming from the receiver. Therefore, a sender should ensure +that other frames are sent in addition to PADDING frames to elicit +acknowledgments from the receiver. -If an ECT codepoint set in the IP header is not corrupted by a network device, -then a received packet contains either the codepoint sent by the peer or the -Congestion Experienced (CE) codepoint set by a network device that is -experiencing congestion. +Strategies and implications of the frequency of generating acknowledgments are +discussed in more detail in {{QUIC-RECOVERY}}. -On receiving a packet with an ECT or CE codepoint, an endpoint that can access -the IP ECN codepoints increases the corresponding ECT(0), ECT(1), or CE count, -and includes these counters in subsequent (see {{processing-and-ack}}) ACK -frames (see {{frame-ack}}). -A packet detected by a receiver as a duplicate does not affect the receiver's -local ECN codepoint counts; see ({{security-ecn}}) for relevant security -concerns. +## Retransmission of Information -If an endpoint receives a packet without an ECT or CE codepoint, it responds per -{{processing-and-ack}} with an ACK frame. +QUIC packets that are determined to be lost are not retransmitted whole. The +same applies to the frames that are contained within lost packets. Instead, the +information that might be carried in frames is sent again in new frames as +needed. -If an endpoint does not have access to received ECN codepoints, it acknowledges -received packets per {{processing-and-ack}} with an ACK frame. +New frames and packets are used to carry information that is determined to have +been lost. In general, information is sent again when a packet containing that +information is determined to be lost and sending ceases when a packet +containing that information is acknowledged. -If a packet sent with an ECT codepoint is newly acknowledged by the peer in an -ACK frame, the endpoint stops setting ECT codepoints in subsequent packets, with -the expectation that either the network or the peer no longer supports ECN. +* Data sent in CRYPTO frames is retransmitted according to the rules in + {{QUIC-RECOVERY}}, until either all data has been acknowledged or the crypto + state machine implicitly knows that the peer received the data. -To protect the connection from arbitrary corruption of ECN codepoints by the -network, an endpoint verifies the following when an ACK frame is received: +* Application data sent in STREAM frames is retransmitted in new STREAM frames + unless the endpoint has sent a RST_STREAM for that stream. Once an endpoint + sends a RST_STREAM frame, no further STREAM frames are needed. -* The increase in ECT(0) and ECT(1) counters MUST be at least the number of - packets newly acknowledged that were sent with the corresponding codepoint. +* The most recent set of acknowledgments are sent in ACK frames. An ACK frame + SHOULD contain all unacknowledged acknowledgments, as described in + {{sending-ack-frames}}. -* The total increase in ECT(0), ECT(1), and CE counters reported in the ACK - frame MUST be at least the total number of packets newly acknowledged in this - ACK frame. +* Cancellation of stream transmission, as carried in a RST_STREAM frame, is + sent until acknowledged or until all stream data is acknowledged by the peer + (that is, either the "Reset Recvd" or "Data Recvd" state is reached on the + send stream). The content of a RST_STREAM frame MUST NOT change when it is + sent again. -An endpoint could miss acknowledgements for a packet when ACK frames are lost. -It is therefore possible for the total increase in ECT(0), ECT(1), and CE -counters to be greater than the number of packets acknowledged in an ACK frame. -When this happens, the local reference counts MUST be increased to match the -counters in the ACK frame. +* Similarly, a request to cancel stream transmission, as encoded in a + STOP_SENDING frame, is sent until the receive stream enters either a "Data + Recvd" or "Reset Recvd" state, see {{solicited-state-transitions}}. -Upon successful verification, an endpoint continues to set ECT codepoints in -subsequent packets with the expectation that the path is ECN-capable. +* Connection close signals, including those that use CONNECTION_CLOSE and + APPLICATION_CLOSE frames, are not sent again when packet loss is detected, but + as described in {{termination}}. -If verification fails, then the endpoint ceases setting ECT codepoints in -subsequent packets with the expectation that either the network or the peer does -not support ECN. +* The current connection maximum data is sent in MAX_DATA frames. An updated + value is sent in a MAX_DATA frame if the packet containing the most recently + sent MAX_DATA frame is declared lost, or when the endpoint decides to update + the limit. Care is necessary to avoid sending this frame too often as the + limit can increase frequently and cause an unnecessarily large number of + MAX_DATA frames to be sent. -If an endpoint sets ECT codepoints on outgoing packets and encounters a -retransmission timeout due to the absence of acknowledgments from the peer (see -{{QUIC-RECOVERY}}), or if an endpoint has reason to believe that a network -element might be corrupting ECN codepoints, the endpoint MAY cease setting ECT -codepoints in subsequent packets. Doing so allows the connection to traverse -network elements that drop or corrupt ECN codepoints in the IP header. +* The current maximum stream data offset is sent in MAX_STREAM_DATA frames. + Like MAX_DATA, an updated value is sent when the packet containing + the most recent MAX_STREAM_DATA frame for a stream is lost or when the limit + is updated, with care taken to prevent the frame from being sent too often. An + endpoint SHOULD stop sending MAX_STREAM_DATA frames when the receive stream + enters a "Size Known" state. +* The maximum stream ID for a stream of a given type is sent in MAX_STREAM_ID + frames. Like MAX_DATA, an updated value is sent when a packet containing the + most recent MAX_STREAM_ID for a stream type frame is declared lost or when + the limit is updated, with care taken to prevent the frame from being sent + too often. -## Proof of Source Address Ownership {#address-validation} +* Blocked signals are carried in BLOCKED, STREAM_BLOCKED, and STREAM_ID_BLOCKED + frames. BLOCKED streams have connection scope, STREAM_BLOCKED frames have + stream scope, and STREAM_ID_BLOCKED frames are scoped to a specific stream + type. New frames are sent if packets containing the most recent frame for a + scope is lost, but only while the endpoint is blocked on the corresponding + limit. These frames always include the limit that is causing blocking at the + time that they are transmitted. -Transport protocols commonly spend a round trip checking that a client owns the -transport address (IP and port) that it claims. Verifying that a client can -receive packets sent to its claimed transport address protects against spoofing -of this information by malicious clients. +* A liveness or path validation check using PATH_CHALLENGE frames is sent + periodically until a matching PATH_RESPONSE frame is received or until there + is no remaining need for liveness or path validation checking. PATH_CHALLENGE + frames include a different payload each time they are sent. -This technique is used primarily to avoid QUIC from being used for traffic -amplification attack. In such an attack, a packet is sent to a server with -spoofed source address information that identifies a victim. If a server -generates more or larger packets in response to that packet, the attacker can -use the server to send more data toward the victim than it would be able to send -on its own. +* Responses to path validation using PATH_RESPONSE frames are sent just once. + A new PATH_CHALLENGE frame will be sent if another PATH_RESPONSE frame is + needed. -Several methods are used in QUIC to mitigate this attack. Firstly, the initial -handshake packet is sent in a UDP datagram that contains at least 1200 octets of -UDP payload. This allows a server to send a similar amount of data without -risking causing an amplification attack toward an unproven remote address. +* New connection IDs are sent in NEW_CONNECTION_ID frames and retransmitted if + the packet containing them is lost. Retransmissions of this frame carry the + same sequence number value. Likewise, retired connection IDs are sent in + RETIRE_CONNECTION_ID frames and retransmitted if the packet containing them is + lost. -A server eventually confirms that a client has received its messages when the -first Handshake-level message is received. This might be insufficient, -either because the server wishes to avoid the computational cost of completing -the handshake, or it might be that the size of the packets that are sent during -the handshake is too large. This is especially important for 0-RTT, where the -server might wish to provide application data traffic - such as a response to a -request - in response to the data carried in the early data from the client. +* PADDING frames contain no information, so lost PADDING frames do not require + repair. -To send additional data prior to completing the cryptographic handshake, the -server then needs to validate that the client owns the address that it claims. +Upon detecting losses, a sender MUST take appropriate congestion control action. +The details of loss detection and congestion control are described in +{{QUIC-RECOVERY}}. -Source address validation is therefore performed by the core transport -protocol during the establishment of a connection. -A different type of source address validation is performed after a connection -migration, see {{migrate-validate}}. +# Packet Size {#packet-size} +The QUIC packet size includes the QUIC header and integrity check, but not the +UDP or IP header. -### Client Address Validation Procedure +Clients MUST ensure that the first Initial packet they send is sent in a UDP +datagram that is at least 1200 octets. Padding the Initial packet or including a +0-RTT packet in the same datagram are ways to meet this requirement. Sending a +UDP datagram of this size ensures that the network path supports a reasonable +Maximum Transmission Unit (MTU), and helps reduce the amplitude of amplification +attacks caused by server responses toward an unverified client address. -QUIC uses token-based address validation. Any time the server wishes -to validate a client address, it provides the client with a token. As -long as the token's authenticity can be checked (see -{{token-integrity}}) and the client is able to return that token, it -proves to the server that it received the token. +The datagram containing the first Initial packet from a client MAY exceed 1200 +octets if the client believes that the Path Maximum Transmission Unit (PMTU) +supports the size that it chooses. -Upon receiving the client's Initial packet, the server can request -address validation by sending a Retry packet containing a token. This -token is repeated in the client's next Initial packet. Because the -token is consumed by the server that generates it, there is no need -for a single well-defined format. A token could include information -about the claimed client address (IP and port), a timestamp, and any -other supplementary information the server will need to validate the -token in the future. +A server MAY send a CONNECTION_CLOSE frame with error code PROTOCOL_VIOLATION in +response to the first Initial packet it receives from a client if the UDP +datagram is smaller than 1200 octets. It MUST NOT send any other frame type in +response, or otherwise behave as if any part of the offending packet was +processed as valid. -The Retry packet is sent to the client and a legitimate client will -respond with an Initial packet containing the token from the Retry packet -when it continues the handshake. In response to receiving the token, a -server can either abort the connection or permit it to proceed. -A connection MAY be accepted without address validation - or with only limited -validation - but a server SHOULD limit the data it sends toward an unvalidated -address. Successful completion of the cryptographic handshake implicitly -provides proof that the client has received packets from the server. +## Path Maximum Transmission Unit -The client should allow for additional Retry packets being sent in -response to Initial packets sent containing a token. There are several -situations in which the server might not be able to use the previously -generated token to validate the client's address and must send a new -Retry. A reasonable limit to the number of tries the client allows -for, before giving up, is 3. That is, the client MUST echo the -address validation token from a new Retry packet up to 3 times. After -that, it MAY give up on the connection attempt. +The Path Maximum Transmission Unit (PMTU) is the maximum size of the entire IP +header, UDP header, and UDP payload. The UDP payload includes the QUIC packet +header, protected payload, and any authentication fields. +All QUIC packets SHOULD be sized to fit within the estimated PMTU to avoid IP +fragmentation or packet drops. To optimize bandwidth efficiency, endpoints +SHOULD use Packetization Layer PMTU Discovery ({{!PLPMTUD=RFC4821}}). Endpoints +MAY use PMTU Discovery ({{!PMTUDv4=RFC1191}}, {{!PMTUDv6=RFC8201}}) for +detecting the PMTU, setting the PMTU appropriately, and storing the result of +previous PMTU determinations. -### Address Validation for Future Connections +In the absence of these mechanisms, QUIC endpoints SHOULD NOT send IP packets +larger than 1280 octets. Assuming the minimum IP header size, this results in +a QUIC packet size of 1232 octets for IPv6 and 1252 octets for IPv4. Some +QUIC implementations MAY be more conservative in computing allowed QUIC packet +size given unknown tunneling overheads or IP header options. -A server MAY provide clients with an address validation token during one -connection that can be used on a subsequent connection. Address validation is -especially important with 0-RTT because a server potentially sends a significant -amount of data to a client in response to 0-RTT data. +QUIC endpoints that implement any kind of PMTU discovery SHOULD maintain an +estimate for each combination of local and remote IP addresses. Each pairing of +local and remote addresses could have a different maximum MTU in the path. -The server uses the NEW_TOKEN frame {{frame-new-token}} to provide the -client with an address validation token that can be used to validate -future connections. The client may then use this token to validate -future connections by including it in the Initial packet's header. -The client MUST NOT use the token provided in a Retry for future -connections. +QUIC depends on the network path supporting an MTU of at least 1280 octets. This +is the IPv6 minimum MTU and therefore also supported by most modern IPv4 +networks. An endpoint MUST NOT reduce its MTU below this number, even if it +receives signals that indicate a smaller limit might exist. -Unlike the token that is created for a Retry packet, there might be some time -between when the token is created and when the token is subsequently used. -Thus, a resumption token SHOULD include an expiration time. The server MAY -include either an explicit expiration time or an issued timestamp and -dynamically calculate the expiration time. It is also unlikely that the client -port number is the same on two different connections; validating the port is -therefore unlikely to be successful. +If a QUIC endpoint determines that the PMTU between any pair of local and remote +IP addresses has fallen below 1280 octets, it MUST immediately cease sending +QUIC packets on the affected path. This could result in termination of the +connection if an alternative path cannot be found. -### Address Validation Token Integrity {#token-integrity} +### IPv4 PMTU Discovery {#v4-pmtud} -An address validation token MUST be difficult to guess. Including a large -enough random value in the token would be sufficient, but this depends on the -server remembering the value it sends to clients. +Traditional ICMP-based path MTU discovery in IPv4 {{!PMTUDv4}} is potentially +vulnerable to off-path attacks that successfully guess the IP/port 4-tuple and +reduce the MTU to a bandwidth-inefficient value. TCP connections mitigate this +risk by using the (at minimum) 8 bytes of transport header echoed in the ICMP +message to validate the TCP sequence number as valid for the current +connection. However, as QUIC operates over UDP, in IPv4 the echoed information +could consist only of the IP and UDP headers, which usually has insufficient +entropy to mitigate off-path attacks. -A token-based scheme allows the server to offload any state associated with -validation to the client. For this design to work, the token MUST be covered by -integrity protection against modification or falsification by clients. Without -integrity protection, malicious clients could generate or guess values for -tokens that would be accepted by the server. Only the server requires access to -the integrity protection key for tokens. +As a result, endpoints that implement PMTUD in IPv4 SHOULD take steps to +mitigate this risk. For instance, an application could: +* Set the IPv4 Don't Fragment (DF) bit on a small proportion of packets, so that +most invalid ICMP messages arrive when there are no DF packets outstanding, and +can therefore be identified as spurious. -## Path Validation {#migrate-validate} +* Store additional information from the IP or UDP headers from DF packets (for +example, the IP ID or UDP checksum) to further authenticate incoming Datagram +Too Big messages. -Path validation is used by an endpoint to verify reachability of a peer over a -specific path. That is, it tests reachability between a specific local address -and a specific peer address, where an address is the two-tuple of IP address and -port. Path validation tests that packets can be both sent to and received from -a peer. +* Any reduction in PMTU due to a report contained in an ICMP packet is +provisional until QUIC's loss detection algorithm determines that the packet is +actually lost. -Path validation is used during connection migration (see {{migration}} and -{{preferred-address}}) by the migrating endpoint to verify reachability of a -peer from a new local address. Path validation is also used by the peer to -verify that the migrating endpoint is able to receive packets sent to the its -new address. That is, that the packets received from the migrating endpoint do -not carry a spoofed source address. -Path validation can be used at any time by either endpoint. For instance, an -endpoint might check that a peer is still in possession of its address after a -period of quiescence. +## Special Considerations for Packetization Layer PMTU Discovery -Path validation is not designed as a NAT traversal mechanism. Though the -mechanism described here might be effective for the creation of NAT bindings -that support NAT traversal, the expectation is that one or other peer is able to -receive packets without first having sent a packet on that path. Effective NAT -traversal needs additional synchronization mechanisms that are not provided -here. -An endpoint MAY bundle PATH_CHALLENGE and PATH_RESPONSE frames that are used for -path validation with other frames. For instance, an endpoint may pad a packet -carrying a PATH_CHALLENGE for PMTU discovery, or an endpoint may bundle a -PATH_RESPONSE with its own PATH_CHALLENGE. +The PADDING frame provides a useful option for PMTU probe packets. PADDING +frames generate acknowledgements, but they need not be delivered reliably. As a +result, the loss of PADDING frames in probe packets does not require +delay-inducing retransmission. However, PADDING frames do consume congestion +window, which may delay the transmission of subsequent application data. -When probing a new path, an endpoint might want to ensure that its peer has an -unused connection ID available for responses. The endpoint can send -NEW_CONNECTION_ID and PATH_CHALLENGE frames in the same packet. This ensures -that an unused connection ID will be available to the peer when sending a -response. +When implementing the algorithm in Section 7.2 of {{!PLPMTUD}}, the initial +value of search_low SHOULD be consistent with the IPv6 minimum packet size. +Paths that do not support this size cannot deliver Initial packets, and +therefore are not QUIC-compliant. -### Initiation +Section 7.3 of {{!PLPMTUD}} discusses trade-offs between small and large +increases in the size of probe packets. As QUIC probe packets need not contain +application data, aggressive increases in probe size carry fewer consequences. -To initiate path validation, an endpoint sends a PATH_CHALLENGE frame containing -a random payload on the path to be validated. -An endpoint MAY send additional PATH_CHALLENGE frames to handle packet loss. An -endpoint SHOULD NOT send a PATH_CHALLENGE more frequently than it would an -Initial packet, ensuring that connection migration is no more load on a new path -than establishing a new connection. -The endpoint MUST use fresh random data in every PATH_CHALLENGE frame so that it -can associate the peer's response with the causative PATH_CHALLENGE. +# Versions {#versions} +QUIC versions are identified using a 32-bit unsigned number. -### Response +The version 0x00000000 is reserved to represent version negotiation. This +version of the specification is identified by the number 0x00000001. -On receiving a PATH_CHALLENGE frame, an endpoint MUST respond immediately by -echoing the data contained in the PATH_CHALLENGE frame in a PATH_RESPONSE frame, -with the following stipulation. Since a PATH_CHALLENGE might be sent from a -spoofed address, an endpoint MAY limit the rate at which it sends PATH_RESPONSE -frames and MAY silently discard PATH_CHALLENGE frames that would cause it to -respond at a higher rate. +Other versions of QUIC might have different properties to this version. The +properties of QUIC that are guaranteed to be consistent across all versions of +the protocol are described in {{QUIC-INVARIANTS}}. -To ensure that packets can be both sent to and received from the peer, the -PATH_RESPONSE MUST be sent on the same path as the triggering PATH_CHALLENGE: -from the same local address on which the PATH_CHALLENGE was received, to the -same remote address from which the PATH_CHALLENGE was received. +Version 0x00000001 of QUIC uses TLS as a cryptographic handshake protocol, as +described in {{QUIC-TLS}}. +Versions with the most significant 16 bits of the version number cleared are +reserved for use in future IETF consensus documents. -### Completion +Versions that follow the pattern 0x?a?a?a?a are reserved for use in forcing +version negotiation to be exercised. That is, any version number where the low +four bits of all octets is 1010 (in binary). A client or server MAY advertise +support for any of these reserved versions. -A new address is considered valid when a PATH_RESPONSE frame is received -containing data that was sent in a previous PATH_CHALLENGE. Receipt of an -acknowledgment for a packet containing a PATH_CHALLENGE frame is not adequate -validation, since the acknowledgment can be spoofed by a malicious peer. +Reserved version numbers will probably never represent a real protocol; a client +MAY use one of these version numbers with the expectation that the server will +initiate version negotiation; a server MAY advertise support for one of these +versions and can expect that clients ignore the value. -For path validation to be successful, a PATH_RESPONSE frame MUST be received -from the same remote address to which the corresponding PATH_CHALLENGE was -sent. If a PATH_RESPONSE frame is received from a different remote address than -the one to which the PATH_CHALLENGE was sent, path validation is considered to -have failed, even if the data matches that sent in the PATH_CHALLENGE. +\[\[RFC editor: please remove the remainder of this section before +publication.]] -Additionally, the PATH_RESPONSE frame MUST be received on the same local address -from which the corresponding PATH_CHALLENGE was sent. If a PATH_RESPONSE frame -is received on a different local address than the one from which the -PATH_CHALLENGE was sent, path validation is considered to have failed, even if -the data matches that sent in the PATH_CHALLENGE. Thus, the endpoint considers -the path to be valid when a PATH_RESPONSE frame is received on the same path -with the same payload as the PATH_CHALLENGE frame. +The version number for the final version of this specification (0x00000001), is +reserved for the version of the protocol that is published as an RFC. +Version numbers used to identify IETF drafts are created by adding the draft +number to 0xff000000. For example, draft-ietf-quic-transport-13 would be +identified as 0xff00000D. -### Abandonment +Implementors are encouraged to register version numbers of QUIC that they are +using for private experimentation on the GitHub wiki at +\. -An endpoint SHOULD abandon path validation after sending some number of -PATH_CHALLENGE frames or after some time has passed. When setting this timer, -implementations are cautioned that the new path could have a longer round-trip -time than the original. -Note that the endpoint might receive packets containing other frames on the new -path, but a PATH_RESPONSE frame with appropriate data is required for path -validation to succeed. -If path validation fails, the path is deemed unusable. This does not -necessarily imply a failure of the connection - endpoints can continue sending -packets over other paths as appropriate. If no paths are available, an endpoint -can wait for a new path to become available or close the connection. +# Packet Types and Formats -A path validation might be abandoned for other reasons besides -failure. Primarily, this happens if a connection migration to a new path is -initiated while a path validation on the old path is in progress. +We first describe QUIC's packet types and their formats, since some are +referenced in subsequent mechanisms. +All numeric values are encoded in network byte order (that is, big-endian) and +all field sizes are in bits. When discussing individual bits of fields, the +least significant bit is referred to as bit 0. Hexadecimal notation is used for +describing the value of fields. -## Connection Migration {#migration} +Any QUIC packet has either a long or a short header, as indicated by the Header +Form bit. Long headers are expected to be used early in the connection before +version negotiation and establishment of 1-RTT keys. Short headers are minimal +version-specific headers, which are used after version negotiation and 1-RTT +keys are established. -QUIC allows connections to survive changes to endpoint addresses (that is, IP -address and/or port), such as those caused by an endpoint migrating to a new -network. This section describes the process by which an endpoint migrates to a -new address. +## Long Header {#long-header} -An endpoint MUST NOT initiate connection migration before the handshake is -finished and the endpoint has 1-RTT keys. The design of QUIC relies on -endpoints retaining a stable address for the duration of the handshake. +~~~~~ + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+ +|1| Type (7) | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Version (32) | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +|DCIL(4)|SCIL(4)| ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Destination Connection ID (0/32..144) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Source Connection ID (0/32..144) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Length (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Packet Number (8/16/32) | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Payload (*) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +~~~~~ +{: #fig-long-header title="Long Header Packet Format"} -An endpoint also MUST NOT initiate connection migration if the peer sent the -`disable_migration` transport parameter during the handshake. An endpoint which -has sent this transport parameter, but detects that a peer has nonetheless -migrated to a different network MAY treat this as a connection error of type -INVALID_MIGRATION. +Long headers are used for packets that are sent prior to the completion of +version negotiation and establishment of 1-RTT keys. Once both conditions are +met, a sender switches to sending packets using the short header +({{short-header}}). The long form allows for special packets - such as the +Version Negotiation packet - to be represented in this uniform fixed-length +packet format. Packets that use the long header contain the following fields: -Not all changes of peer address are intentional migrations. The peer could -experience NAT rebinding: a change of address due to a middlebox, usually a NAT, -allocating a new outgoing port or even a new outgoing IP address for a flow. -Endpoints SHOULD perform path validation ({{migrate-validate}}) if a NAT -rebinding does not cause the connection to fail. +Header Form: -This document limits migration of connections to new client addresses, except as -described in {{preferred-address}}. Clients are responsible for initiating all -migrations. Servers do not send non-probing packets (see {{probing}}) toward a -client address until they see a non-probing packet from that address. If a -client receives packets from an unknown server address, the client MAY discard -these packets. +: The most significant bit (0x80) of octet 0 (the first octet) is set to 1 for + long headers. +Long Packet Type: -### Probing a New Path {#probing} +: The remaining seven bits of octet 0 contain the packet type. This field can + indicate one of 128 packet types. The types specified for this version are + listed in {{long-packet-types}}. -An endpoint MAY probe for peer reachability from a new local address using path -validation {{migrate-validate}} prior to migrating the connection to the new -local address. Failure of path validation simply means that the new path is not -usable for this connection. Failure to validate a path does not cause the -connection to end unless there are no valid alternative paths available. +Version: -An endpoint uses a new connection ID for probes sent from a new local address, -see {{migration-linkability}} for further discussion. An endpoint that uses -a new local address needs to ensure that at least one new connection ID is -available at the peer. That can be achieved by including a NEW_CONNECTION_ID -frame in the probe. +: The QUIC Version is a 32-bit field that follows the Type. This field + indicates which version of QUIC is in use and determines how the rest of the + protocol fields are interpreted. -Receiving a PATH_CHALLENGE frame from a peer indicates that the peer is probing -for reachability on a path. An endpoint sends a PATH_RESPONSE in response as per -{{migrate-validate}}. +DCIL and SCIL: -PATH_CHALLENGE, PATH_RESPONSE, NEW_CONNECTION_ID, and PADDING frames are -"probing frames", and all other frames are "non-probing frames". A packet -containing only probing frames is a "probing packet", and a packet containing -any other frame is a "non-probing packet". +: The octet following the version contains the lengths of the two connection ID + fields that follow it. These lengths are encoded as two 4-bit unsigned + integers. The Destination Connection ID Length (DCIL) field occupies the 4 + high bits of the octet and the Source Connection ID Length (SCIL) field + occupies the 4 low bits of the octet. An encoded length of 0 indicates that + the connection ID is also 0 octets in length. Non-zero encoded lengths are + increased by 3 to get the full length of the connection ID, producing a length + between 4 and 18 octets inclusive. For example, an octet with the value 0x50 + describes an 8-octet Destination Connection ID and a zero-length Source + Connection ID. +Destination Connection ID: -### Initiating Connection Migration {#initiating-migration} +: The Destination Connection ID field follows the connection ID lengths and is + either 0 octets in length or between 4 and 18 octets. + {{connection-id-encoding}} describes the use of this field in more detail. -An endpoint can migrate a connection to a new local address by sending packets -containing frames other than probing frames from that address. +Source Connection ID: -Each endpoint validates its peer's address during connection establishment. -Therefore, a migrating endpoint can send to its peer knowing that the peer is -willing to receive at the peer's current address. Thus an endpoint can migrate -to a new local address without first validating the peer's address. - -When migrating, the new path might not support the endpoint's current sending -rate. Therefore, the endpoint resets its congestion controller, as described in -{{migration-cc}}. - -The new path might not have the same ECN capability. Therefore, the endpoint -verifies ECN capability as described in {{using-ecn}}. +: The Source Connection ID field follows the Destination Connection ID and is + either 0 octets in length or between 4 and 18 octets. + {{connection-id-encoding}} describes the use of this field in more detail. -Receiving acknowledgments for data sent on the new path serves as proof of the -peer's reachability from the new address. Note that since acknowledgments may -be received on any path, return reachability on the new path is not -established. To establish return reachability on the new path, an endpoint MAY -concurrently initiate path validation {{migrate-validate}} on the new path. +Length: +: The length of the remainder of the packet (that is, the Packet Number and + Payload fields) in octets, encoded as a variable-length integer + ({{integer-encoding}}). -### Responding to Connection Migration {#migration-response} +Packet Number: -Receiving a packet from a new peer address containing a non-probing frame -indicates that the peer has migrated to that address. +: The packet number field is 1, 2, or 4 octets long. The packet number has + confidentiality protection separate from packet protection, as described + in Section 5.3 of {{QUIC-TLS}}. The length of the packet number field is + encoded in the plaintext packet number. See {{packet-numbers}} for details. -In response to such a packet, an endpoint MUST start sending subsequent packets -to the new peer address and MUST initiate path validation ({{migrate-validate}}) -to verify the peer's ownership of the unvalidated address. +Payload: -An endpoint MAY send data to an unvalidated peer address, but it MUST protect -against potential attacks as described in {{address-spoofing}} and -{{on-path-spoofing}}. An endpoint MAY skip validation of a peer address if that -address has been seen recently. +: The payload of the packet. -An endpoint only changes the address that it sends packets to in response to the -highest-numbered non-probing packet. This ensures that an endpoint does not send -packets to an old peer address in the case that it receives reordered packets. +The following packet types are defined: -After changing the address to which it sends non-probing packets, an endpoint -could abandon any path validation for other addresses. +| Type | Name | Section | +|:-----|:------------------------------|:----------------------------| +| 0x7F | Initial | {{packet-initial}} | +| 0x7E | Retry | {{packet-retry}} | +| 0x7D | Handshake | {{packet-handshake}} | +| 0x7C | 0-RTT Protected | {{packet-protected}} | +{: #long-packet-types title="Long Header Packet Types"} -Receiving a packet from a new peer address might be the result of a NAT -rebinding at the peer. +The header form, type, connection ID lengths octet, destination and source +connection IDs, and version fields of a long header packet are +version-independent. The packet number and values for packet types defined in +{{long-packet-types}} are version-specific. See {{QUIC-INVARIANTS}} for details +on how packets from different versions of QUIC are interpreted. -After verifying a new client address, the server SHOULD send new address -validation tokens ({{address-validation}}) to the client. +The interpretation of the fields and the payload are specific to a version and +packet type. Type-specific semantics for this version are described in the +following sections. +The end of the packet is determined by the Length field. The Length field +covers both the Packet Number and Payload fields, both of which are +confidentiality protected and initially of unknown length. The size of the +Payload field is learned once the packet number protection is removed. -#### Handling Address Spoofing by a Peer {#address-spoofing} +Senders can sometimes coalesce multiple packets into one UDP datagram. See +{{packet-coalesce}} for more details. -It is possible that a peer is spoofing its source address to cause an endpoint -to send excessive amounts of data to an unwilling host. If the endpoint sends -significantly more data than the spoofing peer, connection migration might be -used to amplify the volume of data that an attacker can generate toward a -victim. -As described in {{migration-response}}, an endpoint is required to validate a -peer's new address to confirm the peer's possession of the new address. Until a -peer's address is deemed valid, an endpoint MUST limit the rate at which it -sends data to this address. The endpoint MUST NOT send more than a minimum -congestion window's worth of data per estimated round-trip time (kMinimumWindow, -as defined in {{QUIC-RECOVERY}}). In the absence of this limit, an endpoint -risks being used for a denial of service attack against an unsuspecting victim. -Note that since the endpoint will not have any round-trip time measurements to -this address, the estimate SHOULD be the default initial value (see -{{QUIC-RECOVERY}}). +## Short Header -If an endpoint skips validation of a peer address as described in -{{migration-response}}, it does not need to limit its sending rate. +~~~~~ + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+ +|0|K|1|1|0|R R R| ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Destination Connection ID (0..144) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Packet Number (8/16/32) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Protected Payload (*) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +~~~~~ +{: #fig-short-header title="Short Header Packet Format"} +The short header can be used after the version and 1-RTT keys are negotiated. +Packets that use the short header contain the following fields: -#### Handling Address Spoofing by an On-path Attacker {#on-path-spoofing} +Header Form: -An on-path attacker could cause a spurious connection migration by copying and -forwarding a packet with a spoofed address such that it arrives before the -original packet. The packet with the spoofed address will be seen to come from -a migrating connection, and the original packet will be seen as a duplicate and -dropped. After a spurious migration, validation of the source address will fail -because the entity at the source address does not have the necessary -cryptographic keys to read or respond to the PATH_CHALLENGE frame that is sent -to it even if it wanted to. +: The most significant bit (0x80) of octet 0 is set to 0 for the short header. -To protect the connection from failing due to such a spurious migration, an -endpoint MUST revert to using the last validated peer address when validation of -a new peer address fails. +Key Phase Bit: -If an endpoint has no state about the last validated peer address, it MUST close -the connection silently by discarding all connection state. This results in new -packets on the connection being handled generically. For instance, an endpoint -MAY send a stateless reset in response to any further incoming packets. +: The second bit (0x40) of octet 0 indicates the key phase, which allows a + recipient of a packet to identify the packet protection keys that are used to + protect the packet. See {{QUIC-TLS}} for details. -Note that receipt of packets with higher packet numbers from the legitimate peer -address will trigger another connection migration. This will cause the -validation of the address of the spurious migration to be abandoned. +\[\[Editor's Note: this section should be removed and the bit definitions +changed before this draft goes to the IESG.]] -### Loss Detection and Congestion Control {#migration-cc} +Third Bit: -The capacity available on the new path might not be the same as the old path. -Packets sent on the old path SHOULD NOT contribute to congestion control or RTT -estimation for the new path. +: The third bit (0x20) of octet 0 is set to 1. -On confirming a peer's ownership of its new address, an endpoint SHOULD -immediately reset the congestion controller and round-trip time estimator for -the new path. +\[\[Editor's Note: this section should be removed and the bit definitions +changed before this draft goes to the IESG.]] -An endpoint MUST NOT return to the send rate used for the previous path unless -it is reasonably sure that the previous send rate is valid for the new path. -For instance, a change in the client's port number is likely indicative of a -rebinding in a middlebox and not a complete change in path. This determination -likely depends on heuristics, which could be imperfect; if the new path capacity -is significantly reduced, ultimately this relies on the congestion controller -responding to congestion signals and reducing send rates appropriately. +Fourth Bit: -There may be apparent reordering at the receiver when an endpoint sends data and -probes from/to multiple addresses during the migration period, since the two -resulting paths may have different round-trip times. A receiver of packets on -multiple paths will still send ACK frames covering all received packets. +: The fourth bit (0x10) of octet 0 is set to 1. -While multiple paths might be used during connection migration, a single -congestion control context and a single loss recovery context (as described in -{{QUIC-RECOVERY}}) may be adequate. A sender can make exceptions for probe -packets so that their loss detection is independent and does not unduly cause -the congestion controller to reduce its sending rate. An endpoint might set a -separate timer when a PATH_CHALLENGE is sent, which is cancelled when the -corresponding PATH_RESPONSE is received. If the timer fires before the -PATH_RESPONSE is received, the endpoint might send a new PATH_CHALLENGE, and -restart the timer for a longer period of time. +\[\[Editor's Note: this section should be removed and the bit definitions +changed before this draft goes to the IESG.]] +Google QUIC Demultiplexing Bit: -### Privacy Implications of Connection Migration {#migration-linkability} +: The fifth bit (0x8) of octet 0 is set to 0. This allows implementations of + Google QUIC to distinguish Google QUIC packets from short header packets sent + by a client because Google QUIC servers expect the connection ID to always be + present. + The special interpretation of this bit SHOULD be removed from this + specification when Google QUIC has finished transitioning to the new header + format. -Using a stable connection ID on multiple network paths allows a passive observer -to correlate activity between those paths. An endpoint that moves between -networks might not wish to have their activity correlated by any entity other -than their peer, so different connection IDs are used when sending from -different local addresses, as discussed in {{connection-id}}. For this to be -effective endpoints need to ensure that connections IDs they provide cannot be -linked by any other entity. +Reserved: -This eliminates the use of the connection ID for linking activity from -the same connection on different networks. Protection of packet numbers ensures -that packet numbers cannot be used to correlate activity. This does not prevent -other properties of packets, such as timing and size, from being used to -correlate activity. +: The sixth, seventh, and eighth bits (0x7) of octet 0 are reserved for + experimentation. Endpoints MUST ignore these bits on packets they receive + unless they are participating in an experiment that uses these bits. An + endpoint not actively using these bits SHOULD set the value randomly on + packets they send to protect against unwanted inference about particular + values. -Clients MAY move to a new connection ID at any time based on -implementation-specific concerns. For example, after a period of network -inactivity NAT rebinding might occur when the client begins sending data again. +Destination Connection ID: -A client might wish to reduce linkability by employing a new connection ID and -source UDP port when sending traffic after a period of inactivity. Changing the -UDP port from which it sends packets at the same time might cause the packet to -appear as a connection migration. This ensures that the mechanisms that support -migration are exercised even for clients that don't experience NAT rebindings or -genuine migrations. Changing port number can cause a peer to reset its -congestion state (see {{migration-cc}}), so the port SHOULD only be changed -infrequently. +: The Destination Connection ID is a connection ID that is chosen by the + intended recipient of the packet. See {{connection-id}} for more details. -Endpoints that use connection IDs with length greater than zero could have their -activity correlated if their peers keep using the same destination connection ID -after migration. Endpoints that receive packets with a previously unused -Destination Connection ID SHOULD change to sending packets with a connection ID -that has not been used on any other network path. The goal here is to ensure -that packets sent on different paths cannot be correlated. To fulfill this -privacy requirement, endpoints that initiate migration and use connection IDs -with length greater than zero SHOULD provide their peers with new connection IDs -before migration. +Packet Number: -Caution: +: The packet number field is 1, 2, or 4 octets long. The packet number has + confidentiality protection separate from packet protection, as described in + Section 5.3 of {{QUIC-TLS}}. The length of the packet number field is encoded + in the plaintext packet number. See {{packet-numbers}} for details. -: If both endpoints change connection ID in response to seeing a change in - connection ID from their peer, then this can trigger an infinite sequence of - changes. +Protected Payload: -## Server's Preferred Address {#preferred-address} +: Packets with a short header always include a 1-RTT protected payload. -QUIC allows servers to accept connections on one IP address and attempt to -transfer these connections to a more preferred address shortly after the -handshake. This is particularly useful when clients initially connect to an -address shared by multiple servers but would prefer to use a unicast address to -ensure connection stability. This section describes the protocol for migrating a -connection to a preferred server address. +The header form and connection ID field of a short header packet are +version-independent. The remaining fields are specific to the selected QUIC +version. See {{QUIC-INVARIANTS}} for details on how packets from different +versions of QUIC are interpreted. -Migrating a connection to a new server address mid-connection is left for future -work. If a client receives packets from a new server address not indicated by -the preferred_address transport parameter, the client SHOULD discard these -packets. -### Communicating A Preferred Address +## Version Negotiation Packet {#packet-version} -A server conveys a preferred address by including the preferred_address -transport parameter in the TLS handshake. +A Version Negotiation packet is inherently not version-specific, and does not +use the long packet header (see {{long-header}}. Upon receipt by a client, it +will appear to be a packet using the long header, but will be identified as a +Version Negotiation packet based on the Version field having a value of 0. -Once the handshake is finished, the client SHOULD initiate path validation (see -{{migrate-validate}}) of the server's preferred address using the connection ID -provided in the preferred_address transport parameter. +The Version Negotiation packet is a response to a client packet that contains a +version that is not supported by the server, and is only sent by servers. -If path validation succeeds, the client SHOULD immediately begin sending all -future packets to the new server address using the new connection ID and -discontinue use of the old server address. If path validation fails, the client -MUST continue sending all future packets to the server's original IP address. +The layout of a Version Negotiation packet is: +~~~ + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+ +|1| Unused (7) | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Version (32) | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +|DCIL(4)|SCIL(4)| ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Destination Connection ID (0/32..144) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Source Connection ID (0/32..144) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Supported Version 1 (32) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| [Supported Version 2 (32)] ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| [Supported Version N (32)] ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +~~~ +{: #version-negotiation-format title="Version Negotiation Packet"} -### Responding to Connection Migration - -A server might receive a packet addressed to its preferred IP address at any -time after the handshake is completed. If this packet contains a PATH_CHALLENGE -frame, the server sends a PATH_RESPONSE frame as per {{migrate-validate}}, but -the server MUST continue sending all other packets from its original IP address. - -The server SHOULD also initiate path validation of the client using its -preferred address and the address from which it received the client probe. This -helps to guard against spurious migration initiated by an attacker. +The value in the Unused field is selected randomly by the server. -Once the server has completed its path validation and has received a non-probing -packet with a new largest packet number on its preferred address, the server -begins sending to the client exclusively from its preferred IP address. It -SHOULD drop packets for this connection received on the old IP address, but MAY -continue to process delayed packets. +The Version field of a Version Negotiation packet MUST be set to 0x00000000. +The server MUST include the value from the Source Connection ID field of the +packet it receives in the Destination Connection ID field. The value for Source +Connection ID MUST be copied from the Destination Connection ID of the received +packet, which is initially randomly selected by a client. Echoing both +connection IDs gives clients some assurance that the server received the packet +and that the Version Negotiation packet was not generated by an off-path +attacker. -### Interaction of Client Migration and Preferred Address +The remainder of the Version Negotiation packet is a list of 32-bit versions +which the server supports. -A client might need to perform a connection migration before it has migrated to -the server's preferred address. In this case, the client SHOULD perform path -validation to both the original and preferred server address from the client's -new address concurrently. +A Version Negotiation packet cannot be explicitly acknowledged in an ACK frame +by a client. Receiving another Initial packet implicitly acknowledges a Version +Negotiation packet. -If path validation of the server's preferred address succeeds, the client MUST -abandon validation of the original address and migrate to using the server's -preferred address. If path validation of the server's preferred address fails, -but validation of the server's original address succeeds, the client MAY migrate -to using the original address from the client's new address. +The Version Negotiation packet does not include the Packet Number and Length +fields present in other packets that use the long header form. Consequently, +a Version Negotiation packet consumes an entire UDP datagram. -If the connection to the server's preferred address is not from the same client -address, the server MUST protect against potential attacks as described in -{{address-spoofing}} and {{on-path-spoofing}}. In addition to intentional -simultaneous migration, this might also occur because the client's access -network used a different NAT binding for the server's preferred address. +See {{version-negotiation}} for a description of the version negotiation +process. -Servers SHOULD initiate path validation to the client's new address upon -receiving a probe packet from a different address. Servers MUST NOT send more -than a minimum congestion window's worth of non-probing packets to the new -address before path validation is complete. +## Retry Packet {#packet-retry} -## Connection Termination {#termination} +A Retry packet uses a long packet header with a type value of 0x7E. It carries +an address validation token created by the server. It is used by a server that +wishes to perform a stateless retry (see {{stateless-retry}}). -Connections should remain open until they become idle for a pre-negotiated -period of time. A QUIC connection, once established, can be terminated in one -of three ways: +~~~ + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+ +|1| 0x7e | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Version (32) | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +|DCIL(4)|SCIL(4)| ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Destination Connection ID (0/32..144) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Source Connection ID (0/32..144) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| ODCIL(8) | Original Destination Connection ID (*) | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Retry Token (*) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +~~~ +{: #retry-format title="Retry Packet"} -* idle timeout ({{idle-timeout}}) -* immediate close ({{immediate-close}}) -* stateless reset ({{stateless-reset}}) +A Retry packet (shown in {{retry-format}}) only uses the invariant portion of +the long packet header {{QUIC-INVARIANTS}}; that is, the fields up to and +including the Destination and Source Connection ID fields. A Retry packet does +not contain any protected fields. Like Version Negotiation, a Retry packet +contains the long header including the connection IDs, but omits the Length, +Packet Number, and Payload fields. These are replaced with: +ODCIL: -### Closing and Draining Connection States {#draining} +: The length of the Original Destination Connection ID field. The length is + encoded in the least significant 4 bits of the octet, using the same encoding + as the DCIL and SCIL fields. The most significant 4 bits of this octet are + reserved. Unless a use for these bits has been negotiated, endpoints SHOULD + send randomized values and MUST ignore any value that it receives. -The closing and draining connection states exist to ensure that connections -close cleanly and that delayed or reordered packets are properly discarded. -These states SHOULD persist for three times the current Retransmission Timeout -(RTO) interval as defined in {{QUIC-RECOVERY}}. +Original Destination Connection ID: -An endpoint enters a closing period after initiating an immediate close -({{immediate-close}}). While closing, an endpoint MUST NOT send packets unless -they contain a CONNECTION_CLOSE or APPLICATION_CLOSE frame (see -{{immediate-close}} for details). +: The Original Destination Connection ID contains the value of the Destination + Connection ID from the Initial packet that this Retry is in response to. The + length of this field is given in ODCIL. -In the closing state, only a packet containing a closing frame can be sent. An -endpoint retains only enough information to generate a packet containing a -closing frame and to identify packets as belonging to the connection. The -connection ID and QUIC version is sufficient information to identify packets for -a closing connection; an endpoint can discard all other connection state. An -endpoint MAY retain packet protection keys for incoming packets to allow it to -read and process a closing frame. +Retry Token: -The draining state is entered once an endpoint receives a signal that its peer -is closing or draining. While otherwise identical to the closing state, an -endpoint in the draining state MUST NOT send any packets. Retaining packet -protection keys is unnecessary once a connection is in the draining state. +: An opaque token that the server can use to validate the client's address. -An endpoint MAY transition from the closing period to the draining period if it -can confirm that its peer is also closing or draining. Receiving a closing -frame is sufficient confirmation, as is receiving a stateless reset. The -draining period SHOULD end when the closing period would have ended. In other -words, the endpoint can use the same end time, but cease retransmission of the -closing packet. +The server populates the Destination Connection ID with the connection ID that +the client included in the Source Connection ID of the Initial packet. -Disposing of connection state prior to the end of the closing or draining period -could cause delayed or reordered packets to be handled poorly. Endpoints that -have some alternative means to ensure that late-arriving packets on the -connection do not create QUIC state, such as those that are able to close the -UDP socket, MAY use an abbreviated draining period which can allow for faster -resource recovery. Servers that retain an open socket for accepting new -connections SHOULD NOT exit the closing or draining period early. +The server includes a connection ID of its choice in the Source Connection ID +field. This value MUST not be equal to the Destination Connection ID field of +the packet sent by the client. The client MUST use this connection ID in the +Destination Connection ID of subsequent packets that it sends. -Once the closing or draining period has ended, an endpoint SHOULD discard all -connection state. This results in new packets on the connection being handled -generically. For instance, an endpoint MAY send a stateless reset in response -to any further incoming packets. +A server MAY send Retry packets in response to Initial and 0-RTT packets. A +server can either discard or buffer 0-RTT packets that it receives. A server +can send multiple Retry packets as it receives Initial or 0-RTT packets. -The draining and closing periods do not apply when a stateless reset -({{stateless-reset}}) is sent. +A client MUST accept and process at most one Retry packet for each connection +attempt. After the client has received and processed an Initial or Retry packet +from the server, it MUST discard any subsequent Retry packets that it receives. -An endpoint is not expected to handle key updates when it is closing or -draining. A key update might prevent the endpoint from moving from the closing -state to draining, but it otherwise has no impact. +Clients MUST discard Retry packets that contain an Original Destination +Connection ID field that does not match the Destination Connection ID from its +Initial packet. This prevents an off-path attacker from injecting a Retry +packet. -An endpoint could receive packets from a new source address, indicating a client -connection migration ({{migration}}), while in the closing period. An endpoint -in the closing state MUST strictly limit the number of packets it sends to this -new address until the address is validated (see {{migrate-validate}}). A server -in the closing state MAY instead choose to discard packets received from a new -source address. +The client responds to a Retry packet with an Initial packet that includes the +provided Retry Token to continue connection establishment. +A client sets the Destination Connection ID field of this Initial packet to the +value from the Source Connection ID in the Retry packet. Changing Destination +Connection ID also results in a change to the keys used to protect the Initial +packet. It also sets the Token field to the token provided in the Retry. The +client MUST NOT change the Source Connection ID because the server could include +the connection ID as part of its token validation logic (see {{tokens}}). -### Idle Timeout +All subsequent Initial packets from the client MUST use the connection ID and +token values from the Retry packet. Aside from this, the Initial packet sent +by the client is subject to the same restrictions as the first Initial packet. +A client can either reuse the cryptographic handshake message or construct a +new one at its discretion. -If the idle timeout is enabled, a connection that remains idle for longer than -the advertised idle timeout (see {{transport-parameter-definitions}}) is closed. -A connection enters the draining state when the idle timeout expires. +A client MAY attempt 0-RTT after receiving a Retry packet by sending 0-RTT +packets to the connection ID provided by the server. A client that sends +additional 0-RTT packets without constructing a new cryptographic handshake +message MUST NOT reset the packet number to 0 after a Retry packet, see +{{retry-0rtt-pn}}. -Each endpoint advertises their own idle timeout to their peer. The idle timeout -starts from the last packet received. In order to ensure that initiating new -activity postpones an idle timeout, an endpoint restarts this timer when sending -a packet. An endpoint does not postpone the idle timeout if another packet has -been sent containing frames other than ACK or PADDING, and that other packet has -not been acknowledged or declared lost. Packets that contain only ACK or -PADDING frames are not acknowledged until an endpoint has other frames to send, -so they could prevent the timeout from being refreshed. +A server acknowledges the use of a Retry packet for a connection using the +original_connection_id transport parameter (see +{{transport-parameter-definitions}}). If the server sends a Retry packet, it +MUST include the value of the Original Destination Connection ID field of the +Retry packet (that is, the Destination Connection ID field from the client's +first Initial packet) in the transport parameter. -The value for an idle timeout can be asymmetric. The value advertised by an -endpoint is only used to determine whether the connection is live at that -endpoint. An endpoint that sends packets near the end of the idle timeout -period of a peer risks having those packets discarded if its peer enters the -draining state before the packets arrive. If a peer could timeout within an RTO -(see Section 4.3.3 of {{QUIC-RECOVERY}}), it is advisable to test for liveness -before sending any data that cannot be retried safely. +If the client received and processed a Retry packet, it validates that the +original_connection_id transport parameter is present and correct; otherwise, it +validates that the transport parameter is absent. A client MUST treat a failed +validation as a connection error of type TRANSPORT_PARAMETER_ERROR. +A Retry packet does not include a packet number and cannot be explicitly +acknowledged by a client. -### Immediate Close -An endpoint sends a closing frame (CONNECTION_CLOSE or APPLICATION_CLOSE) to -terminate the connection immediately. Any closing frame causes all streams to -immediately become closed; open streams can be assumed to be implicitly reset. +## Cryptographic Handshake Packets {#handshake-packets} -After sending a closing frame, endpoints immediately enter the closing state. -During the closing period, an endpoint that sends a closing frame SHOULD respond -to any packet that it receives with another packet containing a closing frame. -To minimize the state that an endpoint maintains for a closing connection, -endpoints MAY send the exact same packet. However, endpoints SHOULD limit the -number of packets they generate containing a closing frame. For instance, an -endpoint could progressively increase the number of packets that it receives -before sending additional packets or increase the time between packets. +Once version negotiation is complete, the cryptographic handshake is used to +agree on cryptographic keys. The cryptographic handshake is carried in Initial +({{packet-initial}}) and Handshake ({{packet-handshake}}) packets. -Note: - -: Allowing retransmission of a packet contradicts other advice in this document - that recommends the creation of new packet numbers for every packet. Sending - new packet numbers is primarily of advantage to loss recovery and congestion - control, which are not expected to be relevant for a closed connection. - Retransmitting the final packet requires less state. - -After receiving a closing frame, endpoints enter the draining state. An -endpoint that receives a closing frame MAY send a single packet containing a -closing frame before entering the draining state, using a CONNECTION_CLOSE frame -and a NO_ERROR code if appropriate. An endpoint MUST NOT send further packets, -which could result in a constant exchange of closing frames until the closing -period on either peer ended. - -An immediate close can be used after an application protocol has arranged to -close a connection. This might be after the application protocols negotiates a -graceful shutdown. The application protocol exchanges whatever messages that -are needed to cause both endpoints to agree to close the connection, after which -the application requests that the connection be closed. The application -protocol can use an APPLICATION_CLOSE message with an appropriate error code to -signal closure. +All these packets use the long header and contain the current QUIC version in +the version field. +In order to prevent tampering by version-unaware middleboxes, Initial +packets are protected with connection- and version-specific keys +(Initial keys) as described in {{QUIC-TLS}}. This protection does not +provide confidentiality or integrity against on-path attackers, but +provides some level of protection against off-path attackers. -### Stateless Reset {#stateless-reset} -A stateless reset is provided as an option of last resort for an endpoint that -does not have access to the state of a connection. A crash or outage might -result in peers continuing to send data to an endpoint that is unable to -properly continue the connection. An endpoint that wishes to communicate a -fatal connection error MUST use a closing frame if it has sufficient state to do -so. +## Initial Packet {#packet-initial} -To support this process, a token is sent by endpoints. The token is carried in -the NEW_CONNECTION_ID frame sent by either peer, and servers can specify the -stateless_reset_token transport parameter during the handshake (clients cannot -because their transport parameters don't have confidentiality protection). This -value is protected by encryption, so only client and server know this value. -Tokens sent via NEW_CONNECTION_ID frames are invalidated when their associated -connection ID is retired via a RETIRE_CONNECTION_ID frame -({{frame-retire-connection-id}}). +The Initial packet uses long headers with a type value of 0x7F. It carries the +first CRYPTO frames sent by the client and server to perform key exchange, and +carries ACKs in either direction. The Initial packet is protected by Initial +keys as described in {{QUIC-TLS}}. -An endpoint that receives packets that it cannot process sends a packet in the -following layout: +The Initial packet (shown in {{initial-format}}) has two additional header +fields that are added to the Long Header before the Length field. ~~~ - 0 1 2 3 - 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+ -|0|K|1|1|0|0|0|0| +|1| 0x7f | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Random Octets (160..) ... +| Version (32) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| | -+ + -| | -+ Stateless Reset Token (128) + -| | -+ + -| | +|DCIL(4)|SCIL(4)| ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Destination Connection ID (0/32..144) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Source Connection ID (0/32..144) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Token Length (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Token (*) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Length (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Packet Number (8/16/32) | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Payload (*) ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~~~ -{: #fig-stateless-reset title="Stateless Reset Packet"} - -This design ensures that a stateless reset packet is - to the extent possible - -indistinguishable from a regular packet with a short header. +{: #initial-format title="Initial Packet"} -The message consists of a header octet, followed by an arbitrary number of -random octets, followed by a Stateless Reset Token. +These fields include the token that was previously provided in a Retry packet or +NEW_TOKEN frame: -A stateless reset will be interpreted by a recipient as a packet with a short -header. For the packet to appear as valid, the Random Octets field needs to -include at least 20 octets of random or unpredictable values. This is intended -to allow for a destination connection ID of the maximum length permitted, a -packet number, and minimal payload. The Stateless Reset Token corresponds to -the minimum expansion of the packet protection AEAD. More random octets might -be necessary if the endpoint could have negotiated a packet protection scheme -with a larger minimum AEAD expansion. +Token Length: -An endpoint SHOULD NOT send a stateless reset that is significantly larger than -the packet it receives. Endpoints MUST discard packets that are too small to be -valid QUIC packets. With the set of AEAD functions defined in {{QUIC-TLS}}, -packets less than 19 octets long are never valid. +: A variable-length integer specifying the length of the Token field, in bytes. + This value is zero if no token is present. Initial packets sent by the server + MUST set the Token Length field to zero; clients that receive an Initial + packet with a non-zero Token Length field MUST either discard the packet or + generate a connection error of type PROTOCOL_VIOLATION. -An endpoint MAY send a stateless reset in response to a packet with a long -header. This would not be effective if the stateless reset token was not yet -available to a peer. In this QUIC version, packets with a long header are only -used during connection establishment. Because the stateless reset token is not -available until connection establishment is complete or near completion, -ignoring an unknown packet with a long header might be more effective. +Token: -An endpoint cannot determine the Source Connection ID from a packet with a short -header, therefore it cannot set the Destination Connection ID in the stateless -reset packet. The Destination Connection ID will therefore differ from the -value used in previous packets. A random Destination Connection ID makes the -connection ID appear to be the result of moving to a new connection ID that was -provided using a NEW_CONNECTION_ID frame ({{frame-new-connection-id}}). +: The value of the token. -Using a randomized connection ID results in two problems: +The client and server use the Initial packet type for any packet that contains +an initial cryptographic handshake message. This includes all cases where a new +packet containing the initial cryptographic message needs to be created, such as +the packets sent after receiving a Version Negotiation ({{packet-version}}) or +Retry packet ({{packet-retry}}). -* The packet might not reach the peer. If the Destination Connection ID is - critical for routing toward the peer, then this packet could be incorrectly - routed. This might also trigger another Stateless Reset in response, see - {{reset-looping}}. A Stateless Reset that is not correctly routed is - ineffective in causing errors to be quickly detected and recovered. In this - case, endpoints will need to rely on other methods - such as timers - to - detect that the connection has failed. +A server sends its first Initial packet in response to a client Initial. A +server may send multiple Initial packets. The cryptographic key exchange could +require multiple round trips or retransmissions of this data. -* The randomly generated connection ID can be used by entities other than the - peer to identify this as a potential stateless reset. An endpoint that - occasionally uses different connection IDs might introduce some uncertainty - about this. +The payload of an Initial packet includes a CRYPTO frame (or frames) containing +a cryptographic handshake message, ACK frames, or both. PADDING and +CONNECTION_CLOSE frames are also permitted. An endpoint that receives an +Initial packet containing other frames can either discard the packet as spurious +or treat it as a connection error. -Finally, the last 16 octets of the packet are set to the value of the Stateless -Reset Token. +The first packet sent by a client always includes a CRYPTO frame that contains +the entirety of the first cryptographic handshake message. This packet, and the +cryptographic handshake message, MUST fit in a single UDP datagram (see +{{handshake}}). The first CRYPTO frame sent always begins at an offset of 0 +(see {{handshake}}). -A stateless reset is not appropriate for signaling error conditions. An -endpoint that wishes to communicate a fatal connection error MUST use a -CONNECTION_CLOSE or APPLICATION_CLOSE frame if it has sufficient state to do so. +Note that if the server sends a HelloRetryRequest, the client will send a second +Initial packet. This Initial packet will continue the cryptographic handshake +and will contain a CRYPTO frame with an offset matching the size of the CRYPTO +frame sent in the first Initial packet. Cryptographic handshake messages +subsequent to the first do not need to fit within a single UDP datagram. -This stateless reset design is specific to QUIC version 1. An endpoint that -supports multiple versions of QUIC needs to generate a stateless reset that will -be accepted by peers that support any version that the endpoint might support -(or might have supported prior to losing state). Designers of new versions of -QUIC need to be aware of this and either reuse this design, or use a portion of -the packet other than the last 16 octets for carrying data. +### Connection IDs -#### Detecting a Stateless Reset +When an Initial packet is sent by a client which has not previously received a +Retry packet from the server, it populates the Destination Connection ID field +with an unpredictable value. This MUST be at least 8 octets in length. Until a +packet is received from the server, the client MUST use the same value unless it +abandons the connection attempt and starts a new one. The initial Destination +Connection ID is used to determine packet protection keys for Initial packets. -An endpoint detects a potential stateless reset when a packet with a short -header either cannot be decrypted or is marked as a duplicate packet. The -endpoint then compares the last 16 octets of the packet with the Stateless Reset -Token provided by its peer, either in a NEW_CONNECTION_ID frame or the server's -transport parameters. If these values are identical, the endpoint MUST enter -the draining period and not send any further packets on this connection. If the -comparison fails, the packet can be discarded. +The client populates the Source Connection ID field with a value of its choosing +and sets the SCIL field to match. +The Destination Connection ID field in the server's Initial packet contains a +connection ID that is chosen by the recipient of the packet (i.e., the client); +the Source Connection ID includes the connection ID that the sender of the +packet wishes to use (see {{connection-id}}). The server MUST use consistent +Source Connection IDs during the handshake. -#### Calculating a Stateless Reset Token {#reset-token} +On first receiving an Initial or Retry packet from the server, the client uses +the Source Connection ID supplied by the server as the Destination Connection ID +for subsequent packets. That means that a client might change the Destination +Connection ID twice during connection establishment. Once a client has received +an Initial packet from the server, it MUST discard any packet it receives with a +different Source Connection ID. -The stateless reset token MUST be difficult to guess. In order to create a -Stateless Reset Token, an endpoint could randomly generate {{!RFC4086}} a secret -for every connection that it creates. However, this presents a coordination -problem when there are multiple instances in a cluster or a storage problem for -an endpoint that might lose state. Stateless reset specifically exists to -handle the case where state is lost, so this approach is suboptimal. -A single static key can be used across all connections to the same endpoint by -generating the proof using a second iteration of a preimage-resistant function -that takes a static key and the connection ID chosen by the endpoint (see -{{connection-id}}) as input. An endpoint could use HMAC {{?RFC2104}} (for -example, HMAC(static_key, connection_id)) or HKDF {{?RFC5869}} (for example, -using the static key as input keying material, with the connection ID as salt). -The output of this function is truncated to 16 octets to produce the Stateless -Reset Token for that connection. +### Tokens -An endpoint that loses state can use the same method to generate a valid -Stateless Reset Token. The connection ID comes from the packet that the -endpoint receives. +If the client has a token received in a NEW_TOKEN frame on a previous connection +to what it believes to be the same server, it can include that value in the +Token field of its Initial packet. -This design relies on the peer always sending a connection ID in its packets so -that the endpoint can use the connection ID from a packet to reset the -connection. An endpoint that uses this design MUST either use the same -connection ID length for all connections or encode the length of the connection -ID such that it can be recovered without state. In addition, it MUST NOT -provide a zero-length connection ID. +A token allows a server to correlate activity between connections. +Specifically, the connection where the token was issued, and any connection +where it is used. Clients that want to break continuity of identity with a +server MAY discard tokens provided using the NEW_TOKEN frame. Tokens obtained +in Retry packets MUST NOT be discarded. -Revealing the Stateless Reset Token allows any entity to terminate the -connection, so a value can only be used once. This method for choosing the -Stateless Reset Token means that the combination of connection ID and static key -cannot occur for another connection. A denial of service attack is possible if -the same connection ID is used by instances that share a static key, or if an -attacker can cause a packet to be routed to an instance that has no state but -the same static key (see {{reset-oracle}}). A connection ID from a connection -that is reset by revealing the Stateless Reset Token cannot be reused for new -connections at nodes that share a static key. +A client SHOULD NOT reuse a token. Reusing a token allows connections to be +linked by entities on the network path (see {{migration-linkability}}). A +client MUST NOT reuse a token if it believes that its point of network +attachment has changed since the token was last used; that is, if there is a +change in its local IP address or network interface. A client needs to start +the connection process over if it migrates prior to completing the handshake. -Note that Stateless Reset packets do not have any cryptographic protection. +When a server receives an Initial packet with an address validation token, it +SHOULD attempt to validate it. If the token is invalid then the server SHOULD +proceed as if the client did not have a validated address, including potentially +sending a Retry. If the validation succeeds, the server SHOULD then allow the +handshake to proceed (see {{stateless-retry}}). +Note: -#### Looping {#reset-looping} +: The rationale for treating the client as unvalidated rather than discarding + the packet is that the client might have received the token in a previous + connection using the NEW_TOKEN frame, and if the server has lost state, it + might be unable to validate the token at all, leading to connection failure if + the packet is discarded. A server MAY encode tokens provided with NEW_TOKEN + frames and Retry packets differently, and validate the latter more strictly. -The design of a Stateless Reset is such that it is indistinguishable from a -valid packet. This means that a Stateless Reset might trigger the sending of a -Stateless Reset in response, which could lead to infinite exchanges. +In a stateless design, a server can use encrypted and authenticated tokens to +pass information to clients that the server can later recover and use to +validate a client address. Tokens are not integrated into the cryptographic +handshake and so they are not authenticated. For instance, a client might be +able to reuse a token. To avoid attacks that exploit this property, a server +can limit its use of tokens to only the information needed validate client +addresses. -An endpoint MUST ensure that every Stateless Reset that it sends is smaller than -the packet which triggered it, unless it maintains state sufficient to prevent -looping. In the event of a loop, this results in packets eventually being too -small to trigger a response. -An endpoint can remember the number of Stateless Reset packets that it has sent -and stop generating new Stateless Reset packets once a limit is reached. Using -separate limits for different remote addresses will ensure that Stateless Reset -packets can be used to close connections when other peers or connections have -exhausted limits. +### Starting Packet Numbers -Reducing the size of a Stateless Reset below the recommended minimum size of 37 -octets could mean that the packet could reveal to an observer that it is a -Stateless Reset. Conversely, refusing to send a Stateless Reset in response to -a small packet might result in Stateless Reset not being useful in detecting -cases of broken connections where only very small packets are sent; such -failures might only be detected by other means, such as timers. +The first Initial packet sent by either endpoint contains a packet number of +0. The packet number MUST increase monotonically thereafter. Initial packets +are in a different packet number space to other packets (see +{{packet-numbers}}). + + +### 0-RTT Packet Numbers {#retry-0rtt-pn} + +Packet numbers for 0-RTT protected packets use the same space as 1-RTT protected +packets. + +After a client receives a Retry or Version Negotiation packet, 0-RTT packets are +likely to have been lost or discarded by the server. A client MAY attempt to +resend data in 0-RTT packets after it sends a new Initial packet. + +A client MUST NOT reset the packet number it uses for 0-RTT packets. The keys +used to protect 0-RTT packets will not change as a result of responding to a +Retry or Version Negotiation packet unless the client also regenerates the +cryptographic handshake message. Sending packets with the same packet number in +that case is likely to compromise the packet protection for all 0-RTT packets +because the same key and nonce could be used to protect different content. + +Receiving a Retry or Version Negotiation packet, especially a Retry that changes +the connection ID used for subsequent packets, indicates a strong possibility +that 0-RTT packets could be lost. A client only receives acknowledgments for +its 0-RTT packets once the handshake is complete. Consequently, a server might +expect 0-RTT packets to start with a packet number of 0. Therefore, in +determining the length of the packet number encoding for 0-RTT packets, a client +MUST assume that all packets up to the current packet number are in flight, +starting from a packet number of 0. Thus, 0-RTT packets could need to use a +longer packet number encoding. + +A client SHOULD instead generate a fresh cryptographic handshake message and +start packet numbers from 0. This ensures that new 0-RTT packets will not use +the same keys, avoiding any risk of key and nonce reuse; this also prevents +0-RTT packets from previous handshake attempts from being accepted as part of +the connection. + + +### Minimum Packet Size + +The payload of a UDP datagram carrying the Initial packet MUST be expanded to at +least 1200 octets (see {{packetization}}), by adding PADDING frames to the +Initial packet and/or by combining the Initial packet with a 0-RTT packet (see +{{packet-coalesce}}). + + +## Handshake Packet {#packet-handshake} + +A Handshake packet uses long headers with a type value of 0x7D. It is +used to carry acknowledgments and cryptographic handshake messages from the +server and client. + +A server sends its cryptographic handshake in one or more Handshake packets in +response to an Initial packet if it does not send a Retry packet. Once a client +has received a Handshake packet from a server, it uses Handshake packets to send +subsequent cryptographic handshake messages and acknowledgments to the server. + +The Destination Connection ID field in a Handshake packet contains a connection +ID that is chosen by the recipient of the packet; the Source Connection ID +includes the connection ID that the sender of the packet wishes to use (see +{{connection-id-encoding}}). + +The first Handshake packet sent by a server contains a packet number of 0. +Handshake packets are their own packet number space. Packet numbers are +incremented normally for other Handshake packets. + +Servers MUST NOT send more than three times as many bytes as the number of bytes +received prior to verifying the client's address. Source addresses can be +verified through an address validation token (delivered via a Retry packet or +a NEW_TOKEN frame) or by processing any message from the client encrypted using +the Handshake keys. This limit exists to mitigate amplification attacks. + +In order to prevent this limit causing a handshake deadlock, the client SHOULD +always send a packet upon a handshake timeout, as described in +{{QUIC-RECOVERY}}. If the client has no data to retransmit and does not have +Handshake keys, it SHOULD send an Initial packet in a UDP datagram of at least +1200 octets. If the client has Handshake keys, it SHOULD send a Handshake +packet. + +The payload of this packet contains CRYPTO frames and could contain PADDING, or +ACK frames. Handshake packets MAY contain CONNECTION_CLOSE or APPLICATION_CLOSE +frames. Endpoints MUST treat receipt of Handshake packets with other frames as +a connection error. + + +## Protected Packets {#packet-protected} + +All QUIC packets use packet protection. Packets that are protected with the +static handshake keys or the 0-RTT keys are sent with long headers; all packets +protected with 1-RTT keys are sent with short headers. The different packet +types explicitly indicate the encryption level and therefore the keys that are +used to remove packet protection. 0-RTT and 1-RTT protected packets share a +single packet number space. + +Packets protected with handshake keys only use packet protection to ensure that +the sender of the packet is on the network path. This packet protection is not +effective confidentiality protection; any entity that receives the Initial +packet from a client can recover the keys necessary to remove packet protection +or to generate packets that will be successfully authenticated. + +Packets protected with 0-RTT and 1-RTT keys are expected to have confidentiality +and data origin authentication; the cryptographic handshake ensures that only +the communicating endpoints receive the corresponding keys. + +Packets protected with 0-RTT keys use a type value of 0x7C. The connection ID +fields for a 0-RTT packet MUST match the values used in the Initial packet +({{packet-initial}}). + +The version field for protected packets is the current QUIC version. + +The packet number field contains a packet number, which has additional +confidentiality protection that is applied after packet protection is applied +(see {{QUIC-TLS}} for details). The underlying packet number increases with +each packet sent, see {{packet-numbers}} for details. + +The payload is protected using authenticated encryption. {{QUIC-TLS}} describes +packet protection in detail. After decryption, the plaintext consists of a +sequence of frames, as described in {{frames}}. + + +## Coalescing Packets {#packet-coalesce} + +A sender can coalesce multiple QUIC packets (typically a Cryptographic Handshake +packet and a Protected packet) into one UDP datagram. This can reduce the +number of UDP datagrams needed to send application data during the handshake and +immediately afterwards. It is not necessary for senders to coalesce +packets, though failing to do so will require sending a significantly +larger number of datagrams during the handshake. Receivers MUST +be able to process coalesced packets. + +Coalescing packets in order of increasing encryption levels (Initial, 0-RTT, +Handshake, 1-RTT) makes it more likely the receiver will be able to process all +the packets in a single pass. A packet with a short header does not include a +length, so it will always be the last packet included in a UDP datagram. + +Senders MUST NOT coalesce QUIC packets with different Destination Connection +IDs into a single UDP datagram. Receivers SHOULD ignore any subsequent packets +with a different Destination Connection ID than the first packet in the +datagram. + +Every QUIC packet that is coalesced into a single UDP datagram is separate and +complete. Though the values of some fields in the packet header might be +redundant, no fields are omitted. The receiver of coalesced QUIC packets MUST +individually process each QUIC packet and separately acknowledge them, as if +they were received as the payload of different UDP datagrams. If one or more +packets in a datagram cannot be processed yet (because the keys are not yet +available) or processing fails (decryption failure, unknown type, etc.), the +receiver MUST still attempt to process the remaining packets. The skipped +packets MAY either be discarded or buffered for later processing, just as if the +packets were received out-of-order in separate datagrams. + +Retry ({{packet-retry}}) and Version Negotiation ({{packet-version}}) packets +cannot be coalesced. + + +## Connection ID Encoding + +A connection ID is used to ensure consistent routing of packets, as described in +{{connection-id}}. The long header contains two connection IDs: the Destination +Connection ID is chosen by the recipient of the packet and is used to provide +consistent routing; the Source Connection ID is used to set the Destination +Connection ID used by the peer. + +During the handshake, packets with the long header are used to establish the +connection ID that each endpoint uses. Each endpoint uses the Source Connection +ID field to specify the connection ID that is used in the Destination Connection +ID field of packets being sent to them. Upon receiving a packet, each endpoint +sets the Destination Connection ID it sends to match the value of the Source +Connection ID that they receive. + +During the handshake, a client can receive both a Retry and an Initial packet, +and thus be given two opportunities to update the Destination Connection ID it +sends. A client MUST only change the value it sends in the Destination +Connection ID in response to the first packet of each type it receives from the +server (Retry or Initial); a server MUST set its value based on the Initial +packet. Any additional changes are not permitted; if subsequent packets of +those types include a different Source Connection ID, they MUST be discarded. +This avoids problems that might arise from stateless processing of multiple +Initial packets producing different connection IDs. + +Short headers only include the Destination Connection ID and omit the explicit +length. The length of the Destination Connection ID field is expected to be +known to endpoints. + +Endpoints using a connection-ID based load balancer could agree with the load +balancer on a fixed or minimum length and on an encoding for connection IDs. +This fixed portion could encode an explicit length, which allows the entire +connection ID to vary in length and still be used by the load balancer. + +The very first packet sent by a client includes a random value for Destination +Connection ID. The same value MUST be used for all 0-RTT packets sent on that +connection ({{packet-protected}}). This randomized value is used to determine +the packet protection keys for Initial packets (see Section 5.2 of +{{QUIC-TLS}}). + +A Version Negotiation ({{packet-version}}) packet MUST use both connection IDs +selected by the client, swapped to ensure correct routing toward the client. + +The connection ID can change over the lifetime of a connection, especially in +response to connection migration ({{migration}}). NEW_CONNECTION_ID frames +({{frame-new-connection-id}}) are used to provide new connection ID values. + + +## Packet Numbers {#packet-numbers} + +The packet number is an integer in the range 0 to 2^62-1. The value is used in +determining the cryptographic nonce for packet protection. Each endpoint +maintains a separate packet number for sending and receiving. + +Packet numbers are divided into 3 spaces in QUIC: + +- Initial space: All Initial packets {{packet-initial}} are in this space. +- Handshake space: All Handshake packets {{packet-handshake}} are in this space. +- Application data space: All 0-RTT and 1-RTT encrypted packets + {{packet-protected}} are in this space. + +As described in {{QUIC-TLS}}, each packet type uses different protection keys. + +Conceptually, a packet number space is the context in which a packet can be +processed and acknowledged. Initial packets can only be sent with Initial +packet protection keys and acknowledged in packets which are also Initial +packets. Similarly, Handshake packets are sent at the Handshake encryption +level and can only be acknowledged in Handshake packets. + +This enforces cryptographic separation between the data sent in the different +packet sequence number spaces. Each packet number space starts at packet number +0. Subsequent packets sent in the same packet number space MUST increase the +packet number by at least one. + +0-RTT and 1-RTT data exist in the same packet number space to make loss recovery +algorithms easier to implement between the two packet types. + +A QUIC endpoint MUST NOT reuse a packet number within the same packet number +space in one connection (that is, under the same cryptographic keys). If the +packet number for sending reaches 2^62 - 1, the sender MUST close the connection +without sending a CONNECTION_CLOSE frame or any further packets; an endpoint MAY +send a Stateless Reset ({{stateless-reset}}) in response to further packets that +it receives. + +In the QUIC long and short packet headers, the number of bits required to +represent the packet number is reduced by including only a variable number of +the least significant bits of the packet number. One or two of the most +significant bits of the first octet determine how many bits of the packet +number are provided, as shown in {{pn-encodings}}. + +| First octet pattern | Encoded Length | Bits Present | +|:--------------------|:---------------|:-------------| +| 0b0xxxxxxx | 1 octet | 7 | +| 0b10xxxxxx | 2 | 14 | +| 0b11xxxxxx | 4 | 30 | +{: #pn-encodings title="Packet Number Encodings for Packet Headers"} + +Note that these encodings are similar to those in {{integer-encoding}}, but +use different values. + +The encoded packet number is protected as described in Section 5.3 +{{QUIC-TLS}}. Protection of the packet number is removed prior to recovering the +full packet number. The full packet number is reconstructed at the receiver +based on the number of significant bits present, the value of those bits, and +the largest packet number received on a successfully authenticated +packet. Recovering the full packet number is necessary to successfully remove +packet protection. + +Once packet number protection is removed, the packet number is decoded by +finding the packet number value that is closest to the next expected packet. +The next expected packet is the highest received packet number plus one. For +example, if the highest successfully authenticated packet had a packet number of +0xaa82f30e, then a packet containing a 14-bit value of 0x9b3 will be decoded as +0xaa8309b3. +Example pseudo-code for packet number decoding can be found in +{{sample-packet-number-decoding}}. + +The sender MUST use a packet number size able to represent more than twice as +large a range than the difference between the largest acknowledged packet and +packet number being sent. A peer receiving the packet will then correctly +decode the packet number, unless the packet is delayed in transit such that it +arrives after many higher-numbered packets have been received. An endpoint +SHOULD use a large enough packet number encoding to allow the packet number to +be recovered even if the packet arrives after packets that are sent afterwards. + +As a result, the size of the packet number encoding is at least one more than +the base 2 logarithm of the number of contiguous unacknowledged packet numbers, +including the new packet. + +For example, if an endpoint has received an acknowledgment for packet 0x6afa2f, +sending a packet with a number of 0x6b2d79 requires a packet number encoding +with 14 bits or more; whereas the 30-bit packet number encoding is needed to +send a packet with a number of 0x6bc107. + +A receiver MUST discard a newly unprotected packet unless it is certain that it +has not processed another packet with the same packet number from the same +packet number space. Duplicate suppression MUST happen after removing packet +protection for the reasons described in Section 9.3 of {{QUIC-TLS}}. An +efficient algorithm for duplicate suppression can be found in Section 3.4.3 of +{{?RFC2406}}. + +A Version Negotiation packet ({{packet-version}}) does not include a packet +number. The Retry packet ({{packet-retry}}) has special rules for populating +the packet number field. + + +# Frames and Frame Types {#frames} + +The payload of all packets, after removing packet protection, consists of a +sequence of frames, as shown in {{packet-frames}}. Version Negotiation and +Stateless Reset do not contain frames. + +~~~ + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Frame 1 (*) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Frame 2 (*) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Frame N (*) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +~~~ +{: #packet-frames title="QUIC Payload"} + +QUIC payloads MUST contain at least one frame, and MAY contain multiple +frames and multiple frame types. + +Frames MUST fit within a single QUIC packet and MUST NOT span a QUIC packet +boundary. Each frame begins with a Frame Type, indicating its type, followed by +additional type-dependent fields: + +~~~ + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Frame Type (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Type-Dependent Fields (*) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +~~~ +{: #frame-layout title="Generic Frame Layout"} + +The frame types defined in this specification are listed in {{frame-types}}. +The Frame Type in STREAM frames is used to carry other frame-specific flags. +For all other frames, the Frame Type field simply identifies the frame. These +frames are explained in more detail as they are referenced later in the +document. + +| Type Value | Frame Type Name | Definition | +|:------------|:---------------------|:-------------------------------| +| 0x00 | PADDING | {{frame-padding}} | +| 0x01 | RST_STREAM | {{frame-rst-stream}} | +| 0x02 | CONNECTION_CLOSE | {{frame-connection-close}} | +| 0x03 | APPLICATION_CLOSE | {{frame-application-close}} | +| 0x04 | MAX_DATA | {{frame-max-data}} | +| 0x05 | MAX_STREAM_DATA | {{frame-max-stream-data}} | +| 0x06 | MAX_STREAM_ID | {{frame-max-stream-id}} | +| 0x07 | PING | {{frame-ping}} | +| 0x08 | BLOCKED | {{frame-blocked}} | +| 0x09 | STREAM_BLOCKED | {{frame-stream-blocked}} | +| 0x0a | STREAM_ID_BLOCKED | {{frame-stream-id-blocked}} | +| 0x0b | NEW_CONNECTION_ID | {{frame-new-connection-id}} | +| 0x0c | STOP_SENDING | {{frame-stop-sending}} | +| 0x0d | RETIRE_CONNECTION_ID | {{frame-retire-connection-id}} | +| 0x0e | PATH_CHALLENGE | {{frame-path-challenge}} | +| 0x0f | PATH_RESPONSE | {{frame-path-response}} | +| 0x10 - 0x17 | STREAM | {{frame-stream}} | +| 0x18 | CRYPTO | {{frame-crypto}} | +| 0x19 | NEW_TOKEN | {{frame-new-token}} | +| 0x1a - 0x1b | ACK | {{frame-ack}} | +{: #frame-types title="Frame Types"} + +All QUIC frames are idempotent. That is, a valid frame does not cause +undesirable side effects or errors when received more than once. + +The Frame Type field uses a variable length integer encoding (see +{{integer-encoding}}) with one exception. To ensure simple and efficient +implementations of frame parsing, a frame type MUST use the shortest possible +encoding. Though a two-, four- or eight-octet encoding of the frame types +defined in this document is possible, the Frame Type field for these frames is +encoded on a single octet. For instance, though 0x4007 is a legitimate +two-octet encoding for a variable-length integer with a value of 7, PING frames +are always encoded as a single octet with the value 0x07. An endpoint MUST +treat the receipt of a frame type that uses a longer encoding than necessary as +a connection error of type PROTOCOL_VIOLATION. -An endpoint can increase the odds that a packet will trigger a Stateless Reset -if it cannot be processed by padding it to at least 38 octets. # Frame Types and Formats @@ -4517,239 +4739,26 @@ level. The stream does not have an explicit end, so CRYPTO frames do not have a FIN bit. -# Packetization and Reliability {#packetization} - -A sender bundles one or more frames in a QUIC packet (see {{frames}}). - -A sender SHOULD minimize per-packet bandwidth and computational costs by -bundling as many frames as possible within a QUIC packet. A sender MAY wait for -a short period of time to bundle multiple frames before sending a packet that is -not maximally packed, to avoid sending out large numbers of small packets. An -implementation may use knowledge about application sending behavior or -heuristics to determine whether and for how long to wait. This waiting period -is an implementation decision, and an implementation should be careful to delay -conservatively, since any delay is likely to increase application-visible -latency. - - -## Packet Processing and Acknowledgment {#processing-and-ack} - -A packet MUST NOT be acknowledged until packet protection has been successfully -removed and all frames contained in the packet have been processed. For STREAM -frames, this means the data has been enqueued in preparation to be received by -the application protocol, but it does not require that data is delivered and -consumed. - -Once the packet has been fully processed, a receiver acknowledges receipt by -sending one or more ACK frames containing the packet number of the received -packet. To avoid creating an indefinite feedback loop, an endpoint MUST NOT -send an ACK frame in response to a packet containing only ACK or PADDING frames, -even if there are packet gaps which precede the received packet. The endpoint -MUST acknowledge packets containing only ACK or PADDING frames in the next ACK -frame that it sends. - -While PADDING frames do not elicit an ACK frame from a receiver, they are -considered to be in flight for congestion control purposes -{{QUIC-RECOVERY}}. Sending only PADDING frames might cause the sender to become -limited by the congestion controller (as described in {{QUIC-RECOVERY}}) with no -acknowledgments forthcoming from the receiver. Therefore, a sender should ensure -that other frames are sent in addition to PADDING frames to elicit -acknowledgments from the receiver. - -Strategies and implications of the frequency of generating acknowledgments are -discussed in more detail in {{QUIC-RECOVERY}}. - - -## Retransmission of Information - -QUIC packets that are determined to be lost are not retransmitted whole. The -same applies to the frames that are contained within lost packets. Instead, the -information that might be carried in frames is sent again in new frames as -needed. - -New frames and packets are used to carry information that is determined to have -been lost. In general, information is sent again when a packet containing that -information is determined to be lost and sending ceases when a packet -containing that information is acknowledged. - -* Data sent in CRYPTO frames is retransmitted according to the rules in - {{QUIC-RECOVERY}}, until either all data has been acknowledged or the crypto - state machine implicitly knows that the peer received the data. - -* Application data sent in STREAM frames is retransmitted in new STREAM frames - unless the endpoint has sent a RST_STREAM for that stream. Once an endpoint - sends a RST_STREAM frame, no further STREAM frames are needed. - -* The most recent set of acknowledgments are sent in ACK frames. An ACK frame - SHOULD contain all unacknowledged acknowledgments, as described in - {{sending-ack-frames}}. - -* Cancellation of stream transmission, as carried in a RST_STREAM frame, is - sent until acknowledged or until all stream data is acknowledged by the peer - (that is, either the "Reset Recvd" or "Data Recvd" state is reached on the - send stream). The content of a RST_STREAM frame MUST NOT change when it is - sent again. - -* Similarly, a request to cancel stream transmission, as encoded in a - STOP_SENDING frame, is sent until the receive stream enters either a "Data - Recvd" or "Reset Recvd" state, see {{solicited-state-transitions}}. - -* Connection close signals, including those that use CONNECTION_CLOSE and - APPLICATION_CLOSE frames, are not sent again when packet loss is detected, but - as described in {{termination}}. - -* The current connection maximum data is sent in MAX_DATA frames. An updated - value is sent in a MAX_DATA frame if the packet containing the most recently - sent MAX_DATA frame is declared lost, or when the endpoint decides to update - the limit. Care is necessary to avoid sending this frame too often as the - limit can increase frequently and cause an unnecessarily large number of - MAX_DATA frames to be sent. - -* The current maximum stream data offset is sent in MAX_STREAM_DATA frames. - Like MAX_DATA, an updated value is sent when the packet containing - the most recent MAX_STREAM_DATA frame for a stream is lost or when the limit - is updated, with care taken to prevent the frame from being sent too often. An - endpoint SHOULD stop sending MAX_STREAM_DATA frames when the receive stream - enters a "Size Known" state. - -* The maximum stream ID for a stream of a given type is sent in MAX_STREAM_ID - frames. Like MAX_DATA, an updated value is sent when a packet containing the - most recent MAX_STREAM_ID for a stream type frame is declared lost or when - the limit is updated, with care taken to prevent the frame from being sent - too often. - -* Blocked signals are carried in BLOCKED, STREAM_BLOCKED, and STREAM_ID_BLOCKED - frames. BLOCKED streams have connection scope, STREAM_BLOCKED frames have - stream scope, and STREAM_ID_BLOCKED frames are scoped to a specific stream - type. New frames are sent if packets containing the most recent frame for a - scope is lost, but only while the endpoint is blocked on the corresponding - limit. These frames always include the limit that is causing blocking at the - time that they are transmitted. - -* A liveness or path validation check using PATH_CHALLENGE frames is sent - periodically until a matching PATH_RESPONSE frame is received or until there - is no remaining need for liveness or path validation checking. PATH_CHALLENGE - frames include a different payload each time they are sent. - -* Responses to path validation using PATH_RESPONSE frames are sent just once. - A new PATH_CHALLENGE frame will be sent if another PATH_RESPONSE frame is - needed. - -* New connection IDs are sent in NEW_CONNECTION_ID frames and retransmitted if - the packet containing them is lost. Retransmissions of this frame carry the - same sequence number value. Likewise, retired connection IDs are sent in - RETIRE_CONNECTION_ID frames and retransmitted if the packet containing them is - lost. - -* PADDING frames contain no information, so lost PADDING frames do not require - repair. - -Upon detecting losses, a sender MUST take appropriate congestion control action. -The details of loss detection and congestion control are described in -{{QUIC-RECOVERY}}. - - -## Packet Size {#packet-size} - -The QUIC packet size includes the QUIC header and integrity check, but not the -UDP or IP header. - -Clients MUST ensure that the first Initial packet they send is sent in a UDP -datagram that is at least 1200 octets. Padding the Initial packet or including a -0-RTT packet in the same datagram are ways to meet this requirement. Sending a -UDP datagram of this size ensures that the network path supports a reasonable -Maximum Transmission Unit (MTU), and helps reduce the amplitude of amplification -attacks caused by server responses toward an unverified client address. - -The datagram containing the first Initial packet from a client MAY exceed 1200 -octets if the client believes that the Path Maximum Transmission Unit (PMTU) -supports the size that it chooses. - -A server MAY send a CONNECTION_CLOSE frame with error code PROTOCOL_VIOLATION in -response to the first Initial packet it receives from a client if the UDP -datagram is smaller than 1200 octets. It MUST NOT send any other frame type in -response, or otherwise behave as if any part of the offending packet was -processed as valid. - - -## Path Maximum Transmission Unit - -The Path Maximum Transmission Unit (PMTU) is the maximum size of the entire IP -header, UDP header, and UDP payload. The UDP payload includes the QUIC packet -header, protected payload, and any authentication fields. - -All QUIC packets SHOULD be sized to fit within the estimated PMTU to avoid IP -fragmentation or packet drops. To optimize bandwidth efficiency, endpoints -SHOULD use Packetization Layer PMTU Discovery ({{!PLPMTUD=RFC4821}}). Endpoints -MAY use PMTU Discovery ({{!PMTUDv4=RFC1191}}, {{!PMTUDv6=RFC8201}}) for -detecting the PMTU, setting the PMTU appropriately, and storing the result of -previous PMTU determinations. - -In the absence of these mechanisms, QUIC endpoints SHOULD NOT send IP packets -larger than 1280 octets. Assuming the minimum IP header size, this results in -a QUIC packet size of 1232 octets for IPv6 and 1252 octets for IPv4. Some -QUIC implementations MAY be more conservative in computing allowed QUIC packet -size given unknown tunneling overheads or IP header options. - -QUIC endpoints that implement any kind of PMTU discovery SHOULD maintain an -estimate for each combination of local and remote IP addresses. Each pairing of -local and remote addresses could have a different maximum MTU in the path. - -QUIC depends on the network path supporting an MTU of at least 1280 octets. This -is the IPv6 minimum MTU and therefore also supported by most modern IPv4 -networks. An endpoint MUST NOT reduce its MTU below this number, even if it -receives signals that indicate a smaller limit might exist. - -If a QUIC endpoint determines that the PMTU between any pair of local and remote -IP addresses has fallen below 1280 octets, it MUST immediately cease sending -QUIC packets on the affected path. This could result in termination of the -connection if an alternative path cannot be found. - - -### IPv4 PMTU Discovery {#v4-pmtud} - -Traditional ICMP-based path MTU discovery in IPv4 {{!PMTUDv4}} is potentially -vulnerable to off-path attacks that successfully guess the IP/port 4-tuple and -reduce the MTU to a bandwidth-inefficient value. TCP connections mitigate this -risk by using the (at minimum) 8 bytes of transport header echoed in the ICMP -message to validate the TCP sequence number as valid for the current -connection. However, as QUIC operates over UDP, in IPv4 the echoed information -could consist only of the IP and UDP headers, which usually has insufficient -entropy to mitigate off-path attacks. - -As a result, endpoints that implement PMTUD in IPv4 SHOULD take steps to -mitigate this risk. For instance, an application could: - -* Set the IPv4 Don't Fragment (DF) bit on a small proportion of packets, so that -most invalid ICMP messages arrive when there are no DF packets outstanding, and -can therefore be identified as spurious. - -* Store additional information from the IP or UDP headers from DF packets (for -example, the IP ID or UDP checksum) to further authenticate incoming Datagram -Too Big messages. - -* Any reduction in PMTU due to a report contained in an ICMP packet is -provisional until QUIC's loss detection algorithm determines that the packet is -actually lost. - +## Extension Frames -### Special Considerations for Packetization Layer PMTU Discovery +QUIC frames do not use a self-describing encoding. An endpoint therefore needs +to understand the syntax of all frames before it can successfully process a +packet. This allows for efficient encoding of frames, but it means that an +endpoint cannot send a frame of a type that is unknown to its peer. +An extension to QUIC that wishes to use a new type of frame MUST first ensure +that a peer is able to understand the frame. An endpoint can use a transport +parameter to signal its willingness to receive one or more extension frame types +with the one transport parameter. -The PADDING frame provides a useful option for PMTU probe packets. PADDING -frames generate acknowledgements, but they need not be delivered reliably. As a -result, the loss of PADDING frames in probe packets does not require -delay-inducing retransmission. However, PADDING frames do consume congestion -window, which may delay the transmission of subsequent application data. +Extension frames MUST be congestion controlled and MUST cause an ACK frame to +be sent. The exception is extension frames that replace or supplement the ACK +frame. Extension frames are not included in flow control unless specified +in the extension. -When implementing the algorithm in Section 7.2 of {{!PLPMTUD}}, the initial -value of search_low SHOULD be consistent with the IPv6 minimum packet size. -Paths that do not support this size cannot deliver Initial packets, and -therefore are not QUIC-compliant. +An IANA registry is used to manage the assignment of frame types, see +{{iana-frames}}. -Section 7.3 of {{!PLPMTUD}} discusses trade-offs between small and large -increases in the size of probe packets. As QUIC probe packets need not contain -application data, aggressive increases in probe size carry fewer consequences. # Error Handling From c4ce2c29be13de5146556c2a49ef3c8aa57cc317 Mon Sep 17 00:00:00 2001 From: Jana Iyengar Date: Fri, 12 Oct 2018 23:02:02 -0700 Subject: [PATCH 28/57] created packets and frames section and moved things about --- draft-ietf-quic-transport.md | 1079 +++++++++++++++++----------------- 1 file changed, 542 insertions(+), 537 deletions(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index a506524663..2ed3e9f04c 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -2498,317 +2498,622 @@ if it cannot be processed by padding it to at least 38 octets. -# Packetization and Reliability {#packetization} +# Packets and Frames {#packets-frames} -A sender bundles one or more frames in a QUIC packet (see {{frames}}). +Any QUIC packet, with the exception of the Version Negotiation packet, has +either a long or a short header, as indicated by the Header Form bit. Long +headers are expected to be used early in the connection before the establishment +of 1-RTT keys. Packets that carry the long header are Initial +{{packet-initial}}, Retry {{packet-retry}}, Handshake {{packet-handshake}}, and +0-RTT Protected packets {{packet-protected}}. Packets that carry Short headers +are minimal version-specific headers, which are used after version negotiation +and 1-RTT keys are established, and are described in {{short-header}}. Version +Negotiation packets are described in {{packet-version}}. -A sender SHOULD minimize per-packet bandwidth and computational costs by -bundling as many frames as possible within a QUIC packet. A sender MAY wait for -a short period of time to bundle multiple frames before sending a packet that is -not maximally packed, to avoid sending out large numbers of small packets. An -implementation may use knowledge about application sending behavior or -heuristics to determine whether and for how long to wait. This waiting period -is an implementation decision, and an implementation should be careful to delay -conservatively, since any delay is likely to increase application-visible -latency. +## Protected Packets {#packet-protected} -## Packet Processing and Acknowledgment {#processing-and-ack} +All QUIC packets use packet protection. Packets that are protected with the +static handshake keys or the 0-RTT keys are sent with long headers; all packets +protected with 1-RTT keys are sent with short headers. The different packet +types explicitly indicate the encryption level and therefore the keys that are +used to remove packet protection. 0-RTT and 1-RTT protected packets share a +single packet number space. -A packet MUST NOT be acknowledged until packet protection has been successfully -removed and all frames contained in the packet have been processed. For STREAM -frames, this means the data has been enqueued in preparation to be received by -the application protocol, but it does not require that data is delivered and -consumed. +Packets protected with handshake keys only use packet protection to ensure that +the sender of the packet is on the network path. This packet protection is not +effective confidentiality protection; any entity that receives the Initial +packet from a client can recover the keys necessary to remove packet protection +or to generate packets that will be successfully authenticated. -Once the packet has been fully processed, a receiver acknowledges receipt by -sending one or more ACK frames containing the packet number of the received -packet. To avoid creating an indefinite feedback loop, an endpoint MUST NOT -send an ACK frame in response to a packet containing only ACK or PADDING frames, -even if there are packet gaps which precede the received packet. The endpoint -MUST acknowledge packets containing only ACK or PADDING frames in the next ACK -frame that it sends. +Packets protected with 0-RTT and 1-RTT keys are expected to have confidentiality +and data origin authentication; the cryptographic handshake ensures that only +the communicating endpoints receive the corresponding keys. -While PADDING frames do not elicit an ACK frame from a receiver, they are -considered to be in flight for congestion control purposes -{{QUIC-RECOVERY}}. Sending only PADDING frames might cause the sender to become -limited by the congestion controller (as described in {{QUIC-RECOVERY}}) with no -acknowledgments forthcoming from the receiver. Therefore, a sender should ensure -that other frames are sent in addition to PADDING frames to elicit -acknowledgments from the receiver. +Packets protected with 0-RTT keys use a type value of 0x7C. The connection ID +fields for a 0-RTT packet MUST match the values used in the Initial packet +({{packet-initial}}). -Strategies and implications of the frequency of generating acknowledgments are -discussed in more detail in {{QUIC-RECOVERY}}. +The version field for protected packets is the current QUIC version. +The packet number field contains a packet number, which has additional +confidentiality protection that is applied after packet protection is applied +(see {{QUIC-TLS}} for details). The underlying packet number increases with +each packet sent, see {{packet-numbers}} for details. -## Retransmission of Information +The payload is protected using authenticated encryption. {{QUIC-TLS}} describes +packet protection in detail. After decryption, the plaintext consists of a +sequence of frames, as described in {{frames}}. -QUIC packets that are determined to be lost are not retransmitted whole. The -same applies to the frames that are contained within lost packets. Instead, the -information that might be carried in frames is sent again in new frames as -needed. -New frames and packets are used to carry information that is determined to have -been lost. In general, information is sent again when a packet containing that -information is determined to be lost and sending ceases when a packet -containing that information is acknowledged. +## Coalescing Packets {#packet-coalesce} -* Data sent in CRYPTO frames is retransmitted according to the rules in - {{QUIC-RECOVERY}}, until either all data has been acknowledged or the crypto - state machine implicitly knows that the peer received the data. +A sender can coalesce multiple QUIC packets (typically a Cryptographic Handshake +packet and a Protected packet) into one UDP datagram. This can reduce the +number of UDP datagrams needed to send application data during the handshake and +immediately afterwards. It is not necessary for senders to coalesce +packets, though failing to do so will require sending a significantly +larger number of datagrams during the handshake. Receivers MUST +be able to process coalesced packets. -* Application data sent in STREAM frames is retransmitted in new STREAM frames - unless the endpoint has sent a RST_STREAM for that stream. Once an endpoint - sends a RST_STREAM frame, no further STREAM frames are needed. +Coalescing packets in order of increasing encryption levels (Initial, 0-RTT, +Handshake, 1-RTT) makes it more likely the receiver will be able to process all +the packets in a single pass. A packet with a short header does not include a +length, so it will always be the last packet included in a UDP datagram. -* The most recent set of acknowledgments are sent in ACK frames. An ACK frame - SHOULD contain all unacknowledged acknowledgments, as described in - {{sending-ack-frames}}. +Senders MUST NOT coalesce QUIC packets with different Destination Connection +IDs into a single UDP datagram. Receivers SHOULD ignore any subsequent packets +with a different Destination Connection ID than the first packet in the +datagram. -* Cancellation of stream transmission, as carried in a RST_STREAM frame, is - sent until acknowledged or until all stream data is acknowledged by the peer - (that is, either the "Reset Recvd" or "Data Recvd" state is reached on the - send stream). The content of a RST_STREAM frame MUST NOT change when it is - sent again. +Every QUIC packet that is coalesced into a single UDP datagram is separate and +complete. Though the values of some fields in the packet header might be +redundant, no fields are omitted. The receiver of coalesced QUIC packets MUST +individually process each QUIC packet and separately acknowledge them, as if +they were received as the payload of different UDP datagrams. If one or more +packets in a datagram cannot be processed yet (because the keys are not yet +available) or processing fails (decryption failure, unknown type, etc.), the +receiver MUST still attempt to process the remaining packets. The skipped +packets MAY either be discarded or buffered for later processing, just as if the +packets were received out-of-order in separate datagrams. -* Similarly, a request to cancel stream transmission, as encoded in a - STOP_SENDING frame, is sent until the receive stream enters either a "Data - Recvd" or "Reset Recvd" state, see {{solicited-state-transitions}}. +Retry ({{packet-retry}}) and Version Negotiation ({{packet-version}}) packets +cannot be coalesced. -* Connection close signals, including those that use CONNECTION_CLOSE and - APPLICATION_CLOSE frames, are not sent again when packet loss is detected, but - as described in {{termination}}. -* The current connection maximum data is sent in MAX_DATA frames. An updated - value is sent in a MAX_DATA frame if the packet containing the most recently - sent MAX_DATA frame is declared lost, or when the endpoint decides to update - the limit. Care is necessary to avoid sending this frame too often as the - limit can increase frequently and cause an unnecessarily large number of - MAX_DATA frames to be sent. +## Connection ID Encoding -* The current maximum stream data offset is sent in MAX_STREAM_DATA frames. - Like MAX_DATA, an updated value is sent when the packet containing - the most recent MAX_STREAM_DATA frame for a stream is lost or when the limit - is updated, with care taken to prevent the frame from being sent too often. An - endpoint SHOULD stop sending MAX_STREAM_DATA frames when the receive stream - enters a "Size Known" state. +A connection ID is used to ensure consistent routing of packets, as described in +{{connection-id}}. The long header contains two connection IDs: the Destination +Connection ID is chosen by the recipient of the packet and is used to provide +consistent routing; the Source Connection ID is used to set the Destination +Connection ID used by the peer. -* The maximum stream ID for a stream of a given type is sent in MAX_STREAM_ID - frames. Like MAX_DATA, an updated value is sent when a packet containing the - most recent MAX_STREAM_ID for a stream type frame is declared lost or when - the limit is updated, with care taken to prevent the frame from being sent - too often. +During the handshake, packets with the long header are used to establish the +connection ID that each endpoint uses. Each endpoint uses the Source Connection +ID field to specify the connection ID that is used in the Destination Connection +ID field of packets being sent to them. Upon receiving a packet, each endpoint +sets the Destination Connection ID it sends to match the value of the Source +Connection ID that they receive. -* Blocked signals are carried in BLOCKED, STREAM_BLOCKED, and STREAM_ID_BLOCKED - frames. BLOCKED streams have connection scope, STREAM_BLOCKED frames have - stream scope, and STREAM_ID_BLOCKED frames are scoped to a specific stream - type. New frames are sent if packets containing the most recent frame for a - scope is lost, but only while the endpoint is blocked on the corresponding - limit. These frames always include the limit that is causing blocking at the - time that they are transmitted. +During the handshake, a client can receive both a Retry and an Initial packet, +and thus be given two opportunities to update the Destination Connection ID it +sends. A client MUST only change the value it sends in the Destination +Connection ID in response to the first packet of each type it receives from the +server (Retry or Initial); a server MUST set its value based on the Initial +packet. Any additional changes are not permitted; if subsequent packets of +those types include a different Source Connection ID, they MUST be discarded. +This avoids problems that might arise from stateless processing of multiple +Initial packets producing different connection IDs. -* A liveness or path validation check using PATH_CHALLENGE frames is sent - periodically until a matching PATH_RESPONSE frame is received or until there - is no remaining need for liveness or path validation checking. PATH_CHALLENGE - frames include a different payload each time they are sent. +Short headers only include the Destination Connection ID and omit the explicit +length. The length of the Destination Connection ID field is expected to be +known to endpoints. -* Responses to path validation using PATH_RESPONSE frames are sent just once. - A new PATH_CHALLENGE frame will be sent if another PATH_RESPONSE frame is - needed. +Endpoints using a connection-ID based load balancer could agree with the load +balancer on a fixed or minimum length and on an encoding for connection IDs. +This fixed portion could encode an explicit length, which allows the entire +connection ID to vary in length and still be used by the load balancer. -* New connection IDs are sent in NEW_CONNECTION_ID frames and retransmitted if - the packet containing them is lost. Retransmissions of this frame carry the - same sequence number value. Likewise, retired connection IDs are sent in - RETIRE_CONNECTION_ID frames and retransmitted if the packet containing them is - lost. +The very first packet sent by a client includes a random value for Destination +Connection ID. The same value MUST be used for all 0-RTT packets sent on that +connection ({{packet-protected}}). This randomized value is used to determine +the packet protection keys for Initial packets (see Section 5.2 of +{{QUIC-TLS}}). -* PADDING frames contain no information, so lost PADDING frames do not require - repair. +A Version Negotiation ({{packet-version}}) packet MUST use both connection IDs +selected by the client, swapped to ensure correct routing toward the client. -Upon detecting losses, a sender MUST take appropriate congestion control action. -The details of loss detection and congestion control are described in -{{QUIC-RECOVERY}}. +The connection ID can change over the lifetime of a connection, especially in +response to connection migration ({{migration}}). NEW_CONNECTION_ID frames +({{frame-new-connection-id}}) are used to provide new connection ID values. -# Packet Size {#packet-size} +## Packet Numbers {#packet-numbers} -The QUIC packet size includes the QUIC header and integrity check, but not the -UDP or IP header. +The packet number is an integer in the range 0 to 2^62-1, present in all long +and short header packets. This number is used in determining the cryptographic +nonce for packet protection. Each endpoint maintains a separate packet number +for sending and receiving. -Clients MUST ensure that the first Initial packet they send is sent in a UDP -datagram that is at least 1200 octets. Padding the Initial packet or including a -0-RTT packet in the same datagram are ways to meet this requirement. Sending a -UDP datagram of this size ensures that the network path supports a reasonable -Maximum Transmission Unit (MTU), and helps reduce the amplitude of amplification -attacks caused by server responses toward an unverified client address. +A Version Negotiation packet ({{packet-version}}) does not include a packet +number. The Retry packet ({{packet-retry}}) has special rules for populating +the packet number field. -The datagram containing the first Initial packet from a client MAY exceed 1200 -octets if the client believes that the Path Maximum Transmission Unit (PMTU) -supports the size that it chooses. +Packet numbers are divided into 3 spaces in QUIC: -A server MAY send a CONNECTION_CLOSE frame with error code PROTOCOL_VIOLATION in -response to the first Initial packet it receives from a client if the UDP -datagram is smaller than 1200 octets. It MUST NOT send any other frame type in -response, or otherwise behave as if any part of the offending packet was -processed as valid. +- Initial space: All Initial packets {{packet-initial}} are in this space. +- Handshake space: All Handshake packets {{packet-handshake}} are in this space. +- Application data space: All 0-RTT and 1-RTT encrypted packets + {{packet-protected}} are in this space. +As described in {{QUIC-TLS}}, each packet type uses different protection keys. -## Path Maximum Transmission Unit +Conceptually, a packet number space is the context in which a packet can be +processed and acknowledged. Initial packets can only be sent with Initial +packet protection keys and acknowledged in packets which are also Initial +packets. Similarly, Handshake packets are sent at the Handshake encryption +level and can only be acknowledged in Handshake packets. -The Path Maximum Transmission Unit (PMTU) is the maximum size of the entire IP -header, UDP header, and UDP payload. The UDP payload includes the QUIC packet -header, protected payload, and any authentication fields. +This enforces cryptographic separation between the data sent in the different +packet sequence number spaces. Each packet number space starts at packet number +0. Subsequent packets sent in the same packet number space MUST increase the +packet number by at least one. -All QUIC packets SHOULD be sized to fit within the estimated PMTU to avoid IP -fragmentation or packet drops. To optimize bandwidth efficiency, endpoints -SHOULD use Packetization Layer PMTU Discovery ({{!PLPMTUD=RFC4821}}). Endpoints -MAY use PMTU Discovery ({{!PMTUDv4=RFC1191}}, {{!PMTUDv6=RFC8201}}) for -detecting the PMTU, setting the PMTU appropriately, and storing the result of -previous PMTU determinations. +0-RTT and 1-RTT data exist in the same packet number space to make loss recovery +algorithms easier to implement between the two packet types. -In the absence of these mechanisms, QUIC endpoints SHOULD NOT send IP packets -larger than 1280 octets. Assuming the minimum IP header size, this results in -a QUIC packet size of 1232 octets for IPv6 and 1252 octets for IPv4. Some -QUIC implementations MAY be more conservative in computing allowed QUIC packet -size given unknown tunneling overheads or IP header options. +A QUIC endpoint MUST NOT reuse a packet number within the same packet number +space in one connection (that is, under the same cryptographic keys). If the +packet number for sending reaches 2^62 - 1, the sender MUST close the connection +without sending a CONNECTION_CLOSE frame or any further packets; an endpoint MAY +send a Stateless Reset ({{stateless-reset}}) in response to further packets that +it receives. -QUIC endpoints that implement any kind of PMTU discovery SHOULD maintain an -estimate for each combination of local and remote IP addresses. Each pairing of -local and remote addresses could have a different maximum MTU in the path. +In the QUIC long and short packet headers, the number of bits required to +represent the packet number is reduced by including only a variable number of +the least significant bits of the packet number. One or two of the most +significant bits of the first octet determine how many bits of the packet +number are provided, as shown in {{pn-encodings}}. -QUIC depends on the network path supporting an MTU of at least 1280 octets. This -is the IPv6 minimum MTU and therefore also supported by most modern IPv4 -networks. An endpoint MUST NOT reduce its MTU below this number, even if it -receives signals that indicate a smaller limit might exist. +| First octet pattern | Encoded Length | Bits Present | +|:--------------------|:---------------|:-------------| +| 0b0xxxxxxx | 1 octet | 7 | +| 0b10xxxxxx | 2 | 14 | +| 0b11xxxxxx | 4 | 30 | +{: #pn-encodings title="Packet Number Encodings for Packet Headers"} -If a QUIC endpoint determines that the PMTU between any pair of local and remote -IP addresses has fallen below 1280 octets, it MUST immediately cease sending -QUIC packets on the affected path. This could result in termination of the -connection if an alternative path cannot be found. +Note that these encodings are similar to those in {{integer-encoding}}, but +use different values. +The encoded packet number is protected as described in Section 5.3 +{{QUIC-TLS}}. Protection of the packet number is removed prior to recovering the +full packet number. The full packet number is reconstructed at the receiver +based on the number of significant bits present, the value of those bits, and +the largest packet number received on a successfully authenticated +packet. Recovering the full packet number is necessary to successfully remove +packet protection. -### IPv4 PMTU Discovery {#v4-pmtud} +Once packet number protection is removed, the packet number is decoded by +finding the packet number value that is closest to the next expected packet. +The next expected packet is the highest received packet number plus one. For +example, if the highest successfully authenticated packet had a packet number of +0xaa82f30e, then a packet containing a 14-bit value of 0x9b3 will be decoded as +0xaa8309b3. +Example pseudo-code for packet number decoding can be found in +{{sample-packet-number-decoding}}. -Traditional ICMP-based path MTU discovery in IPv4 {{!PMTUDv4}} is potentially -vulnerable to off-path attacks that successfully guess the IP/port 4-tuple and -reduce the MTU to a bandwidth-inefficient value. TCP connections mitigate this -risk by using the (at minimum) 8 bytes of transport header echoed in the ICMP -message to validate the TCP sequence number as valid for the current -connection. However, as QUIC operates over UDP, in IPv4 the echoed information -could consist only of the IP and UDP headers, which usually has insufficient -entropy to mitigate off-path attacks. +The sender MUST use a packet number size able to represent more than twice as +large a range than the difference between the largest acknowledged packet and +packet number being sent. A peer receiving the packet will then correctly +decode the packet number, unless the packet is delayed in transit such that it +arrives after many higher-numbered packets have been received. An endpoint +SHOULD use a large enough packet number encoding to allow the packet number to +be recovered even if the packet arrives after packets that are sent afterwards. -As a result, endpoints that implement PMTUD in IPv4 SHOULD take steps to -mitigate this risk. For instance, an application could: +As a result, the size of the packet number encoding is at least one more than +the base 2 logarithm of the number of contiguous unacknowledged packet numbers, +including the new packet. -* Set the IPv4 Don't Fragment (DF) bit on a small proportion of packets, so that -most invalid ICMP messages arrive when there are no DF packets outstanding, and -can therefore be identified as spurious. +For example, if an endpoint has received an acknowledgment for packet 0x6afa2f, +sending a packet with a number of 0x6b2d79 requires a packet number encoding +with 14 bits or more; whereas the 30-bit packet number encoding is needed to +send a packet with a number of 0x6bc107. -* Store additional information from the IP or UDP headers from DF packets (for -example, the IP ID or UDP checksum) to further authenticate incoming Datagram -Too Big messages. +A receiver MUST discard a newly unprotected packet unless it is certain that it +has not processed another packet with the same packet number from the same +packet number space. Duplicate suppression MUST happen after removing packet +protection for the reasons described in Section 9.3 of {{QUIC-TLS}}. An +efficient algorithm for duplicate suppression can be found in Section 3.4.3 of +{{?RFC2406}}. -* Any reduction in PMTU due to a report contained in an ICMP packet is -provisional until QUIC's loss detection algorithm determines that the packet is -actually lost. +## Frames and Frame Types {#frames} -## Special Considerations for Packetization Layer PMTU Discovery +The payload of all packets, after removing packet protection, consists of a +sequence of frames, as shown in {{packet-frames}}. Version Negotiation and +Stateless Reset do not contain frames. +~~~ + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Frame 1 (*) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Frame 2 (*) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Frame N (*) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +~~~ +{: #packet-frames title="QUIC Payload"} -The PADDING frame provides a useful option for PMTU probe packets. PADDING -frames generate acknowledgements, but they need not be delivered reliably. As a -result, the loss of PADDING frames in probe packets does not require -delay-inducing retransmission. However, PADDING frames do consume congestion -window, which may delay the transmission of subsequent application data. +QUIC payloads MUST contain at least one frame, and MAY contain multiple +frames and multiple frame types. -When implementing the algorithm in Section 7.2 of {{!PLPMTUD}}, the initial -value of search_low SHOULD be consistent with the IPv6 minimum packet size. -Paths that do not support this size cannot deliver Initial packets, and -therefore are not QUIC-compliant. +Frames MUST fit within a single QUIC packet and MUST NOT span a QUIC packet +boundary. Each frame begins with a Frame Type, indicating its type, followed by +additional type-dependent fields: -Section 7.3 of {{!PLPMTUD}} discusses trade-offs between small and large -increases in the size of probe packets. As QUIC probe packets need not contain -application data, aggressive increases in probe size carry fewer consequences. +~~~ + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Frame Type (i) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Type-Dependent Fields (*) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +~~~ +{: #frame-layout title="Generic Frame Layout"} +The frame types defined in this specification are listed in {{frame-types}}. +The Frame Type in STREAM frames is used to carry other frame-specific flags. +For all other frames, the Frame Type field simply identifies the frame. These +frames are explained in more detail in {{frame-formats}}. +| Type Value | Frame Type Name | Definition | +|:------------|:---------------------|:-------------------------------| +| 0x00 | PADDING | {{frame-padding}} | +| 0x01 | RST_STREAM | {{frame-rst-stream}} | +| 0x02 | CONNECTION_CLOSE | {{frame-connection-close}} | +| 0x03 | APPLICATION_CLOSE | {{frame-application-close}} | +| 0x04 | MAX_DATA | {{frame-max-data}} | +| 0x05 | MAX_STREAM_DATA | {{frame-max-stream-data}} | +| 0x06 | MAX_STREAM_ID | {{frame-max-stream-id}} | +| 0x07 | PING | {{frame-ping}} | +| 0x08 | BLOCKED | {{frame-blocked}} | +| 0x09 | STREAM_BLOCKED | {{frame-stream-blocked}} | +| 0x0a | STREAM_ID_BLOCKED | {{frame-stream-id-blocked}} | +| 0x0b | NEW_CONNECTION_ID | {{frame-new-connection-id}} | +| 0x0c | STOP_SENDING | {{frame-stop-sending}} | +| 0x0d | RETIRE_CONNECTION_ID | {{frame-retire-connection-id}} | +| 0x0e | PATH_CHALLENGE | {{frame-path-challenge}} | +| 0x0f | PATH_RESPONSE | {{frame-path-response}} | +| 0x10 - 0x17 | STREAM | {{frame-stream}} | +| 0x18 | CRYPTO | {{frame-crypto}} | +| 0x19 | NEW_TOKEN | {{frame-new-token}} | +| 0x1a - 0x1b | ACK | {{frame-ack}} | +{: #frame-types title="Frame Types"} -# Versions {#versions} +All QUIC frames are idempotent. That is, a valid frame does not cause +undesirable side effects or errors when received more than once. -QUIC versions are identified using a 32-bit unsigned number. +The Frame Type field uses a variable length integer encoding (see +{{integer-encoding}}) with one exception. To ensure simple and efficient +implementations of frame parsing, a frame type MUST use the shortest possible +encoding. Though a two-, four- or eight-octet encoding of the frame types +defined in this document is possible, the Frame Type field for these frames is +encoded on a single octet. For instance, though 0x4007 is a legitimate +two-octet encoding for a variable-length integer with a value of 7, PING frames +are always encoded as a single octet with the value 0x07. An endpoint MUST +treat the receipt of a frame type that uses a longer encoding than necessary as +a connection error of type PROTOCOL_VIOLATION. -The version 0x00000000 is reserved to represent version negotiation. This -version of the specification is identified by the number 0x00000001. -Other versions of QUIC might have different properties to this version. The -properties of QUIC that are guaranteed to be consistent across all versions of -the protocol are described in {{QUIC-INVARIANTS}}. -Version 0x00000001 of QUIC uses TLS as a cryptographic handshake protocol, as -described in {{QUIC-TLS}}. +# Packetization and Reliability {#packetization} -Versions with the most significant 16 bits of the version number cleared are -reserved for use in future IETF consensus documents. +A sender bundles one or more frames in a QUIC packet (see {{frames}}). -Versions that follow the pattern 0x?a?a?a?a are reserved for use in forcing -version negotiation to be exercised. That is, any version number where the low -four bits of all octets is 1010 (in binary). A client or server MAY advertise -support for any of these reserved versions. +A sender SHOULD minimize per-packet bandwidth and computational costs by +bundling as many frames as possible within a QUIC packet. A sender MAY wait for +a short period of time to bundle multiple frames before sending a packet that is +not maximally packed, to avoid sending out large numbers of small packets. An +implementation may use knowledge about application sending behavior or +heuristics to determine whether and for how long to wait. This waiting period +is an implementation decision, and an implementation should be careful to delay +conservatively, since any delay is likely to increase application-visible +latency. -Reserved version numbers will probably never represent a real protocol; a client -MAY use one of these version numbers with the expectation that the server will -initiate version negotiation; a server MAY advertise support for one of these -versions and can expect that clients ignore the value. -\[\[RFC editor: please remove the remainder of this section before -publication.]] +## Packet Processing and Acknowledgment {#processing-and-ack} -The version number for the final version of this specification (0x00000001), is -reserved for the version of the protocol that is published as an RFC. +A packet MUST NOT be acknowledged until packet protection has been successfully +removed and all frames contained in the packet have been processed. For STREAM +frames, this means the data has been enqueued in preparation to be received by +the application protocol, but it does not require that data is delivered and +consumed. -Version numbers used to identify IETF drafts are created by adding the draft -number to 0xff000000. For example, draft-ietf-quic-transport-13 would be -identified as 0xff00000D. +Once the packet has been fully processed, a receiver acknowledges receipt by +sending one or more ACK frames containing the packet number of the received +packet. To avoid creating an indefinite feedback loop, an endpoint MUST NOT +send an ACK frame in response to a packet containing only ACK or PADDING frames, +even if there are packet gaps which precede the received packet. The endpoint +MUST acknowledge packets containing only ACK or PADDING frames in the next ACK +frame that it sends. -Implementors are encouraged to register version numbers of QUIC that they are -using for private experimentation on the GitHub wiki at -\. +While PADDING frames do not elicit an ACK frame from a receiver, they are +considered to be in flight for congestion control purposes +{{QUIC-RECOVERY}}. Sending only PADDING frames might cause the sender to become +limited by the congestion controller (as described in {{QUIC-RECOVERY}}) with no +acknowledgments forthcoming from the receiver. Therefore, a sender should ensure +that other frames are sent in addition to PADDING frames to elicit +acknowledgments from the receiver. +Strategies and implications of the frequency of generating acknowledgments are +discussed in more detail in {{QUIC-RECOVERY}}. -# Packet Types and Formats +## Retransmission of Information -We first describe QUIC's packet types and their formats, since some are -referenced in subsequent mechanisms. +QUIC packets that are determined to be lost are not retransmitted whole. The +same applies to the frames that are contained within lost packets. Instead, the +information that might be carried in frames is sent again in new frames as +needed. -All numeric values are encoded in network byte order (that is, big-endian) and -all field sizes are in bits. When discussing individual bits of fields, the -least significant bit is referred to as bit 0. Hexadecimal notation is used for -describing the value of fields. +New frames and packets are used to carry information that is determined to have +been lost. In general, information is sent again when a packet containing that +information is determined to be lost and sending ceases when a packet +containing that information is acknowledged. -Any QUIC packet has either a long or a short header, as indicated by the Header -Form bit. Long headers are expected to be used early in the connection before -version negotiation and establishment of 1-RTT keys. Short headers are minimal -version-specific headers, which are used after version negotiation and 1-RTT -keys are established. +* Data sent in CRYPTO frames is retransmitted according to the rules in + {{QUIC-RECOVERY}}, until either all data has been acknowledged or the crypto + state machine implicitly knows that the peer received the data. -## Long Header {#long-header} +* Application data sent in STREAM frames is retransmitted in new STREAM frames + unless the endpoint has sent a RST_STREAM for that stream. Once an endpoint + sends a RST_STREAM frame, no further STREAM frames are needed. -~~~~~ - 0 1 2 3 - 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 -+-+-+-+-+-+-+-+-+ -|1| Type (7) | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Version (32) | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -|DCIL(4)|SCIL(4)| -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Destination Connection ID (0/32..144) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Source Connection ID (0/32..144) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +* The most recent set of acknowledgments are sent in ACK frames. An ACK frame + SHOULD contain all unacknowledged acknowledgments, as described in + {{sending-ack-frames}}. + +* Cancellation of stream transmission, as carried in a RST_STREAM frame, is + sent until acknowledged or until all stream data is acknowledged by the peer + (that is, either the "Reset Recvd" or "Data Recvd" state is reached on the + send stream). The content of a RST_STREAM frame MUST NOT change when it is + sent again. + +* Similarly, a request to cancel stream transmission, as encoded in a + STOP_SENDING frame, is sent until the receive stream enters either a "Data + Recvd" or "Reset Recvd" state, see {{solicited-state-transitions}}. + +* Connection close signals, including those that use CONNECTION_CLOSE and + APPLICATION_CLOSE frames, are not sent again when packet loss is detected, but + as described in {{termination}}. + +* The current connection maximum data is sent in MAX_DATA frames. An updated + value is sent in a MAX_DATA frame if the packet containing the most recently + sent MAX_DATA frame is declared lost, or when the endpoint decides to update + the limit. Care is necessary to avoid sending this frame too often as the + limit can increase frequently and cause an unnecessarily large number of + MAX_DATA frames to be sent. + +* The current maximum stream data offset is sent in MAX_STREAM_DATA frames. + Like MAX_DATA, an updated value is sent when the packet containing + the most recent MAX_STREAM_DATA frame for a stream is lost or when the limit + is updated, with care taken to prevent the frame from being sent too often. An + endpoint SHOULD stop sending MAX_STREAM_DATA frames when the receive stream + enters a "Size Known" state. + +* The maximum stream ID for a stream of a given type is sent in MAX_STREAM_ID + frames. Like MAX_DATA, an updated value is sent when a packet containing the + most recent MAX_STREAM_ID for a stream type frame is declared lost or when + the limit is updated, with care taken to prevent the frame from being sent + too often. + +* Blocked signals are carried in BLOCKED, STREAM_BLOCKED, and STREAM_ID_BLOCKED + frames. BLOCKED streams have connection scope, STREAM_BLOCKED frames have + stream scope, and STREAM_ID_BLOCKED frames are scoped to a specific stream + type. New frames are sent if packets containing the most recent frame for a + scope is lost, but only while the endpoint is blocked on the corresponding + limit. These frames always include the limit that is causing blocking at the + time that they are transmitted. + +* A liveness or path validation check using PATH_CHALLENGE frames is sent + periodically until a matching PATH_RESPONSE frame is received or until there + is no remaining need for liveness or path validation checking. PATH_CHALLENGE + frames include a different payload each time they are sent. + +* Responses to path validation using PATH_RESPONSE frames are sent just once. + A new PATH_CHALLENGE frame will be sent if another PATH_RESPONSE frame is + needed. + +* New connection IDs are sent in NEW_CONNECTION_ID frames and retransmitted if + the packet containing them is lost. Retransmissions of this frame carry the + same sequence number value. Likewise, retired connection IDs are sent in + RETIRE_CONNECTION_ID frames and retransmitted if the packet containing them is + lost. + +* PADDING frames contain no information, so lost PADDING frames do not require + repair. + +Upon detecting losses, a sender MUST take appropriate congestion control action. +The details of loss detection and congestion control are described in +{{QUIC-RECOVERY}}. + + +# Packet Size {#packet-size} + +The QUIC packet size includes the QUIC header and integrity check, but not the +UDP or IP header. + +Clients MUST ensure that the first Initial packet they send is sent in a UDP +datagram that is at least 1200 octets. Padding the Initial packet or including a +0-RTT packet in the same datagram are ways to meet this requirement. Sending a +UDP datagram of this size ensures that the network path supports a reasonable +Maximum Transmission Unit (MTU), and helps reduce the amplitude of amplification +attacks caused by server responses toward an unverified client address. + +The datagram containing the first Initial packet from a client MAY exceed 1200 +octets if the client believes that the Path Maximum Transmission Unit (PMTU) +supports the size that it chooses. + +A server MAY send a CONNECTION_CLOSE frame with error code PROTOCOL_VIOLATION in +response to the first Initial packet it receives from a client if the UDP +datagram is smaller than 1200 octets. It MUST NOT send any other frame type in +response, or otherwise behave as if any part of the offending packet was +processed as valid. + + +## Path Maximum Transmission Unit + +The Path Maximum Transmission Unit (PMTU) is the maximum size of the entire IP +header, UDP header, and UDP payload. The UDP payload includes the QUIC packet +header, protected payload, and any authentication fields. + +All QUIC packets SHOULD be sized to fit within the estimated PMTU to avoid IP +fragmentation or packet drops. To optimize bandwidth efficiency, endpoints +SHOULD use Packetization Layer PMTU Discovery ({{!PLPMTUD=RFC4821}}). Endpoints +MAY use PMTU Discovery ({{!PMTUDv4=RFC1191}}, {{!PMTUDv6=RFC8201}}) for +detecting the PMTU, setting the PMTU appropriately, and storing the result of +previous PMTU determinations. + +In the absence of these mechanisms, QUIC endpoints SHOULD NOT send IP packets +larger than 1280 octets. Assuming the minimum IP header size, this results in +a QUIC packet size of 1232 octets for IPv6 and 1252 octets for IPv4. Some +QUIC implementations MAY be more conservative in computing allowed QUIC packet +size given unknown tunneling overheads or IP header options. + +QUIC endpoints that implement any kind of PMTU discovery SHOULD maintain an +estimate for each combination of local and remote IP addresses. Each pairing of +local and remote addresses could have a different maximum MTU in the path. + +QUIC depends on the network path supporting an MTU of at least 1280 octets. This +is the IPv6 minimum MTU and therefore also supported by most modern IPv4 +networks. An endpoint MUST NOT reduce its MTU below this number, even if it +receives signals that indicate a smaller limit might exist. + +If a QUIC endpoint determines that the PMTU between any pair of local and remote +IP addresses has fallen below 1280 octets, it MUST immediately cease sending +QUIC packets on the affected path. This could result in termination of the +connection if an alternative path cannot be found. + + +### IPv4 PMTU Discovery {#v4-pmtud} + +Traditional ICMP-based path MTU discovery in IPv4 {{!PMTUDv4}} is potentially +vulnerable to off-path attacks that successfully guess the IP/port 4-tuple and +reduce the MTU to a bandwidth-inefficient value. TCP connections mitigate this +risk by using the (at minimum) 8 bytes of transport header echoed in the ICMP +message to validate the TCP sequence number as valid for the current +connection. However, as QUIC operates over UDP, in IPv4 the echoed information +could consist only of the IP and UDP headers, which usually has insufficient +entropy to mitigate off-path attacks. + +As a result, endpoints that implement PMTUD in IPv4 SHOULD take steps to +mitigate this risk. For instance, an application could: + +* Set the IPv4 Don't Fragment (DF) bit on a small proportion of packets, so that +most invalid ICMP messages arrive when there are no DF packets outstanding, and +can therefore be identified as spurious. + +* Store additional information from the IP or UDP headers from DF packets (for +example, the IP ID or UDP checksum) to further authenticate incoming Datagram +Too Big messages. + +* Any reduction in PMTU due to a report contained in an ICMP packet is +provisional until QUIC's loss detection algorithm determines that the packet is +actually lost. + + +## Special Considerations for Packetization Layer PMTU Discovery + + +The PADDING frame provides a useful option for PMTU probe packets. PADDING +frames generate acknowledgements, but they need not be delivered reliably. As a +result, the loss of PADDING frames in probe packets does not require +delay-inducing retransmission. However, PADDING frames do consume congestion +window, which may delay the transmission of subsequent application data. + +When implementing the algorithm in Section 7.2 of {{!PLPMTUD}}, the initial +value of search_low SHOULD be consistent with the IPv6 minimum packet size. +Paths that do not support this size cannot deliver Initial packets, and +therefore are not QUIC-compliant. + +Section 7.3 of {{!PLPMTUD}} discusses trade-offs between small and large +increases in the size of probe packets. As QUIC probe packets need not contain +application data, aggressive increases in probe size carry fewer consequences. + + + +# Versions {#versions} + +QUIC versions are identified using a 32-bit unsigned number. + +The version 0x00000000 is reserved to represent version negotiation. This +version of the specification is identified by the number 0x00000001. + +Other versions of QUIC might have different properties to this version. The +properties of QUIC that are guaranteed to be consistent across all versions of +the protocol are described in {{QUIC-INVARIANTS}}. + +Version 0x00000001 of QUIC uses TLS as a cryptographic handshake protocol, as +described in {{QUIC-TLS}}. + +Versions with the most significant 16 bits of the version number cleared are +reserved for use in future IETF consensus documents. + +Versions that follow the pattern 0x?a?a?a?a are reserved for use in forcing +version negotiation to be exercised. That is, any version number where the low +four bits of all octets is 1010 (in binary). A client or server MAY advertise +support for any of these reserved versions. + +Reserved version numbers will probably never represent a real protocol; a client +MAY use one of these version numbers with the expectation that the server will +initiate version negotiation; a server MAY advertise support for one of these +versions and can expect that clients ignore the value. + +\[\[RFC editor: please remove the remainder of this section before +publication.]] + +The version number for the final version of this specification (0x00000001), is +reserved for the version of the protocol that is published as an RFC. + +Version numbers used to identify IETF drafts are created by adding the draft +number to 0xff000000. For example, draft-ietf-quic-transport-13 would be +identified as 0xff00000D. + +Implementors are encouraged to register version numbers of QUIC that they are +using for private experimentation on the GitHub wiki at +\. + + + +# Packet Types and Formats {#packet-format} + +All numeric values are encoded in network byte order (that is, big-endian) and +all field sizes are in bits. When discussing individual bits of fields, the +least significant bit is referred to as bit 0. Hexadecimal notation is used for +describing the value of fields. + +## Long Header {#long-header} + +~~~~~ + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+ +|1| Type (7) | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Version (32) | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +|DCIL(4)|SCIL(4)| ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Destination Connection ID (0/32..144) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Source Connection ID (0/32..144) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Length (i) ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Packet Number (8/16/32) | @@ -2913,7 +3218,7 @@ Senders can sometimes coalesce multiple packets into one UDP datagram. See {{packet-coalesce}} for more details. -## Short Header +## Short Header {#short-header} ~~~~~ 0 1 2 3 @@ -3428,308 +3733,8 @@ frames. Endpoints MUST treat receipt of Handshake packets with other frames as a connection error. -## Protected Packets {#packet-protected} - -All QUIC packets use packet protection. Packets that are protected with the -static handshake keys or the 0-RTT keys are sent with long headers; all packets -protected with 1-RTT keys are sent with short headers. The different packet -types explicitly indicate the encryption level and therefore the keys that are -used to remove packet protection. 0-RTT and 1-RTT protected packets share a -single packet number space. - -Packets protected with handshake keys only use packet protection to ensure that -the sender of the packet is on the network path. This packet protection is not -effective confidentiality protection; any entity that receives the Initial -packet from a client can recover the keys necessary to remove packet protection -or to generate packets that will be successfully authenticated. - -Packets protected with 0-RTT and 1-RTT keys are expected to have confidentiality -and data origin authentication; the cryptographic handshake ensures that only -the communicating endpoints receive the corresponding keys. - -Packets protected with 0-RTT keys use a type value of 0x7C. The connection ID -fields for a 0-RTT packet MUST match the values used in the Initial packet -({{packet-initial}}). - -The version field for protected packets is the current QUIC version. - -The packet number field contains a packet number, which has additional -confidentiality protection that is applied after packet protection is applied -(see {{QUIC-TLS}} for details). The underlying packet number increases with -each packet sent, see {{packet-numbers}} for details. - -The payload is protected using authenticated encryption. {{QUIC-TLS}} describes -packet protection in detail. After decryption, the plaintext consists of a -sequence of frames, as described in {{frames}}. - - -## Coalescing Packets {#packet-coalesce} - -A sender can coalesce multiple QUIC packets (typically a Cryptographic Handshake -packet and a Protected packet) into one UDP datagram. This can reduce the -number of UDP datagrams needed to send application data during the handshake and -immediately afterwards. It is not necessary for senders to coalesce -packets, though failing to do so will require sending a significantly -larger number of datagrams during the handshake. Receivers MUST -be able to process coalesced packets. - -Coalescing packets in order of increasing encryption levels (Initial, 0-RTT, -Handshake, 1-RTT) makes it more likely the receiver will be able to process all -the packets in a single pass. A packet with a short header does not include a -length, so it will always be the last packet included in a UDP datagram. - -Senders MUST NOT coalesce QUIC packets with different Destination Connection -IDs into a single UDP datagram. Receivers SHOULD ignore any subsequent packets -with a different Destination Connection ID than the first packet in the -datagram. - -Every QUIC packet that is coalesced into a single UDP datagram is separate and -complete. Though the values of some fields in the packet header might be -redundant, no fields are omitted. The receiver of coalesced QUIC packets MUST -individually process each QUIC packet and separately acknowledge them, as if -they were received as the payload of different UDP datagrams. If one or more -packets in a datagram cannot be processed yet (because the keys are not yet -available) or processing fails (decryption failure, unknown type, etc.), the -receiver MUST still attempt to process the remaining packets. The skipped -packets MAY either be discarded or buffered for later processing, just as if the -packets were received out-of-order in separate datagrams. - -Retry ({{packet-retry}}) and Version Negotiation ({{packet-version}}) packets -cannot be coalesced. - - -## Connection ID Encoding - -A connection ID is used to ensure consistent routing of packets, as described in -{{connection-id}}. The long header contains two connection IDs: the Destination -Connection ID is chosen by the recipient of the packet and is used to provide -consistent routing; the Source Connection ID is used to set the Destination -Connection ID used by the peer. - -During the handshake, packets with the long header are used to establish the -connection ID that each endpoint uses. Each endpoint uses the Source Connection -ID field to specify the connection ID that is used in the Destination Connection -ID field of packets being sent to them. Upon receiving a packet, each endpoint -sets the Destination Connection ID it sends to match the value of the Source -Connection ID that they receive. - -During the handshake, a client can receive both a Retry and an Initial packet, -and thus be given two opportunities to update the Destination Connection ID it -sends. A client MUST only change the value it sends in the Destination -Connection ID in response to the first packet of each type it receives from the -server (Retry or Initial); a server MUST set its value based on the Initial -packet. Any additional changes are not permitted; if subsequent packets of -those types include a different Source Connection ID, they MUST be discarded. -This avoids problems that might arise from stateless processing of multiple -Initial packets producing different connection IDs. - -Short headers only include the Destination Connection ID and omit the explicit -length. The length of the Destination Connection ID field is expected to be -known to endpoints. - -Endpoints using a connection-ID based load balancer could agree with the load -balancer on a fixed or minimum length and on an encoding for connection IDs. -This fixed portion could encode an explicit length, which allows the entire -connection ID to vary in length and still be used by the load balancer. - -The very first packet sent by a client includes a random value for Destination -Connection ID. The same value MUST be used for all 0-RTT packets sent on that -connection ({{packet-protected}}). This randomized value is used to determine -the packet protection keys for Initial packets (see Section 5.2 of -{{QUIC-TLS}}). - -A Version Negotiation ({{packet-version}}) packet MUST use both connection IDs -selected by the client, swapped to ensure correct routing toward the client. - -The connection ID can change over the lifetime of a connection, especially in -response to connection migration ({{migration}}). NEW_CONNECTION_ID frames -({{frame-new-connection-id}}) are used to provide new connection ID values. - - -## Packet Numbers {#packet-numbers} - -The packet number is an integer in the range 0 to 2^62-1. The value is used in -determining the cryptographic nonce for packet protection. Each endpoint -maintains a separate packet number for sending and receiving. - -Packet numbers are divided into 3 spaces in QUIC: - -- Initial space: All Initial packets {{packet-initial}} are in this space. -- Handshake space: All Handshake packets {{packet-handshake}} are in this space. -- Application data space: All 0-RTT and 1-RTT encrypted packets - {{packet-protected}} are in this space. - -As described in {{QUIC-TLS}}, each packet type uses different protection keys. - -Conceptually, a packet number space is the context in which a packet can be -processed and acknowledged. Initial packets can only be sent with Initial -packet protection keys and acknowledged in packets which are also Initial -packets. Similarly, Handshake packets are sent at the Handshake encryption -level and can only be acknowledged in Handshake packets. - -This enforces cryptographic separation between the data sent in the different -packet sequence number spaces. Each packet number space starts at packet number -0. Subsequent packets sent in the same packet number space MUST increase the -packet number by at least one. - -0-RTT and 1-RTT data exist in the same packet number space to make loss recovery -algorithms easier to implement between the two packet types. - -A QUIC endpoint MUST NOT reuse a packet number within the same packet number -space in one connection (that is, under the same cryptographic keys). If the -packet number for sending reaches 2^62 - 1, the sender MUST close the connection -without sending a CONNECTION_CLOSE frame or any further packets; an endpoint MAY -send a Stateless Reset ({{stateless-reset}}) in response to further packets that -it receives. - -In the QUIC long and short packet headers, the number of bits required to -represent the packet number is reduced by including only a variable number of -the least significant bits of the packet number. One or two of the most -significant bits of the first octet determine how many bits of the packet -number are provided, as shown in {{pn-encodings}}. - -| First octet pattern | Encoded Length | Bits Present | -|:--------------------|:---------------|:-------------| -| 0b0xxxxxxx | 1 octet | 7 | -| 0b10xxxxxx | 2 | 14 | -| 0b11xxxxxx | 4 | 30 | -{: #pn-encodings title="Packet Number Encodings for Packet Headers"} - -Note that these encodings are similar to those in {{integer-encoding}}, but -use different values. - -The encoded packet number is protected as described in Section 5.3 -{{QUIC-TLS}}. Protection of the packet number is removed prior to recovering the -full packet number. The full packet number is reconstructed at the receiver -based on the number of significant bits present, the value of those bits, and -the largest packet number received on a successfully authenticated -packet. Recovering the full packet number is necessary to successfully remove -packet protection. - -Once packet number protection is removed, the packet number is decoded by -finding the packet number value that is closest to the next expected packet. -The next expected packet is the highest received packet number plus one. For -example, if the highest successfully authenticated packet had a packet number of -0xaa82f30e, then a packet containing a 14-bit value of 0x9b3 will be decoded as -0xaa8309b3. -Example pseudo-code for packet number decoding can be found in -{{sample-packet-number-decoding}}. - -The sender MUST use a packet number size able to represent more than twice as -large a range than the difference between the largest acknowledged packet and -packet number being sent. A peer receiving the packet will then correctly -decode the packet number, unless the packet is delayed in transit such that it -arrives after many higher-numbered packets have been received. An endpoint -SHOULD use a large enough packet number encoding to allow the packet number to -be recovered even if the packet arrives after packets that are sent afterwards. - -As a result, the size of the packet number encoding is at least one more than -the base 2 logarithm of the number of contiguous unacknowledged packet numbers, -including the new packet. - -For example, if an endpoint has received an acknowledgment for packet 0x6afa2f, -sending a packet with a number of 0x6b2d79 requires a packet number encoding -with 14 bits or more; whereas the 30-bit packet number encoding is needed to -send a packet with a number of 0x6bc107. - -A receiver MUST discard a newly unprotected packet unless it is certain that it -has not processed another packet with the same packet number from the same -packet number space. Duplicate suppression MUST happen after removing packet -protection for the reasons described in Section 9.3 of {{QUIC-TLS}}. An -efficient algorithm for duplicate suppression can be found in Section 3.4.3 of -{{?RFC2406}}. - -A Version Negotiation packet ({{packet-version}}) does not include a packet -number. The Retry packet ({{packet-retry}}) has special rules for populating -the packet number field. - - -# Frames and Frame Types {#frames} - -The payload of all packets, after removing packet protection, consists of a -sequence of frames, as shown in {{packet-frames}}. Version Negotiation and -Stateless Reset do not contain frames. - -~~~ - 0 1 2 3 - 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Frame 1 (*) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Frame 2 (*) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ - ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Frame N (*) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -~~~ -{: #packet-frames title="QUIC Payload"} - -QUIC payloads MUST contain at least one frame, and MAY contain multiple -frames and multiple frame types. - -Frames MUST fit within a single QUIC packet and MUST NOT span a QUIC packet -boundary. Each frame begins with a Frame Type, indicating its type, followed by -additional type-dependent fields: - -~~~ - 0 1 2 3 - 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Frame Type (i) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Type-Dependent Fields (*) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -~~~ -{: #frame-layout title="Generic Frame Layout"} - -The frame types defined in this specification are listed in {{frame-types}}. -The Frame Type in STREAM frames is used to carry other frame-specific flags. -For all other frames, the Frame Type field simply identifies the frame. These -frames are explained in more detail as they are referenced later in the -document. - -| Type Value | Frame Type Name | Definition | -|:------------|:---------------------|:-------------------------------| -| 0x00 | PADDING | {{frame-padding}} | -| 0x01 | RST_STREAM | {{frame-rst-stream}} | -| 0x02 | CONNECTION_CLOSE | {{frame-connection-close}} | -| 0x03 | APPLICATION_CLOSE | {{frame-application-close}} | -| 0x04 | MAX_DATA | {{frame-max-data}} | -| 0x05 | MAX_STREAM_DATA | {{frame-max-stream-data}} | -| 0x06 | MAX_STREAM_ID | {{frame-max-stream-id}} | -| 0x07 | PING | {{frame-ping}} | -| 0x08 | BLOCKED | {{frame-blocked}} | -| 0x09 | STREAM_BLOCKED | {{frame-stream-blocked}} | -| 0x0a | STREAM_ID_BLOCKED | {{frame-stream-id-blocked}} | -| 0x0b | NEW_CONNECTION_ID | {{frame-new-connection-id}} | -| 0x0c | STOP_SENDING | {{frame-stop-sending}} | -| 0x0d | RETIRE_CONNECTION_ID | {{frame-retire-connection-id}} | -| 0x0e | PATH_CHALLENGE | {{frame-path-challenge}} | -| 0x0f | PATH_RESPONSE | {{frame-path-response}} | -| 0x10 - 0x17 | STREAM | {{frame-stream}} | -| 0x18 | CRYPTO | {{frame-crypto}} | -| 0x19 | NEW_TOKEN | {{frame-new-token}} | -| 0x1a - 0x1b | ACK | {{frame-ack}} | -{: #frame-types title="Frame Types"} - -All QUIC frames are idempotent. That is, a valid frame does not cause -undesirable side effects or errors when received more than once. - -The Frame Type field uses a variable length integer encoding (see -{{integer-encoding}}) with one exception. To ensure simple and efficient -implementations of frame parsing, a frame type MUST use the shortest possible -encoding. Though a two-, four- or eight-octet encoding of the frame types -defined in this document is possible, the Frame Type field for these frames is -encoded on a single octet. For instance, though 0x4007 is a legitimate -two-octet encoding for a variable-length integer with a value of 7, PING frames -are always encoded as a single octet with the value 0x07. An endpoint MUST -treat the receipt of a frame type that uses a longer encoding than necessary as -a connection error of type PROTOCOL_VIOLATION. - - -# Frame Types and Formats +# Frame Types and Formats {#frame-formats} As described in {{frames}}, packets contain one or more frames. This section describes the format and semantics of the core QUIC frame types. From 438b9650dcbd28453004d3ffe6296b16277fe7bb Mon Sep 17 00:00:00 2001 From: Jana Iyengar Date: Mon, 15 Oct 2018 17:02:17 -0700 Subject: [PATCH 29/57] modified intro, made error codes first level section --- draft-ietf-quic-transport.md | 153 ++++++++++++++++++----------------- 1 file changed, 80 insertions(+), 73 deletions(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index 2ed3e9f04c..72cecf4c36 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -134,19 +134,25 @@ encrypts most of the data it exchanges, including its signaling. This allows the protocol to evolve without incurring a dependency on upgrades to middleboxes. +## Document Structure + This document describes the core QUIC protocol, and is structured as follows: -* Streams, QUIC's service abstraction to applications, including stream - multiplexing, and stream and connection-level flow control ({{streams}}, - {{stream-states}}, and {{flow-control}}); +* streams, QUIC's service abstraction to applications, including stream + multiplexing, and stream and connection-level flow control ({{streams}} - + {{flow-control}}); + +* connections, including version negotiation and establishment, usage, + migration, and shutdown ({{connections}} - {{termination}}); -* Connections, including connection establishment, migration, and shutdown; +* error handling ({{error-handling}}); -* Packets and Frames, including QUIC's model and mechanics of reliability - (acknowledgements and retransmission) and packet sizing; +* packets and frames, including QUIC's model and mechanics of reliability + (acknowledgements and retransmission) and packet sizing ({{packets-frames}} - + {{packet-size}}); -* Wire format, including versioning, packet formats, frame formats, and error - codes. +* wire format, including QUIC's version format, packet formats, frame formats, + and error codes ({{versions}} - {{error-codes}}). Accompanying documents describe QUIC's loss detection and congestion control {{QUIC-RECOVERY}}, and the use of TLS 1.3 for key negotiation {{QUIC-TLS}}. @@ -871,7 +877,7 @@ provide an interface to QUIC to tell it about its buffering limits so that there is not excessive buffering at multiple layers. -# Connections {#connection} +# Connections {#connections} A QUIC connection is a single conversation between two QUIC endpoints. QUIC's connection establishment intertwines version negotiation with the cryptographic @@ -2498,6 +2504,69 @@ if it cannot be processed by padding it to at least 38 octets. +# Error Handling {#error-handling} + +An endpoint that detects an error SHOULD signal the existence of that error to +its peer. Both transport-level and application-level errors can affect an +entire connection (see {{connection-errors}}), while only application-level +errors can be isolated to a single stream (see {{stream-errors}}). + +The most appropriate error code ({{error-codes}}) SHOULD be included in the +frame that signals the error. Where this specification identifies error +conditions, it also identifies the error code that is used. + +A stateless reset ({{stateless-reset}}) is not suitable for any error that can +be signaled with a CONNECTION_CLOSE, APPLICATION_CLOSE, or RST_STREAM frame. A +stateless reset MUST NOT be used by an endpoint that has the state necessary to +send a frame on the connection. + + +## Connection Errors + +Errors that result in the connection being unusable, such as an obvious +violation of protocol semantics or corruption of state that affects an entire +connection, MUST be signaled using a CONNECTION_CLOSE or APPLICATION_CLOSE frame +({{frame-connection-close}}, {{frame-application-close}}). An endpoint MAY close +the connection in this manner even if the error only affects a single stream. + +Application protocols can signal application-specific protocol errors using the +APPLICATION_CLOSE frame. Errors that are specific to the transport, including +all those described in this document, are carried in a CONNECTION_CLOSE frame. +Other than the type of error code they carry, these frames are identical in +format and semantics. + +A CONNECTION_CLOSE or APPLICATION_CLOSE frame could be sent in a packet that is +lost. An endpoint SHOULD be prepared to retransmit a packet containing either +frame type if it receives more packets on a terminated connection. Limiting the +number of retransmissions and the time over which this final packet is sent +limits the effort expended on terminated connections. + +An endpoint that chooses not to retransmit packets containing CONNECTION_CLOSE +or APPLICATION_CLOSE risks a peer missing the first such packet. The only +mechanism available to an endpoint that continues to receive data for a +terminated connection is to use the stateless reset process +({{stateless-reset}}). + +An endpoint that receives an invalid CONNECTION_CLOSE or APPLICATION_CLOSE frame +MUST NOT signal the existence of the error to its peer. + + +## Stream Errors + +If an application-level error affects a single stream, but otherwise leaves the +connection in a recoverable state, the endpoint can send a RST_STREAM frame +({{frame-rst-stream}}) with an appropriate error code to terminate just the +affected stream. + +Other than STOPPING ({{solicited-state-transitions}}), RST_STREAM MUST be +instigated by the application and MUST carry an application error code. +Resetting a stream without knowledge of the application protocol could cause the +protocol to enter an unrecoverable state. Application protocols might require +certain streams to be reliably delivered in order to guarantee consistent state +between endpoints. + + + # Packets and Frames {#packets-frames} Any QUIC packet, with the exception of the Version Negotiation packet, has @@ -3091,7 +3160,7 @@ using for private experimentation on the GitHub wiki at -# Packet Types and Formats {#packet-format} +# Packet Types and Formats {#packet-formats} All numeric values are encoded in network byte order (that is, big-endian) and all field sizes are in bits. When discussing individual bits of fields, the @@ -4766,69 +4835,7 @@ An IANA registry is used to manage the assignment of frame types, see -# Error Handling - -An endpoint that detects an error SHOULD signal the existence of that error to -its peer. Both transport-level and application-level errors can affect an -entire connection (see {{connection-errors}}), while only application-level -errors can be isolated to a single stream (see {{stream-errors}}). - -The most appropriate error code ({{error-codes}}) SHOULD be included in the -frame that signals the error. Where this specification identifies error -conditions, it also identifies the error code that is used. - -A stateless reset ({{stateless-reset}}) is not suitable for any error that can -be signaled with a CONNECTION_CLOSE, APPLICATION_CLOSE, or RST_STREAM frame. A -stateless reset MUST NOT be used by an endpoint that has the state necessary to -send a frame on the connection. - - -## Connection Errors - -Errors that result in the connection being unusable, such as an obvious -violation of protocol semantics or corruption of state that affects an entire -connection, MUST be signaled using a CONNECTION_CLOSE or APPLICATION_CLOSE frame -({{frame-connection-close}}, {{frame-application-close}}). An endpoint MAY close -the connection in this manner even if the error only affects a single stream. - -Application protocols can signal application-specific protocol errors using the -APPLICATION_CLOSE frame. Errors that are specific to the transport, including -all those described in this document, are carried in a CONNECTION_CLOSE frame. -Other than the type of error code they carry, these frames are identical in -format and semantics. - -A CONNECTION_CLOSE or APPLICATION_CLOSE frame could be sent in a packet that is -lost. An endpoint SHOULD be prepared to retransmit a packet containing either -frame type if it receives more packets on a terminated connection. Limiting the -number of retransmissions and the time over which this final packet is sent -limits the effort expended on terminated connections. - -An endpoint that chooses not to retransmit packets containing CONNECTION_CLOSE -or APPLICATION_CLOSE risks a peer missing the first such packet. The only -mechanism available to an endpoint that continues to receive data for a -terminated connection is to use the stateless reset process -({{stateless-reset}}). - -An endpoint that receives an invalid CONNECTION_CLOSE or APPLICATION_CLOSE frame -MUST NOT signal the existence of the error to its peer. - - -## Stream Errors - -If an application-level error affects a single stream, but otherwise leaves the -connection in a recoverable state, the endpoint can send a RST_STREAM frame -({{frame-rst-stream}}) with an appropriate error code to terminate just the -affected stream. - -Other than STOPPING ({{solicited-state-transitions}}), RST_STREAM MUST be -instigated by the application and MUST carry an application error code. -Resetting a stream without knowledge of the application protocol could cause the -protocol to enter an unrecoverable state. Application protocols might require -certain streams to be reliably delivered in order to guarantee consistent state -between endpoints. - - -## Transport Error Codes {#error-codes} +# Transport Error Codes {#error-codes} QUIC error codes are 16-bit unsigned integers. From 53f0badcfab54967edd5f3c28b5ad43e2b03bb90 Mon Sep 17 00:00:00 2001 From: Jana Iyengar Date: Tue, 16 Oct 2018 15:10:32 -0700 Subject: [PATCH 30/57] Done through connection migration section --- draft-ietf-quic-transport.md | 800 ++++++++++++++++++----------------- 1 file changed, 410 insertions(+), 390 deletions(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index 72cecf4c36..b5a0cb1a44 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -223,7 +223,7 @@ x (*) ... : Indicates that x is variable-length -# Streams: QUIC's Data Structuring Abstraction {#streams} +# Streams {#streams} Streams in QUIC provide a lightweight, ordered byte-stream abstraction. @@ -237,15 +237,17 @@ are initiated by the client and server (see {{stream-id}}). Either type of stream can be created by either endpoint, can concurrently send data interleaved with other streams, and can be cancelled. +Streams can be created by sending data. Other processes associated with stream +management - ending, cancelling, and managing flow control - are all designed to +impose minimal overheads. For instance, a single STREAM frame ({{frame-stream}}) +can open, carry data for, and close a stream. Streams can also be long-lived and +can last the entire duration of a connection. + Stream offsets allow for the octets on a stream to be placed in order. An endpoint MUST be capable of delivering data received on a stream in order. Implementations MAY choose to offer the ability to deliver data out of order. There is no means of ensuring ordering between octets on different streams. -The creation and destruction of streams are expected to have minimal bandwidth -and computational cost. A single STREAM frame may create, carry data for, and -terminate a stream, or a stream may last the entire duration of a connection. - Streams are individually flow controlled, allowing an endpoint to limit memory commitment and to apply back pressure. The creation of streams is also flow controlled, with each peer declaring the maximum stream ID it is willing to @@ -260,9 +262,10 @@ for some applications. ## Stream Identifiers {#stream-id} Streams are identified by an unsigned 62-bit integer, referred to as the Stream -ID. The least significant two bits of the Stream ID are used to identify the -type of stream (unidirectional or bidirectional) and the initiator of the -stream. +ID. Stream IDs are encoded as a variable-length integer (see +{{integer-encoding}}). The least significant two bits of the Stream ID are used +to identify the type of stream (unidirectional or bidirectional) and the +initiator of the stream. The least significant bit (0x1) of the Stream ID identifies the initiator of the stream. Clients initiate even-numbered streams (those with the least @@ -290,21 +293,18 @@ The two type bits from a Stream ID therefore identify streams as summarized in | 0x3 | Server-Initiated, Unidirectional | {: #stream-id-types title="Stream ID Types"} -The first bi-directional stream opened by the client is stream 0. +The first bidirectional stream opened by the client is stream 0. A QUIC endpoint MUST NOT reuse a Stream ID. Streams of each type are created in numeric order. Streams that are used out of order result in opening all lower-numbered streams of the same type in the same direction. -Stream IDs are encoded as a variable-length integer (see {{integer-encoding}}). - ## Stream Concurrency {#stream-concurrency} -An endpoint limits the number of concurrently active incoming streams by -adjusting the maximum stream ID. An initial value is set in the transport -parameters (see {{transport-parameter-definitions}}) and is subsequently -increased by MAX_STREAM_ID frames (see {{frame-max-stream-id}}). +QUIC allows for an arbitrary number of streams to operate concurrently. An +endpoint limits the number of concurrently active incoming streams by limiting +the maximum stream ID (see {{stream-limit-increments}}). The maximum stream ID is specific to each endpoint and applies only to the peer that receives the setting. That is, clients specify the maximum stream ID the @@ -325,12 +325,14 @@ increase the maximum stream ID. ## Sending and Receiving Data -Once a stream is created, endpoints may use the stream to send and receive data. -Each endpoint may send a series of STREAM frames encapsulating data on a stream -until the stream is terminated in that direction. Streams are an ordered -byte-stream abstraction, and they have no other structure within them. STREAM -frame boundaries are not expected to be preserved in retransmissions from the -sender or during delivery to the application at the receiver. +Endpoints uses streams to send and receive data. Endpoints send STREAM frames, +which encapsulate data for a stream. STREAM frames carry a flag that can be used +to signal the end of a stream. + +Streams are an ordered byte-stream abstraction with no other structure that is +visible to QUIC. STREAM frame boundaries are not expected to preserved when data +is transmitted, when data is retransmitted after packet loss, or when data is +delivered to the application at the receiver. When new data is to be sent on a stream, a sender MUST set the encapsulating STREAM frame's offset field to the stream offset of the first byte of this new @@ -350,10 +352,8 @@ change if it is sent multiple times; an endpoint MAY treat receipt of a changed octet as a connection error of type PROTOCOL_VIOLATION. An endpoint MUST NOT send data on any stream without ensuring that it is within -the data limits set by its peer. - -Flow control is described in detail in {{flow-control}}, and congestion control -is described in the companion document {{QUIC-RECOVERY}}. +the data limits set by its peer. Flow control is described in detail in +{{flow-control}}. ## Stream Prioritization @@ -714,35 +714,21 @@ STOP_SENDING frame is unnecessary. It is necessary to limit the amount of data that a sender may have outstanding at any time, so as to prevent a fast sender from overwhelming a slow receiver, or to prevent a malicious sender from consuming significant resources at a -receiver. This section describes QUIC's flow-control mechanisms. - -QUIC employs a credit-based flow-control scheme similar to HTTP/2's flow control -{{?HTTP2}}. A receiver advertises the number of octets it is prepared to -receive on a given stream and for the entire connection. This leads to two -levels of flow control in QUIC: (i) Connection flow control, which prevents -senders from exceeding a receiver's buffer capacity for the connection, and (ii) -Stream flow control, which prevents a single stream from consuming the entire -receive buffer for a connection. - -A data receiver sends MAX_STREAM_DATA or MAX_DATA frames to the sender -to advertise additional credit. MAX_STREAM_DATA frames send the -maximum absolute byte offset of a stream, while MAX_DATA sends the -maximum of the sum of the absolute byte offsets of all streams. - -A receiver MAY advertise a larger offset at any point by sending MAX_DATA or -MAX_STREAM_DATA frames. A receiver cannot renege on an advertisement; that is, -once a receiver advertises an offset, advertising a smaller offset has no -effect. A sender MUST therefore ignore any MAX_DATA or MAX_STREAM_DATA frames -that do not increase flow control limits. +receiver. To this end, QUIC employs a credit-based flow-control scheme similar +to that in HTTP/2 {{?HTTP2}}. A receiver advertises the number of octets it is +prepared to receive on a given stream and for the entire connection. This leads +to two levels of flow control in QUIC: -A receiver MUST close the connection with a FLOW_CONTROL_ERROR error -({{error-handling}}) if the peer violates the advertised connection or stream -data limits. +* Stream flow control, which prevents a single stream from consuming the entire + receive buffer for a connection. -A sender SHOULD send BLOCKED or STREAM_BLOCKED frames to indicate it has data to -write but is blocked by flow control limits. These frames are expected to be -sent infrequently in common cases, but they are considered useful for debugging -and monitoring purposes. +* Connection flow control, which prevents senders from exceeding a receiver's + buffer capacity for the connection, and + +A data receiver sends MAX_STREAM_DATA or MAX_DATA frames to the sender to +advertise additional credit. MAX_STREAM_DATA frames send the maximum absolute +byte offset of a stream, while MAX_DATA frames send the maximum of the sum of +the absolute byte offsets of all streams. A receiver advertises credit for a stream by sending a MAX_STREAM_DATA frame with the Stream ID set appropriately. A receiver could use the current offset of @@ -755,19 +741,35 @@ Connection flow control is a limit to the total bytes of stream data sent in STREAM frames on all streams. A receiver advertises credit for a connection by sending a MAX_DATA frame. A receiver maintains a cumulative sum of bytes received on all contributing streams, which are used to check for flow control -violations. A receiver might use a sum of bytes consumed on all contributing -streams to determine the maximum data limit to be advertised. +violations. A receiver might use a sum of bytes consumed on all streams to +determine the maximum data limit to be advertised. -## Edge Cases and Other Considerations +A receiver MAY advertise a larger offset at any point by sending MAX_STREAM_DATA +or MAX_DATA frames. A receiver cannot renege on an advertisement; that is, once +a receiver advertises an offset, advertising a smaller offset has no effect. A +sender MUST therefore ignore any MAX_STREAM_DATA or MAX_DATA frames that do not +increase flow control limits. + +A receiver MUST close the connection with a FLOW_CONTROL_ERROR error +({{error-handling}}) if the peer violates the advertised connection or stream +data limits. + +A sender SHOULD send STREAM_BLOCKED or BLOCKED frames to indicate it has data to +write but is blocked by flow control limits. These frames are expected to be +sent infrequently in common cases, but they are considered useful for debugging +and monitoring purposes. + +(TODO: Add something about max_stream_id) + +## Handling of Stream Cancellation There are some edge cases which must be considered when dealing with stream and connection level flow control. Given enough time, both endpoints must agree on flow control state. If one end believes it can send more than the other end is willing to receive, the connection will be torn down when too much data arrives. - Conversely if a sender believes it is blocked, while endpoint B expects more data can be received, then the connection can be in a deadlock, with the sender -waiting for a MAX_DATA or MAX_STREAM_DATA frame which will never come. +waiting for a MAX_STREAM_DATA or MAX_DATA frame which will never come. On receipt of a RST_STREAM frame, an endpoint will tear down state for the matching stream and ignore further data arriving on that stream. This could @@ -778,14 +780,12 @@ but a receiver that has not received these bytes would not know to include them as well. The receiver must learn the number of bytes that were sent on the stream to make the same adjustment in its connection flow controller. -To avoid this de-synchronization, a RST_STREAM sender MUST include the final -byte offset sent on the stream in the RST_STREAM frame. On receiving a -RST_STREAM frame, a receiver definitively knows how many bytes were sent on that -stream before the RST_STREAM frame, and the receiver MUST use the final offset -to account for all bytes sent on the stream in its connection level flow -controller. - -### Response to a RST_STREAM +To ensure that endpoints maintain a consistent connection-level flow control +state, the RST_STREAM frame {{frame-rst-stream}} includes the largest offset of +data sent on the stream. On receiving a RST_STREAM frame, a receiver +definitively knows how many bytes were sent on that stream before the RST_STREAM +frame, and the receiver MUST use the final offset to account for all bytes sent +on the stream in its connection level flow controller. RST_STREAM terminates one direction of a stream abruptly. Whether any action or response can or should be taken on the data already received is an @@ -794,50 +794,34 @@ RST_STREAM an endpoint will choose to stop sending data in its own direction. If the sender of a RST_STREAM wishes to explicitly state that no future data will be processed, that endpoint MAY send a STOP_SENDING frame at the same time. -### Data Limit Increments {#fc-credit} + +## Data Limit Increments {#fc-credit} This document leaves when and how many bytes to advertise in a MAX_DATA or MAX_STREAM_DATA to implementations, but offers a few considerations. These frames contribute to connection overhead. Therefore frequently sending frames -with small changes is undesirable. At the same time, infrequent updates require -larger increments to limits if blocking is to be avoided. Thus, larger updates -require a receiver to commit to larger resource commitments. Thus there is a -trade-off between resource commitment and overhead when determining how large a -limit is advertised. +with small changes is undesirable. At the same time, larger increments to +limits are necessary to avoid blocking if updates are less frequent, requiring +larger resource commitments at the receiver. Thus there is a trade-off between +resource commitment and overhead when determining how large a limit is +advertised. A receiver MAY use an autotuning mechanism to tune the frequency and amount that it increases data limits based on a round-trip time estimate and the rate at which the receiving application consumes data, similar to common TCP implementations. -## Stream Limit Increment - -As with flow control, this document leaves when and how many streams to make -available to a peer via MAX_STREAM_ID to implementations, but offers a few -considerations. MAX_STREAM_ID frames constitute minimal overhead, while -withholding MAX_STREAM_ID frames can prevent the peer from using the available -parallelism. - -Implementations will likely want to increase the maximum stream ID as -peer-initiated streams close. A receiver MAY also advance the maximum stream ID -based on current activity, system conditions, and other environmental factors. - +If a sender runs out of flow control credit, it will be unable to send new +data. That is, the sender is blocked. A blocked sender SHOULD send a +STREAM_BLOCKED or BLOCKED frame. A receiver uses these frames for debugging +purposes. A receiver SHOULD NOT wait for a STREAM_BLOCKED or BLOCKED frame +before sending MAX_STREAM_DATA or MAX_DATA, since doing so will mean that a +sender will be blocked for an entire round trip. -### Blocking on Flow Control {#blocking} - -If a sender does not receive a MAX_DATA or MAX_STREAM_DATA frame when it has run -out of flow control credit, the sender will be blocked and SHOULD send a BLOCKED -or STREAM_BLOCKED frame. These frames are expected to be useful for debugging -at the receiver; they do not require any other action. A receiver SHOULD NOT -wait for a BLOCKED or STREAM_BLOCKED frame before sending MAX_DATA or -MAX_STREAM_DATA, since doing so will mean that a sender is unable to send for an -entire round trip. - -For smooth operation of the congestion controller, it is generally considered -best to not let the sender go into quiescence if avoidable. To avoid blocking a -sender, and to reasonably account for the possibility of loss, a receiver should -send a MAX_DATA or MAX_STREAM_DATA frame at least two round trips before it -expects the sender to get blocked. +It is generally considered best to not let the sender go into quiescence if +avoidable. To avoid blocking a sender, and to reasonably account for the +possibility of loss, a receiver should send a MAX_DATA or MAX_STREAM_DATA frame +at least two round trips before it expects the sender to get blocked. A sender sends a single BLOCKED or STREAM_BLOCKED frame only once when it reaches a data limit. A sender SHOULD NOT send multiple BLOCKED or @@ -868,6 +852,7 @@ errors is not mandatory, but only because requiring that an endpoint generate these errors also means that the endpoint needs to maintain the final offset state for closed streams, which could mean a significant state commitment. + ## Flow Control for Cryptographic Handshake {#flow-control-crypto} Data sent in CRYPTO frames is not flow controlled in the same way as STREAM @@ -877,6 +862,24 @@ provide an interface to QUIC to tell it about its buffering limits so that there is not excessive buffering at multiple layers. +## Stream Limit Increment {#stream-limit-increment} + +An endpoint limits the number of concurrently active incoming streams by +limiting the maximum stream ID. An initial value is set in the transport +parameters (see {{transport-parameter-definitions}}) and is subsequently +increased by MAX_STREAM_ID frames (see {{frame-max-stream-id}}). + +As with stream and connection flow control, this document leaves when and how +many streams to make available to a peer via MAX_STREAM_ID to implementations, +but offers a few considerations. MAX_STREAM_ID frames constitute minimal +overhead, while withholding MAX_STREAM_ID frames can prevent the peer from using +the available parallelism. + +Implementations will likely want to increase the maximum stream ID as +peer-initiated streams close. A receiver MAY also advance the maximum stream ID +based on current activity, system conditions, and other environmental factors. + + # Connections {#connections} A QUIC connection is a single conversation between two QUIC endpoints. QUIC's @@ -891,8 +894,7 @@ endpoint, as described in {{termination}}. Each connection possesses a set of identifiers, any of which could be used to distinguish it from other connections. Connection IDs are selected -independently in each direction. Each Connection ID has an associated sequence -number to assist in deduplicating messages. +independently in each direction. The primary function of a connection ID is to ensure that changes in addressing at lower protocol layers (UDP, IP, and below) don't cause packets for a QUIC @@ -920,9 +922,11 @@ using the NEW_CONNECTION_ID frame ({{frame-new-connection-id}}). ### Issuing Connection IDs -The initial connection ID issued by an endpoint is the Source Connection ID -during the handshake. The sequence number of the initial connection ID is 0. If -the preferred_address transport parameter is sent, the sequence number of the +Each Connection ID has an associated sequence number to assist in deduplicating +messages. The initial connection ID issued by an endpoint is sent in the Source +Connection ID field of the long packet header ({{long-header}}) during the +handshake. The sequence number of the initial connection ID is 0. If the +preferred_address transport parameter is sent, the sequence number of the supplied connection ID is 1. Subsequent connection IDs are communicated to the peer using NEW_CONNECTION_ID frames ({{frame-new-connection-id}}), and the sequence number on each newly-issued connection ID MUST increase by 1. The @@ -988,6 +992,8 @@ correspond to a single connection. Endpoints SHOULD send a Stateless Reset ({{stateless-reset}}) for any packets that cannot be attributed to an existing connection. + ### Client Packet Handling {#client-pkt-handling} @@ -1024,8 +1030,9 @@ the packet is sufficiently long. Servers MUST drop other packets that contain unsupported versions. Packets with a supported version, or no version field, are matched to a -connection as described in {{packet-handling}}. If not matched, the server -continues below. +connection using the connection ID or - for packets with zero-length connection +IDs - the address tuple. If the packet doesn't match an existing connection, +the server continues below. If the packet is an Initial packet fully conforming with the specification, the server proceeds with the handshake ({{handshake}}). This commits the server to @@ -1043,6 +1050,15 @@ SHOULD ignore any such packets. Servers MUST drop incoming packets under all other circumstances. +## Life of a QUIC Connection {#connection-lifecycle} + +TBD. + + + # Version Negotiation {#version-negotiation} @@ -1057,6 +1073,7 @@ SHOULD pad the first packet they send to the largest of the minimum packet sizes across all versions they support. This ensures that the server responds if there is a mutually supported version. + ## Sending Version Negotiation Packets {#send-vn} If the version selected by the client is not acceptable to the server, the @@ -1081,8 +1098,8 @@ selects an acceptable protocol version from the list provided by the server. The client then attempts to create a connection using that version. Though the content of the Initial packet the client sends might not change in response to version negotiation, a client MUST increase the packet number it uses on every -packet it sends. Packets MUST continue to use long headers and MUST include the -new negotiated protocol version. +packet it sends. Packets MUST continue to use long headers ({{long-header}}) +and MUST include the new negotiated protocol version. The client MUST use the long header format and include its selected version on all packets until it has 1-RTT keys and it has received a packet from the server @@ -1125,6 +1142,113 @@ solicit a list of supported versions from a server. +# Proof of Source Address Ownership {#address-validation} + + + +Address validation is used by QUIC to avoid being used for a traffic +amplification attack. In such an attack, a packet is sent to a server with +spoofed source address information that identifies a victim. If a server +generates more or larger packets in response to that packet, the attacker can +use the server to send more data toward the victim than it would be able to send +on its own. + +The primary defense against amplification attack is verifying that a client is +able to receive packets at the transport address that it claims. QUIC also +requires that clients send UDP datagrams with at least 1200 octets of payload +until the server has completed address validation. A server can thereby send +more data to an unproven address without increasing the amplification advantage +gained by an attacker. + +A server eventually confirms that a client has received its messages when the +first Handshake-level message is received. This might be insufficient, either +because the server wishes to avoid the computational cost of completing the +handshake, or it might be that the size of the packets that are sent during the +handshake is too large. This is especially important for 0-RTT, where the +server might wish to provide application data traffic - such as a response to a +request - in response to the data carried in the early data from the client. + +To send additional data prior to completing the cryptographic handshake, the +server then needs to validate that the client owns the address that it claims. + +QUIC therefore performs source address validation during connection +establishment. + +A different type of source address validation is performed after a connection +migration, see {{migrate-validate}}. + + +### Client Address Validation Procedure + +QUIC uses token-based address validation. Any time the server wishes to +validate a client address, it provides the client with a token. As long as it +is not possible for an attacker to generate a valid token for its address (see +{{token-integrity}}) and the client is able to return that token, it proves to +the server that it received the token. + +Upon receiving the client's Initial packet, the server can request address +validation by sending a Retry packet containing a token. This token is repeated +in the client's next Initial packet. + +There is no need for a single well-defined format for the token because the +server that generates the token also consumes it. A token could include +information about the claimed client address (IP and port), a timestamp, and any +other supplementary information the server will need to validate the token in +the future. The only requirement is that a valid token be difficult to guess +for an attacker. + +The Retry packet is sent to the client and a legitimate client will respond with +an Initial packet containing the token from the Retry packet when it continues +the handshake. In response to receiving the token, a server can either abort +the connection or permit it to proceed. + +A connection MAY be accepted without address validation - or with only limited +validation - but a server SHOULD limit the data it sends toward an unvalidated +address. Successful completion of the cryptographic handshake implicitly +provides proof that the client has received packets from the server. + + +### Address Validation for Future Connections + +A server MAY provide clients with an address validation token during one +connection that can be used on a subsequent connection. Address validation is +especially important with 0-RTT because a server potentially sends a significant +amount of data to a client in response to 0-RTT data. + +The server uses the NEW_TOKEN frame {{frame-new-token}} to provide the +client with an address validation token that can be used to validate +future connections. The client may then use this token to validate +future connections by including it in the Initial packet's header. +The client MUST NOT use the token provided in a Retry for future +connections. + +Unlike the token that is created for a Retry packet, there might be some time +between when the token is created and when the token is subsequently used. +Thus, a resumption token SHOULD include an expiration time. The server MAY +include either an explicit expiration time or an issued timestamp and +dynamically calculate the expiration time. It is also unlikely that the client +port number is the same on two different connections; validating the port is +therefore unlikely to be successful. + +A resumption token SHOULD be easily distinguishable from tokens that are sent in +Retry packets as they are carried in the same field. + +### Address Validation Token Integrity {#token-integrity} + +An address validation token MUST be difficult to guess. Including a large +enough random value in the token would be sufficient, but this depends on the +server remembering the value it sends to clients. + +A token-based scheme allows the server to offload any state associated with +validation to the client. For this design to work, the token MUST be covered by +integrity protection against modification or falsification by clients. Without +integrity protection, malicious clients could generate or guess values for +tokens that would be accepted by the server. Only the server requires access to +the integrity protection key for tokens. + + + # Cryptographic and Transport Handshake {#handshake} QUIC relies on a combined cryptographic and transport handshake to minimize @@ -1158,21 +1282,19 @@ that meets the requirements of the cryptographic handshake protocol: * authenticated negotiation of an application protocol (TLS uses ALPN {{?RFC7301}} for this purpose) -* for the server, the ability to carry data that provides assurance that the - client can receive packets that are addressed with the transport address that - is claimed by the client (see {{address-validation}}) - -The first CRYPTO frame MUST be sent in a single packet. Any second attempt -that is triggered by address validation MUST also be sent within a single -packet. This avoids having to reassemble a message from multiple packets. +The first CRYPTO frame from a client MUST be sent in a single packet. Any +second attempt that is triggered by address validation MUST also be sent within +a single packet. This avoids having to reassemble a message from multiple +packets. The first client packet of the cryptographic handshake protocol MUST fit within a 1232 octet QUIC packet payload. This includes overheads that reduce the space available to the cryptographic handshake protocol. -The CRYPTO frame can be sent in different packet number spaces. CRYPTO frames -in each packet number space carry a separate sequence of handshake data starting -from an offset of 0. +The CRYPTO frame can be sent in different packet number spaces. The sequence +numbers used by CRYPTO frames to ensure ordered delivery of cryptographic +handshake data start from zero in each packet number space. + ## Example Handshake Flows @@ -1228,7 +1350,6 @@ Initial[0]: CRYPTO[CH] <- 1-RTT[0]: STREAM[1, "..."] ACK[0] Initial[1]: ACK[0] -0-RTT[1]: CRYPTO[EOED] Handshake[0]: CRYPTO[FIN], ACK[0] 1-RTT[2]: STREAM[0, "..."] ACK[0] -> @@ -1238,6 +1359,56 @@ Handshake[0]: CRYPTO[FIN], ACK[0] {: #tls-0rtt-handshake title="Example 0-RTT Handshake"} +## Negotiating Connection IDs + + + +A connection ID is used to ensure consistent routing of packets, as described in +{{connection-id}}. The long header contains two connection IDs: the Destination +Connection ID is chosen by the recipient of the packet and is used to provide +consistent routing; the Source Connection ID is used to set the Destination +Connection ID used by the peer. + +During the handshake, packets with the long header ({{long-header}}) are used to +establish the connection ID that each endpoint uses. Each endpoint uses the +Source Connection ID field to specify the connection ID that is used in the +Destination Connection ID field of packets being sent to them. Upon receiving a +packet, each endpoint sets the Destination Connection ID it sends to match the +value of the Source Connection ID that they receive. + +During the handshake, a client can receive both a Retry and an Initial packet, +and thus be given two opportunities to update the Destination Connection ID it +sends. A client MUST only change the value it sends in the Destination +Connection ID in response to the first packet of each type it receives from the +server (Retry or Initial); a server MUST set its value based on the Initial +packet. Any additional changes are not permitted; if subsequent packets of +those types include a different Source Connection ID, they MUST be discarded. +This avoids problems that might arise from stateless processing of multiple +Initial packets producing different connection IDs. + +Packets with short headers ({{short-header}}) only include the Destination +Connection ID and omit the explicit length. The length of the Destination +Connection ID field is expected to be known to endpoints. + +Endpoints using a connection-ID based load balancer could agree with the load +balancer on a fixed or minimum length and on an encoding for connection IDs. +This fixed portion could encode an explicit length, which allows the entire +connection ID to vary in length and still be used by the load balancer. + +The very first packet sent by a client includes a random value for Destination +Connection ID. The same value MUST be used for all 0-RTT packets sent on that +connection ({{packet-protected}}). This randomized value is used to determine +the packet protection keys for Initial packets (see Section 5.2 of +{{QUIC-TLS}}). + +A Version Negotiation ({{packet-version}}) packet MUST use both connection IDs +selected by the client, swapped to ensure correct routing toward the client. + +The connection ID can change over the lifetime of a connection, especially in +response to connection migration ({{migration}}). NEW_CONNECTION_ID frames +({{frame-new-connection-id}}) are used to provide new connection ID values. + + ## Transport Parameters During connection establishment, both endpoints make authenticated declarations @@ -1285,7 +1456,7 @@ language from Section 3 of {{!TLS13=RFC8446}}. QuicVersion negotiated_version; QuicVersion supported_versions<4..2^8-4>; }; - TransportParameter parameters<22..2^16-1>; + TransportParameter parameters<0..2^16-1>; } TransportParameters; struct { @@ -1300,7 +1471,7 @@ language from Section 3 of {{!TLS13=RFC8446}}. The `extension_data` field of the quic_transport_parameters extension defined in {{QUIC-TLS}} contains a TransportParameters value. TLS encoding rules are -therefore used to encode the transport parameters. +therefore used to describe the encoding of transport parameters. QUIC encodes transport parameters into a sequence of octets, which are then included in the cryptographic handshake. Once the handshake completes, the @@ -1445,9 +1616,9 @@ preferred_address (0x0004): : The server's Preferred Address is used to effect a change in server address at the end of the handshake, as described in {{preferred-address}}. -A client MUST NOT include a stateless reset token or a preferred address. A -server MUST treat receipt of either transport parameter as a connection error of -type TRANSPORT_PARAMETER_ERROR. +A client MUST NOT include an original connection ID, a stateless reset token, or +a preferred address. A server MUST treat receipt of any of these transport +parameters as a connection error of type TRANSPORT_PARAMETER_ERROR. ### Values of Transport Parameters for 0-RTT {#zerortt-parameters} @@ -1469,10 +1640,10 @@ transport parameters for use in the new connection. If 0-RTT data is accepted by the server, the server MUST NOT reduce any limits or alter any values that might be violated by the client with its 0-RTT data. In particular, a server that accepts 0-RTT data MUST NOT set values for initial_max_data, -initial_max_stream_data_bidi_local, initial_max_stream_data_bidi_remote, and -initial_max_stream_data_uni that are smaller than the remembered value of those -parameters. Similarly, a server MUST NOT reduce the value of -initial_max_bidi_streams or initial_max_uni_streams. +initial_max_stream_data_bidi_local, initial_max_stream_data_bidi_remote, +initial_max_stream_data_uni, initial_max_bidi_streams, or +initial_max_uni_streams that are smaller than the remembered value of those +parameters. Omitting or setting a zero value for certain transport parameters can result in 0-RTT data being enabled, but not usable. The applicable subset of transport @@ -1568,117 +1739,8 @@ One way that a new format could be introduced is to define a TLS extension with a different codepoint. -## Proof of Source Address Ownership {#address-validation} - -Transport protocols commonly spend a round trip checking that a client owns the -transport address (IP and port) that it claims. Verifying that a client can -receive packets sent to its claimed transport address protects against spoofing -of this information by malicious clients. - -This technique is used primarily to avoid QUIC from being used for traffic -amplification attack. In such an attack, a packet is sent to a server with -spoofed source address information that identifies a victim. If a server -generates more or larger packets in response to that packet, the attacker can -use the server to send more data toward the victim than it would be able to send -on its own. - -Several methods are used in QUIC to mitigate this attack. Firstly, the initial -handshake packet is sent in a UDP datagram that contains at least 1200 octets of -UDP payload. This allows a server to send a similar amount of data without -risking causing an amplification attack toward an unproven remote address. - -A server eventually confirms that a client has received its messages when the -first Handshake-level message is received. This might be insufficient, -either because the server wishes to avoid the computational cost of completing -the handshake, or it might be that the size of the packets that are sent during -the handshake is too large. This is especially important for 0-RTT, where the -server might wish to provide application data traffic - such as a response to a -request - in response to the data carried in the early data from the client. - -To send additional data prior to completing the cryptographic handshake, the -server then needs to validate that the client owns the address that it claims. - -Source address validation is therefore performed by the core transport -protocol during the establishment of a connection. - -A different type of source address validation is performed after a connection -migration, see {{migrate-validate}}. - - -### Client Address Validation Procedure - -QUIC uses token-based address validation. Any time the server wishes -to validate a client address, it provides the client with a token. As -long as the token's authenticity can be checked (see -{{token-integrity}}) and the client is able to return that token, it -proves to the server that it received the token. - -Upon receiving the client's Initial packet, the server can request -address validation by sending a Retry packet containing a token. This -token is repeated in the client's next Initial packet. Because the -token is consumed by the server that generates it, there is no need -for a single well-defined format. A token could include information -about the claimed client address (IP and port), a timestamp, and any -other supplementary information the server will need to validate the -token in the future. - -The Retry packet is sent to the client and a legitimate client will -respond with an Initial packet containing the token from the Retry packet -when it continues the handshake. In response to receiving the token, a -server can either abort the connection or permit it to proceed. - -A connection MAY be accepted without address validation - or with only limited -validation - but a server SHOULD limit the data it sends toward an unvalidated -address. Successful completion of the cryptographic handshake implicitly -provides proof that the client has received packets from the server. - -The client should allow for additional Retry packets being sent in -response to Initial packets sent containing a token. There are several -situations in which the server might not be able to use the previously -generated token to validate the client's address and must send a new -Retry. A reasonable limit to the number of tries the client allows -for, before giving up, is 3. That is, the client MUST echo the -address validation token from a new Retry packet up to 3 times. After -that, it MAY give up on the connection attempt. - - -### Address Validation for Future Connections - -A server MAY provide clients with an address validation token during one -connection that can be used on a subsequent connection. Address validation is -especially important with 0-RTT because a server potentially sends a significant -amount of data to a client in response to 0-RTT data. - -The server uses the NEW_TOKEN frame {{frame-new-token}} to provide the -client with an address validation token that can be used to validate -future connections. The client may then use this token to validate -future connections by including it in the Initial packet's header. -The client MUST NOT use the token provided in a Retry for future -connections. - -Unlike the token that is created for a Retry packet, there might be some time -between when the token is created and when the token is subsequently used. -Thus, a resumption token SHOULD include an expiration time. The server MAY -include either an explicit expiration time or an issued timestamp and -dynamically calculate the expiration time. It is also unlikely that the client -port number is the same on two different connections; validating the port is -therefore unlikely to be successful. - - -### Address Validation Token Integrity {#token-integrity} - -An address validation token MUST be difficult to guess. Including a large -enough random value in the token would be sufficient, but this depends on the -server remembering the value it sends to clients. - -A token-based scheme allows the server to offload any state associated with -validation to the client. For this design to work, the token MUST be covered by -integrity protection against modification or falsification by clients. Without -integrity protection, malicious clients could generate or guess values for -tokens that would be accepted by the server. Only the server requires access to -the integrity protection key for tokens. - ## Stateless Retries {#stateless-retry} + A server can process an Initial packet from a client without committing any state. This allows a server to perform address validation @@ -1718,7 +1780,7 @@ traversal needs additional synchronization mechanisms that are not provided here. An endpoint MAY bundle PATH_CHALLENGE and PATH_RESPONSE frames that are used for -path validation with other frames. For instance, an endpoint may pad a packet +path validation with other frames. In particular, an endpoint may pad a packet carrying a PATH_CHALLENGE for PMTU discovery, or an endpoint may bundle a PATH_RESPONSE with its own PATH_CHALLENGE. @@ -1728,36 +1790,37 @@ NEW_CONNECTION_ID and PATH_CHALLENGE frames in the same packet. This ensures that an unused connection ID will be available to the peer when sending a response. -## Initiation + +## Initiating Path Validation To initiate path validation, an endpoint sends a PATH_CHALLENGE frame containing a random payload on the path to be validated. -An endpoint MAY send additional PATH_CHALLENGE frames to handle packet loss. An -endpoint SHOULD NOT send a PATH_CHALLENGE more frequently than it would an -Initial packet, ensuring that connection migration is no more load on a new path -than establishing a new connection. +An endpoint MAY send multiple PATH_CHALLENGE frames to guard against packet +loss. An endpoint SHOULD NOT send a PATH_CHALLENGE more frequently than it +would an Initial packet, ensuring that connection migration is no more load on a +new path than establishing a new connection. The endpoint MUST use fresh random data in every PATH_CHALLENGE frame so that it can associate the peer's response with the causative PATH_CHALLENGE. -## Response +## Path Validation Responses On receiving a PATH_CHALLENGE frame, an endpoint MUST respond immediately by -echoing the data contained in the PATH_CHALLENGE frame in a PATH_RESPONSE frame, -with the following stipulation. Since a PATH_CHALLENGE might be sent from a -spoofed address, an endpoint MAY limit the rate at which it sends PATH_RESPONSE -frames and MAY silently discard PATH_CHALLENGE frames that would cause it to -respond at a higher rate. +echoing the data contained in the PATH_CHALLENGE frame in a PATH_RESPONSE frame. +However, because a PATH_CHALLENGE might be sent from a spoofed address, an +endpoint MUST limit the rate at which it sends PATH_RESPONSE frames and MAY +silently discard PATH_CHALLENGE frames that would cause it to respond at a +higher rate. To ensure that packets can be both sent to and received from the peer, the -PATH_RESPONSE MUST be sent on the same path as the triggering PATH_CHALLENGE: -from the same local address on which the PATH_CHALLENGE was received, to the -same remote address from which the PATH_CHALLENGE was received. +PATH_RESPONSE MUST be sent on the same path as the triggering PATH_CHALLENGE. +That is, from the same local address on which the PATH_CHALLENGE was received, +to the same remote address from which the PATH_CHALLENGE was received. -## Completion +## Successful Path Validation A new address is considered valid when a PATH_RESPONSE frame is received containing data that was sent in a previous PATH_CHALLENGE. Receipt of an @@ -1779,21 +1842,24 @@ the path to be valid when a PATH_RESPONSE frame is received on the same path with the same payload as the PATH_CHALLENGE frame. -## Abandonment +## Failed Path Validation + +Path validation only fails when the endpoint attempting to validate the path +abandons its attempt to validate the path. -An endpoint SHOULD abandon path validation after sending some number of -PATH_CHALLENGE frames or after some time has passed. When setting this timer, -implementations are cautioned that the new path could have a longer round-trip -time than the original. +Endpoints SHOULD abandon path validation based on a timer. When setting this +timer, implementations are cautioned that the new path could have a longer +round-trip time than the original. Note that the endpoint might receive packets containing other frames on the new path, but a PATH_RESPONSE frame with appropriate data is required for path validation to succeed. -If path validation fails, the path is deemed unusable. This does not -necessarily imply a failure of the connection - endpoints can continue sending -packets over other paths as appropriate. If no paths are available, an endpoint -can wait for a new path to become available or close the connection. +When an endpoint abandons path validation, it determines that the path is +unusable. This does not necessarily imply a failure of the connection - +endpoints can continue sending packets over other paths as appropriate. If no +paths are available, an endpoint can wait for a new path to become available or +close the connection. A path validation might be abandoned for other reasons besides failure. Primarily, this happens if a connection migration to a new path is @@ -1802,10 +1868,10 @@ initiated while a path validation on the old path is in progress. # Connection Migration {#migration} -QUIC allows connections to survive changes to endpoint addresses (that is, IP -address and/or port), such as those caused by an endpoint migrating to a new -network. This section describes the process by which an endpoint migrates to a -new address. +The use of a connection ID allows connections to survive changes to endpoint +addresses (that is, IP address and/or port), such as those caused by an endpoint +migrating to a new network. This section describes the process by which an +endpoint migrates to a new address. An endpoint MUST NOT initiate connection migration before the handshake is finished and the endpoint has 1-RTT keys. The design of QUIC relies on @@ -1820,8 +1886,9 @@ INVALID_MIGRATION. Not all changes of peer address are intentional migrations. The peer could experience NAT rebinding: a change of address due to a middlebox, usually a NAT, allocating a new outgoing port or even a new outgoing IP address for a flow. -Endpoints SHOULD perform path validation ({{migrate-validate}}) if a NAT -rebinding does not cause the connection to fail. +NAT rebinding is not connection migration as defined in this section, though an +endpoint SHOULD perform path validation ({{migrate-validate}}) if it detects a +change in the IP address of its peer. This document limits migration of connections to new client addresses, except as described in {{preferred-address}}. Clients are responsible for initiating all @@ -1858,7 +1925,7 @@ any other frame is a "non-probing packet". ## Initiating Connection Migration {#initiating-migration} An endpoint can migrate a connection to a new local address by sending packets -containing frames other than probing frames from that address. +containing non-probing frames from that address. Each endpoint validates its peer's address during connection establishment. Therefore, a migrating endpoint can send to its peer knowing that the peer is @@ -2034,7 +2101,7 @@ Caution: changes. -# Server's Preferred Address {#preferred-address} +## Server's Preferred Address {#preferred-address} QUIC allows servers to accept connections on one IP address and attempt to transfer these connections to a more preferred address shortly after the @@ -2048,7 +2115,7 @@ work. If a client receives packets from a new server address not indicated by the preferred_address transport parameter, the client SHOULD discard these packets. -## Communicating A Preferred Address +### Communicating A Preferred Address A server conveys a preferred address by including the preferred_address transport parameter in the TLS handshake. @@ -2063,12 +2130,13 @@ discontinue use of the old server address. If path validation fails, the client MUST continue sending all future packets to the server's original IP address. -## Responding to Connection Migration +### Responding to Connection Migration A server might receive a packet addressed to its preferred IP address at any -time after the handshake is completed. If this packet contains a PATH_CHALLENGE -frame, the server sends a PATH_RESPONSE frame as per {{migrate-validate}}, but -the server MUST continue sending all other packets from its original IP address. +time after it accepts a connection. If this packet contains a PATH_CHALLENGE +frame, the server sends a PATH_RESPONSE frame as per {{migrate-validate}}. The +server MAY send other non-probing frames from its preferred address, but MUST +continue sending all probing packets from its original IP address. The server SHOULD also initiate path validation of the client using its preferred address and the address from which it received the client probe. This @@ -2076,12 +2144,12 @@ helps to guard against spurious migration initiated by an attacker. Once the server has completed its path validation and has received a non-probing packet with a new largest packet number on its preferred address, the server -begins sending to the client exclusively from its preferred IP address. It -SHOULD drop packets for this connection received on the old IP address, but MAY -continue to process delayed packets. +begins sending non-probing packets to the client exclusively from its preferred +IP address. It SHOULD drop packets for this connection received on the old IP +address, but MAY continue to process delayed packets. -## Interaction of Client Migration and Preferred Address +### Interaction of Client Migration and Preferred Address A client might need to perform a connection migration before it has migrated to the server's preferred address. In this case, the client SHOULD perform path @@ -2090,9 +2158,9 @@ new address concurrently. If path validation of the server's preferred address succeeds, the client MUST abandon validation of the original address and migrate to using the server's -preferred address. If path validation of the server's preferred address fails, +preferred address. If path validation of the server's preferred address fails but validation of the server's original address succeeds, the client MAY migrate -to using the original address from the client's new address. +to its new address and continue sending to the server's original address. If the connection to the server's preferred address is not from the same client address, the server MUST protect against potential attacks as described in @@ -2108,6 +2176,9 @@ address before path validation is complete. # Using Explicit Congestion Notification {#using-ecn} + + QUIC endpoints use Explicit Congestion Notification (ECN) {{!RFC3168}} to detect and respond to network congestion. ECN allows a network node to indicate congestion in the network by setting a codepoint in the IP header of a packet @@ -2205,15 +2276,12 @@ These states SHOULD persist for three times the current Retransmission Timeout An endpoint enters a closing period after initiating an immediate close ({{immediate-close}}). While closing, an endpoint MUST NOT send packets unless they contain a CONNECTION_CLOSE or APPLICATION_CLOSE frame (see -{{immediate-close}} for details). - -In the closing state, only a packet containing a closing frame can be sent. An -endpoint retains only enough information to generate a packet containing a -closing frame and to identify packets as belonging to the connection. The -connection ID and QUIC version is sufficient information to identify packets for -a closing connection; an endpoint can discard all other connection state. An -endpoint MAY retain packet protection keys for incoming packets to allow it to -read and process a closing frame. +{{immediate-close}} for details). An endpoint retains only enough information +to generate a packet containing a closing frame and to identify packets as +belonging to the connection. The connection ID and QUIC version is sufficient +information to identify packets for a closing connection; an endpoint can +discard all other connection state. An endpoint MAY retain packet protection +keys for incoming packets to allow it to read and process a closing frame. The draining state is entered once an endpoint receives a signal that its peer is closing or draining. While otherwise identical to the closing state, an @@ -2459,7 +2527,7 @@ This design relies on the peer always sending a connection ID in its packets so that the endpoint can use the connection ID from a packet to reset the connection. An endpoint that uses this design MUST either use the same connection ID length for all connections or encode the length of the connection -ID such that it can be recovered without state. In addition, it MUST NOT +ID such that it can be recovered without state. In addition, it cannot provide a zero-length connection ID. Revealing the Stateless Reset Token allows any entity to terminate the @@ -2569,71 +2637,68 @@ between endpoints. # Packets and Frames {#packets-frames} -Any QUIC packet, with the exception of the Version Negotiation packet, has -either a long or a short header, as indicated by the Header Form bit. Long -headers are expected to be used early in the connection before the establishment -of 1-RTT keys. Packets that carry the long header are Initial -{{packet-initial}}, Retry {{packet-retry}}, Handshake {{packet-handshake}}, and -0-RTT Protected packets {{packet-protected}}. Packets that carry Short headers -are minimal version-specific headers, which are used after version negotiation -and 1-RTT keys are established, and are described in {{short-header}}. Version -Negotiation packets are described in {{packet-version}}. +QUIC endpoints communicate by exchanging packets. Packets are carried in UDP +datagrams (see {{packet-coalesce}}) and have confidentiality and integrity +protection (see {{packet-protected}}). +This version of QUIC uses the long packet header (see {{header-long}}) during +connection establishment and the short header (see {{header-short}}) once 1-RTT +keys have been established. -## Protected Packets {#packet-protected} +Packets that carry the long header are Initial {{packet-initial}}, Retry +{{packet-retry}}, Handshake {{packet-handshake}}, and 0-RTT Protected packets +{{packet-protected}}. -All QUIC packets use packet protection. Packets that are protected with the -static handshake keys or the 0-RTT keys are sent with long headers; all packets -protected with 1-RTT keys are sent with short headers. The different packet -types explicitly indicate the encryption level and therefore the keys that are -used to remove packet protection. 0-RTT and 1-RTT protected packets share a -single packet number space. +Packets with the short header are designed for minimal overhead and are used +after a connection is established. -Packets protected with handshake keys only use packet protection to ensure that -the sender of the packet is on the network path. This packet protection is not -effective confidentiality protection; any entity that receives the Initial -packet from a client can recover the keys necessary to remove packet protection -or to generate packets that will be successfully authenticated. +Version negotiation uses a packet with a special format (see +{{packet-version}}). -Packets protected with 0-RTT and 1-RTT keys are expected to have confidentiality -and data origin authentication; the cryptographic handshake ensures that only -the communicating endpoints receive the corresponding keys. -Packets protected with 0-RTT keys use a type value of 0x7C. The connection ID -fields for a 0-RTT packet MUST match the values used in the Initial packet -({{packet-initial}}). +## Protected Packets {#packet-protected} -The version field for protected packets is the current QUIC version. +All QUIC packets except Version Negotiation and Retry packets use authenticated +encryption with additional data (AEAD) {{!RFC5119}} to provide confidentiality +and integrity protection. Details of packet protection are found in +{{QUIC-TLS}}; this section includes an overview of the process. + +Initial packets are protected using keys that are statically derived. This +packet protection is not effective confidentiality protection, it only exists to +ensure that the sender of the packet is on the network path. Any entity that +receives the Initial packet from a client can recover the keys necessary to +remove packet protection or to generate packets that will be successfully +authenticated. + +All other packets are protected with keys derived from the cryptographic +handshake. The type of the packet from the long header or key phase from the +short header are used to identify which encryption level - and therefore the +keys - that are used. Packets protected with 0-RTT and 1-RTT keys are expected +to have confidentiality and data origin authentication; the cryptographic +handshake ensures that only the communicating endpoints receive the +corresponding keys. The packet number field contains a packet number, which has additional confidentiality protection that is applied after packet protection is applied (see {{QUIC-TLS}} for details). The underlying packet number increases with each packet sent, see {{packet-numbers}} for details. -The payload is protected using authenticated encryption. {{QUIC-TLS}} describes -packet protection in detail. After decryption, the plaintext consists of a -sequence of frames, as described in {{frames}}. - ## Coalescing Packets {#packet-coalesce} -A sender can coalesce multiple QUIC packets (typically a Cryptographic Handshake -packet and a Protected packet) into one UDP datagram. This can reduce the -number of UDP datagrams needed to send application data during the handshake and -immediately afterwards. It is not necessary for senders to coalesce -packets, though failing to do so will require sending a significantly -larger number of datagrams during the handshake. Receivers MUST -be able to process coalesced packets. +A sender can coalesce multiple QUIC packets into one UDP datagram. This can +reduce the number of UDP datagrams needed to complete the cryptographic +handshake and starting sending data. Receivers MUST be able to process +coalesced packets. Coalescing packets in order of increasing encryption levels (Initial, 0-RTT, Handshake, 1-RTT) makes it more likely the receiver will be able to process all the packets in a single pass. A packet with a short header does not include a length, so it will always be the last packet included in a UDP datagram. -Senders MUST NOT coalesce QUIC packets with different Destination Connection -IDs into a single UDP datagram. Receivers SHOULD ignore any subsequent packets -with a different Destination Connection ID than the first packet in the -datagram. +Senders MUST NOT coalesce QUIC packets for different connections into a single +UDP datagram. Receivers SHOULD ignore any subsequent packets with a different +Destination Connection ID than the first packet in the datagram. Every QUIC packet that is coalesced into a single UDP datagram is separate and complete. Though the values of some fields in the packet header might be @@ -2646,68 +2711,23 @@ receiver MUST still attempt to process the remaining packets. The skipped packets MAY either be discarded or buffered for later processing, just as if the packets were received out-of-order in separate datagrams. -Retry ({{packet-retry}}) and Version Negotiation ({{packet-version}}) packets -cannot be coalesced. - + -## Connection ID Encoding - -A connection ID is used to ensure consistent routing of packets, as described in -{{connection-id}}. The long header contains two connection IDs: the Destination -Connection ID is chosen by the recipient of the packet and is used to provide -consistent routing; the Source Connection ID is used to set the Destination -Connection ID used by the peer. - -During the handshake, packets with the long header are used to establish the -connection ID that each endpoint uses. Each endpoint uses the Source Connection -ID field to specify the connection ID that is used in the Destination Connection -ID field of packets being sent to them. Upon receiving a packet, each endpoint -sets the Destination Connection ID it sends to match the value of the Source -Connection ID that they receive. - -During the handshake, a client can receive both a Retry and an Initial packet, -and thus be given two opportunities to update the Destination Connection ID it -sends. A client MUST only change the value it sends in the Destination -Connection ID in response to the first packet of each type it receives from the -server (Retry or Initial); a server MUST set its value based on the Initial -packet. Any additional changes are not permitted; if subsequent packets of -those types include a different Source Connection ID, they MUST be discarded. -This avoids problems that might arise from stateless processing of multiple -Initial packets producing different connection IDs. - -Short headers only include the Destination Connection ID and omit the explicit -length. The length of the Destination Connection ID field is expected to be -known to endpoints. - -Endpoints using a connection-ID based load balancer could agree with the load -balancer on a fixed or minimum length and on an encoding for connection IDs. -This fixed portion could encode an explicit length, which allows the entire -connection ID to vary in length and still be used by the load balancer. - -The very first packet sent by a client includes a random value for Destination -Connection ID. The same value MUST be used for all 0-RTT packets sent on that -connection ({{packet-protected}}). This randomized value is used to determine -the packet protection keys for Initial packets (see Section 5.2 of -{{QUIC-TLS}}). - -A Version Negotiation ({{packet-version}}) packet MUST use both connection IDs -selected by the client, swapped to ensure correct routing toward the client. - -The connection ID can change over the lifetime of a connection, especially in -response to connection migration ({{migration}}). NEW_CONNECTION_ID frames -({{frame-new-connection-id}}) are used to provide new connection ID values. +Retry packets ({{packet-retry}}), Version Negotiation packets +({{packet-version}}), and packets with a short header cannot be followed by +other packets in the same UDP datagram. ## Packet Numbers {#packet-numbers} -The packet number is an integer in the range 0 to 2^62-1, present in all long -and short header packets. This number is used in determining the cryptographic -nonce for packet protection. Each endpoint maintains a separate packet number -for sending and receiving. +The packet number is an integer in the range 0 to 2^62-1. Where present, packet +numbers are encoded as a variable-length integer (see {{integer-encoding}}). +This number is used in determining the cryptographic nonce for packet +protection. Each endpoint maintains a separate packet number for sending and +receiving. -A Version Negotiation packet ({{packet-version}}) does not include a packet -number. The Retry packet ({{packet-retry}}) has special rules for populating -the packet number field. +Version Negotiation ({{packet-version}}) and Retry {{packet-retry}} packets do +not include a packet number. Packet numbers are divided into 3 spaces in QUIC: From 9600d976e421bb846fdf85831e3864ca16300af8 Mon Sep 17 00:00:00 2001 From: Jana Iyengar Date: Wed, 17 Oct 2018 19:33:08 -0700 Subject: [PATCH 31/57] addresses mt's comments. --- draft-ietf-quic-transport.md | 855 ++++++++++++++++++----------------- 1 file changed, 429 insertions(+), 426 deletions(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index b5a0cb1a44..44218c1a40 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -304,7 +304,7 @@ lower-numbered streams of the same type in the same direction. QUIC allows for an arbitrary number of streams to operate concurrently. An endpoint limits the number of concurrently active incoming streams by limiting -the maximum stream ID (see {{stream-limit-increments}}). +the maximum stream ID (see {{stream-limit-increment}}). The maximum stream ID is specific to each endpoint and applies only to the peer that receives the setting. That is, clients specify the maximum stream ID the @@ -890,7 +890,7 @@ different IP or port at either endpoint, due to NAT rebinding or mobility, as described in {{migration}}. Finally, a connection may be terminated by either endpoint, as described in {{termination}}. -## Connection ID +## Connection ID {#connection-id} Each connection possesses a set of identifiers, any of which could be used to distinguish it from other connections. Connection IDs are selected @@ -1171,10 +1171,22 @@ request - in response to the data carried in the early data from the client. To send additional data prior to completing the cryptographic handshake, the server then needs to validate that the client owns the address that it claims. - QUIC therefore performs source address validation during connection establishment. +Servers MUST NOT send more than three times as many bytes as the number of bytes +received prior to verifying the client's address. Source addresses can be +verified through an address validation token (delivered via a Retry packet or +a NEW_TOKEN frame) or by processing any message from the client encrypted using +the Handshake keys. This limit exists to mitigate amplification attacks. + +In order to prevent this limit causing a handshake deadlock, the client SHOULD +always send a packet upon a handshake timeout, as described in +{{QUIC-RECOVERY}}. If the client has no data to retransmit and does not have +Handshake keys, it SHOULD send an Initial packet in a UDP datagram of at least +1200 octets. If the client has Handshake keys, it SHOULD send a Handshake +packet. + A different type of source address validation is performed after a connection migration, see {{migrate-validate}}. @@ -1234,6 +1246,50 @@ therefore unlikely to be successful. A resumption token SHOULD be easily distinguishable from tokens that are sent in Retry packets as they are carried in the same field. + +### Tokens + + +If the client has a token received in a NEW_TOKEN frame on a previous connection +to what it believes to be the same server, it can include that value in the +Token field of its Initial packet. + +A token allows a server to correlate activity between connections. +Specifically, the connection where the token was issued, and any connection +where it is used. Clients that want to break continuity of identity with a +server MAY discard tokens provided using the NEW_TOKEN frame. Tokens obtained +in Retry packets MUST NOT be discarded. + +A client SHOULD NOT reuse a token. Reusing a token allows connections to be +linked by entities on the network path (see {{migration-linkability}}). A +client MUST NOT reuse a token if it believes that its point of network +attachment has changed since the token was last used; that is, if there is a +change in its local IP address or network interface. A client needs to start +the connection process over if it migrates prior to completing the handshake. + +When a server receives an Initial packet with an address validation token, it +SHOULD attempt to validate it. If the token is invalid then the server SHOULD +proceed as if the client did not have a validated address, including potentially +sending a Retry. If the validation succeeds, the server SHOULD then allow the +handshake to proceed (see {{stateless-retry}}). + +Note: + +: The rationale for treating the client as unvalidated rather than discarding + the packet is that the client might have received the token in a previous + connection using the NEW_TOKEN frame, and if the server has lost state, it + might be unable to validate the token at all, leading to connection failure if + the packet is discarded. A server MAY encode tokens provided with NEW_TOKEN + frames and Retry packets differently, and validate the latter more strictly. + +In a stateless design, a server can use encrypted and authenticated tokens to +pass information to clients that the server can later recover and use to +validate a client address. Tokens are not integrated into the cryptographic +handshake and so they are not authenticated. For instance, a client might be +able to reuse a token. To avoid attacks that exploit this property, a server +can limit its use of tokens to only the information needed validate client +addresses. + ### Address Validation Token Integrity {#token-integrity} An address validation token MUST be difficult to guess. Including a large @@ -1301,6 +1357,10 @@ handshake data start from zero in each packet number space. Details of how TLS is integrated with QUIC are provided in {{QUIC-TLS}}, but some examples are provided here. +Once version negotiation is complete, the cryptographic handshake is used to +agree on cryptographic keys. The cryptographic handshake is carried in Initial +({{packet-initial}}) and Handshake ({{packet-handshake}}) packets. + {{tls-1rtt-handshake}} provides an overview of the 1-RTT handshake. Each line shows a QUIC packet with the packet type and packet number shown first, followed by the frames that are typically contained in those packets. So, for instance @@ -1359,7 +1419,7 @@ Handshake[0]: CRYPTO[FIN], ACK[0] {: #tls-0rtt-handshake title="Example 0-RTT Handshake"} -## Negotiating Connection IDs +## Negotiating Connection IDs {#negotiating-connection-ids} @@ -1376,15 +1436,36 @@ Destination Connection ID field of packets being sent to them. Upon receiving a packet, each endpoint sets the Destination Connection ID it sends to match the value of the Source Connection ID that they receive. -During the handshake, a client can receive both a Retry and an Initial packet, -and thus be given two opportunities to update the Destination Connection ID it -sends. A client MUST only change the value it sends in the Destination -Connection ID in response to the first packet of each type it receives from the -server (Retry or Initial); a server MUST set its value based on the Initial -packet. Any additional changes are not permitted; if subsequent packets of -those types include a different Source Connection ID, they MUST be discarded. -This avoids problems that might arise from stateless processing of multiple -Initial packets producing different connection IDs. +When an Initial packet is sent by a client which has not previously received a +Retry packet from the server, it populates the Destination Connection ID field +with an unpredictable value. This MUST be at least 8 octets in length. Until a +packet is received from the server, the client MUST use the same value unless it +abandons the connection attempt and starts a new one. The initial Destination +Connection ID is used to determine packet protection keys for Initial packets. + +The client populates the Source Connection ID field with a value of its choosing +and sets the SCIL field to match. + +The Destination Connection ID field in the server's Initial packet contains a +connection ID that is chosen by the recipient of the packet (i.e., the client); +the Source Connection ID includes the connection ID that the sender of the +packet wishes to use (see {{connection-id}}). The server MUST use consistent +Source Connection IDs during the handshake. + +On first receiving an Initial or Retry packet from the server, the client uses +the Source Connection ID supplied by the server as the Destination Connection ID +for subsequent packets. That means that a client might change the Destination +Connection ID twice during connection establishment. Once a client has received +an Initial packet from the server, it MUST discard any packet it receives with a +different Source Connection ID. + +A client MUST only change the value it sends in the Destination Connection ID in +response to the first packet of each type it receives from the server (Retry or +Initial); a server MUST set its value based on the Initial packet. Any +additional changes are not permitted; if subsequent packets of those types +include a different Source Connection ID, they MUST be discarded. This avoids +problems that might arise from stateless processing of multiple Initial packets +producing different connection IDs. Packets with short headers ({{short-header}}) only include the Destination Connection ID and omit the explicit length. The length of the Destination @@ -2177,7 +2258,7 @@ address before path validation is complete. # Using Explicit Congestion Notification {#using-ecn} +section. --> QUIC endpoints use Explicit Congestion Notification (ECN) {{!RFC3168}} to detect and respond to network congestion. ECN allows a network node to indicate @@ -2641,8 +2722,8 @@ QUIC endpoints communicate by exchanging packets. Packets are carried in UDP datagrams (see {{packet-coalesce}}) and have confidentiality and integrity protection (see {{packet-protected}}). -This version of QUIC uses the long packet header (see {{header-long}}) during -connection establishment and the short header (see {{header-short}}) once 1-RTT +This version of QUIC uses the long packet header (see {{long-header}}) during +connection establishment and the short header (see {{short-header}}) once 1-RTT keys have been established. Packets that carry the long header are Initial {{packet-initial}}, Retry @@ -2759,56 +2840,6 @@ without sending a CONNECTION_CLOSE frame or any further packets; an endpoint MAY send a Stateless Reset ({{stateless-reset}}) in response to further packets that it receives. -In the QUIC long and short packet headers, the number of bits required to -represent the packet number is reduced by including only a variable number of -the least significant bits of the packet number. One or two of the most -significant bits of the first octet determine how many bits of the packet -number are provided, as shown in {{pn-encodings}}. - -| First octet pattern | Encoded Length | Bits Present | -|:--------------------|:---------------|:-------------| -| 0b0xxxxxxx | 1 octet | 7 | -| 0b10xxxxxx | 2 | 14 | -| 0b11xxxxxx | 4 | 30 | -{: #pn-encodings title="Packet Number Encodings for Packet Headers"} - -Note that these encodings are similar to those in {{integer-encoding}}, but -use different values. - -The encoded packet number is protected as described in Section 5.3 -{{QUIC-TLS}}. Protection of the packet number is removed prior to recovering the -full packet number. The full packet number is reconstructed at the receiver -based on the number of significant bits present, the value of those bits, and -the largest packet number received on a successfully authenticated -packet. Recovering the full packet number is necessary to successfully remove -packet protection. - -Once packet number protection is removed, the packet number is decoded by -finding the packet number value that is closest to the next expected packet. -The next expected packet is the highest received packet number plus one. For -example, if the highest successfully authenticated packet had a packet number of -0xaa82f30e, then a packet containing a 14-bit value of 0x9b3 will be decoded as -0xaa8309b3. -Example pseudo-code for packet number decoding can be found in -{{sample-packet-number-decoding}}. - -The sender MUST use a packet number size able to represent more than twice as -large a range than the difference between the largest acknowledged packet and -packet number being sent. A peer receiving the packet will then correctly -decode the packet number, unless the packet is delayed in transit such that it -arrives after many higher-numbered packets have been received. An endpoint -SHOULD use a large enough packet number encoding to allow the packet number to -be recovered even if the packet arrives after packets that are sent afterwards. - -As a result, the size of the packet number encoding is at least one more than -the base 2 logarithm of the number of contiguous unacknowledged packet numbers, -including the new packet. - -For example, if an endpoint has received an acknowledgment for packet 0x6afa2f, -sending a packet with a number of 0x6b2d79 requires a packet number encoding -with 14 bits or more; whereas the 30-bit packet number encoding is needed to -send a packet with a number of 0x6bc107. - A receiver MUST discard a newly unprotected packet unless it is certain that it has not processed another packet with the same packet number from the same packet number space. Duplicate suppression MUST happen after removing packet @@ -2816,12 +2847,18 @@ protection for the reasons described in Section 9.3 of {{QUIC-TLS}}. An efficient algorithm for duplicate suppression can be found in Section 3.4.3 of {{?RFC2406}}. +Packet number encoding at a sender and decoding at a receiver are described in +{{packet-encoding}}. + ## Frames and Frame Types {#frames} -The payload of all packets, after removing packet protection, consists of a -sequence of frames, as shown in {{packet-frames}}. Version Negotiation and -Stateless Reset do not contain frames. +The payload of QUIC packets, after removing packet protection, commonly consists +of a sequence of frames, as shown in {{packet-frames}}. Version Negotiation, +Stateless Reset, and Retry packets do not contain frames. + + ~~~ 0 1 2 3 @@ -2838,8 +2875,8 @@ Stateless Reset do not contain frames. ~~~ {: #packet-frames title="QUIC Payload"} -QUIC payloads MUST contain at least one frame, and MAY contain multiple -frames and multiple frame types. +QUIC payloads MUST contain at least one frame, and MAY contain multiple frames +and multiple frame types. Frames MUST fit within a single QUIC packet and MUST NOT span a QUIC packet boundary. Each frame begins with a Frame Type, indicating its type, followed by @@ -2905,16 +2942,29 @@ a connection error of type PROTOCOL_VIOLATION. A sender bundles one or more frames in a QUIC packet (see {{frames}}). -A sender SHOULD minimize per-packet bandwidth and computational costs by -bundling as many frames as possible within a QUIC packet. A sender MAY wait for -a short period of time to bundle multiple frames before sending a packet that is -not maximally packed, to avoid sending out large numbers of small packets. An +A sender can minimize per-packet bandwidth and computational costs by bundling +as many frames as possible within a QUIC packet. A sender MAY wait for a short +period of time to bundle multiple frames before sending a packet that is not +maximally packed, to avoid sending out large numbers of small packets. An implementation may use knowledge about application sending behavior or heuristics to determine whether and for how long to wait. This waiting period is an implementation decision, and an implementation should be careful to delay conservatively, since any delay is likely to increase application-visible latency. +Stream multiplexing is achieved by interleaving STREAM frames from multiple +streams into one or more QUIC packets. A single QUIC packet can include +multiple STREAM frames from one or more streams. + +One of the benefits of QUIC is avoidance of head-of-line blocking across +multiple streams. When a packet loss occurs, only streams with data in that +packet are blocked waiting for a retransmission to be received, while other +streams can continue making progress. Note that when data from multiple streams +is bundled into a single QUIC packet, loss of that packet blocks all those +streams from making progress. Implementations are advised to bundle as few +streams as necessary in outgoing packets without losing transmission efficiency +to underfilled packets. + ## Packet Processing and Acknowledgment {#processing-and-ack} @@ -2926,11 +2976,21 @@ consumed. Once the packet has been fully processed, a receiver acknowledges receipt by sending one or more ACK frames containing the packet number of the received -packet. To avoid creating an indefinite feedback loop, an endpoint MUST NOT -send an ACK frame in response to a packet containing only ACK or PADDING frames, -even if there are packet gaps which precede the received packet. The endpoint -MUST acknowledge packets containing only ACK or PADDING frames in the next ACK -frame that it sends. +packet. + + + +### Sending ACK Frames + + + +To avoid creating an indefinite feedback loop, an endpoint MUST NOT send an ACK +frame in response to a packet containing only ACK or PADDING frames, even if +there are packet gaps which precede the received packet. The endpoint MUST +however acknowledge packets containing only ACK or PADDING frames when sending +ACK frames in response to other packets. While PADDING frames do not elicit an ACK frame from a receiver, they are considered to be in flight for congestion control purposes @@ -2940,9 +3000,52 @@ acknowledgments forthcoming from the receiver. Therefore, a sender should ensure that other frames are sent in addition to PADDING frames to elicit acknowledgments from the receiver. +An endpoint MUST NOT send more than one packet containing only an ACK frame per +received packet that contains frames other than ACK and PADDING frames. + +The receiver's delayed acknowledgment timer SHOULD NOT exceed the current RTT +estimate or the value it indicates in the `max_ack_delay` transport parameter. +This ensures an acknowledgment is sent at least once per RTT when packets +needing acknowledgement are received. The sender can use the receiver's +`max_ack_delay` value in determining timeouts for timer-based retransmission. + Strategies and implications of the frequency of generating acknowledgments are discussed in more detail in {{QUIC-RECOVERY}}. +To limit ACK Blocks to those that have not yet been received by the sender, the +receiver SHOULD track which ACK frames have been acknowledged by its peer. Once +an ACK frame has been acknowledged, the packets it acknowledges SHOULD NOT be +acknowledged again. + +Because ACK frames are not sent in response to ACK-only packets, a receiver that +is only sending ACK frames will only receive acknowledgements for its packets +if the sender includes them in packets with non-ACK frames. A sender SHOULD +bundle ACK frames with other frames when possible. + +To limit receiver state or the size of ACK frames, a receiver MAY limit the +number of ACK Blocks it sends. A receiver can do this even without receiving +acknowledgment of its ACK frames, with the knowledge this could cause the sender +to unnecessarily retransmit some data. Standard QUIC {{QUIC-RECOVERY}} +algorithms declare packets lost after sufficiently newer packets are +acknowledged. Therefore, the receiver SHOULD repeatedly acknowledge newly +received packets in preference to packets received in the past. + +### ACK Frames and Packet Protection + +ACK frames MUST only be carried in a packet that has the same packet +number space as the packet being ACKed (see {{packet-protected}}). For +instance, packets that are protected with 1-RTT keys MUST be +acknowledged in packets that are also protected with 1-RTT keys. + +Packets that a client sends with 0-RTT packet protection MUST be acknowledged by +the server in packets protected by 1-RTT keys. This can mean that the client is +unable to use these acknowledgments if the server cryptographic handshake +messages are delayed or lost. Note that the same limitation applies to other +data sent by the server protected by the 1-RTT keys. + +Endpoints SHOULD send acknowledgments for packets containing CRYPTO frames with +a reduced delay; see Section 4.3.1 of {{QUIC-RECOVERY}}. + ## Retransmission of Information @@ -3045,6 +3148,10 @@ UDP datagram of this size ensures that the network path supports a reasonable Maximum Transmission Unit (MTU), and helps reduce the amplitude of amplification attacks caused by server responses toward an unverified client address. +The payload of a UDP datagram carrying the Initial packet MUST be expanded to at +least 1200 octets, by adding PADDING frames to the Initial packet and/or by +combining the Initial packet with a 0-RTT packet (see {{packet-coalesce}}). + The datagram containing the first Initial packet from a client MAY exceed 1200 octets if the client believes that the Path Maximum Transmission Unit (PMTU) supports the size that it chooses. @@ -3055,6 +3162,9 @@ datagram is smaller than 1200 octets. It MUST NOT send any other frame type in response, or otherwise behave as if any part of the offending packet was processed as valid. +The server MUST also limit the number of bytes it sends before validating the +address of the client, see {{address-validation}}. + ## Path Maximum Transmission Unit @@ -3180,14 +3290,101 @@ using for private experimentation on the GitHub wiki at -# Packet Types and Formats {#packet-formats} +# Packet Formats {#packet-formats} All numeric values are encoded in network byte order (that is, big-endian) and -all field sizes are in bits. When discussing individual bits of fields, the -least significant bit is referred to as bit 0. Hexadecimal notation is used for -describing the value of fields. +all field sizes are in bits. Hexadecimal notation is used for describing the +value of fields. + + +## Variable-Length Integer Encoding {#integer-encoding} + +QUIC packets and frames commonly use a variable-length encoding for non-negative +integer values. This encoding ensures that smaller integer values need fewer +octets to encode. + +The QUIC variable-length integer encoding reserves the two most significant bits +of the first octet to encode the base 2 logarithm of the integer encoding length +in octets. The integer value is encoded on the remaining bits, in network byte +order. + +This means that integers are encoded on 1, 2, 4, or 8 octets and can encode 6, +14, 30, or 62 bit values respectively. {{integer-summary}} summarizes the +encoding properties. + +| 2Bit | Length | Usable Bits | Range | +|:-----|:-------|:------------|:----------------------| +| 00 | 1 | 6 | 0-63 | +| 01 | 2 | 14 | 0-16383 | +| 10 | 4 | 30 | 0-1073741823 | +| 11 | 8 | 62 | 0-4611686018427387903 | +{: #integer-summary title="Summary of Integer Encodings"} + +For example, the eight octet sequence c2 19 7c 5e ff 14 e8 8c (in hexadecimal) +decodes to the decimal value 151288809941952652; the four octet sequence 9d 7f +3e 7d decodes to 494878333; the two octet sequence 7b bd decodes to 15293; and +the single octet 25 decodes to 37 (as does the two octet sequence 40 25). + +Error codes ({{error-codes}}) and versions {{versions}} are described using +integers, but do not use this encoding. + +## Packet Number Encoding and Decoding {#packet-encoding} + + + +Packet numbers in long and short packet headers are encoded as follows. The +number of bits required to represent the packet number is first reduced by +including only a variable number of the least significant bits of the packet +number. One or two of the most significant bits of the first octet are then +used to represent how many bits of the packet number are provided, as shown in +{{pn-encodings}}. + +| First octet pattern | Encoded Length | Bits Present | +|:--------------------|:---------------|:-------------| +| 0b0xxxxxxx | 1 octet | 7 | +| 0b10xxxxxx | 2 | 14 | +| 0b11xxxxxx | 4 | 30 | +{: #pn-encodings title="Packet Number Encodings for Packet Headers"} + +Note that these encodings are similar to those in {{integer-encoding}}, but +use different values. + +Finally, the encoded packet number is protected as described in Section 5.3 of +{{QUIC-TLS}}. + +The sender MUST use a packet number size able to represent more than twice as +large a range than the difference between the largest acknowledged packet and +packet number being sent. A peer receiving the packet will then correctly +decode the packet number, unless the packet is delayed in transit such that it +arrives after many higher-numbered packets have been received. An endpoint +SHOULD use a large enough packet number encoding to allow the packet number to +be recovered even if the packet arrives after packets that are sent afterwards. + +As a result, the size of the packet number encoding is at least one more than +the base 2 logarithm of the number of contiguous unacknowledged packet numbers, +including the new packet. + +For example, if an endpoint has received an acknowledgment for packet 0x6afa2f, +sending a packet with a number of 0x6b2d79 requires a packet number encoding +with 14 bits or more; whereas the 30-bit packet number encoding is needed to +send a packet with a number of 0x6bc107. + +At a receiver, protection of the packet number is removed prior to recovering +the full packet number. The full packet number is then reconstructed based on +the number of significant bits present, the value of those bits, and the largest +packet number received on a successfully authenticated packet. Recovering the +full packet number is necessary to successfully remove packet protection. + +Once packet number protection is removed, the packet number is decoded by +finding the packet number value that is closest to the next expected packet. +The next expected packet is the highest received packet number plus one. For +example, if the highest successfully authenticated packet had a packet number of +0xaa82f30e, then a packet containing a 14-bit value of 0x9b3 will be decoded as +0xaa8309b3. Example pseudo-code for packet number decoding can be found in +{{sample-packet-number-decoding}}. + -## Long Header {#long-header} +## Long Header Packet {#long-header} ~~~~~ 0 1 2 3 @@ -3253,13 +3450,13 @@ Destination Connection ID: : The Destination Connection ID field follows the connection ID lengths and is either 0 octets in length or between 4 and 18 octets. - {{connection-id-encoding}} describes the use of this field in more detail. + {{negotiating-connection-ids}} describes the use of this field in more detail. Source Connection ID: : The Source Connection ID field follows the Destination Connection ID and is either 0 octets in length or between 4 and 18 octets. - {{connection-id-encoding}} describes the use of this field in more detail. + {{negotiating-connection-ids}} describes the use of this field in more detail. Length: @@ -3270,9 +3467,9 @@ Length: Packet Number: : The packet number field is 1, 2, or 4 octets long. The packet number has - confidentiality protection separate from packet protection, as described - in Section 5.3 of {{QUIC-TLS}}. The length of the packet number field is - encoded in the plaintext packet number. See {{packet-numbers}} for details. + confidentiality protection separate from packet protection, as described in + Section 5.3 of {{QUIC-TLS}}. The length of the packet number field is encoded + in the plaintext packet number. See {{packet-encoding}} for details. Payload: @@ -3280,6 +3477,11 @@ Payload: The following packet types are defined: + + | Type | Name | Section | |:-----|:------------------------------|:----------------------------| | 0x7F | Initial | {{packet-initial}} | @@ -3301,13 +3503,11 @@ following sections. The end of the packet is determined by the Length field. The Length field covers both the Packet Number and Payload fields, both of which are confidentiality protected and initially of unknown length. The size of the -Payload field is learned once the packet number protection is removed. - -Senders can sometimes coalesce multiple packets into one UDP datagram. See -{{packet-coalesce}} for more details. +Payload field is learned once the packet number protection is removed. The +Length field enables packet coalescing ({{packet-coalesce}}). -## Short Header {#short-header} +## Short Header Packet {#short-header} ~~~~~ 0 1 2 3 @@ -3383,7 +3583,7 @@ Packet Number: : The packet number field is 1, 2, or 4 octets long. The packet number has confidentiality protection separate from packet protection, as described in Section 5.3 of {{QUIC-TLS}}. The length of the packet number field is encoded - in the plaintext packet number. See {{packet-numbers}} for details. + in the plaintext packet number. See {{packet-encoding}} for details. Protected Payload: @@ -3398,7 +3598,7 @@ versions of QUIC are interpreted. ## Version Negotiation Packet {#packet-version} A Version Negotiation packet is inherently not version-specific, and does not -use the long packet header (see {{long-header}}. Upon receipt by a client, it +use the long packet header (see {{long-header}}). Upon receipt by a client, it will appear to be a packet using the long header, but will be identified as a Version Negotiation packet based on the Version field having a value of 0. @@ -3459,142 +3659,20 @@ See {{version-negotiation}} for a description of the version negotiation process. -## Retry Packet {#packet-retry} +## Initial Packet {#packet-initial} -A Retry packet uses a long packet header with a type value of 0x7E. It carries -an address validation token created by the server. It is used by a server that -wishes to perform a stateless retry (see {{stateless-retry}}). - -~~~ - 0 1 2 3 - 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 -+-+-+-+-+-+-+-+-+ -|1| 0x7e | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Version (32) | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -|DCIL(4)|SCIL(4)| -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Destination Connection ID (0/32..144) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Source Connection ID (0/32..144) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| ODCIL(8) | Original Destination Connection ID (*) | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| Retry Token (*) ... -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -~~~ -{: #retry-format title="Retry Packet"} - -A Retry packet (shown in {{retry-format}}) only uses the invariant portion of -the long packet header {{QUIC-INVARIANTS}}; that is, the fields up to and -including the Destination and Source Connection ID fields. A Retry packet does -not contain any protected fields. Like Version Negotiation, a Retry packet -contains the long header including the connection IDs, but omits the Length, -Packet Number, and Payload fields. These are replaced with: - -ODCIL: - -: The length of the Original Destination Connection ID field. The length is - encoded in the least significant 4 bits of the octet, using the same encoding - as the DCIL and SCIL fields. The most significant 4 bits of this octet are - reserved. Unless a use for these bits has been negotiated, endpoints SHOULD - send randomized values and MUST ignore any value that it receives. - -Original Destination Connection ID: - -: The Original Destination Connection ID contains the value of the Destination - Connection ID from the Initial packet that this Retry is in response to. The - length of this field is given in ODCIL. - -Retry Token: - -: An opaque token that the server can use to validate the client's address. - -The server populates the Destination Connection ID with the connection ID that -the client included in the Source Connection ID of the Initial packet. - -The server includes a connection ID of its choice in the Source Connection ID -field. This value MUST not be equal to the Destination Connection ID field of -the packet sent by the client. The client MUST use this connection ID in the -Destination Connection ID of subsequent packets that it sends. - -A server MAY send Retry packets in response to Initial and 0-RTT packets. A -server can either discard or buffer 0-RTT packets that it receives. A server -can send multiple Retry packets as it receives Initial or 0-RTT packets. - -A client MUST accept and process at most one Retry packet for each connection -attempt. After the client has received and processed an Initial or Retry packet -from the server, it MUST discard any subsequent Retry packets that it receives. - -Clients MUST discard Retry packets that contain an Original Destination -Connection ID field that does not match the Destination Connection ID from its -Initial packet. This prevents an off-path attacker from injecting a Retry -packet. - -The client responds to a Retry packet with an Initial packet that includes the -provided Retry Token to continue connection establishment. - -A client sets the Destination Connection ID field of this Initial packet to the -value from the Source Connection ID in the Retry packet. Changing Destination -Connection ID also results in a change to the keys used to protect the Initial -packet. It also sets the Token field to the token provided in the Retry. The -client MUST NOT change the Source Connection ID because the server could include -the connection ID as part of its token validation logic (see {{tokens}}). - -All subsequent Initial packets from the client MUST use the connection ID and -token values from the Retry packet. Aside from this, the Initial packet sent -by the client is subject to the same restrictions as the first Initial packet. -A client can either reuse the cryptographic handshake message or construct a -new one at its discretion. - -A client MAY attempt 0-RTT after receiving a Retry packet by sending 0-RTT -packets to the connection ID provided by the server. A client that sends -additional 0-RTT packets without constructing a new cryptographic handshake -message MUST NOT reset the packet number to 0 after a Retry packet, see -{{retry-0rtt-pn}}. - -A server acknowledges the use of a Retry packet for a connection using the -original_connection_id transport parameter (see -{{transport-parameter-definitions}}). If the server sends a Retry packet, it -MUST include the value of the Original Destination Connection ID field of the -Retry packet (that is, the Destination Connection ID field from the client's -first Initial packet) in the transport parameter. - -If the client received and processed a Retry packet, it validates that the -original_connection_id transport parameter is present and correct; otherwise, it -validates that the transport parameter is absent. A client MUST treat a failed -validation as a connection error of type TRANSPORT_PARAMETER_ERROR. - -A Retry packet does not include a packet number and cannot be explicitly -acknowledged by a client. - - -## Cryptographic Handshake Packets {#handshake-packets} - -Once version negotiation is complete, the cryptographic handshake is used to -agree on cryptographic keys. The cryptographic handshake is carried in Initial -({{packet-initial}}) and Handshake ({{packet-handshake}}) packets. - -All these packets use the long header and contain the current QUIC version in -the version field. - -In order to prevent tampering by version-unaware middleboxes, Initial -packets are protected with connection- and version-specific keys -(Initial keys) as described in {{QUIC-TLS}}. This protection does not -provide confidentiality or integrity against on-path attackers, but -provides some level of protection against off-path attackers. - - -## Initial Packet {#packet-initial} - -The Initial packet uses long headers with a type value of 0x7F. It carries the -first CRYPTO frames sent by the client and server to perform key exchange, and -carries ACKs in either direction. The Initial packet is protected by Initial -keys as described in {{QUIC-TLS}}. - -The Initial packet (shown in {{initial-format}}) has two additional header -fields that are added to the Long Header before the Length field. +An Initial packet uses long headers with a type value of 0x7F. It carries the +first CRYPTO frames sent by the client and server to perform key exchange, and +carries ACKs in either direction. + +In order to prevent tampering by version-unaware middleboxes, Initial packets +are protected with connection- and version-specific keys (Initial keys) as +described in {{QUIC-TLS}}. This protection does not provide confidentiality or +integrity against on-path attackers, but provides some level of protection +against off-path attackers. + +An Initial packet (shown in {{initial-format}}) has two additional header fields +that are added to the Long Header before the Length field. ~~~ +-+-+-+-+-+-+-+-+ @@ -3664,76 +3742,6 @@ and will contain a CRYPTO frame with an offset matching the size of the CRYPTO frame sent in the first Initial packet. Cryptographic handshake messages subsequent to the first do not need to fit within a single UDP datagram. - -### Connection IDs - -When an Initial packet is sent by a client which has not previously received a -Retry packet from the server, it populates the Destination Connection ID field -with an unpredictable value. This MUST be at least 8 octets in length. Until a -packet is received from the server, the client MUST use the same value unless it -abandons the connection attempt and starts a new one. The initial Destination -Connection ID is used to determine packet protection keys for Initial packets. - -The client populates the Source Connection ID field with a value of its choosing -and sets the SCIL field to match. - -The Destination Connection ID field in the server's Initial packet contains a -connection ID that is chosen by the recipient of the packet (i.e., the client); -the Source Connection ID includes the connection ID that the sender of the -packet wishes to use (see {{connection-id}}). The server MUST use consistent -Source Connection IDs during the handshake. - -On first receiving an Initial or Retry packet from the server, the client uses -the Source Connection ID supplied by the server as the Destination Connection ID -for subsequent packets. That means that a client might change the Destination -Connection ID twice during connection establishment. Once a client has received -an Initial packet from the server, it MUST discard any packet it receives with a -different Source Connection ID. - - -### Tokens - -If the client has a token received in a NEW_TOKEN frame on a previous connection -to what it believes to be the same server, it can include that value in the -Token field of its Initial packet. - -A token allows a server to correlate activity between connections. -Specifically, the connection where the token was issued, and any connection -where it is used. Clients that want to break continuity of identity with a -server MAY discard tokens provided using the NEW_TOKEN frame. Tokens obtained -in Retry packets MUST NOT be discarded. - -A client SHOULD NOT reuse a token. Reusing a token allows connections to be -linked by entities on the network path (see {{migration-linkability}}). A -client MUST NOT reuse a token if it believes that its point of network -attachment has changed since the token was last used; that is, if there is a -change in its local IP address or network interface. A client needs to start -the connection process over if it migrates prior to completing the handshake. - -When a server receives an Initial packet with an address validation token, it -SHOULD attempt to validate it. If the token is invalid then the server SHOULD -proceed as if the client did not have a validated address, including potentially -sending a Retry. If the validation succeeds, the server SHOULD then allow the -handshake to proceed (see {{stateless-retry}}). - -Note: - -: The rationale for treating the client as unvalidated rather than discarding - the packet is that the client might have received the token in a previous - connection using the NEW_TOKEN frame, and if the server has lost state, it - might be unable to validate the token at all, leading to connection failure if - the packet is discarded. A server MAY encode tokens provided with NEW_TOKEN - frames and Retry packets differently, and validate the latter more strictly. - -In a stateless design, a server can use encrypted and authenticated tokens to -pass information to clients that the server can later recover and use to -validate a client address. Tokens are not integrated into the cryptographic -handshake and so they are not authenticated. For instance, a client might be -able to reuse a token. To avoid attacks that exploit this property, a server -can limit its use of tokens to only the information needed validate client -addresses. - - ### Starting Packet Numbers The first Initial packet sent by either endpoint contains a packet number of @@ -3741,9 +3749,13 @@ The first Initial packet sent by either endpoint contains a packet number of are in a different packet number space to other packets (see {{packet-numbers}}). - ### 0-RTT Packet Numbers {#retry-0rtt-pn} + + + Packet numbers for 0-RTT protected packets use the same space as 1-RTT protected packets. @@ -3775,90 +3787,150 @@ the same keys, avoiding any risk of key and nonce reuse; this also prevents the connection. -### Minimum Packet Size - -The payload of a UDP datagram carrying the Initial packet MUST be expanded to at -least 1200 octets (see {{packetization}}), by adding PADDING frames to the -Initial packet and/or by combining the Initial packet with a 0-RTT packet (see -{{packet-coalesce}}). - - ## Handshake Packet {#packet-handshake} A Handshake packet uses long headers with a type value of 0x7D. It is used to carry acknowledgments and cryptographic handshake messages from the server and client. -A server sends its cryptographic handshake in one or more Handshake packets in -response to an Initial packet if it does not send a Retry packet. Once a client -has received a Handshake packet from a server, it uses Handshake packets to send -subsequent cryptographic handshake messages and acknowledgments to the server. +Once a client has received a Handshake packet from a server, it uses Handshake +packets to send subsequent cryptographic handshake messages and acknowledgments +to the server. The Destination Connection ID field in a Handshake packet contains a connection ID that is chosen by the recipient of the packet; the Source Connection ID includes the connection ID that the sender of the packet wishes to use (see -{{connection-id-encoding}}). +{{negotiating-connection-ids}}). The first Handshake packet sent by a server contains a packet number of 0. Handshake packets are their own packet number space. Packet numbers are incremented normally for other Handshake packets. -Servers MUST NOT send more than three times as many bytes as the number of bytes -received prior to verifying the client's address. Source addresses can be -verified through an address validation token (delivered via a Retry packet or -a NEW_TOKEN frame) or by processing any message from the client encrypted using -the Handshake keys. This limit exists to mitigate amplification attacks. - -In order to prevent this limit causing a handshake deadlock, the client SHOULD -always send a packet upon a handshake timeout, as described in -{{QUIC-RECOVERY}}. If the client has no data to retransmit and does not have -Handshake keys, it SHOULD send an Initial packet in a UDP datagram of at least -1200 octets. If the client has Handshake keys, it SHOULD send a Handshake -packet. - The payload of this packet contains CRYPTO frames and could contain PADDING, or ACK frames. Handshake packets MAY contain CONNECTION_CLOSE or APPLICATION_CLOSE frames. Endpoints MUST treat receipt of Handshake packets with other frames as a connection error. +## Retry Packet {#packet-retry} -# Frame Types and Formats {#frame-formats} +A Retry packet uses a long packet header with a type value of 0x7E. It carries +an address validation token created by the server. It is used by a server that +wishes to perform a stateless retry (see {{stateless-retry}}). -As described in {{frames}}, packets contain one or more frames. This section -describes the format and semantics of the core QUIC frame types. +~~~ + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+ +|1| 0x7e | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Version (32) | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +|DCIL(4)|SCIL(4)| ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Destination Connection ID (0/32..144) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Source Connection ID (0/32..144) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| ODCIL(8) | Original Destination Connection ID (*) | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| Retry Token (*) ... ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +~~~ +{: #retry-format title="Retry Packet"} +A Retry packet (shown in {{retry-format}}) only uses the invariant portion of +the long packet header {{QUIC-INVARIANTS}}; that is, the fields up to and +including the Destination and Source Connection ID fields. A Retry packet does +not contain any protected fields. Like Version Negotiation, a Retry packet +contains the long header including the connection IDs, but omits the Length, +Packet Number, and Payload fields. These are replaced with: -## Variable-Length Integer Encoding {#integer-encoding} +ODCIL: -QUIC frames commonly use a variable-length encoding for non-negative integer -values. This encoding ensures that smaller integer values need fewer octets to -encode. +: The length of the Original Destination Connection ID field. The length is + encoded in the least significant 4 bits of the octet, using the same encoding + as the DCIL and SCIL fields. The most significant 4 bits of this octet are + reserved. Unless a use for these bits has been negotiated, endpoints SHOULD + send randomized values and MUST ignore any value that it receives. -The QUIC variable-length integer encoding reserves the two most significant bits -of the first octet to encode the base 2 logarithm of the integer encoding length -in octets. The integer value is encoded on the remaining bits, in network byte -order. +Original Destination Connection ID: -This means that integers are encoded on 1, 2, 4, or 8 octets and can encode 6, -14, 30, or 62 bit values respectively. {{integer-summary}} summarizes the -encoding properties. +: The Original Destination Connection ID contains the value of the Destination + Connection ID from the Initial packet that this Retry is in response to. The + length of this field is given in ODCIL. -| 2Bit | Length | Usable Bits | Range | -|:-----|:-------|:------------|:----------------------| -| 00 | 1 | 6 | 0-63 | -| 01 | 2 | 14 | 0-16383 | -| 10 | 4 | 30 | 0-1073741823 | -| 11 | 8 | 62 | 0-4611686018427387903 | -{: #integer-summary title="Summary of Integer Encodings"} +Retry Token: + +: An opaque token that the server can use to validate the client's address. + + + +The server populates the Destination Connection ID with the connection ID that +the client included in the Source Connection ID of the Initial packet. + +The server includes a connection ID of its choice in the Source Connection ID +field. This value MUST not be equal to the Destination Connection ID field of +the packet sent by the client. The client MUST use this connection ID in the +Destination Connection ID of subsequent packets that it sends. + +A server MAY send Retry packets in response to Initial and 0-RTT packets. A +server can either discard or buffer 0-RTT packets that it receives. A server +can send multiple Retry packets as it receives Initial or 0-RTT packets. + +A client MUST accept and process at most one Retry packet for each connection +attempt. After the client has received and processed an Initial or Retry packet +from the server, it MUST discard any subsequent Retry packets that it receives. + +Clients MUST discard Retry packets that contain an Original Destination +Connection ID field that does not match the Destination Connection ID from its +Initial packet. This prevents an off-path attacker from injecting a Retry +packet. + +The client responds to a Retry packet with an Initial packet that includes the +provided Retry Token to continue connection establishment. + +A client sets the Destination Connection ID field of this Initial packet to the +value from the Source Connection ID in the Retry packet. Changing Destination +Connection ID also results in a change to the keys used to protect the Initial +packet. It also sets the Token field to the token provided in the Retry. The +client MUST NOT change the Source Connection ID because the server could include +the connection ID as part of its token validation logic (see {{tokens}}). + +All subsequent Initial packets from the client MUST use the connection ID and +token values from the Retry packet. Aside from this, the Initial packet sent +by the client is subject to the same restrictions as the first Initial packet. +A client can either reuse the cryptographic handshake message or construct a +new one at its discretion. + +A client MAY attempt 0-RTT after receiving a Retry packet by sending 0-RTT +packets to the connection ID provided by the server. A client that sends +additional 0-RTT packets without constructing a new cryptographic handshake +message MUST NOT reset the packet number to 0 after a Retry packet, see +{{retry-0rtt-pn}}. + +A server acknowledges the use of a Retry packet for a connection using the +original_connection_id transport parameter (see +{{transport-parameter-definitions}}). If the server sends a Retry packet, it +MUST include the value of the Original Destination Connection ID field of the +Retry packet (that is, the Destination Connection ID field from the client's +first Initial packet) in the transport parameter. + +If the client received and processed a Retry packet, it validates that the +original_connection_id transport parameter is present and correct; otherwise, it +validates that the transport parameter is absent. A client MUST treat a failed +validation as a connection error of type TRANSPORT_PARAMETER_ERROR. + +A Retry packet does not include a packet number and cannot be explicitly +acknowledged by a client. -For example, the eight octet sequence c2 19 7c 5e ff 14 e8 8c (in hexadecimal) -decodes to the decimal value 151288809941952652; the four octet sequence 9d 7f -3e 7d decodes to 494878333; the two octet sequence 7b bd decodes to 15293; and -the single octet 25 decodes to 37 (as does the two octet sequence 40 25). -Error codes ({{error-codes}}) are described using integers, but do not use this -encoding. + +# Frame Types and Formats {#frame-formats} + +As described in {{frames}}, packets contain one or more frames. This section +describes the format and semantics of the core QUIC frame types. ## PADDING Frame {#frame-padding} @@ -4155,7 +4227,7 @@ prevent the majority of middleboxes from losing state for UDP flows. ## BLOCKED Frame {#frame-blocked} A sender SHOULD send a BLOCKED frame (type=0x08) when it wishes to send data, -but is unable to due to connection-level flow control (see {{blocking}}). +but is unable to due to connection-level flow control (see {{flow-control}}). BLOCKED frames can be used as input to tuning of flow control algorithms (see {{fc-credit}}). @@ -4571,62 +4643,6 @@ CE Count: : A variable-length integer representing the total number packets received with the CE codepoint. -### Sending ACK Frames - -Implementations MUST NOT generate packets that only contain ACK frames in -response to packets which only contain ACK and PADDING frames. However, they -MUST acknowledge packets containing only ACK and PADDING frames when sending -ACK frames in response to other packets. Implementations MUST NOT send more -than one packet containing only an ACK frame per received packet that contains -frames other than ACK and PADDING frames. Packets containing frames besides -ACK and PADDING MUST be acknowledged immediately or when a delayed ack timer -expires. - -The receiver's delayed acknowledgment timer SHOULD NOT exceed the current RTT -estimate or the value it indicates in the `max_ack_delay` transport parameter. -This ensures an acknowledgment is sent at least once per RTT when packets -needing acknowledgement are received. The sender can use the receiver's -`max_ack_delay` value in determining timeouts for timer-based retransmission. - -An acknowledgment SHOULD be sent immediately after receiving 2 packets that -require acknowledgement, unless multiple packets are received together. - -To limit ACK Blocks to those that have not yet been received by the sender, the -receiver SHOULD track which ACK frames have been acknowledged by its peer. Once -an ACK frame has been acknowledged, the packets it acknowledges SHOULD NOT be -acknowledged again. - -Because ACK frames are not sent in response to ACK-only packets, a receiver that -is only sending ACK frames will only receive acknowledgements for its packets -if the sender includes them in packets with non-ACK frames. A sender SHOULD -bundle ACK frames with other frames when possible. - -Endpoints can only acknowledge packets sent in a particular packet number -space by sending ACK frames in packets from the same packet number space. - -To limit receiver state or the size of ACK frames, a receiver MAY limit the -number of ACK Blocks it sends. A receiver can do this even without receiving -acknowledgment of its ACK frames, with the knowledge this could cause the sender -to unnecessarily retransmit some data. Standard QUIC {{QUIC-RECOVERY}} -algorithms declare packets lost after sufficiently newer packets are -acknowledged. Therefore, the receiver SHOULD repeatedly acknowledge newly -received packets in preference to packets received in the past. - -### ACK Frames and Packet Protection - -ACK frames MUST only be carried in a packet that has the same packet -number space as the packet being ACKed (see {{packet-protected}}). For -instance, packets that are protected with 1-RTT keys MUST be -acknowledged in packets that are also protected with 1-RTT keys. - -Packets that a client sends with 0-RTT packet protection MUST be acknowledged by -the server in packets protected by 1-RTT keys. This can mean that the client is -unable to use these acknowledgments if the server cryptographic handshake -messages are delayed or lost. Note that the same limitation applies to other -data sent by the server protected by the 1-RTT keys. - -Endpoints SHOULD send acknowledgments for packets containing CRYPTO frames with -a reduced delay; see Section 4.3.1 of {{QUIC-RECOVERY}}. ## PATH_CHALLENGE Frame {#frame-path-challenge} @@ -4769,19 +4785,6 @@ The first byte in the stream has an offset of 0. The largest offset delivered on a stream - the sum of the re-constructed offset and data length - MUST be less than 2^62. -Stream multiplexing is achieved by interleaving STREAM frames from multiple -streams into one or more QUIC packets. A single QUIC packet can include -multiple STREAM frames from one or more streams. - -Implementation note: One of the benefits of QUIC is avoidance of head-of-line -blocking across multiple streams. When a packet loss occurs, only streams with -data in that packet are blocked waiting for a retransmission to be received, -while other streams can continue making progress. Note that when data from -multiple streams is bundled into a single QUIC packet, loss of that packet -blocks all those streams from making progress. An implementation is therefore -advised to bundle as few streams as necessary in outgoing packets without losing -transmission efficiency to underfilled packets. - ## CRYPTO Frame {#frame-crypto} From 209a2699855b8f8bf3fce6691cd9a64ee661ccbf Mon Sep 17 00:00:00 2001 From: Jana Iyengar Date: Wed, 17 Oct 2018 20:01:36 -0700 Subject: [PATCH 32/57] detailed structure in intro --- draft-ietf-quic-transport.md | 200 +++++++++++++++++++---------------- 1 file changed, 109 insertions(+), 91 deletions(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index 44218c1a40..7ccb6700a5 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -138,21 +138,30 @@ middleboxes. This document describes the core QUIC protocol, and is structured as follows: -* streams, QUIC's service abstraction to applications, including stream - multiplexing, and stream and connection-level flow control ({{streams}} - - {{flow-control}}); - -* connections, including version negotiation and establishment, usage, - migration, and shutdown ({{connections}} - {{termination}}); - -* error handling ({{error-handling}}); - -* packets and frames, including QUIC's model and mechanics of reliability - (acknowledgements and retransmission) and packet sizing ({{packets-frames}} - - {{packet-size}}); - -* wire format, including QUIC's version format, packet formats, frame formats, - and error codes ({{versions}} - {{error-codes}}). +* Streams are the basic service abstraction that QUIC provides. +** {{streams}} describes core concepts related to streams, +** {{stream-states}} provides a reference model for stream states, and +** {{flow-control}} outlines the operation of flow control. + +* Connections are the context in which QUIC endpoints communicate. +** {{connections}} describes core concepts related to connections, +** {{version-negotiation}} describes version negotiation, +** {{handshake}} detail the process for establishing connections, +** {{migration}} describes how endpoints migrate a connection to use a new network paths, and +** {{termination}} lists the options for terminating an open connection. + +* Packets and frames are the basic unit used by QUIC to communicate. +** {{packets-frames}} describes concepts related to packets and frames, +** {{packetization}} defines models for the transmission, retransmission, and + acknowledgement of information, and +** {{packet-size}} contains a rules for managing the size of packets. + +* Details of encoding of QUIC protocol elements is described in: +** {{versions}} (Versions), +** {{packet-formats}} (Packet Headers), +** {{transport-parameter-encoding}} (Transport Parameters), +** {{frame-formats}} (Frames), and +** {{error-codes}} (Errors). Accompanying documents describe QUIC's loss detection and congestion control {{QUIC-RECOVERY}}, and the use of TLS 1.3 for key negotiation {{QUIC-TLS}}. @@ -1490,7 +1499,7 @@ response to connection migration ({{migration}}). NEW_CONNECTION_ID frames ({{frame-new-connection-id}}) are used to provide new connection ID values. -## Transport Parameters +## Transport Parameters {#transport-parameters} During connection establishment, both endpoints make authenticated declarations of their transport parameters. These declarations are made unilaterally by each @@ -1498,77 +1507,22 @@ endpoint. Endpoints are required to comply with the restrictions implied by these parameters; the description of each parameter includes rules for its handling. -The format of the transport parameters is the TransportParameters struct from -{{figure-transport-parameters}}. This is described using the presentation -language from Section 3 of {{!TLS13=RFC8446}}. +The encoding of the transport parameters is detailed in +{{transport-parameter-encoding}}. -~~~ - uint32 QuicVersion; - - enum { - initial_max_stream_data_bidi_local(0), - initial_max_data(1), - initial_max_bidi_streams(2), - idle_timeout(3), - preferred_address(4), - max_packet_size(5), - stateless_reset_token(6), - ack_delay_exponent(7), - initial_max_uni_streams(8), - disable_migration(9), - initial_max_stream_data_bidi_remote(10), - initial_max_stream_data_uni(11), - max_ack_delay(12), - original_connection_id(13), - (65535) - } TransportParameterId; - - struct { - TransportParameterId parameter; - opaque value<0..2^16-1>; - } TransportParameter; - - struct { - select (Handshake.msg_type) { - case client_hello: - QuicVersion initial_version; - - case encrypted_extensions: - QuicVersion negotiated_version; - QuicVersion supported_versions<4..2^8-4>; - }; - TransportParameter parameters<0..2^16-1>; - } TransportParameters; - - struct { - enum { IPv4(4), IPv6(6), (15) } ipVersion; - opaque ipAddress<4..2^8-1>; - uint16 port; - opaque connectionId<0..18>; - opaque statelessResetToken[16]; - } PreferredAddress; -~~~ -{: #figure-transport-parameters title="Definition of TransportParameters"} - -The `extension_data` field of the quic_transport_parameters extension defined in -{{QUIC-TLS}} contains a TransportParameters value. TLS encoding rules are -therefore used to describe the encoding of transport parameters. - -QUIC encodes transport parameters into a sequence of octets, which are then -included in the cryptographic handshake. Once the handshake completes, the -transport parameters declared by the peer are available. Each endpoint -validates the value provided by its peer. In particular, version negotiation -MUST be validated (see {{version-validation}}) before the connection -establishment is considered properly complete. +QUIC includes the encoded transport parameters in the cryptographic handshake. +Once the handshake completes, the transport parameters declared by the peer are +available. Each endpoint validates the value provided by its peer. In +particular, version negotiation MUST be validated (see {{version-validation}}) +before the connection establishment is considered properly complete. Definitions for each of the defined transport parameters are included in -{{transport-parameter-definitions}}. Any given parameter MUST appear -at most once in a given transport parameters extension. An endpoint MUST -treat receipt of duplicate transport parameters as a connection error of -type TRANSPORT_PARAMETER_ERROR. - +{{transport-parameter-definitions}}. Any given parameter MUST appear at most +once in a given transport parameters extension. An endpoint MUST treat receipt +of duplicate transport parameters as a connection error of type +TRANSPORT_PARAMETER_ERROR. -### Transport Parameter Definitions +### Transport Parameter Definitions {#transport-parameter-definitions} An endpoint MAY use the following transport parameters: @@ -3290,14 +3244,7 @@ using for private experimentation on the GitHub wiki at -# Packet Formats {#packet-formats} - -All numeric values are encoded in network byte order (that is, big-endian) and -all field sizes are in bits. Hexadecimal notation is used for describing the -value of fields. - - -## Variable-Length Integer Encoding {#integer-encoding} +# Variable-Length Integer Encoding {#integer-encoding} QUIC packets and frames commonly use a variable-length encoding for non-negative integer values. This encoding ensures that smaller integer values need fewer @@ -3328,6 +3275,15 @@ the single octet 25 decodes to 37 (as does the two octet sequence 40 25). Error codes ({{error-codes}}) and versions {{versions}} are described using integers, but do not use this encoding. + + +# Packet Formats {#packet-formats} + +All numeric values are encoded in network byte order (that is, big-endian) and +all field sizes are in bits. Hexadecimal notation is used for describing the +value of fields. + + ## Packet Number Encoding and Decoding {#packet-encoding} @@ -3926,6 +3882,68 @@ A Retry packet does not include a packet number and cannot be explicitly acknowledged by a client. +# Transport Parameter Encoding {#transport-parameter-encoding} + +The format of the transport parameters is the TransportParameters struct from +{{figure-transport-parameters}}. This is described using the presentation +language from Section 3 of {{!TLS13=RFC8446}}. + +~~~ + uint32 QuicVersion; + + enum { + initial_max_stream_data_bidi_local(0), + initial_max_data(1), + initial_max_bidi_streams(2), + idle_timeout(3), + preferred_address(4), + max_packet_size(5), + stateless_reset_token(6), + ack_delay_exponent(7), + initial_max_uni_streams(8), + disable_migration(9), + initial_max_stream_data_bidi_remote(10), + initial_max_stream_data_uni(11), + max_ack_delay(12), + original_connection_id(13), + (65535) + } TransportParameterId; + + struct { + TransportParameterId parameter; + opaque value<0..2^16-1>; + } TransportParameter; + + struct { + select (Handshake.msg_type) { + case client_hello: + QuicVersion initial_version; + + case encrypted_extensions: + QuicVersion negotiated_version; + QuicVersion supported_versions<4..2^8-4>; + }; + TransportParameter parameters<0..2^16-1>; + } TransportParameters; + + struct { + enum { IPv4(4), IPv6(6), (15) } ipVersion; + opaque ipAddress<4..2^8-1>; + uint16 port; + opaque connectionId<0..18>; + opaque statelessResetToken[16]; + } PreferredAddress; +~~~ +{: #figure-transport-parameters title="Definition of TransportParameters"} + +The `extension_data` field of the quic_transport_parameters extension defined in +{{QUIC-TLS}} contains a TransportParameters value. TLS encoding rules are +therefore used to describe the encoding of transport parameters. + +QUIC encodes transport parameters into a sequence of octets, which are then +included in the cryptographic handshake. + + # Frame Types and Formats {#frame-formats} From d927b6af974a164f9fc14252f20997757fd4ef14 Mon Sep 17 00:00:00 2001 From: Jana Iyengar Date: Wed, 17 Oct 2018 20:06:28 -0700 Subject: [PATCH 33/57] most of martinduke comments --- draft-ietf-quic-transport.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index 7ccb6700a5..d514619380 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -192,7 +192,8 @@ Endpoint: Stream: -: A logical, bi-directional channel of ordered bytes within a QUIC connection. +: A logical unidirectional or bidirectional channel of ordered bytes within a + QUIC connection. Connection: @@ -328,7 +329,7 @@ is a result of a change in the initial limits (see {{zerortt-parameters}}). A receiver cannot renege on an advertisement; that is, once a receiver advertises a stream ID via a MAX_STREAM_ID frame, advertising a smaller maximum -ID has no effect. A sender MUST ignore any MAX_STREAM_ID frame that does not +ID has no effect. A receiver MUST ignore any MAX_STREAM_ID frame that does not increase the maximum stream ID. @@ -344,7 +345,7 @@ is transmitted, when data is retransmitted after packet loss, or when data is delivered to the application at the receiver. When new data is to be sent on a stream, a sender MUST set the encapsulating -STREAM frame's offset field to the stream offset of the first byte of this new +STREAM frame's offset field to the stream offset of the first octet of this new data. The first octet of data on a stream has an offset of 0. An endpoint is expected to send every stream octet. The largest offset delivered on a stream MUST be less than 2^62. From af51da7f5ff2256f15efd3f8871dfc8aa29a523e Mon Sep 17 00:00:00 2001 From: Jana Iyengar Date: Wed, 17 Oct 2018 20:13:21 -0700 Subject: [PATCH 34/57] added initial values in flow control --- draft-ietf-quic-transport.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index d514619380..d47536a256 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -735,6 +735,9 @@ to two levels of flow control in QUIC: * Connection flow control, which prevents senders from exceeding a receiver's buffer capacity for the connection, and +A data receiver sets initial credits for all streams by sending transport +parameters during the handshake ({{transport-parameters}}). + A data receiver sends MAX_STREAM_DATA or MAX_DATA frames to the sender to advertise additional credit. MAX_STREAM_DATA frames send the maximum absolute byte offset of a stream, while MAX_DATA frames send the maximum of the sum of From d5a1e49161dc7d43fca83fd2ab065faf10d5a394 Mon Sep 17 00:00:00 2001 From: Jana Iyengar Date: Wed, 17 Oct 2018 20:15:37 -0700 Subject: [PATCH 35/57] lint --- draft-ietf-quic-transport.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index d47536a256..25f2cd68d3 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -147,7 +147,8 @@ This document describes the core QUIC protocol, and is structured as follows: ** {{connections}} describes core concepts related to connections, ** {{version-negotiation}} describes version negotiation, ** {{handshake}} detail the process for establishing connections, -** {{migration}} describes how endpoints migrate a connection to use a new network paths, and +** {{migration}} describes how endpoints migrate a connection to use a new + network paths, and ** {{termination}} lists the options for terminating an open connection. * Packets and frames are the basic unit used by QUIC to communicate. From db2e9b240cdc86c8d7b061152cf05f701fe7c4fc Mon Sep 17 00:00:00 2001 From: Martin Thomson Date: Thu, 18 Oct 2018 07:59:24 -0700 Subject: [PATCH 36/57] max_stream_id --- draft-ietf-quic-transport.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index 25f2cd68d3..edd60cc1df 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -773,7 +773,8 @@ write but is blocked by flow control limits. These frames are expected to be sent infrequently in common cases, but they are considered useful for debugging and monitoring purposes. -(TODO: Add something about max_stream_id) +A similar method is used to control the number of open streams +see {{stream-limit-increment}} for details. ## Handling of Stream Cancellation From ca49ae4cdf1b2470de3e0cfa3b1b0ab9e7b28a62 Mon Sep 17 00:00:00 2001 From: Jana Iyengar Date: Thu, 18 Oct 2018 08:01:33 -0700 Subject: [PATCH 37/57] parenthesize --- draft-ietf-quic-transport.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index edd60cc1df..76668c3bbc 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -773,8 +773,8 @@ write but is blocked by flow control limits. These frames are expected to be sent infrequently in common cases, but they are considered useful for debugging and monitoring purposes. -A similar method is used to control the number of open streams -see {{stream-limit-increment}} for details. +A similar method is used to control the number of open streams (see +{{stream-limit-increment}} for details). ## Handling of Stream Cancellation From 0368ca91704757d307072511cb38f0968b32f72a Mon Sep 17 00:00:00 2001 From: Martin Thomson Date: Thu, 18 Oct 2018 08:02:00 -0700 Subject: [PATCH 38/57] parenthesize --- draft-ietf-quic-transport.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index 76668c3bbc..db74bc7290 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -796,7 +796,7 @@ as well. The receiver must learn the number of bytes that were sent on the stream to make the same adjustment in its connection flow controller. To ensure that endpoints maintain a consistent connection-level flow control -state, the RST_STREAM frame {{frame-rst-stream}} includes the largest offset of +state, the RST_STREAM frame ({{frame-rst-stream}}) includes the largest offset of data sent on the stream. On receiving a RST_STREAM frame, a receiver definitively knows how many bytes were sent on that stream before the RST_STREAM frame, and the receiver MUST use the final offset to account for all bytes sent From 1d4f7d395a4dc6877b6ed81a77fc7d9032680ff6 Mon Sep 17 00:00:00 2001 From: Martin Thomson Date: Thu, 18 Oct 2018 08:02:46 -0700 Subject: [PATCH 39/57] stream-id-blocked frame --- draft-ietf-quic-transport.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index db74bc7290..6ee398cd56 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -890,6 +890,8 @@ but offers a few considerations. MAX_STREAM_ID frames constitute minimal overhead, while withholding MAX_STREAM_ID frames can prevent the peer from using the available parallelism. +The STREAM_ID_BLOCKED frame ({{frame-stream-id-blocked}}) can be +used to signal a shortage of available streams. Implementations will likely want to increase the maximum stream ID as peer-initiated streams close. A receiver MAY also advance the maximum stream ID based on current activity, system conditions, and other environmental factors. From c3426ee9042e67a9d8a4db4e34d22073e2c6cbdb Mon Sep 17 00:00:00 2001 From: Martin Thomson Date: Thu, 18 Oct 2018 08:03:14 -0700 Subject: [PATCH 40/57] s/intertwines/combines --- draft-ietf-quic-transport.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index 6ee398cd56..5eab0c6e98 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -900,7 +900,7 @@ based on current activity, system conditions, and other environmental factors. # Connections {#connections} A QUIC connection is a single conversation between two QUIC endpoints. QUIC's -connection establishment intertwines version negotiation with the cryptographic +connection establishment combines version negotiation with the cryptographic and transport handshakes to reduce connection establishment latency, as described in {{handshake}}. Once established, a connection may migrate to a different IP or port at either endpoint, due to NAT rebinding or mobility, as From 140a9c601c8a60967e3e06c6231231c5475f29b4 Mon Sep 17 00:00:00 2001 From: Martin Thomson Date: Thu, 18 Oct 2018 08:03:42 -0700 Subject: [PATCH 41/57] unnecessary text --- draft-ietf-quic-transport.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index 5eab0c6e98..77a506269b 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -903,7 +903,7 @@ A QUIC connection is a single conversation between two QUIC endpoints. QUIC's connection establishment combines version negotiation with the cryptographic and transport handshakes to reduce connection establishment latency, as described in {{handshake}}. Once established, a connection may migrate to a -different IP or port at either endpoint, due to NAT rebinding or mobility, as +different IP or port at either endpoint as described in {{migration}}. Finally, a connection may be terminated by either endpoint, as described in {{termination}}. From 23d95c3fc683976506d9465083038dbe32f994dc Mon Sep 17 00:00:00 2001 From: Martin Thomson Date: Thu, 18 Oct 2018 08:06:36 -0700 Subject: [PATCH 42/57] connection id para --- draft-ietf-quic-transport.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index 77a506269b..614fb8c0c3 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -909,7 +909,10 @@ endpoint, as described in {{termination}}. ## Connection ID {#connection-id} -Each connection possesses a set of identifiers, any of which could be used to +Each connection possesses a set of connection identifiers, or connection IDs, +each of which can be identify the connection. Connection IDs are independently +selected by endpoints; each endpoint selects the connection IDs that its peer +uses. distinguish it from other connections. Connection IDs are selected independently in each direction. From d3cf19fd880fc281e208e75899834ff241b5f7c6 Mon Sep 17 00:00:00 2001 From: Martin Thomson Date: Thu, 18 Oct 2018 08:06:47 -0700 Subject: [PATCH 43/57] connection id contd --- draft-ietf-quic-transport.md | 1 - 1 file changed, 1 deletion(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index 614fb8c0c3..cbc9837739 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -913,7 +913,6 @@ Each connection possesses a set of connection identifiers, or connection IDs, each of which can be identify the connection. Connection IDs are independently selected by endpoints; each endpoint selects the connection IDs that its peer uses. -distinguish it from other connections. Connection IDs are selected independently in each direction. The primary function of a connection ID is to ensure that changes in addressing From 4c9858f1a212fe7c7f871048389612a64dc2d922 Mon Sep 17 00:00:00 2001 From: Martin Thomson Date: Thu, 18 Oct 2018 08:06:56 -0700 Subject: [PATCH 44/57] connection id contd --- draft-ietf-quic-transport.md | 1 - 1 file changed, 1 deletion(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index cbc9837739..eefa2f0dc6 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -913,7 +913,6 @@ Each connection possesses a set of connection identifiers, or connection IDs, each of which can be identify the connection. Connection IDs are independently selected by endpoints; each endpoint selects the connection IDs that its peer uses. -independently in each direction. The primary function of a connection ID is to ensure that changes in addressing at lower protocol layers (UDP, IP, and below) don't cause packets for a QUIC From 0dfa88580bb9fc3aa32783624b60c4422d6b2b8f Mon Sep 17 00:00:00 2001 From: Martin Thomson Date: Thu, 18 Oct 2018 08:07:21 -0700 Subject: [PATCH 45/57] s/subsequent/additional --- draft-ietf-quic-transport.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index eefa2f0dc6..ddeb5771b6 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -945,7 +945,9 @@ messages. The initial connection ID issued by an endpoint is sent in the Source Connection ID field of the long packet header ({{long-header}}) during the handshake. The sequence number of the initial connection ID is 0. If the preferred_address transport parameter is sent, the sequence number of the -supplied connection ID is 1. Subsequent connection IDs are communicated to the +supplied connection ID is 1. + +Additional connection IDs are communicated to the peer using NEW_CONNECTION_ID frames ({{frame-new-connection-id}}), and the sequence number on each newly-issued connection ID MUST increase by 1. The connection ID randomly selected by the client in the Initial packet and any From a6262f9116957886f682c5c3fb878afac512cbb8 Mon Sep 17 00:00:00 2001 From: Martin Thomson Date: Thu, 18 Oct 2018 08:07:36 -0700 Subject: [PATCH 46/57] shorter sentences --- draft-ietf-quic-transport.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index ddeb5771b6..4ab172fd5b 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -948,7 +948,7 @@ preferred_address transport parameter is sent, the sequence number of the supplied connection ID is 1. Additional connection IDs are communicated to the -peer using NEW_CONNECTION_ID frames ({{frame-new-connection-id}}), and the +peer using NEW_CONNECTION_ID frames ({{frame-new-connection-id}}). The sequence number on each newly-issued connection ID MUST increase by 1. The connection ID randomly selected by the client in the Initial packet and any connection ID provided by a Reset packet are not assigned sequence numbers From b90815f228909371be135bb055993987a59dcf9b Mon Sep 17 00:00:00 2001 From: Martin Thomson Date: Thu, 18 Oct 2018 08:08:34 -0700 Subject: [PATCH 47/57] packet protection TODO --- draft-ietf-quic-transport.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index 4ab172fd5b..940c3ccba2 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -1012,7 +1012,8 @@ correspond to a single connection. Endpoints SHOULD send a Stateless Reset ({{stateless-reset}}) for any packets that cannot be attributed to an existing connection. - ### Client Packet Handling {#client-pkt-handling} From bbb241e8e59cf45f6643f95aeee63dd1ad932718 Mon Sep 17 00:00:00 2001 From: Martin Thomson Date: Thu, 18 Oct 2018 08:08:52 -0700 Subject: [PATCH 48/57] packet protection TODO contd --- draft-ietf-quic-transport.md | 1 - 1 file changed, 1 deletion(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index 940c3ccba2..edbcedae4d 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -1014,7 +1014,6 @@ that cannot be attributed to an existing connection. Packets that are matched to an existing connection, but for which the endpoint cannot remove packet protection, are discarded. -here. We probably want that.--> ### Client Packet Handling {#client-pkt-handling} From 2a00e0b7316c06bec7693abf1ad1408b3b6fa96b Mon Sep 17 00:00:00 2001 From: Martin Thomson Date: Thu, 18 Oct 2018 08:09:03 -0700 Subject: [PATCH 49/57] s/the/a --- draft-ietf-quic-transport.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index edbcedae4d..45ecdc7416 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -1018,7 +1018,7 @@ cannot remove packet protection, are discarded. ### Client Packet Handling {#client-pkt-handling} Valid packets sent to clients always include a Destination Connection ID that -matches the value the client selects. Clients that choose to receive +matches a value the client selects. Clients that choose to receive zero-length connection IDs can use the address/port tuple to identify a connection. Packets that don't match an existing connection are discarded. From 2986d5b90f1dd8317554d60e8b70c0e82871cd87 Mon Sep 17 00:00:00 2001 From: Martin Thomson Date: Thu, 18 Oct 2018 13:19:13 -0700 Subject: [PATCH 50/57] s/acks/acknowledges --- draft-ietf-quic-transport.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index 45ecdc7416..f9404cda1c 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -1415,7 +1415,8 @@ Handshake[0]: CRYPTO[FIN], ACK[0] {{tls-0rtt-handshake}} shows an example of a connection with a 0-RTT handshake and a single packet of 0-RTT data. Note that as described in {{packet-numbers}}, -the server ACKs the 0-RTT data at the 1-RTT encryption level, and the client's +the server acknowledges 0-RTT data at the 1-RTT encryption level, and the +client sends 1-RTT packets in the same packet number space. sequence numbers at the 1-RTT encryption level continue to increment from its 0-RTT packets. From c631c3171ed7bead8240f83e3b48cd6eb52bf474 Mon Sep 17 00:00:00 2001 From: Martin Thomson Date: Thu, 18 Oct 2018 13:21:18 -0700 Subject: [PATCH 51/57] unnecessary text --- draft-ietf-quic-transport.md | 1 - 1 file changed, 1 deletion(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index f9404cda1c..94b900a2f6 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -1417,7 +1417,6 @@ Handshake[0]: CRYPTO[FIN], ACK[0] and a single packet of 0-RTT data. Note that as described in {{packet-numbers}}, the server acknowledges 0-RTT data at the 1-RTT encryption level, and the client sends 1-RTT packets in the same packet number space. -sequence numbers at the 1-RTT encryption level continue to increment from its 0-RTT packets. ~~~~ From 6a980bfd5c8c1e8244045449ed7ed24ec7bf6301 Mon Sep 17 00:00:00 2001 From: Martin Thomson Date: Thu, 18 Oct 2018 13:21:28 -0700 Subject: [PATCH 52/57] unnecessary text --- draft-ietf-quic-transport.md | 1 - 1 file changed, 1 deletion(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index 94b900a2f6..9868b9ce1a 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -1417,7 +1417,6 @@ Handshake[0]: CRYPTO[FIN], ACK[0] and a single packet of 0-RTT data. Note that as described in {{packet-numbers}}, the server acknowledges 0-RTT data at the 1-RTT encryption level, and the client sends 1-RTT packets in the same packet number space. -0-RTT packets. ~~~~ Client Server From d6c92675a9d3933838c0daa1b2dff1db17166012 Mon Sep 17 00:00:00 2001 From: Jana Iyengar Date: Thu, 18 Oct 2018 14:04:50 -0700 Subject: [PATCH 53/57] last of mt's comments --- draft-ietf-quic-transport.md | 286 ++++++++++++++++++----------------- 1 file changed, 147 insertions(+), 139 deletions(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index 9868b9ce1a..9c79158ccd 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -1532,138 +1532,8 @@ once in a given transport parameters extension. An endpoint MUST treat receipt of duplicate transport parameters as a connection error of type TRANSPORT_PARAMETER_ERROR. -### Transport Parameter Definitions {#transport-parameter-definitions} - -An endpoint MAY use the following transport parameters: - -initial_max_data (0x0001): - -: The initial maximum data parameter contains the initial value for the maximum - amount of data that can be sent on the connection. This parameter is encoded - as an unsigned 32-bit integer in units of octets. This is equivalent to - sending a MAX_DATA ({{frame-max-data}}) for the connection immediately after - completing the handshake. If the transport parameter is absent, the connection - starts with a flow control limit of 0. - -initial_max_bidi_streams (0x0002): - -: The initial maximum bidirectional streams parameter contains the initial - maximum number of bidirectional streams the peer may initiate, encoded as an - unsigned 16-bit integer. If this parameter is absent or zero, bidirectional - streams cannot be created until a MAX_STREAM_ID frame is sent. Setting this - parameter is equivalent to sending a MAX_STREAM_ID ({{frame-max-stream-id}}) - immediately after completing the handshake containing the corresponding Stream - ID. For example, a value of 0x05 would be equivalent to receiving a - MAX_STREAM_ID containing 16 when received by a client or 17 when received by a - server. - -initial_max_uni_streams (0x0008): - -: The initial maximum unidirectional streams parameter contains the initial - maximum number of unidirectional streams the peer may initiate, encoded as an - unsigned 16-bit integer. If this parameter is absent or zero, unidirectional - streams cannot be created until a MAX_STREAM_ID frame is sent. Setting this - parameter is equivalent to sending a MAX_STREAM_ID ({{frame-max-stream-id}}) - immediately after completing the handshake containing the corresponding Stream - ID. For example, a value of 0x05 would be equivalent to receiving a - MAX_STREAM_ID containing 18 when received by a client or 19 when received by a - server. - -idle_timeout (0x0003): - -: The idle timeout is a value in seconds that is encoded as an unsigned 16-bit - integer. If this parameter is absent or zero then the idle timeout is - disabled. - -max_packet_size (0x0005): - -: The maximum packet size parameter places a limit on the size of packets that - the endpoint is willing to receive, encoded as an unsigned 16-bit integer. - This indicates that packets larger than this limit will be dropped. The - default for this parameter is the maximum permitted UDP payload of 65527. - Values below 1200 are invalid. This limit only applies to protected packets - ({{packet-protected}}). - -ack_delay_exponent (0x0007): - -: An 8-bit unsigned integer value indicating an exponent used to decode the ACK - Delay field in the ACK frame, see {{frame-ack}}. If this value is absent, a - default value of 3 is assumed (indicating a multiplier of 8). The default - value is also used for ACK frames that are sent in Initial and Handshake - packets. Values above 20 are invalid. - -disable_migration (0x0009): - -: The endpoint does not support connection migration ({{migration}}). Peers MUST - NOT send any packets, including probing packets ({{probing}}), from a local - address other than that used to perform the handshake. This parameter is a - zero-length value. - -max_ack_delay (0x000c): - -: An 8 bit unsigned integer value indicating the maximum amount of time in - milliseconds by which it will delay sending of acknowledgments. If this - value is absent, a default of 25 milliseconds is assumed. - -Either peer MAY advertise an initial value for the flow control on each type of -stream on which they might receive data. Each of the following transport -parameters is encoded as an unsigned 32-bit integer in units of octets: - -initial_max_stream_data_bidi_local (0x0000): - -: The initial stream maximum data for bidirectional, locally-initiated streams - parameter contains the initial flow control limit for newly created - bidirectional streams opened by the endpoint that sets the transport - parameter. In client transport parameters, this applies to streams with an - identifier ending in 0x0; in server transport parameters, this applies to - streams ending in 0x1. - -initial_max_stream_data_bidi_remote (0x000a): - -: The initial stream maximum data for bidirectional, peer-initiated streams - parameter contains the initial flow control limit for newly created - bidirectional streams opened by the endpoint that receives the transport - parameter. In client transport parameters, this applies to streams with an - identifier ending in 0x1; in server transport parameters, this applies to - streams ending in 0x0. - -initial_max_stream_data_uni (0x000b): - -: The initial stream maximum data for unidirectional streams parameter contains - the initial flow control limit for newly created unidirectional streams opened - by the endpoint that receives the transport parameter. In client transport - parameters, this applies to streams with an identifier ending in 0x3; in - server transport parameters, this applies to streams ending in 0x2. - -If present, transport parameters that set initial stream flow control limits are -equivalent to sending a MAX_STREAM_DATA frame ({{frame-max-stream-data}}) on -every stream of the corresponding type immediately after opening. If the -transport parameter is absent, streams of that type start with a flow control -limit of 0. - -A server MUST include the original_connection_id transport parameter if it sent -a Retry packet: - -original_connection_id (0x000d): - -: The value of the Destination Connection ID field from the first Initial packet - sent by the client. This transport parameter is only sent by the server. - -A server MAY include the following transport parameters: - -stateless_reset_token (0x0006): - -: The Stateless Reset Token is used in verifying a stateless reset, see - {{stateless-reset}}. This parameter is a sequence of 16 octets. - -preferred_address (0x0004): - -: The server's Preferred Address is used to effect a change in server address at - the end of the handshake, as described in {{preferred-address}}. - -A client MUST NOT include an original connection ID, a stateless reset token, or -a preferred address. A server MUST treat receipt of any of these transport -parameters as a connection error of type TRANSPORT_PARAMETER_ERROR. +A server MUST include the original_connection_id transport parameter +({{transport-parameter-definitions}}) if it sent a Retry packet. ### Values of Transport Parameters for 0-RTT {#zerortt-parameters} @@ -1687,8 +1557,8 @@ might be violated by the client with its 0-RTT data. In particular, a server that accepts 0-RTT data MUST NOT set values for initial_max_data, initial_max_stream_data_bidi_local, initial_max_stream_data_bidi_remote, initial_max_stream_data_uni, initial_max_bidi_streams, or -initial_max_uni_streams that are smaller than the remembered value of those -parameters. +initial_max_uni_streams ({{transport-parameter-definitions}}) that are smaller +than the remembered value of those parameters. Omitting or setting a zero value for certain transport parameters can result in 0-RTT data being enabled, but not usable. The applicable subset of transport @@ -1727,8 +1597,9 @@ parameters are used to retroactively authenticate the choice of version (see {{version-negotiation}}). The cryptographic handshake provides integrity protection for the negotiated -version as part of the transport parameters (see {{transport-parameters}}). As -a result, attacks on version negotiation by an attacker can be detected. +version as part of the transport parameters (see +{{transport-parameter-definitions}}). As a result, attacks on version +negotiation by an attacker can be detected. The client includes the initial_version field in its transport parameters. The initial_version is the version that the client initially attempted to use. If @@ -1751,7 +1622,8 @@ differs from the QUIC version that is in use, the server MUST terminate the connection with a VERSION_NEGOTIATION_ERROR error. The server includes both the version of QUIC that is in use and a list of the -QUIC versions that the server supports. +QUIC versions that the server supports (see +{{transport-parameter-definitions}}). The negotiated_version field is the version that is in use. This MUST be set by the server to the value that is on the Initial packet that it accepts (not an @@ -3296,8 +3168,6 @@ value of fields. ## Packet Number Encoding and Decoding {#packet-encoding} - - Packet numbers in long and short packet headers are encoded as follows. The number of bits required to represent the packet number is first reduced by including only a variable number of the least significant bits of the packet @@ -3710,6 +3580,8 @@ subsequent to the first do not need to fit within a single UDP datagram. ### Starting Packet Numbers + + The first Initial packet sent by either endpoint contains a packet number of 0. The packet number MUST increase monotonically thereafter. Initial packets are in a different packet number space to other packets (see @@ -3954,6 +3826,142 @@ QUIC encodes transport parameters into a sequence of octets, which are then included in the cryptographic handshake. +### Transport Parameter Definitions {#transport-parameter-definitions} + + + +An endpoint MAY use the following transport parameters: + +idle_timeout (0x0003): + +: The idle timeout is a value in seconds that is encoded as an unsigned 16-bit + integer. If this parameter is absent or zero then the idle timeout is + disabled. + +max_packet_size (0x0005): + +: The maximum packet size parameter places a limit on the size of packets that + the endpoint is willing to receive, encoded as an unsigned 16-bit integer. + This indicates that packets larger than this limit will be dropped. The + default for this parameter is the maximum permitted UDP payload of 65527. + Values below 1200 are invalid. This limit only applies to protected packets + ({{packet-protected}}). + +ack_delay_exponent (0x0007): + +: An 8-bit unsigned integer value indicating an exponent used to decode the ACK + Delay field in the ACK frame, see {{frame-ack}}. If this value is absent, a + default value of 3 is assumed (indicating a multiplier of 8). The default + value is also used for ACK frames that are sent in Initial and Handshake + packets. Values above 20 are invalid. + +disable_migration (0x0009): + +: The endpoint does not support connection migration ({{migration}}). Peers MUST + NOT send any packets, including probing packets ({{probing}}), from a local + address other than that used to perform the handshake. This parameter is a + zero-length value. + +max_ack_delay (0x000c): + +: An 8 bit unsigned integer value indicating the maximum amount of time in + milliseconds by which it will delay sending of acknowledgments. If this + value is absent, a default of 25 milliseconds is assumed. + +Either peer MAY advertise an initial value for flow control of each type of +stream on which they might receive data. Each of the following transport +parameters is encoded as an unsigned 32-bit integer in units of octets: + +initial_max_stream_data_bidi_local (0x0000): + +: The initial stream maximum data for bidirectional, locally-initiated streams + parameter contains the initial flow control limit for newly created + bidirectional streams opened by the endpoint that sets the transport + parameter. In client transport parameters, this applies to streams with an + identifier ending in 0x0; in server transport parameters, this applies to + streams ending in 0x1. + +initial_max_stream_data_bidi_remote (0x000a): + +: The initial stream maximum data for bidirectional, peer-initiated streams + parameter contains the initial flow control limit for newly created + bidirectional streams opened by the endpoint that receives the transport + parameter. In client transport parameters, this applies to streams with an + identifier ending in 0x1; in server transport parameters, this applies to + streams ending in 0x0. + +initial_max_stream_data_uni (0x000b): + +: The initial stream maximum data for unidirectional streams parameter contains + the initial flow control limit for newly created unidirectional streams opened + by the endpoint that receives the transport parameter. In client transport + parameters, this applies to streams with an identifier ending in 0x3; in + server transport parameters, this applies to streams ending in 0x2. + +If present, transport parameters that set initial flow control limits +(initial_max_stream_data_bidi_local, initial_max_stream_data_bidi_remote, and +initial_max_stream_data_uni) are equivalent to sending a MAX_STREAM_DATA frame +({{frame-max-stream-data}}) on every stream of the corresponding type +immediately after opening. If the transport parameter is absent, streams of +that type start with a flow control limit of 0. + +initial_max_data (0x0001): + +: The initial maximum data parameter contains the initial value for the maximum + amount of data that can be sent on the connection. This parameter is encoded + as an unsigned 32-bit integer in units of octets. This is equivalent to + sending a MAX_DATA ({{frame-max-data}}) for the connection immediately after + completing the handshake. If the transport parameter is absent, the connection + starts with a flow control limit of 0. + +initial_max_bidi_streams (0x0002): + +: The initial maximum bidirectional streams parameter contains the initial + maximum number of bidirectional streams the peer may initiate, encoded as an + unsigned 16-bit integer. If this parameter is absent or zero, bidirectional + streams cannot be created until a MAX_STREAM_ID frame is sent. Setting this + parameter is equivalent to sending a MAX_STREAM_ID ({{frame-max-stream-id}}) + immediately after completing the handshake containing the corresponding Stream + ID. For example, a value of 0x05 would be equivalent to receiving a + MAX_STREAM_ID containing 16 when received by a client or 17 when received by a + server. + +initial_max_uni_streams (0x0008): + +: The initial maximum unidirectional streams parameter contains the initial + maximum number of unidirectional streams the peer may initiate, encoded as an + unsigned 16-bit integer. If this parameter is absent or zero, unidirectional + streams cannot be created until a MAX_STREAM_ID frame is sent. Setting this + parameter is equivalent to sending a MAX_STREAM_ID ({{frame-max-stream-id}}) + immediately after completing the handshake containing the corresponding Stream + ID. For example, a value of 0x05 would be equivalent to receiving a + MAX_STREAM_ID containing 18 when received by a client or 19 when received by a + server. + +A server MUST include the following transport parameter if it sent a Retry packet: + +original_connection_id (0x000d): + +: The value of the Destination Connection ID field from the first Initial packet + sent by the client. This transport parameter is only sent by the server. + +A server MAY include the following transport parameters: + +stateless_reset_token (0x0006): + +: The Stateless Reset Token is used in verifying a stateless reset, see + {{stateless-reset}}. This parameter is a sequence of 16 octets. + +preferred_address (0x0004): + +: The server's Preferred Address is used to effect a change in server address at + the end of the handshake, as described in {{preferred-address}}. + +A client MUST NOT include an original connection ID, a stateless reset token, or +a preferred address. A server MUST treat receipt of any of these transport +parameters as a connection error of type TRANSPORT_PARAMETER_ERROR. + + # Frame Types and Formats {#frame-formats} From e13b6b3262ea6c0d85a2ed46510107026d59c4d0 Mon Sep 17 00:00:00 2001 From: Jana Iyengar Date: Thu, 18 Oct 2018 14:07:58 -0700 Subject: [PATCH 54/57] lint --- draft-ietf-quic-transport.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index 9c79158ccd..099479f73c 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -3938,7 +3938,8 @@ initial_max_uni_streams (0x0008): MAX_STREAM_ID containing 18 when received by a client or 19 when received by a server. -A server MUST include the following transport parameter if it sent a Retry packet: +A server MUST include the following transport parameter if it sent a Retry +packet: original_connection_id (0x000d): From dd74eda33def1e3e412420c977e3913493a53cdf Mon Sep 17 00:00:00 2001 From: Jana Iyengar Date: Thu, 18 Oct 2018 14:16:36 -0700 Subject: [PATCH 55/57] lint --- draft-ietf-quic-transport.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index 099479f73c..4f4becee25 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -796,8 +796,8 @@ as well. The receiver must learn the number of bytes that were sent on the stream to make the same adjustment in its connection flow controller. To ensure that endpoints maintain a consistent connection-level flow control -state, the RST_STREAM frame ({{frame-rst-stream}}) includes the largest offset of -data sent on the stream. On receiving a RST_STREAM frame, a receiver +state, the RST_STREAM frame ({{frame-rst-stream}}) includes the largest offset +of data sent on the stream. On receiving a RST_STREAM frame, a receiver definitively knows how many bytes were sent on that stream before the RST_STREAM frame, and the receiver MUST use the final offset to account for all bytes sent on the stream in its connection level flow controller. From 84b5c577281f159f67aa76873f311eb16031a7a4 Mon Sep 17 00:00:00 2001 From: Jana Iyengar Date: Thu, 18 Oct 2018 14:25:27 -0700 Subject: [PATCH 56/57] correct changelog --- draft-ietf-quic-transport.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index 4f4becee25..0dc2641955 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -5362,7 +5362,7 @@ DecodePacketNumber(largest_pn, truncated_pn, pn_nbits): Issue and pull request numbers are listed with a leading octothorp. -## Since draft-ietf-quic-transport-14 +## Since draft-ietf-quic-transport-15 Substantial editorial reorganization; no technical changes. From bce33e240951c6546e3f064b4658ec3c335fd17a Mon Sep 17 00:00:00 2001 From: Jana Iyengar Date: Thu, 18 Oct 2018 14:51:19 -0700 Subject: [PATCH 57/57] fix structure in intro --- draft-ietf-quic-transport.md | 36 ++++++++++++++++++------------------ 1 file changed, 18 insertions(+), 18 deletions(-) diff --git a/draft-ietf-quic-transport.md b/draft-ietf-quic-transport.md index 87807ba870..95223f0b38 100644 --- a/draft-ietf-quic-transport.md +++ b/draft-ietf-quic-transport.md @@ -139,30 +139,30 @@ middleboxes. This document describes the core QUIC protocol, and is structured as follows: * Streams are the basic service abstraction that QUIC provides. -** {{streams}} describes core concepts related to streams, -** {{stream-states}} provides a reference model for stream states, and -** {{flow-control}} outlines the operation of flow control. + - {{streams}} describes core concepts related to streams, + - {{stream-states}} provides a reference model for stream states, and + - {{flow-control}} outlines the operation of flow control. * Connections are the context in which QUIC endpoints communicate. -** {{connections}} describes core concepts related to connections, -** {{version-negotiation}} describes version negotiation, -** {{handshake}} detail the process for establishing connections, -** {{migration}} describes how endpoints migrate a connection to use a new - network paths, and -** {{termination}} lists the options for terminating an open connection. + - {{connections}} describes core concepts related to connections, + - {{version-negotiation}} describes version negotiation, + - {{handshake}} details the process for establishing connections, + - {{migration}} describes how endpoints migrate a connection to use a new + network paths, and + - {{termination}} lists the options for terminating an open connection. * Packets and frames are the basic unit used by QUIC to communicate. -** {{packets-frames}} describes concepts related to packets and frames, -** {{packetization}} defines models for the transmission, retransmission, and - acknowledgement of information, and -** {{packet-size}} contains a rules for managing the size of packets. + - {{packets-frames}} describes concepts related to packets and frames, + - {{packetization}} defines models for the transmission, retransmission, and + acknowledgement of information, and + - {{packet-size}} contains a rules for managing the size of packets. * Details of encoding of QUIC protocol elements is described in: -** {{versions}} (Versions), -** {{packet-formats}} (Packet Headers), -** {{transport-parameter-encoding}} (Transport Parameters), -** {{frame-formats}} (Frames), and -** {{error-codes}} (Errors). + - {{versions}} (Versions), + - {{packet-formats}} (Packet Headers), + - {{transport-parameter-encoding}} (Transport Parameters), + - {{frame-formats}} (Frames), and + - {{error-codes}} (Errors). Accompanying documents describe QUIC's loss detection and congestion control {{QUIC-RECOVERY}}, and the use of TLS 1.3 for key negotiation {{QUIC-TLS}}.