diff --git a/ianswett-min-cwnd/draft-ietf-quic-http.html b/ianswett-min-cwnd/draft-ietf-quic-http.html index a70683f489..f0b29c4975 100644 --- a/ianswett-min-cwnd/draft-ietf-quic-http.html +++ b/ianswett-min-cwnd/draft-ietf-quic-http.html @@ -831,7 +831,7 @@
- This Internet-Draft will expire on 21 October 2020.¶
+ This Internet-Draft will expire on 22 October 2020.¶- This Internet-Draft will expire on 21 October 2020.¶
+ This Internet-Draft will expire on 22 October 2020.¶- This Internet-Draft will expire on 21 October 2020.¶
+ This Internet-Draft will expire on 22 October 2020.¶- This Internet-Draft will expire on 21 October 2020.¶
+ This Internet-Draft will expire on 22 October 2020.¶3.1.7. Probe Timeout Replaces RTO and TLP¶
QUIC recommends a minimum congestion window of 2 packets instead of TCP's 1. 2 packets avoid waiting for a delayed acknowledgement and allow the PTO to @@ -2251,10 +2251,10 @@
- This Internet-Draft will expire on 21 October 2020.¶
+ This Internet-Draft will expire on 22 October 2020.¶- This Internet-Draft will expire on 21 October 2020.¶
+ This Internet-Draft will expire on 22 October 2020.¶Internet-Draft | -HTTP/3 | -March 2020 | -
Bishop | -Expires 22 September 2020 | -[Page] | -
The QUIC transport protocol has several features that are desirable in a -transport for HTTP, such as stream multiplexing, per-stream flow control, and -low-latency connection establishment. This document describes a mapping of HTTP -semantics over QUIC. This document also identifies HTTP/2 features that are -subsumed by QUIC, and describes how HTTP/2 extensions can be ported to HTTP/3.¶
-Discussion of this draft takes place on the QUIC working group mailing list -(quic@ietf.org), which is archived at -https://mailarchive.ietf.org/arch/search/?email_list=quic.¶
-Working Group information can be found at https://github.com/quicwg; source -code and issues list for this draft can be found at -https://github.com/quicwg/base-drafts/labels/-http.¶
-- This Internet-Draft is submitted in full conformance with the - provisions of BCP 78 and BCP 79.¶
-- Internet-Drafts are working documents of the Internet Engineering Task - Force (IETF). Note that other groups may also distribute working - documents as Internet-Drafts. The list of current Internet-Drafts is - at https://datatracker.ietf.org/drafts/current/.¶
-- Internet-Drafts are draft documents valid for a maximum of six months - and may be updated, replaced, or obsoleted by other documents at any - time. It is inappropriate to use Internet-Drafts as reference - material or to cite them other than as "work in progress."¶
-- This Internet-Draft will expire on 22 September 2020.¶
-- Copyright (c) 2020 IETF Trust and the persons identified as the - document authors. All rights reserved.¶
-- This document is subject to BCP 78 and the IETF Trust's Legal - Provisions Relating to IETF Documents - (https://trustee.ietf.org/license-info) in effect on the date of - publication of this document. Please review these documents - carefully, as they describe your rights and restrictions with - respect to this document. Code Components extracted from this - document must include Simplified BSD License text as described in - Section 4.e of the Trust Legal Provisions and are provided without - warranty as described in the Simplified BSD License.¶
-HTTP semantics are used for a broad range of services on the Internet. These -semantics have commonly been used with two different TCP mappings, HTTP/1.1 and -HTTP/2. HTTP/3 supports the same semantics over a new transport protocol, QUIC.¶
-HTTP/1.1 is a TCP mapping which uses whitespace-delimited text fields to convey -HTTP messages. While these exchanges are human-readable, using whitespace for -message formatting leads to parsing difficulties and workarounds to be tolerant -of variant behavior. Because each connection can transfer only a single HTTP -request or response at a time in each direction, multiple parallel TCP -connections are often used, reducing the ability of the congestion controller to -accurately manage traffic between endpoints.¶
-HTTP/2 introduced a binary framing and multiplexing layer to improve latency -without modifying the transport layer. However, because the parallel nature of -HTTP/2's multiplexing is not visible to TCP's loss recovery mechanisms, a lost -or reordered packet causes all active transactions to experience a stall -regardless of whether that transaction was impacted by the lost packet.¶
-The QUIC transport protocol incorporates stream multiplexing and per-stream flow -control, similar to that provided by the HTTP/2 framing layer. By providing -reliability at the stream level and congestion control across the entire -connection, it has the capability to improve the performance of HTTP compared to -a TCP mapping. QUIC also incorporates TLS 1.3 [TLS13] at the -transport layer, offering comparable security to running TLS over TCP, with -the improved connection setup latency of TCP Fast Open [TFO].¶
-This document defines a mapping of HTTP semantics over the QUIC transport -protocol, drawing heavily on the design of HTTP/2. While delegating stream -lifetime and flow control issues to QUIC, a similar binary framing is used on -each stream. Some HTTP/2 features are subsumed by QUIC, while other features are -implemented atop QUIC.¶
-QUIC is described in [QUIC-TRANSPORT]. For a full description of HTTP/2, see -[HTTP2].¶
-HTTP/3 provides a transport for HTTP semantics using the QUIC transport protocol -and an internal framing layer similar to HTTP/2.¶
-Once a client knows that an HTTP/3 server exists at a certain endpoint, it opens -a QUIC connection. QUIC provides protocol negotiation, stream-based -multiplexing, and flow control. An HTTP/3 endpoint can be discovered using HTTP -Alternative Services; this process is described in greater detail in -Section 3.2.¶
-Within each stream, the basic unit of HTTP/3 communication is a frame -(Section 7.2). Each frame type serves a different purpose. For example, HEADERS -and DATA frames form the basis of HTTP requests and responses -(Section 4.1).¶
-Multiplexing of requests is performed using the QUIC stream abstraction, -described in Section 2 of [QUIC-TRANSPORT]. Each request-response pair -consumes a single QUIC stream. Streams are independent of each other, so one -stream that is blocked or suffers packet loss does not prevent progress on other -streams.¶
-Server push is an interaction mode introduced in HTTP/2 [HTTP2] which permits -a server to push a request-response exchange to a client in anticipation of the -client making the indicated request. This trades off network usage against a -potential latency gain. Several HTTP/3 frames are used to manage server push, -such as PUSH_PROMISE, MAX_PUSH_ID, and CANCEL_PUSH.¶
-As in HTTP/2, request and response headers are compressed for transmission. -Because HPACK [HPACK] relies on in-order transmission of compressed -header blocks (a guarantee not provided by QUIC), HTTP/3 replaces HPACK with -QPACK [QPACK]. QPACK uses separate unidirectional streams to modify and track -header table state, while header blocks refer to the state of the table without -modifying it.¶
-The following sections provide a detailed overview of the connection lifecycle -and key concepts:¶
-The details of the wire protocol and interactions with the transport are -described in subsequent sections:¶
-Additional resources are provided in the final sections:¶
-The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", -"SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this -document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] -when, and only when, they appear in all capitals, as shown here.¶
-Field definitions are given in Augmented Backus-Naur Form (ABNF), as defined in -[RFC5234].¶
-This document uses the variable-length integer encoding from -[QUIC-TRANSPORT].¶
-The following terms are used:¶
-The term "payload body" is defined in Section 3.3 of [RFC7230].¶
-Finally, the terms "gateway", "intermediary", "proxy", and "tunnel" are defined -in Section 2.3 of [RFC7230]. Intermediaries act as both client and server at -different times.¶
-HTTP/3 uses the token "h3" to identify itself in ALPN and Alt-Svc. Only -implementations of the final, published RFC can identify themselves as "h3". -Until such an RFC exists, implementations MUST NOT identify themselves using -this string.¶
-Implementations of draft versions of the protocol MUST add the string "-" and -the corresponding draft number to the identifier. For example, -draft-ietf-quic-http-01 is identified using the string "h3-01".¶
-Draft versions MUST use the corresponding draft transport version as their -transport. For example, the application protocol defined in -draft-ietf-quic-http-25 uses the transport defined in -draft-ietf-quic-transport-25.¶
-Non-compatible experiments that are based on these draft versions MUST append -the string "-" and an experiment name to the identifier. For example, an -experimental implementation based on draft-ietf-quic-http-09 which reserves an -extra stream for unsolicited transmission of 1980s pop music might identify -itself as "h3-09-rickroll". Note that any label MUST conform to the "token" -syntax defined in Section 3.2.6 of [RFC7230]. Experimenters are encouraged to -coordinate their experiments on the quic@ietf.org mailing list.¶
-An HTTP origin advertises the availability of an equivalent HTTP/3 endpoint via -the Alt-Svc HTTP response header field or the HTTP/2 ALTSVC frame -([ALTSVC]), using the ALPN token defined in -Section 3.3.¶
-For example, an origin could indicate in an HTTP response that HTTP/3 was -available on UDP port 50781 at the same hostname by including the following -header field:¶
--Alt-Svc: h3=":50781" -¶ -
On receipt of an Alt-Svc record indicating HTTP/3 support, a client MAY attempt -to establish a QUIC connection to the indicated host and port and, if -successful, send HTTP requests using the mapping described in this document.¶
-Connectivity problems (e.g. firewall blocking UDP) can result in QUIC connection -establishment failure, in which case the client SHOULD continue using the -existing connection or try another alternative endpoint offered by the origin.¶
-Servers MAY serve HTTP/3 on any UDP port, since an alternative always includes -an explicit port.¶
-HTTP/3 relies on QUIC version 1 as the underlying transport. The use of other -QUIC transport versions with HTTP/3 MAY be defined by future specifications.¶
-QUIC version 1 uses TLS version 1.3 or greater as its handshake protocol. -HTTP/3 clients MUST support a mechanism to indicate the target host to the -server during the TLS handshake. If the server is identified by a DNS name, -clients MUST send the Server Name Indication (SNI) [RFC6066] TLS extension -unless an alternative mechanism to indicate the target host is used.¶
-QUIC connections are established as described in [QUIC-TRANSPORT]. During -connection establishment, HTTP/3 support is indicated by selecting the ALPN -token "h3" in the TLS handshake. Support for other application-layer protocols -MAY be offered in the same handshake.¶
-While connection-level options pertaining to the core QUIC protocol are set in -the initial crypto handshake, HTTP/3-specific settings are conveyed in the -SETTINGS frame. After the QUIC connection is established, a SETTINGS frame -(Section 7.2.4) MUST be sent by each endpoint as the initial frame of their -respective HTTP control stream (see Section 6.2.1).¶
-HTTP/3 connections are persistent across multiple requests. For best -performance, it is expected that clients will not close connections until it is -determined that no further communication with a server is necessary (for -example, when a user navigates away from a particular web page) or until the -server closes the connection.¶
-Once a connection exists to a server endpoint, this connection MAY be reused for -requests with multiple different URI authority components. The client MAY send -any requests for which the client considers the server authoritative.¶
-An authoritative HTTP/3 endpoint is typically discovered because the client has -received an Alt-Svc record from the request's origin which nominates the -endpoint as a valid HTTP Alternative Service for that origin. As required by -[RFC7838], clients MUST check that the nominated server can present a valid -certificate for the origin before considering it authoritative. Clients MUST NOT -assume that an HTTP/3 endpoint is authoritative for other origins without an -explicit signal.¶
-Clients SHOULD NOT open more than one HTTP/3 connection to a given host and port -pair, where the host is derived from a URI, a selected alternative service -[ALTSVC], or a configured proxy. A client MAY open multiple connections to -the same IP address and UDP port using different transport or TLS configurations -but SHOULD avoid creating multiple connections with the same configuration.¶
-Prior to making requests for an origin whose scheme is not "https," the client
-MUST ensure the server is willing to serve that scheme. If the client intends
-to make requests for an origin whose scheme is "http", this means that it MUST
-obtain a valid http-opportunistic
response for the origin as described in
-[RFC8164] prior to making any such requests. Other schemes might define
-other mechanisms.¶
Servers are encouraged to maintain open connections for as long as possible but -are permitted to terminate idle connections if necessary. When either endpoint -chooses to close the HTTP/3 session, the terminating endpoint SHOULD first send -a GOAWAY frame (Section 5.2) so that both endpoints can reliably -determine whether previously sent frames have been processed and gracefully -complete or terminate any necessary remaining tasks.¶
-A server that does not wish clients to reuse connections for a particular origin -can indicate that it is not authoritative for a request by sending a 421 -(Misdirected Request) status code in response to the request (see Section 9.1.2 -of [HTTP2]).¶
-A client sends an HTTP request on a client-initiated bidirectional QUIC stream. -A client MUST send only a single request on a given stream. A server sends zero -or more non-final HTTP responses on the same stream as the request, followed by -a single final HTTP response, as detailed below.¶
-Pushed responses are sent on a server-initiated unidirectional QUIC stream (see -Section 6.2.2). A server sends zero or more non-final HTTP responses, -followed by a single final HTTP response, in the same manner as a standard -response. Push is described in more detail in Section 4.4.¶
-On a given stream, receipt of multiple requests or receipt of an additional HTTP -response following a final HTTP response MUST be treated as malformed -(Section 4.1.3).¶
-An HTTP message (request or response) consists of:¶
-Receipt of DATA and HEADERS frames in any other sequence MUST be treated as a -connection error of type H3_FRAME_UNEXPECTED (Section 8).¶
-A server MAY send one or more PUSH_PROMISE frames (see Section 7.2.5) -before, after, or interleaved with the frames of a response message. These -PUSH_PROMISE frames are not part of the response; see Section 4.4 for more -details. These frames are not permitted in pushed responses; a pushed response -which includes PUSH_PROMISE frames MUST be treated as a connection error of type -H3_FRAME_UNEXPECTED.¶
-Frames of unknown types (Section 9), including reserved frames -(Section 7.2.8) MAY be sent on a request or push stream before, after, or -interleaved with other frames described in this section.¶
-The HEADERS and PUSH_PROMISE frames might reference updates to the QPACK dynamic -table. While these updates are not directly part of the message exchange, they -must be received and processed before the message can be consumed. See -Section 4.1.1 for more details.¶
-The "chunked" transfer encoding defined in Section 4.1 of [RFC7230] MUST NOT -be used.¶
-A response MAY consist of multiple messages when and only when one or more -informational responses (1xx; see Section 6.2 of [RFC7231]) precede a final -response to the same request. Non-final responses do not contain a payload body -or trailers.¶
-If an endpoint receives an invalid sequence of frames on either a request or -a push stream, it MUST respond with a connection error of type -H3_FRAME_UNEXPECTED (Section 8). In particular, a DATA frame before any -HEADERS frame, or a HEADERS or DATA frame after the trailing HEADERS frame is -considered invalid.¶
-An HTTP request/response exchange fully consumes a client-initiated -bidirectional QUIC stream. After sending a request, a client MUST close the -stream for sending. Unless using the CONNECT method (see Section 4.2), clients -MUST NOT make stream closure dependent on receiving a response to their request. -After sending a final response, the server MUST close the stream for sending. At -this point, the QUIC stream is fully closed.¶
-When a stream is closed, this indicates the end of an HTTP message. Because some -messages are large or unbounded, endpoints SHOULD begin processing partial HTTP -messages once enough of the message has been received to make progress. If a -client stream terminates without enough of the HTTP message to provide a -complete response, the server SHOULD abort its response with the error code -H3_REQUEST_INCOMPLETE.¶
-A server can send a complete response prior to the client sending an entire -request if the response does not depend on any portion of the request that has -not been sent and received. When the server does not need to receive the -remainder of the request, it MAY abort reading the request stream, send a -complete response, and cleanly close the sending part of the stream. The error -code H3_NO_ERROR SHOULD be used when requesting that the client stop sending on -the request stream. Clients MUST NOT discard complete responses as a result of -having their request terminated abruptly, though clients can always discard -responses at their discretion for other reasons. If the server sends a partial -or complete response but does not abort reading, clients SHOULD continue sending -the body of the request and close the stream normally.¶
-HTTP message headers carry information as a series of key-value pairs, called -header fields. For a listing of registered HTTP header fields, see the "Message -Header Field" registry maintained at -https://www.iana.org/assignments/message-headers.¶
-Just as in previous versions of HTTP, header field names are strings of ASCII -characters that are compared in a case-insensitive fashion. Properties of HTTP -header field names and values are discussed in more detail in Section 3.2 of -[RFC7230], though the wire rendering in HTTP/3 differs. As in HTTP/2, header -field names MUST be converted to lowercase prior to their encoding. A request -or response containing uppercase header field names MUST be treated as -malformed (Section 4.1.3).¶
-Like HTTP/2, HTTP/3 does not use the Connection header field to indicate -connection-specific header fields; in this protocol, connection-specific -metadata is conveyed by other means. An endpoint MUST NOT generate an HTTP/3 -message containing connection-specific header fields; any message containing -connection-specific header fields MUST be treated as malformed (Section 4.1.3).¶
-The only exception to this is the TE header field, which MAY be present in an -HTTP/3 request; when it is, it MUST NOT contain any value other than "trailers".¶
-This means that an intermediary transforming an HTTP/1.x message to HTTP/3 will -need to remove any header fields nominated by the Connection header field, along -with the Connection header field itself. Such intermediaries SHOULD also remove -other connection-specific header fields, such as Keep-Alive, Proxy-Connection, -Transfer-Encoding, and Upgrade, even if they are not nominated by the Connection -header field.¶
-As in HTTP/2, HTTP/3 uses special pseudo-header fields beginning with the ':' -character (ASCII 0x3a) to convey the target URI, the method of the request, and -the status code for the response.¶
-Pseudo-header fields are not HTTP header fields. Endpoints MUST NOT generate -pseudo-header fields other than those defined in this document, except as -negotiated via an extension; see Section 9.¶
-Pseudo-header fields are only valid in the context in which they are defined. -Pseudo-header fields defined for requests MUST NOT appear in responses; -pseudo-header fields defined for responses MUST NOT appear in requests. -Pseudo-header fields MUST NOT appear in trailers. Endpoints MUST treat a -request or response that contains undefined or invalid pseudo-header fields as -malformed (Section 4.1.3).¶
-All pseudo-header fields MUST appear in the header block before regular header -fields. Any request or response that contains a pseudo-header field that -appears in a header block after a regular header field MUST be treated as -malformed (Section 4.1.3).¶
-The following pseudo-header fields are defined for requests:¶
-All HTTP/3 requests MUST include exactly one value for the ":method", ":scheme", -and ":path" pseudo-header fields, unless it is a CONNECT request (Section 4.2). -An HTTP request that omits mandatory pseudo-header fields or contains invalid -values for those fields is malformed (Section 4.1.3).¶
-HTTP/3 does not define a way to carry the version identifier that is included in -the HTTP/1.1 request line.¶
-For responses, a single ":status" pseudo-header field is defined that carries -the HTTP status code field (see Section 6 of [RFC7231]). This pseudo-header -field MUST be included in all responses; otherwise, the response is malformed -(Section 4.1.3).¶
-HTTP/3 does not define a way to carry the version or reason phrase that is -included in an HTTP/1.1 status line.¶
-HTTP/3 uses QPACK header compression as described in [QPACK], a variation of -HPACK which allows the flexibility to avoid header-compression-induced -head-of-line blocking. See that document for additional details.¶
-To allow for better compression efficiency, the cookie header field [RFC6265] -MAY be split into separate header fields, each with one or more cookie-pairs, -before compression. If a decompressed header list contains multiple cookie -header fields, these MUST be concatenated into a single octet string using the -two-octet delimiter of 0x3B, 0x20 (the ASCII string "; ") before being passed -into a context other than HTTP/2 or HTTP/3, such as an HTTP/1.1 connection, or a -generic HTTP server application.¶
-An HTTP/3 implementation MAY impose a limit on the maximum size of the message -header it will accept on an individual HTTP message. A server that receives a -larger header field list than it is willing to handle can send an HTTP 431 -(Request Header Fields Too Large) status code [RFC6585]. A client can -discard responses that it cannot process. The size of a header field list is -calculated based on the uncompressed size of header fields, including the length -of the name and value in bytes plus an overhead of 32 bytes for each header -field.¶
-If an implementation wishes to advise its peer of this limit, it can be conveyed
-as a number of bytes in the SETTINGS_MAX_HEADER_LIST_SIZE
parameter. An
-implementation which has received this parameter SHOULD NOT send an HTTP message
-header which exceeds the indicated size, as the peer will likely refuse to
-process it. However, because this limit is applied at each hop, messages below
-this limit are not guaranteed to be accepted.¶
Clients can cancel requests by resetting and aborting the request stream with an -error code of H3_REQUEST_CANCELLED (Section 8.1). When the client -aborts reading a response, it indicates that this response is no longer of -interest. Implementations SHOULD cancel requests by abruptly terminating any -directions of a stream that are still open.¶
-When the server rejects a request without performing any application processing, -it SHOULD abort its response stream with the error code H3_REQUEST_REJECTED. -In this context, "processed" means that some data from the stream was passed to -some higher layer of software that might have taken some action as a result. The -client can treat requests rejected by the server as though they had never been -sent at all, thereby allowing them to be retried later on a new connection. -Servers MUST NOT use the H3_REQUEST_REJECTED error code for requests which -were partially or fully processed. When a server abandons a response after -partial processing, it SHOULD abort its response stream with the error code -H3_REQUEST_CANCELLED.¶
-When a client resets a request with the error code H3_REQUEST_CANCELLED, a -server MAY abruptly terminate the response using the error code -H3_REQUEST_REJECTED if no processing was performed. Clients MUST NOT use the -H3_REQUEST_REJECTED error code, except when a server has requested closure of -the request stream with this error code.¶
-If a stream is cancelled after receiving a complete response, the client MAY -ignore the cancellation and use the response. However, if a stream is cancelled -after receiving a partial response, the response SHOULD NOT be used. -Automatically retrying such requests is not possible, unless this is otherwise -permitted (e.g., idempotent actions like GET, PUT, or DELETE).¶
-A malformed request or response is one that is an otherwise valid sequence of -frames but is invalid due to:¶
-A request or response that includes a payload body can include a
-content-length
header field. A request or response is also malformed if the
-value of a content-length header field does not equal the sum of the DATA frame
-payload lengths that form the body. A response that is defined to have no
-payload, as described in Section 3.3.2 of [RFC7230] can have a non-zero
-content-length header field, even though no content is included in DATA frames.¶
Intermediaries that process HTTP requests or responses (i.e., any intermediary -not acting as a tunnel) MUST NOT forward a malformed request or response. -Malformed requests or responses that are detected MUST be treated as a stream -error (Section 8) of type H3_GENERAL_PROTOCOL_ERROR.¶
-For malformed requests, a server MAY send an HTTP response prior to closing or -resetting the stream. Clients MUST NOT accept a malformed response. Note that -these requirements are intended to protect against several types of common -attacks against HTTP; they are deliberately strict because being permissive can -expose implementations to these vulnerabilities.¶
-The CONNECT method requests that the recipient establish a tunnel to the -destination origin server identified by the request-target (Section 4.3.6 of -[RFC7231]). It is primarily used with HTTP proxies to establish a TLS -session with an origin server for the purposes of interacting with "https" -resources.¶
-In HTTP/1.x, CONNECT is used to convert an entire HTTP connection into a tunnel -to a remote host. In HTTP/2 and HTTP/3, the CONNECT method is used to establish -a tunnel over a single stream.¶
-A CONNECT request MUST be constructed as follows:¶
-The request stream remains open at the end of the request to carry the data to -be transferred. A CONNECT request that does not conform to these restrictions -is malformed (see Section 4.1.3).¶
-A proxy that supports CONNECT establishes a TCP connection ([RFC0793]) to the -server identified in the ":authority" pseudo-header field. Once this connection -is successfully established, the proxy sends a HEADERS frame containing a 2xx -series status code to the client, as defined in Section 4.3.6 of [RFC7231].¶
-All DATA frames on the stream correspond to data sent or received on the TCP -connection. Any DATA frame sent by the client is transmitted by the proxy to the -TCP server; data received from the TCP server is packaged into DATA frames by -the proxy. Note that the size and number of TCP segments is not guaranteed to -map predictably to the size and number of HTTP DATA or QUIC STREAM frames.¶
-Once the CONNECT method has completed, only DATA frames are permitted -to be sent on the stream. Extension frames MAY be used if specifically -permitted by the definition of the extension. Receipt of any other frame type -MUST be treated as a connection error of type H3_FRAME_UNEXPECTED.¶
-The TCP connection can be closed by either peer. When the client ends the -request stream (that is, the receive stream at the proxy enters the "Data Recvd" -state), the proxy will set the FIN bit on its connection to the TCP server. When -the proxy receives a packet with the FIN bit set, it will terminate the send -stream that it sends to the client. TCP connections which remain half-closed in -a single direction are not invalid, but are often handled poorly by servers, so -clients SHOULD NOT close a stream for sending while they still expect to receive -data from the target of the CONNECT.¶
-A TCP connection error is signaled by abruptly terminating the stream. A proxy -treats any error in the TCP connection, which includes receiving a TCP segment -with the RST bit set, as a stream error of type H3_CONNECT_ERROR -(Section 8.1). Correspondingly, if a proxy detects an error with the -stream or the QUIC connection, it MUST close the TCP connection. If the -underlying TCP implementation permits it, the proxy SHOULD send a TCP segment -with the RST bit set.¶
-HTTP/3 does not support the HTTP Upgrade mechanism (Section 6.7 of [RFC7230]) or -101 (Switching Protocols) informational status code (Section 6.2.2 of -[RFC7231]).¶
-Server push is an interaction mode which permits a server to push a -request-response exchange to a client in anticipation of the client making the -indicated request. This trades off network usage against a potential latency -gain. HTTP/3 server push is similar to what is described in HTTP/2 [HTTP2], -but uses different mechanisms.¶
-Each server push is identified by a unique Push ID. This Push ID is used in one -or more PUSH_PROMISE frames (see Section 7.2.5) that carry the request -headers, then included with the push stream which ultimately fulfills those -promises. When the same Push ID is promised on multiple request streams, the -decompressed request header sets MUST contain the same fields in the -same order, and both the name and the value in each field MUST be exact -matches.¶
-Server push is only enabled on a connection when a client sends a MAX_PUSH_ID -frame (see Section 7.2.7). A server cannot use server push until it -receives a MAX_PUSH_ID frame. A client sends additional MAX_PUSH_ID frames to -control the number of pushes that a server can promise. A server SHOULD use Push -IDs sequentially, starting at 0. A client MUST treat receipt of a push stream -with a Push ID that is greater than the maximum Push ID as a connection error of -type H3_ID_ERROR.¶
-The header of the request message is carried by a PUSH_PROMISE frame (see -Section 7.2.5) on the request stream which generated the push. This -allows the server push to be associated with a client request.¶
-Not all requests can be pushed. A server MAY push requests which have the -following properties:¶
-Clients SHOULD send a CANCEL_PUSH frame upon receipt of a PUSH_PROMISE frame -carrying a request which is not cacheable, is not known to be safe, or that -indicates the presence of a request body. If the pushed response arrives on a -push stream, this MAY be treated as a stream error of type -H3_STREAM_CREATION_ERROR.¶
-The server MUST include a value in the ":authority" pseudo-header field for -which the server is authoritative (see Section 3.4). A client SHOULD -send a CANCEL_PUSH frame upon receipt of a PUSH_PROMISE frame carrying a request -for which it does not consider the server authoritative. If the pushed response -arrives on a push stream, this MAY be treated as a stream error of type -H3_STREAM_CREATION_ERROR.¶
-Each pushed response is associated with one or more client requests. The push -is associated with the request stream on which the PUSH_PROMISE frame was -received. The same server push can be associated with additional client -requests using a PUSH_PROMISE frame with the same Push ID on multiple request -streams. These associations do not affect the operation of the protocol, but -MAY be considered by user agents when deciding how to use pushed resources.¶
-Ordering of a PUSH_PROMISE in relation to certain parts of the response is -important. The server SHOULD send PUSH_PROMISE frames prior to sending HEADERS -or DATA frames that reference the promised responses. This reduces the chance -that a client requests a resource that will be pushed by the server.¶
-When a server later fulfills a promise, the server push response is conveyed on -a push stream (see Section 6.2.2). The push stream identifies the Push ID of -the promise that it fulfills, then contains a response to the promised request -using the same format described for responses in Section 4.1.¶
-Due to reordering, push stream data can arrive before the corresponding -PUSH_PROMISE frame. When a client receives a new push stream with an -as-yet-unknown Push ID, both the associated client request and the pushed -request headers are unknown. The client can buffer the stream data in -expectation of the matching PUSH_PROMISE. The client can use stream flow control -(see section 4.1 of [QUIC-TRANSPORT]) to limit the amount of data a server may -commit to the pushed stream.¶
-If a promised server push is not needed by the client, the client SHOULD send a -CANCEL_PUSH frame. If the push stream is already open or opens after sending the -CANCEL_PUSH frame, the client can abort reading the stream with an error code of -H3_REQUEST_CANCELLED. This asks the server not to transfer additional data and -indicates that it will be discarded upon receipt.¶
-Pushed responses that are cacheable (see Section 3 of [RFC7234]) can be -stored by the client, if it implements an HTTP cache. Pushed responses are -considered successfully validated on the origin server (e.g., if the "no-cache" -cache response directive is present (Section 5.2.2 of [RFC7234])) at the time -the pushed response is received.¶
-Pushed responses that are not cacheable MUST NOT be stored by any HTTP cache. -They MAY be made available to the application separately.¶
-Once established, an HTTP/3 connection can be used for many requests and -responses over time until the connection is closed. Connection closure can -happen in any of several different ways.¶
-Each QUIC endpoint declares an idle timeout during the handshake. If the -connection remains idle (no packets received) for longer than this duration, the -peer will assume that the connection has been closed. HTTP/3 implementations -will need to open a new connection for new requests if the existing connection -has been idle for longer than the server's advertised idle timeout, and SHOULD -do so if approaching the idle timeout.¶
-HTTP clients are expected to request that the transport keep connections open -while there are responses outstanding for requests or server pushes, as -described in Section 19.2 of [QUIC-TRANSPORT]. If the client is not expecting -a response from the server, allowing an idle connection to time out is preferred -over expending effort maintaining a connection that might not be needed. A -gateway MAY maintain connections in anticipation of need rather than incur the -latency cost of connection establishment to servers. Servers SHOULD NOT actively -keep connections open.¶
-Even when a connection is not idle, either endpoint can decide to stop using the -connection and initiate a graceful connection close. Endpoints initiate the -graceful shutdown of a connection by sending a GOAWAY frame (Section 7.2.6). -The GOAWAY frame contains an identifier that indicates to the receiver the range -of requests or pushes that were or might be processed in this connection. The -server sends a client-initiated bidirectional Stream ID; the client sends a Push -ID. Requests or pushes with the indicated identifier or greater are rejected by -the sender of the GOAWAY. This identifier MAY be zero if no requests or pushes -were processed.¶
-The information in the GOAWAY frame enables a client and server to agree on -which requests or pushes were accepted prior to the connection shutdown. Upon -sending a GOAWAY frame, the endpoint SHOULD explicitly cancel (see -Section 4.1.2 and Section 7.2.3) any requests or pushes that -have identifiers greater than or equal to that indicated, in order to clean up -transport state for the affected streams. The endpoint SHOULD continue to do so -as more requests or pushes arrive.¶
-Endpoints MUST NOT initiate new requests or promise new pushes on the connection -after receipt of a GOAWAY frame from the peer. Clients MAY establish a new -connection to send additional requests.¶
-Some requests or pushes might already be in transit:¶
-Upon receipt of a GOAWAY frame, if the client has already sent requests with -a Stream ID greater than or equal to the identifier received in a GOAWAY -frame, those requests will not be processed. Clients can safely retry -unprocessed requests on a different connection.¶
--Requests on Stream IDs less than the Stream ID in a GOAWAY frame from the -server might have been processed; their status cannot be known until a -response is received, the stream is reset individually, another GOAWAY is -received, or the connection terminates.¶
--Servers MAY reject individual requests on streams below the indicated ID if -these requests were not processed.¶
-Servers SHOULD send a GOAWAY frame when the closing of a connection is known -in advance, even if the advance notice is small, so that the remote peer can -know whether a request has been partially processed or not. For example, if an -HTTP client sends a POST at the same time that a server closes a QUIC -connection, the client cannot know if the server started to process that POST -request if the server does not send a GOAWAY frame to indicate what streams it -might have acted on.¶
-A client that is unable to retry requests loses all requests that are in flight -when the server closes the connection. An endpoint MAY send multiple GOAWAY -frames indicating different identifiers, but MUST NOT increase the identifier -value they send, since clients might already have retried unprocessed requests -on another connection.¶
-An endpoint that is attempting to gracefully shut down a connection can send a -GOAWAY frame with a value set to the maximum possible value (2^62-4 for servers, -2^62-1 for clients). This ensures that the peer stops creating new requests or -pushes. After allowing time for any in-flight requests or pushes to arrive, the -endpoint can send another GOAWAY frame indicating which requests or pushes it -might accept before the end of the connection. This ensures that a connection -can be cleanly shut down without losing requests.¶
-A client has more flexibility in the value it chooses for the Push ID in a -GOAWAY that it sends. A value of 2^62 - 1 indicates that the server can -continue fulfilling pushes which have already been promised, and the client can -continue granting push credit as needed (see Section 7.2.7). A smaller -value indicates the client will reject pushes with Push IDs greater than or -equal to this value. Like the server, the client MAY send subsequent GOAWAY -frames so long as the specified Push ID is strictly smaller than all previously -sent values.¶
-Even when a GOAWAY indicates that a given request or push will not be processed -or accepted upon receipt, the underlying transport resources still exist. The -endpoint that initiated these requests can cancel them to clean up transport -state.¶
-Once all accepted requests and pushes have been processed, the endpoint can -permit the connection to become idle, or MAY initiate an immediate closure of -the connection. An endpoint that completes a graceful shutdown SHOULD use the -HTTP_NO_ERROR code when closing the connection.¶
-If a client has consumed all available bidirectional stream IDs with requests, -the server need not send a GOAWAY frame, since the client is unable to make -further requests.¶
-An HTTP/3 implementation can immediately close the QUIC connection at any time. -This results in sending a QUIC CONNECTION_CLOSE frame to the peer indicating -that the application layer has terminated the connection. The application error -code in this frame indicates to the peer why the connection is being closed. -See Section 8 for error codes which can be used when closing a connection in -HTTP/3.¶
-Before closing the connection, a GOAWAY frame MAY be sent to allow the client to -retry some requests. Including the GOAWAY frame in the same packet as the QUIC -CONNECTION_CLOSE frame improves the chances of the frame being received by -clients.¶
-For various reasons, the QUIC transport could indicate to the application layer -that the connection has terminated. This might be due to an explicit closure -by the peer, a transport-level error, or a change in network topology which -interrupts connectivity.¶
-If a connection terminates without a GOAWAY frame, clients MUST assume that any -request which was sent, whether in whole or in part, might have been processed.¶
-A QUIC stream provides reliable in-order delivery of bytes, but makes no -guarantees about order of delivery with regard to bytes on other streams. On the -wire, data is framed into QUIC STREAM frames, but this framing is invisible to -the HTTP framing layer. The transport layer buffers and orders received QUIC -STREAM frames, exposing the data contained within as a reliable byte stream to -the application. Although QUIC permits out-of-order delivery within a stream, -HTTP/3 does not make use of this feature.¶
-QUIC streams can be either unidirectional, carrying data only from initiator to -receiver, or bidirectional. Streams can be initiated by either the client or -the server. For more detail on QUIC streams, see Section 2 of -[QUIC-TRANSPORT].¶
-When HTTP headers and data are sent over QUIC, the QUIC layer handles most of -the stream management. HTTP does not need to do any separate multiplexing when -using QUIC - data sent over a QUIC stream always maps to a particular HTTP -transaction or connection context.¶
-All client-initiated bidirectional streams are used for HTTP requests and -responses. A bidirectional stream ensures that the response can be readily -correlated with the request. This means that the client's first request occurs -on QUIC stream 0, with subsequent requests on stream 4, 8, and so on. In order -to permit these streams to open, an HTTP/3 server SHOULD configure non-zero -minimum values for the number of permitted streams and the initial stream flow -control window. So as to not unnecessarily limit parallelism, at least 100 -requests SHOULD be permitted at a time.¶
-HTTP/3 does not use server-initiated bidirectional streams, though an extension -could define a use for these streams. Clients MUST treat receipt of a -server-initiated bidirectional stream as a connection error of type -H3_STREAM_CREATION_ERROR unless such an extension has been negotiated.¶
-Unidirectional streams, in either direction, are used for a range of purposes. -The purpose is indicated by a stream type, which is sent as a variable-length -integer at the start of the stream. The format and structure of data that -follows this integer is determined by the stream type.¶
-Some stream types are reserved (Section 6.2.3). Two stream types are -defined in this document: control streams (Section 6.2.1) and push streams -(Section 6.2.2). [QPACK] defines two additional stream types. Other stream -types can be defined by extensions to HTTP/3; see Section 9 for more -details.¶
-The performance of HTTP/3 connections in the early phase of their lifetime is -sensitive to the creation and exchange of data on unidirectional streams. -Endpoints that excessively restrict the number of streams or the flow control -window of these streams will increase the chance that the remote peer reaches -the limit early and becomes blocked. In particular, implementations should -consider that remote peers may wish to exercise reserved stream behavior -(Section 6.2.3) with some of the unidirectional streams they are permitted -to use. To avoid blocking, the transport parameters sent by both clients and -servers MUST allow the peer to create at least one unidirectional stream for the -HTTP control stream plus the number of unidirectional streams required by -mandatory extensions (three being the minimum number required for the base -HTTP/3 protocol and QPACK), and SHOULD provide at least 1,024 bytes of flow -control credit to each stream.¶
-Note that an endpoint is not required to grant additional credits to create more -unidirectional streams if its peer consumes all the initial credits before -creating the critical unidirectional streams. Endpoints SHOULD create the HTTP -control stream as well as the unidirectional streams required by mandatory -extensions (such as the QPACK encoder and decoder streams) first, and then -create additional streams as allowed by their peer.¶
-If the stream header indicates a stream type which is not supported by the -recipient, the remainder of the stream cannot be consumed as the semantics are -unknown. Recipients of unknown stream types MAY abort reading of the stream with -an error code of H3_STREAM_CREATION_ERROR, but MUST NOT consider such streams -to be a connection error of any kind.¶
-Implementations MAY send stream types before knowing whether the peer supports -them. However, stream types which could modify the state or semantics of -existing protocol components, including QPACK or other extensions, MUST NOT be -sent until the peer is known to support them.¶
-A sender can close or reset a unidirectional stream unless otherwise specified. -A receiver MUST tolerate unidirectional streams being closed or reset prior to -the reception of the unidirectional stream header.¶
-A control stream is indicated by a stream type of 0x00
. Data on this stream
-consists of HTTP/3 frames, as defined in Section 7.2.¶
Each side MUST initiate a single control stream at the beginning of the -connection and send its SETTINGS frame as the first frame on this stream. If -the first frame of the control stream is any other frame type, this MUST be -treated as a connection error of type H3_MISSING_SETTINGS. Only one control -stream per peer is permitted; receipt of a second stream which claims to be a -control stream MUST be treated as a connection error of type -H3_STREAM_CREATION_ERROR. The sender MUST NOT close the control stream, and -the receiver MUST NOT request that the sender close the control stream. If -either control stream is closed at any point, this MUST be treated as a -connection error of type H3_CLOSED_CRITICAL_STREAM.¶
-A pair of unidirectional streams is used rather than a single bidirectional -stream. This allows either peer to send data as soon as it is able. Depending -on whether 0-RTT is enabled on the connection, either client or server might be -able to send stream data first after the cryptographic handshake completes.¶
-Server push is an optional feature introduced in HTTP/2 that allows a server to -initiate a response before a request has been made. See Section 4.4 for -more details.¶
-A push stream is indicated by a stream type of 0x01
, followed by the Push ID
-of the promise that it fulfills, encoded as a variable-length integer. The
-remaining data on this stream consists of HTTP/3 frames, as defined in
-Section 7.2, and fulfills a promised server push by zero or more non-final HTTP
-responses followed by a single final HTTP response, as defined in
-Section 4.1. Server push and Push IDs are described in
-Section 4.4.¶
Only servers can push; if a server receives a client-initiated push stream, this -MUST be treated as a connection error of type H3_STREAM_CREATION_ERROR.¶
-Each Push ID MUST only be used once in a push stream header. If a push stream -header includes a Push ID that was used in another push stream header, the -client MUST treat this as a connection error of type H3_ID_ERROR.¶
-Stream types of the format 0x1f * N + 0x21
for integer values of N are
-reserved to exercise the requirement that unknown types be ignored. These
-streams have no semantics, and can be sent when application-layer padding is
-desired. They MAY also be sent on connections where no data is currently being
-transferred. Endpoints MUST NOT consider these streams to have any meaning upon
-receipt.¶
The payload and length of the stream are selected in any manner the -implementation chooses. Implementations MAY terminate these streams cleanly, or -MAY abruptly terminate them. When terminating abruptly, the error code -H3_NO_ERROR or a reserved error code (Section 8.1) SHOULD be used.¶
-HTTP frames are carried on QUIC streams, as described in Section 6. -HTTP/3 defines three stream types: control stream, request stream, and push -stream. This section describes HTTP/3 frame formats and the streams types on -which they are permitted; see Table 1 for an overview. A -comparison between HTTP/2 and HTTP/3 frames is provided in Appendix A.2.¶
-Frame | -Control Stream | -Request Stream | -Push Stream | -Section | -
---|---|---|---|---|
DATA | -No | -Yes | -Yes | -- Section 7.2.1 - | -
HEADERS | -No | -Yes | -Yes | -- Section 7.2.2 - | -
CANCEL_PUSH | -Yes | -No | -No | -- Section 7.2.3 - | -
SETTINGS | -Yes (1) | -No | -No | -- Section 7.2.4 - | -
PUSH_PROMISE | -No | -Yes | -No | -- Section 7.2.5 - | -
GOAWAY | -Yes | -No | -No | -- Section 7.2.6 - | -
MAX_PUSH_ID | -Yes | -No | -No | -- Section 7.2.7 - | -
Reserved | -Yes | -Yes | -Yes | -- Section 7.2.8 - | -
Certain frames can only occur as the first frame of a particular stream type; -these are indicated in Table 1 with a (1). Specific guidance -is provided in the relevant section.¶
-Note that, unlike QUIC frames, HTTP/3 frames can span multiple packets.¶
-All frames have the following format:¶
-A frame includes the following fields:¶
-Each frame's payload MUST contain exactly the fields identified in its -description. A frame payload that contains additional bytes after the -identified fields or a frame payload that terminates before the end of the -identified fields MUST be treated as a connection error of type -H3_FRAME_ERROR.¶
-When a stream terminates cleanly, if the last frame on the stream was truncated, -this MUST be treated as a connection error (Section 8) of type -H3_FRAME_ERROR. Streams which terminate abruptly may be reset at any point in -a frame.¶
-DATA frames (type=0x0) convey arbitrary, variable-length sequences of bytes -associated with an HTTP request or response payload.¶
-DATA frames MUST be associated with an HTTP request or response. If a DATA -frame is received on a control stream, the recipient MUST respond with a -connection error (Section 8) of type H3_FRAME_UNEXPECTED.¶
-The HEADERS frame (type=0x1) is used to carry a header block, compressed using -QPACK. See [QPACK] for more details.¶
-HEADERS frames can only be sent on request / push streams. If a HEADERS frame -is received on a control stream, the recipient MUST respond with a connection -error (Section 8) of type H3_FRAME_UNEXPECTED.¶
-The CANCEL_PUSH frame (type=0x3) is used to request cancellation of a server -push prior to the push stream being received. The CANCEL_PUSH frame identifies -a server push by Push ID (see Section 7.2.5), encoded as a -variable-length integer.¶
-When a client sends CANCEL_PUSH, it is indicating that it does not wish to -receive the promised resource. The server SHOULD abort sending the resource, -but the mechanism to do so depends on the state of the corresponding push -stream. If the server has not yet created a push stream, it does not create -one. If the push stream is open, the server SHOULD abruptly terminate that -stream. If the push stream has already ended, the server MAY still abruptly -terminate the stream or MAY take no action.¶
-When a server sends CANCEL_PUSH, it is indicating that it will not be fulfilling -a promise and has not created a push stream. The client should not expect the -corresponding promise to be fulfilled.¶
-Sending CANCEL_PUSH has no direct effect on the state of existing push streams. -A server SHOULD NOT send a CANCEL_PUSH when it has already created a -corresponding push stream, and a client SHOULD NOT send a CANCEL_PUSH when it -has already received a corresponding push stream.¶
-A CANCEL_PUSH frame is sent on the control stream. Receiving a CANCEL_PUSH -frame on a stream other than the control stream MUST be treated as a connection -error of type H3_FRAME_UNEXPECTED.¶
-The CANCEL_PUSH frame carries a Push ID encoded as a variable-length integer. -The Push ID identifies the server push that is being cancelled (see -Section 7.2.5). If a CANCEL_PUSH frame is received which references a -Push ID greater than currently allowed on the connection, this MUST be treated -as a connection error of type H3_ID_ERROR.¶
-If the client receives a CANCEL_PUSH frame, that frame might identify a Push ID -that has not yet been mentioned by a PUSH_PROMISE frame due to reordering. If a -server receives a CANCEL_PUSH frame for a Push ID that has not yet been -mentioned by a PUSH_PROMISE frame, this MUST be treated as a connection error of -type H3_ID_ERROR.¶
-The SETTINGS frame (type=0x4) conveys configuration parameters that affect how -endpoints communicate, such as preferences and constraints on peer behavior. -Individually, a SETTINGS parameter can also be referred to as a "setting"; the -identifier and value of each setting parameter can be referred to as a "setting -identifier" and a "setting value".¶
-SETTINGS frames always apply to a connection, never a single stream. A SETTINGS -frame MUST be sent as the first frame of each control stream (see -Section 6.2.1) by each peer, and MUST NOT be sent subsequently. If -an endpoint receives a second SETTINGS frame on the control stream, the endpoint -MUST respond with a connection error of type H3_FRAME_UNEXPECTED.¶
-SETTINGS frames MUST NOT be sent on any stream other than the control stream. -If an endpoint receives a SETTINGS frame on a different stream, the endpoint -MUST respond with a connection error of type H3_FRAME_UNEXPECTED.¶
-SETTINGS parameters are not negotiated; they describe characteristics of the -sending peer, which can be used by the receiving peer. However, a negotiation -can be implied by the use of SETTINGS - each peer uses SETTINGS to advertise a -set of supported values. The definition of the setting would describe how each -peer combines the two sets to conclude which choice will be used. SETTINGS does -not provide a mechanism to identify when the choice takes effect.¶
-Different values for the same parameter can be advertised by each peer. For -example, a client might be willing to consume a very large response header, -while servers are more cautious about request size.¶
-The same setting identifier MUST NOT occur more than once in the SETTINGS frame. -A receiver MAY treat the presence of duplicate setting identifiers as a -connection error of type H3_SETTINGS_ERROR.¶
-The payload of a SETTINGS frame consists of zero or more parameters. Each -parameter consists of a setting identifier and a value, both encoded as QUIC -variable-length integers.¶
-An implementation MUST ignore the contents for any SETTINGS identifier it does -not understand.¶
-The following settings are defined in HTTP/3:¶
-Setting identifiers of the format 0x1f * N + 0x21
for integer values of N are
-reserved to exercise the requirement that unknown identifiers be ignored. Such
-settings have no defined meaning. Endpoints SHOULD include at least one such
-setting in their SETTINGS frame. Endpoints MUST NOT consider such settings to
-have any meaning upon receipt.¶
Because the setting has no defined meaning, the value of the setting can be any -value the implementation selects.¶
-Additional settings can be defined by extensions to HTTP/3; see Section 9 -for more details.¶
-An HTTP implementation MUST NOT send frames or requests which would be invalid -based on its current understanding of the peer's settings.¶
-All settings begin at an initial value. Each endpoint SHOULD use these initial -values to send messages before the peer's SETTINGS frame has arrived, as packets -carrying the settings can be lost or delayed. When the SETTINGS frame arrives, -any settings are changed to their new values.¶
-This removes the need to wait for the SETTINGS frame before sending messages. -Endpoints MUST NOT require any data to be received from the peer prior to -sending the SETTINGS frame; settings MUST be sent as soon as the transport is -ready to send data.¶
-For servers, the initial value of each client setting is the default value.¶
-For clients using a 1-RTT QUIC connection, the initial value of each server -setting is the default value. 1-RTT keys will always become available prior to -SETTINGS arriving, even if the server sends SETTINGS immediately. Clients SHOULD -NOT wait indefinitely for SETTINGS to arrive before sending requests, but SHOULD -process received datagrams in order to increase the likelihood of processing -SETTINGS before sending the first request.¶
-When a 0-RTT QUIC connection is being used, the initial value of each server -setting is the value used in the previous session. Clients SHOULD store the -settings the server provided in the connection where resumption information was -provided, but MAY opt not to store settings in certain cases (e.g., if the -session ticket is received before the SETTINGS frame). A client MUST comply with -stored settings - or default values, if no values are stored - when attempting -0-RTT. Once a server has provided new settings, clients MUST comply with those -values.¶
-A server can remember the settings that it advertised, or store an -integrity-protected copy of the values in the ticket and recover the information -when accepting 0-RTT data. A server uses the HTTP/3 settings values in -determining whether to accept 0-RTT data. If the server cannot determine that -the settings remembered by a client are compatible with its current settings, it -MUST NOT accept 0-RTT data. Remembered settings are compatible if a client -complying with those settings would not violate the server's current settings.¶
-A server MAY accept 0-RTT and subsequently provide different settings in its -SETTINGS frame. If 0-RTT data is accepted by the server, its SETTINGS frame MUST -NOT reduce any limits or alter any values that might be violated by the client -with its 0-RTT data. The server MUST include all settings which differ from -their default values. If a server accepts 0-RTT but then sends settings that -are not compatible with the previously specified settings, this MUST be treated -as a connection error of type H3_SETTINGS_ERROR. If a server accepts 0-RTT but -then sends a SETTINGS frame that omits a setting value that the client -understands (apart from reserved setting identifiers) that was previously -specified to have a non-default value, this MUST be treated as a connection -error of type H3_SETTINGS_ERROR.¶
-The PUSH_PROMISE frame (type=0x5) is used to carry a promised request header -set from server to client on a request stream, as in HTTP/2.¶
-The payload consists of:¶
-A server MUST NOT use a Push ID that is larger than the client has provided in a -MAX_PUSH_ID frame (Section 7.2.7). A client MUST treat receipt of a -PUSH_PROMISE frame that contains a larger Push ID than the client has advertised -as a connection error of H3_ID_ERROR.¶
-A server MAY use the same Push ID in multiple PUSH_PROMISE frames. If so, the -decompressed request header sets MUST contain the same fields in the same -order, and both the name and the value in each field MUST be exact -matches. Clients SHOULD compare the request header sets for resources promised -multiple times. If a client receives a Push ID that has already been promised -and detects a mismatch, it MUST respond with a connection error of type -H3_GENERAL_PROTOCOL_ERROR. If the decompressed header sets match exactly, the -client SHOULD associate the pushed content with each stream on which -a PUSH_PROMISE was received.¶
-Allowing duplicate references to the same Push ID is primarily to reduce -duplication caused by concurrent requests. A server SHOULD avoid reusing a Push -ID over a long period. Clients are likely to consume server push responses and -not retain them for reuse over time. Clients that see a PUSH_PROMISE that uses -a Push ID that they have already consumed and discarded are forced to ignore the -PUSH_PROMISE.¶
-If a PUSH_PROMISE frame is received on the control stream, the client MUST -respond with a connection error (Section 8) of type H3_FRAME_UNEXPECTED.¶
-A client MUST NOT send a PUSH_PROMISE frame. A server MUST treat the receipt -of a PUSH_PROMISE frame as a connection error of type H3_FRAME_UNEXPECTED.¶
-See Section 4.4 for a description of the overall server push mechanism.¶
-The GOAWAY frame (type=0x7) is used to initiate graceful shutdown of a -connection by either endpoint. GOAWAY allows an endpoint to stop accepting new -requests or pushes while still finishing processing of previously received -requests and pushes. This enables administrative actions, like server -maintenance. GOAWAY by itself does not close a connection.¶
-The GOAWAY frame is always sent on the control stream. In the server to client -direction, it carries a QUIC Stream ID for a client-initiated bidirectional -stream encoded as a variable-length integer. A client MUST treat receipt of a -GOAWAY frame containing a Stream ID of any other type as a connection error of -type H3_ID_ERROR.¶
-In the client to server direction, the GOAWAY frame carries a Push ID encoded as -a variable-length integer.¶
-The GOAWAY frame applies to the connection, not a specific stream. A client -MUST treat a GOAWAY frame on a stream other than the control stream as a -connection error (Section 8) of type H3_FRAME_UNEXPECTED.¶
-See Section 5.2 for more information on the use of the GOAWAY frame.¶
-The MAX_PUSH_ID frame (type=0xD) is used by clients to control the number of -server pushes that the server can initiate. This sets the maximum value for a -Push ID that the server can use in PUSH_PROMISE and CANCEL_PUSH frames. -Consequently, this also limits the number of push streams that the server can -initiate in addition to the limit maintained by the QUIC transport.¶
-The MAX_PUSH_ID frame is always sent on the control stream. Receipt of a -MAX_PUSH_ID frame on any other stream MUST be treated as a connection error of -type H3_FRAME_UNEXPECTED.¶
-A server MUST NOT send a MAX_PUSH_ID frame. A client MUST treat the receipt of -a MAX_PUSH_ID frame as a connection error of type H3_FRAME_UNEXPECTED.¶
-The maximum Push ID is unset when a connection is created, meaning that a server -cannot push until it receives a MAX_PUSH_ID frame. A client that wishes to -manage the number of promised server pushes can increase the maximum Push ID by -sending MAX_PUSH_ID frames as the server fulfills or cancels server pushes.¶
-The MAX_PUSH_ID frame carries a single variable-length integer that identifies -the maximum value for a Push ID that the server can use (see -Section 7.2.5). A MAX_PUSH_ID frame cannot reduce the maximum Push ID; -receipt of a MAX_PUSH_ID that contains a smaller value than previously received -MUST be treated as a connection error of type H3_ID_ERROR.¶
-Frame types of the format 0x1f * N + 0x21
for integer values of N are reserved
-to exercise the requirement that unknown types be ignored (Section 9).
-These frames have no semantics, and can be sent on any open stream when
-application-layer padding is desired. They MAY also be sent on connections where
-no data is currently being transferred. Endpoints MUST NOT consider these frames
-to have any meaning upon receipt.¶
The payload and length of the frames are selected in any manner the -implementation chooses.¶
-Frame types which were used in HTTP/2 where there is no corresponding HTTP/3 -frame have also been reserved (Section 11.2.1). These frame types MUST NOT be -sent, and receipt MAY be treated as an error of type H3_FRAME_UNEXPECTED.¶
-QUIC allows the application to abruptly terminate (reset) individual streams or -the entire connection when an error is encountered. These are referred to as -"stream errors" or "connection errors" and are described in more detail in -[QUIC-TRANSPORT].¶
-An endpoint MAY choose to treat a stream error as a connection error under -certain circumstances. Implementations need to consider the impact on -outstanding requests before making this choice.¶
-Because new error codes can be defined without negotiation (see Section 9), -use of an error code in an unexpected context or receipt of an unknown error -code MUST be treated as equivalent to H3_NO_ERROR. However, closing a stream -can have other effects regardless of the error code (see Section 4.1).¶
-This section describes HTTP/3-specific error codes which can be used to express -the cause of a connection or stream error.¶
-The following error codes are defined for use when abruptly terminating streams, -aborting reading of streams, or immediately closing connections.¶
-Error codes of the format 0x1f * N + 0x21
for integer values of N are reserved
-to exercise the requirement that unknown error codes be treated as equivalent to
-H3_NO_ERROR (Section 9). Implementations SHOULD select an error code from
-this space with some probability when they would have sent H3_NO_ERROR.¶
HTTP/3 permits extension of the protocol. Within the limitations described in -this section, protocol extensions can be used to provide additional services or -alter any aspect of the protocol. Extensions are effective only within the -scope of a single HTTP/3 connection.¶
-This applies to the protocol elements defined in this document. This does not -affect the existing options for extending HTTP, such as defining new methods, -status codes, or header fields.¶
-Extensions are permitted to use new frame types (Section 7.2), new settings -(Section 7.2.4.1), new error codes (Section 8), or new unidirectional -stream types (Section 6.2). Registries are established for -managing these extension points: frame types (Section 11.2.1), settings -(Section 11.2.2), error codes (Section 11.2.3), and stream types -(Section 11.2.4).¶
-Implementations MUST ignore unknown or unsupported values in all extensible -protocol elements. Implementations MUST discard frames and unidirectional -streams that have unknown or unsupported types. This means that any of these -extension points can be safely used by extensions without prior arrangement or -negotiation. However, where a known frame type is required to be in a specific -location, such as the SETTINGS frame as the first frame of the control stream -(see Section 6.2.1), an unknown frame type does not satisfy that -requirement and SHOULD be treated as an error.¶
-Extensions that could change the semantics of existing protocol components MUST -be negotiated before being used. For example, an extension that changes the -layout of the HEADERS frame cannot be used until the peer has given a positive -signal that this is acceptable. Coordinating when such a revised layout comes -into effect could prove complex. As such, allocating new identifiers for -new definitions of existing protocol elements is likely to be more effective.¶
-This document doesn't mandate a specific method for negotiating the use of an -extension but notes that a setting (Section 7.2.4.1) could be used for -that purpose. If both peers set a value that indicates willingness to use the -extension, then the extension can be used. If a setting is used for extension -negotiation, the default value MUST be defined in such a fashion that the -extension is disabled if the setting is omitted.¶
-The security considerations of HTTP/3 should be comparable to those of HTTP/2 -with TLS; the considerations from Section 10 of [HTTP2] apply in addition to -those listed here.¶
-TODO: This is going to be a big import, probably worthy of its own PR.¶
-When HTTP Alternative Services is used for discovery for HTTP/3 endpoints, the -security considerations of [ALTSVC] also apply.¶
-Where HTTP/2 employs PADDING frames and Padding fields in other frames to make a -connection more resistant to traffic analysis, HTTP/3 can either rely on -transport-layer padding or employ the reserved frame and stream types discussed -in Section 7.2.8 and Section 6.2.3. These methods of padding produce -different results in terms of the granularity of padding, how padding is -arranged in relation to the information that is being protected, whether -padding is applied in the case of packet loss, and how an implementation might -control padding.¶
-Several protocol elements contain nested length elements, typically in the form -of frames with an explicit length containing variable-length integers. This -could pose a security risk to an incautious implementer. An implementation MUST -ensure that the length of a frame exactly matches the length of the fields it -contains.¶
-The use of 0-RTT with HTTP/3 creates an exposure to replay attack. The -anti-replay mitigations in [HTTP-REPLAY] MUST be applied when using -HTTP/3 with 0-RTT.¶
-Certain HTTP implementations use the client address for logging or -access-control purposes. Since a QUIC client's address might change during a -connection (and future versions might support simultaneous use of multiple -addresses), such implementations will need to either actively retrieve the -client's current address or addresses when they are relevant or explicitly -accept that the original address might change.¶
-This document registers a new ALPN protocol ID (Section 11.1) and creates new -registries that manage the assignment of codepoints in HTTP/3.¶
-This document creates a new registration for the identification of -HTTP/3 in the "Application Layer Protocol Negotiation (ALPN) -Protocol IDs" registry established in [RFC7301].¶
-The "h3" string identifies HTTP/3:¶
- -New registries created in this document operate under the QUIC registration -policy documented in Section 22.1 of [QUIC-TRANSPORT]. These registries all -include the common set of fields listed in Section 22.1.1 of [QUIC-TRANSPORT].¶
-The initial allocations in these registries created in this document are all -assigned permanent status and list as contact both the IESG (iesg@ietf.org) and -the HTTP working group (ietf-http-wg@w3.org).¶
-This document establishes a registry for HTTP/3 frame type codes. The "HTTP/3 -Frame Type" registry governs a 62-bit space. This registry follows the QUIC -registry policy; see Section 11.2. Permanent registrations in this registry -are assigned using the Specification Required policy [RFC8126], except for -values between 0x00 and 0x3f (in hexadecimal; inclusive), which are assigned -using Standards Action or IESG Approval as defined in Section 4.9 and 4.10 of -[RFC8126].¶
-While this registry is separate from the "HTTP/2 Frame Type" registry defined in -[HTTP2], it is preferable that the assignments parallel each other where the -code spaces overlap. If an entry is present in only one registry, every effort -SHOULD be made to avoid assigning the corresponding value to an unrelated -operation.¶
-In addition to common fields as described in Section 11.2, permanent -registrations in this registry MUST include the following field:¶
-Specifications of frame types MUST include a description of the frame layout and -its semantics, including any parts of the frame that are conditionally present.¶
-The entries in Table 2 are registered by this document.¶
-Frame Type | -Value | -Specification | -
---|---|---|
DATA | -0x0 | -- Section 7.2.1 - | -
HEADERS | -0x1 | -- Section 7.2.2 - | -
Reserved | -0x2 | -N/A | -
CANCEL_PUSH | -0x3 | -- Section 7.2.3 - | -
SETTINGS | -0x4 | -- Section 7.2.4 - | -
PUSH_PROMISE | -0x5 | -- Section 7.2.5 - | -
Reserved | -0x6 | -N/A | -
GOAWAY | -0x7 | -- Section 7.2.6 - | -
Reserved | -0x8 | -N/A | -
Reserved | -0x9 | -N/A | -
MAX_PUSH_ID | -0xD | -- Section 7.2.7 - | -
Additionally, each code of the format 0x1f * N + 0x21
for integer values of N
-(that is, 0x21
, 0x40
, ..., through 0x3FFFFFFFFFFFFFFE
) MUST NOT be
-assigned by IANA.¶
This document establishes a registry for HTTP/3 settings. The "HTTP/3 Settings" -registry governs a 62-bit space. This registry follows the QUIC registry -policy; see Section 11.2. Permanent registrations in this registry are -assigned using the Specification Required policy [RFC8126], except for values -between 0x00 and 0x3f (in hexadecimal; inclusive), which are assigned using -Standards Action or IESG Approval as defined in Section 4.9 and 4.10 of -[RFC8126].¶
-While this registry is separate from the "HTTP/2 Settings" registry defined in -[HTTP2], it is preferable that the assignments parallel each other. If an -entry is present in only one registry, every effort SHOULD be made to avoid -assigning the corresponding value to an unrelated operation.¶
-In addition to common fields as described in Section 11.2, permanent -registrations in this registry MUST include the following fields:¶
-The entries in Table 3 are registered by this document.¶
-Setting Name | -Value | -Specification | -Default | -
---|---|---|---|
Reserved | -0x2 | -N/A | -N/A | -
Reserved | -0x3 | -N/A | -N/A | -
Reserved | -0x4 | -N/A | -N/A | -
Reserved | -0x5 | -N/A | -N/A | -
MAX_HEADER_LIST_SIZE | -0x6 | -- Section 7.2.4.1 - | -Unlimited | -
Additionally, each code of the format 0x1f * N + 0x21
for integer values of N
-(that is, 0x21
, 0x40
, ..., through 0x3FFFFFFFFFFFFFFE
) MUST NOT be
-assigned by IANA.¶
This document establishes a registry for HTTP/3 error codes. The "HTTP/3 Error -Code" registry manages a 62-bit space. This registry follows the QUIC registry -policy; see Section 11.2. Permanent registrations in this registry are -assigned using the Specification Required policy [RFC8126], except for values -between 0x00 and 0x3f (in hexadecimal; inclusive), which are assigned using -Standards Action or IESG Approval as defined in Section 4.9 and 4.10 of -[RFC8126].¶
-Registrations for error codes are required to include a description -of the error code. An expert reviewer is advised to examine new -registrations for possible duplication with existing error codes. -Use of existing registrations is to be encouraged, but not mandated.¶
-In addition to common fields as described in Section 11.2, permanent -registrations in this registry MUST include the following fields:¶
-The entries in the Table 4 are registered by this document.¶
-Name | -Value | -Description | -Specification | -
---|---|---|---|
H3_NO_ERROR | -0x0100 | -No error | -- Section 8.1 - | -
H3_GENERAL_PROTOCOL_ERROR | -0x0101 | -General protocol error | -- Section 8.1 - | -
H3_INTERNAL_ERROR | -0x0102 | -Internal error | -- Section 8.1 - | -
H3_STREAM_CREATION_ERROR | -0x0103 | -Stream creation error | -- Section 8.1 - | -
H3_CLOSED_CRITICAL_STREAM | -0x0104 | -Critical stream was closed | -- Section 8.1 - | -
H3_FRAME_UNEXPECTED | -0x0105 | -Frame not permitted in the current state | -- Section 8.1 - | -
H3_FRAME_ERROR | -0x0106 | -Frame violated layout or size rules | -- Section 8.1 - | -
H3_EXCESSIVE_LOAD | -0x0107 | -Peer generating excessive load | -- Section 8.1 - | -
H3_ID_ERROR | -0x0108 | -An identifier was used incorrectly | -- Section 8.1 - | -
H3_SETTINGS_ERROR | -0x0109 | -SETTINGS frame contained invalid values | -- Section 8.1 - | -
H3_MISSING_SETTINGS | -0x010A | -No SETTINGS frame received | -- Section 8.1 - | -
H3_REQUEST_REJECTED | -0x010B | -Request not processed | -- Section 8.1 - | -
H3_REQUEST_CANCELLED | -0x010C | -Data no longer needed | -- Section 8.1 - | -
H3_REQUEST_INCOMPLETE | -0x010D | -Stream terminated early | -- Section 8.1 - | -
H3_CONNECT_ERROR | -0x010F | -TCP reset or error on CONNECT request | -- Section 8.1 - | -
H3_VERSION_FALLBACK | -0x0110 | -Retry over HTTP/1.1 | -- Section 8.1 - | -
Additionally, each code of the format 0x1f * N + 0x21
for integer values of N
-(that is, 0x21
, 0x40
, ..., through 0x3FFFFFFFFFFFFFFE
) MUST NOT be
-assigned by IANA.¶
This document establishes a registry for HTTP/3 unidirectional stream types. The -"HTTP/3 Stream Type" registry governs a 62-bit space. This registry follows the -QUIC registry policy; see Section 11.2. Permanent registrations in this -registry are assigned using the Specification Required policy [RFC8126], -except for values between 0x00 and 0x3f (in hexadecimal; inclusive), which are -assigned using Standards Action or IESG Approval as defined in Section 4.9 and -4.10 of [RFC8126].¶
-In addition to common fields as described in Section 11.2, permanent -registrations in this registry MUST include the following fields:¶
-Specifications for permanent registrations MUST include a description of the -stream type, including the layout semantics of the stream contents.¶
-The entries in the following table are registered by this document.¶
-Stream Type | -Value | -Specification | -Sender | -
---|---|---|---|
Control Stream | -0x00 | -- Section 6.2.1 - | -Both | -
Push Stream | -0x01 | -- Section 4.4 - | -Server | -
Additionally, each code of the format 0x1f * N + 0x21
for integer values of N
-(that is, 0x21
, 0x40
, ..., through 0x3FFFFFFFFFFFFFFE
) MUST NOT be
-assigned by IANA.¶
HTTP/3 is strongly informed by HTTP/2, and bears many similarities. This -section describes the approach taken to design HTTP/3, points out important -differences from HTTP/2, and describes how to map HTTP/2 extensions into HTTP/3.¶
-HTTP/3 begins from the premise that similarity to HTTP/2 is preferable, but not -a hard requirement. HTTP/3 departs from HTTP/2 where QUIC differs from TCP, -either to take advantage of QUIC features (like streams) or to accommodate -important shortcomings (such as a lack of total ordering). These differences -make HTTP/3 similar to HTTP/2 in key aspects, such as the relationship of -requests and responses to streams. However, the details of the HTTP/3 design are -substantially different than HTTP/2.¶
-These departures are noted in this section.¶
-HTTP/3 permits use of a larger number of streams (2^62-1) than HTTP/2. The -considerations about exhaustion of stream identifier space apply, though the -space is significantly larger such that it is likely that other limits in QUIC -are reached first, such as the limit on the connection flow control window.¶
-In contrast to HTTP/2, stream concurrency in HTTP/3 is managed by QUIC. QUIC -considers a stream closed when all data has been received and sent data has been -acknowledged by the peer. HTTP/2 considers a stream closed when the frame -containing the END_STREAM bit has been committed to the transport. As a result, -the stream for an equivalent exchange could remain "active" for a longer period -of time. HTTP/3 servers might choose to permit a larger number of concurrent -client-initiated bidirectional streams to achieve equivalent concurrency to -HTTP/2, depending on the expected usage patterns.¶
-Due to the presence of other unidirectional stream types, HTTP/3 does not rely -exclusively on the number of concurrent unidirectional streams to control the -number of concurrent in-flight pushes. Instead, HTTP/3 clients use the -MAX_PUSH_ID frame to control the number of pushes received from an HTTP/3 -server.¶
-Many framing concepts from HTTP/2 can be elided on QUIC, because the transport -deals with them. Because frames are already on a stream, they can omit the -stream number. Because frames do not block multiplexing (QUIC's multiplexing -occurs below this layer), the support for variable-maximum-length packets can be -removed. Because stream termination is handled by QUIC, an END_STREAM flag is -not required. This permits the removal of the Flags field from the generic -frame layout.¶
-Frame payloads are largely drawn from [HTTP2]. However, QUIC includes many -features (e.g., flow control) which are also present in HTTP/2. In these cases, -the HTTP mapping does not re-implement them. As a result, several HTTP/2 frame -types are not required in HTTP/3. Where an HTTP/2-defined frame is no longer -used, the frame ID has been reserved in order to maximize portability between -HTTP/2 and HTTP/3 implementations. However, even equivalent frames between the -two mappings are not identical.¶
-Many of the differences arise from the fact that HTTP/2 provides an absolute -ordering between frames across all streams, while QUIC provides this guarantee -on each stream only. As a result, if a frame type makes assumptions that frames -from different streams will still be received in the order sent, HTTP/3 will -break them.¶
-Some examples of feature adaptations are described below, as well as general -guidance to extension frame implementors converting an HTTP/2 extension to -HTTP/3.¶
-HTTP/2 specifies priority assignments in PRIORITY frames and (optionally) in -HEADERS frames. HTTP/3 does not provide a means of signaling priority.¶
-Note that while there is no explicit signaling for priority, this does not mean -that prioritization is not important for achieving good performance.¶
-HPACK was designed with the assumption of in-order delivery. A sequence of -encoded header blocks must arrive (and be decoded) at an endpoint in the same -order in which they were encoded. This ensures that the dynamic state at the two -endpoints remains in sync.¶
-Because this total ordering is not provided by QUIC, HTTP/3 uses a modified -version of HPACK, called QPACK. QPACK uses a single unidirectional stream to -make all modifications to the dynamic table, ensuring a total order of updates. -All frames which contain encoded headers merely reference the table state at a -given time without modifying it.¶
- -Frame type definitions in HTTP/3 often use the QUIC variable-length integer -encoding. In particular, Stream IDs use this encoding, which allows for a -larger range of possible values than the encoding used in HTTP/2. Some frames -in HTTP/3 use an identifier rather than a Stream ID (e.g., Push -IDs). Redefinition of the encoding of extension frame types might be necessary -if the encoding includes a Stream ID.¶
-Because the Flags field is not present in generic HTTP/3 frames, those frames -which depend on the presence of flags need to allocate space for flags as part -of their frame payload.¶
-Other than this issue, frame type HTTP/2 extensions are typically portable to -QUIC simply by replacing Stream 0 in HTTP/2 with a control stream in HTTP/3. -HTTP/3 extensions will not assume ordering, but would not be harmed by ordering, -and would be portable to HTTP/2 in the same manner.¶
-Frame types defined by extensions to HTTP/2 need to be separately registered for -HTTP/3 if still applicable. The IDs of frames defined in [HTTP2] have been -reserved for simplicity. Note that the frame type space in HTTP/3 is -substantially larger (62 bits versus 8 bits), so many HTTP/3 frame types have no -equivalent HTTP/2 code points. See Section 11.2.1.¶
-An important difference from HTTP/2 is that settings are sent once, as the first -frame of the control stream, and thereafter cannot change. This eliminates many -corner cases around synchronization of changes.¶
-Some transport-level options that HTTP/2 specifies via the SETTINGS frame are -superseded by QUIC transport parameters in HTTP/3. The HTTP-level options that -are retained in HTTP/3 have the same value as in HTTP/2.¶
-Below is a listing of how each HTTP/2 SETTINGS parameter is mapped:¶
-In HTTP/3, setting values are variable-length integers (6, 14, 30, or 62 bits -long) rather than fixed-length 32-bit fields as in HTTP/2. This will often -produce a shorter encoding, but can produce a longer encoding for settings which -use the full 32-bit space. Settings ported from HTTP/2 might choose to redefine -their value to limit it to 30 bits for more efficient encoding, or to make use -of the 62-bit space if more than 30 bits are required.¶
-Settings need to be defined separately for HTTP/2 and HTTP/3. The IDs of -settings defined in [HTTP2] have been reserved for simplicity. Note that -the settings identifier space in HTTP/3 is substantially larger (62 bits versus -16 bits), so many HTTP/3 settings have no equivalent HTTP/2 code point. See -Section 11.2.2.¶
-As QUIC streams might arrive out-of-order, endpoints are advised to not wait for -the peers' settings to arrive before responding to other streams. See -Section 7.2.4.2.¶
-QUIC has the same concepts of "stream" and "connection" errors that HTTP/2 -provides. However, there is no direct portability of HTTP/2 error codes to -HTTP/3 error codes; the values are shifted in order to prevent accidental -or implicit conversion.¶
-The HTTP/2 error codes defined in Section 7 of [HTTP2] logically map to -the HTTP/3 error codes as follows:¶
-Error codes need to be defined for HTTP/2 and HTTP/3 separately. See -Section 11.2.3.¶
-Further changes to error codes (#2662,#2551):¶
- -http-opportunistic
resource (RFC 8164) when scheme is
-http
(#2439,#2973)¶
-Changes to SETTINGS frames in 0-RTT (#2972,#2790,#2945):¶
-No changes¶
-Extensive changes to error codes and conditions of their sending¶
-Use variable-length integers throughout (#2437,#2233,#2253,#2275)¶
- -Changes to PRIORITY frame (#1865, #2075)¶
- -Substantial editorial reorganization; no technical changes.¶
-None.¶
-SETTINGS changes (#181):¶
- -The original authors of this specification were Robbie Shade and Mike Warres.¶
-A substantial portion of Mike's contribution was supported by Microsoft during -his employment there.¶
-Internet-Draft | -QUIC Invariants | -March 2020 | -
Thomson | -Expires 22 September 2020 | -[Page] | -
This document defines the properties of the QUIC transport protocol that are -expected to remain unchanged over time as new versions of the protocol are -developed.¶
-Discussion of this draft takes place on the QUIC working group mailing list -(quic@ietf.org), which is archived at -https://mailarchive.ietf.org/arch/search/?email_list=quic.¶
-Working Group information can be found at https://github.com/quicwg; source -code and issues list for this draft can be found at -https://github.com/quicwg/base-drafts/labels/-invariants.¶
-- This Internet-Draft is submitted in full conformance with the - provisions of BCP 78 and BCP 79.¶
-- Internet-Drafts are working documents of the Internet Engineering Task - Force (IETF). Note that other groups may also distribute working - documents as Internet-Drafts. The list of current Internet-Drafts is - at https://datatracker.ietf.org/drafts/current/.¶
-- Internet-Drafts are draft documents valid for a maximum of six months - and may be updated, replaced, or obsoleted by other documents at any - time. It is inappropriate to use Internet-Drafts as reference - material or to cite them other than as "work in progress."¶
-- This Internet-Draft will expire on 22 September 2020.¶
-- Copyright (c) 2020 IETF Trust and the persons identified as the - document authors. All rights reserved.¶
-- This document is subject to BCP 78 and the IETF Trust's Legal - Provisions Relating to IETF Documents - (https://trustee.ietf.org/license-info) in effect on the date of - publication of this document. Please review these documents - carefully, as they describe your rights and restrictions with - respect to this document. Code Components extracted from this - document must include Simplified BSD License text as described in - Section 4.e of the Trust Legal Provisions and are provided without - warranty as described in the Simplified BSD License.¶
-In addition to providing secure, multiplexed transport, QUIC [QUIC-TRANSPORT] -includes the ability to negotiate a version. This allows the protocol to change -over time in response to new requirements. Many characteristics of the protocol -will change between versions.¶
-This document describes the subset of QUIC that is intended to remain stable as -new versions are developed and deployed. All of these invariants are -IP-version-independent.¶
-The primary goal of this document is to ensure that it is possible to deploy new -versions of QUIC. By documenting the properties that can't change, this -document aims to preserve the ability to change any other aspect of the -protocol. Thus, unless specifically described in this document, any aspect of -the protocol can change between different versions.¶
-Appendix A is a non-exhaustive list of some incorrect assumptions that -might be made based on knowledge of QUIC version 1; these do not apply to every -version of QUIC.¶
-The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", -"SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this -document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] -when, and only when, they appear in all capitals, as shown here.¶
-This document uses terms and notational conventions from [QUIC-TRANSPORT].¶
-QUIC is a connection-oriented protocol between two endpoints. Those endpoints -exchange UDP datagrams. These UDP datagrams contain QUIC packets. QUIC -endpoints use QUIC packets to establish a QUIC connection, which is shared -protocol state between those endpoints.¶
-A QUIC packet is the content of the UDP datagrams exchanged by QUIC endpoints. -This document describes the contents of those datagrams.¶
-QUIC defines two types of packet header: long and short. Packets with long -headers are identified by the most significant bit of the first byte being set; -packets with a short header have that bit cleared.¶
-Aside from the values described here, the payload of QUIC packets is -version-specific and of arbitrary length.¶
-Long headers take the form described in Figure 1. Bits that have -version-specific semantics are marked with an X.¶
-A QUIC packet with a long header has the high bit of the first byte set to 1. -All other bits in that byte are version specific.¶
-The next four bytes include a 32-bit Version field (see Section 4.4).¶
-The next byte contains the length in bytes of the Destination Connection ID (see -Section 4.3) field that follows it. This length is encoded as an 8-bit -unsigned integer. The Destination Connection ID field follows the DCID Len -field and is between 0 and 255 bytes in length.¶
-The next byte contains the length in bytes -of the Source Connection ID field that follows it. This length is encoded as -a 8-bit unsigned integer. The Source Connection ID field follows the SCID Len -field and is between 0 and 255 bytes in length.¶
-The remainder of the packet contains version-specific content.¶
-Short headers take the form described in Figure 2. Bits that have -version-specific semantics are marked with an X.¶
-A QUIC packet with a short header has the high bit of the first byte set to 0.¶
-A QUIC packet with a short header includes a Destination Connection ID -immediately following the first byte. The short header does not include the -Connection ID Lengths, Source Connection ID, or Version fields. The length of -the Destination Connection ID is not specified in packets with a short header -and is not constrained by this specification.¶
-The remainder of the packet has version-specific semantics.¶
-A connection ID is an opaque field of arbitrary length.¶
-The primary function of a connection ID is to ensure that changes in addressing -at lower protocol layers (UDP, IP, and below) don't cause packets for a QUIC -connection to be delivered to the wrong endpoint. The connection ID is used by -endpoints and the intermediaries that support them to ensure that each QUIC -packet can be delivered to the correct instance of an endpoint. At the -endpoint, the connection ID is used to identify which QUIC connection the packet -is intended for.¶
-The connection ID is chosen by each endpoint using version-specific methods. -Packets for the same QUIC connection might use different connection ID values.¶
-QUIC versions are identified with a 32-bit integer, encoded in network byte -order. Version 0 is reserved for version negotiation (see -Section 5). All other version numbers are potentially valid.¶
-The properties described in this document apply to all versions of QUIC. A -protocol that does not conform to the properties described in this document is -not QUIC. Future documents might describe additional properties which apply to -a specific QUIC version, or to a range of QUIC versions.¶
-A QUIC endpoint that receives a packet with a long header and a version it -either does not understand or does not support might send a Version Negotiation -packet in response. Packets with a short header do not trigger version -negotiation.¶
-A Version Negotiation packet sets the high bit of the first byte, and thus it -conforms with the format of a packet with a long header as defined in -Section 4.1. A Version Negotiation packet is identifiable as such by the -Version field, which is set to 0x00000000.¶
-The Version Negotiation packet contains a list of Supported Version fields, each -identifying a version that the endpoint sending the packet supports. The -Supported Version fields follow the Version field. A Version Negotiation packet -contains no other fields. An endpoint MUST ignore a packet that contains no -Supported Version fields, or a truncated Supported Version.¶
-Version Negotiation packets do not use integrity or confidentiality protection. -A specific QUIC version might authenticate the packet as part of its connection -establishment process.¶
-An endpoint MUST include the value from the Source Connection ID field of the -packet it receives in the Destination Connection ID field. The value for Source -Connection ID MUST be copied from the Destination Connection ID of the received -packet, which is initially randomly selected by a client. Echoing both -connection IDs gives clients some assurance that the server received the packet -and that the Version Negotiation packet was not generated by an off-path -attacker.¶
-An endpoint that receives a Version Negotiation packet might change the version -that it decides to use for subsequent packets. The conditions under which an -endpoint changes QUIC version will depend on the version of QUIC that it -chooses.¶
-See [QUIC-TRANSPORT] for a more thorough description of how an endpoint that -supports QUIC version 1 generates and consumes a Version Negotiation packet.¶
-It is possible that middleboxes could use traits of a specific version of QUIC -and assume that when other versions of QUIC exhibit similar traits the same -underlying semantic is being expressed. There are potentially many such traits -(see Appendix A). Some effort has been made to either eliminate or -obscure some observable traits in QUIC version 1, but many of these remain. -Other QUIC versions might make different design decisions and so exhibit -different traits.¶
-The QUIC version number does not appear in all QUIC packets, which means that -reliably extracting information from a flow based on version-specific traits -requires that middleboxes retain state for every connection ID they see.¶
-The Version Negotiation packet described in this document is not -integrity-protected; it only has modest protection against insertion by off-path -attackers. QUIC versions MUST define a mechanism that authenticates the values -it contains.¶
-This document makes no request of IANA.¶
-There are several traits of QUIC version 1 [QUIC-TRANSPORT] that are not -protected from observation, but are nonetheless considered to be changeable when -a new version is deployed.¶
-This section lists a sampling of incorrect assumptions that might be made based -on knowledge of QUIC version 1. Some of these statements are not even true for -QUIC version 1. This is not an exhaustive list, it is intended to be -illustrative only.¶
-The following statements are NOT guaranteed to be true for every QUIC version:¶
-Internet-Draft | -QPACK | -March 2020 | -
Krasic, et al. | -Expires 22 September 2020 | -[Page] | -
This specification defines QPACK, a compression format for efficiently -representing HTTP header fields, to be used in HTTP/3. This is a variation of -HPACK header compression that seeks to reduce head-of-line blocking.¶
-Discussion of this draft takes place on the QUIC working group mailing list -(quic@ietf.org), which is archived at -https://mailarchive.ietf.org/arch/search/?email_list=quic.¶
-Working Group information can be found at https://github.com/quicwg; source -code and issues list for this draft can be found at -https://github.com/quicwg/base-drafts/labels/-qpack.¶
-- This Internet-Draft is submitted in full conformance with the - provisions of BCP 78 and BCP 79.¶
-- Internet-Drafts are working documents of the Internet Engineering Task - Force (IETF). Note that other groups may also distribute working - documents as Internet-Drafts. The list of current Internet-Drafts is - at https://datatracker.ietf.org/drafts/current/.¶
-- Internet-Drafts are draft documents valid for a maximum of six months - and may be updated, replaced, or obsoleted by other documents at any - time. It is inappropriate to use Internet-Drafts as reference - material or to cite them other than as "work in progress."¶
-- This Internet-Draft will expire on 22 September 2020.¶
-- Copyright (c) 2020 IETF Trust and the persons identified as the - document authors. All rights reserved.¶
-- This document is subject to BCP 78 and the IETF Trust's Legal - Provisions Relating to IETF Documents - (https://trustee.ietf.org/license-info) in effect on the date of - publication of this document. Please review these documents - carefully, as they describe your rights and restrictions with - respect to this document. Code Components extracted from this - document must include Simplified BSD License text as described in - Section 4.e of the Trust Legal Provisions and are provided without - warranty as described in the Simplified BSD License.¶
-The QUIC transport protocol [QUIC-TRANSPORT] is designed to support HTTP -semantics, and its design subsumes many of the features of HTTP/2 [RFC7540]. -HTTP/2 uses HPACK [RFC7541] for header compression. If HPACK were used for -HTTP/3 [HTTP3], it would induce head-of-line blocking due to built-in -assumptions of a total ordering across frames on all streams.¶
-QPACK reuses core concepts from HPACK, but is redesigned to allow correctness in -the presence of out-of-order delivery, with flexibility for implementations to -balance between resilience against head-of-line blocking and optimal compression -ratio. The design goals are to closely approach the compression ratio of HPACK -with substantially less head-of-line blocking under the same loss conditions.¶
-The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", -"SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this -document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] -when, and only when, they appear in all capitals, as shown here.¶
-Definitions of terms that are used in this document:¶
-QPACK is a name, not an acronym.¶
-Diagrams use the format described in Section 3.1 of [RFC2360], with the -following additional conventions:¶
-Like HPACK, QPACK uses two tables for associating header fields to indices. The -static table (Section 3.1) is predefined and contains common header -fields (some of them with an empty value). The dynamic table -(Section 3.2) is built up over the course of the connection and can -be used by the encoder to index header fields in the encoded header lists.¶
-QPACK defines unidirectional streams for sending instructions from encoder to -decoder and vice versa.¶
-An encoder converts a header list into a header block by emitting either an -indexed or a literal representation for each header field in the list; see -Section 4.5. Indexed representations achieve high -compression by replacing the literal name and possibly the value with an index -to either the static or dynamic table. References to the static table and -literal representations do not require any dynamic state and never risk -head-of-line blocking. References to the dynamic table risk head-of-line -blocking if the encoder has not received an acknowledgement indicating the entry -is available at the decoder.¶
-An encoder MAY insert any entry in the dynamic table it chooses; it is not -limited to header fields it is compressing.¶
-QPACK preserves the ordering of header fields within each header list. An -encoder MUST emit header field representations in the order they appear in the -input header list.¶
-QPACK is designed to contain the more complex state tracking to the encoder, -while the decoder is relatively simple.¶
-Inserting entries into the dynamic table might not be possible if the table -contains entries which cannot be evicted.¶
-A dynamic table entry cannot be evicted immediately after insertion, even if it -has never been referenced. Once the insertion of a dynamic table entry has been -acknowledged and there are no outstanding references to the entry in -unacknowledged header blocks, the entry becomes evictable. Note that -references on the encoder stream never preclude the eviction of an entry, -because those references are guaranteed to be processed before the instruction -evicting the entry.¶
-If the dynamic table does not contain enough room for a new entry without -evicting other entries, and the entries which would be evicted are not -evictable, the encoder MUST NOT insert that entry into the dynamic table -(including duplicates of existing entries). In order to avoid this, an encoder -that uses the dynamic table has to keep track of each dynamic table entry -referenced by each header block until that header block is acknowledged by the -decoder (see Section 4.4.1).¶
-To ensure that the encoder is not prevented from adding new entries, the encoder -can avoid referencing entries that are close to eviction. Rather than -reference such an entry, the encoder can emit a Duplicate instruction -(Section 4.3.4), and reference the duplicate instead.¶
-Determining which entries are too close to eviction to reference is an encoder -preference. One heuristic is to target a fixed amount of available space in the -dynamic table: either unused space or space that can be reclaimed by evicting -non-blocking entries. To achieve this, the encoder can maintain a draining -index, which is the smallest absolute index (Section 3.2.4) in the dynamic table -that it will emit a reference for. As new entries are inserted, the encoder -increases the draining index to maintain the section of the table that it will -not reference. If the encoder does not create new references to entries with an -absolute index lower than the draining index, the number of unacknowledged -references to those entries will eventually become zero, allowing them to be -evicted.¶
-Because QUIC does not guarantee order between data on different streams, a -decoder might encounter a header block that references a dynamic table entry -that it has not yet received.¶
-Each header block contains a Required Insert Count (Section 4.5.1), the -lowest possible value for the Insert Count with which the header block can be -decoded. For a header block with references to the dynamic table, the Required -Insert Count is one larger than the largest absolute index of all referenced -dynamic table entries. For a header block with no references to the dynamic -table, the Required Insert Count is zero.¶
-When the decoder receives a header block with a Required Insert Count greater -than its own Insert Count, the stream cannot be processed immediately, and is -considered "blocked"; see Section 2.2.1.¶
-The decoder specifies an upper bound on the number of streams which can be -blocked using the SETTINGS_QPACK_BLOCKED_STREAMS setting; see Section 5. -An encoder MUST limit the number of streams which could become blocked to the -value of SETTINGS_QPACK_BLOCKED_STREAMS at all times. If a decoder encounters -more blocked streams than it promised to support, it MUST treat this as a -connection error of type QPACK_DECOMPRESSION_FAILED.¶
-Note that the decoder might not become blocked on every stream which risks -becoming blocked.¶
-An encoder can decide whether to risk having a stream become blocked. If -permitted by the value of SETTINGS_QPACK_BLOCKED_STREAMS, compression efficiency -can often be improved by referencing dynamic table entries that are still in -transit, but if there is loss or reordering the stream can become blocked at the -decoder. An encoder can avoid the risk of blocking by only referencing dynamic -table entries which have been acknowledged, but this could mean using -literals. Since literals make the header block larger, this can result in the -encoder becoming blocked on congestion or flow control limits.¶
-Writing instructions on streams that are limited by flow control can produce -deadlocks.¶
-A decoder might stop issuing flow control credit on the stream that carries a -header block until the necessary updates are received on the encoder -stream. If the granting of flow control credit on the encoder stream (or the -connection as a whole) depends on the consumption and release of data on the -stream carrying the header block, a deadlock might result.¶
-More generally, a stream containing a large instruction can become deadlocked if -the decoder withholds flow control credit until the instruction is completely -received.¶
-To avoid these deadlocks, an encoder SHOULD avoid writing an instruction unless -sufficient stream and connection flow control credit is available for the entire -instruction.¶
-The Known Received Count is the total number of dynamic table insertions and -duplications acknowledged by the decoder. The encoder tracks the Known Received -Count in order to identify which dynamic table entries can be referenced without -potentially blocking a stream. The decoder tracks the Known Received Count in -order to be able to send Insert Count Increment instructions.¶
-A Header Acknowledgement instruction (Section 4.4.1) implies that -the decoder has received all dynamic table state necessary to process -corresponding the header block. If the Required Insert Count of the -acknowledged header block is greater than the current Known Received Count, -Known Received Count is updated to the value of the Required Insert Count.¶
-An Insert Count Increment instruction Section 4.4.3 increases the -Known Received Count by its Increment parameter. See Section 2.2.2.3 for -guidance.¶
-As in HPACK, the decoder processes header blocks and emits the corresponding -header lists. It also processes instructions received on the encoder stream that -modify the dynamic table. Note that header blocks and encoder stream -instructions arrive on separate streams. This is unlike HPACK, where header -blocks can contain instructions that modify the dynamic table, and there is no -dedicated stream of HPACK instructions.¶
-The decoder MUST emit header fields in the order their representations appear in -the input header block.¶
-Upon receipt of a header block, the decoder examines the Required Insert Count. -When the Required Insert Count is less than or equal to the decoder's Insert -Count, the header block can be processed immediately. Otherwise, the stream on -which the header block was received becomes blocked.¶
-While blocked, header block data SHOULD remain in the blocked stream's flow -control window. A stream becomes unblocked when the Insert Count becomes -greater than or equal to the Required Insert Count for all header blocks the -decoder has started reading from the stream.¶
-When processing header blocks, the decoder expects the Required Insert Count to -equal the lowest possible value for the Insert Count with which the header block -can be decoded, as prescribed in Section 2.1.2. If it encounters a -Required Insert Count smaller than expected, it MUST treat this as a connection -error of type QPACK_DECOMPRESSION_FAILED; see Section 2.2.3. If it -encounters a Required Insert Count larger than expected, it MAY treat this as a -connection error of type QPACK_DECOMPRESSION_FAILED.¶
-The decoder signals the following events by emitting decoder instructions -(Section 4.4) on the decoder stream.¶
-After the decoder finishes decoding a header block containing dynamic table -references, it MUST emit a Header Acknowledgement instruction -(Section 4.4.1). A stream may carry multiple header blocks in the -case of intermediate responses, trailers, and pushed requests. The encoder -interprets each Header Acknowledgement instruction as acknowledging the earliest -unacknowledged header block containing dynamic table references sent on the -given stream.¶
-When an endpoint receives a stream reset before the end of a stream or before -all header blocks are processed on that stream, or when it abandons reading of a -stream, it generates a Stream Cancellation instruction; see -Section 4.4.2. This signals to the encoder that all references to the -dynamic table on that stream are no longer outstanding. A decoder with a -maximum dynamic table capacity (Section 3.2.3) equal to -zero MAY omit sending Stream Cancellations, because the encoder cannot have -any dynamic table references. An encoder cannot infer from this instruction -that any updates to the dynamic table have been received.¶
-The Header Acknowledgement and Stream Cancellation instructions permit the -encoder to remove references to entries in the dynamic table. When an entry -with absolute index lower than the Known Received Count has zero references, -then it is considered evictable; see Section 2.1.1.¶
-After receiving new table entries on the encoder stream, the decoder chooses -when to emit Insert Count Increment instructions; see -Section 4.4.3. Emitting this instruction after adding each new -dynamic table entry will provide the timeliest feedback to the encoder, but -could be redundant with other decoder feedback. By delaying an Insert Count -Increment instruction, the decoder might be able to coalesce multiple Insert -Count Increment instructions, or replace them entirely with Header -Acknowledgements; see Section 4.4.1. However, delaying too long -may lead to compression inefficiencies if the encoder waits for an entry to be -acknowledged before using it.¶
-If the decoder encounters a reference in a header block representation to a -dynamic table entry which has already been evicted or which has an absolute -index greater than or equal to the declared Required Insert Count -(Section 4.5.1), it MUST treat this as a connection error of type -QPACK_DECOMPRESSION_FAILED.¶
-If the decoder encounters a reference in an encoder instruction to a dynamic -table entry which has already been evicted, it MUST treat this as a connection -error of type QPACK_ENCODER_STREAM_ERROR.¶
-Unlike in HPACK, entries in the QPACK static and dynamic tables are addressed -separately. The following sections describe how entries in each table are -addressed.¶
-The static table consists of a predefined static list of header fields, each of -which has a fixed index over time. Its entries are defined in Appendix A.¶
-All entries in the static table have a name and a value. However, values can be -empty (that is, have a length of 0). Each entry is identified by a unique -index.¶
-Note that the QPACK static table is indexed from 0, whereas the HPACK static -table is indexed from 1.¶
-When the decoder encounters an invalid static table index in a header block -representation it MUST treat this as a connection error of type -QPACK_DECOMPRESSION_FAILED. If this index is received on the encoder stream, -this MUST be treated as a connection error of type QPACK_ENCODER_STREAM_ERROR.¶
-The dynamic table consists of a list of header fields maintained in first-in, -first-out order. Each HTTP/3 endpoint holds a dynamic table that is initially -empty. Entries are added by encoder instructions received on the encoder -stream; see Section 4.3.¶
-The dynamic table can contain duplicate entries (i.e., entries with the same -name and same value). Therefore, duplicate entries MUST NOT be treated as an -error by the decoder.¶
-Dynamic table entries can have empty values.¶
-The size of the dynamic table is the sum of the size of its entries.¶
-The size of an entry is the sum of its name's length in bytes, its value's -length in bytes, and 32. The size of an entry is calculated using the length of -its name and value without Huffman encoding applied.¶
-The encoder sets the capacity of the dynamic table, which serves as the upper -limit on its size. The initial capacity of the dynamic table is zero. The -encoder sends a Set Dynamic Table Capacity instruction -(Section 4.3.1) with a non-zero capacity to begin using the dynamic -table.¶
-Before a new entry is added to the dynamic table, entries are evicted from the -end of the dynamic table until the size of the dynamic table is less than or -equal to (table capacity - size of new entry). The encoder MUST NOT cause a -dynamic table entry to be evicted unless that entry is evictable; see -Section 2.1.1. The new entry is then added to the table. It is an -error if the encoder attempts to add an entry that is larger than the dynamic -table capacity; the decoder MUST treat this as a connection error of type -QPACK_ENCODER_STREAM_ERROR.¶
-A new entry can reference an entry in the dynamic table that will be evicted -when adding this new entry into the dynamic table. Implementations are -cautioned to avoid deleting the referenced name or value if the referenced entry -is evicted from the dynamic table prior to inserting the new entry.¶
-Whenever the dynamic table capacity is reduced by the encoder -(Section 4.3.1), entries are evicted from the end of the dynamic -table until the size of the dynamic table is less than or equal to the new table -capacity. This mechanism can be used to completely clear entries from the -dynamic table by setting a capacity of 0, which can subsequently be restored.¶
-To bound the memory requirements of the decoder, the decoder limits the maximum -value the encoder is permitted to set for the dynamic table capacity. In -HTTP/3, this limit is determined by the value of -SETTINGS_QPACK_MAX_TABLE_CAPACITY sent by the decoder; see Section 5. -The encoder MUST not set a dynamic table capacity that exceeds this maximum, but -it can choose to use a lower dynamic table capacity; see -Section 4.3.1.¶
-For clients using 0-RTT data in HTTP/3, the server's maximum table capacity is -the remembered value of the setting, or zero if the value was not previously -sent. When the client's 0-RTT value of the SETTING is zero, the server MAY set -it to a non-zero value in its SETTINGS frame. If the remembered value is -non-zero, the server MUST send the same non-zero value in its SETTINGS frame. If -it specifies any other value, or omits SETTINGS_QPACK_MAX_TABLE_CAPACITY from -SETTINGS, the encoder must treat this as a connection error of type -QPACK_DECODER_STREAM_ERROR.¶
-For HTTP/3 servers and HTTP/3 clients when 0-RTT is not attempted or is -rejected, the maximum table capacity is 0 until the encoder processes a SETTINGS -frame with a non-zero value of SETTINGS_QPACK_MAX_TABLE_CAPACITY.¶
-When the maximum table capacity is zero, the encoder MUST NOT insert entries -into the dynamic table, and MUST NOT send any encoder instructions on the -encoder stream.¶
-Each entry possesses an absolute index which is fixed for the lifetime of that -entry. The first entry inserted has an absolute index of "0"; indices increase -by one with each insertion.¶
-Relative indices begin at zero and increase in the opposite direction from the -absolute index. Determining which entry has a relative index of "0" depends on -the context of the reference.¶
-In encoder instructions (Section 4.3), a relative index of "0" -refers to the most recently inserted value in the dynamic table. Note that this -means the entry referenced by a given relative index will change while -interpreting instructions on the encoder stream.¶
- -Unlike in encoder instructions, relative indices in header block representations -are relative to the Base at the beginning of the header block; see -Section 4.5.1. This ensures that references are stable even if header -blocks and dynamic table updates are processed out of order.¶
-In a header block a relative index of "0" refers to the entry with absolute -index equal to Base - 1.¶
- -Post-Base indices are used in header block instructions for entries with -absolute indices greater than or equal to Base, starting at 0 for the entry with -absolute index equal to Base, and increasing in the same direction as the -absolute index.¶
-Post-Base indices allow an encoder to process a header block in a single pass -and include references to entries added while processing this (or other) header -blocks.¶
- -The prefixed integer from Section 5.1 of [RFC7541] is used heavily throughout -this document. The format from [RFC7541] is used unmodified. Note, however, -that QPACK uses some prefix sizes not actually used in HPACK.¶
-QPACK implementations MUST be able to decode integers up to and including 62 -bits long.¶
-The string literal defined by Section 5.2 of [RFC7541] is also used throughout. -This string format includes optional Huffman encoding.¶
-HPACK defines string literals to begin on a byte boundary. They begin with a -single bit flag, denoted as 'H' in this document (indicating whether the string -is Huffman-coded), followed by the Length encoded as a 7-bit prefix integer, -and finally Length bytes of data. When Huffman encoding is enabled, the Huffman -table from Appendix B of [RFC7541] is used without modification.¶
-This document expands the definition of string literals and permits them to -begin other than on a byte boundary. An "N-bit prefix string literal" begins -with the same Huffman flag, followed by the length encoded as an (N-1)-bit -prefix integer. The prefix size, N, can have a value between 2 and 8 inclusive. -The remainder of the string literal is unmodified.¶
-A string literal without a prefix length noted is an 8-bit prefix string literal -and follows the definitions in [RFC7541] without modification.¶
-QPACK defines two unidirectional stream types:¶
-0x02
.
-It carries an unframed sequence of encoder instructions from encoder
-to decoder.¶
-0x03
.
-It carries an unframed sequence of decoder instructions from decoder
-to encoder.¶
-HTTP/3 endpoints contain a QPACK encoder and decoder. Each endpoint MUST -initiate at most one encoder stream and at most one decoder stream. Receipt of a -second instance of either stream type MUST be treated as a connection error of -type H3_STREAM_CREATION_ERROR. These streams MUST NOT be closed. Closure of -either unidirectional stream type MUST be treated as a connection error of type -H3_CLOSED_CRITICAL_STREAM.¶
-An endpoint MAY avoid creating an encoder stream if it's not going to be used -(for example if its encoder doesn't wish to use the dynamic table, or if the -maximum size of the dynamic table permitted by the peer is zero).¶
-An endpoint MAY avoid creating a decoder stream if its decoder sets the maximum -capacity of the dynamic table to zero.¶
-An endpoint MUST allow its peer to create an encoder stream and a decoder stream -even if the connection's settings prevent their use.¶
-An encoder sends encoder instructions on the encoder stream to set the capacity -of the dynamic table and add dynamic table entries. Instructions adding table -entries can use existing entries to avoid transmitting redundant information. -The name can be transmitted as a reference to an existing entry in the static or -the dynamic table or as a string literal. For entries which already exist in -the dynamic table, the full entry can also be used by reference, creating a -duplicate entry.¶
-This section specifies the following encoder instructions.¶
-An encoder informs the decoder of a change to the dynamic table capacity using -an instruction which begins with the '001' three-bit pattern. This is followed -by the new dynamic table capacity represented as an integer with a 5-bit prefix; -see Section 4.1.1.¶
-The new capacity MUST be lower than or equal to the limit described in -Section 3.2.3. In HTTP/3, this limit is the value of the -SETTINGS_QPACK_MAX_TABLE_CAPACITY parameter (Section 5) received from -the decoder. The decoder MUST treat a new dynamic table capacity value that -exceeds this limit as a connection error of type QPACK_ENCODER_STREAM_ERROR.¶
-Reducing the dynamic table capacity can cause entries to be evicted; see -Section 3.2.2. This MUST NOT cause the eviction of entries which are not -evictable; see Section 2.1.1. Changing the capacity of the dynamic -table is not acknowledged as this instruction does not insert an entry.¶
-An encoder adds an entry to the dynamic table where the header field name -matches the header field name of an entry stored in the static or the dynamic -table using an instruction that starts with the '1' one-bit pattern. The second -('T') bit indicates whether the reference is to the static or dynamic table. The -6-bit prefix integer (Section 4.1.1) that follows is used to locate -the table entry for the header name. When T=1, the number represents the static -table index; when T=0, the number is the relative index of the entry in the -dynamic table.¶
-The header name reference is followed by the header field value represented as a -string literal; see Section 4.1.2.¶
- -An encoder adds an entry to the dynamic table where both the header field name -and the header field value are represented as string literals using an -instruction that starts with the '01' two-bit pattern.¶
-This is followed by the name represented as a 6-bit prefix string literal, and -the value represented as an 8-bit prefix string literal; see -Section 4.1.2.¶
- -An encoder duplicates an existing entry in the dynamic table using an -instruction that begins with the '000' three-bit pattern. This is followed by -the relative index of the existing entry represented as an integer with a 5-bit -prefix; see Section 4.1.1.¶
-The existing entry is re-inserted into the dynamic table without resending -either the name or the value. This is useful to avoid adding a reference to an -older entry, which might block inserting new entries.¶
-A decoder sends decoder instructions on the decoder stream to inform the encoder -about the processing of header blocks and table updates to ensure consistency of -the dynamic table.¶
-This section specifies the following decoder instructions.¶
-After processing a header block whose declared Required Insert Count is not -zero, the decoder emits a Header Acknowledgement instruction. The instruction -begins with the '1' one-bit pattern which is followed by the header block's -associated stream ID encoded as a 7-bit prefix integer; see -Section 4.1.1.¶
-This instruction is used as described in Section 2.1.4 and -in Section 2.2.2.¶
-If an encoder receives a Header Acknowledgement instruction referring to a -stream on which every header block with a non-zero Required Insert Count has -already been acknowledged, that MUST be treated as a connection error of type -QPACK_DECODER_STREAM_ERROR.¶
-The Header Acknowledgement instruction might increase the Known Received Count; -see Section 2.1.4.¶
-When a stream is reset or reading is abandoned, the decoder emits a Stream -Cancellation instruction. The instruction begins with the '01' two-bit -pattern, which is followed by the stream ID of the affected stream encoded as a -6-bit prefix integer.¶
-This instruction is used as described in Section 2.2.2.¶
-The Insert Count Increment instruction begins with the '00' two-bit pattern, -followed by the Increment encoded as a 6-bit prefix integer. This instruction -increases the Known Received Count (Section 2.1.4) by the value of -the Increment parameter. The decoder should send an Increment value that -increases the Known Received Count to the total number of dynamic table -insertions and duplications processed so far.¶
-An encoder that receives an Increment field equal to zero, or one that increases -the Known Received Count beyond what the encoder has sent MUST treat this as a -connection error of type QPACK_DECODER_STREAM_ERROR.¶
-A header block consists of a prefix and a possibly empty sequence of -representations defined in this section. Each representation corresponds to a -single header field. These representations reference the static table or the -dynamic table in a particular state, but do not modify that state.¶
-Header blocks are carried in frames on streams defined by the enclosing -protocol.¶
-Each header block is prefixed with two integers. The Required Insert Count is -encoded as an integer with an 8-bit prefix after the encoding described in -Section 4.5.1.1). The Base is encoded as a sign bit ('S') and a Delta Base value with a -7-bit prefix; see Section 4.5.1.2.¶
-Required Insert Count identifies the state of the dynamic table needed to -process the header block. Blocking decoders use the Required Insert Count to -determine when it is safe to process the rest of the block.¶
-The encoder transforms the Required Insert Count as follows before encoding:¶
-- if ReqInsertCount == 0: - EncInsertCount = 0 - else: - EncInsertCount = (ReqInsertCount mod (2 * MaxEntries)) + 1 -¶ -
Here MaxEntries
is the maximum number of entries that the dynamic table can
-have. The smallest entry has empty name and value strings and has the size of
-32. Hence MaxEntries
is calculated as¶
- MaxEntries = floor( MaxTableCapacity / 32 ) -¶ -
MaxTableCapacity
is the maximum capacity of the dynamic table as specified by
-the decoder; see Section 3.2.3.¶
This encoding limits the length of the prefix on long-lived connections.¶
-The decoder can reconstruct the Required Insert Count using an algorithm such as -the following. If the decoder encounters a value of EncodedInsertCount that -could not have been produced by a conformant encoder, it MUST treat this as a -connection error of type QPACK_DECOMPRESSION_FAILED.¶
-TotalNumberOfInserts is the total number of inserts into the decoder's dynamic -table.¶
-- FullRange = 2 * MaxEntries - if EncodedInsertCount == 0: - ReqInsertCount = 0 - else: - if EncodedInsertCount > FullRange: - Error - MaxValue = TotalNumberOfInserts + MaxEntries - - # MaxWrapped is the largest possible value of - # ReqInsertCount that is 0 mod 2*MaxEntries - MaxWrapped = floor(MaxValue / FullRange) * FullRange - ReqInsertCount = MaxWrapped + EncodedInsertCount - 1 - - # If ReqInsertCount exceeds MaxValue, the Encoder's value - # must have wrapped one fewer time - if ReqInsertCount > MaxValue: - if ReqInsertCount <= FullRange: - Error - ReqInsertCount -= FullRange - - # Value of 0 must be encoded as 0. - if ReqInsertCount == 0: - Error -¶ -
For example, if the dynamic table is 100 bytes, then the Required Insert Count -will be encoded modulo 6. If a decoder has received 10 inserts, then an encoded -value of 3 indicates that the Required Insert Count is 9 for the header block.¶
-The Base
is used to resolve references in the dynamic table as described in
-Section 3.2.5.¶
To save space, the Base is encoded relative to the Required Insert Count using a
-one-bit sign ('S') and the Delta Base
value. A sign bit of 0 indicates that
-the Base is greater than or equal to the value of the Required Insert Count; the
-decoder adds the value of Delta Base to the Required Insert Count to determine
-the value of the Base. A sign bit of 1 indicates that the Base is less than the
-Required Insert Count; the decoder subtracts the value of Delta Base from the
-Required Insert Count and also subtracts one to determine the value of the Base.
-That is:¶
- if S == 0: - Base = ReqInsertCount + DeltaBase - else: - Base = ReqInsertCount - DeltaBase - 1 -¶ -
A single-pass encoder determines the Base before encoding a header block. If -the encoder inserted entries in the dynamic table while encoding the header -block, Required Insert Count will be greater than the Base, so the encoded -difference is negative and the sign bit is set to 1. If the header block did -not reference the most recent entry in the table and did not insert any new -entries, the Base will be greater than the Required Insert Count, so the delta -will be positive and the sign bit is set to 0.¶
-An encoder that produces table updates before encoding a header block might set -Base to the value of Required Insert Count. In such case, both the sign bit and -the Delta Base will be set to zero.¶
-A header block that does not reference the dynamic table can use any value for -the Base; setting Delta Base to zero is one of the most efficient encodings.¶
-For example, with a Required Insert Count of 9, a decoder receives an S bit of 1 -and a Delta Base of 2. This sets the Base to 6 and enables post-base indexing -for three entries. In this example, a relative index of 1 refers to the 5th -entry that was added to the table; a post-base index of 1 refers to the 8th -entry.¶
-An indexed header field representation identifies an entry in the static table, -or an entry in the dynamic table with an absolute index less than the value of -the Base.¶
- -This representation starts with the '1' 1-bit pattern, followed by the 'T' bit -indicating whether the reference is into the static or dynamic table. The 6-bit -prefix integer (Section 4.1.1) that follows is used to locate the -table entry for the header field. When T=1, the number represents the static -table index; when T=0, the number is the relative index of the entry in the -dynamic table.¶
-An indexed header field with post-base index representation identifies an entry -in the dynamic table with an absolute index greater than or equal to the value -of the Base.¶
- -This representation starts with the '0001' 4-bit pattern. This is followed by -the post-base index (Section 3.2.6) of the matching header field, represented as -an integer with a 4-bit prefix; see Section 4.1.1.¶
-A literal header field with name reference representation encodes a header field -where the header field name matches the header field name of an entry in the -static table, or the header field name of an entry in the dynamic table with an -absolute index less than the value of the Base.¶
- -This representation starts with the '01' two-bit pattern. The following bit, -'N', indicates whether an intermediary is permitted to add this header to the -dynamic header table on subsequent hops. When the 'N' bit is set, the encoded -header MUST always be encoded with a literal representation. In particular, when -a peer sends a header field that it received represented as a literal header -field with the 'N' bit set, it MUST use a literal representation to forward this -header field. This bit is intended for protecting header field values that are -not to be put at risk by compressing them; see Section 7 for -more details.¶
-The fourth ('T') bit indicates whether the reference is to the static or dynamic -table. The 4-bit prefix integer (Section 4.1.1) that follows is -used to locate the table entry for the header name. When T=1, the number -represents the static table index; when T=0, the number is the relative index of -the entry in the dynamic table.¶
-Only the header field name is taken from the dynamic table entry; the header -field value is encoded as an 8-bit prefix string literal; see -Section 4.1.2.¶
-A literal header field with post-base name reference representation encodes a -header field where the header field name matches the header field name of a -dynamic table entry with an absolute index greater than or equal to the value of -the Base.¶
- -This representation starts with the '0000' four-bit pattern. The fifth bit is -the 'N' bit as described in Section 4.5.4. This is followed by a -post-base index of the dynamic table entry (Section 3.2.6) encoded as an -integer with a 3-bit prefix; see Section 4.1.1.¶
-Only the header field name is taken from the dynamic table entry; the header -field value is encoded as an 8-bit prefix string literal; see -Section 4.1.2.¶
-The literal header field without name reference representation encodes a header -field name and a header field value as string literals.¶
- -This representation begins with the '001' three-bit pattern. The fourth bit is -the 'N' bit as described in Section 4.5.4. The name follows, -represented as a 4-bit prefix string literal, then the value, represented as an -8-bit prefix string literal; see Section 4.1.2.¶
-QPACK defines two settings which are included in the HTTP/3 SETTINGS frame.¶
-The following error codes are defined for HTTP/3 to indicate failures of -QPACK which prevent the connection from continuing:¶
-TBD. Also see Section 7.1 of [RFC7541].¶
-While the negotiated limit on the dynamic table size accounts for much of the -memory that can be consumed by a QPACK implementation, data which cannot be -immediately sent due to flow control is not affected by this limit. -Implementations should limit the size of unsent data, especially on the decoder -stream where flexibility to choose what to send is limited. Possible responses -to an excess of unsent data might include limiting the ability of the peer to -open new streams, reading only from the encoder stream, or closing the -connection.¶
-This document specifies two settings. The entries in the following table are -registered in the "HTTP/3 Settings" registry established in [HTTP3].¶
-Setting Name | -Code | -Specification | -Default | -
---|---|---|---|
QPACK_MAX_TABLE_CAPACITY | -0x1 | -- Section 5 - | -0 | -
QPACK_BLOCKED_STREAMS | -0x7 | -- Section 5 - | -0 | -
This document specifies two stream types. The entries in the following table are -registered in the "HTTP/3 Stream Type" registry established in [HTTP3].¶
-Stream Type | -Code | -Specification | -Sender | -
---|---|---|---|
QPACK Encoder Stream | -0x02 | -- Section 4.2 - | -Both | -
QPACK Decoder Stream | -0x03 | -- Section 4.2 - | -Both | -
This document specifies three error codes. The entries in the following table -are registered in the "HTTP/3 Error Code" registry established in [HTTP3].¶
-Name | -Code | -Description | -Specification | -
---|---|---|---|
QPACK_DECOMPRESSION_FAILED | -0x200 | -Decompression of a header block failed | -- Section 6 - | -
QPACK_ENCODER_STREAM_ERROR | -0x201 | -Error on the encoder stream | -- Section 6 - | -
QPACK_DECODER_STREAM_ERROR | -0x202 | -Error on the decoder stream | -- Section 6 - | -
Index | -Name | -Value | -
---|---|---|
0 | -:authority | -- |
1 | -:path | -/ | -
2 | -age | -0 | -
3 | -content-disposition | -- |
4 | -content-length | -0 | -
5 | -cookie | -- |
6 | -date | -- |
7 | -etag | -- |
8 | -if-modified-since | -- |
9 | -if-none-match | -- |
10 | -last-modified | -- |
11 | -link | -- |
12 | -location | -- |
13 | -referer | -- |
14 | -set-cookie | -- |
15 | -:method | -CONNECT | -
16 | -:method | -DELETE | -
17 | -:method | -GET | -
18 | -:method | -HEAD | -
19 | -:method | -OPTIONS | -
20 | -:method | -POST | -
21 | -:method | -PUT | -
22 | -:scheme | -http | -
23 | -:scheme | -https | -
24 | -:status | -103 | -
25 | -:status | -200 | -
26 | -:status | -304 | -
27 | -:status | -404 | -
28 | -:status | -503 | -
29 | -accept | -*/* | -
30 | -accept | -application/dns-message | -
31 | -accept-encoding | -gzip, deflate, br | -
32 | -accept-ranges | -bytes | -
33 | -access-control-allow-headers | -cache-control | -
34 | -access-control-allow-headers | -content-type | -
35 | -access-control-allow-origin | -* | -
36 | -cache-control | -max-age=0 | -
37 | -cache-control | -max-age=2592000 | -
38 | -cache-control | -max-age=604800 | -
39 | -cache-control | -no-cache | -
40 | -cache-control | -no-store | -
41 | -cache-control | -public, max-age=31536000 | -
42 | -content-encoding | -br | -
43 | -content-encoding | -gzip | -
44 | -content-type | -application/dns-message | -
45 | -content-type | -application/javascript | -
46 | -content-type | -application/json | -
47 | -content-type | -application/x-www-form-urlencoded | -
48 | -content-type | -image/gif | -
49 | -content-type | -image/jpeg | -
50 | -content-type | -image/png | -
51 | -content-type | -text/css | -
52 | -content-type | -text/html; charset=utf-8 | -
53 | -content-type | -text/plain | -
54 | -content-type | -text/plain;charset=utf-8 | -
55 | -range | -bytes=0- | -
56 | -strict-transport-security | -max-age=31536000 | -
57 | -strict-transport-security | -max-age=31536000; includesubdomains | -
58 | -strict-transport-security | -max-age=31536000; includesubdomains; preload | -
59 | -vary | -accept-encoding | -
60 | -vary | -origin | -
61 | -x-content-type-options | -nosniff | -
62 | -x-xss-protection | -1; mode=block | -
63 | -:status | -100 | -
64 | -:status | -204 | -
65 | -:status | -206 | -
66 | -:status | -302 | -
67 | -:status | -400 | -
68 | -:status | -403 | -
69 | -:status | -421 | -
70 | -:status | -425 | -
71 | -:status | -500 | -
72 | -accept-language | -- |
73 | -access-control-allow-credentials | -FALSE | -
74 | -access-control-allow-credentials | -TRUE | -
75 | -access-control-allow-headers | -* | -
76 | -access-control-allow-methods | -get | -
77 | -access-control-allow-methods | -get, post, options | -
78 | -access-control-allow-methods | -options | -
79 | -access-control-expose-headers | -content-length | -
80 | -access-control-request-headers | -content-type | -
81 | -access-control-request-method | -get | -
82 | -access-control-request-method | -post | -
83 | -alt-svc | -clear | -
84 | -authorization | -- |
85 | -content-security-policy | -script-src 'none'; object-src 'none'; base-uri 'none' | -
86 | -early-data | -1 | -
87 | -expect-ct | -- |
88 | -forwarded | -- |
89 | -if-range | -- |
90 | -origin | -- |
91 | -purpose | -prefetch | -
92 | -server | -- |
93 | -timing-allow-origin | -* | -
94 | -upgrade-insecure-requests | -1 | -
95 | -user-agent | -- |
96 | -x-forwarded-for | -- |
97 | -x-frame-options | -deny | -
98 | -x-frame-options | -sameorigin | -
Pseudo-code for single pass encoding, excluding handling of duplicates, -non-blocking mode, and reference tracking.¶
--baseIndex = dynamicTable.baseIndex -largestReference = 0 -for header in headers: - staticIdx = staticTable.getIndex(header) - if staticIdx: - encodeIndexReference(streamBuffer, staticIdx) - continue - - dynamicIdx = dynamicTable.getIndex(header) - if !dynamicIdx: - # No matching entry. Either insert+index or encode literal - nameIdx = getNameIndex(header) - if shouldIndex(header) and dynamicTable.canIndex(header): - encodeLiteralWithIncrementalIndex(controlBuffer, nameIdx, - header) - dynamicTable.add(header) - dynamicIdx = dynamicTable.baseIndex - - if !dynamicIdx: - # Couldn't index it, literal - if nameIdx <= staticTable.size: - encodeLiteral(streamBuffer, nameIndex, header) - else: - # encode literal, possibly with nameIdx above baseIndex - encodeDynamicLiteral(streamBuffer, nameIndex, baseIndex, - header) - largestReference = max(largestReference, - dynamicTable.toAbsolute(nameIdx)) - else: - # Dynamic index reference - assert(dynamicIdx) - largestReference = max(largestReference, dynamicIdx) - # Encode dynamicIdx, possibly with dynamicIdx above baseIndex - encodeDynamicIndexReference(streamBuffer, dynamicIdx, - baseIndex) - -# encode the prefix -encodeInteger(prefixBuffer, 0x00, largestReference, 8) -if baseIndex >= largestReference: - encodeInteger(prefixBuffer, 0, baseIndex - largestReference, 7) -else: - encodeInteger(prefixBuffer, 0x80, - largestReference - baseIndex, 7) - -return controlBuffer, prefixBuffer + streamBuffer -¶ -
No changes¶
-Editorial changes only¶
-Editorial changes only¶
-Editorial changes only¶
-This draft draws heavily on the text of [RFC7541]. The indirect input of -those authors is gratefully acknowledged, as well as ideas from:¶
-Buck's contribution was supported by Google during his employment there.¶
-A substantial portion of Mike's contribution was supported by Microsoft during -his employment there.¶
-Internet-Draft | -QUIC Loss Detection | -March 2020 | -
Iyengar & Swett | -Expires 22 September 2020 | -[Page] | -
This document describes loss detection and congestion control mechanisms for -QUIC.¶
-Discussion of this draft takes place on the QUIC working group mailing list -(quic@ietf.org), which is archived at -https://mailarchive.ietf.org/arch/search/?email_list=quic.¶
-Working Group information can be found at https://github.com/quicwg; source -code and issues list for this draft can be found at -https://github.com/quicwg/base-drafts/labels/-recovery.¶
-- This Internet-Draft is submitted in full conformance with the - provisions of BCP 78 and BCP 79.¶
-- Internet-Drafts are working documents of the Internet Engineering Task - Force (IETF). Note that other groups may also distribute working - documents as Internet-Drafts. The list of current Internet-Drafts is - at https://datatracker.ietf.org/drafts/current/.¶
-- Internet-Drafts are draft documents valid for a maximum of six months - and may be updated, replaced, or obsoleted by other documents at any - time. It is inappropriate to use Internet-Drafts as reference - material or to cite them other than as "work in progress."¶
-- This Internet-Draft will expire on 22 September 2020.¶
-- Copyright (c) 2020 IETF Trust and the persons identified as the - document authors. All rights reserved.¶
-- This document is subject to BCP 78 and the IETF Trust's Legal - Provisions Relating to IETF Documents - (https://trustee.ietf.org/license-info) in effect on the date of - publication of this document. Please review these documents - carefully, as they describe your rights and restrictions with - respect to this document. Code Components extracted from this - document must include Simplified BSD License text as described in - Section 4.e of the Trust Legal Provisions and are provided without - warranty as described in the Simplified BSD License.¶
-QUIC is a new multiplexed and secure transport protocol atop UDP, specified in -[QUIC-TRANSPORT]. This document describes congestion control and loss -recovery for QUIC. Mechanisms described in this document follow the spirit -of existing TCP congestion control and loss recovery mechanisms, described in -RFCs, various Internet-drafts, or academic papers, and also those prevalent in -TCP implementations.¶
-The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", -"SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this -document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] -when, and only when, they appear in all capitals, as shown here.¶
-Definitions of terms that are used in this document:¶
-All transmissions in QUIC are sent with a packet-level header, which indicates -the encryption level and includes a packet sequence number (referred to below as -a packet number). The encryption level indicates the packet number space, as -described in [QUIC-TRANSPORT]. Packet numbers never repeat within a packet -number space for the lifetime of a connection. Packet numbers are sent in -monotonically increasing order within a space, preventing ambiguity.¶
-This design obviates the need for disambiguating between transmissions and -retransmissions and eliminates significant complexity from QUIC's interpretation -of TCP loss detection mechanisms.¶
-QUIC packets can contain multiple frames of different types. The recovery -mechanisms ensure that data and frames that need reliable delivery are -acknowledged or declared lost and sent in new packets as necessary. The types -of frames contained in a packet affect recovery and congestion control logic:¶
-Readers familiar with TCP's loss detection and congestion control will find -algorithms here that parallel well-known TCP ones. Protocol differences between -QUIC and TCP however contribute to algorithmic differences. We briefly describe -these protocol differences below.¶
-QUIC uses separate packet number spaces for each encryption level, except 0-RTT -and all generations of 1-RTT keys use the same packet number space. Separate -packet number spaces ensures acknowledgement of packets sent with one level of -encryption will not cause spurious retransmission of packets sent with a -different encryption level. Congestion control and round-trip time (RTT) -measurement are unified across packet number spaces.¶
-TCP conflates transmission order at the sender with delivery order at the -receiver, which results in retransmissions of the same data carrying the same -sequence number, and consequently leads to "retransmission ambiguity". QUIC -separates the two. QUIC uses a packet number to indicate transmission order. -Application data is sent in one or more streams and delivery order is -determined by stream offsets encoded within STREAM frames.¶
-QUIC's packet number is strictly increasing within a packet number space, -and directly encodes transmission order. A higher packet number signifies -that the packet was sent later, and a lower packet number signifies that -the packet was sent earlier. When a packet containing ack-eliciting -frames is detected lost, QUIC rebundles necessary frames in a new packet -with a new packet number, removing ambiguity about which packet is -acknowledged when an ACK is received. Consequently, more accurate RTT -measurements can be made, spurious retransmissions are trivially detected, and -mechanisms such as Fast Retransmit can be applied universally, based only on -packet number.¶
-This design point significantly simplifies loss detection mechanisms for QUIC. -Most TCP mechanisms implicitly attempt to infer transmission ordering based on -TCP sequence numbers - a non-trivial task, especially when TCP timestamps are -not available.¶
-QUIC starts a loss epoch when a packet is lost and ends one when any packet -sent after the epoch starts is acknowledged. TCP waits for the gap in the -sequence number space to be filled, and so if a segment is lost multiple times -in a row, the loss epoch may not end for several round trips. Because both -should reduce their congestion windows only once per epoch, QUIC will do it -once for every round trip that experiences loss, while TCP may only do it -once across multiple round trips.¶
-QUIC ACKs contain information that is similar to TCP SACK, but QUIC does not -allow any acked packet to be reneged, greatly simplifying implementations on -both sides and reducing memory pressure on the sender.¶
-QUIC supports many ACK ranges, opposed to TCP's 3 SACK ranges. In high loss -environments, this speeds recovery, reduces spurious retransmits, and ensures -forward progress without relying on timeouts.¶
-QUIC endpoints measure the delay incurred between when a packet is received and -when the corresponding acknowledgment is sent, allowing a peer to maintain a -more accurate round-trip time estimate (see Section 13.2 of [QUIC-TRANSPORT]).¶
-QUIC uses a probe timeout (see Section 5.2), with a timer based on TCP's RTO -computation. QUIC's PTO includes the peer's maximum expected acknowledgement -delay instead of using a fixed minimum timeout. QUIC does not collapse the -congestion window until persistent congestion (Section 6.8) is -declared, unlike TCP, which collapses the congestion window upon expiry of an -RTO. Instead of collapsing the congestion window and declaring everything -in-flight lost, QUIC allows probe packets to temporarily exceed the congestion -window whenever the timer expires.¶
-In doing this, QUIC avoids unnecessary congestion window reductions, obviating -the need for correcting mechanisms such as F-RTO [RFC5682]. Since QUIC does -not collapse the congestion window on a PTO expiration, a QUIC sender is not -limited from sending more in-flight packets after a PTO expiration if it still -has available congestion window. This occurs when a sender is -application-limited and the PTO timer expires. This is more aggressive than -TCP's RTO mechanism when application-limited, but identical when not -application-limited.¶
-A single packet loss at the tail does not indicate persistent congestion, so -QUIC specifies a time-based definition to ensure one or more packets are sent -prior to a dramatic decrease in congestion window; see -Section 6.8.¶
-At a high level, an endpoint measures the time from when a packet was sent to -when it is acknowledged as a round-trip time (RTT) sample. The endpoint uses -RTT samples and peer-reported host delays (see Section 13.2 of -[QUIC-TRANSPORT]) to generate a statistical description of the network -path's RTT. An endpoint computes the following three values for each path: -the minimum value observed over the lifetime of the path (min_rtt), an -exponentially-weighted moving average (smoothed_rtt), and the mean deviation -(referred to as "variation" in the rest of this document) in the observed RTT -samples (rttvar).¶
-An endpoint generates an RTT sample on receiving an ACK frame that meets the -following two conditions:¶
-The RTT sample, latest_rtt, is generated as the time elapsed since the largest -acknowledged packet was sent:¶
--latest_rtt = ack_time - send_time_of_largest_acked -¶ -
An RTT sample is generated using only the largest acknowledged packet in the -received ACK frame. This is because a peer reports ACK delays for only the -largest acknowledged packet in an ACK frame. While the reported ACK delay is -not used by the RTT sample measurement, it is used to adjust the RTT sample in -subsequent computations of smoothed_rtt and rttvar Section 4.3.¶
-To avoid generating multiple RTT samples for a single packet, an ACK frame -SHOULD NOT be used to update RTT estimates if it does not newly acknowledge the -largest acknowledged packet.¶
-An RTT sample MUST NOT be generated on receiving an ACK frame that does not -newly acknowledge at least one ack-eliciting packet. A peer usually does not -send an ACK frame when only non-ack-eliciting packets are received. Therefore -an ACK frame that contains acknowledgements for only non-ack-eliciting packets -could include an arbitrarily large Ack Delay value. Ignoring -such ACK frames avoids complications in subsequent smoothed_rtt and rttvar -computations.¶
-A sender might generate multiple RTT samples per RTT when multiple ACK frames -are received within an RTT. As suggested in [RFC6298], doing so might result -in inadequate history in smoothed_rtt and rttvar. Ensuring that RTT estimates -retain sufficient history is an open research question.¶
-min_rtt is the minimum RTT observed for a given network path. min_rtt is set -to the latest_rtt on the first RTT sample, and to the lesser of min_rtt and -latest_rtt on subsequent samples. In this document, min_rtt is used by loss -detection to reject implausibly small rtt samples.¶
-An endpoint uses only locally observed times in computing the min_rtt and does -not adjust for ACK delays reported by the peer. Doing so allows the endpoint -to set a lower bound for the smoothed_rtt based entirely on what it observes -(see Section 4.3), and limits potential underestimation due to -erroneously-reported delays by the peer.¶
-The RTT for a network path may change over time. If a path's actual RTT -decreases, the min_rtt will adapt immediately on the first low sample. If -the path's actual RTT increases, the min_rtt will not adapt to it, allowing -future RTT samples that are smaller than the new RTT be included in -smoothed_rtt.¶
-smoothed_rtt is an exponentially-weighted moving average of an endpoint's RTT -samples, and rttvar is the variation in the RTT samples, estimated using a -mean variation.¶
-The calculation of smoothed_rtt uses path latency after adjusting RTT samples -for acknowledgement delays. These delays are computed using the ACK Delay -field of the ACK frame as described in Section 19.3 of [QUIC-TRANSPORT]. -For packets sent in the ApplicationData packet number space, a peer limits -any delay in sending an acknowledgement for an ack-eliciting packet to no -greater than the value it advertised in the max_ack_delay transport parameter. -Consequently, when a peer reports an Ack Delay that is greater than its -max_ack_delay, the delay is attributed to reasons out of the peer's control, -such as scheduler latency at the peer or loss of previous ACK frames. Any -delays beyond the peer's max_ack_delay are therefore considered effectively -part of path delay and incorporated into the smoothed_rtt estimate.¶
-When adjusting an RTT sample using peer-reported acknowledgement delays, an -endpoint:¶
-On the first RTT sample for a network path, the smoothed_rtt is set to the -latest_rtt.¶
-smoothed_rtt and rttvar are computed as follows, similar to [RFC6298]. On -the first RTT sample for a network path:¶
--smoothed_rtt = latest_rtt -rttvar = latest_rtt / 2 -¶ -
On subsequent RTT samples, smoothed_rtt and rttvar evolve as follows:¶
--ack_delay = min(Ack Delay in ACK Frame, max_ack_delay) -adjusted_rtt = latest_rtt -if (min_rtt + ack_delay < latest_rtt): - adjusted_rtt = latest_rtt - ack_delay -smoothed_rtt = 7/8 * smoothed_rtt + 1/8 * adjusted_rtt -rttvar_sample = abs(smoothed_rtt - adjusted_rtt) -rttvar = 3/4 * rttvar + 1/4 * rttvar_sample -¶ -
QUIC senders use acknowledgements to detect lost packets, and a probe -time out (see Section 5.2) to ensure acknowledgements are received. This section -provides a description of these algorithms.¶
-If a packet is lost, the QUIC transport needs to recover from that loss, such -as by retransmitting the data, sending an updated frame, or abandoning the -frame. For more information, see Section 13.3 of [QUIC-TRANSPORT].¶
-Acknowledgement-based loss detection implements the spirit of TCP's Fast -Retransmit [RFC5681], Early Retransmit [RFC5827], FACK [FACK], SACK loss -recovery [RFC6675], and RACK [RACK]. This section -provides an overview of how these algorithms are implemented in QUIC.¶
-A packet is declared lost if it meets all the following conditions:¶
-The acknowledgement indicates that a packet sent later was delivered, and the -packet and time thresholds provide some tolerance for packet reordering.¶
-Spuriously declaring packets as lost leads to unnecessary retransmissions and -may result in degraded performance due to the actions of the congestion -controller upon detecting loss. Implementations can detect spurious -retransmissions and increase the reordering threshold in packets or time to -reduce future spurious retransmissions and loss events. Implementations with -adaptive time thresholds MAY choose to start with smaller initial reordering -thresholds to minimize recovery latency.¶
-The RECOMMENDED initial value for the packet reordering threshold -(kPacketThreshold) is 3, based on best practices for TCP loss detection -[RFC5681] [RFC6675]. Implementations SHOULD NOT use a packet threshold -less than 3, to keep in line with TCP [RFC5681].¶
-Some networks may exhibit higher degrees of reordering, causing a sender to -detect spurious losses. Implementers MAY use algorithms developed for TCP, such -as TCP-NCR [RFC4653], to improve QUIC's reordering resilience.¶
-Once a later packet within the same packet number space has been acknowledged, -an endpoint SHOULD declare an earlier packet lost if it was sent a threshold -amount of time in the past. To avoid declaring packets as lost too early, this -time threshold MUST be set to at least the local timer granularity, as -indicated by the kGranularity constant. The time threshold is:¶
--max(kTimeThreshold * max(smoothed_rtt, latest_rtt), kGranularity) -¶ -
If packets sent prior to the largest acknowledged packet cannot yet be declared -lost, then a timer SHOULD be set for the remaining time.¶
-Using max(smoothed_rtt, latest_rtt) protects from the two following cases:¶
-The RECOMMENDED time threshold (kTimeThreshold), expressed as a round-trip time -multiplier, is 9/8.¶
-Implementations MAY experiment with absolute thresholds, thresholds from -previous connections, adaptive thresholds, or including RTT variation. Smaller -thresholds reduce reordering resilience and increase spurious retransmissions, -and larger thresholds increase loss detection delay.¶
-A Probe Timeout (PTO) triggers sending one or two probe datagrams when -ack-eliciting packets are not acknowledged within the expected period of -time or the handshake has not been completed. A PTO enables a connection to -recover from loss of tail packets or acknowledgements.¶
-As with loss detection, the probe timeout is per packet number space. -The PTO algorithm used in QUIC implements the reliability functions of -Tail Loss Probe [RACK], RTO [RFC5681], and F-RTO algorithms for -TCP [RFC5682]. The timeout computation is based on TCP's retransmission -timeout period [RFC6298].¶
-When an ack-eliciting packet is transmitted, the sender schedules a timer for -the PTO period as follows:¶
--PTO = smoothed_rtt + max(4*rttvar, kGranularity) + max_ack_delay -¶ -
The PTO period is the amount of time that a sender ought to wait for an -acknowledgement of a sent packet. This time period includes the estimated -network roundtrip-time (smoothed_rtt), the variation in the estimate (4*rttvar), -and max_ack_delay, to account for the maximum time by which a receiver might -delay sending an acknowledgement. When the PTO is armed for Initial or -Handshake packet number spaces, the max_ack_delay is 0, as specified in -13.2.1 of [QUIC-TRANSPORT].¶
-The PTO value MUST be set to at least kGranularity, to avoid the timer expiring -immediately.¶
-A sender computes its PTO timer every time an ack-eliciting packet is sent. -When ack-eliciting packets are in-flight in multiple packet number spaces, -the timer MUST be set for the packet number space with the earliest timeout, -except for ApplicationData, which MUST be ignored until the handshake -completes; see Section 4.1.1 of [QUIC-TLS]. Not arming the PTO for -ApplicationData prevents a client from retransmitting a 0-RTT packet on a PTO -expiration before confirming that the server is able to decrypt 0-RTT packets, -and prevents a server from sending a 1-RTT packet on a PTO expiration before it -has the keys to process an acknowledgement.¶
-When a PTO timer expires, the PTO period MUST be set to twice its current -value. This exponential reduction in the sender's rate is important because -consecutive PTOs might be caused by loss of packets or acknowledgements due to -severe congestion. Even when there are ack-eliciting packets in-flight in -multiple packet number spaces, the exponential increase in probe timeout -occurs across all spaces to prevent excess load on the network. For example, -a timeout in the Initial packet number space doubles the length of the timeout -in the Handshake packet number space.¶
-The life of a connection that is experiencing consecutive PTOs is limited by -the endpoint's idle timeout.¶
-The probe timer MUST NOT be set if the time threshold Section 5.1.2 loss -detection timer is set. The time threshold loss detection timer is expected -to both expire earlier than the PTO and be less likely to spuriously retransmit -data.¶
-The initial probe timeout for a new connection or new path SHOULD be -set to twice the initial RTT. Resumed connections over the same network -MAY use the previous connection's final smoothed RTT value as the resumed -connection's initial RTT. If no previous RTT is available, the initial RTT -SHOULD be set to 500ms, resulting in a 1 second initial timeout as recommended -in [RFC6298].¶
-A connection MAY use the delay between sending a PATH_CHALLENGE and receiving a -PATH_RESPONSE to set the initial RTT (see kInitialRtt in -Appendix A.2) for a new path, but the delay SHOULD NOT be -considered an RTT sample.¶
-Until the server has validated the client's address on the path, the amount of -data it can send is limited to three times the amount of data received, -as specified in Section 8.1 of [QUIC-TRANSPORT]. If no data can be sent, -then the PTO alarm MUST NOT be armed until datagrams have been received from -the client.¶
-Since the server could be blocked until more packets are received from the -client, it is the client's responsibility to send packets to unblock the server -until it is certain that the server has finished its address validation -(see Section 8 of [QUIC-TRANSPORT]). That is, the client MUST set the -probe timer if the client has not received an acknowledgement for one of its -Handshake or 1-RTT packets, and has not received a HANDSHAKE_DONE frame.¶
-Prior to handshake completion, when few to none RTT samples have been -generated, it is possible that the probe timer expiration is due to an -incorrect RTT estimate at the client. To allow the client to improve its RTT -estimate, the new packet that it sends MUST be ack-eliciting. If Handshake -keys are available to the client, it MUST send a Handshake packet, and -otherwise it MUST send an Initial packet in a UDP datagram of at least 1200 -bytes.¶
-Initial packets and Handshake packets could be never acknowledged, but they are -removed from bytes in flight when the Initial and Handshake keys are discarded, -as described below in Section Section 5.4. When Initial or Handshake -keys are discarded, the PTO and loss detection timers MUST be reset, because -discarding keys indicates forward progress and the loss detection timer might -have been set for a now discarded packet number space.¶
-When a server receives an Initial packet containing duplicate CRYPTO data, -it can assume the client did not receive all of the server's CRYPTO data sent -in Initial packets, or the client's estimated RTT is too small. When a -client receives Handshake or 1-RTT packets prior to obtaining Handshake keys, -it may assume some or all of the server's Initial packets were lost.¶
-To speed up handshake completion under these conditions, an endpoint MAY send -a packet containing unacknowledged CRYPTO data earlier than the PTO expiry, -subject to address validation limits; see Section 8.1 of [QUIC-TRANSPORT].¶
-Peers can also use coalesced packets to ensure that each datagram elicits at -least one acknowledgement. For example, clients can coalesce an Initial packet -containing PING and PADDING frames with a 0-RTT data packet and a server can -coalesce an Initial packet containing a PING frame with one or more packets in -its first flight.¶
-When a PTO timer expires, a sender MUST send at least one ack-eliciting packet -in the packet number space as a probe, unless there is no data available to -send. An endpoint MAY send up to two full-sized datagrams containing -ack-eliciting packets, to avoid an expensive consecutive PTO expiration due -to a single lost datagram or transmit data from multiple packet number spaces.¶
-In addition to sending data in the packet number space for which the timer -expired, the sender SHOULD send ack-eliciting packets from other packet -number spaces with in-flight data, coalescing packets if possible.¶
-If the sender wants to elicit a faster acknowledgement on PTO, it can skip a -packet number to eliminate the ack delay.¶
-When the PTO timer expires, and there is new or previously sent unacknowledged -data, it MUST be sent.¶
-It is possible the sender has no new or previously-sent data to send. -As an example, consider the following sequence of events: new application data -is sent in a STREAM frame, deemed lost, then retransmitted in a new packet, -and then the original transmission is acknowledged. When there is no data to -send, the sender SHOULD send a PING or other ack-eliciting frame in a single -packet, re-arming the PTO timer.¶
-Alternatively, instead of sending an ack-eliciting packet, the sender MAY mark -any packets still in flight as lost. Doing so avoids sending an additional -packet, but increases the risk that loss is declared too aggressively, resulting -in an unnecessary rate reduction by the congestion controller.¶
-Consecutive PTO periods increase exponentially, and as a result, connection -recovery latency increases exponentially as packets continue to be dropped in -the network. Sending two packets on PTO expiration increases resilience to -packet drops, thus reducing the probability of consecutive PTO events.¶
-Probe packets sent on a PTO MUST be ack-eliciting. A probe packet SHOULD carry -new data when possible. A probe packet MAY carry retransmitted unacknowledged -data when new data is unavailable, when flow control does not permit new data to -be sent, or to opportunistically reduce loss recovery delay. Implementations -MAY use alternative strategies for determining the content of probe packets, -including sending new or retransmitted data based on the application's -priorities.¶
-When the PTO timer expires multiple times and new data cannot be sent, -implementations must choose between sending the same payload every time -or sending different payloads. Sending the same payload may be simpler -and ensures the highest priority frames arrive first. Sending different -payloads each time reduces the chances of spurious retransmission.¶
-Delivery or loss of packets in flight is established when an ACK frame is -received that newly acknowledges one or more packets.¶
-A PTO timer expiration event does not indicate packet loss and MUST NOT cause -prior unacknowledged packets to be marked as lost. When an acknowledgement -is received that newly acknowledges packets, loss detection proceeds as -dictated by packet and time threshold mechanisms; see Section 5.1.¶
-A Retry packet causes a client to send another Initial packet, effectively -restarting the connection process. A Retry packet indicates that the Initial -was received, but not processed. A Retry packet cannot be treated as an -acknowledgment, because it does not indicate that a packet was processed or -specify the packet number.¶
-Clients that receive a Retry packet reset congestion control and loss recovery -state, including resetting any pending timers. Other connection state, in -particular cryptographic handshake messages, is retained; see Section 17.2.5 of -[QUIC-TRANSPORT].¶
-The client MAY compute an RTT estimate to the server as the time period from -when the first Initial was sent to when a Retry or a Version Negotiation packet -is received. The client MAY use this value in place of its default for the -initial RTT estimate.¶
-When packet protection keys are discarded (see Section 4.10 of [QUIC-TLS]), -all packets that were sent with those keys can no longer be acknowledged because -their acknowledgements cannot be processed anymore. The sender MUST discard -all recovery state associated with those packets and MUST remove them from -the count of bytes in flight.¶
-Endpoints stop sending and receiving Initial packets once they start exchanging -Handshake packets (see Section 17.2.2.1 of [QUIC-TRANSPORT]). At this point, -recovery state for all in-flight Initial packets is discarded.¶
-When 0-RTT is rejected, recovery state for all in-flight 0-RTT packets is -discarded.¶
-If a server accepts 0-RTT, but does not buffer 0-RTT packets that arrive -before Initial packets, early 0-RTT packets will be declared lost, but that -is expected to be infrequent.¶
-It is expected that keys are discarded after packets encrypted with them would -be acknowledged or declared lost. Initial secrets however might be destroyed -sooner, as soon as handshake keys are available (see Section 4.10.1 of -[QUIC-TLS]).¶
-This document specifies a congestion controller for QUIC similar to -TCP NewReno [RFC6582].¶
-The signals QUIC provides for congestion control are generic and are designed to -support different algorithms. Endpoints can unilaterally choose a different -algorithm to use, such as Cubic [RFC8312].¶
-If an endpoint uses a different controller than that specified in this document, -the chosen controller MUST conform to the congestion control guidelines -specified in Section 3.1 of [RFC8085].¶
-Similar to TCP, packets containing only ACK frames do not count towards bytes -in flight and are not congestion controlled. Unlike TCP, QUIC can detect the -loss of these packets and MAY use that information to adjust the congestion -controller or the rate of ACK-only packets being sent, but this document does -not describe a mechanism for doing so.¶
-The algorithm in this document specifies and uses the controller's congestion -window in bytes.¶
-An endpoint MUST NOT send a packet if it would cause bytes_in_flight (see -Appendix B.2) to be larger than the congestion window, unless the packet -is sent on a PTO timer expiration (see Section 5.2).¶
-If a path has been verified to support ECN [RFC3168] [RFC8311], QUIC -treats a Congestion Experienced (CE) codepoint in the IP header as a signal of -congestion. This document specifies an endpoint's response when its peer -receives packets with the ECN-CE codepoint.¶
-QUIC begins every connection in slow start with the congestion window set to -an initial value. Endpoints SHOULD use an initial congestion window of 10 times -the maximum datagram size (max_datagram_size), limited to the larger of 14720 or -twice the maximum datagram size. This follows the analysis and recommendations -in [RFC6928], increasing the byte limit to account for the smaller 8 byte -overhead of UDP compared to the 20 byte overhead for TCP.¶
-The minimum congestion window is the smallest value the congestion window can -decrease to as a response to loss, ECN-CE, or persistent congestion. -The RECOMMENDED value is 2 * max_datagram_size.¶
-While in slow start, QUIC increases the congestion window by the -number of bytes acknowledged when each acknowledgment is processed, resulting -in exponential growth of the congestion window.¶
-QUIC exits slow start upon loss or upon increase in the ECN-CE counter. -When slow start is exited, the congestion window halves and the slow start -threshold is set to the new congestion window. QUIC re-enters slow start -any time the congestion window is less than the slow start threshold, -which only occurs after persistent congestion is declared.¶
-Slow start exits to congestion avoidance. Congestion avoidance uses an -Additive Increase Multiplicative Decrease (AIMD) approach that increases -the congestion window by one maximum packet size per congestion window -acknowledged. When a loss or ECN-CE marking is detected, NewReno halves -the congestion window, sets the slow start threshold to the new -congestion window, and then enters the recovery period.¶
-A recovery period is entered when loss or ECN-CE marking of a packet is -detected in congestion avoidance after the congestion window and slow start -threshold have been decreased. A recovery period ends when a packet sent -during the recovery period is acknowledged. This is slightly different from -TCP's definition of recovery, which ends when the lost packet that started -recovery is acknowledged.¶
-The recovery period aims to limit congestion window reduction to once per round -trip. Therefore during recovery, the congestion window remains unchanged -irrespective of new losses or increases in the ECN-CE counter.¶
-When entering recovery, a single packet MAY be sent even if bytes in flight -now exceeds the recently reduced congestion window. This speeds up loss -recovery if the data in the lost packet is retransmitted and is similar to TCP -as described in Section 5 of [RFC6675]. If further packets are lost while -the sender is in recovery, sending any packets in response MUST obey the -congestion window limit.¶
-During the handshake, some packet protection keys might not be available when -a packet arrives and the receiver can choose to drop the packet. In particular, -Handshake and 0-RTT packets cannot be processed until the Initial packets -arrive and 1-RTT packets cannot be processed until the handshake completes. -Endpoints MAY ignore the loss of Handshake, 0-RTT, and 1-RTT packets that might -have arrived before the peer had packet protection keys to process those -packets. Endpoints MUST NOT ignore the loss of packets that were sent after -the earliest acknowledged packet in a given packet number space.¶
-Probe packets MUST NOT be blocked by the congestion controller. A sender MUST -however count these packets as being additionally in flight, since these packets -add network load without establishing packet loss. Note that sending probe -packets might cause the sender's bytes in flight to exceed the congestion window -until an acknowledgement is received that establishes loss or delivery of -packets.¶
-When an ACK frame is received that establishes loss of all in-flight packets -sent over a long enough period of time, the network is considered to be -experiencing persistent congestion. Commonly, this can be established by -consecutive PTOs, but since the PTO timer is reset when a new ack-eliciting -packet is sent, an explicit duration must be used to account for those cases -where PTOs do not occur or are substantially delayed. This duration is computed -as follows:¶
--(smoothed_rtt + 4 * rttvar + max_ack_delay) * - kPersistentCongestionThreshold -¶ -
For example, assume:¶
-smoothed_rtt = 1 - rttvar = 0 - max_ack_delay = 0 - kPersistentCongestionThreshold = 3¶
-If an ack-eliciting packet is sent at time t = 0, the following scenario would -illustrate persistent congestion:¶
-t=0 | -Send Pkt #1 (App Data) | -
---|---|
t=1 | -Send Pkt #2 (PTO 1) | -
t=3 | -Send Pkt #3 (PTO 2) | -
t=7 | -Send Pkt #4 (PTO 3) | -
t=8 | -Recv ACK of Pkt #4 | -
The first three packets are determined to be lost when the acknowlegement of -packet 4 is received at t=8. The congestion period is calculated as the time -between the oldest and newest lost packets: (3 - 0) = 3. The duration for -persistent congestion is equal to: (1 * kPersistentCongestionThreshold) = 3. -Because the threshold was reached and because none of the packets between the -oldest and the newest packets are acknowledged, the network is considered to -have experienced persistent congestion.¶
-When persistent congestion is established, the sender's congestion window MUST -be reduced to the minimum congestion window (kMinimumWindow). This response of -collapsing the congestion window on persistent congestion is functionally -similar to a sender's response on a Retransmission Timeout (RTO) in TCP -[RFC5681] after Tail Loss Probes (TLP) [RACK].¶
-This document does not specify a pacer, but it is RECOMMENDED that a sender pace -sending of all in-flight packets based on input from the congestion -controller. For example, a pacer might distribute the congestion window over -the smoothed RTT when used with a window-based controller, or a pacer might use -the rate estimate of a rate-based controller.¶
-An implementation should take care to architect its congestion controller to -work well with a pacer. For instance, a pacer might wrap the congestion -controller and control the availability of the congestion window, or a pacer -might pace out packets handed to it by the congestion controller.¶
-Timely delivery of ACK frames is important for efficient loss recovery. Packets -containing only ACK frames SHOULD therefore not be paced, to avoid delaying -their delivery to the peer.¶
-Sending multiple packets into the network without any delay between them -creates a packet burst that might cause short-term congestion and losses. -Implementations MUST either use pacing or limit such bursts to the initial -congestion window, which is recommended to be the minimum of -10 * max_datagram_size and max(2* max_datagram_size, 14720)), where -max_datagram_size is the current maximum size of a datagram for the connection, -not including UDP or IP overhead.¶
-As an example of a well-known and publicly available implementation of a flow -pacer, implementers are referred to the Fair Queue packet scheduler (fq qdisc) -in Linux (3.11 onwards).¶
-When bytes in flight is smaller than the congestion window and sending is not -pacing limited, the congestion window is under-utilized. When this occurs, -the congestion window SHOULD NOT be increased in either slow start or -congestion avoidance. This can happen due to insufficient application data -or flow control limits.¶
-A sender MAY use the pipeACK method described in section 4.3 of [RFC7661] -to determine if the congestion window is sufficiently utilized.¶
-A sender that paces packets (see Section 6.9) might delay sending packets -and not fully utilize the congestion window due to this delay. A sender -SHOULD NOT consider itself application limited if it would have fully -utilized the congestion window without pacing delay.¶
-A sender MAY implement alternative mechanisms to update its congestion window -after periods of under-utilization, such as those proposed for TCP in -[RFC7661].¶
-Congestion control fundamentally involves the consumption of signals - both -loss and ECN codepoints - from unauthenticated entities. On-path attackers can -spoof or alter these signals. An attacker can cause endpoints to reduce their -sending rate by dropping packets, or alter send rate by changing ECN codepoints.¶
-Packets that carry only ACK frames can be heuristically identified by observing -packet size. Acknowledgement patterns may expose information about link -characteristics or application behavior. Endpoints can use PADDING frames or -bundle acknowledgments with other frames to reduce leaked information.¶
-A receiver can misreport ECN markings to alter the congestion response of a -sender. Suppressing reports of ECN-CE markings could cause a sender to -increase their send rate. This increase could result in congestion and loss.¶
-A sender MAY attempt to detect suppression of reports by marking occasional -packets that they send with ECN-CE. If a packet sent with ECN-CE is not -reported as having been CE marked when the packet is acknowledged, then the -sender SHOULD disable ECN for that path.¶
-Reporting additional ECN-CE markings will cause a sender to reduce their sending -rate, which is similar in effect to advertising reduced connection flow control -limits and so no advantage is gained by doing so.¶
-Endpoints choose the congestion controller that they use. Though congestion -controllers generally treat reports of ECN-CE markings as equivalent to loss -[RFC8311], the exact response for each controller could be different. Failure -to correctly respond to information about ECN markings is therefore difficult to -detect.¶
-This document has no IANA actions.¶
-We now describe an example implementation of the loss detection mechanisms -described in Section 5.¶
-To correctly implement congestion control, a QUIC sender tracks every -ack-eliciting packet until the packet is acknowledged or lost. -It is expected that implementations will be able to access this information by -packet number and crypto context and store the per-packet fields -(Appendix A.1.1) for loss recovery and congestion control.¶
-After a packet is declared lost, the endpoint can track it for an amount of -time comparable to the maximum expected packet reordering, such as 1 RTT. -This allows for detection of spurious retransmissions.¶
-Sent packets are tracked for each packet number space, and ACK -processing only applies to a single space.¶
-Constants used in loss recovery are based on a combination of RFCs, papers, and -common practice.¶
-- enum kPacketNumberSpace { - Initial, - Handshake, - ApplicationData, - } -¶ -
Variables required to implement the congestion control mechanisms -are described in this section.¶
-At the beginning of the connection, initialize the loss detection variables as -follows:¶
-- loss_detection_timer.reset() - pto_count = 0 - latest_rtt = 0 - smoothed_rtt = 0 - rttvar = 0 - min_rtt = 0 - max_ack_delay = 0 - for pn_space in [ Initial, Handshake, ApplicationData ]: - largest_acked_packet[pn_space] = infinite - time_of_last_sent_ack_eliciting_packet[pn_space] = 0 - loss_time[pn_space] = 0 -¶ -
After a packet is sent, information about the packet is stored. The parameters -to OnPacketSent are described in detail above in Appendix A.1.1.¶
-Pseudocode for OnPacketSent follows:¶
-- OnPacketSent(packet_number, pn_space, ack_eliciting, - in_flight, sent_bytes): - sent_packets[pn_space][packet_number].packet_number = - packet_number - sent_packets[pn_space][packet_number].time_sent = now - sent_packets[pn_space][packet_number].ack_eliciting = - ack_eliciting - sent_packets[pn_space][packet_number].in_flight = in_flight - if (in_flight): - if (ack_eliciting): - time_of_last_sent_ack_eliciting_packet[pn_space] = now - OnPacketSentCC(sent_bytes) - sent_packets[pn_space][packet_number].size = sent_bytes - SetLossDetectionTimer() -¶ -
When an ACK frame is received, it may newly acknowledge any number of packets.¶
-Pseudocode for OnAckReceived and UpdateRtt follow:¶
--OnAckReceived(ack, pn_space): - if (largest_acked_packet[pn_space] == infinite): - largest_acked_packet[pn_space] = ack.largest_acked - else: - largest_acked_packet[pn_space] = - max(largest_acked_packet[pn_space], ack.largest_acked) - - // Nothing to do if there are no newly acked packets. - newly_acked_packets = DetermineNewlyAckedPackets(ack, pn_space) - if (newly_acked_packets.empty()): - return - - // If the largest acknowledged is newly acked and - // at least one ack-eliciting was newly acked, update the RTT. - if (sent_packets[pn_space].contains(ack.largest_acked) && - IncludesAckEliciting(newly_acked_packets)): - latest_rtt = - now - sent_packets[pn_space][ack.largest_acked].time_sent - ack_delay = 0 - if (pn_space == ApplicationData): - ack_delay = ack.ack_delay - UpdateRtt(ack_delay) - - // Process ECN information if present. - if (ACK frame contains ECN information): - ProcessECN(ack, pn_space) - - for acked_packet in newly_acked_packets: - OnPacketAcked(acked_packet.packet_number, pn_space) - - DetectLostPackets(pn_space) - - pto_count = 0 - - SetLossDetectionTimer() - - -UpdateRtt(ack_delay): - // First RTT sample. - if (smoothed_rtt == 0): - min_rtt = latest_rtt - smoothed_rtt = latest_rtt - rttvar = latest_rtt / 2 - return - - // min_rtt ignores ack delay. - min_rtt = min(min_rtt, latest_rtt) - // Limit ack_delay by max_ack_delay - ack_delay = min(ack_delay, max_ack_delay) - // Adjust for ack delay if plausible. - adjusted_rtt = latest_rtt - if (latest_rtt > min_rtt + ack_delay): - adjusted_rtt = latest_rtt - ack_delay - - rttvar = 3/4 * rttvar + 1/4 * abs(smoothed_rtt - adjusted_rtt) - smoothed_rtt = 7/8 * smoothed_rtt + 1/8 * adjusted_rtt -¶ -
When a packet is acknowledged for the first time, the following OnPacketAcked -function is called. Note that a single ACK frame may newly acknowledge several -packets. OnPacketAcked must be called once for each of these newly acknowledged -packets.¶
-OnPacketAcked takes two parameters: acked_packet, which is the struct detailed -in Appendix A.1.1, and the packet number space that this ACK frame was -sent for.¶
-Pseudocode for OnPacketAcked follows:¶
-- OnPacketAcked(acked_packet, pn_space): - if (acked_packet.in_flight): - OnPacketAckedCC(acked_packet) - sent_packets[pn_space].remove(acked_packet.packet_number) -¶ -
QUIC loss detection uses a single timer for all timeout loss detection. The -duration of the timer is based on the timer's mode, which is set in the packet -and timer events further below. The function SetLossDetectionTimer defined -below shows how the single timer is set.¶
-This algorithm may result in the timer being set in the past, particularly if -timers wake up late. Timers set in the past SHOULD fire immediately.¶
-Pseudocode for SetLossDetectionTimer follows:¶
--GetEarliestTimeAndSpace(times): - time = times[Initial] - space = Initial - for pn_space in [ Handshake, ApplicationData ]: - if (times[pn_space] != 0 && - (time == 0 || times[pn_space] < time) && - # Skip ApplicationData until handshake completion. - (pn_space != ApplicationData || - IsHandshakeComplete()): - time = times[pn_space]; - space = pn_space - return time, space - -PeerNotAwaitingAddressValidation(): - # Assume clients validate the server's address implicitly. - if (endpoint is server): - return true - # Servers complete address validation when a - # protected packet is received. - return has received Handshake ACK || - has received 1-RTT ACK || - has received HANDSHAKE_DONE - -SetLossDetectionTimer(): - earliest_loss_time, _ = GetEarliestTimeAndSpace(loss_time) - if (earliest_loss_time != 0): - // Time threshold loss detection. - loss_detection_timer.update(earliest_loss_time) - return - - if (server is at anti-amplification limit): - // The server's alarm is not set if nothing can be sent. - loss_detection_timer.cancel() - return - - if (no ack-eliciting packets in flight && - peer not awaiting address validation): - // There is nothing to detect lost, so no timer is set. - // However, the client needs to arm the timer if the - // server might be blocked by the anti-amplification limit. - loss_detection_timer.cancel() - return - - // Use a default timeout if there are no RTT measurements - if (smoothed_rtt == 0): - timeout = 2 * kInitialRtt - else: - // Calculate PTO duration - timeout = smoothed_rtt + max(4 * rttvar, kGranularity) + - max_ack_delay - timeout = timeout * (2 ^ pto_count) - - sent_time, _ = GetEarliestTimeAndSpace( - time_of_last_sent_ack_eliciting_packet) - loss_detection_timer.update(sent_time + timeout) -¶ -
When the loss detection timer expires, the timer's mode determines the action -to be performed.¶
-Pseudocode for OnLossDetectionTimeout follows:¶
--OnLossDetectionTimeout(): - earliest_loss_time, pn_space = - GetEarliestTimeAndSpace(loss_time) - if (earliest_loss_time != 0): - // Time threshold loss Detection - DetectLostPackets(pn_space) - SetLossDetectionTimer() - return - - if (bytes_in_flight > 0): - // PTO. Send new data if available, else retransmit old data. - // If neither is available, send a single PING frame. - _, pn_space = GetEarliestTimeAndSpace( - time_of_last_sent_ack_eliciting_packet) - SendOneOrTwoAckElicitingPackets(pn_space) - else: - assert(endpoint is client without 1-RTT keys) - // Client sends an anti-deadlock packet: Initial is padded - // to earn more anti-amplification credit, - // a Handshake packet proves address ownership. - if (has Handshake keys): - SendOneAckElicitingHandshakePacket() - else: - SendOneAckElicitingPaddedInitialPacket() - - pto_count++ - SetLossDetectionTimer() -¶ -
DetectLostPackets is called every time an ACK is received and operates on -the sent_packets for that packet number space.¶
-Pseudocode for DetectLostPackets follows:¶
--DetectLostPackets(pn_space): - assert(largest_acked_packet[pn_space] != infinite) - loss_time[pn_space] = 0 - lost_packets = {} - loss_delay = kTimeThreshold * max(latest_rtt, smoothed_rtt) - - // Minimum time of kGranularity before packets are deemed lost. - loss_delay = max(loss_delay, kGranularity) - - // Packets sent before this time are deemed lost. - lost_send_time = now() - loss_delay - - foreach unacked in sent_packets[pn_space]: - if (unacked.packet_number > largest_acked_packet[pn_space]): - continue - - // Mark packet as lost, or set time when it should be marked. - if (unacked.time_sent <= lost_send_time || - largest_acked_packet[pn_space] >= - unacked.packet_number + kPacketThreshold): - sent_packets[pn_space].remove(unacked.packet_number) - if (unacked.in_flight): - lost_packets.insert(unacked) - else: - if (loss_time[pn_space] == 0): - loss_time[pn_space] = unacked.time_sent + loss_delay - else: - loss_time[pn_space] = min(loss_time[pn_space], - unacked.time_sent + loss_delay) - - // Inform the congestion controller of lost packets and - // let it decide whether to retransmit immediately. - if (!lost_packets.empty()): - OnPacketsLost(lost_packets) -¶ -
We now describe an example implementation of the congestion controller described -in Section 6.¶
-Constants used in congestion control are based on a combination of RFCs, papers, -and common practice.¶
-Variables required to implement the congestion control mechanisms -are described in this section.¶
-At the beginning of the connection, initialize the congestion control -variables as follows:¶
-- congestion_window = kInitialWindow - bytes_in_flight = 0 - congestion_recovery_start_time = 0 - ssthresh = infinite - for pn_space in [ Initial, Handshake, ApplicationData ]: - ecn_ce_counters[pn_space] = 0 -¶ -
Whenever a packet is sent, and it contains non-ACK frames, the packet -increases bytes_in_flight.¶
-- OnPacketSentCC(bytes_sent): - bytes_in_flight += bytes_sent -¶ -
Invoked from loss detection's OnPacketAcked and is supplied with the -acked_packet from sent_packets.¶
-- InCongestionRecovery(sent_time): - return sent_time <= congestion_recovery_start_time - - OnPacketAckedCC(acked_packet): - // Remove from bytes_in_flight. - bytes_in_flight -= acked_packet.size - if (InCongestionRecovery(acked_packet.time_sent)): - // Do not increase congestion window in recovery period. - return - if (IsAppOrFlowControlLimited()): - // Do not increase congestion_window if application - // limited or flow control limited. - return - if (congestion_window < ssthresh): - // Slow start. - congestion_window += acked_packet.size - else: - // Congestion avoidance. - congestion_window += max_datagram_size * acked_packet.size - / congestion_window -¶ -
Invoked from ProcessECN and OnPacketsLost when a new congestion event is -detected. May start a new recovery period and reduces the congestion -window.¶
-- CongestionEvent(sent_time): - // Start a new congestion event if packet was sent after the - // start of the previous congestion recovery period. - if (!InCongestionRecovery(sent_time)): - congestion_recovery_start_time = Now() - congestion_window *= kLossReductionFactor - congestion_window = max(congestion_window, kMinimumWindow) - ssthresh = congestion_window - // A packet can be sent to speed up loss recovery. - MaybeSendOnePacket() -¶ -
Invoked when an ACK frame with an ECN section is received from the peer.¶
-- ProcessECN(ack, pn_space): - // If the ECN-CE counter reported by the peer has increased, - // this could be a new congestion event. - if (ack.ce_counter > ecn_ce_counters[pn_space]): - ecn_ce_counters[pn_space] = ack.ce_counter - CongestionEvent(sent_packets[ack.largest_acked].time_sent) -¶ -
Invoked from DetectLostPackets when packets are deemed lost.¶
-- InPersistentCongestion(largest_lost_packet): - pto = smoothed_rtt + max(4 * rttvar, kGranularity) + - max_ack_delay - congestion_period = pto * kPersistentCongestionThreshold - // Determine if all packets in the time period before the - // newest lost packet, including the edges, are marked - // lost - return AreAllPacketsLost(largest_lost_packet, - congestion_period) - - OnPacketsLost(lost_packets): - // Remove lost packets from bytes_in_flight. - for (lost_packet : lost_packets): - bytes_in_flight -= lost_packet.size - largest_lost_packet = lost_packets.last() - CongestionEvent(largest_lost_packet.time_sent) - - // Collapse congestion window if persistent congestion - if (InPersistentCongestion(largest_lost_packet)): - congestion_window = kMinimumWindow -¶ -
When Initial or Handshake keys are discarded, packets from the space -are discarded and loss detection state is updated.¶
-Pseudocode for OnPacketNumberSpaceDiscarded follows:¶
--OnPacketNumberSpaceDiscarded(pn_space): - assert(pn_space != ApplicationData) - // Remove any unacknowledged packets from flight. - foreach packet in sent_packets[pn_space]: - if packet.in_flight - bytes_in_flight -= size - sent_packets[pn_space].clear() - // Reset the loss detection and PTO timer - time_of_last_sent_ack_eliciting_packet[kPacketNumberSpace] = 0 - loss_time[pn_space] = 0 - SetLossDetectionTimer() -¶ -
Issue and pull request numbers are listed with a leading octothorp.¶
-No changes.¶
-No significant changes.¶
-No significant changes.¶
-No significant changes.¶
-No significant changes.¶
-No significant changes.¶
-No significant changes.¶
-The IETF QUIC Working Group received an enormous amount of support from many -people. The following people provided substantive contributions to this -document: -Alessandro Ghedini, -Benjamin Saunders, -Gorry Fairhurst, 奥 一穂 (Kazuho Oku), -Lars Eggert, -Magnus Westerlund, -Marten Seemann, -Martin Duke, -Martin Thomson, -Nick Banks, -Praveen Balasubramaniam.¶
-Internet-Draft | -Using TLS to Secure QUIC | -March 2020 | -
Thomson & Turner | -Expires 22 September 2020 | -[Page] | -
This document describes how Transport Layer Security (TLS) is used to secure -QUIC.¶
-Discussion of this draft takes place on the QUIC working group mailing list -(quic@ietf.org), which is archived at -https://mailarchive.ietf.org/arch/search/?email_list=quic.¶
-Working Group information can be found at https://github.com/quicwg; source -code and issues list for this draft can be found at -https://github.com/quicwg/base-drafts/labels/-tls.¶
-- This Internet-Draft is submitted in full conformance with the - provisions of BCP 78 and BCP 79.¶
-- Internet-Drafts are working documents of the Internet Engineering Task - Force (IETF). Note that other groups may also distribute working - documents as Internet-Drafts. The list of current Internet-Drafts is - at https://datatracker.ietf.org/drafts/current/.¶
-- Internet-Drafts are draft documents valid for a maximum of six months - and may be updated, replaced, or obsoleted by other documents at any - time. It is inappropriate to use Internet-Drafts as reference - material or to cite them other than as "work in progress."¶
-- This Internet-Draft will expire on 22 September 2020.¶
-- Copyright (c) 2020 IETF Trust and the persons identified as the - document authors. All rights reserved.¶
-- This document is subject to BCP 78 and the IETF Trust's Legal - Provisions Relating to IETF Documents - (https://trustee.ietf.org/license-info) in effect on the date of - publication of this document. Please review these documents - carefully, as they describe your rights and restrictions with - respect to this document. Code Components extracted from this - document must include Simplified BSD License text as described in - Section 4.e of the Trust Legal Provisions and are provided without - warranty as described in the Simplified BSD License.¶
-This document describes how QUIC [QUIC-TRANSPORT] is secured using TLS -[TLS13].¶
-TLS 1.3 provides critical latency improvements for connection establishment over -previous versions. Absent packet loss, most new connections can be established -and secured within a single round trip; on subsequent connections between the -same client and server, the client can often send application data immediately, -that is, using a zero round trip setup.¶
-This document describes how TLS acts as a security component of QUIC.¶
-The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", -"SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this -document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] -when, and only when, they appear in all capitals, as shown here.¶
-This document uses the terminology established in [QUIC-TRANSPORT].¶
-For brevity, the acronym TLS is used to refer to TLS 1.3, though a newer version -could be used (see Section 4.2).¶
-TLS provides two endpoints with a way to establish a means of communication over -an untrusted medium (that is, the Internet) that ensures that messages they -exchange cannot be observed, modified, or forged.¶
-Internally, TLS is a layered protocol, with the structure shown in -Figure 1.¶
-Each Handshake layer message (e.g., Handshake, Alerts, and Application Data) is -carried as a series of typed TLS records by the Record layer. Records are -individually cryptographically protected and then transmitted over a reliable -transport (typically TCP) which provides sequencing and guaranteed delivery.¶
-The TLS authenticated key exchange occurs between two endpoints: client and -server. The client initiates the exchange and the server responds. If the key -exchange completes successfully, both client and server will agree on a secret. -TLS supports both pre-shared key (PSK) and Diffie-Hellman over either finite -fields or elliptic curves ((EC)DHE) key exchanges. PSK is the basis for 0-RTT; -the latter provides perfect forward secrecy (PFS) when the (EC)DHE keys are -destroyed.¶
-After completing the TLS handshake, the client will have learned and -authenticated an identity for the server and the server is optionally able to -learn and authenticate an identity for the client. TLS supports X.509 -[RFC5280] certificate-based authentication for both server and client.¶
-The TLS key exchange is resistant to tampering by attackers and it produces -shared secrets that cannot be controlled by either participating peer.¶
-TLS provides two basic handshake modes of interest to QUIC:¶
-A simplified TLS handshake with 0-RTT application data is shown in Figure 2. -Note that this omits the EndOfEarlyData message, which is not used in QUIC (see -Section 8.3). Likewise, neither ChangeCipherSpec nor KeyUpdate messages are -used by QUIC; ChangeCipherSpec is redundant in TLS 1.3 and QUIC has defined its -own key update mechanism Section 6.¶
-Data is protected using a number of encryption levels:¶
- -Application Data may appear only in the Early Data and Application Data -levels. Handshake and Alert messages may appear in any level.¶
-The 0-RTT handshake is only possible if the client and server have previously -communicated. In the 1-RTT handshake, the client is unable to send protected -Application Data until it has received all of the Handshake messages sent by the -server.¶
-QUIC [QUIC-TRANSPORT] assumes responsibility for the confidentiality and -integrity protection of packets. For this it uses keys derived from a TLS -handshake [TLS13], but instead of carrying TLS records over QUIC (as with -TCP), TLS Handshake and Alert messages are carried directly over the QUIC -transport, which takes over the responsibilities of the TLS record layer, as -shown in Figure 3.¶
-QUIC also relies on TLS for authentication and negotiation of parameters that -are critical to security and performance.¶
-Rather than a strict layering, these two protocols cooperate: QUIC uses the TLS -handshake; TLS uses the reliability, ordered delivery, and record layer provided -by QUIC.¶
-At a high level, there are two main interactions between the TLS and QUIC -components:¶
-Figure 4 shows these interactions in more detail, with the QUIC packet -protection being called out specially.¶
-Unlike TLS over TCP, QUIC applications which want to send data do not send it -through TLS "application_data" records. Rather, they send it as QUIC STREAM -frames or other frame types which are then carried in QUIC packets.¶
-QUIC carries TLS handshake data in CRYPTO frames, each of which consists of a -contiguous block of handshake data identified by an offset and length. Those -frames are packaged into QUIC packets and encrypted under the current TLS -encryption level. As with TLS over TCP, once TLS handshake data has been -delivered to QUIC, it is QUIC's responsibility to deliver it reliably. Each -chunk of data that is produced by TLS is associated with the set of keys that -TLS is currently using. If QUIC needs to retransmit that data, it MUST use the -same keys even if TLS has already updated to newer keys.¶
-One important difference between TLS records (used with TCP) and QUIC CRYPTO -frames is that in QUIC multiple frames may appear in the same QUIC packet as -long as they are associated with the same packet number space. For instance, -an endpoint can bundle a Handshake message and an ACK for some Handshake data -into the same packet.¶
-Some frames are prohibited in different packet number spaces. The rules here -generalize those of TLS, in that frames associated with establishing the -connection can usually appear in packets in any packet number space, whereas -those associated with transferring data can only appear in the application -data packet number space:¶
-Note that it is not possible to send the following frames in 0-RTT packets for -various reasons: ACK, CRYPTO, HANDSHAKE_DONE, NEW_TOKEN, PATH_RESPONSE, and -RETIRE_CONNECTION_ID. A server MAY treat receipt of these frames in 0-RTT -packets as a connection error of type PROTOCOL_VIOLATION.¶
-Because packets could be reordered on the wire, QUIC uses the packet type to -indicate which keys were used to protect a given packet, as shown in -Table 1. When packets of different types need to be sent, -endpoints SHOULD use coalesced packets to send them in the same UDP datagram.¶
-Packet Type | -Encryption Keys | -PN Space | -
---|---|---|
Initial | -Initial secrets | -Initial | -
0-RTT Protected | -0-RTT | -Application data | -
Handshake | -Handshake | -Handshake | -
Retry | -Retry | -N/A | -
Version Negotiation | -N/A | -N/A | -
Short Header | -1-RTT | -Application data | -
Section 17 of [QUIC-TRANSPORT] shows how packets at the various encryption -levels fit into the handshake process.¶
-As shown in Figure 4, the interface from QUIC to TLS consists of four -primary functions:¶
-Additional functions might be needed to configure TLS.¶
-In this document, the TLS handshake is considered complete when the TLS stack -has reported that the handshake is complete. This happens when the TLS stack -has both sent a Finished message and verified the peer's Finished message. -Verifying the peer's Finished provides the endpoints with an assurance that -previous handshake messages have not been modified. Note that the handshake -does not complete at both endpoints simultaneously. Consequently, any -requirement that is based on the completion of the handshake depends on the -perspective of the endpoint in question.¶
-In this document, the TLS handshake is considered confirmed at the server when -the handshake completes. At the client, the handshake is considered confirmed -when a HANDSHAKE_DONE frame is received.¶
-A client MAY consider the handshake to be confirmed when it receives an -acknowledgement for a 1-RTT packet. This can be implemented by recording the -lowest packet number sent with 1-RTT keys, and comparing it to the Largest -Acknowledged field in any received 1-RTT ACK frame: once the latter is greater -than or equal to the former, the handshake is confirmed.¶
-In order to drive the handshake, TLS depends on being able to send and receive -handshake messages. There are two basic functions on this interface: one where -QUIC requests handshake messages and one where QUIC provides handshake packets.¶
-Before starting the handshake QUIC provides TLS with the transport parameters -(see Section 8.2) that it wishes to carry.¶
-A QUIC client starts TLS by requesting TLS handshake bytes from TLS. The client -acquires handshake bytes before sending its first packet. A QUIC server starts -the process by providing TLS with the client's handshake bytes.¶
-At any time, the TLS stack at an endpoint will have a current sending -encryption level and receiving encryption level. Encryption levels determine -the packet type and keys that are used for protecting data.¶
-Each encryption level is associated with a different sequence of bytes, which is -reliably transmitted to the peer in CRYPTO frames. When TLS provides handshake -bytes to be sent, they are appended to the current flow. Any packet that -includes the CRYPTO frame is protected using keys from the corresponding -encryption level. Four encryption levels are used, producing keys for Initial, -0-RTT, Handshake, and 1-RTT packets. CRYPTO frames are carried in just three of -these levels, omitting the 0-RTT level. These four levels correspond to three -packet number spaces: Initial and Handshake encrypted packets use their own -separate spaces; 0-RTT and 1-RTT packets use the application data packet number -space.¶
-QUIC takes the unprotected content of TLS handshake records as the content of -CRYPTO frames. TLS record protection is not used by QUIC. QUIC assembles -CRYPTO frames into QUIC packets, which are protected using QUIC packet -protection.¶
-QUIC is only capable of conveying TLS handshake records in CRYPTO frames. TLS -alerts are turned into QUIC CONNECTION_CLOSE error codes; see Section 4.9. -TLS application data and other message types cannot be carried by QUIC at any -encryption level and is an error if they are received from the TLS stack.¶
-When an endpoint receives a QUIC packet containing a CRYPTO frame from the -network, it proceeds as follows:¶
-Each time that TLS is provided with new data, new handshake bytes are requested -from TLS. TLS might not provide any bytes if the handshake messages it has -received are incomplete or it has no data to send.¶
-Once the TLS handshake is complete, this is indicated to QUIC along with any -final handshake bytes that TLS needs to send. TLS also provides QUIC with the -transport parameters that the peer advertised during the handshake.¶
-Once the handshake is complete, TLS becomes passive. TLS can still receive data -from its peer and respond in kind, but it will not need to send more data unless -specifically requested - either by an application or QUIC. One reason to send -data is that the server might wish to provide additional or updated session -tickets to a client.¶
-When the handshake is complete, QUIC only needs to provide TLS with any data -that arrives in CRYPTO streams. In the same way that is done during the -handshake, new data is requested from TLS after providing received data.¶
-As keys for new encryption levels become available, TLS provides QUIC with those -keys. Separately, as keys at a given encryption level become available to TLS, -TLS indicates to QUIC that reading or writing keys at that encryption level are -available. These events are not asynchronous; they always occur immediately -after TLS is provided with new handshake bytes, or after TLS produces handshake -bytes.¶
-TLS provides QUIC with three items as a new encryption level becomes available:¶
-These values are based on the values that TLS negotiates and are used by QUIC to -generate packet and header protection keys (see Section 5 and -Section 5.4).¶
-If 0-RTT is possible, it is ready after the client sends a TLS ClientHello -message or the server receives that message. After providing a QUIC client with -the first handshake bytes, the TLS stack might signal the change to 0-RTT -keys. On the server, after receiving handshake bytes that contain a ClientHello -message, a TLS server might signal that 0-RTT keys are available.¶
-Although TLS only uses one encryption level at a time, QUIC may use more than -one level. For instance, after sending its Finished message (using a CRYPTO -frame at the Handshake encryption level) an endpoint can send STREAM data (in -1-RTT encryption). If the Finished message is lost, the endpoint uses the -Handshake encryption level to retransmit the lost message. Reordering or loss -of packets can mean that QUIC will need to handle packets at multiple encryption -levels. During the handshake, this means potentially handling packets at higher -and lower encryption levels than the current encryption level used by TLS.¶
-In particular, server implementations need to be able to read packets at the -Handshake encryption level at the same time as the 0-RTT encryption level. A -client could interleave ACK frames that are protected with Handshake keys with -0-RTT data and the server needs to process those acknowledgments in order to -detect lost Handshake packets.¶
-QUIC also needs access to keys that might not ordinarily be available to a TLS -implementation. For instance, a client might need to acknowledge Handshake -packets before it is ready to send CRYPTO frames at that encryption level. TLS -therefore needs to provide keys to QUIC before it might produce them for its own -use.¶
-Figure 5 summarizes the exchange between QUIC and TLS for both -client and server. Each arrow is tagged with the encryption level used for that -transmission.¶
-Figure 5 shows the multiple packets that form a single "flight" of -messages being processed individually, to show what incoming messages trigger -different actions. New handshake messages are requested after all incoming -packets have been processed. This process might vary depending on how QUIC -implementations and the packets they receive are structured.¶
-This document describes how TLS 1.3 [TLS13] is used with QUIC.¶
-In practice, the TLS handshake will negotiate a version of TLS to use. This -could result in a newer version of TLS than 1.3 being negotiated if both -endpoints support that version. This is acceptable provided that the features -of TLS 1.3 that are used by QUIC are supported by the newer version.¶
-A badly configured TLS implementation could negotiate TLS 1.2 or another older -version of TLS. An endpoint MUST terminate the connection if a version of TLS -older than 1.3 is negotiated.¶
-The first Initial packet from a client contains the start or all of its first -cryptographic handshake message, which for TLS is the ClientHello. Servers -might need to parse the entire ClientHello (e.g., to access extensions such as -Server Name Identification (SNI) or Application Layer Protocol Negotiation -(ALPN)) in order to decide whether to accept the new incoming QUIC connection. -If the ClientHello spans multiple Initial packets, such servers would need to -buffer the first received fragments, which could consume excessive resources if -the client's address has not yet been validated. To avoid this, servers MAY -use the Retry feature (see Section 8.1 of [QUIC-TRANSPORT]) to only buffer -partial ClientHello messages from clients with a validated address.¶
-QUIC packet and framing add at least 36 bytes of overhead to the ClientHello -message. That overhead increases if the client chooses a connection ID without -zero length. Overheads also do not include the token or a connection ID longer -than 8 bytes, both of which might be required if a server sends a Retry packet.¶
-A typical TLS ClientHello can easily fit into a 1200 byte packet. However, in -addition to the overheads added by QUIC, there are several variables that could -cause this limit to be exceeded. Large session tickets, multiple or large key -shares, and long lists of supported ciphers, signature algorithms, versions, -QUIC transport parameters, and other negotiable parameters and extensions could -cause this message to grow.¶
-For servers, in addition to connection IDs and tokens, the size of TLS session -tickets can have an effect on a client's ability to connect efficiently. -Minimizing the size of these values increases the probability that clients can -use them and still fit their ClientHello message in their first Initial packet.¶
-The TLS implementation does not need to ensure that the ClientHello is -sufficiently large. QUIC PADDING frames are added to increase the size of the -packet as necessary.¶
-The requirements for authentication depend on the application protocol that is -in use. TLS provides server authentication and permits the server to request -client authentication.¶
-A client MUST authenticate the identity of the server. This typically involves -verification that the identity of the server is included in a certificate and -that the certificate is issued by a trusted entity (see for example -[RFC2818]).¶
-A server MAY request that the client authenticate during the handshake. A server -MAY refuse a connection if the client is unable to authenticate when requested. -The requirements for client authentication vary based on application protocol -and deployment.¶
-A server MUST NOT use post-handshake client authentication (as defined in -Section 4.6.2 of [TLS13]), because the multiplexing offered by QUIC prevents -clients from correlating the certificate request with the application-level -event that triggered it (see [HTTP2-TLS13]). -More specifically, servers MUST NOT send post-handshake TLS CertificateRequest -messages and clients MUST treat receipt of such messages as a connection error -of type PROTOCOL_VIOLATION.¶
-To communicate their willingness to process 0-RTT data, servers send a -NewSessionTicket message that contains the "early_data" extension with a -max_early_data_size of 0xffffffff; the amount of data which the client can send -in 0-RTT is controlled by the "initial_max_data" transport parameter supplied -by the server. Servers MUST NOT send the "early_data" extension with a -max_early_data_size set to any value other than 0xffffffff. A client MUST -treat receipt of a NewSessionTicket that contains an "early_data" extension -with any other value as a connection error of type PROTOCOL_VIOLATION.¶
-A client that wishes to send 0-RTT packets uses the "early_data" extension in -the ClientHello message of a subsequent handshake (see Section 4.2.10 of -[TLS13]). It then sends the application data in 0-RTT packets.¶
-A server accepts 0-RTT by sending an early_data extension in the -EncryptedExtensions (see Section 4.2.10 of [TLS13]). The server then -processes and acknowledges the 0-RTT packets that it receives.¶
-A server rejects 0-RTT by sending the EncryptedExtensions without an early_data -extension. A server will always reject 0-RTT if it sends a TLS -HelloRetryRequest. When rejecting 0-RTT, a server MUST NOT process any 0-RTT -packets, even if it could. When 0-RTT was rejected, a client SHOULD treat -receipt of an acknowledgement for a 0-RTT packet as a connection error of type -PROTOCOL_VIOLATION, if it is able to detect the condition.¶
-When 0-RTT is rejected, all connection characteristics that the client assumed -might be incorrect. This includes the choice of application protocol, transport -parameters, and any application configuration. The client therefore MUST reset -the state of all streams, including application state bound to those streams.¶
-A client MAY attempt to send 0-RTT again if it receives a Retry or Version -Negotiation packet. These packets do not signify rejection of 0-RTT.¶
-When a server receives a ClientHello with the "early_data" extension, it has to -decide whether to accept or reject early data from the client. Some of this -decision is made by the TLS stack (e.g., checking that the cipher suite being -resumed was included in the ClientHello; see Section 4.2.10 of [TLS13]). Even -when the TLS stack has no reason to reject early data, the QUIC stack or the -application protocol using QUIC might reject early data because the -configuration of the transport or application associated with the resumed -session is not compatible with the server's current configuration.¶
-QUIC requires additional transport state to be associated with a 0-RTT session -ticket. One common way to implement this is using stateless session tickets and -storing this state in the session ticket. Application protocols that use QUIC -might have similar requirements regarding associating or storing state. This -associated state is used for deciding whether early data must be rejected. For -example, HTTP/3 ([QUIC-HTTP]) settings determine how early data from the -client is interpreted. Other applications using QUIC could have different -requirements for determining whether to accept or reject early data.¶
-In TLS over TCP, the HelloRetryRequest feature (see Section 4.1.4 of -[TLS13]) can be used to correct a client's incorrect KeyShare extension as -well as for a stateless round-trip check. From the perspective of QUIC, this -just looks like additional messages carried in Initial packets. Although it is -in principle possible to use this feature for address verification in QUIC, -QUIC implementations SHOULD instead use the Retry feature (see Section 8.1 of -[QUIC-TRANSPORT]). HelloRetryRequest is still used to request key shares.¶
-If TLS experiences an error, it generates an appropriate alert as defined in -Section 6 of [TLS13].¶
-A TLS alert is turned into a QUIC connection error by converting the one-byte -alert description into a QUIC error code. The alert description is added to -0x100 to produce a QUIC error code from the range reserved for CRYPTO_ERROR. -The resulting value is sent in a QUIC CONNECTION_CLOSE frame of type 0x1c.¶
-The alert level of all TLS alerts is "fatal"; a TLS stack MUST NOT generate -alerts at the "warning" level.¶
-After QUIC moves to a new encryption level, packet protection keys for previous -encryption levels can be discarded. This occurs several times during the -handshake, as well as when keys are updated; see Section 6.¶
-Packet protection keys are not discarded immediately when new keys are -available. If packets from a lower encryption level contain CRYPTO frames, -frames that retransmit that data MUST be sent at the same encryption level. -Similarly, an endpoint generates acknowledgements for packets at the same -encryption level as the packet being acknowledged. Thus, it is possible that -keys for a lower encryption level are needed for a short time after keys for a -newer encryption level are available.¶
-An endpoint cannot discard keys for a given encryption level unless it has both -received and acknowledged all CRYPTO frames for that encryption level and when -all CRYPTO frames for that encryption level have been acknowledged by its peer. -However, this does not guarantee that no further packets will need to be -received or sent at that encryption level because a peer might not have received -all the acknowledgements necessary to reach the same state.¶
-Though an endpoint might retain older keys, new data MUST be sent at the highest -currently-available encryption level. Only ACK frames and retransmissions of -data in CRYPTO frames are sent at a previous encryption level. These packets -MAY also include PADDING frames.¶
-Packets protected with Initial secrets (Section 5.2) are not -authenticated, meaning that an attacker could spoof packets with the intent to -disrupt a connection. To limit these attacks, Initial packet protection keys -can be discarded more aggressively than other keys.¶
-The successful use of Handshake packets indicates that no more Initial packets -need to be exchanged, as these keys can only be produced after receiving all -CRYPTO frames from Initial packets. Thus, a client MUST discard Initial keys -when it first sends a Handshake packet and a server MUST discard Initial keys -when it first successfully processes a Handshake packet. Endpoints MUST NOT -send Initial packets after this point.¶
-This results in abandoning loss recovery state for the Initial encryption level -and ignoring any outstanding Initial packets.¶
-An endpoint MUST discard its handshake keys when the TLS handshake is confirmed -(Section 4.1.2). The server MUST send a HANDSHAKE_DONE frame as soon -as it completes the handshake.¶
-0-RTT and 1-RTT packets share the same packet number space, and clients do not -send 0-RTT packets after sending a 1-RTT packet (Section 5.6).¶
-Therefore, a client SHOULD discard 0-RTT keys as soon as it installs 1-RTT -keys, since they have no use after that moment.¶
-Additionally, a server MAY discard 0-RTT keys as soon as it receives a 1-RTT -packet. However, due to packet reordering, a 0-RTT packet could arrive after -a 1-RTT packet. Servers MAY temporarily retain 0-RTT keys to allow decrypting -reordered packets without requiring their contents to be retransmitted with -1-RTT keys. After receiving a 1-RTT packet, servers MUST discard 0-RTT keys -within a short time; the RECOMMENDED time period is three times the Probe -Timeout (PTO, see [QUIC-RECOVERY]). A server MAY discard 0-RTT keys earlier -if it determines that it has received all 0-RTT packets, which can be done by -keeping track of missing packet numbers.¶
-As with TLS over TCP, QUIC protects packets with keys derived from the TLS -handshake, using the AEAD algorithm negotiated by TLS.¶
-QUIC derives packet protection keys in the same way that TLS derives record -protection keys.¶
-Each encryption level has separate secret values for protection of packets sent -in each direction. These traffic secrets are derived by TLS (see Section 7.1 of -[TLS13]) and are used by QUIC for all encryption levels except the Initial -encryption level. The secrets for the Initial encryption level are computed -based on the client's initial Destination Connection ID, as described in -Section 5.2.¶
-The keys used for packet protection are computed from the TLS secrets using the -KDF provided by TLS. In TLS 1.3, the HKDF-Expand-Label function described in -Section 7.1 of [TLS13] is used, using the hash function from the negotiated -cipher suite. Other versions of TLS MUST provide a similar function in order to -be used with QUIC.¶
-The current encryption level secret and the label "quic key" are input to the -KDF to produce the AEAD key; the label "quic iv" is used to derive the IV; see -Section 5.3. The header protection key uses the "quic hp" label; see -Section 5.4. Using these labels provides key separation between QUIC -and TLS; see Section 9.5.¶
-The KDF used for initial secrets is always the HKDF-Expand-Label function from -TLS 1.3 (see Section 5.2).¶
-Initial packets are protected with a secret derived from the Destination -Connection ID field from the client's Initial packet. Specifically:¶
--initial_salt = 0xc3eef712c72ebb5a11a7d2432bb46365bef9f502 -initial_secret = HKDF-Extract(initial_salt, - client_dst_connection_id) - -client_initial_secret = HKDF-Expand-Label(initial_secret, - "client in", "", - Hash.length) -server_initial_secret = HKDF-Expand-Label(initial_secret, - "server in", "", - Hash.length) -¶ -
The hash function for HKDF when deriving initial secrets and keys is SHA-256 -[SHA].¶
-The connection ID used with HKDF-Expand-Label is the Destination Connection ID -in the Initial packet sent by the client. This will be a randomly-selected -value unless the client creates the Initial packet after receiving a Retry -packet, where the Destination Connection ID is selected by the server.¶
-The value of initial_salt is a 20 byte sequence shown in the figure in -hexadecimal notation. Future versions of QUIC SHOULD generate a new salt value, -thus ensuring that the keys are different for each version of QUIC. This -prevents a middlebox that only recognizes one version of QUIC from seeing or -modifying the contents of packets from future versions.¶
-The HKDF-Expand-Label function defined in TLS 1.3 MUST be used for Initial -packets even where the TLS versions offered do not include TLS 1.3.¶
-The secrets used for protecting Initial packets change when a server sends a -Retry packet to use the connection ID value selected by the server. The secrets -do not change when a client changes the Destination Connection ID it uses in -response to an Initial packet from the server.¶
-Appendix A contains test vectors for packet encryption.¶
-The Authentication Encryption with Associated Data (AEAD) [AEAD] function -used for QUIC packet protection is the AEAD that is negotiated for use with the -TLS connection. For example, if TLS is using the TLS_AES_128_GCM_SHA256, the -AEAD_AES_128_GCM function is used.¶
-Packets are protected prior to applying header protection (Section 5.4). -The unprotected packet header is part of the associated data (A). When removing -packet protection, an endpoint first removes the header protection.¶
-All QUIC packets other than Version Negotiation and Retry packets are protected -with an AEAD algorithm [AEAD]. Prior to establishing a shared secret, packets -are protected with AEAD_AES_128_GCM and a key derived from the Destination -Connection ID in the client's first Initial packet (see Section 5.2). -This provides protection against off-path attackers and robustness against QUIC -version unaware middleboxes, but not against on-path attackers.¶
-QUIC can use any of the ciphersuites defined in [TLS13] with the exception of -TLS_AES_128_CCM_8_SHA256. A ciphersuite MUST NOT be negotiated unless a header -protection scheme is defined for the ciphersuite. This document defines a -header protection scheme for all ciphersuites defined in [TLS13] aside from -TLS_AES_128_CCM_8_SHA256. These ciphersuites have a 16-byte authentication tag -and produce an output 16 bytes larger than their input.¶
-The key and IV for the packet are computed as described in Section 5.1. -The nonce, N, is formed by combining the packet protection IV with the packet -number. The 62 bits of the reconstructed QUIC packet number in network byte -order are left-padded with zeros to the size of the IV. The exclusive OR of the -padded packet number and the IV forms the AEAD nonce.¶
-The associated data, A, for the AEAD is the contents of the QUIC header, -starting from the flags byte in either the short or long header, up to and -including the unprotected packet number.¶
-The input plaintext, P, for the AEAD is the payload of the QUIC packet, as -described in [QUIC-TRANSPORT].¶
-The output ciphertext, C, of the AEAD is transmitted in place of P.¶
-Some AEAD functions have limits for how many packets can be encrypted under the -same key and IV (see for example [AEBounds]). This might be lower than the -packet number limit. An endpoint MUST initiate a key update (Section 6) -prior to exceeding any limit set for the AEAD that is in use.¶
-Parts of QUIC packet headers, in particular the Packet Number field, are -protected using a key that is derived separate to the packet protection key and -IV. The key derived using the "quic hp" label is used to provide -confidentiality protection for those fields that are not exposed to on-path -elements.¶
-This protection applies to the least-significant bits of the first byte, plus -the Packet Number field. The four least-significant bits of the first byte are -protected for packets with long headers; the five least significant bits of the -first byte are protected for packets with short headers. For both header forms, -this covers the reserved bits and the Packet Number Length field; the Key Phase -bit is also protected for packets with a short header.¶
-The same header protection key is used for the duration of the connection, with -the value not changing after a key update (see Section 6). This allows -header protection to be used to protect the key phase.¶
-This process does not apply to Retry or Version Negotiation packets, which do -not contain a protected payload or any of the fields that are protected by this -process.¶
-Header protection is applied after packet protection is applied (see Section 5.3). -The ciphertext of the packet is sampled and used as input to an encryption -algorithm. The algorithm used depends on the negotiated AEAD.¶
-The output of this algorithm is a 5 byte mask which is applied to the protected -header fields using exclusive OR. The least significant bits of the first byte -of the packet are masked by the least significant bits of the first mask byte, -and the packet number is masked with the remaining bytes. Any unused bytes of -mask that might result from a shorter packet number encoding are unused.¶
-Figure 6 shows a sample algorithm for applying header protection. Removing -header protection only differs in the order in which the packet number length -(pn_length) is determined.¶
-Figure 7 shows the protected fields of long and short headers marked with -an E. Figure 7 also shows the sampled fields.¶
-Before a TLS ciphersuite can be used with QUIC, a header protection algorithm -MUST be specified for the AEAD used with that ciphersuite. This document -defines algorithms for AEAD_AES_128_GCM, AEAD_AES_128_CCM, AEAD_AES_256_GCM -(all AES AEADs are defined in [AEAD]), and -AEAD_CHACHA20_POLY1305 [CHACHA]. Prior to TLS selecting a -ciphersuite, AES header protection is used (Section 5.4.3), matching the -AEAD_AES_128_GCM packet protection.¶
-The header protection algorithm uses both the header protection key and a sample -of the ciphertext from the packet Payload field.¶
-The same number of bytes are always sampled, but an allowance needs to be made -for the endpoint removing protection, which will not know the length of the -Packet Number field. In sampling the packet ciphertext, the Packet Number field -is assumed to be 4 bytes long (its maximum possible encoded length).¶
-An endpoint MUST discard packets that are not long enough to contain a complete -sample.¶
-To ensure that sufficient data is available for sampling, packets are padded so -that the combined lengths of the encoded packet number and protected payload is -at least 4 bytes longer than the sample required for header protection. The -ciphersuites defined in [TLS13] - other than TLS_AES_128_CCM_8_SHA256, for -which a header protection scheme is not defined in this document - have 16-byte -expansions and 16-byte header protection samples. This results in needing at -least 3 bytes of frames in the unprotected payload if the packet number is -encoded on a single byte, or 2 bytes of frames for a 2-byte packet number -encoding.¶
-The sampled ciphertext for a packet with a short header can be determined by the -following pseudocode:¶
--sample_offset = 1 + len(connection_id) + 4 - -sample = packet[sample_offset..sample_offset+sample_length] -¶ -
For example, for a packet with a short header, an 8 byte connection ID, and -protected with AEAD_AES_128_GCM, the sample takes bytes 13 to 28 inclusive -(using zero-based indexing).¶
-A packet with a long header is sampled in the same way, noting that multiple -QUIC packets might be included in the same UDP datagram and that each one is -handled separately.¶
--sample_offset = 7 + len(destination_connection_id) + - len(source_connection_id) + - len(payload_length) + 4 -if packet_type == Initial: - sample_offset += len(token_length) + - len(token) - -sample = packet[sample_offset..sample_offset+sample_length] -¶ -
This section defines the packet protection algorithm for AEAD_AES_128_GCM, -AEAD_AES_128_CCM, and AEAD_AES_256_GCM. AEAD_AES_128_GCM and -AEAD_AES_128_CCM use 128-bit AES [AES] in -electronic code-book (ECB) mode. AEAD_AES_256_GCM uses -256-bit AES in ECB mode.¶
-This algorithm samples 16 bytes from the packet ciphertext. This value is used -as the input to AES-ECB. In pseudocode:¶
--mask = AES-ECB(hp_key, sample) -¶ -
When AEAD_CHACHA20_POLY1305 is in use, header protection uses the raw ChaCha20 -function as defined in Section 2.4 of [CHACHA]. This uses a 256-bit key and -16 bytes sampled from the packet protection output.¶
-The first 4 bytes of the sampled ciphertext are the block counter. A ChaCha20 -implementation could take a 32-bit integer in place of a byte sequence, in -which case the byte sequence is interpreted as a little-endian value.¶
-The remaining 12 bytes are used as the nonce. A ChaCha20 implementation might -take an array of three 32-bit integers in place of a byte sequence, in which -case the nonce bytes are interpreted as a sequence of 32-bit little-endian -integers.¶
-The encryption mask is produced by invoking ChaCha20 to protect 5 zero bytes. In -pseudocode:¶
--counter = sample[0..3] -nonce = sample[4..15] -mask = ChaCha20(hp_key, counter, nonce, {0,0,0,0,0}) -¶ -
Once an endpoint successfully receives a packet with a given packet number, it -MUST discard all packets in the same packet number space with higher packet -numbers if they cannot be successfully unprotected with either the same key, or -- if there is a key update - the next packet protection key (see -Section 6). Similarly, a packet that appears to trigger a key update, but -cannot be unprotected successfully MUST be discarded.¶
-Failure to unprotect a packet does not necessarily indicate the existence of a -protocol error in a peer or an attack. The truncated packet number encoding -used in QUIC can cause packet numbers to be decoded incorrectly if they are -delayed significantly.¶
-If 0-RTT keys are available (see Section 4.5), the lack of replay protection -means that restrictions on their use are necessary to avoid replay attacks on -the protocol.¶
-A client MUST only use 0-RTT keys to protect data that is idempotent. A client -MAY wish to apply additional restrictions on what data it sends prior to the -completion of the TLS handshake. A client otherwise treats 0-RTT keys as -equivalent to 1-RTT keys, except that it MUST NOT send ACKs with 0-RTT keys.¶
-A client that receives an indication that its 0-RTT data has been accepted by a -server can send 0-RTT data until it receives all of the server's handshake -messages. A client SHOULD stop sending 0-RTT data if it receives an indication -that 0-RTT data has been rejected.¶
-A server MUST NOT use 0-RTT keys to protect packets; it uses 1-RTT keys to -protect acknowledgements of 0-RTT packets. A client MUST NOT attempt to -decrypt 0-RTT packets it receives and instead MUST discard them.¶
-Once a client has installed 1-RTT keys, it MUST NOT send any more 0-RTT -packets.¶
-Due to reordering and loss, protected packets might be received by an endpoint -before the final TLS handshake messages are received. A client will be unable -to decrypt 1-RTT packets from the server, whereas a server will be able to -decrypt 1-RTT packets from the client. Endpoints in either role MUST NOT -decrypt 1-RTT packets from their peer prior to completing the handshake.¶
-Even though 1-RTT keys are available to a server after receiving the first -handshake messages from a client, it is missing assurances on the client state:¶
-Therefore, the server's use of 1-RTT keys MUST be limited to sending data before -the handshake is complete. A server MUST NOT process incoming 1-RTT protected -packets before the TLS handshake is complete. Because sending acknowledgments -indicates that all frames in a packet have been processed, a server cannot send -acknowledgments for 1-RTT packets until the TLS handshake is complete. Received -packets protected with 1-RTT keys MAY be stored and later decrypted and used -once the handshake is complete.¶
-The requirement for the server to wait for the client Finished message creates -a dependency on that message being delivered. A client can avoid the -potential for head-of-line blocking that this implies by sending its 1-RTT -packets coalesced with a handshake packet containing a copy of the CRYPTO frame -that carries the Finished message, until one of the handshake packets is -acknowledged. This enables immediate server processing for those packets.¶
-A server could receive packets protected with 0-RTT keys prior to receiving a -TLS ClientHello. The server MAY retain these packets for later decryption in -anticipation of receiving a ClientHello.¶
-Retry packets (see the Retry Packet section of [QUIC-TRANSPORT]) carry a -Retry Integrity Tag that provides two properties: it allows discarding -packets that have accidentally been corrupted by the network, and it diminishes -off-path attackers' ability to send valid Retry packets.¶
-The Retry Integrity Tag is a 128-bit field that is computed as the output of -AEAD_AES_128_GCM [AEAD] used with the following inputs:¶
-The secret key and the nonce are values derived by calling HKDF-Expand-Label -using 0x656e61e336ae9417f7f0edd8d78d461e2aa7084aba7a14c1e9f726d55709169a as the -secret, with labels being "quic key" and "quic iv" (Section 5.1).¶
-The Retry Pseudo-Packet is not sent over the wire. It is computed by taking -the transmitted Retry packet, removing the Retry Integrity Tag and prepending -the two following fields:¶
-Once the handshake is confirmed (see Section 4.1.2), an endpoint MAY -initiate a key update.¶
-The Key Phase bit indicates which packet protection keys are used to protect the -packet. The Key Phase bit is initially set to 0 for the first set of 1-RTT -packets and toggled to signal each subsequent key update.¶
-The Key Phase bit allows a recipient to detect a change in keying material -without needing to receive the first packet that triggered the change. An -endpoint that notices a changed Key Phase bit updates keys and decrypts the -packet that contains the changed value.¶
-This mechanism replaces the TLS KeyUpdate message. Endpoints MUST NOT send a -TLS KeyUpdate message. Endpoints MUST treat the receipt of a TLS KeyUpdate -message as a connection error of type 0x10a, equivalent to a fatal TLS alert of -unexpected_message (see Section 4.9).¶
-Figure 9 shows a key update process, where the initial set of keys used -(identified with @M) are replaced by updated keys (identified with @N). The -value of the Key Phase bit is indicated in brackets [].¶
-Endpoints maintain separate read and write secrets for packet protection. An -endpoint initiates a key update by updating its packet protection write secret -and using that to protect new packets. The endpoint creates a new write secret -from the existing write secret as performed in Section 7.2 of [TLS13]. This -uses the KDF function provided by TLS with a label of "quic ku". The -corresponding key and IV are created from that secret as defined in -Section 5.1. The header protection key is not updated.¶
-For example, to update write keys with TLS 1.3, HKDF-Expand-Label is used as:¶
--secret_<n+1> = HKDF-Expand-Label(secret_<n>, "quic ku", - "", Hash.length) -¶ -
The endpoint toggles the value of the Key Phase bit and uses the updated key and -IV to protect all subsequent packets.¶
-An endpoint MUST NOT initiate a key update prior to having confirmed the -handshake (Section 4.1.2). An endpoint MUST NOT initiate a subsequent -key update prior unless it has received an acknowledgment for a packet that was -sent protected with keys from the current key phase. This ensures that keys are -available to both peers before another key update can be initiated. This can be -implemented by tracking the lowest packet number sent with each key phase, and -the highest acknowledged packet number in the 1-RTT space: once the latter is -higher than or equal to the former, another key update can be initiated.¶
-The endpoint that initiates a key update also updates the keys that it uses for -receiving packets. These keys will be needed to process packets the peer sends -after updating.¶
-An endpoint SHOULD retain old keys so that packets sent by its peer prior to -receiving the key update can be processed. Discarding old keys too early can -cause delayed packets to be discarded. Discarding packets will be interpreted -as packet loss by the peer and could adversely affect performance.¶
-A peer is permitted to initiate a key update after receiving an acknowledgement -of a packet in the current key phase. An endpoint detects a key update when -processing a packet with a key phase that differs from the value last used to -protect the last packet it sent. To process this packet, the endpoint uses the -next packet protection key and IV. See Section 6.3 for -considerations about generating these keys.¶
-If a packet is successfully processed using the next key and IV, then the peer -has initiated a key update. The endpoint MUST update its send keys to the -corresponding key phase in response, as described in Section 6.1. -Sending keys MUST be updated before sending an acknowledgement for the packet -that was received with updated keys. By acknowledging the packet that triggered -the key update in a packet protected with the updated keys, the endpoint signals -that the key update is complete.¶
-An endpoint can defer sending the packet or acknowledgement according to its -normal packet sending behaviour; it is not necessary to immediately generate a -packet in response to a key update. The next packet sent by the endpoint will -use the updated keys. The next packet that contains an acknowledgement will -cause the key update to be completed. If an endpoint detects a second update -before it has sent any packets with updated keys containing an -acknowledgement for the packet that initiated the key update, it indicates that -its peer has updated keys twice without awaiting confirmation. An endpoint MAY -treat consecutive key updates as a connection error of type KEY_UPDATE_ERROR.¶
-An endpoint that receives an acknowledgement that is carried in a packet -protected with old keys where any acknowledged packet was protected with newer -keys MAY treat that as a connection error of type KEY_UPDATE_ERROR. This -indicates that a peer has received and acknowledged a packet that initiates a -key update, but has not updated keys in response.¶
-Endpoints responding to an apparent key update MUST NOT generate a timing -side-channel signal that might indicate that the Key Phase bit was invalid (see -Section 9.3). Endpoints can use dummy packet protection keys in -place of discarded keys when key updates are not yet permitted. Using dummy -keys will generate no variation in the timing signal produced by attempting to -remove packet protection, and results in all packets with an invalid Key Phase -bit being rejected.¶
-The process of creating new packet protection keys for receiving packets could -reveal that a key update has occurred. An endpoint MAY perform this process as -part of packet processing, but this creates a timing signal that can be used by -an attacker to learn when key updates happen and thus the value of the Key Phase -bit in certain packets. Endpoints MAY instead defer the creation of the next -set of receive packet protection keys until some time after a key update -completes, up to three times the PTO; see Section 6.5.¶
-Once generated, the next set of packet protection keys SHOULD be retained, even -if the packet that was received was subsequently discarded. Packets containing -apparent key updates are easy to forge and - while the process of key update -does not require significant effort - triggering this process could be used by -an attacker for DoS.¶
-For this reason, endpoints MUST be able to retain two sets of packet protection -keys for receiving packets: the current and the next. Retaining the previous -keys in addition to these might improve performance, but this is not essential.¶
-An endpoint always sends packets that are protected with the newest keys. Keys -used for packet protection can be discarded immediately after switching to newer -keys.¶
-Packets with higher packet numbers MUST be protected with either the same or -newer packet protection keys than packets with lower packet numbers. An -endpoint that successfully removes protection with old keys when newer keys were -used for packets with lower packet numbers MUST treat this as a connection error -of type KEY_UPDATE_ERROR.¶
-For receiving packets during a key update, packets protected with older keys -might arrive if they were delayed by the network. Retaining old packet -protection keys allows these packets to be successfully processed.¶
-As packets protected with keys from the next key phase use the same Key Phase -value as those protected with keys from the previous key phase, it can be -necessary to distinguish between the two. This can be done using packet -numbers. A recovered packet number that is lower than any packet number from -the current key phase uses the previous packet protection keys; a recovered -packet number that is higher than any packet number from the current key phase -requires the use of the next packet protection keys.¶
-Some care is necessary to ensure that any process for selecting between -previous, current, and next packet protection keys does not expose a timing side -channel that might reveal which keys were used to remove packet protection. See -Section 9.4 for more information.¶
-Alternatively, endpoints can retain only two sets of packet protection keys, -swapping previous for next after enough time has passed to allow for reordering -in the network. In this case, the Key Phase bit alone can be used to select -keys.¶
-An endpoint MAY allow a period of approximately the Probe Timeout (PTO; see -[QUIC-RECOVERY]) after a key update before it creates the next set of packet -protection keys. These updated keys MAY replace the previous keys at that time. -With the caveat that PTO is a subjective measure - that is, a peer could have a -different view of the RTT - this time is expected to be long enough that any -reordered packets would be declared lost by a peer even if they were -acknowledged and short enough to allow for subsequent key updates.¶
-Endpoints need to allow for the possibility that a peer might not be able to -decrypt packets that initiate a key update during the period when it retains old -keys. Endpoints SHOULD wait three times the PTO before initiating a key update -after receiving an acknowledgment that confirms that the previous key update was -received. Failing to allow sufficient time could lead to packets being -discarded.¶
-An endpoint SHOULD retain old read keys for no more than three times the PTO. -After this period, old read keys and their corresponding secrets SHOULD be -discarded.¶
-Key updates MUST be initiated before usage limits on packet protection keys are -exceeded. For the cipher suites mentioned in this document, the limits in -Section 5.5 of [TLS13] apply. Other cipher suites MUST define usage limits -in order to be used with QUIC.¶
-The KEY_UPDATE_ERROR error code (0xE) is used to signal errors related to key -updates.¶
-Initial packets are not protected with a secret key, so they are subject to -potential tampering by an attacker. QUIC provides protection against attackers -that cannot read packets, but does not attempt to provide additional protection -against attacks where the attacker can observe and inject packets. Some forms -of tampering - such as modifying the TLS messages themselves - are detectable, -but some - such as modifying ACKs - are not.¶
-For example, an attacker could inject a packet containing an ACK frame that -makes it appear that a packet had not been received or to create a false -impression of the state of the connection (e.g., by modifying the ACK Delay). -Note that such a packet could cause a legitimate packet to be dropped as a -duplicate. Implementations SHOULD use caution in relying on any data which is -contained in Initial packets that is not otherwise authenticated.¶
-It is also possible for the attacker to tamper with data that is carried in -Handshake packets, but because that tampering requires modifying TLS handshake -messages, that tampering will cause the TLS handshake to fail.¶
-QUIC uses the TLS handshake for more than just negotiation of cryptographic -parameters. The TLS handshake provides preliminary values for QUIC transport -parameters and allows a server to perform return routability checks on clients.¶
-QUIC requires that the cryptographic handshake provide authenticated protocol -negotiation. TLS uses Application Layer Protocol Negotiation (ALPN) -[ALPN] to select an application protocol. Unless another mechanism -is used for agreeing on an application protocol, endpoints MUST use ALPN for -this purpose. When using ALPN, endpoints MUST immediately close a connection -(see Section 10.3 in [QUIC-TRANSPORT]) if an application protocol is not -negotiated with a no_application_protocol TLS alert (QUIC error code 0x178, see -Section 4.9). While [ALPN] only specifies that servers use this alert, -QUIC clients MUST also use it to terminate a connection when ALPN negotiation -fails.¶
-An application protocol MAY restrict the QUIC versions that it can operate over. -Servers MUST select an application protocol compatible with the QUIC version -that the client has selected. The server MUST treat the inability to select a -compatible application protocol as a connection error of type 0x178 -(no_application_protocol). Similarly, a client MUST treat the selection of an -incompatible application protocol by a server as a connection error of type -0x178.¶
-QUIC transport parameters are carried in a TLS extension. Different versions of -QUIC might define a different method for negotiating transport configuration.¶
-Including transport parameters in the TLS handshake provides integrity -protection for these values.¶
-- enum { - quic_transport_parameters(0xffa5), (65535) - } ExtensionType; -¶ -
The extension_data
field of the quic_transport_parameters extension contains a
-value that is defined by the version of QUIC that is in use.¶
The quic_transport_parameters extension is carried in the ClientHello and the -EncryptedExtensions messages during the handshake. Endpoints MUST send the -quic_transport_parameters extension; endpoints that receive ClientHello or -EncryptedExtensions messages without the quic_transport_parameters extension -MUST close the connection with an error of type 0x16d (equivalent to a fatal TLS -missing_extension alert, see Section 4.9).¶
-While the transport parameters are technically available prior to the completion -of the handshake, they cannot be fully trusted until the handshake completes, -and reliance on them should be minimized. However, any tampering with the -parameters will cause the handshake to fail.¶
-Endpoints MUST NOT send this extension in a TLS connection that does not use -QUIC (such as the use of TLS with TCP defined in [TLS13]). A fatal -unsupported_extension alert MUST be sent by an implementation that supports this -extension if the extension is received when the transport is not QUIC.¶
-The TLS EndOfEarlyData message is not used with QUIC. QUIC does not rely on -this message to mark the end of 0-RTT data or to signal the change to Handshake -keys.¶
-Clients MUST NOT send the EndOfEarlyData message. A server MUST treat receipt -of a CRYPTO frame in a 0-RTT packet as a connection error of type -PROTOCOL_VIOLATION.¶
-As a result, EndOfEarlyData does not appear in the TLS handshake transcript.¶
-There are likely to be some real clangers here eventually, but the current set -of issues is well captured in the relevant sections of the main text.¶
-Never assume that because it isn't in the security considerations section it -doesn't affect security. Most of this document does.¶
-As described in Section 8 of [TLS13], use of TLS early data comes with an -exposure to replay attack. The use of 0-RTT in QUIC is similarly vulnerable to -replay attack.¶
-Endpoints MUST implement and use the replay protections described in [TLS13], -however it is recognized that these protections are imperfect. Therefore, -additional consideration of the risk of replay is needed.¶
-QUIC is not vulnerable to replay attack, except via the application protocol -information it might carry. The management of QUIC protocol state based on the -frame types defined in [QUIC-TRANSPORT] is not vulnerable to replay. -Processing of QUIC frames is idempotent and cannot result in invalid connection -states if frames are replayed, reordered or lost. QUIC connections do not -produce effects that last beyond the lifetime of the connection, except for -those produced by the application protocol that QUIC serves.¶
-A server that accepts 0-RTT on a connection incurs a higher cost than accepting -a connection without 0-RTT. This includes higher processing and computation -costs. Servers need to consider the probability of replay and all associated -costs when accepting 0-RTT.¶
-Ultimately, the responsibility for managing the risks of replay attacks with -0-RTT lies with an application protocol. An application protocol that uses QUIC -MUST describe how the protocol uses 0-RTT and the measures that are employed to -protect against replay attack. An analysis of replay risk needs to consider -all QUIC protocol features that carry application semantics.¶
-Disabling 0-RTT entirely is the most effective defense against replay attack.¶
-QUIC extensions MUST describe how replay attacks affect their operation, or -prohibit their use in 0-RTT. Application protocols MUST either prohibit the use -of extensions that carry application semantics in 0-RTT or provide replay -mitigation strategies.¶
-A small ClientHello that results in a large block of handshake messages from a -server can be used in packet reflection attacks to amplify the traffic generated -by an attacker.¶
-QUIC includes three defenses against this attack. First, the packet containing a -ClientHello MUST be padded to a minimum size. Second, if responding to an -unverified source address, the server is forbidden to send more than three UDP -datagrams in its first flight (see Section 8.1 of [QUIC-TRANSPORT]). Finally, -because acknowledgements of Handshake packets are authenticated, a blind -attacker cannot forge them. Put together, these defenses limit the level of -amplification.¶
-[NAN] analyzes authenticated encryption
-algorithms which provide nonce privacy, referred to as "Hide Nonce" (HN)
-transforms. The general header protection construction in this document is
-one of those algorithms (HN1). Header protection uses the output of the packet
-protection AEAD to derive sample
, and then encrypts the header field using
-a pseudorandom function (PRF) as follows:¶
-protected_field = field XOR PRF(hp_key, sample) -¶ -
The header protection variants in this document use a pseudorandom permutation -(PRP) in place of a generic PRF. However, since all PRPs are also PRFs [IMC], -these variants do not deviate from the HN1 construction.¶
-As hp_key
is distinct from the packet protection key, it follows that header
-protection achieves AE2 security as defined in [NAN] and therefore guarantees
-privacy of field
, the protected packet header. Future header protection
-variants based on this construction MUST use a PRF to ensure equivalent
-security guarantees.¶
Use of the same key and ciphertext sample more than once risks compromising -header protection. Protecting two different headers with the same key and -ciphertext sample reveals the exclusive OR of the protected fields. Assuming -that the AEAD acts as a PRF, if L bits are sampled, the odds of two ciphertext -samples being identical approach 2^(-L/2), that is, the birthday bound. For the -algorithms described in this document, that probability is one in 2^64.¶
-To prevent an attacker from modifying packet headers, the header is transitively -authenticated using packet protection; the entire packet header is part of the -authenticated additional data. Protected fields that are falsified or modified -can only be detected once the packet protection is removed.¶
-An attacker could guess values for packet numbers or Key Phase and have an -endpoint confirm guesses through timing side channels. Similarly, guesses for -the packet number length can be trialed and exposed. If the recipient of a -packet discards packets with duplicate packet numbers without attempting to -remove packet protection they could reveal through timing side-channels that the -packet number matches a received packet. For authentication to be free from -side-channels, the entire process of header protection removal, packet number -recovery, and packet protection removal MUST be applied together without timing -and other side-channels.¶
-For the sending of packets, construction and protection of packet payloads and -packet numbers MUST be free from side-channels that would reveal the packet -number or its encoded size.¶
-During a key update, the time taken to generate new keys could reveal through -timing side-channels that a key update has occurred. Alternatively, where an -attacker injects packets this side-channel could reveal the value of the Key -Phase on injected packets. After receiving a key update, an endpoint SHOULD -generate and save the next set of receive packet protection keys, as described -in Section 6.3. By generating new keys before a key update is -received, receipt of packets will not create timing signals that leak the value -of the Key Phase.¶
-This depends on not doing this key generation during packet processing and it -can require that endpoints maintain three sets of packet protection keys for -receiving: for the previous key phase, for the current key phase, and for the -next key phase. Endpoints can instead choose to defer generation of the next -receive packet protection keys until they discard old keys so that only two sets -of receive keys need to be retained at any point in time.¶
-In using TLS, the central key schedule of TLS is used. As a result of the TLS -handshake messages being integrated into the calculation of secrets, the -inclusion of the QUIC transport parameters extension ensures that handshake and -1-RTT keys are not the same as those that might be produced by a server running -TLS over TCP. To avoid the possibility of cross-protocol key synchronization, -additional measures are provided to improve key separation.¶
-The QUIC packet protection keys and IVs are derived using a different label than -the equivalent keys in TLS.¶
-To preserve this separation, a new version of QUIC SHOULD define new labels for -key derivation for packet protection key and IV, plus the header protection -keys. This version of QUIC uses the string "quic". Other versions can use a -version-specific label in place of that string.¶
-The initial secrets use a key that is specific to the negotiated QUIC version. -New QUIC versions SHOULD define a new salt value used in calculating initial -secrets.¶
-This document does not create any new IANA registries, but it registers the -values in the following registries:¶
-This section shows examples of packet protection so that implementations can be -verified incrementally. Samples of Initial packets from both client and server, -plus a Retry packet are defined. These packets use an 8-byte client-chosen -Destination Connection ID of 0x8394c8f03e515708. Some intermediate values are -included. All values are shown in hexadecimal.¶
-The labels generated by the HKDF-Expand-Label function are:¶
-The initial secret is common:¶
--initial_secret = HKDF-Extract(initial_salt, cid) - = 524e374c6da8cf8b496f4bcb69678350 - 7aafee6198b202b4bc823ebf7514a423 -¶ -
The secrets for protecting client packets are:¶
--client_initial_secret - = HKDF-Expand-Label(initial_secret, "client in", _, 32) - = fda3953aecc040e48b34e27ef87de3a6 - 098ecf0e38b7e032c5c57bcbd5975b84 - -key = HKDF-Expand-Label(client_initial_secret, "quic key", _, 16) - = af7fd7efebd21878ff66811248983694 - -iv = HKDF-Expand-Label(client_initial_secret, "quic iv", _, 12) - = 8681359410a70bb9c92f0420 - -hp = HKDF-Expand-Label(client_initial_secret, "quic hp", _, 16) - = a980b8b4fb7d9fbc13e814c23164253d -¶ -
The secrets for protecting server packets are:¶
--server_initial_secret - = HKDF-Expand-Label(initial_secret, "server in", _, 32) - = 554366b81912ff90be41f17e80222130 - 90ab17d8149179bcadf222f29ff2ddd5 - -key = HKDF-Expand-Label(server_initial_secret, "quic key", _, 16) - = 5d51da9ee897a21b2659ccc7e5bfa577 - -iv = HKDF-Expand-Label(server_initial_secret, "quic iv", _, 12) - = 5e5ae651fd1e8495af13508b - -hp = HKDF-Expand-Label(server_initial_secret, "quic hp", _, 16) - = a8ed82e6664f865aedf6106943f95fb8 -¶ -
The client sends an Initial packet. The unprotected payload of this packet -contains the following CRYPTO frame, plus enough PADDING frames to make a 1162 -byte payload:¶
--060040c4010000c003036660261ff947 cea49cce6cfad687f457cf1b14531ba1 -4131a0e8f309a1d0b9c4000006130113 031302010000910000000b0009000006 -736572766572ff01000100000a001400 12001d00170018001901000101010201 -03010400230000003300260024001d00 204cfdfcd178b784bf328cae793b136f -2aedce005ff183d7bb14952072366470 37002b0003020304000d0020001e0403 -05030603020308040805080604010501 060102010402050206020202002d0002 -0101001c00024001 -¶ -
The unprotected header includes the connection ID and a 4 byte packet number -encoding for a packet number of 2:¶
--c3ff00001b088394c8f03e5157080000449e00000002 -¶ -
Protecting the payload produces output that is sampled for header protection. -Because the header uses a 4 byte packet number encoding, the first 16 bytes of -the protected payload is sampled, then applied to the header:¶
--sample = 535064a4268a0d9d7b1c9d250ae35516 - -mask = AES-ECB(hp, sample)[0..4] - = 833b343aaa - -header[0] ^= mask[0] & 0x0f - = c0 -header[18..21] ^= mask[1..4] - = 3b343aa8 -header = c0ff00001b088394c8f03e5157080000449e3b343aa8 -¶ -
The resulting protected packet is:¶
--c0ff00001b088394c8f03e5157080000 449e3b343aa8535064a4268a0d9d7b1c -9d250ae355162276e9b1e3011ef6bbc0 ab48ad5bcc2681e953857ca62becd752 -4daac473e68d7405fbba4e9ee616c870 38bdbe908c06d9605d9ac49030359eec -b1d05a14e117db8cede2bb09d0dbbfee 271cb374d8f10abec82d0f59a1dee29f -e95638ed8dd41da07487468791b719c5 5c46968eb3b54680037102a28e53dc1d -12903db0af5821794b41c4a93357fa59 ce69cfe7f6bdfa629eef78616447e1d6 -11c4baf71bf33febcb03137c2c75d253 17d3e13b684370f668411c0f00304b50 -1c8fd422bd9b9ad81d643b20da89ca05 25d24d2b142041cae0af205092e43008 -0cd8559ea4c5c6e4fa3f66082b7d303e 52ce0162baa958532b0bbc2bc785681f -cf37485dff6595e01e739c8ac9efba31 b985d5f656cc092432d781db95221724 -87641c4d3ab8ece01e39bc85b1543661 4775a98ba8fa12d46f9b35e2a55eb72d -7f85181a366663387ddc20551807e007 673bd7e26bf9b29b5ab10a1ca87cbb7a -d97e99eb66959c2a9bc3cbde4707ff77 20b110fa95354674e395812e47a0ae53 -b464dcb2d1f345df360dc227270c7506 76f6724eb479f0d2fbb6124429990457 -ac6c9167f40aab739998f38b9eccb24f d47c8410131bf65a52af841275d5b3d1 -880b197df2b5dea3e6de56ebce3ffb6e 9277a82082f8d9677a6767089b671ebd -244c214f0bde95c2beb02cd1172d58bd f39dce56ff68eb35ab39b49b4eac7c81 -5ea60451d6e6ab82119118df02a58684 4a9ffe162ba006d0669ef57668cab38b -62f71a2523a084852cd1d079b3658dc2 f3e87949b550bab3e177cfc49ed190df -f0630e43077c30de8f6ae081537f1e83 da537da980afa668e7b7fb25301cf741 -524be3c49884b42821f17552fbd1931a 813017b6b6590a41ea18b6ba49cd48a4 -40bd9a3346a7623fb4ba34a3ee571e3c 731f35a7a3cf25b551a680fa68763507 -b7fde3aaf023c50b9d22da6876ba337e b5e9dd9ec3daf970242b6c5aab3aa4b2 -96ad8b9f6832f686ef70fa938b31b4e5 ddd7364442d3ea72e73d668fb0937796 -f462923a81a47e1cee7426ff6d922126 9b5a62ec03d6ec94d12606cb485560ba -b574816009e96504249385bb61a819be 04f62c2066214d8360a2022beb316240 -b6c7d78bbe56c13082e0ca272661210a bf020bf3b5783f1426436cf9ff418405 -93a5d0638d32fc51c5c65ff291a3a7a5 2fd6775e623a4439cc08dd25582febc9 -44ef92d8dbd329c91de3e9c9582e41f1 7f3d186f104ad3f90995116c682a2a14 -a3b4b1f547c335f0be710fc9fc03e0e5 87b8cda31ce65b969878a4ad4283e6d5 -b0373f43da86e9e0ffe1ae0fddd35162 55bd74566f36a38703d5f34249ded1f6 -6b3d9b45b9af2ccfefe984e13376b1b2 c6404aa48c8026132343da3f3a33659e -c1b3e95080540b28b7f3fcd35fa5d843 b579a84c089121a60d8c1754915c344e -eaf45a9bf27dc0c1e784161691220913 13eb0e87555abd706626e557fc36a04f -cd191a58829104d6075c5594f627ca50 6bf181daec940f4a4f3af0074eee89da -acde6758312622d4fa675b39f728e062 d2bee680d8f41a597c262648bb18bcfc -13c8b3d97b1a77b2ac3af745d61a34cc 4709865bac824a94bb19058015e4e42d -38d3b779d72edc00c5cd088eff802b05 -¶ -
The server sends the following payload in response, including an ACK frame, a -CRYPTO frame, and no PADDING frames:¶
--0d0000000018410a020000560303eefc e7f7b37ba1d1632e96677825ddf73988 -cfc79825df566dc5430b9a045a120013 0100002e00330024001d00209d3c940d -89690b84d08a60993c144eca684d1081 287c834d5311bcf32bb9da1a002b0002 -0304 -¶ -
The header from the server includes a new connection ID and a 2-byte packet -number encoding for a packet number of 1:¶
--c1ff00001b0008f067a5502a4262b50040740001 -¶ -
As a result, after protection, the header protection sample is taken starting -from the third protected octet:¶
--sample = 7002596f99ae67abf65a5852f54f58c3 -mask = 38168a0c25 -header = c9ff00001b0008f067a5502a4262b5004074168b -¶ -
The final protected packet is then:¶
--c9ff00001b0008f067a5502a4262b500 4074168bf22b7002596f99ae67abf65a -5852f54f58c37c808682e2e40492d8a3 899fb04fc0afe9aabc8767b18a0aa493 -537426373b48d502214dd856d63b78ce e37bc664b3fe86d487ac7a77c53038a3 -cd32f0b5004d9f5754c4f7f2d1f35cf3 f7116351c92bd8c3a9528d2b6aca20f0 -8047d9f017f0 -¶ -
This shows a Retry packet that might be sent in response to the Initial packet -in Appendix A.2. The integrity check includes the client-chosen -connection ID value of 0x8394c8f03e515708, but that value is not -included in the final Retry packet:¶
--ffff00001b0008f067a5502a4262b574 6f6b656ea523cb5ba524695f6569f293 -a1359d8e -¶ -
Issue and pull request numbers are listed with a leading octothorp.¶
-Changes to integration of the TLS handshake (#829, #1018, #1094, #1165, #1190, -#1233, #1242, #1252, #1450)¶
-No significant changes.¶
-No significant changes.¶
-The IETF QUIC Working Group received an enormous amount of support from many -people. The following people provided substantive contributions to this -document: -Adam Langley, -Alessandro Ghedini, -Christian Huitema, -Christopher Wood, -David Schinazi, -Dragana Damjanovic, -Eric Rescorla, -Ian Swett, -Jana Iyengar, 奥 一穂 (Kazuho Oku), -Marten Seemann, -Martin Duke, -Mike Bishop, Mikkel Fahnøe Jørgensen, -Nick Banks, -Nick Harper, -Roberto Peon, -Rui Paulo, -Ryan Hamilton, -and Victor Vasiliev.¶
-Internet-Draft | -QUIC Transport Protocol | -March 2020 | -
Iyengar & Thomson | -Expires 22 September 2020 | -[Page] | -
This document defines the core of the QUIC transport protocol. Accompanying -documents describe QUIC's loss detection and congestion control and the use of -TLS for key negotiation.¶
-Discussion of this draft takes place on the QUIC working group mailing list -(quic@ietf.org), which is archived at -<https://mailarchive.ietf.org/arch/search/?email_list=quic>.¶
-Working Group information can be found at <https://github.com/quicwg>; source -code and issues list for this draft can be found at -<https://github.com/quicwg/base-drafts/labels/-transport>.¶
-- This Internet-Draft is submitted in full conformance with the - provisions of BCP 78 and BCP 79.¶
-- Internet-Drafts are working documents of the Internet Engineering Task - Force (IETF). Note that other groups may also distribute working - documents as Internet-Drafts. The list of current Internet-Drafts is - at https://datatracker.ietf.org/drafts/current/.¶
-- Internet-Drafts are draft documents valid for a maximum of six months - and may be updated, replaced, or obsoleted by other documents at any - time. It is inappropriate to use Internet-Drafts as reference - material or to cite them other than as "work in progress."¶
-- This Internet-Draft will expire on 22 September 2020.¶
-- Copyright (c) 2020 IETF Trust and the persons identified as the - document authors. All rights reserved.¶
-- This document is subject to BCP 78 and the IETF Trust's Legal - Provisions Relating to IETF Documents - (https://trustee.ietf.org/license-info) in effect on the date of - publication of this document. Please review these documents - carefully, as they describe your rights and restrictions with - respect to this document. Code Components extracted from this - document must include Simplified BSD License text as described in - Section 4.e of the Trust Legal Provisions and are provided without - warranty as described in the Simplified BSD License.¶
-QUIC is a multiplexed and secure general-purpose transport protocol that -provides:¶
-QUIC uses UDP as a substrate to avoid requiring changes to legacy client -operating systems and middleboxes. QUIC authenticates all of its headers and -encrypts most of the data it exchanges, including its signaling, to avoid -incurring a dependency on middleboxes.¶
-This document describes the core QUIC protocol and is structured as follows:¶
-Streams are the basic service abstraction that QUIC provides.¶
- -Connections are the context in which QUIC endpoints communicate.¶
-Packets and frames are the basic unit used by QUIC to communicate.¶
-Finally, encoding details of QUIC protocol elements are described in:¶
-Accompanying documents describe QUIC's loss detection and congestion control -[QUIC-RECOVERY], and the use of TLS for key negotiation [QUIC-TLS].¶
-This document defines QUIC version 1, which conforms to the protocol invariants -in [QUIC-INVARIANTS].¶
-The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", -"SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this -document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] -when, and only when, they appear in all capitals, as shown here.¶
-Commonly used terms in the document are described below.¶
-Packet and frame diagrams in this document use the format described in Section -3.1 of [RFC2360], with the following additional conventions:¶
-Streams in QUIC provide a lightweight, ordered byte-stream abstraction to an -application. Streams can be unidirectional or bidirectional. An alternative -view of QUIC unidirectional streams is a "message" abstraction of practically -unlimited length.¶
-Streams can be created by sending data. Other processes associated with stream -management - ending, cancelling, and managing flow control - are all designed to -impose minimal overheads. For instance, a single STREAM frame (Section 19.8) -can open, carry data for, and close a stream. Streams can also be long-lived and -can last the entire duration of a connection.¶
-Streams can be created by either endpoint, can concurrently send data -interleaved with other streams, and can be cancelled. QUIC does not provide any -means of ensuring ordering between bytes on different streams.¶
-QUIC allows for an arbitrary number of streams to operate concurrently and for -an arbitrary amount of data to be sent on any stream, subject to flow control -constraints (see Section 4) and stream limits.¶
-Streams can be unidirectional or bidirectional. Unidirectional streams carry -data in one direction: from the initiator of the stream to its peer. -Bidirectional streams allow for data to be sent in both directions.¶
-Streams are identified within a connection by a numeric value, referred to as -the stream ID. A stream ID is a 62-bit integer (0 to 2^62-1) that is unique for -all streams on a connection. Stream IDs are encoded as variable-length integers -(see Section 16). A QUIC endpoint MUST NOT reuse a stream ID within a -connection.¶
-The least significant bit (0x1) of the stream ID identifies the initiator of the -stream. Client-initiated streams have even-numbered stream IDs (with the bit -set to 0), and server-initiated streams have odd-numbered stream IDs (with the -bit set to 1).¶
-The second least significant bit (0x2) of the stream ID distinguishes between -bidirectional streams (with the bit set to 0) and unidirectional streams (with -the bit set to 1).¶
-The least significant two bits from a stream ID therefore identify a stream as -one of four types, as summarized in Table 1.¶
-Bits | -Stream Type | -
---|---|
0x0 | -Client-Initiated, Bidirectional | -
0x1 | -Server-Initiated, Bidirectional | -
0x2 | -Client-Initiated, Unidirectional | -
0x3 | -Server-Initiated, Unidirectional | -
Within each type, streams are created with numerically increasing stream IDs. A -stream ID that is used out of order results in all streams of that type with -lower-numbered stream IDs also being opened.¶
-The first bidirectional stream opened by the client has a stream ID of 0.¶
-STREAM frames (Section 19.8) encapsulate data sent by an application. An -endpoint uses the Stream ID and Offset fields in STREAM frames to place data in -order.¶
-Endpoints MUST be able to deliver stream data to an application as an ordered -byte-stream. Delivering an ordered byte-stream requires that an endpoint buffer -any data that is received out of order, up to the advertised flow control limit.¶
-QUIC makes no specific allowances for delivery of stream data out of -order. However, implementations MAY choose to offer the ability to deliver data -out of order to a receiving application.¶
-An endpoint could receive data for a stream at the same stream offset multiple -times. Data that has already been received can be discarded. The data at a -given offset MUST NOT change if it is sent multiple times; an endpoint MAY treat -receipt of different data at the same offset within a stream as a connection -error of type PROTOCOL_VIOLATION.¶
-Streams are an ordered byte-stream abstraction with no other structure visible -to QUIC. STREAM frame boundaries are not expected to be preserved when -data is transmitted, retransmitted after packet loss, or delivered to the -application at a receiver.¶
-An endpoint MUST NOT send data on any stream without ensuring that it is within -the flow control limits set by its peer. Flow control is described in detail in -Section 4.¶
-Stream multiplexing can have a significant effect on application performance if -resources allocated to streams are correctly prioritized.¶
-QUIC does not provide a mechanism for exchanging prioritization information. -Instead, it relies on receiving priority information from the application that -uses QUIC.¶
-A QUIC implementation SHOULD provide ways in which an application can indicate -the relative priority of streams. When deciding which streams to dedicate -resources to, the implementation SHOULD use the information provided by the -application.¶
-There are certain operations which an application MUST be able to perform when -interacting with QUIC streams. This document does not specify an API, but -any implementation of this version of QUIC MUST expose the ability to perform -the operations described in this section on a QUIC stream.¶
-On the sending part of a stream, application protocols need to be able to:¶
-On the receiving part of a stream, application protocols need to be able to:¶
-Applications also need to be informed of state changes on streams, including -when the peer has opened or reset a stream, when a peer aborts reading on a -stream, when new data is available, and when data can or cannot be written to -the stream due to flow control.¶
-This section describes streams in terms of their send or receive components. -Two state machines are described: one for the streams on which an endpoint -transmits data (Section 3.1), and another for streams on which an -endpoint receives data (Section 3.2).¶
-Unidirectional streams use the applicable state machine directly. Bidirectional -streams use both state machines. For the most part, the use of these state -machines is the same whether the stream is unidirectional or bidirectional. The -conditions for opening a stream are slightly more complex for a bidirectional -stream because the opening of either send or receive sides causes the stream -to open in both directions.¶
-An endpoint MUST open streams of the same type in increasing order of stream ID.¶
-Figure 1 shows the states for the part of a stream that sends -data to a peer.¶
-The sending part of stream that the endpoint initiates (types 0 -and 2 for clients, 1 and 3 for servers) is opened by the application. The -"Ready" state represents a newly created stream that is able to accept data from -the application. Stream data might be buffered in this state in preparation for -sending.¶
-Sending the first STREAM or STREAM_DATA_BLOCKED frame causes a sending part of a -stream to enter the "Send" state. An implementation might choose to defer -allocating a stream ID to a stream until it sends the first STREAM frame and -enters this state, which can allow for better stream prioritization.¶
-The sending part of a bidirectional stream initiated by a peer (type 0 for a -server, type 1 for a client) starts in the "Ready" state when the receiving part -is created.¶
-In the "Send" state, an endpoint transmits - and retransmits as necessary - -stream data in STREAM frames. The endpoint respects the flow control limits set -by its peer, and continues to accept and process MAX_STREAM_DATA frames. An -endpoint in the "Send" state generates STREAM_DATA_BLOCKED frames if it is -blocked from sending by stream or connection flow control limits -Section 4.1.¶
-After the application indicates that all stream data has been sent and a STREAM -frame containing the FIN bit is sent, the sending part of the stream enters the -"Data Sent" state. From this state, the endpoint only retransmits stream data -as necessary. The endpoint does not need to check flow control limits or send -STREAM_DATA_BLOCKED frames for a stream in this state. MAX_STREAM_DATA frames -might be received until the peer receives the final stream offset. The endpoint -can safely ignore any MAX_STREAM_DATA frames it receives from its peer for a -stream in this state.¶
-Once all stream data has been successfully acknowledged, the sending part of the -stream enters the "Data Recvd" state, which is a terminal state.¶
-From any of the "Ready", "Send", or "Data Sent" states, an application can -signal that it wishes to abandon transmission of stream data. Alternatively, an -endpoint might receive a STOP_SENDING frame from its peer. In either case, the -endpoint sends a RESET_STREAM frame, which causes the stream to enter the "Reset -Sent" state.¶
-An endpoint MAY send a RESET_STREAM as the first frame that mentions a stream; -this causes the sending part of that stream to open and then immediately -transition to the "Reset Sent" state.¶
-Once a packet containing a RESET_STREAM has been acknowledged, the sending part -of the stream enters the "Reset Recvd" state, which is a terminal state.¶
-Figure 2 shows the states for the part of a stream that -receives data from a peer. The states for a receiving part of a stream mirror -only some of the states of the sending part of the stream at the peer. The -receiving part of a stream does not track states on the sending part that cannot -be observed, such as the "Ready" state. Instead, the receiving part of a stream -tracks the delivery of data to the application, some of which cannot be observed -by the sender.¶
-The receiving part of a stream initiated by a peer (types 1 and 3 for a client, -or 0 and 2 for a server) is created when the first STREAM, STREAM_DATA_BLOCKED, -or RESET_STREAM is received for that stream. For bidirectional streams -initiated by a peer, receipt of a MAX_STREAM_DATA or STOP_SENDING frame for the -sending part of the stream also creates the receiving part. The initial state -for the receiving part of stream is "Recv".¶
-The receiving part of a stream enters the "Recv" state when the sending part of -a bidirectional stream initiated by the endpoint (type 0 for a client, type 1 -for a server) enters the "Ready" state.¶
-An endpoint opens a bidirectional stream when a MAX_STREAM_DATA or STOP_SENDING -frame is received from the peer for that stream. Receiving a MAX_STREAM_DATA -frame for an unopened stream indicates that the remote peer has opened the -stream and is providing flow control credit. Receiving a STOP_SENDING frame for -an unopened stream indicates that the remote peer no longer wishes to receive -data on this stream. Either frame might arrive before a STREAM or -STREAM_DATA_BLOCKED frame if packets are lost or reordered.¶
-Before a stream is created, all streams of the same type with lower-numbered -stream IDs MUST be created. This ensures that the creation order for streams is -consistent on both endpoints.¶
-In the "Recv" state, the endpoint receives STREAM and STREAM_DATA_BLOCKED -frames. Incoming data is buffered and can be reassembled into the correct order -for delivery to the application. As data is consumed by the application and -buffer space becomes available, the endpoint sends MAX_STREAM_DATA frames to -allow the peer to send more data.¶
-When a STREAM frame with a FIN bit is received, the final size of the stream is -known (see Section 4.4). The receiving part of the stream then enters the -"Size Known" state. In this state, the endpoint no longer needs to send -MAX_STREAM_DATA frames, it only receives any retransmissions of stream data.¶
-Once all data for the stream has been received, the receiving part enters the -"Data Recvd" state. This might happen as a result of receiving the same STREAM -frame that causes the transition to "Size Known". After all data has been -received, any STREAM or STREAM_DATA_BLOCKED frames for the stream can be -discarded.¶
-The "Data Recvd" state persists until stream data has been delivered to the -application. Once stream data has been delivered, the stream enters the "Data -Read" state, which is a terminal state.¶
-Receiving a RESET_STREAM frame in the "Recv" or "Size Known" states causes the -stream to enter the "Reset Recvd" state. This might cause the delivery of -stream data to the application to be interrupted.¶
-It is possible that all stream data is received when a RESET_STREAM is received -(that is, from the "Data Recvd" state). Similarly, it is possible for remaining -stream data to arrive after receiving a RESET_STREAM frame (the "Reset Recvd" -state). An implementation is free to manage this situation as it chooses.¶
-Sending RESET_STREAM means that an endpoint cannot guarantee delivery of stream -data; however there is no requirement that stream data not be delivered if a -RESET_STREAM is received. An implementation MAY interrupt delivery of stream -data, discard any data that was not consumed, and signal the receipt of the -RESET_STREAM. A RESET_STREAM signal might be suppressed or withheld if stream -data is completely received and is buffered to be read by the application. If -the RESET_STREAM is suppressed, the receiving part of the stream remains in -"Data Recvd".¶
-Once the application receives the signal indicating that the stream -was reset, the receiving part of the stream transitions to the "Reset Read" -state, which is a terminal state.¶
-The sender of a stream sends just three frame types that affect the state of a -stream at either sender or receiver: STREAM (Section 19.8), -STREAM_DATA_BLOCKED (Section 19.13), and RESET_STREAM -(Section 19.4).¶
-A sender MUST NOT send any of these frames from a terminal state ("Data Recvd" -or "Reset Recvd"). A sender MUST NOT send STREAM or STREAM_DATA_BLOCKED after -sending a RESET_STREAM; that is, in the terminal states and in the "Reset Sent" -state. A receiver could receive any of these three frames in any state, due to -the possibility of delayed delivery of packets carrying them.¶
-The receiver of a stream sends MAX_STREAM_DATA (Section 19.10) and -STOP_SENDING frames (Section 19.5).¶
-The receiver only sends MAX_STREAM_DATA in the "Recv" state. A receiver can -send STOP_SENDING in any state where it has not received a RESET_STREAM frame; -that is states other than "Reset Recvd" or "Reset Read". However there is -little value in sending a STOP_SENDING frame in the "Data Recvd" state, since -all stream data has been received. A sender could receive either of these two -frames in any state as a result of delayed delivery of packets.¶
-A bidirectional stream is composed of sending and receiving parts. -Implementations may represent states of the bidirectional stream as composites -of sending and receiving stream states. The simplest model presents the stream -as "open" when either sending or receiving parts are in a non-terminal state and -"closed" when both sending and receiving streams are in terminal states.¶
-Table 2 shows a more complex mapping of bidirectional stream -states that loosely correspond to the stream states in HTTP/2 -[HTTP2]. This shows that multiple states on sending or receiving -parts of streams are mapped to the same composite state. Note that this is just -one possibility for such a mapping; this mapping requires that data is -acknowledged before the transition to a "closed" or "half-closed" state.¶
-Sending Part | -Receiving Part | -Composite State | -
---|---|---|
No Stream/Ready | -No Stream/Recv *1 | -idle | -
Ready/Send/Data Sent | -Recv/Size Known | -open | -
Ready/Send/Data Sent | -Data Recvd/Data Read | -half-closed (remote) | -
Ready/Send/Data Sent | -Reset Recvd/Reset Read | -half-closed (remote) | -
Data Recvd | -Recv/Size Known | -half-closed (local) | -
Reset Sent/Reset Recvd | -Recv/Size Known | -half-closed (local) | -
Reset Sent/Reset Recvd | -Data Recvd/Data Read | -closed | -
Reset Sent/Reset Recvd | -Reset Recvd/Reset Read | -closed | -
Data Recvd | -Data Recvd/Data Read | -closed | -
Data Recvd | -Reset Recvd/Reset Read | -closed | -
If an application is no longer interested in the data it is receiving on a -stream, it can abort reading the stream and specify an application error code.¶
-If the stream is in the "Recv" or "Size Known" states, the transport SHOULD -signal this by sending a STOP_SENDING frame to prompt closure of the stream in -the opposite direction. This typically indicates that the receiving application -is no longer reading data it receives from the stream, but it is not a guarantee -that incoming data will be ignored.¶
-STREAM frames received after sending STOP_SENDING are still counted toward -connection and stream flow control, even though these frames can be discarded -upon receipt.¶
-A STOP_SENDING frame requests that the receiving endpoint send a RESET_STREAM -frame. An endpoint that receives a STOP_SENDING frame MUST send a RESET_STREAM -frame if the stream is in the Ready or Send state. If the stream is in the Data -Sent state and any outstanding data is declared lost, an endpoint SHOULD send a -RESET_STREAM frame in lieu of a retransmission.¶
-An endpoint SHOULD copy the error code from the STOP_SENDING frame to the -RESET_STREAM frame it sends, but MAY use any application error code. The -endpoint that sends a STOP_SENDING frame MAY ignore the error code carried in -any RESET_STREAM frame it receives.¶
-If the STOP_SENDING frame is received on a stream that is already in the -"Data Sent" state, an endpoint that wishes to cease retransmission of -previously-sent STREAM frames on that stream MUST first send a RESET_STREAM -frame.¶
-STOP_SENDING SHOULD only be sent for a stream that has not been reset by the -peer. STOP_SENDING is most useful for streams in the "Recv" or "Size Known" -states.¶
-An endpoint is expected to send another STOP_SENDING frame if a packet -containing a previous STOP_SENDING is lost. However, once either all stream -data or a RESET_STREAM frame has been received for the stream - that is, the -stream is in any state other than "Recv" or "Size Known" - sending a -STOP_SENDING frame is unnecessary.¶
-An endpoint that wishes to terminate both directions of a bidirectional stream -can terminate one direction by sending a RESET_STREAM, and it can encourage -prompt termination in the opposite direction by sending a STOP_SENDING frame.¶
-It is necessary to limit the amount of data that a receiver could buffer, to -prevent a fast sender from overwhelming a slow receiver, or to prevent a -malicious sender from consuming a large amount of memory at a receiver. To -enable a receiver to limit memory commitment to a connection and to apply back -pressure on the sender, streams are flow controlled both individually and as an -aggregate. A QUIC receiver controls the maximum amount of data the sender can -send on a stream at any time, as described in Section 4.1 and -Section 4.2¶
-Similarly, to limit concurrency within a connection, a QUIC endpoint controls -the maximum cumulative number of streams that its peer can initiate, as -described in Section 4.5.¶
-Data sent in CRYPTO frames is not flow controlled in the same way as stream -data. QUIC relies on the cryptographic protocol implementation to avoid -excessive buffering of data; see [QUIC-TLS]. The implementation SHOULD -provide an interface to QUIC to tell it about its buffering limits so that there -is not excessive buffering at multiple layers.¶
-QUIC employs a credit-based flow-control scheme similar to that in HTTP/2 -[HTTP2], where a receiver advertises the number of bytes it is prepared to -receive on a given stream and for the entire connection. This leads to two -levels of data flow control in QUIC:¶
-A receiver sets initial credits for all streams by sending transport parameters -during the handshake (Section 7.3). A receiver sends -MAX_STREAM_DATA (Section 19.10) or MAX_DATA (Section 19.9) -frames to the sender to advertise additional credit.¶
-A receiver advertises credit for a stream by sending a MAX_STREAM_DATA frame -with the Stream ID field set appropriately. A MAX_STREAM_DATA frame indicates -the maximum absolute byte offset of a stream. A receiver could use the current -offset of data consumed to determine the flow control offset to be advertised. -A receiver MAY send MAX_STREAM_DATA frames in multiple packets in order to make -sure that the sender receives an update before running out of flow control -credit, even if one of the packets is lost.¶
-A receiver advertises credit for a connection by sending a MAX_DATA frame, which -indicates the maximum of the sum of the absolute byte offsets of all streams. A -receiver maintains a cumulative sum of bytes received on all streams, which is -used to check for flow control violations. A receiver might use a sum of bytes -consumed on all streams to determine the maximum data limit to be advertised.¶
-A receiver can advertise a larger offset by sending MAX_STREAM_DATA or MAX_DATA -frames. Once a receiver advertises an offset, it MAY advertise a smaller -offset, but this has no effect.¶
-A receiver MUST close the connection with a FLOW_CONTROL_ERROR error -(Section 11) if the sender violates the advertised connection or stream -data limits.¶
-A sender MUST ignore any MAX_STREAM_DATA or MAX_DATA frames that do not increase -flow control limits.¶
-If a sender runs out of flow control credit, it will be unable to send new data -and is considered blocked. A sender SHOULD send a STREAM_DATA_BLOCKED or -DATA_BLOCKED frame to indicate it has data to write but is blocked by flow -control limits. If a sender is blocked for a period longer than the idle -timeout (Section 10.2), the connection might be closed even when data is -available for transmission. To keep the connection from closing, a sender that -is flow control limited SHOULD periodically send a STREAM_DATA_BLOCKED or -DATA_BLOCKED frame when it has no ack-eliciting packets in flight.¶
-Implementations decide when and how much credit to advertise in MAX_STREAM_DATA -and MAX_DATA frames, but this section offers a few considerations.¶
-To avoid blocking a sender, a receiver can send a MAX_STREAM_DATA or MAX_DATA -frame multiple times within a round trip or send it early enough to allow for -recovery from loss of the frame.¶
-Control frames contribute to connection overhead. Therefore, frequently sending -MAX_STREAM_DATA and MAX_DATA frames with small changes is undesirable. On the -other hand, if updates are less frequent, larger increments to limits are -necessary to avoid blocking a sender, requiring larger resource commitments at -the receiver. There is a trade-off between resource commitment and overhead -when determining how large a limit is advertised.¶
-A receiver can use an autotuning mechanism to tune the frequency and amount of -advertised additional credit based on a round-trip time estimate and the rate at -which the receiving application consumes data, similar to common TCP -implementations. As an optimization, an endpoint could send frames related to -flow control only when there are other frames to send or when a peer is blocked, -ensuring that flow control does not cause extra packets to be sent.¶
-A blocked sender is not required to send STREAM_DATA_BLOCKED or DATA_BLOCKED -frames. Therefore, a receiver MUST NOT wait for a STREAM_DATA_BLOCKED or -DATA_BLOCKED frame before sending a MAX_STREAM_DATA or MAX_DATA frame; doing so -could result in the sender being blocked for the rest of the connection. Even if -the sender sends these frames, waiting for them will result in the sender being -blocked for at least an entire round trip.¶
-When a sender receives credit after being blocked, it might be able to send a -large amount of data in response, resulting in short-term congestion; see -Section 6.9 in [QUIC-RECOVERY] for a discussion of how a sender can avoid this -congestion.¶
-Endpoints need to eventually agree on the amount of flow control credit that has -been consumed, to avoid either exceeding flow control limits or deadlocking.¶
-On receipt of a RESET_STREAM frame, an endpoint will tear down state for the -matching stream and ignore further data arriving on that stream. Without the -offset included in RESET_STREAM, the two endpoints could disagree on -the number of bytes that count towards connection flow control.¶
-To remedy this issue, a RESET_STREAM frame (Section 19.4) includes the -final size of data sent on the stream. On receiving a RESET_STREAM frame, a -receiver definitively knows how many bytes were sent on that stream before the -RESET_STREAM frame, and the receiver MUST use the final size of the stream to -account for all bytes sent on the stream in its connection level flow -controller.¶
-RESET_STREAM terminates one direction of a stream abruptly. For a bidirectional -stream, RESET_STREAM has no effect on data flow in the opposite direction. Both -endpoints MUST maintain flow control state for the stream in the unterminated -direction until that direction enters a terminal state, or until one of the -endpoints sends CONNECTION_CLOSE.¶
-The final size is the amount of flow control credit that is consumed by a -stream. Assuming that every contiguous byte on the stream was sent once, the -final size is the number of bytes sent. More generally, this is one higher -than the offset of the byte with the largest offset sent on the stream, or zero -if no bytes were sent.¶
-For a stream that is reset, the final size is carried explicitly in a -RESET_STREAM frame. Otherwise, the final size is the offset plus the length of -a STREAM frame marked with a FIN flag, or 0 in the case of incoming -unidirectional streams.¶
-An endpoint will know the final size for a stream when the receiving part of the -stream enters the "Size Known" or "Reset Recvd" state (Section 3).¶
-An endpoint MUST NOT send data on a stream at or beyond the final size.¶
-Once a final size for a stream is known, it cannot change. If a RESET_STREAM or -STREAM frame is received indicating a change in the final size for the stream, -an endpoint SHOULD respond with a FINAL_SIZE_ERROR error (see -Section 11). A receiver SHOULD treat receipt of data at or beyond the -final size as a FINAL_SIZE_ERROR error, even after a stream is closed. -Generating these errors is not mandatory, but only because requiring that an -endpoint generate these errors also means that the endpoint needs to maintain -the final size state for closed streams, which could mean a significant state -commitment.¶
-An endpoint limits the cumulative number of incoming streams a peer can open. -Only streams with a stream ID less than (max_stream * 4 + -initial_stream_id_for_type) can be opened (see Table 5). Initial -limits are set in the transport parameters (see -Section 18.2) and subsequently limits are advertised -using MAX_STREAMS frames (Section 19.11). Separate limits apply to -unidirectional and bidirectional streams.¶
-If a max_streams transport parameter or MAX_STREAMS frame is received with a -value greater than 2^60, this would allow a maximum stream ID that cannot be -expressed as a variable-length integer (see Section 16). -If either is received, the connection MUST be closed immediately with a -connection error of type STREAM_LIMIT_ERROR (see Section 10.3).¶
-Endpoints MUST NOT exceed the limit set by their peer. An endpoint that -receives a frame with a stream ID exceeding the limit it has sent MUST treat -this as a connection error of type STREAM_LIMIT_ERROR (Section 11).¶
-Once a receiver advertises a stream limit using the MAX_STREAMS frame, -advertising a smaller limit has no effect. A receiver MUST ignore any -MAX_STREAMS frame that does not increase the stream limit.¶
-As with stream and connection flow control, this document leaves when and how -many streams to advertise to a peer via MAX_STREAMS to implementations. -Implementations might choose to increase limits as streams close to keep the -number of streams available to peers roughly consistent.¶
-An endpoint that is unable to open a new stream due to the peer's limits SHOULD -send a STREAMS_BLOCKED frame (Section 19.14). This signal is -considered useful for debugging. An endpoint MUST NOT wait to receive this -signal before advertising additional credit, since doing so will mean that the -peer will be blocked for at least an entire round trip, and potentially for -longer if the peer chooses to not send STREAMS_BLOCKED frames.¶
-QUIC's connection establishment combines version negotiation with the -cryptographic and transport handshakes to reduce connection establishment -latency, as described in Section 7. Once established, a connection -may migrate to a different IP or port at either endpoint as -described in Section 9. Finally, a connection may be terminated by either -endpoint, as described in Section 10.¶
-Each connection possesses a set of connection identifiers, or connection IDs, -each of which can identify the connection. Connection IDs are independently -selected by endpoints; each endpoint selects the connection IDs that its peer -uses.¶
-The primary function of a connection ID is to ensure that changes in addressing -at lower protocol layers (UDP, IP) don't cause packets for a QUIC -connection to be delivered to the wrong endpoint. Each endpoint selects -connection IDs using an implementation-specific (and perhaps -deployment-specific) method which will allow packets with that connection ID to -be routed back to the endpoint and identified by the endpoint upon receipt.¶
-Connection IDs MUST NOT contain any information that can be used by an external -observer (that is, one that does not cooperate with the issuer) to correlate -them with other connection IDs for the same connection. As a trivial example, -this means the same connection ID MUST NOT be issued more than once on the same -connection.¶
-Packets with long headers include Source Connection ID and Destination -Connection ID fields. These fields are used to set the connection IDs for new -connections; see Section 7.2 for details.¶
-Packets with short headers (Section 17.3) only include the Destination -Connection ID and omit the explicit length. The length of the Destination -Connection ID field is expected to be known to endpoints. Endpoints using a -load balancer that routes based on connection ID could agree with the load -balancer on a fixed length for connection IDs, or agree on an encoding scheme. -A fixed portion could encode an explicit length, which allows the entire -connection ID to vary in length and still be used by the load balancer.¶
-A Version Negotiation (Section 17.2.1) packet echoes the connection IDs -selected by the client, both to ensure correct routing toward the client and to -allow the client to validate that the packet is in response to an Initial -packet.¶
-A zero-length connection ID can be used when a connection ID is not needed to -route to the correct endpoint. However, multiplexing connections on the same -local IP address and port while using zero-length connection IDs will cause -failures in the presence of peer connection migration, NAT rebinding, and client -port reuse; and therefore MUST NOT be done unless an endpoint is certain that -those protocol features are not in use.¶
-When an endpoint uses a non-zero-length connection ID, it needs to ensure that -the peer has a supply of connection IDs from which to choose for packets sent to -the endpoint. These connection IDs are supplied by the endpoint using the -NEW_CONNECTION_ID frame (Section 19.15).¶
-Each Connection ID has an associated sequence number to assist in deduplicating -messages. The initial connection ID issued by an endpoint is sent in the Source -Connection ID field of the long packet header (Section 17.2) during the -handshake. The sequence number of the initial connection ID is 0. If the -preferred_address transport parameter is sent, the sequence number of the -supplied connection ID is 1.¶
-Additional connection IDs are communicated to the peer using NEW_CONNECTION_ID -frames (Section 19.15). The sequence number on each newly-issued -connection ID MUST increase by 1. The connection ID randomly selected by the -client in the Initial packet and any connection ID provided by a Retry packet -are not assigned sequence numbers unless a server opts to retain them as its -initial connection ID.¶
-When an endpoint issues a connection ID, it MUST accept packets that carry this -connection ID for the duration of the connection or until its peer invalidates -the connection ID via a RETIRE_CONNECTION_ID frame -(Section 19.16). Connection IDs that are issued and not -retired are considered active; any active connection ID is valid for use with -the current connection at any time, in any packet type. This includes the -connection ID issued by the server via the preferred_address transport -parameter.¶
-An endpoint SHOULD ensure that its peer has a sufficient number of available and -unused connection IDs. Endpoints store received connection IDs for future use -and advertise the number of connection IDs they are willing to store with the -active_connection_id_limit transport parameter. An endpoint MUST NOT provide -more connection IDs than the peer's limit. An endpoint that receives more -connection IDs than its advertised active_connection_id_limit MUST close the -connection with an error of type CONNECTION_ID_LIMIT_ERROR.¶
-An endpoint SHOULD supply a new connection ID when the peer retires a connection -ID. If an endpoint provided fewer connection IDs than the peer's -active_connection_id_limit, it MAY supply a new connection ID when it receives -a packet with a previously unused connection ID. An endpoint MAY limit the -frequency or the total number of connection IDs issued for each connection to -avoid the risk of running out of connection IDs; see Section 10.4.2.¶
-An endpoint that initiates migration and requires non-zero-length connection IDs -SHOULD ensure that the pool of connection IDs available to its peer allows the -peer to use a new connection ID on migration, as the peer will close the -connection if the pool is exhausted.¶
-An endpoint can change the connection ID it uses for a peer to another available -one at any time during the connection. An endpoint consumes connection IDs in -response to a migrating peer; see Section 9.5 for more.¶
-An endpoint maintains a set of connection IDs received from its peer, any of -which it can use when sending packets. When the endpoint wishes to remove a -connection ID from use, it sends a RETIRE_CONNECTION_ID frame to its peer. -Sending a RETIRE_CONNECTION_ID frame indicates that the connection ID will not -be used again and requests that the peer replace it with a new connection ID -using a NEW_CONNECTION_ID frame.¶
-As discussed in Section 9.5, endpoints limit the use of a -connection ID to a single network path where possible. Endpoints SHOULD retire -connection IDs when no longer actively using the network path on which the -connection ID was used.¶
-An endpoint can cause its peer to retire connection IDs by sending a -NEW_CONNECTION_ID frame with an increased Retire Prior To field. Upon receipt, -the peer MUST first retire the corresponding connection IDs using -RETIRE_CONNECTION_ID frames and then add the newly provided connection ID to the -set of active connection IDs. Failure to retire the connection IDs within -approximately one PTO can cause packets to be delayed, lost, or cause the -original endpoint to send a stateless reset in response to a connection ID it -can no longer route correctly.¶
-An endpoint MAY discard a connection ID for which retirement has been requested -once an interval of no less than 3 PTO has elapsed since an acknowledgement is -received for the NEW_CONNECTION_ID frame requesting that retirement. Until -then, the endpoint SHOULD be prepared to receive packets that contain the -connection ID that it has requested be retired. Subsequent incoming packets -using that connection ID could elicit a response with the corresponding -stateless reset token.¶
-Incoming packets are classified on receipt. Packets can either be associated -with an existing connection, or - for servers - potentially create a new -connection.¶
-Endpoints try to associate a packet with an existing connection. If the packet -has a non-zero-length Destination Connection ID corresponding to an existing -connection, QUIC processes that packet accordingly. Note that more than one -connection ID can be associated with a connection; see Section 5.1.¶
-If the Destination Connection ID is zero length and the addressing information -in the packet matches the addressing information the endpoint uses to identify a -connection with a zero-length connection ID, QUIC processes the packet as part -of that connection. An endpoint can use just destination IP and port or both -source and destination addresses for identification, though this makes -connections fragile as described in Section 5.1.¶
-Endpoints can send a Stateless Reset (Section 10.4) for any packets that -cannot be attributed to an existing connection. A stateless reset allows a peer -to more quickly identify when a connection becomes unusable.¶
-Packets that are matched to an existing connection are discarded if the packets -are inconsistent with the state of that connection. For example, packets are -discarded if they indicate a different protocol version than that of the -connection, or if the removal of packet protection is unsuccessful once the -expected keys are available.¶
-Invalid packets without packet protection, such as Initial, Retry, or Version -Negotiation, MAY be discarded. An endpoint MUST generate a connection error if -it commits changes to state before discovering an error.¶
-Valid packets sent to clients always include a Destination Connection ID that -matches a value the client selects. Clients that choose to receive -zero-length connection IDs can use the local address and port to identify a -connection. Packets that don't match an existing connection are discarded.¶
-Due to packet reordering or loss, a client might receive packets for a -connection that are encrypted with a key it has not yet computed. The client MAY -drop these packets, or MAY buffer them in anticipation of later packets that -allow it to compute the key.¶
-If a client receives a packet that has an unsupported version, it MUST discard -that packet.¶
-If a server receives a packet that has an unsupported version, but the packet is -sufficiently large to initiate a new connection for any version supported by the -server, it SHOULD send a Version Negotiation packet as described in -Section 6.1. Servers MAY rate control these packets to avoid storms of Version -Negotiation packets. Otherwise, servers MUST drop packets that specify -unsupported versions.¶
-The first packet for an unsupported version can use different semantics and -encodings for any version-specific field. In particular, different packet -protection keys might be used for different versions. Servers that do not -support a particular version are unlikely to be able to decrypt the payload of -the packet. Servers SHOULD NOT attempt to decode or decrypt a packet from an -unknown version, but instead send a Version Negotiation packet, provided that -the packet is sufficiently long.¶
-Packets with a supported version, or no version field, are matched to a -connection using the connection ID or - for packets with zero-length connection -IDs - the local address and port. If the packet doesn't match an existing -connection, the server continues below.¶
-If the packet is an Initial packet fully conforming with the specification, the -server proceeds with the handshake (Section 7). This commits the server to -the version that the client selected.¶
-If a server isn't currently accepting any new connections, it SHOULD send an -Initial packet containing a CONNECTION_CLOSE frame with error code -SERVER_BUSY.¶
-If the packet is a 0-RTT packet, the server MAY buffer a limited number of these -packets in anticipation of a late-arriving Initial packet. Clients are not able -to send Handshake packets prior to receiving a server response, so servers -SHOULD ignore any such packets.¶
-Servers MUST drop incoming packets under all other circumstances.¶
-A QUIC connection is a stateful interaction between a client and server, the -primary purpose of which is to support the exchange of data by an application -protocol. Streams (Section 2) are the primary means by which an application -protocol exchanges information.¶
-Each connection starts with a handshake phase, during which client and server -establish a shared secret using the cryptographic handshake protocol -[QUIC-TLS] and negotiate the application protocol. The handshake -(Section 7) confirms that both endpoints are willing to communicate -(Section 8.1) and establishes parameters for the connection -(Section 7.3).¶
-An application protocol can also operate in a limited fashion during the -handshake phase. 0-RTT allows application messages to be sent by a client -before receiving any messages from the server. However, 0-RTT lacks certain key -security guarantees. In particular, there is no protection against replay -attacks in 0-RTT; see [QUIC-TLS]. Separately, a server can also send -application data to a client before it receives the final cryptographic -handshake messages that allow it to confirm the identity and liveness of the -client. These capabilities allow an application protocol to offer the option to -trade some security guarantees for reduced latency.¶
-The use of connection IDs (Section 5.1) allows connections to migrate to a -new network path, both as a direct choice of an endpoint and when forced by a -change in a middlebox. Section 9 describes mitigations for the security and -privacy issues associated with migration.¶
-For connections that are no longer needed or desired, there are several ways for -a client and server to terminate a connection (Section 10).¶
-There are certain operations which an application MUST be able to perform when -interacting with the QUIC transport. This document does not specify an API, but -any implementation of this version of QUIC MUST expose the ability to perform -the operations described in this section on a QUIC connection.¶
-When implementing the client role, applications need to be able to:¶
-When implementing the server role, applications need to be able to:¶
-In either role, applications need to be able to:¶
-Version negotiation ensures that client and server agree to a QUIC version -that is mutually supported. A server sends a Version Negotiation packet in -response to each packet that might initiate a new connection; see -Section 5.2 for details.¶
-The size of the first packet sent by a client will determine whether a server -sends a Version Negotiation packet. Clients that support multiple QUIC versions -SHOULD pad the first packet they send to the largest of the minimum packet sizes -across all versions they support. This ensures that the server responds if there -is a mutually supported version.¶
-If the version selected by the client is not acceptable to the server, the -server responds with a Version Negotiation packet (see Section 17.2.1). -This includes a list of versions that the server will accept. An endpoint MUST -NOT send a Version Negotiation packet in response to receiving a Version -Negotiation packet.¶
-This system allows a server to process packets with unsupported versions without -retaining state. Though either the Initial packet or the Version Negotiation -packet that is sent in response could be lost, the client will send new packets -until it successfully receives a response or it abandons the connection attempt. -As a result, the client discards all state for the connection and does not send -any more packets on the connection.¶
-A server MAY limit the number of Version Negotiation packets it sends. For -instance, a server that is able to recognize packets as 0-RTT might choose not -to send Version Negotiation packets in response to 0-RTT packets with the -expectation that it will eventually receive an Initial packet.¶
-When a client receives a Version Negotiation packet, it MUST abandon the -current connection attempt. Version Negotiation packets are designed to allow -future versions of QUIC to negotiate the version in use between endpoints. -Future versions of QUIC might change how implementations that support multiple -versions of QUIC react to Version Negotiation packets when attempting to -establish a connection using this version. How to perform version negotiation -is left as future work defined by future versions of QUIC. In particular, -that future work will need to ensure robustness against version downgrade -attacks; see Section 21.10.¶
-[[RFC editor: please remove this section before publication.]]¶
-When a draft implementation receives a Version Negotiation packet, it MAY use -it to attempt a new connection with one of the versions listed in the packet, -instead of abandoning the current connection attempt; see Section 6.2.¶
-The client MUST check that the Destination and Source Connection ID fields -match the Source and Destination Connection ID fields in a packet that the -client sent. If this check fails, the packet MUST be discarded.¶
-Once the Version Negotiation packet is determined to be valid, the client then -selects an acceptable protocol version from the list provided by the server. -The client then attempts to create a new connection using that version. The new -connection MUST use a new random Destination Connection ID different from the -one it had previously sent.¶
-Note that this mechanism does not protect against downgrade attacks and -MUST NOT be used outside of draft implementations.¶
-For a server to use a new version in the future, clients need to correctly -handle unsupported versions. To help ensure this, a server SHOULD include a -version that is reserved for forcing version negotiation (0x?a?a?a?a as defined -in Section 15) when generating a Version Negotiation packet.¶
-The design of version negotiation permits a server to avoid maintaining state -for packets that it rejects in this fashion.¶
-A client MAY send a packet using a version that is reserved for forcing version -negotiation. This can be used to solicit a list of supported versions from a -server.¶
-QUIC relies on a combined cryptographic and transport handshake to minimize -connection establishment latency. QUIC uses the CRYPTO frame Section 19.6 -to transmit the cryptographic handshake. Version 0x00000001 of QUIC uses TLS as -described in [QUIC-TLS]; a different QUIC version number could indicate that a -different cryptographic handshake protocol is in use.¶
-QUIC provides reliable, ordered delivery of the cryptographic handshake -data. QUIC packet protection is used to encrypt as much of the handshake -protocol as possible. The cryptographic handshake MUST provide the following -properties:¶
-authenticated key exchange, where¶
- -An endpoint can verify support for Explicit Congestion Notification (ECN) in the -first packets it sends, as described in Section 13.4.2.¶
-The CRYPTO frame can be sent in different packet number spaces -(Section 12.3). The sequence numbers used by CRYPTO frames to ensure -ordered delivery of cryptographic handshake data start from zero in each -packet number space.¶
-Endpoints MUST explicitly negotiate an application protocol. This avoids -situations where there is a disagreement about the protocol that is in use.¶
-Details of how TLS is integrated with QUIC are provided in [QUIC-TLS], but -some examples are provided here. An extension of this exchange to support -client address validation is shown in Section 8.1.2.¶
-Once any address validation exchanges are complete, the -cryptographic handshake is used to agree on cryptographic keys. The -cryptographic handshake is carried in Initial (Section 17.2.2) and Handshake -(Section 17.2.4) packets.¶
-Figure 3 provides an overview of the 1-RTT handshake. Each line -shows a QUIC packet with the packet type and packet number shown first, followed -by the frames that are typically contained in those packets. So, for instance -the first packet is of type Initial, with packet number 0, and contains a CRYPTO -frame carrying the ClientHello.¶
-Note that multiple QUIC packets - even of different packet types - can be -coalesced into a single UDP datagram (see Section 12.2), and so this -handshake may consist of as few as 4 UDP datagrams, or any number more. For -instance, the server's first flight contains Initial packets, -Handshake packets, and "0.5-RTT data" in 1-RTT packets with a short header.¶
-Figure 4 shows an example of a connection with a 0-RTT handshake -and a single packet of 0-RTT data. Note that as described in -Section 12.3, the server acknowledges 0-RTT data in 1-RTT packets, and -the client sends 1-RTT packets in the same packet number space.¶
-A connection ID is used to ensure consistent routing of packets, as described in -Section 5.1. The long header contains two connection IDs: the Destination -Connection ID is chosen by the recipient of the packet and is used to provide -consistent routing; the Source Connection ID is used to set the Destination -Connection ID used by the peer.¶
-During the handshake, packets with the long header (Section 17.2) are used to -establish the connection IDs in each direction. Each endpoint uses the Source -Connection ID field to specify the connection ID that is used in the Destination -Connection ID field of packets being sent to them. Upon receiving a packet, each -endpoint sets the Destination Connection ID it sends to match the value of the -Source Connection ID that it receives.¶
-When an Initial packet is sent by a client that has not previously received an -Initial or Retry packet from the server, the client populates the Destination -Connection ID field with an unpredictable value. This Destination Connection ID -MUST be at least 8 bytes in length. Until a packet is received from the server, -the client MUST use the same Destination Connection ID value on all packets in -this connection. This Destination Connection ID is used to determine packet -protection keys for Initial packets.¶
-The client populates the Source Connection ID field with a value of its choosing -and sets the SCID Len field to indicate the length.¶
-The first flight of 0-RTT packets use the same Destination Connection ID and -Source Connection ID values as the client's first Initial packet.¶
-Upon first receiving an Initial or Retry packet from the server, the client uses -the Source Connection ID supplied by the server as the Destination Connection ID -for subsequent packets, including all subsequent 0-RTT packets. This means that -a client might have to change the connection ID it sets in the Destination -Connection ID field twice during connection establishment: once in response to a -Retry, and once in response to an Initial packet from the server. Once a client -has received an Initial packet from the server, it MUST discard any subsequent -packet it receives with a different Source Connection ID.¶
-A client MUST change the Destination Connection ID it uses for sending packets -in response to only the first received Initial or Retry packet. A server MUST -set the Destination Connection ID it uses for sending packets based on the first -received Initial packet. Any further changes to the Destination Connection ID -are only permitted if the values are taken from any received -NEW_CONNECTION_ID frames; if subsequent Initial packets include a different -Source Connection ID, they MUST be discarded. This avoids unpredictable -outcomes that might otherwise result from stateless processing of multiple -Initial packets with different Source Connection IDs.¶
-The Destination Connection ID that an endpoint sends can change over the -lifetime of a connection, especially in response to connection migration -(Section 9); see Section 5.1.1 for details.¶
-During connection establishment, both endpoints make authenticated declarations -of their transport parameters. Endpoints are required to comply with the -restrictions implied by these parameters; the description of each parameter -includes rules for its handling.¶
-Transport parameters are declarations that are made unilaterally by each -endpoint. Each endpoint can choose values for transport parameters independent -of the values chosen by its peer.¶
-The encoding of the transport parameters is detailed in -Section 18.¶
-QUIC includes the encoded transport parameters in the cryptographic handshake. -Once the handshake completes, the transport parameters declared by the peer are -available. Each endpoint validates the value provided by its peer.¶
-Definitions for each of the defined transport parameters are included in -Section 18.2.¶
-An endpoint MUST treat receipt of a transport parameter with an invalid value as -a connection error of type TRANSPORT_PARAMETER_ERROR.¶
-An endpoint MUST NOT send a parameter more than once in a given transport -parameters extension. An endpoint SHOULD treat receipt of duplicate transport -parameters as a connection error of type TRANSPORT_PARAMETER_ERROR.¶
-A server MUST include the original_connection_id transport parameter -(Section 18.2) if it sent a Retry packet to enable -validation of the Retry, as described in Section 17.2.5.¶
-Both endpoints store the value of the server transport parameters from -a connection and apply them to any 0-RTT packets that are sent in -subsequent connections to that peer, except for transport parameters that -are explicitly excluded. Remembered transport parameters apply to the new -connection until the handshake completes and the client starts sending -1-RTT packets. Once the handshake completes, the client uses the transport -parameters established in the handshake.¶
-The definition of new transport parameters (Section 7.3.2) MUST -specify whether they MUST, MAY, or MUST NOT be stored for 0-RTT. A client need -not store a transport parameter it cannot process.¶
-A client MUST NOT use remembered values for the following parameters: -original_connection_id, preferred_address, stateless_reset_token, and -ack_delay_exponent. The client MUST use the server's new values in the -handshake instead, and absent new values from the server, the default value.¶
-A client that attempts to send 0-RTT data MUST remember all other transport -parameters used by the server. The server can remember these transport -parameters, or store an integrity-protected copy of the values in the ticket -and recover the information when accepting 0-RTT data. A server uses the -transport parameters in determining whether to accept 0-RTT data.¶
-If 0-RTT data is accepted by the server, the server MUST NOT reduce any -limits or alter any values that might be violated by the client with its -0-RTT data. In particular, a server that accepts 0-RTT data MUST NOT set -values for the following parameters (Section 18.2) -that are smaller than the remembered value of the parameters.¶
-Omitting or setting a zero value for certain transport parameters can result in -0-RTT data being enabled, but not usable. The applicable subset of transport -parameters that permit sending of application data SHOULD be set to non-zero -values for 0-RTT. This includes initial_max_data and either -initial_max_streams_bidi and initial_max_stream_data_bidi_remote, or -initial_max_streams_uni and initial_max_stream_data_uni.¶
-A server MUST either reject 0-RTT data or abort a handshake if the implied -values for transport parameters cannot be supported.¶
-When sending frames in 0-RTT packets, a client MUST only use remembered -transport parameters; importantly, it MUST NOT use updated values that it learns -from the server's updated transport parameters or from frames received in 1-RTT -packets. Updated values of transport parameters from the handshake apply only -to 1-RTT packets. For instance, flow control limits from remembered transport -parameters apply to all 0-RTT packets even if those values are increased by the -handshake or by frames sent in 1-RTT packets. A server MAY treat use of updated -transport parameters in 0-RTT as a connection error of type PROTOCOL_VIOLATION.¶
-New transport parameters can be used to negotiate new protocol behavior. An -endpoint MUST ignore transport parameters that it does not support. Absence of -a transport parameter therefore disables any optional protocol feature that is -negotiated using the parameter. As described in Section 18.1, -some identifiers are reserved in order to exercise this requirement.¶
-New transport parameters can be registered according to the rules in -Section 22.2.¶
-Implementations need to maintain a buffer of CRYPTO data received out of order. -Because there is no flow control of CRYPTO frames, an endpoint could -potentially force its peer to buffer an unbounded amount of data.¶
-Implementations MUST support buffering at least 4096 bytes of data received in -CRYPTO frames out of order. Endpoints MAY choose to allow more data to be -buffered during the handshake. A larger limit during the handshake could allow -for larger keys or credentials to be exchanged. An endpoint's buffer size does -not need to remain constant during the life of the connection.¶
-Being unable to buffer CRYPTO frames during the handshake can lead to a -connection failure. If an endpoint's buffer is exceeded during the handshake, it -can expand its buffer temporarily to complete the handshake. If an endpoint -does not expand its buffer, it MUST close the connection with a -CRYPTO_BUFFER_EXCEEDED error code.¶
-Once the handshake completes, if an endpoint is unable to buffer all data in a -CRYPTO frame, it MAY discard that CRYPTO frame and all CRYPTO frames received in -the future, or it MAY close the connection with a CRYPTO_BUFFER_EXCEEDED error -code. Packets containing discarded CRYPTO frames MUST be acknowledged because -the packet has been received and processed by the transport even though the -CRYPTO frame was discarded.¶
-Address validation is used by QUIC to avoid being used for a traffic -amplification attack. In such an attack, a packet is sent to a server with -spoofed source address information that identifies a victim. If a server -generates more or larger packets in response to that packet, the attacker can -use the server to send more data toward the victim than it would be able to send -on its own.¶
-The primary defense against amplification attack is verifying that an endpoint -is able to receive packets at the transport address that it claims. Address -validation is performed both during connection establishment (see -Section 8.1) and during connection migration (see -Section 8.2).¶
-Connection establishment implicitly provides address validation for both -endpoints. In particular, receipt of a packet protected with Handshake keys -confirms that the client received the Initial packet from the server. Once the -server has successfully processed a Handshake packet from the client, it can -consider the client address to have been validated.¶
-Prior to validating the client address, servers MUST NOT send more than three -times as many bytes as the number of bytes they have received. This limits the -magnitude of any amplification attack that can be mounted using spoofed source -addresses. In determining this limit, servers only count the size of -successfully processed packets.¶
-Clients MUST ensure that UDP datagrams containing Initial packets have UDP -payloads of at least 1200 bytes, adding padding to packets in the datagram as -necessary. Sending padded datagrams ensures that the server is not overly -constrained by the amplification restriction.¶
-Loss of an Initial or Handshake packet from the server can cause a deadlock if -the client does not send additional Initial or Handshake packets. A deadlock -could occur when the server reaches its anti-amplification limit and the client -has received acknowledgements for all the data it has sent. In this case, when -the client has no reason to send additional packets, the server will be unable -to send more data because it has not validated the client's address. To prevent -this deadlock, clients MUST send a packet on a probe timeout -(PTO, see Section 5.3 of [QUIC-RECOVERY]). Specifically, the client MUST send -an Initial packet in a UDP datagram of at least 1200 bytes if it does not have -Handshake keys, and otherwise send a Handshake packet.¶
-A server might wish to validate the client address before starting the -cryptographic handshake. QUIC uses a token in the Initial packet to provide -address validation prior to completing the handshake. This token is delivered to -the client during connection establishment with a Retry packet (see -Section 8.1.2) or in a previous connection using the NEW_TOKEN frame (see -Section 8.1.3).¶
-In addition to sending limits imposed prior to address validation, servers are -also constrained in what they can send by the limits set by the congestion -controller. Clients are only constrained by the congestion controller.¶
-A token sent in a NEW_TOKEN frames or a Retry packet MUST be constructed in a -way that allows the server to identify how it was provided to a client. These -tokens are carried in the same field, but require different handling from -servers.¶
-Upon receiving the client's Initial packet, the server can request address -validation by sending a Retry packet (Section 17.2.5) containing a token. This -token MUST be repeated by the client in all Initial packets it sends for that -connection after it receives the Retry packet. In response to processing an -Initial containing a token, a server can either abort the connection or permit -it to proceed.¶
-As long as it is not possible for an attacker to generate a valid token for -its own address (see Section 8.1.4) and the client is able to return -that token, it proves to the server that it received the token.¶
-A server can also use a Retry packet to defer the state and processing costs of -connection establishment. Requiring the server to provide a different -connection ID, along with the original_connection_id transport parameter defined -in Section 18.2, forces the server to demonstrate that -it, or an entity it cooperates with, received the original Initial packet from -the client. Providing a different connection ID also grants a server some -control over how subsequent packets are routed. This can be used to direct -connections to a different server instance.¶
-If a server receives a client Initial that can be unprotected but contains an -invalid Retry token, it knows the client will not accept another Retry token. -The server can discard such a packet and allow the client to time out to -detect handshake failure, but that could impose a significant latency penalty on -the client. Instead, the server SHOULD immediately close (Section 10.3) -the connection with an INVALID_TOKEN error. Note that a server has not -established any state for the connection at this point and so does not enter the -closing period.¶
-A flow showing the use of a Retry packet is shown in Figure 5.¶
-A server MAY provide clients with an address validation token during one -connection that can be used on a subsequent connection. Address validation is -especially important with 0-RTT because a server potentially sends a significant -amount of data to a client in response to 0-RTT data.¶
-The server uses the NEW_TOKEN frame Section 19.7 to provide the client -with an address validation token that can be used to validate future -connections. The client includes this token in Initial packets to provide -address validation in a future connection. The client MUST include the token in -all Initial packets it sends, unless a Retry replaces the token with a newer -one. The client MUST NOT use the token provided in a Retry for future -connections. Servers MAY discard any Initial packet that does not carry the -expected token.¶
-Unlike the token that is created for a Retry packet, which is used immediately, -the token sent in the NEW_TOKEN frame might be used after some period of -time has passed. Thus, a token SHOULD have an expiration time, which could -be either an explicit expiration time or an issued timestamp that can be -used to dynamically calculate the expiration time. A server can store the -expiration time or include it in an encrypted form in the token.¶
-A token issued with NEW_TOKEN MUST NOT include information that would allow -values to be linked by an observer to the connection on which it was -issued, unless the values are encrypted. For example, it cannot include the -previous connection ID or addressing information. A server MUST ensure that -every NEW_TOKEN frame it sends is unique across all clients, with the exception -of those sent to repair losses of previously sent NEW_TOKEN frames. Information -that allows the server to distinguish between tokens from Retry and NEW_TOKEN -MAY be accessible to entities other than the server.¶
-It is unlikely that the client port number is the same on two different -connections; validating the port is therefore unlikely to be successful.¶
-A token received in a NEW_TOKEN frame is applicable to any server that the -connection is considered authoritative for (e.g., server names included in the -certificate). When connecting to a server for which the client retains an -applicable and unused token, it SHOULD include that token in the Token field of -its Initial packet. Including a token might allow the server to validate the -client address without an additional round trip. A client MUST NOT include a -token that is not applicable to the server that it is connecting to, unless the -client has the knowledge that the server that issued the token and the server -the client is connecting to are jointly managing the tokens. A client MAY use a -token from any previous connection to that server.¶
-A token allows a server to correlate activity between the connection where the -token was issued and any connection where it is used. Clients that want to -break continuity of identity with a server MAY discard tokens provided using the -NEW_TOKEN frame. In comparison, a token obtained in a Retry packet MUST be used -immediately during the connection attempt and cannot be used in subsequent -connection attempts.¶
-A client SHOULD NOT reuse a NEW_TOKEN token for different connection attempts. -Reusing a token allows connections to be linked by entities on the network path; -see Section 9.5.¶
-Clients might receive multiple tokens on a single connection. Aside from -preventing linkability, any token can be used in any connection attempt. -Servers can send additional tokens to either enable address validation for -multiple connection attempts or to replace older tokens that might become -invalid. For a client, this ambiguity means that sending the most recent unused -token is most likely to be effective. Though saving and using older tokens has -no negative consequences, clients can regard older tokens as being less likely -be useful to the server for address validation.¶
-When a server receives an Initial packet with an address validation token, it -MUST attempt to validate the token, unless it has already completed address -validation. If the token is invalid then the server SHOULD proceed as if -the client did not have a validated address, including potentially sending -a Retry. If the validation succeeds, the server SHOULD then allow the -handshake to proceed.¶
-In a stateless design, a server can use encrypted and authenticated tokens to -pass information to clients that the server can later recover and use to -validate a client address. Tokens are not integrated into the cryptographic -handshake and so they are not authenticated. For instance, a client might be -able to reuse a token. To avoid attacks that exploit this property, a server -can limit its use of tokens to only the information needed to validate client -addresses.¶
-Clients MAY use tokens obtained on one connection for any connection attempt -using the same version. When selecting a token to use, clients do not need to -consider other properties of the connection that is being attempted, including -the choice of possible application protocols, session tickets, or other -connection properties.¶
-Attackers could replay tokens to use servers as amplifiers in DDoS attacks. To -protect against such attacks, servers SHOULD ensure that tokens sent in Retry -packets are only accepted for a short time. Tokens that are provided in -NEW_TOKEN frames (see Section 19.7) need to be valid for longer, but -SHOULD NOT be accepted multiple times in a short period. Servers are encouraged -to allow tokens to be used only once, if possible.¶
-An address validation token MUST be difficult to guess. Including a large -enough random value in the token would be sufficient, but this depends on the -server remembering the value it sends to clients.¶
-A token-based scheme allows the server to offload any state associated with -validation to the client. For this design to work, the token MUST be covered by -integrity protection against modification or falsification by clients. Without -integrity protection, malicious clients could generate or guess values for -tokens that would be accepted by the server. Only the server requires access to -the integrity protection key for tokens.¶
-There is no need for a single well-defined format for the token because the -server that generates the token also consumes it. A token could include -information about the claimed client address (IP and port), a timestamp, and any -other supplementary information the server will need to validate the token in -the future.¶
-Path validation is used during connection migration (see Section 9 and -Section 9.6) by the migrating endpoint to verify reachability of a -peer from a new local address. In path validation, endpoints test reachability -between a specific local address and a specific peer address, where an address -is the two-tuple of IP address and port.¶
-Path validation tests that packets (PATH_CHALLENGE) can be both sent to and -received (PATH_RESPONSE) from a peer on the path. Importantly, it validates -that the packets received from the migrating endpoint do not carry a spoofed -source address.¶
-Path validation can be used at any time by either endpoint. For instance, an -endpoint might check that a peer is still in possession of its address after a -period of quiescence.¶
-Path validation is not designed as a NAT traversal mechanism. Though the -mechanism described here might be effective for the creation of NAT bindings -that support NAT traversal, the expectation is that one or other peer is able to -receive packets without first having sent a packet on that path. Effective NAT -traversal needs additional synchronization mechanisms that are not provided -here.¶
-An endpoint MAY bundle PATH_CHALLENGE and PATH_RESPONSE frames that are used for -path validation with other frames. In particular, an endpoint may pad a packet -carrying a PATH_CHALLENGE for PMTU discovery, or an endpoint may bundle a -PATH_RESPONSE with its own PATH_CHALLENGE.¶
-When probing a new path, an endpoint might want to ensure that its peer has an -unused connection ID available for responses. The endpoint can send -NEW_CONNECTION_ID and PATH_CHALLENGE frames in the same packet. This ensures -that an unused connection ID will be available to the peer when sending a -response.¶
-To initiate path validation, an endpoint sends a PATH_CHALLENGE frame containing -a random payload on the path to be validated.¶
-An endpoint MAY send multiple PATH_CHALLENGE frames to guard against packet -loss. However, an endpoint SHOULD NOT send multiple PATH_CHALLENGE frames in a -single packet. An endpoint SHOULD NOT send a PATH_CHALLENGE more frequently -than it would an Initial packet, ensuring that connection migration is no more -load on a new path than establishing a new connection.¶
-The endpoint MUST use unpredictable data in every PATH_CHALLENGE frame so that -it can associate the peer's response with the corresponding PATH_CHALLENGE.¶
-On receiving a PATH_CHALLENGE frame, an endpoint MUST respond immediately by -echoing the data contained in the PATH_CHALLENGE frame in a PATH_RESPONSE frame.¶
-An endpoint MUST NOT send more than one PATH_RESPONSE frame in response to one -PATH_CHALLENGE frame (see Section 13.3). The peer is -expected to send more PATH_CHALLENGE frames as necessary to evoke additional -PATH_RESPONSE frames.¶
-A new address is considered valid when a PATH_RESPONSE frame is received that -contains the data that was sent in a previous PATH_CHALLENGE. Receipt of an -acknowledgment for a packet containing a PATH_CHALLENGE frame is not adequate -validation, since the acknowledgment can be spoofed by a malicious peer.¶
-Note that receipt on a different local address does not result in path -validation failure, as it might be a result of a forwarded packet (see -Section 9.3.3) or misrouting. It is possible that a valid PATH_RESPONSE -might be received in the future.¶
-Path validation only fails when the endpoint attempting to validate the path -abandons its attempt to validate the path.¶
-Endpoints SHOULD abandon path validation based on a timer. When setting this -timer, implementations are cautioned that the new path could have a longer -round-trip time than the original. A value of three times the larger of the -current Probe Timeout (PTO) or the initial timeout (that is, 2*kInitialRtt) as -defined in [QUIC-RECOVERY] is RECOMMENDED. That is:¶
-- validation_timeout = max(3*PTO, 6*kInitialRtt) -¶ -
Note that the endpoint might receive packets containing other frames on the new -path, but a PATH_RESPONSE frame with appropriate data is required for path -validation to succeed.¶
-When an endpoint abandons path validation, it determines that the path is -unusable. This does not necessarily imply a failure of the connection - -endpoints can continue sending packets over other paths as appropriate. If no -paths are available, an endpoint can wait for a new path to become available or -close the connection.¶
-A path validation might be abandoned for other reasons besides -failure. Primarily, this happens if a connection migration to a new path is -initiated while a path validation on the old path is in progress.¶
-The use of a connection ID allows connections to survive changes to endpoint -addresses (IP address and port), such as those caused by an -endpoint migrating to a new network. This section describes the process by -which an endpoint migrates to a new address.¶
-The design of QUIC relies on endpoints retaining a stable address for the -duration of the handshake. An endpoint MUST NOT initiate connection migration -before the handshake is confirmed, as defined in section 4.1.2 of [QUIC-TLS].¶
-An endpoint also MUST NOT send packets from a different local address, actively
-initiating migration, if the peer sent the disable_active_migration
transport
-parameter during the handshake. An endpoint which has sent this transport
-parameter, but detects that a peer has nonetheless migrated to a different
-network MUST either drop the incoming packets on that path without generating a
-stateless reset or proceed with path validation and allow the peer to migrate.
-Generating a stateless reset or closing the connection would allow third parties
-in the network to cause connections to close by spoofing or otherwise
-manipulating observed traffic.¶
Not all changes of peer address are intentional, or active, migrations. The peer -could experience NAT rebinding: a change of address due to a middlebox, usually -a NAT, allocating a new outgoing port or even a new outgoing IP address for a -flow. An endpoint MUST perform path validation (Section 8.2) if it -detects any change to a peer's address, unless it has previously validated that -address.¶
-When an endpoint has no validated path on which to send packets, it MAY discard -connection state. An endpoint capable of connection migration MAY wait for a -new path to become available before discarding connection state.¶
-This document limits migration of connections to new client addresses, except as -described in Section 9.6. Clients are responsible for initiating all -migrations. Servers do not send non-probing packets (see Section 9.1) toward a -client address until they see a non-probing packet from that address. If a -client receives packets from an unknown server address, the client MUST discard -these packets.¶
-An endpoint MAY probe for peer reachability from a new local address using path -validation Section 8.2 prior to migrating the connection to the new -local address. Failure of path validation simply means that the new path is not -usable for this connection. Failure to validate a path does not cause the -connection to end unless there are no valid alternative paths available.¶
-An endpoint uses a new connection ID for probes sent from a new local address; -see Section 9.5 for further discussion. An endpoint that uses -a new local address needs to ensure that at least one new connection ID is -available at the peer. That can be achieved by including a NEW_CONNECTION_ID -frame in the probe.¶
-Receiving a PATH_CHALLENGE frame from a peer indicates that the peer is probing -for reachability on a path. An endpoint sends a PATH_RESPONSE in response as per -Section 8.2.¶
-PATH_CHALLENGE, PATH_RESPONSE, NEW_CONNECTION_ID, and PADDING frames are -"probing frames", and all other frames are "non-probing frames". A packet -containing only probing frames is a "probing packet", and a packet containing -any other frame is a "non-probing packet".¶
-An endpoint can migrate a connection to a new local address by sending packets -containing non-probing frames from that address.¶
-Each endpoint validates its peer's address during connection establishment. -Therefore, a migrating endpoint can send to its peer knowing that the peer is -willing to receive at the peer's current address. Thus an endpoint can migrate -to a new local address without first validating the peer's address.¶
-When migrating, the new path might not support the endpoint's current sending -rate. Therefore, the endpoint resets its congestion controller, as described in -Section 9.4.¶
-The new path might not have the same ECN capability. Therefore, the endpoint -verifies ECN capability as described in Section 13.4.¶
-Receiving acknowledgments for data sent on the new path serves as proof of the -peer's reachability from the new address. Note that since acknowledgments may -be received on any path, return reachability on the new path is not -established. To establish return reachability on the new path, an endpoint MAY -concurrently initiate path validation Section 8.2 on the new path.¶
-Receiving a packet from a new peer address containing a non-probing frame -indicates that the peer has migrated to that address.¶
-In response to such a packet, an endpoint MUST start sending subsequent packets -to the new peer address and MUST initiate path validation (Section 8.2) -to verify the peer's ownership of the unvalidated address.¶
-An endpoint MAY send data to an unvalidated peer address, but it MUST protect -against potential attacks as described in Section 9.3.1 and -Section 9.3.2. An endpoint MAY skip validation of a peer address if that -address has been seen recently. In particular, if an endpoint returns to a -previously-validated path after detecting some form of spurious migration, -skipping address validation and restoring loss detection and congestion state -can reduce the performance impact of the attack.¶
-An endpoint only changes the address that it sends packets to in response to the -highest-numbered non-probing packet. This ensures that an endpoint does not send -packets to an old peer address in the case that it receives reordered packets.¶
-After changing the address to which it sends non-probing packets, an endpoint -could abandon any path validation for other addresses.¶
-Receiving a packet from a new peer address might be the result of a NAT -rebinding at the peer.¶
-After verifying a new client address, the server SHOULD send new address -validation tokens (Section 8) to the client.¶
-It is possible that a peer is spoofing its source address to cause an endpoint -to send excessive amounts of data to an unwilling host. If the endpoint sends -significantly more data than the spoofing peer, connection migration might be -used to amplify the volume of data that an attacker can generate toward a -victim.¶
-As described in Section 9.3, an endpoint is required to validate a -peer's new address to confirm the peer's possession of the new address. Until a -peer's address is deemed valid, an endpoint MUST limit the rate at which it -sends data to this address. The endpoint MUST NOT send more than a minimum -congestion window's worth of data per estimated round-trip time (kMinimumWindow, -as defined in [QUIC-RECOVERY]). In the absence of this limit, an endpoint -risks being used for a denial of service attack against an unsuspecting victim. -Note that since the endpoint will not have any round-trip time measurements to -this address, the estimate SHOULD be the default initial value (see -[QUIC-RECOVERY]).¶
-If an endpoint skips validation of a peer address as described in -Section 9.3, it does not need to limit its sending rate.¶
-An on-path attacker could cause a spurious connection migration by copying and -forwarding a packet with a spoofed address such that it arrives before the -original packet. The packet with the spoofed address will be seen to come from -a migrating connection, and the original packet will be seen as a duplicate and -dropped. After a spurious migration, validation of the source address will fail -because the entity at the source address does not have the necessary -cryptographic keys to read or respond to the PATH_CHALLENGE frame that is sent -to it even if it wanted to.¶
-To protect the connection from failing due to such a spurious migration, an -endpoint MUST revert to using the last validated peer address when validation of -a new peer address fails.¶
-If an endpoint has no state about the last validated peer address, it MUST close -the connection silently by discarding all connection state. This results in new -packets on the connection being handled generically. For instance, an endpoint -MAY send a stateless reset in response to any further incoming packets.¶
-Note that receipt of packets with higher packet numbers from the legitimate peer -address will trigger another connection migration. This will cause the -validation of the address of the spurious migration to be abandoned.¶
-An off-path attacker that can observe packets might forward copies of genuine -packets to endpoints. If the copied packet arrives before the genuine packet, -this will appear as a NAT rebinding. Any genuine packet will be discarded as a -duplicate. If the attacker is able to continue forwarding packets, it might be -able to cause migration to a path via the attacker. This places the attacker on -path, giving it the ability to observe or drop all subsequent packets.¶
-Unlike the attack described in Section 9.3.2, the attacker can ensure -that the new path is successfully validated.¶
-This style of attack relies on the attacker using a path that is approximately -as fast as the direct path between endpoints. The attack is more reliable if -relatively few packets are sent or if packet loss coincides with the attempted -attack.¶
-A non-probing packet received on the original path that increases the maximum -received packet number will cause the endpoint to move back to that path. -Eliciting packets on this path increases the likelihood that the attack is -unsuccessful. Therefore, mitigation of this attack relies on triggering the -exchange of packets.¶
-In response to an apparent migration, endpoints MUST validate the previously -active path using a PATH_CHALLENGE frame. This induces the sending of new -packets on that path. If the path is no longer viable, the validation attempt -will time out and fail; if the path is viable, but no longer desired, the -validation will succeed, but only results in probing packets being sent on the -path.¶
-An endpoint that receives a PATH_CHALLENGE on an active path SHOULD send a -non-probing packet in response. If the non-probing packet arrives before any -copy made by an attacker, this results in the connection being migrated back to -the original path. Any subsequent migration to another path restarts this -entire process.¶
-This defense is imperfect, but this is not considered a serious problem. If the -path via the attack is reliably faster than the original path despite multiple -attempts to use that original path, it is not possible to distinguish between -attack and an improvement in routing.¶
-An endpoint could also use heuristics to improve detection of this style of -attack. For instance, NAT rebinding is improbable if packets were recently -received on the old path, similarly rebinding is rare on IPv6 paths. Endpoints -can also look for duplicated packets. Conversely, a change in connection ID is -more likely to indicate an intentional migration rather than an attack.¶
-The capacity available on the new path might not be the same as the old path. -Packets sent on the old path MUST NOT contribute to congestion control or RTT -estimation for the new path.¶
-On confirming a peer's ownership of its new address, an endpoint MUST -immediately reset the congestion controller and round-trip time estimator for -the new path to initial values (see Sections A.3 and B.3 in [QUIC-RECOVERY]) -unless it has knowledge that a previous send rate or round-trip time estimate is -valid for the new path. For instance, an endpoint might infer that a change in -only the client's port number is indicative of a NAT rebinding, meaning that the -new path is likely to have similar bandwidth and round-trip time. However, this -determination will be imperfect. If the determination is incorrect, the -congestion controller and the RTT estimator are expected to adapt to the new -path. Generally, implementations are advised to be cautious when using previous -values on a new path.¶
-There may be apparent reordering at the receiver when an endpoint sends data and -probes from/to multiple addresses during the migration period, since the two -resulting paths may have different round-trip times. A receiver of packets on -multiple paths will still send ACK frames covering all received packets.¶
-While multiple paths might be used during connection migration, a single -congestion control context and a single loss recovery context (as described in -[QUIC-RECOVERY]) may be adequate. For instance, an endpoint might delay -switching to a new congestion control context until it is confirmed that an old -path is no longer needed (such as the case in Section 9.3.3).¶
-A sender can make exceptions for probe packets so that their loss detection is -independent and does not unduly cause the congestion controller to reduce its -sending rate. An endpoint might set a separate timer when a PATH_CHALLENGE is -sent, which is cancelled if the corresponding PATH_RESPONSE is received. If -the timer fires before the PATH_RESPONSE is received, the endpoint might send a -new PATH_CHALLENGE, and restart the timer for a longer period of time. -This timer SHOULD be set as described in Section 5.3 of [QUIC-RECOVERY] and -MUST NOT be more aggressive.¶
-Using a stable connection ID on multiple network paths allows a passive observer -to correlate activity between those paths. An endpoint that moves between -networks might not wish to have their activity correlated by any entity other -than their peer, so different connection IDs are used when sending from -different local addresses, as discussed in Section 5.1. For this to be -effective endpoints need to ensure that connection IDs they provide cannot be -linked by any other entity.¶
-At any time, endpoints MAY change the Destination Connection ID they send to a -value that has not been used on another path.¶
-An endpoint MUST use a new connection ID if it initiates connection migration as -described in Section 9.2 or probes a new network path as described -in Section 9.1. An endpoint MUST use a new connection ID in response to a -change in the address of a peer if the packet with the new peer address uses an -active connection ID that has not been previously used by the peer.¶
-Using different connection IDs for packets sent in both directions on each new -network path eliminates the use of the connection ID for linking packets from -the same connection across different network paths. Header protection ensures -that packet numbers cannot be used to correlate activity. This does not prevent -other properties of packets, such as timing and size, from being used to -correlate activity.¶
-Unintentional changes in path without a change in connection ID are possible. -For example, after a period of network inactivity, NAT rebinding might cause -packets to be sent on a new path when the client resumes sending.¶
-A client might wish to reduce linkability by employing a new connection ID and -source UDP port when sending traffic after a period of inactivity. Changing the -UDP port from which it sends packets at the same time might cause the packet to -appear as a connection migration. This ensures that the mechanisms that support -migration are exercised even for clients that don't experience NAT rebindings or -genuine migrations. Changing port number can cause a peer to reset its -congestion state (see Section 9.4), so the port SHOULD only be changed -infrequently.¶
-An endpoint that exhausts available connection IDs cannot probe new paths or -initiate migration, nor can it respond to probes or attempts by its peer to -migrate. To ensure that migration is possible and packets sent on different -paths cannot be correlated, endpoints SHOULD provide new connection IDs before -peers migrate; see Section 5.1.1. If a peer might have exhausted available -connection IDs, a migrating endpoint could include a NEW_CONNECTION_ID frame in -all packets sent on a new network path.¶
-QUIC allows servers to accept connections on one IP address and attempt to -transfer these connections to a more preferred address shortly after the -handshake. This is particularly useful when clients initially connect to an -address shared by multiple servers but would prefer to use a unicast address to -ensure connection stability. This section describes the protocol for migrating a -connection to a preferred server address.¶
-Migrating a connection to a new server address mid-connection is left for future -work. If a client receives packets from a new server address not indicated by -the preferred_address transport parameter, the client SHOULD discard these -packets.¶
-A server conveys a preferred address by including the preferred_address -transport parameter in the TLS handshake.¶
-Servers MAY communicate a preferred address of each address family (IPv4 and -IPv6) to allow clients to pick the one most suited to their network attachment.¶
-Once the handshake is finished, the client SHOULD select one of the two -server's preferred addresses and initiate path validation (see -Section 8.2) of that address using the connection ID provided in the -preferred_address transport parameter.¶
-If path validation succeeds, the client SHOULD immediately begin sending all -future packets to the new server address using the new connection ID and -discontinue use of the old server address. If path validation fails, the client -MUST continue sending all future packets to the server's original IP address.¶
-A server might receive a packet addressed to its preferred IP address at any -time after it accepts a connection. If this packet contains a PATH_CHALLENGE -frame, the server sends a PATH_RESPONSE frame as per Section 8.2. The -server MUST send other non-probing frames from its original address until it -receives a non-probing packet from the client at its preferred address and until -the server has validated the new path.¶
-The server MUST probe on the path toward the client from its preferred address. -This helps to guard against spurious migration initiated by an attacker.¶
-Once the server has completed its path validation and has received a non-probing -packet with a new largest packet number on its preferred address, the server -begins sending non-probing packets to the client exclusively from its preferred -IP address. It SHOULD drop packets for this connection received on the old IP -address, but MAY continue to process delayed packets.¶
-A client might need to perform a connection migration before it has migrated to -the server's preferred address. In this case, the client SHOULD perform path -validation to both the original and preferred server address from the client's -new address concurrently.¶
-If path validation of the server's preferred address succeeds, the client MUST -abandon validation of the original address and migrate to using the server's -preferred address. If path validation of the server's preferred address fails -but validation of the server's original address succeeds, the client MAY migrate -to its new address and continue sending to the server's original address.¶
-If the connection to the server's preferred address is not from the same client -address, the server MUST protect against potential attacks as described in -Section 9.3.1 and Section 9.3.2. In addition to intentional -simultaneous migration, this might also occur because the client's access -network used a different NAT binding for the server's preferred address.¶
-Servers SHOULD initiate path validation to the client's new address upon -receiving a probe packet from a different address. Servers MUST NOT send more -than a minimum congestion window's worth of non-probing packets to the new -address before path validation is complete.¶
-A client that migrates to a new address SHOULD use a preferred address from the -same address family for the server.¶
-Endpoints that send data using IPv6 SHOULD apply an IPv6 flow label -in compliance with [RFC6437], unless the local API does not allow -setting IPv6 flow labels.¶
-The IPv6 flow label SHOULD be a pseudo-random function of the source -and destination addresses, source and destination UDP ports, and the destination -CID. The flow label generation MUST be designed to minimize the chances of -linkability with a previously used flow label, as this would enable correlating -activity on multiple paths (see Section 9.5).¶
-A possible implementation is to compute the flow label as a cryptographic hash -function of the source and destination addresses, source and destination -UDP ports, destination CID, and a local secret.¶
-An established QUIC connection can be terminated in one of three ways:¶
-An endpoint MAY discard connection state if it does not have a validated path on -which it can send packets (see Section 8.2).¶
-The closing and draining connection states exist to ensure that connections -close cleanly and that delayed or reordered packets are properly discarded. -These states SHOULD persist for at least three times the current Probe Timeout -(PTO) interval as defined in [QUIC-RECOVERY].¶
-An endpoint enters a closing period after initiating an immediate close -(Section 10.3). While closing, an endpoint MUST NOT send packets unless -they contain a CONNECTION_CLOSE frame (see Section 10.3 for details). An -endpoint retains only enough information to generate a packet containing a -CONNECTION_CLOSE frame and to identify packets as belonging to the connection. -The endpoint's selected connection ID and the QUIC version are sufficient -information to identify packets for a closing connection; an endpoint can -discard all other connection state. An endpoint MAY retain packet protection -keys for incoming packets to allow it to read and process a CONNECTION_CLOSE -frame.¶
-The draining state is entered once an endpoint receives a signal that its peer -is closing or draining. While otherwise identical to the closing state, an -endpoint in the draining state MUST NOT send any packets. Retaining packet -protection keys is unnecessary once a connection is in the draining state.¶
-An endpoint MAY transition from the closing period to the draining period if it -receives a CONNECTION_CLOSE frame or stateless reset, both of which indicate -that the peer is also closing or draining. The draining period SHOULD end when -the closing period would have ended. In other words, the endpoint can use the -same end time, but cease retransmission of the closing packet.¶
-Disposing of connection state prior to the end of the closing or draining period -could cause delayed or reordered packets to generate an unnecessary stateless -reset. Endpoints that have some alternative means to ensure that late-arriving -packets on the connection do not induce a response, such as those that are able -to close the UDP socket, MAY use an abbreviated draining period which can allow -for faster resource recovery. Servers that retain an open socket for accepting -new connections SHOULD NOT exit the closing or draining period early.¶
-Once the closing or draining period has ended, an endpoint SHOULD discard all -connection state. This results in new packets on the connection being handled -generically. For instance, an endpoint MAY send a stateless reset in response -to any further incoming packets.¶
-The draining and closing periods do not apply when a stateless reset -(Section 10.4) is sent.¶
-An endpoint is not expected to handle key updates when it is closing or -draining. A key update might prevent the endpoint from moving from the closing -state to draining, but it otherwise has no impact.¶
-While in the closing period, an endpoint could receive packets from a new source -address, indicating a connection migration (Section 9). An endpoint in the -closing state MUST strictly limit the number of packets it sends to this new -address until the address is validated (see Section 8.2). A server in -the closing state MAY instead choose to discard packets received from a new -source address.¶
-If a max_idle_timeout is specified by either peer in its transport parameters -(Section 18.2), the connection is silently closed -and its state is discarded when it remains idle for longer than the minimum of -both peers max_idle_timeout values and three times the current Probe Timeout -(PTO).¶
-Each endpoint advertises a max_idle_timeout, but the effective value -at an endpoint is computed as the minimum of the two advertised values. By -announcing a max_idle_timeout, an endpoint commits to initiating an immediate -close (Section 10.3) if it abandons the connection prior to the effective -value.¶
-An endpoint restarts its idle timer when a packet from its peer is received and -processed successfully. An endpoint also restarts its idle timer when sending an -ack-eliciting packet if no other ack-eliciting packets have been sent since last -receiving and processing a packet. Restarting this timer when sending a packet -ensures that connections are not closed after new activity is initiated.¶
-An endpoint might need to send ack-eliciting packets to avoid an idle timeout -if it is expecting response data, but does not have or is unable to send -application data.¶
-An endpoint that sends packets close to the effective timeout risks having -them be discarded at the peer, since the peer might enter its draining state -before these packets arrive. An endpoint can send a PING or another -ack-eliciting frame to test the connection for liveness if the peer could -time out soon, such as within a PTO; see Section 6.6 of [QUIC-RECOVERY]. -This is especially useful if any available application data cannot be safely -retried. Note that the application determines what data is safe to retry.¶
-An endpoint sends a CONNECTION_CLOSE frame (Section 19.19) to -terminate the connection immediately. A CONNECTION_CLOSE frame causes all -streams to immediately become closed; open streams can be assumed to be -implicitly reset.¶
-After sending a CONNECTION_CLOSE frame, an endpoint immediately enters the -closing state.¶
-During the closing period, an endpoint that sends a CONNECTION_CLOSE frame -SHOULD respond to any incoming packet that can be decrypted with another packet -containing a CONNECTION_CLOSE frame. Such an endpoint SHOULD limit the number -of packets it generates containing a CONNECTION_CLOSE frame. For instance, an -endpoint could wait for a progressively increasing number of received packets or -amount of time before responding to a received packet.¶
-An endpoint is allowed to drop the packet protection keys when entering the -closing period (Section 10.1) and send a packet containing a CONNECTION_CLOSE in -response to any UDP datagram that is received. However, an endpoint without the -packet protection keys cannot identify and discard invalid packets. To avoid -creating an unwitting amplification attack, such endpoints MUST reduce the -frequency with which it sends packets containing a CONNECTION_CLOSE frame. To -minimize the state that an endpoint maintains for a closing connection, -endpoints MAY send the exact same packet.¶
-New packets from unverified addresses could be used to create an amplification -attack (see Section 8). To avoid this, endpoints MUST either limit -transmission of CONNECTION_CLOSE frames to validated addresses or drop packets -without response if the response would be more than three times larger than the -received packet.¶
-After receiving a CONNECTION_CLOSE frame, endpoints enter the draining state. -An endpoint that receives a CONNECTION_CLOSE frame MAY send a single packet -containing a CONNECTION_CLOSE frame before entering the draining state, using a -CONNECTION_CLOSE frame and a NO_ERROR code if appropriate. An endpoint MUST NOT -send further packets, which could result in a constant exchange of -CONNECTION_CLOSE frames until the closing period on either peer ended.¶
-An immediate close can be used after an application protocol has arranged to -close a connection. This might be after the application protocols negotiates a -graceful shutdown. The application protocol exchanges whatever messages that -are needed to cause both endpoints to agree to close the connection, after which -the application requests that the connection be closed. The application -protocol can use a CONNECTION_CLOSE frame with an appropriate error code to -signal closure.¶
-When sending CONNECTION_CLOSE, the goal is to ensure that the peer will process -the frame. Generally, this means sending the frame in a packet with the highest -level of packet protection to avoid the packet being discarded. After the -handshake is confirmed (see Section 4.1.2 of [QUIC-TLS]), an endpoint MUST -send any CONNECTION_CLOSE frames in a 1-RTT packet. However, prior to -confirming the handshake, it is possible that more advanced packet protection -keys are not available to the peer, so another CONNECTION_CLOSE frame MAY be -sent in a packet that uses a lower packet protection level. More specifically:¶
-Sending a CONNECTION_CLOSE of type 0x1d in an Initial or Handshake packet could -expose application state or be used to alter application state. A -CONNECTION_CLOSE of type 0x1d MUST be replaced by a CONNECTION_CLOSE of type -0x1c when sending the frame in Initial or Handshake packets. Otherwise, -information about the application state might be revealed. Endpoints MUST clear -the value of the Reason Phrase field and SHOULD use the APPLICATION_ERROR code -when converting to a CONNECTION_CLOSE of type 0x1c.¶
-CONNECTION_CLOSE frames sent in multiple packet types can be coalesced into a -single UDP datagram; see Section 12.2.¶
-An endpoint might send a CONNECTION_CLOSE frame in an Initial packet or in -response to unauthenticated information received in Initial or Handshake -packets. Such an immediate close might expose legitimate connections to a -denial of service. QUIC does not include defensive measures for on-path attacks -during the handshake; see Section 21.1. However, at the cost of reducing -feedback about errors for legitimate peers, some forms of denial of service can -be made more difficult for an attacker if endpoints discard illegal packets -rather than terminating a connection with CONNECTION_CLOSE. For this reason, -endpoints MAY discard packets rather than immediately close if errors are -detected in packets that lack authentication.¶
-An endpoint that has not established state, such as a server that detects an -error in an Initial packet, does not enter the closing state. An endpoint that -has no state for the connection does not enter a closing or draining period on -sending a CONNECTION_CLOSE frame.¶
-A stateless reset is provided as an option of last resort for an endpoint that -does not have access to the state of a connection. A crash or outage might -result in peers continuing to send data to an endpoint that is unable to -properly continue the connection. An endpoint MAY send a stateless reset in -response to receiving a packet that it cannot associate with an active -connection.¶
-A stateless reset is not appropriate for signaling error conditions. An -endpoint that wishes to communicate a fatal connection error MUST use a -CONNECTION_CLOSE frame if it has sufficient state to do so.¶
-To support this process, a token is sent by endpoints. The token is carried in -the Stateless Reset Token field of a NEW_CONNECTION_ID frame. Servers can also -specify a stateless_reset_token transport parameter during the handshake that -applies to the connection ID that it selected during the handshake; clients -cannot use this transport parameter because their transport parameters don't -have confidentiality protection. These tokens are protected by encryption, so -only client and server know their value. Tokens are invalidated when their -associated connection ID is retired via a RETIRE_CONNECTION_ID frame -(Section 19.16).¶
-An endpoint that receives packets that it cannot process sends a packet in the -following layout:¶
-This design ensures that a stateless reset packet is - to the extent possible - -indistinguishable from a regular packet with a short header.¶
-A stateless reset uses an entire UDP datagram, starting with the first two bits -of the packet header. The remainder of the first byte and an arbitrary number -of bytes following it that are set to unpredictable values. The last 16 bytes -of the datagram contain a Stateless Reset Token.¶
-To entities other than its intended recipient, a stateless reset will appear to -be a packet with a short header. For the stateless reset to appear as a valid -QUIC packet, the Unpredictable Bits field needs to include at least 38 bits of -data (or 5 bytes, less the two fixed bits).¶
-A minimum size of 21 bytes does not guarantee that a stateless reset is -difficult to distinguish from other packets if the recipient requires the use of -a connection ID. To prevent a resulting stateless reset from being trivially -distinguishable from a valid packet, all packets sent by an endpoint SHOULD be -padded to at least 22 bytes longer than the minimum connection ID that the -endpoint might use. An endpoint that sends a stateless reset in response to -packet that is 43 bytes or less in length SHOULD send a stateless reset that is -one byte shorter than the packet it responds to.¶
-These values assume that the Stateless Reset Token is the same as the minimum -expansion of the packet protection AEAD. Additional unpredictable bytes are -necessary if the endpoint could have negotiated a packet protection scheme with -a larger minimum expansion.¶
-An endpoint MUST NOT send a stateless reset that is three times or more larger -than the packet it receives to avoid being used for amplification. -Section 10.4.3 describes additional limits on stateless reset size.¶
-Endpoints MUST discard packets that are too small to be valid QUIC packets. -With the set of AEAD functions defined in [QUIC-TLS], packets that are smaller -than 21 bytes are never valid.¶
-Endpoints MUST send stateless reset packets formatted as a packet with a short -header. However, endpoints MUST treat any packet ending in a valid stateless -reset token as a stateless reset, as other QUIC versions might allow the use of -a long header.¶
-An endpoint MAY send a stateless reset in response to a packet with a long -header. Sending a stateless reset is not effective prior to the stateless reset -token being available to a peer. In this QUIC version, packets with a long -header are only used during connection establishment. Because the stateless -reset token is not available until connection establishment is complete or near -completion, ignoring an unknown packet with a long header might be as effective -as sending a stateless reset.¶
-An endpoint cannot determine the Source Connection ID from a packet with a short -header, therefore it cannot set the Destination Connection ID in the stateless -reset packet. The Destination Connection ID will therefore differ from the -value used in previous packets. A random Destination Connection ID makes the -connection ID appear to be the result of moving to a new connection ID that was -provided using a NEW_CONNECTION_ID frame (Section 19.15).¶
-Using a randomized connection ID results in two problems:¶
-This stateless reset design is specific to QUIC version 1. An endpoint that -supports multiple versions of QUIC needs to generate a stateless reset that will -be accepted by peers that support any version that the endpoint might support -(or might have supported prior to losing state). Designers of new versions of -QUIC need to be aware of this and either reuse this design, or use a portion of -the packet other than the last 16 bytes for carrying data.¶
-An endpoint detects a potential stateless reset using the trailing 16 bytes of -the UDP datagram. An endpoint remembers all Stateless Reset Tokens associated -with the connection IDs and remote addresses for datagrams it has recently sent. -This includes Stateless Reset Tokens from NEW_CONNECTION_ID frames and the -server's transport parameters but excludes Stateless Reset Tokens associated -with connection IDs that are either unused or retired. The endpoint identifies -a received datagram as a stateless reset by comparing the last 16 bytes of the -datagram with all Stateless Reset Tokens associated with the remote address on -which the datagram was received.¶
-This comparison can be performed for every inbound datagram. Endpoints MAY skip -this check if any packet from a datagram is successfully processed. However, -the comparison MUST be performed when the first packet in an incoming datagram -either cannot be associated with a connection, or cannot be decrypted.¶
-An endpoint MUST NOT check for any Stateless Reset Tokens associated with -connection IDs it has not used or for connection IDs that have been retired.¶
-When comparing a datagram to Stateless Reset Token values, endpoints MUST -perform the comparison without leaking information about the value of the token. -For example, performing this comparison in constant time protects the value of -individual Stateless Reset Tokens from information leakage through timing side -channels. Another approach would be to store and compare the transformed values -of Stateless Reset Tokens instead of the raw token values, where the -transformation is defined as a cryptographically-secure pseudo-random function -using a secret key (e.g., block cipher, HMAC [RFC2104]). An endpoint is not -expected to protect information about whether a packet was successfully -decrypted, or the number of valid Stateless Reset Tokens.¶
-If the last 16 bytes of the datagram are identical in value to a Stateless Reset -Token, the endpoint MUST enter the draining period and not send any further -packets on this connection.¶
-The stateless reset token MUST be difficult to guess. In order to create a -Stateless Reset Token, an endpoint could randomly generate [RFC4086] a secret -for every connection that it creates. However, this presents a coordination -problem when there are multiple instances in a cluster or a storage problem for -an endpoint that might lose state. Stateless reset specifically exists to -handle the case where state is lost, so this approach is suboptimal.¶
-A single static key can be used across all connections to the same endpoint by -generating the proof using a second iteration of a preimage-resistant function -that takes a static key and the connection ID chosen by the endpoint (see -Section 5.1) as input. An endpoint could use HMAC [RFC2104] (for -example, HMAC(static_key, connection_id)) or HKDF [RFC5869] (for example, -using the static key as input keying material, with the connection ID as salt). -The output of this function is truncated to 16 bytes to produce the Stateless -Reset Token for that connection.¶
-An endpoint that loses state can use the same method to generate a valid -Stateless Reset Token. The connection ID comes from the packet that the -endpoint receives.¶
-This design relies on the peer always sending a connection ID in its packets so -that the endpoint can use the connection ID from a packet to reset the -connection. An endpoint that uses this design MUST either use the same -connection ID length for all connections or encode the length of the connection -ID such that it can be recovered without state. In addition, it cannot provide -a zero-length connection ID.¶
-Revealing the Stateless Reset Token allows any entity to terminate the -connection, so a value can only be used once. This method for choosing the -Stateless Reset Token means that the combination of connection ID and static key -MUST NOT be used for another connection. A denial of service attack is possible -if the same connection ID is used by instances that share a static key, or if an -attacker can cause a packet to be routed to an instance that has no state but -the same static key; see Section 21.9. A connection ID from a connection -that is reset by revealing the Stateless Reset Token MUST NOT be reused for new -connections at nodes that share a static key.¶
-The same Stateless Reset Token MUST NOT be used for multiple connection IDs. -Endpoints are not required to compare new values against all previous values, -but a duplicate value MAY be treated as a connection error of type -PROTOCOL_VIOLATION.¶
-Note that Stateless Reset packets do not have any cryptographic protection.¶
-The design of a Stateless Reset is such that without knowing the stateless reset -token it is indistinguishable from a valid packet. For instance, if a server -sends a Stateless Reset to another server it might receive another Stateless -Reset in response, which could lead to an infinite exchange.¶
-An endpoint MUST ensure that every Stateless Reset that it sends is smaller than -the packet which triggered it, unless it maintains state sufficient to prevent -looping. In the event of a loop, this results in packets eventually being too -small to trigger a response.¶
-An endpoint can remember the number of Stateless Reset packets that it has sent -and stop generating new Stateless Reset packets once a limit is reached. Using -separate limits for different remote addresses will ensure that Stateless Reset -packets can be used to close connections when other peers or connections have -exhausted limits.¶
-Reducing the size of a Stateless Reset below 41 bytes means that the packet -could reveal to an observer that it is a Stateless Reset, depending upon the -length of the peer's connection IDs. Conversely, refusing to send a Stateless -Reset in response to a small packet might result in Stateless Reset not being -useful in detecting cases of broken connections where only very small packets -are sent; such failures might only be detected by other means, such as timers.¶
-An endpoint that detects an error SHOULD signal the existence of that error to -its peer. Both transport-level and application-level errors can affect an -entire connection (see Section 11.1), while only application-level -errors can be isolated to a single stream (see Section 11.2).¶
-The most appropriate error code (Section 20) SHOULD be included in the -frame that signals the error. Where this specification identifies error -conditions, it also identifies the error code that is used; though these are -worded as requirements, different implementation strategies might lead to -different errors being reported. In particular, an endpoint MAY use any -applicable error code when it detects an error condition; a generic error code -(such as PROTOCOL_VIOLATION or INTERNAL_ERROR) can always be used in place of -specific error codes.¶
-A stateless reset (Section 10.4) is not suitable for any error that can -be signaled with a CONNECTION_CLOSE or RESET_STREAM frame. A stateless reset -MUST NOT be used by an endpoint that has the state necessary to send a frame on -the connection.¶
-Errors that result in the connection being unusable, such as an obvious -violation of protocol semantics or corruption of state that affects an entire -connection, MUST be signaled using a CONNECTION_CLOSE frame -(Section 19.19). An endpoint MAY close the connection in this -manner even if the error only affects a single stream.¶
-Application protocols can signal application-specific protocol errors using the -application-specific variant of the CONNECTION_CLOSE frame. Errors that are -specific to the transport, including all those described in this document, are -carried in the QUIC-specific variant of the CONNECTION_CLOSE frame.¶
-A CONNECTION_CLOSE frame could be sent in a packet that is lost. An endpoint -SHOULD be prepared to retransmit a packet containing a CONNECTION_CLOSE frame if -it receives more packets on a terminated connection. Limiting the number of -retransmissions and the time over which this final packet is sent limits the -effort expended on terminated connections.¶
-An endpoint that chooses not to retransmit packets containing a CONNECTION_CLOSE -frame risks a peer missing the first such packet. The only mechanism available -to an endpoint that continues to receive data for a terminated connection is to -use the stateless reset process (Section 10.4).¶
-If an application-level error affects a single stream, but otherwise leaves the -connection in a recoverable state, the endpoint can send a RESET_STREAM frame -(Section 19.4) with an appropriate error code to terminate just the -affected stream.¶
-Resetting a stream without the involvement of the application protocol could -cause the application protocol to enter an unrecoverable state. RESET_STREAM -MUST only be instigated by the application protocol that uses QUIC.¶
-The semantics of the application error code carried in RESET_STREAM are -defined by the application protocol. Only the application protocol is able to -cause a stream to be terminated. A local instance of the application protocol -uses a direct API call and a remote instance uses the STOP_SENDING frame, which -triggers an automatic RESET_STREAM.¶
-Application protocols SHOULD define rules for handling streams that are -prematurely cancelled by either endpoint.¶
-QUIC endpoints communicate by exchanging packets. Packets have confidentiality -and integrity protection (see Section 12.1) and are carried in UDP -datagrams (see Section 12.2).¶
-This version of QUIC uses the long packet header (see Section 17.2) during -connection establishment. Packets with the long header are Initial -(Section 17.2.2), 0-RTT (Section 17.2.3), Handshake (Section 17.2.4), -and Retry (Section 17.2.5). Version negotiation uses a version-independent -packet with a long header (see Section 17.2.1).¶
-Packets with the short header (Section 17.3) are designed for minimal -overhead and are used after a connection is established and 1-RTT keys are -available.¶
-All QUIC packets except Version Negotiation packets use authenticated -encryption with additional data (AEAD) [RFC5116] to provide confidentiality -and integrity protection. Retry packets use AEAD to provide integrity -protection. Details of packet protection are found in [QUIC-TLS]; this -section includes an overview of the process.¶
-Initial packets are protected using keys that are statically derived. This -packet protection is not effective confidentiality protection. Initial -protection only exists to ensure that the sender of the packet is on the network -path. Any entity that receives the Initial packet from a client can recover the -keys necessary to remove packet protection or to generate packets that will be -successfully authenticated.¶
-All other packets are protected with keys derived from the cryptographic -handshake. The type of the packet from the long header or key phase from the -short header are used to identify which encryption keys are used. Packets -protected with 0-RTT and 1-RTT keys are expected to have confidentiality and -data origin authentication; the cryptographic handshake ensures that only the -communicating endpoints receive the corresponding keys.¶
-The packet number field contains a packet number, which has additional -confidentiality protection that is applied after packet protection is applied -(see [QUIC-TLS] for details). The underlying packet number increases with -each packet sent in a given packet number space; see Section 12.3 for -details.¶
-Initial (Section 17.2.2), 0-RTT (Section 17.2.3), and Handshake -(Section 17.2.4) packets contain a Length field, which determines the end -of the packet. The length includes both the Packet Number and Payload -fields, both of which are confidentiality protected and initially of unknown -length. The length of the Payload field is learned once header protection is -removed.¶
-Using the Length field, a sender can coalesce multiple QUIC packets into one UDP -datagram. This can reduce the number of UDP datagrams needed to complete the -cryptographic handshake and start sending data. This can also be used to -construct PMTU probes (see Section 14.3.1). Receivers MUST be able to -process coalesced packets.¶
-Coalescing packets in order of increasing encryption levels (Initial, 0-RTT, -Handshake, 1-RTT; see Section 4.1.4 of [QUIC-TLS]) makes it more likely the -receiver will be able to process all the packets in a single pass. A packet -with a short header does not include a length, so it can only be the last -packet included in a UDP datagram. An endpoint SHOULD NOT coalesce multiple -packets at the same encryption level.¶
-Senders MUST NOT coalesce QUIC packets for different connections into a single -UDP datagram. Receivers SHOULD ignore any subsequent packets with a different -Destination Connection ID than the first packet in the datagram.¶
-Every QUIC packet that is coalesced into a single UDP datagram is separate and -complete. The receiver of coalesced QUIC packets MUST individually process each -QUIC packet and separately acknowledge them, as if they were received as the -payload of different UDP datagrams. For example, if decryption fails (because -the keys are not available or any other reason), the receiver MAY either discard -or buffer the packet for later processing and MUST attempt to process the -remaining packets.¶
-Retry packets (Section 17.2.5), Version Negotiation packets -(Section 17.2.1), and packets with a short header (Section 17.3) do not -contain a Length field and so cannot be followed by other packets in the same -UDP datagram. Note also that there is no situation where a Retry or Version -Negotiation packet is coalesced with another packet.¶
-The packet number is an integer in the range 0 to 2^62-1. This number is used -in determining the cryptographic nonce for packet protection. Each endpoint -maintains a separate packet number for sending and receiving.¶
-Packet numbers are limited to this range because they need to be representable -in whole in the Largest Acknowledged field of an ACK frame (Section 19.3). -When present in a long or short header however, packet numbers are reduced and -encoded in 1 to 4 bytes (see Section 17.1).¶
-Version Negotiation (Section 17.2.1) and Retry (Section 17.2.5) packets -do not include a packet number.¶
-Packet numbers are divided into 3 spaces in QUIC:¶
-As described in [QUIC-TLS], each packet type uses different protection keys.¶
-Conceptually, a packet number space is the context in which a packet can be -processed and acknowledged. Initial packets can only be sent with Initial -packet protection keys and acknowledged in packets which are also Initial -packets. Similarly, Handshake packets are sent at the Handshake encryption -level and can only be acknowledged in Handshake packets.¶
-This enforces cryptographic separation between the data sent in the different -packet sequence number spaces. Packet numbers in each space start at packet -number 0. Subsequent packets sent in the same packet number space MUST increase -the packet number by at least one.¶
-0-RTT and 1-RTT data exist in the same packet number space to make loss recovery -algorithms easier to implement between the two packet types.¶
-A QUIC endpoint MUST NOT reuse a packet number within the same packet number -space in one connection. If the packet number for sending reaches 2^62 - 1, the -sender MUST close the connection without sending a CONNECTION_CLOSE frame or any -further packets; an endpoint MAY send a Stateless Reset (Section 10.4) in -response to further packets that it receives.¶
-A receiver MUST discard a newly unprotected packet unless it is certain that it -has not processed another packet with the same packet number from the same -packet number space. Duplicate suppression MUST happen after removing packet -protection for the reasons described in Section 9.3 of [QUIC-TLS]. An -efficient algorithm for duplicate suppression can be found in Section 3.4.3 of -[RFC4303].¶
-Packet number encoding at a sender and decoding at a receiver are described in -Section 17.1.¶
-The payload of QUIC packets, after removing packet protection, consists of a -sequence of complete frames, as shown in Figure 7. Version -Negotiation, Stateless Reset, and Retry packets do not contain frames.¶
-The payload of a packet that contains frames MUST contain at least one frame, -and MAY contain multiple frames and multiple frame types. Frames always fit -within a single QUIC packet and cannot span multiple packets.¶
-Each frame begins with a Frame Type, indicating its type, followed by -additional type-dependent fields:¶
-The frame types defined in this specification are listed in Table 3. -The Frame Type in ACK, STREAM, MAX_STREAMS, STREAMS_BLOCKED, and -CONNECTION_CLOSE frames is used to carry other frame-specific flags. For all -other frames, the Frame Type field simply identifies the frame. These -frames are explained in more detail in Section 19.¶
-Type Value | -Frame Type Name | -Definition | -Packets | -
---|---|---|---|
0x00 | -PADDING | -- Section 19.1 - | -IH01 | -
0x01 | -PING | -- Section 19.2 - | -IH01 | -
0x02 - 0x03 | -ACK | -- Section 19.3 - | -IH_1 | -
0x04 | -RESET_STREAM | -- Section 19.4 - | -__01 | -
0x05 | -STOP_SENDING | -- Section 19.5 - | -__01 | -
0x06 | -CRYPTO | -- Section 19.6 - | -IH_1 | -
0x07 | -NEW_TOKEN | -- Section 19.7 - | -___1 | -
0x08 - 0x0f | -STREAM | -- Section 19.8 - | -__01 | -
0x10 | -MAX_DATA | -- Section 19.9 - | -__01 | -
0x11 | -MAX_STREAM_DATA | -- Section 19.10 - | -__01 | -
0x12 - 0x13 | -MAX_STREAMS | -- Section 19.11 - | -__01 | -
0x14 | -DATA_BLOCKED | -- Section 19.12 - | -__01 | -
0x15 | -STREAM_DATA_BLOCKED | -- Section 19.13 - | -__01 | -
0x16 - 0x17 | -STREAMS_BLOCKED | -- Section 19.14 - | -__01 | -
0x18 | -NEW_CONNECTION_ID | -- Section 19.15 - | -__01 | -
0x19 | -RETIRE_CONNECTION_ID | -- Section 19.16 - | -__01 | -
0x1a | -PATH_CHALLENGE | -- Section 19.17 - | -__01 | -
0x1b | -PATH_RESPONSE | -- Section 19.18 - | -__01 | -
0x1c - 0x1d | -CONNECTION_CLOSE | -- Section 19.19 - | -ih01 | -
0x1e | -HANDSHAKE_DONE | -- Section 19.20 - | -___1 | -
The "Packets" column in Table 3 does not form part of the IANA registry -(see Section 22.3). This column lists the types of packets that each -frame type could appear in, indicated by the following characters:¶
-Section 4 of [QUIC-TLS] provides more detail about these restrictions. Note -that all frames can appear in 1-RTT packets.¶
-An endpoint MUST treat the receipt of a frame of unknown type as a connection -error of type FRAME_ENCODING_ERROR.¶
-All QUIC frames are idempotent in this version of QUIC. That is, a valid -frame does not cause undesirable side effects or errors when received more -than once.¶
-The Frame Type field uses a variable length integer encoding (see -Section 16) with one exception. To ensure simple and efficient -implementations of frame parsing, a frame type MUST use the shortest possible -encoding. For frame types defined in this document, this means a single-byte -encoding, even though it is possible to encode these values as a two-, four- -or eight-byte variable length integer. For instance, though 0x4001 is -a legitimate two-byte encoding for a variable-length integer with a value -of 1, PING frames are always encoded as a single byte with the value 0x01. -This rule applies to all current and future QUIC frame types. An endpoint -MAY treat the receipt of a frame type that uses a longer encoding than -necessary as a connection error of type PROTOCOL_VIOLATION.¶
-A sender bundles one or more frames in a QUIC packet (see Section 12.4).¶
-A sender can minimize per-packet bandwidth and computational costs by bundling -as many frames as possible within a QUIC packet. A sender MAY wait for a short -period of time to bundle multiple frames before sending a packet that is not -maximally packed, to avoid sending out large numbers of small packets. An -implementation MAY use knowledge about application sending behavior or -heuristics to determine whether and for how long to wait. This waiting period -is an implementation decision, and an implementation should be careful to delay -conservatively, since any delay is likely to increase application-visible -latency.¶
-Stream multiplexing is achieved by interleaving STREAM frames from multiple -streams into one or more QUIC packets. A single QUIC packet can include -multiple STREAM frames from one or more streams.¶
-One of the benefits of QUIC is avoidance of head-of-line blocking across -multiple streams. When a packet loss occurs, only streams with data in that -packet are blocked waiting for a retransmission to be received, while other -streams can continue making progress. Note that when data from multiple streams -is bundled into a single QUIC packet, loss of that packet blocks all those -streams from making progress. Implementations are advised to bundle as few -streams as necessary in outgoing packets without losing transmission efficiency -to underfilled packets.¶
-A packet MUST NOT be acknowledged until packet protection has been successfully -removed and all frames contained in the packet have been processed. For STREAM -frames, this means the data has been enqueued in preparation to be received by -the application protocol, but it does not require that data is delivered and -consumed.¶
-Once the packet has been fully processed, a receiver acknowledges receipt by -sending one or more ACK frames containing the packet number of the received -packet.¶
-Endpoints acknowledge all packets they receive and process. However, only -ack-eliciting packets cause an ACK frame to be sent within the maximum ack -delay. Packets that are not ack-eliciting are only acknowledged when an ACK -frame is sent for other reasons.¶
-When sending a packet for any reason, an endpoint SHOULD attempt to bundle an -ACK frame if one has not been sent recently. Doing so helps with timely loss -detection at the peer.¶
-In general, frequent feedback from a receiver improves loss and congestion -response, but this has to be balanced against excessive load generated by a -receiver that sends an ACK frame in response to every ack-eliciting packet. The -guidance offered below seeks to strike this balance.¶
-Every packet SHOULD be acknowledged at least once, and ack-eliciting packets
-MUST be acknowledged at least once within the maximum ack delay. An endpoint
-communicates its maximum delay using the max_ack_delay transport parameter;
-see Section 18.2. max_ack_delay declares an explicit
-contract: an endpoint promises to never intentionally delay acknowledgments
-of an ack-eliciting packet by more than the indicated value. If it does,
-any excess accrues to the RTT estimate and could result in spurious or
-delayed retransmissions from the peer. For Initial and Handshake packets,
-a max_ack_delay of 0 is used. The sender uses the receiver's max_ack_delay
-value in determining timeouts for timer-based retransmission, as detailed in
-Section 5.2.1 of [QUIC-RECOVERY].¶
An ACK frame SHOULD be generated for at least every second ack-eliciting packet. -This recommendation is in keeping with standard practice for TCP [RFC5681]. -A receiver could decide to send an ACK frame less frequently if it has -information about how frequently the sender's congestion controller -needs feedback, or if the receiver is CPU or bandwidth constrained.¶
-In order to assist loss detection at the sender, an endpoint SHOULD send an ACK -frame immediately on receiving an ack-eliciting packet that is out of order. The -endpoint MAY continue sending ACK frames immediately on each subsequently -received packet, but the endpoint SHOULD return to acknowledging every other -packet within a period of 1/8 x RTT, unless more ack-eliciting packets are -received out of order. If every subsequent ack-eliciting packet arrives out of -order, then an ACK frame SHOULD be sent immediately for every received -ack-eliciting packet.¶
-Similarly, packets marked with the ECN Congestion Experienced (CE) codepoint in -the IP header SHOULD be acknowledged immediately, to reduce the peer's response -time to congestion events.¶
-As an optimization, a receiver MAY process multiple packets before sending any -ACK frames in response. In this case the receiver can determine whether an -immediate or delayed acknowledgement should be generated after processing -incoming packets.¶
-Packets containing PADDING frames are considered to be in flight for congestion -control purposes [QUIC-RECOVERY]. Sending only PADDING frames might cause the -sender to become limited by the congestion controller with no acknowledgments -forthcoming from the receiver. Therefore, a sender SHOULD ensure that other -frames are sent in addition to PADDING frames to elicit acknowledgments from -the receiver.¶
-An endpoint that is only sending ACK frames will not receive acknowledgments -from its peer unless those acknowledgements are included in packets with -ack-eliciting frames. An endpoint SHOULD bundle ACK frames with other frames -when there are new ack-eliciting packets to acknowledge. When only -non-ack-eliciting packets need to be acknowledged, an endpoint MAY wait until an -ack-eliciting packet has been received to bundle an ACK frame with outgoing -frames.¶
-The algorithms in [QUIC-RECOVERY] are resilient to receivers that do not -follow guidance offered above. However, an implementor should only deviate from -these requirements after careful consideration of the performance implications -of doing so.¶
-Packets containing only ACK frames are not congestion controlled, so there are -limits on how frequently they can be sent. An endpoint MUST NOT send more than -one ACK-frame-only packet in response to receiving an ack-eliciting packet. An -endpoint MUST NOT send a non-ack-eliciting packet in response to a -non-ack-eliciting packet, even if there are packet gaps which precede the -received packet. Limiting ACK frames avoids an infinite feedback loop of -acknowledgements, which could prevent the connection from ever becoming idle. -However, the endpoint acknowledges non-ACK-eliciting packets when it sends an -ACK frame.¶
-An endpoint SHOULD treat receipt of an acknowledgment for a packet it did not -send as a connection error of type PROTOCOL_VIOLATION, if it is able to detect -the condition.¶
-When an ACK frame is sent, one or more ranges of acknowledged packets are -included. Including older packets reduces the chance of spurious retransmits -caused by losing previously sent ACK frames, at the cost of larger ACK frames.¶
-ACK frames SHOULD always acknowledge the most recently received packets, and the -more out-of-order the packets are, the more important it is to send an updated -ACK frame quickly, to prevent the peer from declaring a packet as lost and -spuriously retransmitting the frames it contains. An ACK frame is expected -to fit within a single QUIC packet. If it does not, then older ranges -(those with the smallest packet numbers) are omitted.¶
-Section 13.2.3 and Section 13.2.4 describe an exemplary approach for -determining what packets to acknowledge in each ACK frame.¶
-When a packet containing an ACK frame is sent, the largest acknowledged in that -frame may be saved. When a packet containing an ACK frame is acknowledged, the -receiver can stop acknowledging packets less than or equal to the largest -acknowledged in the sent ACK frame.¶
-In cases without ACK frame loss, this algorithm allows for a minimum of 1 RTT -of reordering. In cases with ACK frame loss and reordering, this approach does -not guarantee that every acknowledgement is seen by the sender before it is no -longer included in the ACK frame. Packets could be received out of order and -all subsequent ACK frames containing them could be lost. In this case, the -loss recovery algorithm could cause spurious retransmits, but the sender will -continue making forward progress.¶
-A receiver limits the number of ACK Ranges (Section 19.3.1) it remembers and -sends in ACK frames, both to limit the size of ACK frames and to avoid resource -exhaustion. After receiving acknowledgments for an ACK frame, the receiver -SHOULD stop tracking those acknowledged ACK Ranges.¶
-It is possible that retaining many ACK Ranges could cause an ACK frame to become -too large. A receiver can discard unacknowledged ACK Ranges to limit ACK frame -size, at the cost of increased retransmissions from the sender. This is -necessary if an ACK frame would be too large to fit in a packet, however -receivers MAY also limit ACK frame size further to preserve space for other -frames.¶
-When discarding unacknowledged ACK Ranges, a receiver SHOULD retain ACK Ranges -containing newly received packets or higher-numbered packets.¶
-A receiver that sends only non-ack-eliciting packets, such as ACK frames, might -not receive an acknowledgement for a long period of time. This could cause the -receiver to maintain state for a large number of ACK frames for a long period of -time, and ACK frames it sends could be unnecessarily large. In such a case, a -receiver could bundle a PING or other small ack-eliciting frame occasionally, -such as once per round trip, to elicit an ACK from the peer.¶
-A receiver MUST NOT bundle an ack-eliciting frame with all packets that would -otherwise be non-ack-eliciting, to avoid an infinite feedback loop of -acknowledgements.¶
-An endpoint measures the delays intentionally introduced between the time -the packet with the largest packet number is received and the time an -acknowledgment is sent. The endpoint encodes this delay in the Ack Delay -field of an ACK frame (see Section 19.3). This allows the receiver of the ACK -to adjust for any intentional delays, which is important for getting a better -estimate of the path RTT when acknowledgments are delayed. A packet might -be held in the OS kernel or elsewhere on the host before being processed. -An endpoint MUST NOT include delays that it does not control when populating -the Ack Delay field in an ACK frame.¶
-ACK frames MUST only be carried in a packet that has the same packet -number space as the packet being ACKed (see Section 12.1). For -instance, packets that are protected with 1-RTT keys MUST be -acknowledged in packets that are also protected with 1-RTT keys.¶
-Packets that a client sends with 0-RTT packet protection MUST be acknowledged by -the server in packets protected by 1-RTT keys. This can mean that the client is -unable to use these acknowledgments if the server cryptographic handshake -messages are delayed or lost. Note that the same limitation applies to other -data sent by the server protected by the 1-RTT keys.¶
-QUIC packets that are determined to be lost are not retransmitted whole. The -same applies to the frames that are contained within lost packets. Instead, the -information that might be carried in frames is sent again in new frames as -needed.¶
-New frames and packets are used to carry information that is determined to have -been lost. In general, information is sent again when a packet containing that -information is determined to be lost and sending ceases when a packet -containing that information is acknowledged.¶
-Endpoints SHOULD prioritize retransmission of data over sending new data, unless -priorities specified by the application indicate otherwise (see -Section 2.3).¶
-Even though a sender is encouraged to assemble frames containing up-to-date -information every time it sends a packet, it is not forbidden to retransmit -copies of frames from lost packets. A sender that retransmits copies of frames -needs to handle decreases in available payload size due to change in packet -number length, connection ID length, and path MTU. A receiver MUST accept -packets containing an outdated frame, such as a MAX_DATA frame carrying a -smaller maximum data than one found in an older packet.¶
-Upon detecting losses, a sender MUST take appropriate congestion control action. -The details of loss detection and congestion control are described in -[QUIC-RECOVERY].¶
-QUIC endpoints can use Explicit Congestion Notification (ECN) [RFC3168] to -detect and respond to network congestion. ECN allows a network node to indicate -congestion in the network by setting a codepoint in the IP header of a packet -instead of dropping it. Endpoints react to congestion by reducing their sending -rate in response, as described in [QUIC-RECOVERY].¶
-To use ECN, QUIC endpoints first determine whether a path supports ECN marking -and the peer is able to access the ECN codepoint in the IP header. A network -path does not support ECN if ECN marked packets get dropped or ECN markings are -rewritten on the path. An endpoint validates the use of ECN on the path, both -during connection establishment and when migrating to a new path -(Section 9).¶
-On receiving a QUIC packet with an ECT or CE codepoint, an ECN-enabled endpoint -that can access the ECN codepoints from the enclosing IP packet increases the -corresponding ECT(0), ECT(1), or CE count, and includes these counts in -subsequent ACK frames (see Section 13.2 and Section 19.3). Note -that this requires being able to read the ECN codepoints from the enclosing IP -packet, which is not possible on all platforms.¶
-A packet detected by a receiver as a duplicate does not affect the receiver's -local ECN codepoint counts; see (Section 21.8) for relevant security -concerns.¶
-If an endpoint receives a QUIC packet without an ECT or CE codepoint in the IP -packet header, it responds per Section 13.2 with an ACK frame without -increasing any ECN counts. If an endpoint does not implement ECN -support or does not have access to received ECN codepoints, it does not increase -ECN counts.¶
-Coalesced packets (see Section 12.2) mean that several packets can share -the same IP header. The ECN counter for the ECN codepoint received in the -associated IP header are incremented once for each QUIC packet, not per -enclosing IP packet or UDP datagram.¶
-Each packet number space maintains separate acknowledgement state and separate -ECN counts. For example, if one each of an Initial, 0-RTT, Handshake, and 1-RTT -QUIC packet are coalesced, the corresponding counts for the Initial and -Handshake packet number space will be incremented by one and the counts for the -1-RTT packet number space will be increased by two.¶
-It is possible for faulty network devices to corrupt or erroneously drop packets -with ECN markings. To provide robust connectivity in the presence of such -devices, each endpoint independently validates ECN counts and disables ECN if -errors are detected.¶
-Endpoints validate ECN for packets sent on each network path independently. An -endpoint thus validates ECN on new connection establishment, when switching to a -new server preferred address, and on active connection migration to a new path. -Appendix B describes one possible algorithm for testing paths for ECN support.¶
-Even if an endpoint does not use ECN markings on packets it transmits, the -endpoint MUST provide feedback about ECN markings received from the peer if they -are accessible. Failing to report ECN counts will cause the peer to disable ECN -marking.¶
-To start ECN validation, an endpoint SHOULD do the following when sending -packets on a new path to a peer:¶
-To reduce the chances of misinterpreting congestive loss as packets dropped by a -faulty network element, an endpoint could set the ECT(0) codepoint in the first -ten outgoing packets on a path, or for a period of three RTTs, whichever occurs -first.¶
-Implementations MAY experiment with and use other strategies for use of ECN. -Other methods of probing paths for ECN support are possible, as are different -marking strategies. Implementations can also use the ECT(1) codepoint, as -specified in [RFC8311].¶
-An endpoint that sets ECT(0) or ECT(1) codepoints on packets it transmits MUST -use the following steps on receiving an ACK frame to validate ECN.¶
-Processing ECN counts out of order can result in validation failure. An -endpoint SHOULD NOT perform this validation if this ACK frame does not advance -the largest packet number acknowledged in this connection.¶
-An endpoint could miss acknowledgements for a packet when ACK frames are lost. -It is therefore possible for the total increase in ECT(0), ECT(1), and CE counts -to be greater than the number of packets acknowledged in an ACK frame. When -this happens, and if validation succeeds, the local reference counts MUST be -increased to match the counts in the ACK frame.¶
-If validation fails, then the endpoint stops sending ECN markings in subsequent -IP packets with the expectation that either the network path or the peer does -not support ECN.¶
-Upon successful validation, an endpoint can continue to set ECT codepoints in -subsequent packets with the expectation that the path is ECN-capable. Network -routing and path elements can change mid-connection however; an endpoint MUST -disable ECN if validation fails at any point in the connection.¶
-Even if validation fails, an endpoint MAY revalidate ECN on the same path at any -later time in the connection.¶
-The QUIC packet size includes the QUIC header and protected payload, but not the -UDP or IP header.¶
-A client MUST expand the payload of all UDP datagrams carrying Initial packets -to at least 1200 bytes, by adding PADDING frames to the Initial packet or by -coalescing the Initial packet (see Section 12.2). Sending a UDP datagram -of this size ensures that the network path from the client to the server -supports a reasonable Maximum Transmission Unit (MTU). Padding datagrams also -helps reduce the amplitude of amplification attacks caused by server responses -toward an unverified client address; see Section 8.¶
-Datagrams containing Initial packets MAY exceed 1200 bytes if the client -believes that the Path Maximum Transmission Unit (PMTU) supports the size that -it chooses.¶
-UDP datagrams MUST NOT be fragmented at the IP layer. In IPv4 -[IPv4], the DF bit MUST be set to prevent fragmentation on the path.¶
-A server MUST discard an Initial packet that is carried in a UDP datagram that -is smaller than 1200 bytes. A server MAY also immediately close the connection -by sending a CONNECTION_CLOSE frame with an error code of PROTOCOL_VIOLATION; -see Section 10.3.1.¶
-The server MUST also limit the number of bytes it sends before validating the -address of the client; see Section 8.¶
-The PMTU is the maximum size of the entire IP packet including the IP header, -UDP header, and UDP payload. The UDP payload includes the QUIC packet header, -protected payload, and any authentication fields. The PMTU can depend upon the -current path characteristics. Therefore, the current largest UDP payload an -implementation will send is referred to as the QUIC maximum packet size.¶
-QUIC depends on a PMTU of at least 1280 bytes. This is the IPv6 minimum size -[RFC8200] and is also supported by most modern IPv4 networks. All QUIC -packets (except for PMTU probe packets) SHOULD be sized to fit within the -maximum packet size to avoid the packet being fragmented or dropped -[RFC8085].¶
-An endpoint SHOULD use Datagram Packetization Layer PMTU Discovery -([DPLPMTUD]) or implement Path MTU Discovery -(PMTUD) [RFC1191] [RFC8201] to determine whether the path to a destination -will support a desired message size without fragmentation.¶
-In the absence of these mechanisms, QUIC endpoints SHOULD NOT send IP packets -larger than 1280 bytes. Assuming the minimum IP header size, this results in a -QUIC maximum packet size of 1232 bytes for IPv6 and 1252 bytes for IPv4. A QUIC -implementation MAY be more conservative in computing the QUIC maximum packet -size to allow for unknown tunnel overheads or IP header options/extensions.¶
-Each pair of local and remote addresses could have a different PMTU. QUIC -implementations that implement any kind of PMTU discovery therefore SHOULD -maintain a maximum packet size for each combination of local and remote IP -addresses.¶
-If a QUIC endpoint determines that the PMTU between any pair of local and remote -IP addresses has fallen below the size needed to support the smallest allowed -maximum packet size, it MUST immediately cease sending QUIC packets, except for -PMTU probe packets, on the affected path. An endpoint MAY terminate the -connection if an alternative path cannot be found.¶
-PMTU discovery [RFC1191] [RFC8201] relies on reception of ICMP messages -(e.g., IPv6 Packet Too Big messages) that indicate when a packet is dropped -because it is larger than the local router MTU. DPLPMTUD can also optionally use -these messages. This use of ICMP messages is potentially vulnerable to off-path -attacks that successfully guess the addresses used on the path and reduce the -PMTU to a bandwidth-inefficient value.¶
-An endpoint MUST ignore an ICMP message that claims the PMTU has decreased below -1280 bytes.¶
-The requirements for generating ICMP ([RFC1812], [RFC4443]) state that the -quoted packet should contain as much of the original packet as possible without -exceeding the minimum MTU for the IP version. The size of the quoted packet can -actually be smaller, or the information unintelligible, as described in Section -1.1 of [DPLPMTUD].¶
-QUIC endpoints SHOULD validate ICMP messages to protect from off-path injection -as specified in [RFC8201] and Section 5.2 of [RFC8085]. This validation -SHOULD use the quoted packet supplied in the payload of an ICMP message to -associate the message with a corresponding transport connection [DPLPMTUD].¶
-ICMP message validation MUST include matching IP addresses and UDP ports -[RFC8085] and, when possible, connection IDs to an active QUIC session.¶
-Further validation can also be provided:¶
-The endpoint SHOULD ignore all ICMP messages that fail validation.¶
-An endpoint MUST NOT increase PMTU based on ICMP messages. Any reduction in the -QUIC maximum packet size MAY be provisional until QUIC's loss detection -algorithm determines that the quoted packet has actually been lost.¶
-Section 6.3 of [DPLPMTUD] provides considerations for implementing Datagram -Packetization Layer PMTUD (DPLPMTUD) with QUIC.¶
-When implementing the algorithm in Section 5 of [DPLPMTUD], the initial -value of BASE_PMTU SHOULD be consistent with the minimum QUIC packet size (1232 -bytes for IPv6 and 1252 bytes for IPv4).¶
-PING and PADDING frames can be used to generate PMTU probe packets. These frames -might not be retransmitted if a probe packet containing them is lost. However, -these frames do consume congestion window, which could delay the transmission of -subsequent application data.¶
-A PING frame can be included in a PMTU probe to ensure that a valid probe is -acknowledged.¶
-The considerations for processing ICMP messages in the previous section also -apply if these messages are used by DPLPMTUD.¶
-Endpoints that rely on the destination connection ID for routing QUIC packets -are likely to require that the connection ID be included in PMTU probe packets -to route any resulting ICMP messages (Section 14.2) back to the correct -endpoint. However, only long header packets (Section 17.2) contain source -connection IDs, and long header packets are not decrypted or acknowledged by -the peer once the handshake is complete. One way to construct a PMTU probe is -to coalesce (see Section 12.2) a Handshake packet (Section 17.2.4) -with a short header packet in a single UDP datagram. If the UDP datagram -reaches the endpoint, the Handshake packet will be ignored, but the short header -packet will be acknowledged. If the UDP datagram elicits an ICMP message, that -message will likely contain the source connection ID within the quoted portion -of the UDP datagram.¶
-QUIC versions are identified using a 32-bit unsigned number.¶
-The version 0x00000000 is reserved to represent version negotiation. This -version of the specification is identified by the number 0x00000001.¶
-Other versions of QUIC might have different properties to this version. The -properties of QUIC that are guaranteed to be consistent across all versions of -the protocol are described in [QUIC-INVARIANTS].¶
-Version 0x00000001 of QUIC uses TLS as a cryptographic handshake protocol, as -described in [QUIC-TLS].¶
-Versions with the most significant 16 bits of the version number cleared are -reserved for use in future IETF consensus documents.¶
-Versions that follow the pattern 0x?a?a?a?a are reserved for use in forcing -version negotiation to be exercised. That is, any version number where the low -four bits of all bytes is 1010 (in binary). A client or server MAY advertise -support for any of these reserved versions.¶
-Reserved version numbers will never represent a real protocol; a client MAY use -one of these version numbers with the expectation that the server will initiate -version negotiation; a server MAY advertise support for one of these versions -and can expect that clients ignore the value.¶
-[[RFC editor: please remove the remainder of this section before -publication.]]¶
-The version number for the final version of this specification (0x00000001), is -reserved for the version of the protocol that is published as an RFC.¶
-Version numbers used to identify IETF drafts are created by adding the draft -number to 0xff000000. For example, draft-ietf-quic-transport-13 would be -identified as 0xff00000D.¶
-Implementors are encouraged to register version numbers of QUIC that they are -using for private experimentation on the GitHub wiki at -<https://github.com/quicwg/base-drafts/wiki/QUIC-Versions>.¶
-QUIC packets and frames commonly use a variable-length encoding for non-negative -integer values. This encoding ensures that smaller integer values need fewer -bytes to encode.¶
-The QUIC variable-length integer encoding reserves the two most significant bits -of the first byte to encode the base 2 logarithm of the integer encoding length -in bytes. The integer value is encoded on the remaining bits, in network byte -order.¶
-This means that integers are encoded on 1, 2, 4, or 8 bytes and can encode 6, -14, 30, or 62 bit values respectively. Table 4 summarizes the -encoding properties.¶
-2Bit | -Length | -Usable Bits | -Range | -
---|---|---|---|
00 | -1 | -6 | -0-63 | -
01 | -2 | -14 | -0-16383 | -
10 | -4 | -30 | -0-1073741823 | -
11 | -8 | -62 | -0-4611686018427387903 | -
For example, the eight byte sequence c2 19 7c 5e ff 14 e8 8c (in hexadecimal) -decodes to the decimal value 151288809941952652; the four byte sequence 9d 7f 3e -7d decodes to 494878333; the two byte sequence 7b bd decodes to 15293; and the -single byte 25 decodes to 37 (as does the two byte sequence 40 25).¶
-Error codes (Section 20) and versions (Section 15) are described using -integers, but do not use this encoding.¶
-All numeric values are encoded in network byte order (that is, big-endian) and -all field sizes are in bits. Hexadecimal notation is used for describing the -value of fields.¶
-Packet numbers are integers in the range 0 to 2^62-1 (Section 12.3). When -present in long or short packet headers, they are encoded in 1 to 4 bytes. The -number of bits required to represent the packet number is reduced by including -the least significant bits of the packet number.¶
-The encoded packet number is protected as described in Section 5.4 of -[QUIC-TLS].¶
-The sender MUST use a packet number size able to represent more than twice as -large a range than the difference between the largest acknowledged packet and -packet number being sent. A peer receiving the packet will then correctly -decode the packet number, unless the packet is delayed in transit such that it -arrives after many higher-numbered packets have been received. An endpoint -SHOULD use a large enough packet number encoding to allow the packet number to -be recovered even if the packet arrives after packets that are sent afterwards.¶
-As a result, the size of the packet number encoding is at least one bit more -than the base-2 logarithm of the number of contiguous unacknowledged packet -numbers, including the new packet.¶
-For example, if an endpoint has received an acknowledgment for packet 0xabe8bc, -sending a packet with a number of 0xac5c02 requires a packet number encoding -with 16 bits or more; whereas the 24-bit packet number encoding is needed to -send a packet with a number of 0xace8fe.¶
-At a receiver, protection of the packet number is removed prior to recovering -the full packet number. The full packet number is then reconstructed based on -the number of significant bits present, the value of those bits, and the largest -packet number received on a successfully authenticated packet. Recovering the -full packet number is necessary to successfully remove packet protection.¶
-Once header protection is removed, the packet number is decoded by finding the -packet number value that is closest to the next expected packet. The next -expected packet is the highest received packet number plus one. For example, if -the highest successfully authenticated packet had a packet number of 0xa82f30ea, -then a packet containing a 16-bit value of 0x9b32 will be decoded as 0xa82f9b32. -Example pseudo-code for packet number decoding can be found in -Appendix A.¶
-Long headers are used for packets that are sent prior to the establishment -of 1-RTT keys. Once 1-RTT keys are available, -a sender switches to sending packets using the short header -(Section 17.3). The long form allows for special packets - such as the -Version Negotiation packet - to be represented in this uniform fixed-length -packet format. Packets that use the long header contain the following fields:¶
-In this version of QUIC, the following packet types with the long header are -defined:¶
-Type | -Name | -Section | -
---|---|---|
0x0 | -Initial | -- Section 17.2.2 - | -
0x1 | -0-RTT | -- Section 17.2.3 - | -
0x2 | -Handshake | -- Section 17.2.4 - | -
0x3 | -Retry | -- Section 17.2.5 - | -
The header form bit, connection ID lengths byte, Destination and Source -Connection ID fields, and Version fields of a long header packet are -version-independent. The other fields in the first byte are version-specific. -See [QUIC-INVARIANTS] for details on how packets from different versions of -QUIC are interpreted.¶
-The interpretation of the fields and the payload are specific to a version and -packet type. While type-specific semantics for this version are described in -the following sections, several long-header packets in this version of QUIC -contain these additional fields:¶
-A Version Negotiation packet is inherently not version-specific. Upon receipt by -a client, it will be identified as a Version Negotiation packet based on the -Version field having a value of 0.¶
-The Version Negotiation packet is a response to a client packet that contains a -version that is not supported by the server, and is only sent by servers.¶
-The layout of a Version Negotiation packet is:¶
-The value in the Unused field is selected randomly by the server. Clients MUST -ignore the value of this field. Servers SHOULD set the most significant bit of -this field (0x40) to 1 so that Version Negotiation packets appear to have the -Fixed Bit field.¶
-The Version field of a Version Negotiation packet MUST be set to 0x00000000.¶
-The server MUST include the value from the Source Connection ID field of the -packet it receives in the Destination Connection ID field. The value for Source -Connection ID MUST be copied from the Destination Connection ID of the received -packet, which is initially randomly selected by a client. Echoing both -connection IDs gives clients some assurance that the server received the packet -and that the Version Negotiation packet was not generated by an off-path -attacker.¶
-As future versions of QUIC may support Connection IDs larger than the version 1 -limit, Version Negotiation packets could carry Connection IDs that are longer -than 20 bytes.¶
-The remainder of the Version Negotiation packet is a list of 32-bit versions -which the server supports.¶
-A Version Negotiation packet cannot be explicitly acknowledged in an ACK frame -by a client. Receiving another Initial packet implicitly acknowledges a Version -Negotiation packet.¶
-The Version Negotiation packet does not include the Packet Number and Length -fields present in other packets that use the long header form. Consequently, -a Version Negotiation packet consumes an entire UDP datagram.¶
-A server MUST NOT send more than one Version Negotiation packet in response to a -single UDP datagram.¶
-See Section 6 for a description of the version negotiation -process.¶
-An Initial packet uses long headers with a type value of 0x0. It carries the -first CRYPTO frames sent by the client and server to perform key exchange, and -carries ACKs in either direction.¶
-The Initial packet contains a long header as well as the Length and Packet -Number fields. The first byte contains the Reserved and Packet Number Length -bits. Between the SCID and Length fields, there are two additional -field specific to the Initial packet.¶
-In order to prevent tampering by version-unaware middleboxes, Initial packets -are protected with connection- and version-specific keys (Initial keys) as -described in [QUIC-TLS]. This protection does not provide confidentiality or -integrity against on-path attackers, but provides some level of protection -against off-path attackers.¶
-The client and server use the Initial packet type for any packet that contains -an initial cryptographic handshake message. This includes all cases where a new -packet containing the initial cryptographic message needs to be created, such as -the packets sent after receiving a Retry packet (Section 17.2.5).¶
-A server sends its first Initial packet in response to a client Initial. A -server may send multiple Initial packets. The cryptographic key exchange could -require multiple round trips or retransmissions of this data.¶
-The payload of an Initial packet includes a CRYPTO frame (or frames) containing -a cryptographic handshake message, ACK frames, or both. PING, PADDING, and -CONNECTION_CLOSE frames are also permitted. An endpoint that receives an -Initial packet containing other frames can either discard the packet as spurious -or treat it as a connection error.¶
-The first packet sent by a client always includes a CRYPTO frame that contains -the start or all of the first cryptographic handshake message. The first -CRYPTO frame sent always begins at an offset of 0 (see Section 7).¶
-Note that if the server sends a HelloRetryRequest, the client will send another -series of Initial packets. These Initial packets will continue the -cryptographic handshake and will contain CRYPTO frames starting at an offset -matching the size of the CRYPTO frames sent in the first flight of Initial -packets.¶
-A client stops both sending and processing Initial packets when it sends its -first Handshake packet. A server stops sending and processing Initial packets -when it receives its first Handshake packet. Though packets might still be in -flight or awaiting acknowledgment, no further Initial packets need to be -exchanged beyond this point. Initial packet protection keys are discarded (see -Section 4.10.1 of [QUIC-TLS]) along with any loss recovery and congestion -control state (see Section 6.5 of [QUIC-RECOVERY]).¶
-Any data in CRYPTO frames is discarded - and no longer retransmitted - when -Initial keys are discarded.¶
-A 0-RTT packet uses long headers with a type value of 0x1, followed by the -Length and Packet Number fields. The first byte contains the Reserved and Packet -Number Length bits. It is used to carry "early" data from the client to the -server as part of the first flight, prior to handshake completion. As part of -the TLS handshake, the server can accept or reject this early data.¶
-See Section 2.3 of [TLS13] for a discussion of 0-RTT data and its -limitations.¶
- -Packet numbers for 0-RTT protected packets use the same space as 1-RTT protected -packets.¶
-After a client receives a Retry packet, 0-RTT packets are likely to have been -lost or discarded by the server. A client SHOULD attempt to resend data in -0-RTT packets after it sends a new Initial packet.¶
-A client MUST NOT reset the packet number it uses for 0-RTT packets, since the -keys used to protect 0-RTT packets will not change as a result of responding to -a Retry packet. Sending packets with the same packet number in that case is -likely to compromise the packet protection for all 0-RTT packets because the -same key and nonce could be used to protect different content.¶
-A client only receives acknowledgments for its 0-RTT packets once the handshake -is complete. Consequently, a server might expect 0-RTT packets to start with a -packet number of 0. Therefore, in determining the length of the packet number -encoding for 0-RTT packets, a client MUST assume that all packets up to the -current packet number are in flight, starting from a packet number of 0. Thus, -0-RTT packets could need to use a longer packet number encoding.¶
-A client MUST NOT send 0-RTT packets once it starts processing 1-RTT packets -from the server. This means that 0-RTT packets cannot contain any response to -frames from 1-RTT packets. For instance, a client cannot send an ACK frame in a -0-RTT packet, because that can only acknowledge a 1-RTT packet. An -acknowledgment for a 1-RTT packet MUST be carried in a 1-RTT packet.¶
-A server SHOULD treat a violation of remembered limits as a connection error of -an appropriate type (for instance, a FLOW_CONTROL_ERROR for exceeding stream -data limits).¶
-A Handshake packet uses long headers with a type value of 0x2, followed by the -Length and Packet Number fields. The first byte contains the Reserved and -Packet Number Length bits. It is used to carry acknowledgments and -cryptographic handshake messages from the server and client.¶
-Once a client has received a Handshake packet from a server, it uses Handshake -packets to send subsequent cryptographic handshake messages and acknowledgments -to the server.¶
-The Destination Connection ID field in a Handshake packet contains a connection -ID that is chosen by the recipient of the packet; the Source Connection ID -includes the connection ID that the sender of the packet wishes to use (see -Section 7.2).¶
-Handshake packets are their own packet number space, and thus the first -Handshake packet sent by a server contains a packet number of 0.¶
-The payload of this packet contains CRYPTO frames and could contain PING, -PADDING, or ACK frames. Handshake packets MAY contain CONNECTION_CLOSE frames. -Endpoints MUST treat receipt of Handshake packets with other frames as a -connection error.¶
-Like Initial packets (see Section 17.2.2.1), data in CRYPTO frames for -Handshake packets is discarded - and no longer retransmitted - when Handshake -protection keys are discarded.¶
-A Retry packet uses a long packet header with a type value of 0x3. It carries -an address validation token created by the server. It is used by a server that -wishes to perform a retry (see Section 8.1).¶
-A Retry packet (shown in Figure 14) does not contain any protected -fields. The value in the Unused field is selected randomly by the server. In -addition to the long header, it contains these additional fields:¶
-The server populates the Destination Connection ID with the connection ID that -the client included in the Source Connection ID of the Initial packet.¶
-The server includes a connection ID of its choice in the Source Connection ID -field. This value MUST not be equal to the Destination Connection ID field of -the packet sent by the client. A client MUST discard a Retry packet that -contains a Source Connection ID field that is identical to the Destination -Connection ID field of its Initial packet. The client MUST use the value from -the Source Connection ID field of the Retry packet in the Destination Connection -ID field of subsequent packets that it sends.¶
-A server MAY send Retry packets in response to Initial and 0-RTT packets. A -server can either discard or buffer 0-RTT packets that it receives. A server -can send multiple Retry packets as it receives Initial or 0-RTT packets. A -server MUST NOT send more than one Retry packet in response to a single UDP -datagram.¶
-A client MUST accept and process at most one Retry packet for each connection -attempt. After the client has received and processed an Initial or Retry packet -from the server, it MUST discard any subsequent Retry packets that it receives.¶
-Clients MUST discard Retry packets that have a Retry Integrity Tag that cannot -be validated, see the Retry Packet Integrity section of [QUIC-TLS]. This -diminishes an off-path attacker's ability to inject a Retry packet and protects -against accidental corruption of Retry packets. A client MUST discard a Retry -packet with a zero-length Retry Token field.¶
-The client responds to a Retry packet with an Initial packet that includes the -provided Retry Token to continue connection establishment.¶
-A client sets the Destination Connection ID field of this Initial packet to the -value from the Source Connection ID in the Retry packet. Changing Destination -Connection ID also results in a change to the keys used to protect the Initial -packet. It also sets the Token field to the token provided in the Retry. The -client MUST NOT change the Source Connection ID because the server could include -the connection ID as part of its token validation logic (see -Section 8.1.4).¶
-The next Initial packet from the client uses the connection ID and token values -from the Retry packet (see Section 7.2). Aside from this, -the Initial packet sent by the client is subject to the same restrictions as the -first Initial packet. A client MUST use the same cryptographic handshake -message it includes in this packet. A server MAY treat a packet that -contains a different cryptographic handshake message as a connection error or -discard it.¶
-A client MAY attempt 0-RTT after receiving a Retry packet by sending 0-RTT -packets to the connection ID provided by the server. A client MUST NOT change -the cryptographic handshake message it sends in response to receiving a Retry.¶
-A client MUST NOT reset the packet number for any packet number space after -processing a Retry packet; Section 17.2.3 contains more information on this.¶
-A server acknowledges the use of a Retry packet for a connection using the -original_connection_id transport parameter (see -Section 18.2). If the server sends a Retry packet, it -MUST include the Destination Connection ID field from the client's first -Initial packet in the transport parameter.¶
-If the client received and processed a Retry packet, it MUST validate that the -original_connection_id transport parameter is present and correct; otherwise, it -MUST validate that the transport parameter is absent. A client MUST treat a -failed validation as a connection error of type TRANSPORT_PARAMETER_ERROR.¶
-A Retry packet does not include a packet number and cannot be explicitly -acknowledged by a client.¶
-This version of QUIC defines a single packet type which uses the -short packet header.¶
-The short header can be used after the version and 1-RTT keys are negotiated. -Packets that use the short header contain the following fields:¶
-The header form bit and the connection ID field of a short header packet are -version-independent. The remaining fields are specific to the selected QUIC -version. See [QUIC-INVARIANTS] for details on how packets from different -versions of QUIC are interpreted.¶
-The latency spin bit enables passive latency monitoring from observation points -on the network path throughout the duration of a connection. The spin bit is -only present in the short packet header, since it is possible to measure the -initial RTT of a connection by observing the handshake. Therefore, the spin bit -is available after version negotiation and connection establishment are -completed. On-path measurement and use of the latency spin bit is further -discussed in [QUIC-MANAGEABILITY].¶
-The spin bit is an OPTIONAL feature of QUIC. A QUIC stack that chooses to -support the spin bit MUST implement it as specified in this section.¶
-Each endpoint unilaterally decides if the spin bit is enabled or disabled for a -connection. Implementations MUST allow administrators of clients and servers -to disable the spin bit either globally or on a per-connection basis. Even when -the spin bit is not disabled by the administrator, endpoints MUST disable their -use of the spin bit for a random selection of at least one in every 16 network -paths, or for one in every 16 connection IDs. As each endpoint disables the -spin bit independently, this ensures that the spin bit signal is disabled on -approximately one in eight network paths.¶
-When the spin bit is disabled, endpoints MAY set the spin bit to any value, and -MUST ignore any incoming value. It is RECOMMENDED that endpoints set the spin -bit to a random value either chosen independently for each packet or chosen -independently for each connection ID.¶
-If the spin bit is enabled for the connection, the endpoint maintains a spin -value and sets the spin bit in the short header to the currently stored -value when a packet with a short header is sent out. The spin value is -initialized to 0 in the endpoint at connection start. Each endpoint also -remembers the highest packet number seen from its peer on the connection.¶
-When a server receives a short header packet that increments the highest -packet number seen by the server from the client, it sets the spin value to be -equal to the spin bit in the received packet.¶
-When a client receives a short header packet that increments the highest -packet number seen by the client from the server, it sets the spin value to the -inverse of the spin bit in the received packet.¶
-An endpoint resets its spin value to zero when sending the first packet of a -given connection with a new connection ID. This reduces the risk that transient -spin bit state can be used to link flows across connection migration or ID -change.¶
-With this mechanism, the server reflects the spin value received, while the -client 'spins' it after one RTT. On-path observers can measure the time -between two spin bit toggle events to estimate the end-to-end RTT of a -connection.¶
-The extension_data
field of the quic_transport_parameters extension defined in
-[QUIC-TLS] contains the QUIC transport parameters. They are encoded as a
-sequence of transport parameters, as shown in Figure 16:¶
Each transport parameter is encoded as an (identifier, length, value) tuple, -as shown in Figure 17:¶
-The Transport Param Length field contains the length of the Transport -Parameter Value field.¶
-QUIC encodes transport parameters into a sequence of bytes, which are then -included in the cryptographic handshake.¶
-Transport parameters with an identifier of the form 31 * N + 27
for integer
-values of N are reserved to exercise the requirement that unknown transport
-parameters be ignored. These transport parameters have no semantics, and may
-carry arbitrary values.¶
This section details the transport parameters defined in this document.¶
-Many transport parameters listed here have integer values. Those transport -parameters that are identified as integers use a variable-length integer -encoding (see Section 16) and have a default value of 0 if the -transport parameter is absent, unless otherwise stated.¶
-The following transport parameters are defined:¶
-If present, transport parameters that set initial flow control limits -(initial_max_stream_data_bidi_local, initial_max_stream_data_bidi_remote, and -initial_max_stream_data_uni) are equivalent to sending a MAX_STREAM_DATA frame -(Section 19.10) on every stream of the corresponding type -immediately after opening. If the transport parameter is absent, streams of -that type start with a flow control limit of 0.¶
-A client MUST NOT include server-only transport parameters -(original_connection_id, stateless_reset_token, or preferred_address). A server -MUST treat receipt of any of these transport parameters as a connection error of -type TRANSPORT_PARAMETER_ERROR.¶
-As described in Section 12.4, packets contain one or more frames. This section -describes the format and semantics of the core QUIC frame types.¶
-The PADDING frame (type=0x00) has no semantic value. PADDING frames can be used -to increase the size of a packet. Padding can be used to increase an initial -client packet to the minimum required size, or to provide protection against -traffic analysis for protected packets.¶
-A PADDING frame has no content. That is, a PADDING frame consists of the single -byte that identifies the frame as a PADDING frame.¶
-Endpoints can use PING frames (type=0x01) to verify that their peers are still -alive or to check reachability to the peer. The PING frame contains no -additional fields.¶
-The receiver of a PING frame simply needs to acknowledge the packet containing -this frame.¶
-The PING frame can be used to keep a connection alive when an application or -application protocol wishes to prevent the connection from timing out. An -application protocol SHOULD provide guidance about the conditions under which -generating a PING is recommended. This guidance SHOULD indicate whether it is -the client or the server that is expected to send the PING. Having both -endpoints send PING frames without coordination can produce an excessive number -of packets and poor performance.¶
-A connection will time out if no packets are sent or received for a period -longer than the time negotiated using the max_idle_timeout transport parameter -(see Section 10). However, state in middleboxes might time out earlier -than that. Though REQ-5 in [RFC4787] recommends a 2 minute timeout -interval, experience shows that sending packets every 15 to 30 seconds is -necessary to prevent the majority of middleboxes from losing state for UDP -flows.¶
-Receivers send ACK frames (types 0x02 and 0x03) to inform senders of packets -they have received and processed. The ACK frame contains one or more ACK Ranges. -ACK Ranges identify acknowledged packets. If the frame type is 0x03, ACK frames -also contain the sum of QUIC packets with associated ECN marks received on the -connection up until this point. QUIC implementations MUST properly handle both -types and, if they have enabled ECN for packets they send, they SHOULD use the -information in the ECN section to manage their congestion state.¶
-QUIC acknowledgements are irrevocable. Once acknowledged, a packet remains -acknowledged, even if it does not appear in a future ACK frame. This is unlike -TCP SACKs ([RFC2018]).¶
-Packets from different packet number spaces can be identified using the same -numeric value. An acknowledgment for a packet needs to indicate both a packet -number and a packet number space. This is accomplished by having each ACK frame -only acknowledge packet numbers in the same space as the packet in which the -ACK frame is contained.¶
-Version Negotiation and Retry packets cannot be acknowledged because they do not -contain a packet number. Rather than relying on ACK frames, these packets are -implicitly acknowledged by the next Initial packet sent by the client.¶
-An ACK frame is shown in Figure 19.¶
-ACK frames contain the following fields:¶
-ack_delay_exponent
transport parameter set by the
-sender of the ACK frame (see Section 18.2). Scaling in
-this fashion allows for a larger range of values with a shorter encoding at
-the cost of lower resolution. Because the receiver doesn't use the ACK Delay
-for Initial and Handshake packets, a sender SHOULD send a value of 0.¶
-The ACK Ranges field consists of alternating Gap and ACK Range values in -descending packet number order. The number of Gap and ACK Range values is -determined by the ACK Range Count field; one of each value is present for each -value in the ACK Range Count field.¶
-ACK Ranges are structured as shown in Figure 20.¶
-The fields that form the ACK Ranges are:¶
-Gap and ACK Range value use a relative integer encoding for efficiency. Though -each encoded value is positive, the values are subtracted, so that each ACK -Range describes progressively lower-numbered packets.¶
-Each ACK Range acknowledges a contiguous range of packets by indicating the -number of acknowledged packets that precede the largest packet number in that -range. A value of zero indicates that only the largest packet number is -acknowledged. Larger ACK Range values indicate a larger range, with -corresponding lower values for the smallest packet number in the range. Thus, -given a largest packet number for the range, the smallest value is determined by -the formula:¶
-- smallest = largest - ack_range -¶ -
An ACK Range acknowledges all packets between the smallest packet number and the -largest, inclusive.¶
-The largest value for an ACK Range is determined by cumulatively subtracting the -size of all preceding ACK Ranges and Gaps.¶
-Each Gap indicates a range of packets that are not being acknowledged. The -number of packets in the gap is one higher than the encoded value of the Gap -field.¶
-The value of the Gap field establishes the largest packet number value for the -subsequent ACK Range using the following formula:¶
-- largest = previous_smallest - gap - 2 -¶ -
If any computed packet number is negative, an endpoint MUST generate a -connection error of type FRAME_ENCODING_ERROR.¶
-The ACK frame uses the least significant bit (that is, type 0x03) to indicate -ECN feedback and report receipt of QUIC packets with associated ECN codepoints -of ECT(0), ECT(1), or CE in the packet's IP header. ECN Counts are only present -when the ACK frame type is 0x03.¶
-ECN Counts are only parsed when the ACK frame type is 0x03. There are 3 ECN -counts, as shown in Figure 21.¶
-The three ECN Counts are:¶
-ECN counts are maintained separately for each packet number space.¶
-An endpoint uses a RESET_STREAM frame (type=0x04) to abruptly terminate the -sending part of a stream.¶
-After sending a RESET_STREAM, an endpoint ceases transmission and retransmission -of STREAM frames on the identified stream. A receiver of RESET_STREAM can -discard any data that it already received on that stream.¶
-An endpoint that receives a RESET_STREAM frame for a send-only stream MUST -terminate the connection with error STREAM_STATE_ERROR.¶
-The RESET_STREAM frame is shown in Figure 22.¶
-RESET_STREAM frames contain the following fields:¶
-An endpoint uses a STOP_SENDING frame (type=0x05) to communicate that incoming -data is being discarded on receipt at application request. STOP_SENDING -requests that a peer cease transmission on a stream.¶
-A STOP_SENDING frame can be sent for streams in the Recv or Size Known states -(see Section 3.1). Receiving a STOP_SENDING frame for a -locally-initiated stream that has not yet been created MUST be treated as a -connection error of type STREAM_STATE_ERROR. An endpoint that receives a -STOP_SENDING frame for a receive-only stream MUST terminate the connection with -error STREAM_STATE_ERROR.¶
-The STOP_SENDING frame is shown in Figure 23.¶
-STOP_SENDING frames contain the following fields:¶
-The CRYPTO frame (type=0x06) is used to transmit cryptographic handshake -messages. It can be sent in all packet types except 0-RTT. The CRYPTO frame -offers the cryptographic protocol an in-order stream of bytes. CRYPTO frames -are functionally identical to STREAM frames, except that they do not bear a -stream identifier; they are not flow controlled; and they do not carry markers -for optional offset, optional length, and the end of the stream.¶
-The CRYPTO frame is shown in Figure 24.¶
-CRYPTO frames contain the following fields:¶
-There is a separate flow of cryptographic handshake data in each encryption -level, each of which starts at an offset of 0. This implies that each encryption -level is treated as a separate CRYPTO stream of data.¶
-The largest offset delivered on a stream - the sum of the offset and data -length - cannot exceed 2^62-1. Receipt of a frame that exceeds this limit MUST -be treated as a connection error of type FRAME_ENCODING_ERROR or -CRYPTO_BUFFER_EXCEEDED.¶
-Unlike STREAM frames, which include a Stream ID indicating to which stream the -data belongs, the CRYPTO frame carries data for a single stream per encryption -level. The stream does not have an explicit end, so CRYPTO frames do not have a -FIN bit.¶
-A server sends a NEW_TOKEN frame (type=0x07) to provide the client with a token -to send in the header of an Initial packet for a future connection.¶
-The NEW_TOKEN frame is shown in Figure 25.¶
-NEW_TOKEN frames contain the following fields:¶
-An endpoint might receive multiple NEW_TOKEN frames that contain the same token -value if packets containing the frame are incorrectly determined to be lost. -Endpoints are responsible for discarding duplicate values, which might be used -to link connection attempts; see Section 8.1.3.¶
-Clients MUST NOT send NEW_TOKEN frames. Servers MUST treat receipt of a -NEW_TOKEN frame as a connection error of type PROTOCOL_VIOLATION.¶
-STREAM frames implicitly create a stream and carry stream data. The STREAM -frame takes the form 0b00001XXX (or the set of values from 0x08 to 0x0f). The -value of the three low-order bits of the frame type determines the fields that -are present in the frame.¶
-An endpoint MUST terminate the connection with error STREAM_STATE_ERROR if it -receives a STREAM frame for a locally-initiated stream that has not yet been -created, or for a send-only stream.¶
-The STREAM frames are shown in Figure 26.¶
-STREAM frames contain the following fields:¶
-When a Stream Data field has a length of 0, the offset in the STREAM frame is -the offset of the next byte that would be sent.¶
-The first byte in the stream has an offset of 0. The largest offset delivered -on a stream - the sum of the offset and data length - cannot exceed 2^62-1, as -it is not possible to provide flow control credit for that data. Receipt of a -frame that exceeds this limit MUST be treated as a connection error of type -FRAME_ENCODING_ERROR or FLOW_CONTROL_ERROR.¶
-The MAX_DATA frame (type=0x10) is used in flow control to inform the peer of -the maximum amount of data that can be sent on the connection as a whole.¶
-The MAX_DATA frame is shown in Figure 27.¶
-MAX_DATA frames contain the following fields:¶
-All data sent in STREAM frames counts toward this limit. The sum of the largest -received offsets on all streams - including streams in terminal states - MUST -NOT exceed the value advertised by a receiver. An endpoint MUST terminate a -connection with a FLOW_CONTROL_ERROR error if it receives more data than the -maximum data value that it has sent, unless this is a result of a change in -the initial limits (see Section 7.3.1).¶
-The MAX_STREAM_DATA frame (type=0x11) is used in flow control to inform a peer -of the maximum amount of data that can be sent on a stream.¶
-A MAX_STREAM_DATA frame can be sent for streams in the Recv state (see -Section 3.1). Receiving a MAX_STREAM_DATA frame for a -locally-initiated stream that has not yet been created MUST be treated as a -connection error of type STREAM_STATE_ERROR. An endpoint that receives a -MAX_STREAM_DATA frame for a receive-only stream MUST terminate the connection -with error STREAM_STATE_ERROR.¶
-The MAX_STREAM_DATA frame is shown in Figure 28.¶
-MAX_STREAM_DATA frames contain the following fields:¶
-When counting data toward this limit, an endpoint accounts for the largest -received offset of data that is sent or received on the stream. Loss or -reordering can mean that the largest received offset on a stream can be greater -than the total size of data received on that stream. Receiving STREAM frames -might not increase the largest received offset.¶
-The data sent on a stream MUST NOT exceed the largest maximum stream data value -advertised by the receiver. An endpoint MUST terminate a connection with a -FLOW_CONTROL_ERROR error if it receives more data than the largest maximum -stream data that it has sent for the affected stream, unless this is a result of -a change in the initial limits (see Section 7.3.1).¶
-The MAX_STREAMS frames (type=0x12 and 0x13) inform the peer of the cumulative -number of streams of a given type it is permitted to open. A MAX_STREAMS frame -with a type of 0x12 applies to bidirectional streams, and a MAX_STREAMS frame -with a type of 0x13 applies to unidirectional streams.¶
-The MAX_STREAMS frames are shown in Figure 29;¶
-MAX_STREAMS frames contain the following fields:¶
-Loss or reordering can cause a MAX_STREAMS frame to be received which states a -lower stream limit than an endpoint has previously received. MAX_STREAMS frames -which do not increase the stream limit MUST be ignored.¶
-An endpoint MUST NOT open more streams than permitted by the current stream -limit set by its peer. For instance, a server that receives a unidirectional -stream limit of 3 is permitted to open stream 3, 7, and 11, but not stream 15. -An endpoint MUST terminate a connection with a STREAM_LIMIT_ERROR error if a -peer opens more streams than was permitted.¶
-Note that these frames (and the corresponding transport parameters) do not -describe the number of streams that can be opened concurrently. The limit -includes streams that have been closed as well as those that are open.¶
-A sender SHOULD send a DATA_BLOCKED frame (type=0x14) when it wishes to send -data, but is unable to due to connection-level flow control (see -Section 4). DATA_BLOCKED frames can be used as input to tuning of flow -control algorithms (see Section 4.2).¶
-The DATA_BLOCKED frame is shown in Figure 30.¶
-DATA_BLOCKED frames contain the following fields:¶
-A sender SHOULD send a STREAM_DATA_BLOCKED frame (type=0x15) when it wishes to -send data, but is unable to due to stream-level flow control. This frame is -analogous to DATA_BLOCKED (Section 19.12).¶
-An endpoint that receives a STREAM_DATA_BLOCKED frame for a send-only stream -MUST terminate the connection with error STREAM_STATE_ERROR.¶
-The STREAM_DATA_BLOCKED frame is shown in Figure 31.¶
-STREAM_DATA_BLOCKED frames contain the following fields:¶
- -A sender SHOULD send a STREAMS_BLOCKED frame (type=0x16 or 0x17) when it wishes -to open a stream, but is unable to due to the maximum stream limit set by its -peer (see Section 19.11). A STREAMS_BLOCKED frame of type 0x16 is used -to indicate reaching the bidirectional stream limit, and a STREAMS_BLOCKED frame -of type 0x17 indicates reaching the unidirectional stream limit.¶
-A STREAMS_BLOCKED frame does not open the stream, but informs the peer that a -new stream was needed and the stream limit prevented the creation of the stream.¶
-The STREAMS_BLOCKED frames are shown in Figure 32.¶
-STREAMS_BLOCKED frames contain the following fields:¶
-An endpoint sends a NEW_CONNECTION_ID frame (type=0x18) to provide its peer with -alternative connection IDs that can be used to break linkability when migrating -connections (see Section 9.5).¶
-The NEW_CONNECTION_ID frame is shown in Figure 33.¶
-NEW_CONNECTION_ID frames contain the following fields:¶
-An endpoint MUST NOT send this frame if it currently requires that its peer send -packets with a zero-length Destination Connection ID. Changing the length of a -connection ID to or from zero-length makes it difficult to identify when the -value of the connection ID changed. An endpoint that is sending packets with a -zero-length Destination Connection ID MUST treat receipt of a NEW_CONNECTION_ID -frame as a connection error of type PROTOCOL_VIOLATION.¶
-Transmission errors, timeouts and retransmissions might cause the same -NEW_CONNECTION_ID frame to be received multiple times. Receipt of the same -frame multiple times MUST NOT be treated as a connection error. A receiver can -use the sequence number supplied in the NEW_CONNECTION_ID frame to identify new -connection IDs from old ones.¶
-If an endpoint receives a NEW_CONNECTION_ID frame that repeats a previously -issued connection ID with a different Stateless Reset Token or a different -sequence number, or if a sequence number is used for different connection -IDs, the endpoint MAY treat that receipt as a connection error of type -PROTOCOL_VIOLATION.¶
-The Retire Prior To field counts connection IDs established during connection -setup and the preferred_address transport parameter (see Section 5.1.2). The -Retire Prior To field MUST be less than or equal to the Sequence Number field. -Receiving a value greater than the Sequence Number MUST be treated as a -connection error of type FRAME_ENCODING_ERROR.¶
-Once a sender indicates a Retire Prior To value, smaller values sent in -subsequent NEW_CONNECTION_ID frames have no effect. A receiver MUST ignore any -Retire Prior To fields that do not increase the largest received Retire Prior To -value.¶
-An endpoint that receives a NEW_CONNECTION_ID frame with a sequence number -smaller than the Retire Prior To field of a previously received -NEW_CONNECTION_ID frame MUST immediately send a corresponding -RETIRE_CONNECTION_ID frame that retires the newly received connection ID.¶
-An endpoint sends a RETIRE_CONNECTION_ID frame (type=0x19) to indicate that it -will no longer use a connection ID that was issued by its peer. This may include -the connection ID provided during the handshake. Sending a RETIRE_CONNECTION_ID -frame also serves as a request to the peer to send additional connection IDs for -future use (see Section 5.1). New connection IDs can be delivered to a -peer using the NEW_CONNECTION_ID frame (Section 19.15).¶
-Retiring a connection ID invalidates the stateless reset token associated with -that connection ID.¶
-The RETIRE_CONNECTION_ID frame is shown in Figure 34.¶
-RETIRE_CONNECTION_ID frames contain the following fields:¶
-Receipt of a RETIRE_CONNECTION_ID frame containing a sequence number greater -than any previously sent to the peer MUST be treated as a connection error of -type PROTOCOL_VIOLATION.¶
-The sequence number specified in a RETIRE_CONNECTION_ID frame MUST NOT refer -to the Destination Connection ID field of the packet in which the frame is -contained. The peer MAY treat this as a connection error of type -FRAME_ENCODING_ERROR.¶
-An endpoint cannot send this frame if it was provided with a zero-length -connection ID by its peer. An endpoint that provides a zero-length connection -ID MUST treat receipt of a RETIRE_CONNECTION_ID frame as a connection error of -type PROTOCOL_VIOLATION.¶
-Endpoints can use PATH_CHALLENGE frames (type=0x1a) to check reachability to the -peer and for path validation during connection migration.¶
-The PATH_CHALLENGE frame is shown in Figure 35.¶
-PATH_CHALLENGE frames contain the following fields:¶
-A PATH_CHALLENGE frame containing 8 bytes that are hard to guess is sufficient -to ensure that it is easier to receive the packet than it is to guess the value -correctly.¶
-The recipient of this frame MUST generate a PATH_RESPONSE frame -(Section 19.18) containing the same Data.¶
-The PATH_RESPONSE frame (type=0x1b) is sent in response to a PATH_CHALLENGE -frame. Its format is identical to the PATH_CHALLENGE frame -(Section 19.17).¶
-If the content of a PATH_RESPONSE frame does not match the content of a -PATH_CHALLENGE frame previously sent by the endpoint, the endpoint MAY generate -a connection error of type PROTOCOL_VIOLATION.¶
-An endpoint sends a CONNECTION_CLOSE frame (type=0x1c or 0x1d) to notify its -peer that the connection is being closed. The CONNECTION_CLOSE with a frame -type of 0x1c is used to signal errors at only the QUIC layer, or the absence of -errors (with the NO_ERROR code). The CONNECTION_CLOSE frame with a type of 0x1d -is used to signal an error with the application that uses QUIC.¶
-If there are open streams that haven't been explicitly closed, they are -implicitly closed when the connection is closed.¶
-The CONNECTION_CLOSE frames are shown in Figure 36.¶
-CONNECTION_CLOSE frames contain the following fields:¶
-The application-specific variant of CONNECTION_CLOSE (type 0x1d) can only be -sent using 0-RTT or 1-RTT packets ([QUIC-TLS], Section 4). When an -application wishes to abandon a connection during the handshake, an endpoint -can send a CONNECTION_CLOSE frame (type 0x1c) with an error code of -APPLICATION_ERROR in an Initial or a Handshake packet.¶
-The server uses the HANDSHAKE_DONE frame (type=0x1e) to signal confirmation of -the handshake to the client. The HANDSHAKE_DONE frame contains no additional -fields.¶
-This frame can only be sent by the server. Servers MUST NOT send a -HANDSHAKE_DONE frame before completing the handshake. A server MUST treat -receipt of a HANDSHAKE_DONE frame as a connection error of type -PROTOCOL_VIOLATION.¶
-QUIC frames do not use a self-describing encoding. An endpoint therefore needs -to understand the syntax of all frames before it can successfully process a -packet. This allows for efficient encoding of frames, but it means that an -endpoint cannot send a frame of a type that is unknown to its peer.¶
-An extension to QUIC that wishes to use a new type of frame MUST first ensure -that a peer is able to understand the frame. An endpoint can use a transport -parameter to signal its willingness to receive one or more extension frame types -with the one transport parameter.¶
-Extensions that modify or replace core protocol functionality (including frame -types) will be difficult to combine with other extensions which modify or -replace the same functionality unless the behavior of the combination is -explicitly defined. Such extensions SHOULD define their interaction with -previously-defined extensions modifying the same protocol components.¶
-Extension frames MUST be congestion controlled and MUST cause an ACK frame to -be sent. The exception is extension frames that replace or supplement the ACK -frame. Extension frames are not included in flow control unless specified -in the extension.¶
-An IANA registry is used to manage the assignment of frame types; see -Section 22.3.¶
-QUIC error codes are 62-bit unsigned integers.¶
-This section lists the defined QUIC transport error codes that may be used in a -CONNECTION_CLOSE frame. These errors apply to the entire connection.¶
-See Section 22.4 for details of registering new error codes.¶
-In defining these error codes, several principles are applied. Error conditions -that might require specific action on the part of a recipient are given unique -codes. Errors that represent common conditions are given specific codes. -Absent either of these conditions, error codes are used to identify a general -function of the stack, like flow control or transport parameter handling. -Finally, generic errors are provided for conditions where implementations are -unable or unwilling to use more specific codes.¶
-Application protocol error codes are 62-bit unsigned integers, but the -management of application error codes is left to application protocols. -Application protocol error codes are used for the RESET_STREAM frame -(Section 19.4), the STOP_SENDING frame (Section 19.5), and -the CONNECTION_CLOSE frame with a type of 0x1d (Section 19.19).¶
-As an encrypted and authenticated transport QUIC provides a range of protections -against denial of service. Once the cryptographic handshake is complete, QUIC -endpoints discard most packets that are not authenticated, greatly limiting the -ability of an attacker to interfere with existing connections.¶
-Once a connection is established QUIC endpoints might accept some -unauthenticated ICMP packets (see Section 14.2), but the use of these packets -is extremely limited. The only other type of packet that an endpoint might -accept is a stateless reset (Section 10.4) which relies on the token -being kept secret until it is used.¶
-During the creation of a connection, QUIC only provides protection against -attack from off the network path. All QUIC packets contain proof that the -recipient saw a preceding packet from its peer.¶
-Addresses cannot change during the handshake, so endpoints can discard packets -that are received on a different network path.¶
-The Source and Destination Connection ID fields are the primary means of -protection against off-path attack during the handshake. These are required to -match those set by a peer. Except for an Initial and stateless reset packets, -an endpoint only accepts packets that include a Destination Connection ID field -that matches a value the endpoint previously chose. This is the only protection -offered for Version Negotiation packets.¶
-The Destination Connection ID field in an Initial packet is selected by a client -to be unpredictable, which serves an additional purpose. The packets that carry -the cryptographic handshake are protected with a key that is derived from this -connection ID and salt specific to the QUIC version. This allows endpoints to -use the same process for authenticating packets that they receive as they use -after the cryptographic handshake completes. Packets that cannot be -authenticated are discarded. Protecting packets in this fashion provides a -strong assurance that the sender of the packet saw the Initial packet and -understood it.¶
-These protections are not intended to be effective against an attacker that is -able to receive QUIC packets prior to the connection being established. Such an -attacker can potentially send packets that will be accepted by QUIC endpoints. -This version of QUIC attempts to detect this sort of attack, but it expects that -endpoints will fail to establish a connection rather than recovering. For the -most part, the cryptographic handshake protocol [QUIC-TLS] is responsible for -detecting tampering during the handshake.¶
-Endpoints are permitted to use other methods to detect and attempt to recover -from interference with the handshake. Invalid packets may be identified and -discarded using other methods, but no specific method is mandated in this -document.¶
-An attacker might be able to receive an address validation token -(Section 8) from a server and then release the IP address it used -to acquire that token. At a later time, the attacker may initiate a 0-RTT -connection with a server by spoofing this same address, which might now address -a different (victim) endpoint. The attacker can thus potentially cause the -server to send an initial congestion window's worth of data towards the victim.¶
-Servers SHOULD provide mitigations for this attack by limiting the usage and -lifetime of address validation tokens (see Section 8.1.3).¶
-An endpoint that acknowledges packets it has not received might cause a -congestion controller to permit sending at rates beyond what the network -supports. An endpoint MAY skip packet numbers when sending packets to detect -this behavior. An endpoint can then immediately close the connection with a -connection error of type PROTOCOL_VIOLATION (see Section 10.3).¶
-The attacks commonly known as Slowloris [SLOWLORIS] try to keep many -connections to the target endpoint open and hold them open as long as possible. -These attacks can be executed against a QUIC endpoint by generating the minimum -amount of activity necessary to avoid being closed for inactivity. This might -involve sending small amounts of data, gradually opening flow control windows in -order to control the sender rate, or manufacturing ACK frames that simulate a -high loss rate.¶
-QUIC deployments SHOULD provide mitigations for the Slowloris attacks, such as -increasing the maximum number of clients the server will allow, limiting the -number of connections a single IP address is allowed to make, imposing -restrictions on the minimum transfer speed a connection is allowed to have, and -restricting the length of time an endpoint is allowed to stay connected.¶
-An adversarial sender might intentionally send fragments of stream data in -order to cause disproportionate receive buffer memory commitment and/or -creation of a large and inefficient data structure.¶
-An adversarial receiver might intentionally not acknowledge packets -containing stream data in order to force the sender to store the -unacknowledged stream data for retransmission.¶
-The attack on receivers is mitigated if flow control windows correspond to -available memory. However, some receivers will over-commit memory and -advertise flow control offsets in the aggregate that exceed actual available -memory. The over-commitment strategy can lead to better performance when -endpoints are well behaved, but renders endpoints vulnerable to the stream -fragmentation attack.¶
-QUIC deployments SHOULD provide mitigations against stream fragmentation -attacks. Mitigations could consist of avoiding over-committing memory, -limiting the size of tracking data structures, delaying reassembly -of STREAM frames, implementing heuristics based on the age and -duration of reassembly holes, or some combination.¶
-An adversarial endpoint can open lots of streams, exhausting state on an -endpoint. The adversarial endpoint could repeat the process on a large number -of connections, in a manner similar to SYN flooding attacks in TCP.¶
-Normally, clients will open streams sequentially, as explained in Section 2.1. -However, when several streams are initiated at short intervals, loss or -reordering may cause STREAM frames that open streams to be received out of -sequence. On receiving a higher-numbered stream ID, a receiver is required to -open all intervening streams of the same type (see Section 3.2). -Thus, on a new connection, opening stream 4000000 opens 1 million and 1 -client-initiated bidirectional streams.¶
-The number of active streams is limited by the initial_max_streams_bidi and -initial_max_streams_uni transport parameters, as explained in -Section 4.5. If chosen judiciously, these limits mitigate the -effect of the stream commitment attack. However, setting the limit too low -could affect performance when applications expect to open large number of -streams.¶
-QUIC and TLS both contain messages that have legitimate uses in some contexts, -but that can be abused to cause a peer to expend processing resources without -having any observable impact on the state of the connection.¶
-Messages can also be used to change and revert state in small or inconsequential -ways, such as by sending small increments to flow control limits.¶
-If processing costs are disproportionately large in comparison to bandwidth -consumption or effect on state, then this could allow a malicious peer to -exhaust processing capacity.¶
-While there are legitimate uses for all messages, implementations SHOULD track -cost of processing relative to progress and treat excessive quantities of any -non-productive packets as indicative of an attack. Endpoints MAY respond to -this condition with a connection error, or by dropping packets.¶
-An on-path attacker could manipulate the value of ECN codepoints in the IP -header to influence the sender's rate. [RFC3168] discusses manipulations and -their effects in more detail.¶
-An on-the-side attacker can duplicate and send packets with modified ECN -codepoints to affect the sender's rate. If duplicate packets are discarded by a -receiver, an off-path attacker will need to race the duplicate packet against -the original to be successful in this attack. Therefore, QUIC endpoints ignore -the ECN codepoint field on an IP packet unless at least one QUIC packet in that -IP packet is successfully processed; see Section 13.4.¶
-Stateless resets create a possible denial of service attack analogous to a TCP -reset injection. This attack is possible if an attacker is able to cause a -stateless reset token to be generated for a connection with a selected -connection ID. An attacker that can cause this token to be generated can reset -an active connection with the same connection ID.¶
-If a packet can be routed to different instances that share a static key, for -example by changing an IP address or port, then an attacker can cause the server -to send a stateless reset. To defend against this style of denial service, -endpoints that share a static key for stateless reset (see Section 10.4.2) MUST -be arranged so that packets with a given connection ID always arrive at an -instance that has connection state, unless that connection is no longer active.¶
-In the case of a cluster that uses dynamic load balancing, it's possible that a -change in load balancer configuration could happen while an active instance -retains connection state; even if an instance retains connection state, the -change in routing and resulting stateless reset will result in the connection -being terminated. If there is no chance in the packet being routed to the -correct instance, it is better to send a stateless reset than wait for -connections to time out. However, this is acceptable only if the routing cannot -be influenced by an attacker.¶
-This document defines QUIC Version Negotiation packets in -Section 6, which can be used to negotiate the QUIC version used -between two endpoints. However, this document does not specify how this -negotiation will be performed between this version and subsequent future -versions. In particular, Version Negotiation packets do not contain any -mechanism to prevent version downgrade attacks. Future versions of QUIC that -use Version Negotiation packets MUST define a mechanism that is robust against -version downgrade attacks.¶
-Deployments should limit the ability of an attacker to target a new connection -to a particular server instance. This means that client-controlled fields, such -as the initial Destination Connection ID used on Initial and 0-RTT packets -SHOULD NOT be used by themselves to make routing decisions. Ideally, routing -decisions are made independently of client-selected values; a Source Connection -ID can be selected to route later packets to the same server.¶
-A complete security analysis of QUIC is outside the scope of this document. -This section provides an informal description of the desired security properties -as an aid to implementors and to help guide protocol analysis.¶
-QUIC assumes the threat model described in [SEC-CONS] and provides -protections against many of the attacks that arise from that model.¶
-For this purpose, attacks are divided into passive and active attacks. Passive -attackers have the capability to read packets from the network, while active -attackers also have the capability to write packets into the network. However, -a passive attack may involve an attacker with the ability to cause a routing -change or other modification in the path taken by packets that comprise a -connection.¶
-Attackers are additionally categorized as either on-path attackers or off-path -attackers; see Section 3.5 of [SEC-CONS]. An on-path attacker can read, -modify, or remove any packet it observes such that it no longer reaches its -destination, while an off-path attacker observes the packets, but cannot prevent -the original packet from reaching its intended destination. An off-path -attacker can also transmit arbitrary packets.¶
-Properties of the handshake, protected packets, and connection migration are -considered separately.¶
-The QUIC handshake incorporates the TLS 1.3 handshake and enjoys the -cryptographic properties described in Appendix E.1 of [TLS13].¶
-In addition to those properties, the handshake is intended to provide some -defense against DoS attacks on the handshake, as described below.¶
-Address validation (Section 8) is used to verify that an entity -that claims a given address is able to receive packets at that address. Address -validation limits amplification attack targets to addresses for which an -attacker is either on-path or off-path.¶
-Prior to validation, endpoints are limited in what they are able to send. -During the handshake, a server cannot send more than three times the data it -receives; clients that initiate new connections or migrate to a new network -path are limited.¶
-Computing the server's first flight for a full handshake is potentially -expensive, requiring both a signature and a key exchange computation. In order -to prevent computational DoS attacks, the Retry packet provides a cheap token -exchange mechanism which allows servers to validate a client's IP address prior -to doing any expensive computations at the cost of a single round trip. After a -successful handshake, servers can issue new tokens to a client which will allow -new connection establishment without incurring this cost.¶
-An on-path or off-path attacker can force a handshake to fail by replacing or -racing Initial packets. Once valid Initial packets have been exchanged, -subsequent Handshake packets are protected with the handshake keys and an -on-path attacker cannot force handshake failure other than by dropping packets -to cause endpoints to abandon the attempt.¶
-An on-path attacker can also replace the addresses of packets on either side and -therefore cause the client or server to have an incorrect view of the remote -addresses. Such an attack is indistinguishable from the functions performed by a -NAT.¶
-The entire handshake is cryptographically protected, with the Initial packets -being encrypted with per-version keys and the Handshake and later packets being -encrypted with keys derived from the TLS key exchange. Further, parameter -negotiation is folded into the TLS transcript and thus provides the same -security guarantees as ordinary TLS negotiation. Thus, an attacker can observe -the client's transport parameters (as long as it knows the version-specific -salt) but cannot observe the server's transport parameters and cannot influence -parameter negotiation.¶
-Connection IDs are unencrypted but integrity protected in all packets.¶
-This version of QUIC does not incorporate a version negotiation mechanism; -implementations of incompatible versions will simply fail to establish a -connection.¶
-Packet protection (Section 12.1) provides authentication and encryption -of all packets except Version Negotiation packets, though Initial and Retry -packets have limited encryption and authentication based on version-specific -keys; see [QUIC-TLS] for more details. This section considers passive and -active attacks against protected packets.¶
-Both on-path and off-path attackers can mount a passive attack in which they -save observed packets for an offline attack against packet protection at a -future time; this is true for any observer of any packet on any network.¶
-A blind attacker, one who injects packets without being able to observe valid -packets for a connection, is unlikely to be successful, since packet protection -ensures that valid packets are only generated by endpoints which possess the -key material established during the handshake; see Section 7 and -Section 21.12.1. Similarly, any active attacker that observes packets -and attempts to insert new data or modify existing data in those packets should -not be able to generate packets deemed valid by the receiving endpoint.¶
-A spoofing attack, in which an active attacker rewrites unprotected parts of a -packet that it forwards or injects, such as the source or destination -address, is only effective if the attacker can forward packets to the original -endpoint. Packet protection ensures that the packet payloads can only be -processed by the endpoints that completed the handshake, and invalid -packets are ignored by those endpoints.¶
-An attacker can also modify the boundaries between packets and UDP datagrams, -causing multiple packets to be coalesced into a single datagram, or splitting -coalesced packets into multiple datagrams. Aside from datagrams containing -Initial packets, which require padding, modification of how packets are -arranged in datagrams has no functional effect on a connection, although it -might change some performance characteristics.¶
-Connection Migration (Section 9) provides endpoints with the ability to -transition between IP addresses and ports on multiple paths, using one path at a -time for transmission and receipt of non-probing frames. Path validation -(Section 8.2) establishes that a peer is both willing and able -to receive packets sent on a particular path. This helps reduce the effects of -address spoofing by limiting the number of packets sent to a spoofed address.¶
-This section describes the intended security properties of connection migration -when under various types of DoS attacks.¶
-An attacker that can cause a packet it observes to no longer reach its intended -destination is considered an on-path attacker. When an attacker is present -between a client and server, endpoints are required to send packets through the -attacker to establish connectivity on a given path.¶
-An on-path attacker can:¶
-An on-path attacker cannot:¶
-An on-path attacker has the opportunity to modify the packets that it observes, -however any modifications to an authenticated portion of a packet will cause it -to be dropped by the receiving endpoint as invalid, as packet payloads are both -authenticated and encrypted.¶
-In the presence of an on-path attacker, QUIC aims to provide the following -properties:¶
-An off-path attacker is not directly on the path between a client and server, -but could be able to obtain copies of some or all packets sent between the -client and the server. It is also able to send copies of those packets to -either endpoint.¶
-An off-path attacker can:¶
- -An off-path attacker cannot:¶
- -An off-path attacker can modify packets that it has observed and inject them -back into the network, potentially with spoofed source and destination -addresses.¶
-For the purposes of this discussion, it is assumed that an off-path attacker -has the ability to observe, modify, and re-inject a packet into the network -that will reach the destination endpoint prior to the arrival of the original -packet observed by the attacker. In other words, an attacker has the ability to -consistently "win" a race with the legitimate packets between the endpoints, -potentially causing the original packet to be ignored by the recipient.¶
-It is also assumed that an attacker has the resources necessary to affect NAT -state, potentially both causing an endpoint to lose its NAT binding, and an -attacker to obtain the same port for use with its traffic.¶
-In the presence of an off-path attacker, QUIC aims to provide the following -properties:¶
-A limited on-path attacker is an off-path attacker that has offered improved -routing of packets by duplicating and forwarding original packets between the -server and the client, causing those packets to arrive before the original -copies such that the original packets are dropped by the destination endpoint.¶
-A limited on-path attacker differs from an on-path attacker in that it is not on -the original path between endpoints, and therefore the original packets sent by -an endpoint are still reaching their destination. This means that a future -failure to route copied packets to the destination faster than their original -path will not prevent the original packets from reaching the destination.¶
-A limited on-path attacker can:¶
-A limited on-path attacker cannot:¶
-A limited on-path attacker can only delay packets up to the point that the -original packets arrive before the duplicate packets, meaning that it cannot -offer routing with worse latency than the original path. If a limited on-path -attacker drops packets, the original copy will still arrive at the destination -endpoint.¶
-In the presence of a limited on-path attacker, QUIC aims to provide the -following properties:¶
-Note that these guarantees are the same guarantees provided for any NAT, for the -same reasons.¶
-This document establishes several registries for the management of codepoints in -QUIC. These registries operate on a common set of policies as defined in -Section 22.1.¶
-All QUIC registries allow for both provisional and permanent registration of -codepoints. This section documents policies that are common to these -registries.¶
-Provisional registration of codepoints are intended to allow for private use and -experimentation with extensions to QUIC. Provisional registrations only require -the inclusion of the codepoint value and contact information. However, -provisional registrations could be reclaimed and reassigned for another purpose.¶
-Provisional registrations require Expert Review, as defined in Section 4.5 of -[RFC8126]. Designated expert(s) are advised that only registrations for an -excessive proportion of remaining codepoint space or the very first unassigned -value (see Section 22.1.2) can be rejected.¶
-Provisional registrations will include a date field that indicates when the -registration was last updated. A request to update the date on any provisional -registration can be made without review from the designated expert(s).¶
-All QUIC registries include the following fields to support provisional -registration:¶
-Provisional registrations MAY omit the Specification and Notes fields, plus any -additional fields that might be required for a permanent registration. The Date -field is not required as part of requesting a registration as it is set to the -date the registration is created or updated.¶
-New uses of codepoints from QUIC registries SHOULD use a randomly selected -codepoint that excludes both existing allocations and the first unallocated -codepoint in the selected space. Requests for multiple codepoints MAY use a -contiguous range. This minimizes the risk that differing semantics are -attributed to the same codepoint by different implementations. Use of the first -codepoint in a range is intended for use by specifications that are developed -through the standards process [STD] and its allocation MUST be -negotiated with IANA before use.¶
-For codepoints that are encoded in variable-length integers -(Section 16), such as frame types, codepoints that encode to four or -eight bytes (that is, values 2^14 and above) SHOULD be used unless the usage is -especially sensitive to having a longer encoding.¶
-Applications to register codepoints in QUIC registries MAY include a codepoint -as part of the registration. IANA MUST allocate the selected codepoint unless -that codepoint is already assigned or the codepoint is the first unallocated -codepoint in the registry.¶
-A request might be made to remove an unused provisional registration from the -registry to reclaim space in a registry, or portion of the registry (such as the -64-16383 range for codepoints that use variable-length encodings). This SHOULD -be done only for the codepoints with the earliest recorded date and entries that -have been updated less than a year prior SHOULD NOT be reclaimed.¶
-A request to remove a codepoint MUST be reviewed by the designated expert(s). -The expert(s) MUST attempt to determine whether the codepoint is still in use. -Experts are advised to contact the listed contacts for the registration, plus as -wide a set of protocol implementers as possible in order to determine whether -any use of the codepoint is known. The expert(s) are advised to allow at least -four weeks for responses.¶
-If any use of the codepoints is identified by this search or a request to update -the registration is made, the codepoint MUST NOT be reclaimed. Instead, the -date on the registration is updated. A note might be added for the registration -recording relevant information that was learned.¶
-If no use of the codepoint was identified and no request was made to update the -registration, the codepoint MAY be removed from the registry.¶
-This process also applies to requests to change a provisional registration into -a permanent registration, except that the goal is not to determine whether there -is no use of the codepoint, but to determine that the registration is an -accurate representation of any deployed usage.¶
-Permanent registrations in QUIC registries use the Specification Required policy -[RFC8126], unless otherwise specified. The designated expert(s) verify that -a specification exists and is readily accessible. Expert(s) are encouraged to -be biased towards approving registrations unless they are abusive, frivolous, or -actively harmful (not merely aesthetically displeasing, or architecturally -dubious). The creation of a registry MAY specify additional constraints on -permanent registrations.¶
-The creation of a registries MAY identify a range of codepoints where -registrations are governed by a different registration policy. For instance, -the registries for 62-bit codepoints in this document have stricter policies for -codepoints in the range from 0 to 63.¶
-Any stricter requirements for permanent registrations do not prevent provisional -registrations for affected codepoints. For instance, a provisional registration -for a frame type Section 22.3 of 61 could be requested.¶
-All registrations made by Standards Track publications MUST be permanent.¶
-All registrations in this document are assigned a permanent status and list as -contact both the IESG (ietf@ietf.org) and the QUIC working group -(quic@ietf.org).¶
-IANA [SHALL add/has added] a registry for "QUIC Transport Parameters" under a -"QUIC" heading.¶
-The "QUIC Transport Parameters" registry governs a 62-bit space. This registry -follows the registration policy from Section 22.1. Permanent registrations -in this registry are assigned using the Specification Required policy -[RFC8126].¶
-In addition to the fields in Section 22.1.1, permanent registrations in -this registry MUST include the following fields:¶
-The initial contents of this registry are shown in Table 6.¶
-Value | -Parameter Name | -Specification | -
---|---|---|
0x00 | -original_connection_id | -- Section 18.2 - | -
0x01 | -max_idle_timeout | -- Section 18.2 - | -
0x02 | -stateless_reset_token | -- Section 18.2 - | -
0x03 | -max_udp_payload_size | -- Section 18.2 - | -
0x04 | -initial_max_data | -- Section 18.2 - | -
0x05 | -initial_max_stream_data_bidi_local | -- Section 18.2 - | -
0x06 | -initial_max_stream_data_bidi_remote | -- Section 18.2 - | -
0x07 | -initial_max_stream_data_uni | -- Section 18.2 - | -
0x08 | -initial_max_streams_bidi | -- Section 18.2 - | -
0x09 | -initial_max_streams_uni | -- Section 18.2 - | -
0x0a | -ack_delay_exponent | -- Section 18.2 - | -
0x0b | -max_ack_delay | -- Section 18.2 - | -
0x0c | -disable_active_migration | -- Section 18.2 - | -
0x0d | -preferred_address | -- Section 18.2 - | -
0x0e | -active_connection_id_limit | -- Section 18.2 - | -
Additionally, each value of the format 31 * N + 27
for integer values of N
-(that is, 27
, 58
, 89
, ...) are reserved and MUST NOT be assigned by IANA.¶
IANA [SHALL add/has added] a registry for "QUIC Frame Types" under a -"QUIC" heading.¶
-The "QUIC Frame Types" registry governs a 62-bit space. This registry follows -the registration policy from Section 22.1. Permanent registrations in this -registry are assigned using the Specification Required policy [RFC8126], -except for values between 0x00 and 0x3f (in hexadecimal; inclusive), which are -assigned using Standards Action or IESG Approval as defined in Section 4.9 and -4.10 of [RFC8126].¶
-In addition to the fields in Section 22.1.1, permanent registrations in -this registry MUST include the following fields:¶
-In addition to the advice in Section 22.1, specifications for new permanent -registrations SHOULD describe the means by which an endpoint might determine -that it can send the identified type of frame. An accompanying transport -parameter registration (see Section 22.2) is expected for most -registrations. Specifications for permanent registrations also needs to -describe the format and assigned semantics of any fields in the frame.¶
-The initial contents of this registry are tabulated in Table 3.¶
-IANA [SHALL add/has added] a registry for "QUIC Transport Error Codes" under a -"QUIC" heading.¶
-The "QUIC Transport Error Codes" registry governs a 62-bit space. This space is -split into three spaces that are governed by different policies. Permanent -registrations in this registry are assigned using the Specification Required -policy [RFC8126], except for values between 0x00 and 0x3f (in hexadecimal; -inclusive), which are assigned using Standards Action or IESG Approval as -defined in Section 4.9 and 4.10 of [RFC8126].¶
-In addition to the fields in Section 22.1.1, permanent registrations in -this registry MUST include the following fields:¶
-The initial contents of this registry are shown in Table 7.¶
-Value | -Error | -Description | -Specification | -
---|---|---|---|
0x0 | -NO_ERROR | -No error | -- Section 20 - | -
0x1 | -INTERNAL_ERROR | -Implementation error | -- Section 20 - | -
0x2 | -SERVER_BUSY | -Server currently busy | -- Section 20 - | -
0x3 | -FLOW_CONTROL_ERROR | -Flow control error | -- Section 20 - | -
0x4 | -STREAM_LIMIT_ERROR | -Too many streams opened | -- Section 20 - | -
0x5 | -STREAM_STATE_ERROR | -Frame received in invalid stream state | -- Section 20 - | -
0x6 | -FINAL_SIZE_ERROR | -Change to final size | -- Section 20 - | -
0x7 | -FRAME_ENCODING_ERROR | -Frame encoding error | -- Section 20 - | -
0x8 | -TRANSPORT_PARAMETER_ERROR | -Error in transport parameters | -- Section 20 - | -
0x9 | -CONNECTION_ID_LIMIT_ERROR | -Too many connection IDs received | -- Section 20 - | -
0xA | -PROTOCOL_VIOLATION | -Generic protocol violation | -- Section 20 - | -
0xB | -INVALID_TOKEN | -Invalid Token Received | -- Section 20 - | -
0xC | -APPLICATION_ERROR | -Application error | -- Section 20 - | -
0xD | -CRYPTO_BUFFER_EXCEEDED | -CRYPTO data buffer overflowed | -- Section 20 - | -
The pseudo-code in Figure 37 shows how an implementation can decode -packet numbers after header protection has been removed.¶
-Each time an endpoint commences sending on a new network path, it determines -whether the path supports ECN; see Section 13.4. If the path supports ECN, the goal -is to use ECN. Endpoints might also periodically reassess a path that was -determined to not support ECN.¶
-This section describes one method for testing new paths. This algorithm is -intended to show how a path might be tested for ECN support. Endpoints can -implement different methods.¶
-The path is assigned an ECN state that is one of "testing", "unknown", "failed", -or "capable". On paths with a "testing" or "capable" state the endpoint sends -packets with an ECT marking, by default ECT(0); otherwise, the endpoint sends -unmarked packets.¶
-To start testing a path, the ECN state is set to "testing" and existing ECN -counts are remembered as a baseline.¶
-The testing period runs for a number of packets or round-trip times, as -determined by the endpoint. The goal is not to limit the duration of the -testing period, but to ensure that enough marked packets are sent for received -ECN counts to provide a clear indication of how the path treats marked packets. -Section 13.4.2.2 suggests limiting this to 10 packets or 3 round-trip times.¶
-After the testing period ends, the ECN state for the path becomes "unknown". -From the "unknown" state, successful validation of the ECN counts an ACK frame -(see Section 13.4.2.2) causes the ECN state for the path to become "capable", unless -no marked packet has been acknowledged.¶
-If validation of ECN counts fails at any time, the ECN state for the affected -path becomes "failed". An endpoint can also mark the ECN state for a path as -"failed" if marked packets are all declared lost or if they are all CE marked.¶
-Following this algorithm ensures that ECN is rarely disabled for paths that -properly support ECN. Any path that incorrectly modifies markings will cause -ECN to be disabled. For those rare cases where marked packets are discarded by -the path, the short duration of the testing period limits the number of losses -incurred.¶
-Issue and pull request numbers are listed with a leading octothorp.¶
-Stateless reset changes (#2152, #2993)¶
- -Rework the first byte (#2006)¶
- -Substantial editorial reorganization; no technical changes.¶
-Changes to integration of the TLS handshake (#829, #1018, #1094, #1165, #1190, -#1233, #1242, #1252, #1450, #1458)¶
-Streams are split into unidirectional and bidirectional (#643, #656, #720, -#872, #175, #885)¶
- -Improvements to connection close¶
- -Split some frames into separate connection- and stream- level frames -(#443)¶
- -Transport parameters for 0-RTT are retained from a previous connection (#405, -#513, #512)¶
-The original design and rationale behind this protocol draw significantly from -work by Jim Roskind [EARLY-DESIGN].¶
-The IETF QUIC Working Group received an enormous amount of support from many -people. The following people provided substantive contributions to this -document: -Alessandro Ghedini, -Alyssa Wilk, -Antoine Delignat-Lavaud, -Brian Trammell, -Christian Huitema, -Colin Perkins, -David Schinazi, -Dmitri Tikhonov, -Eric Kinnear, -Eric Rescorla, -Gorry Fairhurst, -Ian Swett, -Igor Lubashev, 奥 一穂 (Kazuho Oku), -Lucas Pardue, -Magnus Westerlund, -Marten Seemann, -Martin Duke, -Mike Bishop, Mikkel Fahnøe Jørgensen, Mirja Kühlewind, -Nick Banks, -Nick Harper, -Patrick McManus, -Roberto Peon, -Ryan Hamilton, -Subodh Iyengar, -Tatsuhiro Tsujikawa, -Ted Hardie, -Tom Jones, -and Victor Vasiliev.¶
-View saved issues, - or the latest GitHub issues - and pull requests.
-draft-ietf-quic-http | -html | -plain text | -diff with master | -- diff with last submission | -
---|---|---|---|---|
draft-ietf-quic-invariants | -html | -plain text | -diff with master | -- diff with last submission | -
draft-ietf-quic-qpack | -html | -plain text | -diff with master | -- diff with last submission | -
draft-ietf-quic-recovery | -html | -plain text | -diff with master | -- diff with last submission | -
draft-ietf-quic-tls | -html | -plain text | -diff with master | -- diff with last submission | -
draft-ietf-quic-transport | -html | -plain text | -diff with master | -- diff with last submission | -
* | -html | -plain text | -diff with master | -- diff with last submission | -
---|