Skip to content
cheyiliu edited this page Feb 17, 2016 · 7 revisions

The HTTP/1.1 protocal

1.4 Overall Operation

   Most HTTP communication is initiated by a user agent and consists of
   a request to be applied to a resource on some origin server. In the
   simplest case, this may be accomplished via a single connection (v)
   between the user agent (UA) and the origin server (O).

          request chain ------------------------>
       UA -------------------v------------------- O
          <----------------------- response chain

   A more complicated situation occurs when one or more intermediaries
   are present in the request/response chain. There are three common
   forms of intermediary: proxy, gateway, and tunnel. A proxy is a
   forwarding agent, receiving requests for a URI in its absolute form,
   rewriting all or part of the message, and forwarding the reformatted
   request toward the server identified by the URI. A gateway is a
   receiving agent, acting as a layer above some other server(s) and, if
   necessary, translating the requests to the underlying server's
   protocol. A tunnel acts as a relay point between two connections
   without changing the messages; tunnels are used when the
   communication needs to pass through an intermediary (such as a
   firewall) even when the intermediary cannot understand the contents
   of the messages.

          request chain -------------------------------------->
       UA -----v----- A -----v----- B -----v----- C -----v----- O
          <------------------------------------- response chain

   The figure above shows three intermediaries (A, B, and C) between the
   user agent and origin server. A request or response message that
   travels the whole chain will pass through four separate connections.
   This distinction is important because some HTTP communication options
   may apply only to the connection with the nearest, non-tunnel
   neighbor, only to the end-points of the chain, or to all connections
   along the chain. Although the diagram is linear, each participant may
   be engaged in multiple, simultaneous communications. For example, B
   may be receiving requests from many clients other than A, and/or
   forwarding requests to servers other than C, at the same time that it
   is handling A's request.

   Any party to the communication which is not acting as a tunnel may
   employ an internal cache for handling requests. The effect of a cache
   is that the request/response chain is shortened if one of the
   participants along the chain has a cached response applicable to that
   request. The following illustrates the resulting chain if B has a
   cached copy of an earlier response from O (via C) for a request which
   has not been cached by UA or A.

          request chain ---------->
       UA -----v----- A -----v----- B - - - - - - C - - - - - - O
          <--------- response chain

   Not all responses are usefully cacheable, and some requests may
   contain modifiers which place special requirements on cache behavior.
   HTTP requirements for cache behavior and cacheable responses are
   defined in section 13.

   HTTP communication usually takes place over TCP/IP connections. The
   default port is TCP 80 [19], but other ports can be used. This does
   not preclude HTTP from being implemented on top of any other protocol
   on the Internet, or on other networks. HTTP only presumes a reliable
   transport; any protocol that provides such guarantees can be used;
   the mapping of the HTTP/1.1 request and response structures onto the
   transport data units of the protocol in question is outside the scope
   of this specification.
   In HTTP/1.0, most implementations used a new connection for each
   request/response exchange. In HTTP/1.1, a connection may be used for
   one or more request/response exchanges, although connections may be
   closed for a variety of reasons (see section 8.1).

2 Notational Conventions and Generic Grammar

  • 略,基本语法规则

3.1 HTTP Version

   The HTTP version of an application is the highest HTTP version for
   which the application is at least conditionally compliant.

   Proxy and gateway applications need to be careful when forwarding
   messages in protocol versions different from that of the application.
   Since the protocol version indicates the protocol capability of the
   sender, a proxy/gateway MUST NOT send a message with a version
   indicator which is greater than its actual version. If a higher
   version request is received, the proxy/gateway MUST either downgrade
   the request version, or respond with an error, or switch to tunnel
   behavior.

3.2.1 General Syntax

   The HTTP protocol does not place any a priori limit on the length of
   a URI. Servers MUST be able to handle the URI of any resource they
   serve, and SHOULD be able to handle URIs of unbounded length if they
   provide GET-based forms that could generate such URIs. A server
   SHOULD return 414 (Request-URI Too Long) status if a URI is longer
   than the server can handle (see section 10.4.15).

      Note: Servers ought to be cautious about depending on URI lengths
      above 255 bytes, because some older client or proxy
      implementations might not properly support these lengths.

3.2.2 http URL

   The "http" scheme is used to locate network resources via the HTTP
   protocol. This section defines the scheme-specific syntax and
   semantics for http URLs.

   http_URL = "http:" "//" host [ ":" port ] [ abs_path [ "?" query ]]

   If the port is empty or not given, port 80 is assumed. The semantics
   are that the identified resource is located at the server listening
   for TCP connections on that port of that host, and the Request-URI
   for the resource is abs_path (section 5.1.2). The use of IP addresses
   in URLs SHOULD be avoided whenever possible (see RFC 1900 [24]). If
   the abs_path is not present in the URL, it MUST be given as "/" when
   used as a Request-URI for a resource (section 5.1.2). If a proxy
   receives a host name which is not a fully qualified domain name, it
   MAY add its domain to the host name it received. If a proxy receives
   a fully qualified domain name, the proxy MUST NOT change the host
   name.

3.2.3 URI Comparison

   When comparing two URIs to decide if they match or not, a client
   SHOULD use a case-sensitive octet-by-octet comparison of the entire
   URIs, with these exceptions:

      - A port that is empty or not given is equivalent to the default
        port for that URI-reference;

        - Comparisons of host names MUST be case-insensitive;

        - Comparisons of scheme names MUST be case-insensitive;

        - An empty abs_path is equivalent to an abs_path of "/".

   Characters other than those in the "reserved" and "unsafe" sets (see
   RFC 2396 [42]) are equivalent to their ""%" HEX HEX" encoding.

   For example, the following three URIs are equivalent:

      http://abc.com:80/~smith/home.html
      http://ABC.com/%7Esmith/home.html
      http://ABC.com:/%7esmith/home.html

4.1 Message Types

   HTTP messages consist of requests from client to server and responses
   from server to client.

       HTTP-message   = Request | Response     ; HTTP/1.1 messages

   Request (section 5) and Response (section 6) messages use the generic
   message format of RFC 822 [9] for transferring entities (the payload
   of the message). Both types of message consist of a start-line, zero
   or more header fields (also known as "headers"), an empty line (i.e.,
   a line with nothing preceding the CRLF) indicating the end of the
   header fields, and possibly a message-body.

        generic-message = start-line
                          *(message-header CRLF)
                          CRLF
                          [ message-body ]
        start-line      = Request-Line | Status-Line

5 Request

   A request message from a client to a server includes, within the
   first line of that message, the method to be applied to the resource,
   the identifier of the resource, and the protocol version in use.

        Request       = Request-Line              ; Section 5.1
                        *(( general-header        ; Section 4.5
                         | request-header         ; Section 5.3
                         | entity-header ) CRLF)  ; Section 7.1
                        CRLF
                        [ message-body ]          ; Section 4.3

5.1 Request-Line

   The Request-Line begins with a method token, followed by the
   Request-URI and the protocol version, and ending with CRLF. The
   elements are separated by SP characters. No CR or LF is allowed
   except in the final CRLF sequence.

        Request-Line   = Method SP Request-URI SP HTTP-Version CRLF

5.1.1 Method

   The Method  token indicates the method to be performed on the
   resource identified by the Request-URI. The method is case-sensitive.

       Method         = "OPTIONS"                ; Section 9.2
                      | "GET"                    ; Section 9.3
                      | "HEAD"                   ; Section 9.4
                      | "POST"                   ; Section 9.5
                      | "PUT"                    ; Section 9.6
                      | "DELETE"                 ; Section 9.7
                      | "TRACE"                  ; Section 9.8
                      | "CONNECT"                ; Section 9.9
                      | extension-method
       extension-method = token

   The list of methods allowed by a resource can be specified in an
   Allow header field (section 14.7). The return code of the response
   always notifies the client whether a method is currently allowed on a
   resource, since the set of allowed methods can change dynamically. An
   origin server SHOULD return the status code 405 (Method Not Allowed)
   if the method is known by the origin server but not allowed for the
   requested resource, and 501 (Not Implemented) if the method is
   unrecognized or not implemented by the origin server. The methods GET
   and HEAD MUST be supported by all general-purpose servers. All other
   methods are OPTIONAL; however, if the above methods are implemented,
   they MUST be implemented with the same semantics as those specified
   in section 9.

6 Response

   After receiving and interpreting a request message, a server responds
   with an HTTP response message.

       Response      = Status-Line               ; Section 6.1
                       *(( general-header        ; Section 4.5
                        | response-header        ; Section 6.2
                        | entity-header ) CRLF)  ; Section 7.1
                       CRLF
                       [ message-body ]          ; Section 7.2

6.1 Status-Line

   The first line of a Response message is the Status-Line, consisting
   of the protocol version followed by a numeric status code and its
   associated textual phrase, with each element separated by SP
   characters. No CR or LF is allowed except in the final CRLF sequence.

       Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF

6.1.1 Status Code and Reason Phrase

   The Status-Code element is a 3-digit integer result code of the
   attempt to understand and satisfy the request. These codes are fully
   defined in section 10. The Reason-Phrase is intended to give a short
   textual description of the Status-Code. The Status-Code is intended
   for use by automata and the Reason-Phrase is intended for the human
   user. The client is not required to examine or display the Reason-
   Phrase.

   The first digit of the Status-Code defines the class of response. The
   last two digits do not have any categorization role. There are 5
   values for the first digit:

      - 1xx: Informational - Request received, continuing process

      - 2xx: Success - The action was successfully received,
        understood, and accepted

      - 3xx: Redirection - Further action must be taken in order to
        complete the request

      - 4xx: Client Error - The request contains bad syntax or cannot
        be fulfilled

      - 5xx: Server Error - The server failed to fulfill an apparently
        valid request

   The individual values of the numeric status codes defined for
   HTTP/1.1, and an example set of corresponding Reason-Phrase's, are
   presented below. The reason phrases listed here are only
   recommendations -- they MAY be replaced by local equivalents without
   affecting the protocol.

      Status-Code    =
            "100"  ; Section 10.1.1: Continue
          | "101"  ; Section 10.1.2: Switching Protocols

          | "200"  ; Section 10.2.1: OK
          | "201"  ; Section 10.2.2: Created
          | "202"  ; Section 10.2.3: Accepted
          | "203"  ; Section 10.2.4: Non-Authoritative Information
          | "204"  ; Section 10.2.5: No Content
          | "205"  ; Section 10.2.6: Reset Content
          | "206"  ; Section 10.2.7: Partial Content

          | "300"  ; Section 10.3.1: Multiple Choices
          | "301"  ; Section 10.3.2: Moved Permanently
          | "302"  ; Section 10.3.3: Found
          | "303"  ; Section 10.3.4: See Other
          | "304"  ; Section 10.3.5: Not Modified
          | "305"  ; Section 10.3.6: Use Proxy
          | "307"  ; Section 10.3.8: Temporary Redirect

          | "400"  ; Section 10.4.1: Bad Request
          | "401"  ; Section 10.4.2: Unauthorized
          | "402"  ; Section 10.4.3: Payment Required
          | "403"  ; Section 10.4.4: Forbidden
          | "404"  ; Section 10.4.5: Not Found
          | "405"  ; Section 10.4.6: Method Not Allowed
          | "406"  ; Section 10.4.7: Not Acceptable

          | "407"  ; Section 10.4.8: Proxy Authentication Required
          | "408"  ; Section 10.4.9: Request Time-out
          | "409"  ; Section 10.4.10: Conflict
          | "410"  ; Section 10.4.11: Gone
          | "411"  ; Section 10.4.12: Length Required
          | "412"  ; Section 10.4.13: Precondition Failed
          | "413"  ; Section 10.4.14: Request Entity Too Large
          | "414"  ; Section 10.4.15: Request-URI Too Large
          | "415"  ; Section 10.4.16: Unsupported Media Type
          | "416"  ; Section 10.4.17: Requested range not satisfiable
          | "417"  ; Section 10.4.18: Expectation Failed

          | "500"  ; Section 10.5.1: Internal Server Error
          | "501"  ; Section 10.5.2: Not Implemented
          | "502"  ; Section 10.5.3: Bad Gateway
          | "503"  ; Section 10.5.4: Service Unavailable
          | "504"  ; Section 10.5.5: Gateway Time-out
          | "505"  ; Section 10.5.6: HTTP Version not supported
          | extension-code

      extension-code = 3DIGIT
      Reason-Phrase  = *<TEXT, excluding CR, LF>

   HTTP status codes are extensible. HTTP applications are not required
   to understand the meaning of all registered status codes, though such
   understanding is obviously desirable. However, applications MUST
   understand the class of any status code, as indicated by the first
   digit, and treat any unrecognized response as being equivalent to the
   x00 status code of that class, with the exception that an
   unrecognized response MUST NOT be cached. For example, if an
   unrecognized status code of 431 is received by the client, it can
   safely assume that there was something wrong with its request and
   treat the response as if it had received a 400 status code. In such
   cases, user agents SHOULD present to the user the entity returned
   with the response, since that entity is likely to include human-
   readable information which will explain the unusual status.

7 Entity

   Request and Response messages MAY transfer an entity if not otherwise
   restricted by the request method or response status code. An entity
   consists of entity-header fields and an entity-body, although some
   responses will only include the entity-headers.

   In this section, both sender and recipient refer to either the client
   or the server, depending on who sends and who receives the entity.

7.1 Entity Header Fields

   Entity-header fields define metainformation about the entity-body or,
   if no body is present, about the resource identified by the request.
   Some of this metainformation is OPTIONAL; some might be REQUIRED by
   portions of this specification.

       entity-header  = Allow                    ; Section 14.7
                      | Content-Encoding         ; Section 14.11
                      | Content-Language         ; Section 14.12
                      | Content-Length           ; Section 14.13
                      | Content-Location         ; Section 14.14
                      | Content-MD5              ; Section 14.15
                      | Content-Range            ; Section 14.16
                      | Content-Type             ; Section 14.17
                      | Expires                  ; Section 14.21
                      | Last-Modified            ; Section 14.29
                      | extension-header

       extension-header = message-header

   The extension-header mechanism allows additional entity-header fields
   to be defined without changing the protocol, but these fields cannot
   be assumed to be recognizable by the recipient. Unrecognized header
   fields SHOULD be ignored by the recipient and MUST be forwarded by
   transparent proxies.

7.2 Entity Body

   The entity-body (if any) sent with an HTTP request or response is in
   a format and encoding defined by the entity-header fields.

       entity-body    = *OCTET

   An entity-body is only present in a message when a message-body is
   present, as described in section 4.3. The entity-body is obtained
   from the message-body by decoding any Transfer-Encoding that might
   have been applied to ensure safe and proper transfer of the message.

8 Connections

8.1 Persistent Connections

8.1.1 Purpose

   Prior to persistent connections, a separate TCP connection was
   established to fetch each URL, increasing the load on HTTP servers
   and causing congestion on the Internet. The use of inline images and
   other associated data often require a client to make multiple
   requests of the same server in a short amount of time. Analysis of
   these performance problems and results from a prototype
   implementation are available [26] [30]. Implementation experience and
   measurements of actual HTTP/1.1 (RFC 2068) implementations show good
   results [39]. Alternatives have also been explored, for example,
   T/TCP [27].

   Persistent HTTP connections have a number of advantages:

      - By opening and closing fewer TCP connections, CPU time is saved
        in routers and hosts (clients, servers, proxies, gateways,
        tunnels, or caches), and memory used for TCP protocol control
        blocks can be saved in hosts.

      - HTTP requests and responses can be pipelined on a connection.
        Pipelining allows a client to make multiple requests without
        waiting for each response, allowing a single TCP connection to
        be used much more efficiently, with much lower elapsed time.

      - Network congestion is reduced by reducing the number of packets
        caused by TCP opens, and by allowing TCP sufficient time to
        determine the congestion state of the network.

      - Latency on subsequent requests is reduced since there is no time
        spent in TCP's connection opening handshake.

      - HTTP can evolve more gracefully, since errors can be reported
        without the penalty of closing the TCP connection. Clients using
        future versions of HTTP might optimistically try a new feature,
        but if communicating with an older server, retry with old
        semantics after an error is reported.

   HTTP implementations SHOULD implement persistent connections.

8.1.2 Overall Operation

   A significant difference between HTTP/1.1 and earlier versions of
   HTTP is that persistent connections are the default behavior of any
   HTTP connection. That is, unless otherwise indicated, the client
   SHOULD assume that the server will maintain a persistent connection,
   even after error responses from the server.

   Persistent connections provide a mechanism by which a client and a
   server can signal the close of a TCP connection. This signaling takes
   place using the Connection header field (section 14.10). Once a close
   has been signaled, the client MUST NOT send any more requests on that
   connection.

8.1.2.2 Pipelining

   A client that supports persistent connections MAY "pipeline" its
   requests (i.e., send multiple requests without waiting for each
   response). A server MUST send its responses to those requests in the
   same order that the requests were received.

   Clients which assume persistent connections and pipeline immediately
   after connection establishment SHOULD be prepared to retry their
   connection if the first pipelined attempt fails. If a client does
   such a retry, it MUST NOT pipeline before it knows the connection is
   persistent. Clients MUST also be prepared to resend their requests if
   the server closes the connection before sending all of the
   corresponding responses.

   Clients SHOULD NOT pipeline requests using non-idempotent methods or
   non-idempotent sequences of methods (see section 9.1.2). Otherwise, a
   premature termination of the transport connection could lead to
   indeterminate results. A client wishing to send a non-idempotent
   request SHOULD wait to send that request until it has received the
   response status for the previous request.

NOTE, idempotent(幂等) http://baike.baidu.com/view/2067025.htm

9 Method Definitions

Clone this wiki locally