diff --git a/doc/modules/ROOT/images/ClassHierarchy.odg b/doc/modules/ROOT/images/ClassHierarchy.odg index 1ee5a75e..6f043272 100644 Binary files a/doc/modules/ROOT/images/ClassHierarchy.odg and b/doc/modules/ROOT/images/ClassHierarchy.odg differ diff --git a/doc/modules/ROOT/images/ClassHierarchy.svg b/doc/modules/ROOT/images/ClassHierarchy.svg index 2381776d..d25b73a3 100644 --- a/doc/modules/ROOT/images/ClassHierarchy.svg +++ b/doc/modules/ROOT/images/ClassHierarchy.svg @@ -1,6 +1,6 @@ - + diff --git a/doc/modules/ROOT/nav.adoc b/doc/modules/ROOT/nav.adoc index 95b69808..3aede057 100644 --- a/doc/modules/ROOT/nav.adoc +++ b/doc/modules/ROOT/nav.adoc @@ -1,20 +1,14 @@ +* xref:1.primer.adoc[] +* xref:2.messages.adoc[] * xref:sans_io_philosophy.adoc[] - * xref:http_protocol_basics.adoc[] - * xref:header_containers.adoc[] - * xref:message_bodies.adoc[] - * Serializing - * Parsing - * xref:Message.adoc[] - * Design Requirements ** xref:design_requirements/serializer.adoc[Serializer] ** xref:design_requirements/parser.adoc[Parser] - // * xref:reference:boost/http_proto.adoc[Reference] * xref:reference.adoc[Reference] diff --git a/doc/modules/ROOT/pages/1.primer.adoc b/doc/modules/ROOT/pages/1.primer.adoc new file mode 100644 index 00000000..949f878e --- /dev/null +++ b/doc/modules/ROOT/pages/1.primer.adoc @@ -0,0 +1,115 @@ +// +// Copyright (c) 2025 Vinnie Falco (vinnie.falco@gmail.com) +// +// Distributed under the Boost Software License, Version 1.0. (See accompanying +// file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) +// +// Official repository: https://github.com/cppalliance/buffers +// + += HTTP Primer + +HTTP is a stream-oriented protocol between two connected programs: one acting +as the client, the other as the server. While the connection is open, the client +sends an HTTP request, which the server reads and answers with an HTTP response. +These _messages_ are paired in order; each request has exactly one +corresponding response. This exchange of structured messages continues until +either peer closes the connection, whether normally or due to an error. + +HTTP messages consist of three parts: the start line, the headers, and the +message body. The start line differs between requests and responses, while +the headers and body share the same structure. Headers are made up of zero +or more fields, each expressed as a name–value pair. Both the start line and +the header fields use a line-oriented text format, with each line terminated +by a CRLF sequence (carriage return followed by line feed, i.e. bytes +`0x0D 0x0A`). The message body is a sequence of bytes of defined length, +with content determined by the semantics of the start line and headers. + +This diagram shows an actual HTTP request and HTTP response + +[cols="1a,1a"] +|=== +|HTTP Request|HTTP Response + +| +[source] +---- +GET /index.html HTTP/1.1\r\n +User-Agent: Boost\r\n +\r\n +---- +| +[source] +---- +HTTP/1.1 200 OK\r\n +Server: Boost.Http.Proto\r\n +Content-Length: 13\r\n +\r\n +Hello, world! +---- + +|=== + +More formally, the ABNF for HTTP messages is defined as follows: + +[cols="1a,4a"] +|=== +|Name|ABNF + +|message +|[literal] +HTTP-message = request-line / status-line + *( header-field CRLF ) + CRLF + [ message-body ] + +|request-line +|[literal] +request-line = method SP request-target SP HTTP-version CRLF + +|status-line +|[literal] +status-line = HTTP-version SP status-code SP reason-phrase CRLF + +|=== + + +Most HTTP header field values are domain-specific or application-defined, while +certain fields commonly recur. The library understands these fields and takes +appropriate action to ensure RFC compliance: + +[cols="1a,4a"] +|=== +|Field|Description + +a| +https://tools.ietf.org/html/rfc7230#section-6.1[*Connection*] + +https://tools.ietf.org/html/rfc7230#appendix-A.1.2[*Proxy-Connection*] + +|This field lets the sender specify control options for the current connection. +Typical values include close, keep-alive, and upgrade. + +|https://tools.ietf.org/html/rfc7230#section-3.3.2[*Content-Length*] +|When present, this field tells the recipient the exact size of the message +body, measured in bytes, that follows the header. + +|https://tools.ietf.org/html/rfc7230#section-3.3.1[*Transfer-Encoding*] +|This optional field specifies the sequence of transfer codings that have been, +or will be, applied to the content payload to produce the message body. + +The library supports the +chunked, +gzip, +deflate, and +brotli +encoding schemes, +in any valid combination. Encodings can be automatically applied or removed +as needed by the caller. + +|https://tools.ietf.org/html/rfc7230#section-6.7[*Upgrade*] +|The Upgrade header field provides a mechanism to transition from HTTP/1.1 to +another protocol on the same connection. For example, it is the mechanism used +by WebSocket's initial HTTP handshake to establish a WebSocket connection. + +|=== + diff --git a/doc/modules/ROOT/pages/2.messages.adoc b/doc/modules/ROOT/pages/2.messages.adoc new file mode 100644 index 00000000..b003c7a3 --- /dev/null +++ b/doc/modules/ROOT/pages/2.messages.adoc @@ -0,0 +1,134 @@ +// +// Copyright (c) 2023 Vinnie Falco (vinnie.falco@gmail.com) +// +// Distributed under the Boost Software License, Version 1.0. (See accompanying +// file LICENSE_1_0.txt or copy at https://www.boost.org/LICENSE_1_0.txt) +// +// Official repository: https://github.com/cppalliance/http_proto +// + += Messages + +The library provides both modifiable containers and immutable views for +requests, responses, and standalone field sets such as trailers. These can +be used to store incoming messages or construct outgoing ones. Unlike other +libraries, such as its predecessor Boost.Beast, the message body is kept +separate. In other words, the containers and views offered by this library +do not include the body. + +NOTE: By omitting the body from its container and view types, the library avoids +the need for templates—unlike the message container in Boost.Beast. Experience +has shown that templated containers create poor ergonomics, a design flaw this +library corrects. + +The following table lists the types used to model containers and views: + +[cols="1a,4a"] +|=== +|Type|Description + +|cpp:fields[] +|A modifiable container of header fields. + +|cpp:fields_view[] +|A read-only view to a cpp:fields[] + +|cpp:message[] +|A modifiable container holding a start-line and header fields, with + accompanying metadata. + +|cpp:message_view[] +|A read-only view to a cpp:message[] + +|cpp:request[] +|A modifiable container holding a request-line and header fields, with + accompanying metadata. + +|cpp:request_view[] +|A read-only view to a cpp:request[] + +|cpp:response[] +|A modifiable container holding a status-line and header fields, with + accompanying metadata. + +|cpp:response_view[] +|A read-only view to a cpp:response[] + +|=== + +== Construction + +All containers maintain the following invariants: + +* The container’s contents are always stored in serialized form that is + syntactically valid. + +* Any modification that would produce a malformed field or start line + throws an exception, with strong exception safety guaranteed. + +To satisfy these invariants, default-constructed containers +initially consist of a start line: + +[source,cpp] +---- +request req; +response res; + +assert(req->buffer() == "GET / HTTP/1.1\r\n\r\n"); +assert(res->buffer() == "HTTP/1.1 200 OK\r\n\r\n"); +---- + +The `buffer` function runs in constant time, never throws exceptions, +and returns a cpp:boost::core::string_view[] representing the complete +serialized object. + +Each header field consists of a name and a value, both stored as strings, +with prescribed ABNF format: + +[source] +---- +field-name = token +field-value = *( field-content / obs-fold ) +field-content = field-vchar [ 1*( SP / HTAB ) field-vchar ] +field-vchar = VCHAR / obs-text + +obs-fold = CRLF 1*( SP / HTAB ) + ; obsolete line folding +---- + +Operations that create or modify fields throw an exception if the name or +value violates the syntactic requirements of the protocol. + +Although fields may be identified by comparing their names, the library +provides the field enumeration, which defines a wide set of constants for +well-known field names. Internally, containers maintain a lookup table so +that specifying fields by enumeration replaces costly string comparisons +with efficient integer comparisons. + +The following example builds an +https://tools.ietf.org/html/rfc7231#section-4.3.1[HTTP GET] +request: + +[cols="1a,1a"] +|=== +|Code|Serialized Result + +| +[source,cpp] +---- +request req( method::get, "/index.htm", version::http_1_1 ); +req.append( field::accept, "text/html" ); +req.append( "User-Agent", "Boost" ); +---- +| +[literal] +GET /index.htm HTTP/1.1\r\n +Accept: text/html\r\n +User-Agent: Boost\r\n +\r\n + +|=== + +NOTE: Field-specific syntax (e.g., for date values) is not fully validated by +this library. It is the application’s responsibility to follow the relevant +specifications to ensure correct behavior. diff --git a/doc/modules/ROOT/pages/design_requirements/parser.adoc b/doc/modules/ROOT/pages/design_requirements/parser.adoc index 1b5379ce..204756eb 100644 --- a/doc/modules/ROOT/pages/design_requirements/parser.adoc +++ b/doc/modules/ROOT/pages/design_requirements/parser.adoc @@ -9,6 +9,51 @@ = Parser Design Requirements +== Design + +=== Comparison to Boost.Beast + +This library builds on the experiences learned from Boost.Beast's seven years +of success. Beast brings these unique design strengths: + +* Body type named requirements +* First-class message container +* Individual parser and serializer objects + +The message container suffers from these problems: + +* Templated on the body type. +* Templated on Allocator +* Node-based implementation +* Serialization is too costly + +Meanwhile parsers and serializes suffer from these problems: + +* Buffer-at-a-time operation is clumsy. +* Objects are not easily re-used +* Parser is a class template because of body types + +==== Message Container + +In HTTP.Proto the message container implementation always stores the complete +message or fields in its correctly serialized form. Insertions and modifications +are performed in linear time. When the container is reused, the amortized cost +of reallocation becomes zero. A small lookup table is stored past the end of +the serialized message, permitting iteration in constant time. + +==== Parser + +The HTTP.Proto parser is designed to persist for the lifetime of the connection +or application. It allocates a fixed size memory buffer upon construction and +uses this memory region to perform type-erasure and apply or remove content +encodings to the body. The parser is a regular class instead of a class +template, which greatly improves its ease of use over the Beast parser design. + +==== Serializer + +As with the parser, the serializer is designed to persist for the lifetime of +the connection or application and also allocates a fixed size buffer. + == Memory Allocation and Memory Utilization The `parser` must use a single block of memory allocated during construction and diff --git a/doc/modules/ROOT/pages/header_containers.adoc b/doc/modules/ROOT/pages/header_containers.adoc deleted file mode 100644 index 9c0fa11d..00000000 --- a/doc/modules/ROOT/pages/header_containers.adoc +++ /dev/null @@ -1 +0,0 @@ -= Header Containers diff --git a/doc/modules/ROOT/pages/index.adoc b/doc/modules/ROOT/pages/index.adoc index 0ac41ab6..975cfb76 100644 --- a/doc/modules/ROOT/pages/index.adoc +++ b/doc/modules/ROOT/pages/index.adoc @@ -9,33 +9,46 @@ = Boost.HTTP.Proto -Boost.HTTP.Proto is a portable C++ library which provides containers and -algorithms which implement the HTTP/1.1 protocol, widely used to deliver content -on the Internet. It adheres strictly to the HTTP/1.1 RFC specification -(henceforth referred to as https://datatracker.ietf.org/doc/html/rfc9110[rfc9110,window=blank_]). - -This library understands the grammars related to HTTP nessages and provides -functionality to validate, parse, examine, and modify messages. - -== Features +This is a portable C++ library offering containers and algorithms for implementing +the HTTP/1.1 protocol. The format is widely used to deliver content on the Internet, +and this implementation adheres strictly to the +https://datatracker.ietf.org/doc/html/rfc9110[HTTP/1.1 RFC specification], +henceform referred to as the RFC. The library is distinguished by these +provided features: + +* Sans-I/O approach +* Requires only C++11 +* Works without exceptions +* Fast compilation, few templates +* Advanced handling of memory (RAM) + +== Sans-I/O + +While this library implements the HTTP protocol, it does so without performing +any actual network activity as the logic is completely isolated from the +underlying I/O operations. The implementation manages state, ensures RFC +compliance, and provides the application-level interface for building and +inspecting HTTP messages and their payloads, and it is necessary to use or +write the interfacing network implementation on top of HTTP.Proto. + +The companion library Boost.HTTP.IO uses Boost.HTTP.Proto to implement network +I/O using Boost.Asio. The +https://sans-io.readthedocs.io/[sans-I/O] website goes into more depth regarding +this innovative approach to designing protocol libraries. == Requirements -The library requires a compiler supporting at least C++11. - -Standard types such as `error_code` or `string_view` use their Boost equivalents. +* Requires Boost and a compiler supporting at least C++11 +* Link to a static or dynamically linked version of this library +* Supports `-fno-exceptions`, detected automatically == Tested Compilers -Boost.HTTP.Proto has been tested with the following compilers: - -* clang: -* gcc: -* msvc: - -and these architectures: x86, x64 +Boost.HTTP.Proto is tested with the following compiler versions: -We do not test and support gcc 8.0.1. +* gcc: 5 to 14 (except 8.0.1) +* clang: 3.9, 4 to 18 +* msvc: 14.1 to 14.42 == Quality Assurance @@ -43,7 +56,6 @@ The development infrastructure for the library includes these per-commit analyse * Coverage reports * Compilation and tests on Drone.io and GitHub Actions -* Regular code audits for security == ABNF @@ -52,58 +64,13 @@ https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_form[Backus-Naur Form,window=b (ABNF) notation of https://datatracker.ietf.org/doc/html/rfc5234[rfc5234,window=blank_] to specify particular grammars used by algorithms and containers. -While a complete understanding of the notation is not a requirement for using the -library, it may help for an understanding of how valid components of URLs are defined. -In particular, this is of interest to users who wish to compose parsing algorithms -using the combinators provided by the library. +While a complete understanding of the notation is not a requirement for using +the library, it may help for an understanding of how valid components of HTTP +messages are defined. In particular, this is of interest to users who wish to +compose parsing algorithms using the combinators provided by the library. == Acknowledgments This library wouldn't be where it is today without the help of https://github.com/pdimov[Peter Dimov,window=blank_] for design advice and general assistance. - -== Design - -=== Comparison to Boost.Beast - -This library builds on the experiences learned from Boost.Beast's seven years -of success. Beast brings these unique design strengths: - -* Body type named requirements -* First-class message container -* Individual parser and serializer objects - -The message container suffers from these problems: - -* Templated on the body type. -* Templated on Allocator -* Node-based implementation -* Serialization is too costly - -Meanwhile parsers and serializes suffer from these problems: - -* Buffer-at-a-time operation is clumsy. -* Objects are not easily re-used -* Parser is a class template because of body types - -==== Message Container - -In HTTP.Proto the message container implementation always stores the complete -message or fields in its correctly serialized form. Insertions and modifications -are performed in linear time. When the container is reused, the amortized cost -of reallocation becomes zero. A small lookup table is stored past the end of -the serialized message, permitting iteration in constant time. - -==== Parser - -The HTTP.Proto parser is designed to persist for the lifetime of the connection -or application. It allocates a fixed size memory buffer upon construction and -uses this memory region to perform type-erasure and apply or remove content -encodings to the body. The parser is a regular class instead of a class -template, which greatly improves its ease of use over the Beast parser design. - -==== Serializer - -As with the parser, the serializer is designed to persist for the lifetime of -the connection or application and also allocates a fixed size buffer.