Skip to content

HTTP response/request compression (gzip via zlib-ng) — phase 1 #8

@EdmondDantes

Description

@EdmondDantes

Goal

Add HTTP body compression for responses and decoding for requests. One codec on the wire in phase 1 (gzip), one engine (zlib-ng), shared across H1/H2/H3 since Content-Encoding is a representation-level mechanism (RFC 9110 §8.4) — independent of transport. HPACK/QPACK handle header compression in H2/H3 already and are out of scope.

Future codecs (Brotli, zstd) are deferred to phase 2; pre-compressed static assets to phase 3. The encoder API is shaped as a vtable from day one so phase 2 plugs in without touching the response pipeline.

Scope (phase 1)

  • gzip response encoding via zlib-ng (fallback to system zlib if zlib-ng absent)
  • gzip request body decoding with anti-bomb size cap
  • Accept-Encoding negotiation: q-values, identity;q=0, *, 406 when unsatisfiable
  • Skip rules: status 1xx/204/304, HEAD, Range, already-encoded, body below threshold, MIME not in whitelist
  • Vary: Accept-Encoding on compressed responses; Content-Length dropped on streaming, kept on buffered small-body fast path
  • Per-response opt-out (\$response->setNoCompression()) for BREACH-sensitive endpoints
  • Common path above H1 chunked / H2 DATA / H3 DATA — single implementation, three transports

HttpServerConfig knobs

All editable via setters on HttpServerConfig, frozen after HttpServer::__construct (matches existing setWs* pattern):

Setter Default Validation
setCompressionEnabled(bool) true
setCompressionLevel(int) 6 1..9
setCompressionMinSize(int) 1024 ≥ 0
setCompressionMimeTypes(array) whitelist below non-empty strings, lowercased + trimmed
setRequestMaxDecompressedSize(int) 10 MiB ≥ 0 (0 = disabled)

getCompressionMimeTypes() returns the materialized list so introspection reflects the actual policy.

Default MIME whitelist

Lives in a dedicated TU (src/http/compression/http_compression_defaults.c) exporting extern const char *const http_compression_default_mime_types[] (NULL-terminated, lowercase, sorted). Materialized into the config HashTable at init; setCompressionMimeTypes() replaces wholesale (nginx semantics).

```
application/javascript
application/json
application/xml
image/svg+xml
text/css
text/html
text/javascript
text/plain
text/xml
```

Build

  • --enable-http-compression in config.m4, equivalent CMake flag
  • pkg-config probe: zlib-ng preferred, zlib fallback; print chosen engine in build banner
  • Compression module isolated under src/http/compression/ so phase-2 codec backends drop in alongside

Implementation plan (commits)

  1. build + skeleton — feature flag, pkg-config, http_encoder_t vtable, gzip-backend stub. No call sites yet.
  2. HttpServerConfig knobs — five setters above + freeze + validation; phpt for setter validation and immutability after bind.
  3. Accept-Encoding parser + negotiation — q-values, identity, *, MIME match (param-stripped), skip heuristics; unit tests in tests/unit/.
  4. gzip encoder via zlib-ng (streaming) — vtable impl, window=15+16 (gzip wrap), correct teardown on early disconnect / write-error; standalone unit tests.
  5. Wire into response pipeline — single hook above H1 chunked / H2 DATA / H3 DATA; Content-Encoding, Vary, Content-Length handling; per-response setNoCompression() opt-out; phpt covering H1 + H2.
  6. Request body decoding — transparent gzip decoding in request body pipeline, anti-bomb cap → 413, unknown coding → 415; phpt for golden path, bomb limit, unknown coding.
  7. Docs + CHANGELOG.

phpt coverage

Response (H1 + H2 where feasible):

  • identity → not compressed
  • gzip in Accept-Encoding → compressed + Vary set
  • gzip;q=0 → not compressed
  • below threshold → not compressed
  • non-whitelist MIME (e.g. image/png) → not compressed
  • HEAD → not compressed
  • Range response → not compressed
  • setNoCompression() → not compressed
  • handler already set Content-Encoding → no double-encode

Request:

  • gzipped POST → handler sees decoded body
  • decompressed size exceeds limit → 413
  • unknown Content-Encoding → 415

H3 cross-version test gated on whether CI has an lsquic client harness; otherwise tracked as a follow-up.

Security

  • BREACH mitigation policy: setNoCompression() documented for endpoints reflecting user input next to secrets.
  • Anti-bomb cap on request decoding is mandatory (default 10 MiB); 0 = disabled (must be explicit).
  • Whitelist semantics on MIME (not blacklist) — refuse to compress unknown content rather than risk corrupting binaries with mistaken MIME.

Out of scope (phase 2+)

  • Brotli (libbrotli), zstd (libzstd) — same vtable, additional backends
  • Threadpool offload for large bodies — measure first, optimize on evidence
  • Pre-compressed static assets (*.gz, *.br on disk served via sendfile)
  • deflate codec (historical raw-vs-zlib-wrapped ambiguity — skip)

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

Status

Done

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions