Skip to content

net.http: add HPACK header compression (RFC 7541) for HTTP/2#27353

Merged
JalonSolov merged 2 commits into
vlang:masterfrom
quaesitor-scientiam:net-http2-hpack
Jun 6, 2026
Merged

net.http: add HPACK header compression (RFC 7541) for HTTP/2#27353
JalonSolov merged 2 commits into
vlang:masterfrom
quaesitor-scientiam:net-http2-hpack

Conversation

@quaesitor-scientiam

Copy link
Copy Markdown
Contributor

What

Adds a self-contained HPACK (RFC 7541) encoder and decoder to net.http.
HPACK is the header-compression layer required by HTTP/2; this is the first,
independently-reviewable building block toward HTTP/2 support.

This PR is purely additive: it adds new files in the http module and does
not modify any existing net.http code path. Nothing calls it yet, so there is
no user-visible behaviour change.

Why split it out

HTTP/2 is large; landing it as one PR is unreviewable (see the discussion on
#26776). HPACK is a stable, well-specified, pure-data-transformation component
that can be reviewed and tested entirely on its own, against the RFC's own
worked examples.

Files

File Contents
h2_hpack.v Integer codec, dynamic table, H2HpackEncoder, H2HpackDecoder, public H2HeaderField
h2_hpack_static.v The 61-entry static table (RFC 7541 Appendix A)
h2_hpack_huffman.v Huffman encode/decode + decode map
h2_hpack_huffman_table.v Canonical Huffman code table (Appendix B), generated
h2_hpack_test.v Tests (RFC Appendix C vectors + more)

Design notes

  • Decoder handles every field representation (indexed; literal with
    incremental indexing, without indexing, and never-indexed) plus dynamic table
    size updates, including the ordering/bounds rules from Section 6 (e.g. a size
    update may not follow a header field; an update may not exceed the advertised
    limit).
  • Encoder uses indexed representations for static-table hits and literal
    representations otherwise (never-indexed for sensitive headers such as
    cookie / authorization). It deliberately does not add to the dynamic
    table, which keeps encoder and decoder state trivially consistent while
    remaining fully interoperable. Dynamic-table-aware encoding can be added later
    without changing the decoder.
  • Security: the integer codec rejects overflowing continuations; the Huffman
    decoder rejects an explicit EOS symbol and invalid padding (RFC 7541 §5).

Tests

h2_hpack_test.v (internal module http test) covers:

  • RFC 7541 Appendix C vectors, byte-for-byte: integer (C.1), all four
    representations (C.2), and the request sequences C.3 (no Huffman) and C.4
    (Huffman) on a shared decoder. The C.3/C.4 byte strings were cross-checked
    against the python-hyper/hpack test suite.
  • Huffman round-trips and a known vector.
  • Dynamic-table eviction on add, oversized-entry-empties-table, and resize
    eviction.
  • Encode/decode round-trip including a sensitive (cookie) header.
  • Adversarial inputs: integer overflow, EOS in a Huffman string, non-all-ones
    Huffman padding, index 0, out-of-range index, size update after a field, and
    size update over the limit.
./vnew test vlib/net/http/h2_hpack_test.v          # passes
./vnew -W -cstrict -cc clang test vlib/net/http/h2_hpack_test.v  # passes
./vnew -silent test vlib/net/http/                 # full suite: 15 passed, 1 skipped (Windows)

🤖 Generated with Claude Code

Add a self-contained HPACK encoder and decoder, the first building block for
HTTP/2 support. This does not touch any existing net.http code paths; it only
adds new files in the http module.

Included:
- Variable-length integer codec (Section 5.1), with overflow rejection.
- Static Huffman codec (Section 5.2 / Appendix B), with rejection of an
  explicit EOS symbol and of invalid (too long / non-all-ones) padding. The
  code table is generated from the canonical RFC table.
- The 61-entry static table (Appendix A).
- A size-bounded dynamic table with FIFO eviction and dynamic size updates
  (Sections 2.3.2, 4).
- H2HpackDecoder: handles all field representations (indexed, literal with
  incremental indexing, without indexing, never indexed) and dynamic table
  size updates, with the ordering and bounds checks required by Section 6.
- H2HpackEncoder: emits indexed representations for static-table hits and
  literal (never-indexed for sensitive headers like cookie/authorization)
  otherwise. It does not mutate the dynamic table, keeping encoder/decoder
  state trivially consistent while remaining fully interoperable.

Tests cover the RFC 7541 Appendix C worked examples (integer C.1, all four
representations C.2, the request sequences C.3 without and C.4 with Huffman,
verified byte-for-byte against the spec), Huffman round-trips, dynamic table
eviction, encode/decode round-trips, and adversarial inputs (integer overflow,
EOS in a Huffman string, bad padding, index 0, out-of-range index, and a
dynamic table size update placed after a header field or over the limit).
Passes under -W -cstrict -cc clang.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 58606a97ba

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread vlib/net/http/h2_hpack.v Outdated
Comment thread vlib/net/http/h2_hpack.v Outdated
Validate HPACK indexes and string lengths in u64 space before narrowing to
int, so a malicious peer cannot use an oversized value to truncate past a
bounds check:

- lookup: compute the dynamic index as u64 and reject it when it exceeds the
  dynamic table length, before casting. Previously an index of (2^32 + N)
  truncated to N could resolve onto a valid dynamic entry and return the wrong
  header instead of an out-of-range error.
- read_string: compare the string length against the remaining buffer in u64
  space. Previously an oversized length truncated to a small/negative int could
  bypass the bounds check and cause an invalid slice (panic) or misdecode.

Adds regression tests that build both oversized values (2^32 + n) and assert a
clean decode error.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@JalonSolov

Copy link
Copy Markdown
Collaborator

compress.deflate also has Huffman encoding/decoding. I would rather see both use a common hash.huffman than duplicate the code.

What is here may be more efficient than what is currently in compress.deflate because of the static tables.

@quaesitor-scientiam

Copy link
Copy Markdown
Contributor Author

Good call — I looked into it, and you're right that there's real overlap. A couple of findings that might shape how we factor it:

The tables are both canonical. I verified the HPACK Huffman table (RFC 7541 Appendix B) is a canonical code: all 257 codes rebuild exactly from the bit-lengths alone using the same bl_count/next_code assignment compress.deflate already uses. So the code-assignment step (lengths[] -> codes) is genuinely common ground, and HPACK could ship just the lengths and derive the codes with that shared routine.

But the decoders can't share one structure. DEFLATE caps code lengths at 15 bits and decodes via a single flat 2^max_bits table (32K entries). HPACK codes go up to 30 bits, so a flat 2^30 table is infeasible — it needs a tree / multi-level table / bit-at-a-time approach. The two also use opposite bit order (DEFLATE is LSB-first and bit_reverses every code; HPACK is MSB-first), plus different padding/EOS semantics.

So the realistically shareable piece is a canonical-code builder (code_lengths -> {code table, decode structure}), not a full shared codec. That's worth doing, but it's a cross-cutting refactor of compress.deflate (the HuffTree there is unexported and used by zlib/gzip), so I'd rather not bundle it into this HPACK PR and risk the same "too big to review" problem.

Proposal: keep this PR as-is, and I'll open a follow-up issue to design a shared hash.huffman (or similar) canonical-Huffman builder that both compress.deflate and HPACK can use, with care taken not to regress DEFLATE's flat-table decode. As a smaller down payment I could also change HPACK here to build its codes from the lengths array (dropping the explicit codes table), which would make that later extraction nearly mechanical.

Does that split sound right to you, or would you prefer the shared module land first and HPACK depend on it?

@JalonSolov

Copy link
Copy Markdown
Collaborator

That split is fine, and would prevent (nearly) duplicate code. A separate PR to make a common hash.huffman works.

The new hash.huffman could take parameters to tell it how many bits, and lsb vs msb, so it could be shared between this code and compress.deflate, as well as allowing use in future applications.

It could be argued either direction as to if those parameters should have defaults (I'm leaning towards "No"), or if the params should be marked with required or noinit on the struct (if "No" wins, then these would make more sense).

@quaesitor-scientiam

Copy link
Copy Markdown
Contributor Author

Sounds good — I'll keep this PR as-is and open a separate PR for the shared hash.huffman.

Agreed on parameterizing by bit-width and LSB-vs-MSB; that maps cleanly onto the two real differences here (DEFLATE ≤15-bit codes + LSB-first, HPACK ≤30-bit + MSB-first), and the bit-width param is also what lets the builder pick its decode representation (a flat 2^bits table is fine at 15 bits but not at 30). I'll capture your lean toward no defaults + required/@[noinit] in the follow-up so callers must be explicit.

I've filed a tracking issue for the design/work so this doesn't get lost: see below. Happy to take the first cut at it once this lands.

@JalonSolov JalonSolov merged commit c5fca67 into vlang:master Jun 6, 2026
75 of 83 checks passed
JalonSolov pushed a commit that referenced this pull request Jun 6, 2026
* net.http: add synchronous HTTP/2 client connection (H2Conn)

Build on the ALPN (#27343), HPACK (#27353), and frame-codec (#27356) PRs to add
a minimal, synchronous, single-stream HTTP/2 client. Additive only: new files in
the http module, no change to existing code paths. Nothing wires it into
http.fetch yet (that ALPN shim is a follow-up), so there is no user-visible
behaviour change.

- H2Transport interface — the byte transport the connection runs over. Its
  read/write signatures match net.ssl.SSLConn, so an ALPN-negotiated `h2` TLS
  socket satisfies it directly; tests use an in-memory mock, so the connection
  is exercised without a socket.
- Connection preface + SETTINGS handshake (sent lazily on the first request).
- H2Conn.do(req): HPACK-encodes the request headers, sends HEADERS (plus DATA
  for a body, chunked to the peer's max frame size and bounded by the
  connection flow-control window), then reads frames until the stream closes.
- Inline servicing of connection-level frames: SETTINGS (apply + ACK), PING
  (echo ACK), WINDOW_UPDATE (grow the send window), GOAWAY (fail). Receive-side
  flow control is replenished with a WINDOW_UPDATE per DATA frame.
- CONTINUATION reassembly for response header blocks; RST_STREAM on the request
  stream is surfaced as an error.

Hermetic tests over the mock transport cover: a basic GET, a multi-DATA
response, a POST with a body, GOAWAY, RST_STREAM, CONTINUATION-split response
headers, and PING ACK. Passes under -W -cstrict -cc clang; the full
vlib/net/http suite is green.

Not included here, by design (follow-ups): stream multiplexing with a
background reader thread, connection pooling, wiring into http.fetch via ALPN,
and the server side.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* net.http: HTTP/2 client header splitting and per-stream flow control

Address two send-path correctness gaps in H2Conn:

- Split request header blocks larger than the peer's SETTINGS_MAX_FRAME_SIZE
  into a HEADERS frame followed by CONTINUATION frames (RFC 7540 Section 4.3),
  via send_header_block. Previously a large header set (e.g. big Cookie or
  Authorization) was sent as one oversized HEADERS frame that a compliant peer
  would reject.

- Track the per-stream send window in addition to the connection window
  (RFC 7540 Section 6.9). DATA is now bounded by min(connection, stream)
  window, stream-level WINDOW_UPDATE credits the active stream, and a change to
  SETTINGS_INITIAL_WINDOW_SIZE retroactively adjusts the stream window by the
  delta (Section 6.9.2). Previously only the connection window was tracked, so
  a peer lowering the initial window or granting stream-level updates could be
  overrun or could stall the client.

Adds tests for a request whose header block spans HEADERS + CONTINUATION, and
for a body larger than the initial window that resumes after the peer grows
both windows (no DATA frame exceeding the max frame size).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Richard Wheeler <quaesitor.scientiam@gmail.com>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants