MDA2AV · MDA2AV · Apr 24, 2026 · Apr 24, 2026
@@ -41,9 +41,8 @@ The benchmark runner mounts these paths into your container (read-only):
 | Path | Purpose |
 |------|---------|
 | `/data/dataset.json` | 50-item dataset for `/json` endpoint |
-| `/data/benchmark.db` | SQLite database (100K rows) for `/db` endpoint |
-| `/data/static/` | 20 static files (CSS, JS, HTML, fonts, images) |
+| `/data/static/` | 20 static assets (CSS, JS, HTML, fonts, images) — 15 ship with `.gz` and `.br` sibling files for precompression-aware frameworks |
 | `/certs/server.crt` | TLS certificate for HTTPS/H2/H3 |
 | `/certs/server.key` | TLS private key for HTTPS/H2/H3 |
 
-All data mounts are provided unconditionally — your container always has access to all files regardless of which profiles it participates in.
+Postgres (profiles `async-db`, `crud`, `api-4`, `api-16`, and the compose-orchestrated gateway + production-stack) is provided by a separate sidecar container, reachable via the `DATABASE_URL` environment variable — not a mount. Redis (profile `crud`) is similarly reachable via `REDIS_URL`. See [Configuration](../../running-locally/configuration/) for the full env var list.
@@ -10,13 +10,17 @@ Tuned entries have more freedom. They can use non-default configurations, experi
 - Alternative JSON serializers (simd-json, sonic-json, etc.)
 - Custom buffer sizes and TCP socket options
 - Experimental or unstable framework flags
-- Pre-computed responses and response caching
 - Memory-mapped files and in-memory static file caching
 - Custom thread pools and worker configurations
 - Non-default GC settings without documentation requirement
 - Framework-specific performance flags not recommended for production
 - Any compression approach for static files — custom compression, pre-compressed file serving, alternative compression libraries
 
+## What is NOT allowed
+
+- **Pre-computed response bodies** — serializing a fixed response at startup and returning the same bytes per request (e.g. caching a JSON blob and writing it back unchanged). The serialization + compression work is the workload; bypassing it defeats the measurement.
+- **Response caching** — memoizing the full HTTP response body keyed by URL/params and replaying it. This is distinct from upstream data caching (DB query results, JWT verification, etc.), which remains allowed where the profile calls for it (e.g. the CRUD profile's read cache).
+
 ## What is still required
 
 - Must use the framework's HTTP server (not a raw socket replacement)

@@ -9,7 +9,7 @@ Create a `meta.json` file in your framework directory:
   "display_name": "your-framework",
   "language": "Go",
   "engine": "net/http",
-  "type": "framework",
+  "type": "production",
   "description": "Short description of the framework and its key features.",
   "repo": "https://github.com/org/repo",
   "enabled": true,
@@ -39,53 +39,30 @@ Create a `meta.json` file in your framework directory:
 | `baseline` | HTTP/1.1 | `/baseline11` |
 | `pipelined` | HTTP/1.1 | `/pipeline` |
 | `limited-conn` | HTTP/1.1 | `/baseline11` |
-| `json` | HTTP/1.1 | `/json/{count}` |
+| `json` | HTTP/1.1 | `/json/{count}?m=N` |
 | `json-comp` | HTTP/1.1 | `/json/{count}?m=N` (must honor `Accept-Encoding: gzip, br`) |
-| `json-tls` | HTTP/1.1 + TLS | `/json/{count}?m=N` on port 8081 (ALPN `http/1.1`) |
+| `json-tls` | HTTP/1.1 + TLS | `/json/{count}?m=N` (port 8081, ALPN `http/1.1`) |
 | `upload` | HTTP/1.1 | `/upload` |
-| `static` | HTTP/1.1 | `/static/*` (port 8080) |
-| `async-db` | HTTP/1.1 | `/async-db?limit=N` (requires `DATABASE_URL` env var) |
 | `api-4` | HTTP/1.1 | `/baseline11`, `/json/{count}`, `/async-db` (4 CPU, 16 GB) |
 | `api-16` | HTTP/1.1 | `/baseline11`, `/json/{count}`, `/async-db` (16 CPU, 32 GB) |
+| `static` | HTTP/1.1 | `/static/*` (port 8080) |
+| `async-db` | HTTP/1.1 | `/async-db?min=X&max=Y&limit=N` (requires `DATABASE_URL`) |
+| `crud` | HTTP/1.1 | `/api/items`, `/api/items/{id}` (GET/POST/PUT; requires `DATABASE_URL`, optional `REDIS_URL`) |
 | `baseline-h2` | HTTP/2 | `/baseline2` (TLS, port 8443) |
 | `static-h2` | HTTP/2 | `/static/*` (TLS, port 8443) |
-| `gateway-64` | HTTP/2 | `/static/*`, `/json`, `/async-db` via reverse proxy (TLS, port 8443) |
+| `baseline-h2c` | HTTP/2 cleartext | `/baseline2` (port 8082, prior-knowledge) |
+| `json-h2c` | HTTP/2 cleartext | `/json/{count}?m=N` (port 8082, prior-knowledge) |
 | `baseline-h3` | HTTP/3 | `/baseline2` (QUIC, port 8443) |
 | `static-h3` | HTTP/3 | `/static/*` (QUIC, port 8443) |
+| `gateway-64` | HTTP/2 | Compose stack serving `/static/*`, `/json`, `/async-db`, `/baseline2` (TLS, port 8443) |
+| `gateway-h3` | HTTP/3 | Compose stack serving `/static/*`, `/json`, `/async-db`, `/baseline2` (QUIC, port 8443) |
+| `production-stack` | HTTP/2 | Compose stack: edge + JWT auth sidecar + Redis + server (TLS, port 8443) |
 | `unary-grpc` | gRPC | `BenchmarkService/GetSum` (h2c, port 8080) |
 | `unary-grpc-tls` | gRPC | `BenchmarkService/GetSum` (TLS, port 8443) |
+| `stream-grpc` | gRPC | `BenchmarkService/StreamSum` (h2c, port 8080) |
+| `stream-grpc-tls` | gRPC | `BenchmarkService/StreamSum` (TLS, port 8443) |
 | `echo-ws` | WebSocket | `/ws` echo (port 8080) |
 
 Only include profiles your framework supports. Frameworks missing a profile simply don't appear in that profile's leaderboard.
 
-### async-db
-
-The `async-db` profile requires an async PostgreSQL driver. The benchmark script starts a Postgres sidecar with 100K rows and passes `DATABASE_URL=postgres://bench:bench@localhost:5432/benchmark` to your container. Your framework must:
-
-1. Connect to Postgres using the `DATABASE_URL` environment variable
-2. Implement `GET /async-db?min=X&max=Y&limit=N` that queries: `SELECT id, name, category, price, quantity, active, tags, rating_score, rating_count FROM items WHERE price BETWEEN $1 AND $2 LIMIT $3`
-3. Return JSON: `{"items": [...], "count": N}` with nested `rating: {score, count}` and `tags` as a JSON array
-4. Return `{"items":[],"count":0}` if the database is unavailable
-5. Use lazy connection initialization — retry connecting if Postgres isn't ready at startup
-
-### gateway-64
-
-The `gateway-64` profile tests your framework as part of a complete deployment stack over HTTP/2 with TLS. Unlike other tests that run a single container, this test uses **Docker Compose** to orchestrate multi-container deployments — typically a reverse proxy in front of an application server, but any architecture is allowed.
-
-**Quick start:**
-
-1. Create a `compose.gateway.yml` in your framework directory
-2. Define your services (proxy, server, cache — whatever you need)
-3. Pin each service to specific CPUs using `cpuset` — total must be exactly 64 logical CPUs (0-31 + 64-95), always in physical+SMT pairs (core N and N+64 together)
-4. All services must use `network_mode: host`, `security_opt: [seccomp:unconfined]`, and appropriate ulimits
-5. Use `${CERTS_DIR}`, `${DATA_DIR}`, and `${DATABASE_URL}` env vars — they are exported by the benchmark script
-6. Port **8443** must serve HTTPS/H2 — this is where the load generator sends requests
-7. The stack must implement `/static/*`, `/json`, `/async-db`, and `/baseline2` endpoints
-
-**What makes this different from other tests:**
-- You control the full architecture via Docker Compose
-- Multiple containers compete for a shared 64-CPU budget
-- The proxy, caching layer, and internal protocol choices are all part of the benchmark
-- Static files can be served directly by the proxy (e.g., Nginx) instead of the application server
-
-See the [Gateway-64 implementation guide](/docs/test-profiles/gateway/gateway-h2/implementation) for detailed documentation, three complete compose examples (two-tier, three-tier, and single-tier), CPU topology rules, and proxy configuration options.
+Per-profile endpoint contracts, request/response shapes, and validation rules live under the [Test Profiles](/docs/test-profiles/) section — link to the specific profile's Implementation page from your PR description when adding a new framework.
@@ -11,4 +11,5 @@ HttpArena uses a different load generator for each transport / workload.
   {{< card link="h2" title="HTTP/2" subtitle="h2load — nghttp2's load generator with TLS and stream multiplexing." icon="globe-alt" >}}
   {{< card link="h3" title="HTTP/3" subtitle="h2load-h3 — nghttp2 + ngtcp2 for QUIC-based HTTP/3 benchmarks." icon="globe-alt" >}}
   {{< card link="grpc" title="gRPC" subtitle="ghz — proto-aware gRPC load tester for streaming and unary RPCs." icon="globe-alt" >}}
+  {{< card link="ws" title="WebSocket" subtitle="gcannon --ws — io_uring WebSocket echo driver reusing the HTTP/1.1 engine with a frame-aware send/recv loop." icon="globe-alt" >}}
 {{< /cards >}}
@@ -0,0 +1,56 @@
+---
+title: WebSocket
+---
+
+HttpArena drives the `echo-ws` profile with **gcannon in `--ws` mode**. The same io_uring engine documented under [HTTP/1.1 → gcannon](../h1/gcannon/) is reused here — worker threads, per-thread provided-buffer rings, multishot receives, per-connection state — with a frame-aware send/recv loop layered on top. Using one tool across transports keeps the client-side ceiling, threading model, and CPU-pinning behavior consistent so differences in the measurement land on the server, not the generator.
+
+## Handshake
+
+Each worker opens TCP connections and issues an HTTP/1.1 upgrade request to the target URL (typically `http://localhost:8080/ws`). The server must respond with `HTTP/1.1 101 Switching Protocols` and the correct `Sec-WebSocket-Accept` value derived from the client's `Sec-WebSocket-Key`. Connections that fail the handshake are reported as reconnects; the validator ([WebSocket validation](../../test-profiles/ws/echo/validation/)) checks the handshake path separately and catches framework-side bugs before benchmarks run.
+
+## Echo loop
+
+Once upgraded, each connection runs the steady-state loop:
+
+1. Build a masked client-to-server text frame with a short payload
+2. Send the frame via `io_uring_prep_send`
+3. Wait for the server to echo it back (matched server-to-client frame)
+4. On receipt, increment the per-thread frame counter and immediately send the next frame
+
+Pipeline depth is 1 for the `echo-ws` profile — one message in flight per connection — so the measurement is effectively a back-to-back request/response loop rather than a batched burst. With thousands of concurrent connections each running this loop in parallel, the steady-state throughput reflects the server's ability to multiplex WebSocket frames across a large connection count without head-of-line blocking.
+
+Both text frames (opcode `0x1`) and binary frames (opcode `0x2`) are exercised against the server during validation; benchmark runs use the text shape for simplicity. Framing follows RFC 6455: masked from client to server, unmasked from server to client, FIN bit set on every frame (no fragmented messages in the benchmark path).
+
+## Command-line usage
+
+```bash
+gcannon http://localhost:8080/ws --ws \
+        -c <connections> -t <threads> -d <duration> -p 1
+```
+
+| Flag | Description |
+|------|-------------|
+| `<url>` | The WebSocket endpoint served over HTTP/1.1 (uses `http://` scheme; the upgrade is implicit) |
+| `--ws` | Switches gcannon from HTTP request mode into WebSocket echo mode |
+| `-c` | Total concurrent connections (distributed evenly across `-t` threads) |
+| `-t` | Worker threads (each owns an io_uring and a slice of connections; defaults to `$THREADS=64`) |
+| `-d` | Test duration — `5s` for `echo-ws` |
+| `-p` | Pipeline depth — fixed at `1` for `echo-ws` (one message in flight per connection) |
+
+The profile dispatcher (`scripts/lib/tools/gcannon.sh:ws-echo`) wires all of this automatically when you invoke `./scripts/benchmark.sh <framework> echo-ws`.
+
+## Output shape
+
+gcannon reports WebSocket results with the same layout as HTTP requests, except the summary line reads "frames sent / frames received" instead of "requests / responses":
+
+```
+  2400000 frames sent     in 5.00s, 2400000 frames received
+  Throughput: 480.00K frames/s
+  WS frames: 2400000
+```
+
+The parser (`gcannon_parse ws-echo`) records `frames received` as the `status_2xx` equivalent and divides by the measured duration to produce the headline RPS number shown on the [WebSocket leaderboard](/leaderboards/websocket/). One echo round-trip counts as one unit — the frames-received count from the client side, not frames-sent, because the metric is "how many echoes the framework completed," not "how many messages the benchmarker pushed into the socket."
+
+## Why not a dedicated WebSocket tool
+
+The two common alternatives — `wrk2` with a Lua WebSocket plugin, or `artillery` — either can't saturate the server at 64-core scale (GC + per-connection Lua overhead becomes the bottleneck) or produce non-deterministic per-thread CPU pinning that makes cross-framework comparison unreliable. Reusing gcannon means the generator's tuning story is the same one already vetted against the HTTP/1.1 profiles, and the operator-side flags (`$GCANNON_CPUS`, cpuset pinning, provided buffer ring sizing) compose identically.
@@ -14,18 +14,19 @@ Defined in `scripts/lib/common.sh`. Override by exporting before you run the scr
 | `DURATION` | `5s` | Load-test duration per run (`-d`/`-D` passed through to the tool). |
 | `RUNS` | `3` | Measurement iterations per (profile, connection count). Best wins. |
 | `THREADS` | `64` | gcannon / wrk worker threads. |
-| `H2THREADS` | `128` | h2load worker threads (HTTP/2, h2c gRPC). |
+| `H2THREADS` | `64` | h2load worker threads (HTTP/2, h2c gRPC). |
 | `H3THREADS` | `64` | h2load-h3 worker threads (HTTP/3 over QUIC). |
 
-In `benchmark-lite.sh`, `THREADS` / `H2THREADS` / `H3THREADS` all default to `nproc / 2` instead.
+In `benchmark-lite.sh`, `THREADS` defaults to `max(nproc / 2, 1)` and `H2THREADS` / `H3THREADS` mirror `$THREADS`. Pass `--load-threads N` to override all three in one shot.
 
 ## Ports
 
 | Variable | Default | Description |
 |---|---|---|
-| `PORT` | `8080` | HTTP/1.1 — also h2c for gRPC. |
-| `H2PORT` | `8443` | HTTPS, HTTP/2 TLS, HTTP/3 QUIC, gRPC-TLS. |
-| `H1TLS_PORT` | `8081` | HTTP/1.1 + TLS, used only by the `json-tls` profile. |
+| `PORT` | `8080` | HTTP/1.1 plaintext (all `h1*` profiles + `echo-ws`); also h2c for gRPC (`unary-grpc`, `stream-grpc` — prior-knowledge on the same socket). |
+| `H2PORT` | `8443` | HTTPS / HTTP/2 over TLS (`baseline-h2`, `static-h2`, gateway + production-stack), HTTP/3 over QUIC (`baseline-h3`, `static-h3`, `gateway-h3`), and gRPC-TLS (`unary-grpc-tls`, `stream-grpc-tls`). |
+| `H1TLS_PORT` | `8081` | HTTP/1.1 + TLS, used only by the `json-tls` profile (ALPN `http/1.1`). |
+| `H2C_PORT` | `8082` | HTTP/2 cleartext prior-knowledge for the `baseline-h2c` and `json-h2c` profiles. Must be a dedicated listener that refuses HTTP/1.1 — the validator checks this explicitly. |
 
 Every framework `Dockerfile` reads the same defaults from its env, so you rarely need to change these.
 
@@ -83,10 +84,10 @@ From `endpoint_tool()` in `scripts/lib/profiles.sh`:
 | Endpoint | Tool |
 |---|---|
 | `static`, `json-tls` | wrk |
-| `h2`, `static-h2`, `gateway-64`, `grpc`, `grpc-tls` | h2load |
-| `h3`, `static-h3` | h2load-h3 |
+| `h2`, `static-h2`, `h2c`, `json-h2c`, `gateway-64`, `grpc`, `grpc-tls`, `production-stack` | h2load |
+| `h3`, `static-h3`, `gateway-h3` | h2load-h3 |
 | `grpc-stream`, `grpc-stream-tls` | ghz |
-| everything else (`""`, `pipeline`, `upload`, `api-4`, `api-16`, `async-db`, `json`, `json-compressed`, `ws-echo`, …) | gcannon |
+| everything else (`""`, `pipeline`, `upload`, `api-4`, `api-16`, `async-db`, `crud`, `json`, `json-compressed`, `ws-echo`) | gcannon |
 
 ## Small-machine overrides
 

@@ -12,8 +12,8 @@ weight: 4
 | Default load generators | Native binaries | **Always** docker (forced — no env override) |
 | CPU pinning | Per-profile `--cpuset-cpus` | None — all containers see every core |
 | `THREADS` default | 64 | `nproc / 2` |
-| `H2THREADS` / `H3THREADS` default | 128 / 64 | Same as `THREADS` |
-| Profile set | 21 profiles | 15 — skips `api-4`, `api-16`, `json-tls`, `gateway-64`, `stream-grpc`, `stream-grpc-tls` |
+| `H2THREADS` / `H3THREADS` default | 64 / 64 | Same as `THREADS` |
+| Profile set | 26 profiles | 15 — skips `api-4`, `api-16`, `json-tls`, `crud`, `baseline-h2c`, `json-h2c`, `gateway-64`, `gateway-h3`, `production-stack`, `stream-grpc`, `stream-grpc-tls` |
 | Connection counts | Varies (512, 1024, 4096, 16384, …) | One per profile (mostly 512; upload 128; h3 64) |
 | Framework selection | One framework, always | Optional — runs every enabled framework if omitted |
 

@@ -64,16 +64,17 @@ Set via `VAR=value ./scripts/benchmark.sh ...` or `export VAR=value`.
 | `DURATION` | `5s` | `-d`/`-D` value passed to each load generator. |
 | `RUNS` | `3` | Measurement iterations per (profile, conns). Best result wins. |
 | `THREADS` | `64` | Load-generator threads for gcannon, wrk, and the default path. |
-| `H2THREADS` | `128` | h2load worker threads (h2, h2c gRPC). |
+| `H2THREADS` | `64` | h2load worker threads (h2, h2c gRPC). |
 | `H3THREADS` | `64` | h2load-h3 worker threads (HTTP/3 over QUIC). |
 
 ### Ports
 
 | Variable | Default | Description |
 |---|---|---|
-| `PORT` | `8080` | HTTP/1.1 (and h2c for gRPC). |
-| `H2PORT` | `8443` | HTTPS, HTTP/2 TLS, HTTP/3 QUIC, gRPC-TLS. |
+| `PORT` | `8080` | HTTP/1.1 plaintext (all `h1*` profiles + `echo-ws`); also h2c for gRPC (`unary-grpc`, `stream-grpc`). |
+| `H2PORT` | `8443` | HTTPS / HTTP/2 TLS (`baseline-h2`, `static-h2`, gateway + production-stack), HTTP/3 QUIC (`baseline-h3`, `static-h3`, `gateway-h3`), gRPC-TLS (`unary-grpc-tls`, `stream-grpc-tls`). |
 | `H1TLS_PORT` | `8081` | HTTP/1.1 + TLS — only used by the `json-tls` profile. |
+| `H2C_PORT` | `8082` | HTTP/2 cleartext prior-knowledge for `baseline-h2c` and `json-h2c`. Must refuse HTTP/1.1 — the validator checks this. |
 
 ### Load generator selection
 
@@ -93,11 +94,11 @@ LOADGEN_DOCKER=true ./scripts/benchmark.sh aspnet-minimal
 
 | Variable | Default | Used for |
 |---|---|---|
-| `GCANNON` | `gcannon` | Native binary — baseline, pipelined, limited-conn, json, json-comp, upload, api-4/16, async-db, echo-ws. |
+| `GCANNON` | `gcannon` | Native binary — baseline, pipelined, limited-conn, json, json-comp, upload, api-4/16, async-db, crud, echo-ws. |
 | `GCANNON_IMAGE` | `gcannon:latest` | Docker image when `LOADGEN_DOCKER=true`. |
-| `H2LOAD` | `h2load` | Native binary — baseline-h2, static-h2, unary-grpc, unary-grpc-tls, gateway-64. |
+| `H2LOAD` | `h2load` | Native binary — baseline-h2, static-h2, baseline-h2c, json-h2c, unary-grpc, unary-grpc-tls, gateway-64, production-stack. |
 | `H2LOAD_IMAGE` | `h2load:latest` | Docker image (Ubuntu 24.04 + glibc build; do **not** use the alpine/musl image — it's 20–40% slower). |
-| `H2LOAD_H3` | `h2load-h3` | Native binary — baseline-h3, static-h3. |
+| `H2LOAD_H3` | `h2load-h3` | Native binary — baseline-h3, static-h3, gateway-h3. |
 | `H2LOAD_H3_IMAGE` | `h2load-h3:local` | Docker image with `quictls` + `nghttp3` + `ngtcp2` + `nghttp2 --enable-http3` built from source. |
 | `WRK` | `wrk` | Native binary — static, json-tls. |
 | `WRK_IMAGE` | `wrk:local` | Docker image. |
@@ -109,7 +110,7 @@ LOADGEN_DOCKER=true ./scripts/benchmark.sh aspnet-minimal
 | Variable | Default | Description |
 |---|---|---|
 | `PG_CONTAINER` | `httparena-postgres` | Name of the sidecar container. |
-| `DATABASE_URL` | `postgres://bench:bench@localhost:5432/benchmark` | Passed to framework containers for `async-db`, `api-4`, `api-16`, `gateway-64`. |
+| `DATABASE_URL` | `postgres://bench:bench@localhost:5432/benchmark` | Passed to framework containers for `async-db`, `crud`, `api-4`, `api-16`, `gateway-64`, `gateway-h3`, `production-stack`. |
 
 ## Profiles