Wire-additive over v0.4. v0.4 → v0.5 happy-path bytes identical. v0.4.1 → v0.5.0 is non-breaking; bump the package version and existing code keeps working.
Wire-protocol additions (all opt-in)
- Delta-varint stream encoding (
stream_format: "msgpack-delta"/"protobuf-delta") — ~10-15% wire reduction pre-zstd - Discoverable Zstandard dictionaries at
<origin>/.well-known/codec/dicts/<sha256>.zstd— hash-pinned, closes the v0.4.1 silent-COPY-dicts-drop regression class - GPU-side latent quantize fast path —
LatentStreamEncoderOptions.gpu_quantize=Truefortorch.cuda.Tensorinputs, ~75% PCIe reduction on int4 SDXL - Bolt-on tool dispatcher contract — engine dispatches to manifest-published tools without ever detokenizing the model's
<tool_call>region
Client cohort — 11 artifacts at 0.5.0
- npm:
@codecai/{web,web-safety,web-llm,maps-cli,mcp-leaf,tool-kit,wire-compress} - PyPI:
codecai - NuGet:
Codec.Net - crates:
codec-rs - Maven Central:
ai.codec:codec
New cross-cohort surfaces: content-aware + per-stack-aware compression picker rewrite with typed PickReasonCode enum, policies-enumerate subcommand on @codecai/maps-cli (resolves v0.4-OQ4), @codecai/tool-kit promoted to first-class family member.
Engine cohort — 5 images live on Docker Hub
wdunn001/codec-{sglang,vllm,llamacpp,comfyui,diffusers}:v0.5.0 and :latest. Each image now bakes the canonical zstd dicts at /opt/codec/dicts/, ships /opt/codec/check-dict-availability.sh (§1.7 sub-gate 2 probe), and is dep-verified for import brotli, zstandard, msgpack before push (§1.9).
Upstream PRs filed: sgl-project/sglang#25544, vllm-project/vllm#42896. Both DCO-signed.
wdunn001/codec-tgi is DROPPED — TGI treated as a dead project; cohort is sglang + vLLM + llama.cpp + ComfyUI + diffusers only.
Benchmarks
Full cohort at packages/bench/results/2026-05-17T23-06-45Z/.
§1b engine-output @ 2K tokens, Codec msgpack + dict-zstd:
| Engine | JSON-SSE | Best Codec | Reduction |
|---|---|---|---|
| llama.cpp | 528.8 KB | 140 B | 3,868× |
| sglang | 485.2 KB | 291 B | 1,707× |
| vllm | 517.8 KB | 3.9 KB | 137× |
Byte-identical to v0.4.1 — confirms the wire-additive invariant.
§2 cross-language interop: 72/72 wire-unanimous + 72/72 decode-unanimous across 3 engines × 6 client languages.
Release-process additions
docs/RELEASE_CHECKLIST.md§1.7 — zstd dictionary availability gate (4 sub-gates)docs/RELEASE_CHECKLIST.md§1.9 — engine image protocol-critical dep auditdocs/RELEASE_CHECKLIST.md§3.5 — bench surface coverage gate (codifies per-surface re-run triggers)packages/bench/scripts/release-bench.sh— one-shot orchestrator for the full §3 + §3.5 cohort
IETF Internet-Draft
docs/submissions/draft-dunn-codec-00.md ready for submission as an Independent Submission. Companion SUBMITTING.md walkthrough covers the kdrfc → datatracker workflow.
Full notes: CHANGELOG.md. Spec: spec/versions/v0.5.md. Versioning policy: spec/versions/v0.4.md#versioning-policy.