Skip to content

v0.5.0 — efficiency + observability + cohort honesty

Latest

Choose a tag to compare

@wdunn001 wdunn001 released this 18 May 05:13
· 8 commits to main since this release

Wire-additive over v0.4. v0.4 → v0.5 happy-path bytes identical. v0.4.1 → v0.5.0 is non-breaking; bump the package version and existing code keeps working.

Wire-protocol additions (all opt-in)

  • Delta-varint stream encoding (stream_format: "msgpack-delta" / "protobuf-delta") — ~10-15% wire reduction pre-zstd
  • Discoverable Zstandard dictionaries at <origin>/.well-known/codec/dicts/<sha256>.zstd — hash-pinned, closes the v0.4.1 silent-COPY-dicts-drop regression class
  • GPU-side latent quantize fast pathLatentStreamEncoderOptions.gpu_quantize=True for torch.cuda.Tensor inputs, ~75% PCIe reduction on int4 SDXL
  • Bolt-on tool dispatcher contract — engine dispatches to manifest-published tools without ever detokenizing the model's <tool_call> region

Client cohort — 11 artifacts at 0.5.0

  • npm: @codecai/{web,web-safety,web-llm,maps-cli,mcp-leaf,tool-kit,wire-compress}
  • PyPI: codecai
  • NuGet: Codec.Net
  • crates: codec-rs
  • Maven Central: ai.codec:codec

New cross-cohort surfaces: content-aware + per-stack-aware compression picker rewrite with typed PickReasonCode enum, policies-enumerate subcommand on @codecai/maps-cli (resolves v0.4-OQ4), @codecai/tool-kit promoted to first-class family member.

Engine cohort — 5 images live on Docker Hub

wdunn001/codec-{sglang,vllm,llamacpp,comfyui,diffusers}:v0.5.0 and :latest. Each image now bakes the canonical zstd dicts at /opt/codec/dicts/, ships /opt/codec/check-dict-availability.sh (§1.7 sub-gate 2 probe), and is dep-verified for import brotli, zstandard, msgpack before push (§1.9).

Upstream PRs filed: sgl-project/sglang#25544, vllm-project/vllm#42896. Both DCO-signed.

wdunn001/codec-tgi is DROPPED — TGI treated as a dead project; cohort is sglang + vLLM + llama.cpp + ComfyUI + diffusers only.

Benchmarks

Full cohort at packages/bench/results/2026-05-17T23-06-45Z/.

§1b engine-output @ 2K tokens, Codec msgpack + dict-zstd:

Engine JSON-SSE Best Codec Reduction
llama.cpp 528.8 KB 140 B 3,868×
sglang 485.2 KB 291 B 1,707×
vllm 517.8 KB 3.9 KB 137×

Byte-identical to v0.4.1 — confirms the wire-additive invariant.

§2 cross-language interop: 72/72 wire-unanimous + 72/72 decode-unanimous across 3 engines × 6 client languages.

Release-process additions

  • docs/RELEASE_CHECKLIST.md §1.7 — zstd dictionary availability gate (4 sub-gates)
  • docs/RELEASE_CHECKLIST.md §1.9 — engine image protocol-critical dep audit
  • docs/RELEASE_CHECKLIST.md §3.5 — bench surface coverage gate (codifies per-surface re-run triggers)
  • packages/bench/scripts/release-bench.sh — one-shot orchestrator for the full §3 + §3.5 cohort

IETF Internet-Draft

docs/submissions/draft-dunn-codec-00.md ready for submission as an Independent Submission. Companion SUBMITTING.md walkthrough covers the kdrfc → datatracker workflow.


Full notes: CHANGELOG.md. Spec: spec/versions/v0.5.md. Versioning policy: spec/versions/v0.4.md#versioning-policy.