Releases: headroomlabs-ai/headroom
Releases · headroomlabs-ai/headroom
Release v0.27.0
0.27.0 (2026-06-22)
Features
- cli: add headroom doctor setup diagnostics (#926) (e45cf4e)
- cli: add headroom update command and release banner (#1088) (26be2c3)
- compression extraction — Rust knob exposure, CCR hardening, traffic audits (#818) (b7be381)
- measure and surface token throughput (tokens/sec) through the proxy (#983) (0d89c67)
- output-token reduction — verbosity shaper, per-user learning, counterfactual savings (#965) (a99dc61)
- policy: decay P_alive from idle time near cache TTL (#856 P3b) (#1028) (fe4f9ee)
- providers: add Cortex Code (Snowflake CoCo) as a supported agent (#1190) (d9d0bf4)
- proxy: cc-switch reconciler — keep Headroom in the request path alongside cc-switch (#1030) (e8fc8a0)
- proxy: hot-reload live env knobs so a reused proxy picks them up without a restart (#1090) (6904d47)
- proxy: make COMPRESSION_TIMEOUT_SECONDS configurable via env (#946) (#991) (addebdb)
- transforms: tabular + spreadsheet (.xlsx/.xls) compression (#1128) (d789a7c)
- vertex: turnkey Claude Code + Vertex compression (+ fixes from the Vertex review) (#1113) (0e05915)
Bug Fixes
- ccr: accept 12-char SmartCrusher hashes in tool injection (#1095) (#1141) (9f7f3ad)
- ccr: return stored content when headroom_retrieve query matches nothing (#1213) (#1236) (08fb845)
- content-router: honor target_ratio in compression cache + add proxy --target-ratio flag (#1108) (8894ee0)
- dashboard: light-mode backgrounds + aligned savings tables (#1064) (5eae32b)
- deps: make litellm optional on Python 3.14 (#956) (#993) (b2f04e4)
- e2e: align Codex wrap e2e with global-only RTK guidance (#1240) (#1254) (bc12ace)
- init: set ENABLE_TOOL_SEARCH=true so Claude Code keeps deferring tools (#746) (#995) (500ec2b)
- kompress: never block the request path on the cold-cache model download (#1161) (3fc2a78)
- memory: use ONNX embedder for
wrap --memorysync (#1092) (#1262) (4f9feda) - openclaw: wrap plugin export as {register} object for OpenClaw 2026.x compatibility (#1218) (2e6c442)
- providers: update DeepSeek V3 context limit from 128K to 1M (#1038) (#1137) (bcabc5c)
- proxy: allow disabling periodic TOIN stats logging (#1265) (b5f63d8)
- proxy: honor HEADROOM_EXCLUDE_TOOLS for Codex /v1/responses tool outputs (#940) (#1053) (f03e77b)
- proxy: preserve byte-faithful Anthropic tool forwarding (#1222) (1f18d59)
- proxy: route Codex OAuth image requests (#1215) (381d771)
- proxy: scope CORS to loopback + gate operator/content endpoints (#1226) (bd55a42)
- proxy: stamp X-Client: codex on Responses endpoint for unidentified callers (#1036) (b0cd032)
- proxy: treat NODE_EXTRA_CA_CERTS as additive, not replacement (#998) (#1031) (c987283)
- telemetry: switch anonymous telemetry to opt-in (off by default) (#1223) (b998697)
- tokenizers: bound tiktoken vocab load so a stalled download cannot hang requests (#956) (#994) (7e86baf)
- unwrap: remove ANTHROPIC_BASE_URL + ENABLE_TOOL_SEARCH and init hooks on unwrap (#992) (5b84691)
- wrap: keep Codex RTK guidance global (#1240) (7c26a54)
- wrap: percent-encode non-ASCII cwd names in X-Headroom-Project header (#1071) (9f712cc)
- wrap: write env.ANTHROPIC_BASE_URL to settings.json so daemon-spawned conversations inherit proxy (#951) (#1078) (a554c3a)
Release v0.26.0
0.26.0 (2026-06-16)
Features
- add Copilot BYOK provider wrapper utilities and CLI support (#1041) (e67ee2a)
- add dashboard agent usage stats (#814) (6d3f39f)
- Add support for Mistral Vibe CLI (#935) (0932b8b)
- attribute reread waste to over-compression via marker check (#901) (f928576)
- bedrock: cross-region + Converse compression; bundle proxy binary in images (#999) (0dc2e1c)
- dashboard: surface compression-vs-cache net impact in Prefix Cache panel (#913) (2a4d300)
- evals: adversarial-input robustness grid for compressors (#918) (5939004)
- parser: detect re-issued identical tool calls as reread waste (#909) (7d4ae86)
- policy: batch deep edits through one cache-bust (#856 P3a) (#1015) (c2e52fe)
- policy: consume net-cost mutation gate in ContentRouter (#856 P2) (#905) (553ade4)
- proxy: compress AWS Bedrock InvokeModel requests via configurable upstream (#720) (7edb27a)
Bug Fixes
- anthropic: strip styled Claude model ids (#651) (0c5c89d)
- anyllm: forward openai api_base/api_key to the any-llm backend (#942) (#954) (a7ee8a6)
- cache: guard None exemplar embeddings in dynamic detector (#950) (1ec9320)
- cache: name the missing piece in semantic detector guard (#1018) (3b0bcee)
- ci: check out repo in PR Governance label job (#1021) (4558bc2)
- ci: make PR governance advisory (#1047) (74dff94)
- codex: compute waste signals on the OpenAI Responses path (#898) (b9e2761)
- codex: poll /wham/usage for subscription limits (handshake no longer sends x-codex-* headers) (#924) (8c00f71)
- codex: PR health label check state (#986) (99c874d)
- codex: retag thread providers so history menu stays whole across the proxy boundary (#1034) (74ae781)
- codex: write canonical hooks feature flag and migrate deprecated codex_hooks (#743) (dff6a19)
- compression: convert tree-sitter byte offsets to char offsets (#892) (b1f700f)
- compression: correct JSON array item counting and entropy gate (#887) (d6f0f0f)
- compression: keep container bodies compressible in code handler (#890) (16ed73b)
- compression: measure short-value threshold on payload, not token (#889) (65b0e8c)
- compression: use thread-local tree-sitter parsers in code handler (#893) (6cdb846)
- gemini: surface functionResponse payloads to waste-signal detection (#897) (9b0c840)
- learn: decode directory names with spaces in Windows project paths (#997) (#1027) (2d3701b)
- learn: scan subagent and workflow transcripts (#1045) (0ddd4ed)
- openclaw: declare headroom_retrieve tool contract (#947) (7c8c909)
- policy: correct warm-cache penalty in net_mutation_gain to (S + dT) (#903) (0632eba)
- proxy: add native Bedrock converse-stream route (#917) (b08ec15)
- proxy: keep codex image-generation WS turns alive through the relay (#1000) (7dbbb40)
- proxy: make budget enforcement actually work (#885) (a14ab45)
- proxy: read RTK gain stats globally by default (#957) (b70fccb)
- route v1internal code assist requests to cloudcode-pa.googleapis… (#821) (e20f16b)
- serena: stop the Serena dashboard popup and make --no-serena actually disable Serena (#1003) (919379a)
- support Copilot Business subscription auth (#641) (0b4a4bd)
- wire HEADROOM_EXCLUDE_TOOLS / HEADROOM_TOOL_PROFILES into Click proxy entrypoint (#943) (9b7b436)
- wrap: avoid duplicate top-level keys when injecting codex provider (#884) (dd22cfd)
Code Refactoring
Release v0.25.0
0.25.0 (2026-06-12)
Features
- add differential network capture harness (#761) (11ab5f8)
- add light mode for dashboard (#834) (c425893)
- add OAuth2 client-credentials upstream-auth proxy extension (#778) (#784) (eb2e50f)
- add Vertex AI proxy routing (#793) (3c77e52)
- cli: comprehensive help text, validation, and exception handling improvements (#640) (028efab)
- compression safety rails — error-output protection, pipeline circuit breaker, library inflation guard (#851) (c0cadcc)
- dashboard: per-model savings breakdown and expected-vs-actual cost on historical charts (#807) (34dafe6)
- detect re-served tool results as over-compression waste signal (#854) (5f1d88a)
- evals: add zero-cost tool schema compaction integrity eval (#817) (53a08c6)
- gated Markdown-KV compaction formatter (serialization-aware output) (#859) (06b2625)
- kompress: warn on unrecognized HEADROOM_KOMPRESS_BACKEND + document backend selection (#204) (6367d0b)
- memory: add opt-in Apple-GPU (MPS) embedding runtime (#766) (c71592d)
- net-cost cache mutation formula on CompressionPolicy (#856 P1) (#857) (d5f5802)
- plugins: Hermes agent headroom_retrieve plugin (#824) (058bced)
- probe-based retention scoring of recorded compression events (#862) (c2106cb)
- proxy: add CLI opt-outs for CCR injection (compression-only mode) (#823) (693d9d2)
- proxy: attribute savings history rollups per provider (#791) (0b8b8d9)
- proxy: log compressed messages alongside original request (#261) (2269e40)
- proxy: per-project savings breakdown on the dashboard (claude, codex, aider, copilot, cursor) (#803) (914a60a)
- support Python 3.14+ via pyo3 abi3 stable ABI (#516) (19eac8e)
- switch Kompress default to kompress-v2-base with weight-only int8 ONNX (#799) (74392b2)
- transforms: attribute read_lifecycle + smart_crush tags (#249) (8f37426)
Bug Fixes
- anthropic: CCR exception must re-raise, not silently swallow (#838) (8db5efc)
- ccr: key Rust search/diff/log markers with explicit_hash (#852) (bfcb07d)
- ccr: make retrieval TTL configurable (#715) (2533f77)
- ccr: skip CCR when model calls headroom_retrieve alongside user tools (#839) (30078f8)
- ccr: use shared compression store (#875) (249af6c)
- ci: correct comments, timeouts, and pip reliability in native e2e workflows (#878) (b716c8c)
- ci: pin cosign-installer to v3 (v4 does not exist) (#774) (199d693)
- codex: respect CODEX_HOME for wrap config (#731) (96abf38)
- content_router: guard against empty compression output causing Anthropic 400 (#771) (2f9ff07)
- copilot: use responses API for subscription reasoning models (#647) (84ac332)
- correct preserved-entry index mapping in Gemini content round-trip (#836) (0ffe2b6)
- dashboard: stable 'Proxy $ Saved' hero tile under --workers > 1 (#481) (fd73b88)
- don't inject empty tools:[] when client omitted the tools field (#772) (574bbae)
- harden Copilot API auth token handling (#557) (6b0c09f)
- health: readyz verifies upstream connectivity, not just process liveness (#744) (5dfb446)
- init: guard persistent task startup (#616) (9252d85)
- init: normalize Windows hook paths to forward slashes (#788) (6ea6e31)
- init: suppress hook recovery output (#760) (b439599)
- learn: claude-cli streams output with idle timeout (#373) (9bff575)
- make headroom wrap readiness probe timeout configurable for slow ML imports (#581) (163677b)
- parser: detect waste signals in Anthropic tool_result content blocks (#815) (929698a)
- proxy: F4 — trust ...
Release v0.24.0
0.24.0 (2026-06-08)
Features
- perf: add --format {text,json,csv} to
headroom perf(#648) (9fe4886) - proxy: show resolved upstream API targets in startup banner (#586) (8dbe7ad), closes #583
- relevance: weight BM25 score_batch by corpus IDF (#646) (88177bd)
- support CLAUDE_CODE_USE_FOUNDRY and custom upstream gateways (#726) (d90cdce)
Bug Fixes
- ci: restore green lint gate on main (fe50f9d)
- codex: auto-enable fail-open on compression timeout in headroom wrap codex (#531) (5f5f261)
- copilot: restore generic endpoint for non-subscription OAuth (#610) (#612) (18925b8)
- deps: move gunicorn to [proxy-prod] extra, add Windows guard (#537) (fa558c5)
- proxy: fail-open on corrupt golden bytes instead of RuntimeError (#603) (2170a1b)
- proxy: route Claude Code model metadata to Anthropic (#627) (30c1ac8)
- security: patch loopback guard, retry None raise, async subprocess, and cache race (06d7cb9)
- security: patch loopback guard, retry None raise, blocking subprocess, and cache stats race (78f3a4d)
- startup: move HF/httpx log suppression before sentence_transformers init (#622) (176d4c7)
- startup: suppress proxy startup log noise (#619) (4555901)
- wrap: report unbindable proxy ports (#602) (6dfcaa8)
[Unreleased]
Bug Fixes
- deps: move
gunicornto[proxy-prod]extra withsys_platform != 'win32'guard; removed from[proxy]to avoid forcing a Unix-only package on dev, CI, and Windows users (#537) - startup: suppress proxy startup log noise — litellm banner, trafilatura parse errors, HuggingFace Hub unauthenticated warnings, tiktoken fallback warning, and httpx INFO lines from sentence_transformers HEAD checks. Affected files:
headroom/providers/litellm.py,headroom/transforms/html_extractor.py,headroom/memory/adapters/embedders.py,headroom/providers/anthropic.py,headroom/providers/registry.py,headroom/image/onnx_router.py,headroom/transforms/kompress_compressor.py.
Release v0.23.0
0.23.0 (2026-06-04)
Features
- copilot: GitHub Copilot subscription mode through Headroom (f4dff9b)
Bug Fixes
- ccr: scope proactive expansion by workspace (cross-project leak) (197601b)
- ccr: scope proactive expansion by workspace (cross-project leak) (1bc163f)
- codex: keep init model_provider at config root (#260) (304dcc7)
- codex: keep init model_provider at config root (#260) (849b46d)
- copilot: deterministic subscription token handoff to the proxy (72da461)
- copilot: support subscription auth through Headroom (ff4a0c6)
- correct tiktoken encoding for unknown gpt-4 model snapshots (#552) (0e551de)
- decode/encode owned config, state and template assets as UTF-8 (2f1538a)
- decode/encode owned config, state and template assets as UTF-8 (fixes #533) (92075b9)
- docker: upgrade base images to Python 3.13 / debian13 (e6bf7a0)
- docker: upgrade base images to Python 3.13 / debian13, drop digest pinning (08a2197)
- docs: bump next.js to 16.2.6 for GHSA-h64f-5h5j-jqjh (CVE-2026-44577) (a6a09e6)
- docs: mkdocs configuration to build with correct folder (#543) (5557944)
- docs: update brace-expansion to 5.0.6 to remediate GHSA-jxxr-4gwj-5jf2 (CVE-2026-45149) (6eb6fb5)
- docs: update bun.lock to next 16.2.6 for GHSA-h64f-5h5j-jqjh (CVE-2026-44577) (91e0937)
- ignore brackets inside JSON strings when splitting mixed content (#553) (bdcfc32)
- learn: decode Unix home dirs whose username contains '.', '-' or '_' (211daae)
- learn: decode Unix home dirs whose username contains '.', '-' or '_' (491a8b3)
- learn: finish gemini-flash-latest default model sweep (982d01b)
- learn: finish gemini-flash-latest default model sweep (#532) (d797366)
- memory: READ-ONLY framing + fail-closed unresolved-project fallback (a178249)
- memory: READ-ONLY framing + fail-closed unresolved-project fallback (482f80e)
- update dashboard doc link (#544) (378d77e)
- Update Next.js to 16.2.4 in docs/bun.lock to address GHSA-gx5p-jg67-6x7h (CVE-2026-44580) (0b9f11a)
- Update Next.js to 16.2.6 in docs/package.json and package-lock.json to address GHSA-h64f-5h5j-jqjh (CVE-2026-44577) (db5d15f)
- Upgrade litellm to 1.86.2 to remediate CVE-2026-42271 (07581b9)
Code Refactoring
Release v0.22.4
What's Changed
- fix(cli): wrap CLI breadth — cline, continue, goose, openhands by @chopratejas in #492
- fix(subscription): wire tokens_saved_rtk data plane by @chopratejas in #493
- fix(observability): RTK metrics + Rust observability (Phase H blocker) by @chopratejas in #494
- ci(release): adopt release-please for gated publishes by @chopratejas in #495
- fix(release): tag format vX.Y.Z (drop release-please component prefix) by @chopratejas in #497
Full Changelog: v0.22.3...v0.22.4
Release v0.22.2
[0.22.2] - 2026-05-20
Bug Fixes
- memory: expose memory IDs in auto-tail + memory_list tool + ID-usage guidance (f844f64)
Release v0.22.1
[0.22.1] - 2026-05-20
Refactors
- proxy: MemoryRanker + ImageCompressionDecision + branch-aware version-sync (c79aa22)
Release v0.22.0
[0.22.0] - 2026-05-19
Features
- proxy: add --exclude-tools flag + HEADROOM_EXCLUDE_TOOLS env var (1058043)
Bug Fixes
- proxy: MemoryDecision contract + 3 bypass bugs + drop 500-char query cap (4e1b218)
- proxy: thread tags into 13 outcome sites; synthesize /v1/models for ChatGPT auth (6d62985)
- proxy: surface CompressionDecision.passthrough_reason in tags (88983ad)
Refactors
Release v0.21.38
[0.21.38] - 2026-05-15
Refactors
- proxy: introduce RequestOutcome funnel; collapse 3 streaming finalizers (3762f63)