v0.2.0 — same answers, measured
The credibility release: knitbrain now proves the answers survive, not just that tokens shrink.
Headline numbers (real transcripts, reproduce with one command)
- 48.7% of ALL tool-result tokens saved (3.33M tokens, 63 real sessions; 55.4% on sizable blocks ≥400 chars) —
npx knitbrain profile - Answer-preservation gates PASS on 4,500+ real blocks —
npx knitbrain evals: error-fidelity 100%, summary-fidelity 100%, identifier-fidelity 99.4%, round-trip lossless 100%, never-expand 100% - Holding the fidelity line cost ~1pp of savings vs 0.1.x — paid deliberately, both numbers published
New
knitbrain evals— deterministic answer-preservation suite (no LLM judge), exit 1 on any gate failure- Search / log / diff handlers — grep dumps collapse per-file, test logs keep every error + summary, diffs keep headers with ±counts; error and result lines are never elided by any handler
- Real CacheAligner — volatile lines moved out of the system prefix; Anthropic
cache_controlbreakpoints inserted only when the client has none; stacks with compression because compression is deterministic - 11 AST grammars — Go, Rust, Java, C++, C#, Ruby, PHP, Bash join TS/TSX/Python
knitbrain learn— offline failure mining with success correlation; writes precision-gated, secret-redacted corrections into CLAUDE.md- Puppeteer agents — complex tasks materialize pre-briefed, scope-guarded
.claude/agents/*.mdfiles (same persistence model as skills); creation events are team-board and hub visible - Plan-mode adherence — complex verdicts carry an explicit ENTER-PLAN-MODE imperative; the full closed-loop protocol rides the MCP handshake (zero per-project setup, any MCP client)
- Library API —
import { createOptimizer } from "knitbrain" - Platforms — Claude Code, Cursor, VS Code + Copilot agent mode, Windsurf written natively by
knitbrain setup; Codex / Copilot CLI / Zed snippets;knitbrain promptfor everything else - Real-shape CI bench — fixture mix mirrors the profiled burn distribution with per-shape regression floors; best-case fixtures clearly labeled, never quoted as real-world
Verified
196 tests · per-tool live E2E (25/25 tools over stdio) · production audit 50/50 (cold clone → packed install → tools + proxy + hook + dashboard + hub) · cold-registry check: npx knitbrain@0.2.0 profile → 48.6%, evals → PASS