You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
llmtrim discover finds where compressible tokens still escape compression. A
read-only scan over the before/after capture corpus (written when LLMTRIM_CAPTURE_DIR
is set) that re-buckets each request's token surface by block kind (system, user,
assistant, tool_result, tool_call_args, document, tool_schema) and, with --by-tool, by
the tool behind each tool_result. Each row reports the residual still in the compressed
request, its share of the corpus-wide residual, and how much compression already removed
(before→after) — so the next compression target is chosen from real traffic instead of
guesswork. --json for the machine-readable report, --dir/--limit to scope the scan.
llmtrim wrap <agent> convenience launcher. A one-command way to run a coding agent
through the interceptor: llmtrim wrap claude, llmtrim wrap codex -- --model …, or any
binary on PATH. It is sugar over setup plus a subprocess launch (no per-agent config and
no base-URL rewriting) and it refuses to launch when HTTPS_PROXY isn't pointing at
llmtrim in the current shell, so a wrapped agent can't silently bypass compression. Starts
the daemon for you if the environment is wired but the interceptor is down.
Fixed
setup now sets NO_PROXY so localhost and LAN traffic bypasses the interceptor. The
managed shell-profile block (and HKCU\Environment on Windows) wired HTTPS_PROXY for the
whole user, which made every proxy-aware program — not just LLM tools — funnel its local
and LAN calls at 127.0.0.1:<port>, where they failed with "couldn't connect" whenever the
interceptor was down or on a stale port (e.g. Plex device discovery, localhost dev servers).
The block now also exports NO_PROXY/no_proxy (both casings — curl and Go only read the
lowercase form): localhost, 127.0.0.1, ::1, *.local, plus the private LAN CIDR ranges.
The literal hosts and *.local are honored by nearly every client; CIDR-based LAN bypass is
best-effort (curl and Node/undici match only exact host or domain suffix, not CIDR). Existing
installs self-heal: the daemon rewrites a pre-NO_PROXY block in place when it starts at
login (reusing the wired port), so no setup re-run is needed. Already-running apps still
need a one-time restart to pick up the new environment.
Token-F1 bench scorer now char-tokenizes CJK, fixing degenerate scores on
Chinese/Japanese/Korean. The scorer split on whitespace, which CJK text doesn't use, so a
whole CJK answer collapsed to one or two "tokens" and the reported F1 was meaningless. It now
character-tokenizes CJK runs, so quality benchmarks over CJK corpora report a real F1.
tool_trim stage now appears as "tool_trim" in capture stages lists (was "tools").
When description trimming is active (agent/aggressive presets), the ToolStage stage name
is now "tool_trim" instead of "tools". This lets QA auditors correctly identify
description-trimming runs as lossy and not flag them as category-4 bugs (lossless-only runs
that dropped content). Lossless-only uses (selection + schema minification without trimming)
continue to appear as "tools".