Release v0.0.2 · ryan-mt/tomte

0.0.2

Taught the agent to see a task through by default — a # Seeing a task through section in the system prompt every turn inherits: plan, write the failing test first (TDD), finish the job, then prove it with build/test/lint and loop until green; scaled so a one-line fix stays light.
Added a glass-box pre-flight — before a write or shell command runs, one calm line states what it changes and how far it reaches (plus a leash note for destructive ones); reads and searches stay cardless.
Surfaced a file's recorded decisions as house rules in the pre-flight — an edit to a file with recorded decisions lists them first, so the agent re-reads its own constraints before it could break one.
Added a cross-model decision trail — record_decision logs why a non-obvious change was made to an append-only decisions.jsonl, re-injected each session so a later session or a different model inherits the reasoning.
Added tomte why <loc> / tomte why --all / /why to read the decision trail back, and tomte blame <file> for the greppable, one-decision-per-line file view.
Added --json to tomte why and tomte blame (and tomte why --reconcile --json) — the decision trail and the drift report emit machine-readable JSON alongside the rendered text, so a script or CI drift-gate can read the trail (and assert no stale/ambiguous decisions) instead of parsing prose.
Added an end-of-turn receipt — a turn that changes something closes with one line: files touched, tests run (pass/fail), and the why it recorded.
Added an agent-writable memory tool — project-scoped notes that persist across sessions, with a MEMORY.md index re-injected each session.
Added an OS-level sandbox for run_shell — Landlock + seccomp (Linux) / sandbox-exec (macOS) confine writes to the workspace and block outbound network by default (read-only / workspace-write / danger-full-access).
Added per-run sandbox overrides — --sandbox <mode> / --sandbox-allow-net and TOMTE_SANDBOX_* env vars (never persisted); Linux adds conservative rlimits, Windows tears the tree down via a kill-on-close Job Object.
Added tomte doctor and /doctor — a read-only setup health check (auth, config, model routing, MCP, external tools) that runs headless and exits non-zero on failure.
Added tomte mcp — manage MCP servers from the CLI instead of hand-editing settings.json: tomte mcp add <name> -- <command> [args…] (with repeatable --env KEY=VALUE) writes the mcp_servers entry atomically while preserving every other key, and list / get / remove round it out (env values are never printed back).
Added a TOMTE_CONFIG_DIR override to relocate the whole config tree (config, auth, sessions, logs) on every platform — also the portable way to isolate tests.
Added composer prefixes — @<path> attaches a file via gitignore-aware typeahead, !<command> runs a shell command inline (!! past the guard), #<note> appends to CLAUDE.md.
Added automatic, provider-agnostic model failover — a rate-limited or overloaded model switches to the next in fallback_models; off by default, only before any answer has streamed.
Added project-scoped config — a .tomte/config.json overrides safe fields only (model, reasoning_effort, verbosity, auto_compact, fallback_models); security keys are ignored.
Added left-drag text selection in the TUI — drag to highlight and copy on release (no Shift), handling wide CJK/emoji; /help documents it plus history recall (↑/↓).
Added a live context gauge to the status line — N% ctx next to the model, colored calm → warning → danger toward the ~85% auto-compact threshold.
Added tomte-website/ — the static Next.js marketing & docs site, deployed at https://tomte-website.vercel.app.
Added runnable, std-only previews of the planned pillar concepts (hand-compiled, invisible to cargo/CI), including a cross-provider cost demo.
Rebuilt the welcome card into a full first-screen panel — pixel-pet, brand/version, live setup (model · effort · account), workspace, a /init-style house-rules check, and a shortcuts footer; spans the full terminal width.
Reworked the turn spinner — a flickering-hearthfire glyph (▁▂▄▆█▆▄▂) instead of braille and a ~245-word tomte-voiced pool that holds a word ~8s then drifts on a pure seed × elapsed schedule, so it never flickers; a running todo's active_form takes the line instead.
Made the spinner words configurable — spinner_verbs { verbs, exclude_default } in config.json appends to or replaces the built-in pool.
Gave a finished sub-agent in the fleet view a settled past-tense verb (e.g. Forged · 4 steps · 1m 12s) instead of a stale in-flight phrase.
Rewrote the edit_file / multi_edit diff into a real hunk — shared lines collapse into uncolored context, the -/+ counts reflect only real changes, and line numbers follow the unified-diff convention.
Unified todo glyphs across the inline todo_write checklist and the pinned panel (✓ / ▪ / □, in-progress now a filled ▪), and added the (Ctrl+O for more) hint to truncated diff/error bodies.
Rebuilt /context as a visual context-window breakdown — a proportional colored grid and per-category legend with token estimates; /context all expands the lists.
Made /cost accurate and per-model — spend tallied per model and billing class (input / output / cache read / cache write), and it survives /resume.
Gave the composer a cozy face — a ✿ prompt gutter and a what shall we build today? placeholder.
Made grep work with no external tools — a native, dependency-free fallback covers content / files_with_matches / count with context and path scoping when neither ripgrep nor grep can be spawned; the recursive walk is shared with glob.
Made tomte doctor warn (not hard-error) when neither ripgrep nor grep is installed, now that grep / glob have a native fallback.
Made read_file render a Jupyter .ipynb as cells (ids + text outputs; image/rich outputs omitted) instead of dumping raw JSON, pairing with notebook_edit; a sliced read (offset/limit) still returns the raw JSON.
Gave read_file vision — a whole-file read of an image (PNG/JPEG/GIF/WebP) or PDF now attaches the bytes as media so a vision model can SEE it (the Anthropic translator emits image/document blocks in the tool_result), instead of the old text-only "binary file" note. Tool results carry optional media end-to-end via a new execute_rich (the 26 text tools are untouched; only read_file overrides it); the OpenAI wire degrades to the text note since its function_call_output doesn't accept media.
Surfaced project-local skills and custom commands in the / slash menu — skills under .tomte/skills (and .claude/.codex) plus commands/*.md now appear as /<name> entries (tagged by scope, e.g. skill (.tomte)) so you can trigger them manually, and typing /<skill-name> loads that skill's instructions into the composer. Global skills stay out of the quick menu (the model still loads any of them on demand via the skill tool).
Made read_file and list_dir give a clear, self-correcting error when handed the wrong kind of path — read_file on a directory points to list_dir/glob, and list_dir on a file points to read_file, instead of surfacing a raw OS error.
Settled on the full-screen alternate-buffer renderer as the default; the inline viewport is now opt-in via TOMTE_INLINE=1, with a slimmer height and a bottom-anchored live tail.
Unified the TUI's ~70 scattered color literals into one calm palette — an achromatic base, a single muted sage-teal accent, and muted semantic colors.
Gave the harness prompt a voice with a spine — it pushes back on weak plans, states confidence, anchors claims to receipts, and drops sycophancy and emoji.
Made tool calls self-correcting — a failed-to-parse argument gets an expected-args summary, and an unknown tool name suggests the closest real one (Did you mean: read_file?).
Made headless chat / run read-only by default — side-effecting tools are denied so a prompt-injected model can't act unattended; pass --dangerously-skip-permissions to allow them.
Grounded the OpenAI model catalog against the current API — removed ids auto-migrate, the -pro family clamps to high effort, and a future Opus/Sonnet inherits the 1M window.
Improved retry/timeout behavior — Retry-After is honored in HTTP-date form too, backoff is jittered so sub-agents don't retry in lockstep, and non-streaming calls get a 300s cap.
Reorganized the codebase so every Rust source file is ≤500 lines — large modules split into focused submodules, later renamed to semantic names (canonical_args, todo_reminder, slash_ops / slash_meta, content-named *_tests); pure refactor, no behavior change.
Removed the unused keyring dependency for a smaller build and attack surface — credentials stay in auth.json with 0o600 perms.
Renamed the project from opencli to tomte — binary, crates, config dir (~/.config/tomte), TOMTE_* env vars, logo, and user-agent — breaking: the old ~/.config/opencli is no longer read, so re-run tomte login.
Hardened the decision-trail reconcile write — a failed atomic rewrite of decisions.jsonl is now logged instead of silently swallowed, and the staging temp uses a unique per-process name so two concurrent reconciles can't clobber each other's temp before the rename.
Fixed sign-in on Windows — auth.json now persists under %APPDATA%\tomte (owner-only via icacls), so an OAuth login can complete instead of looping the sign-in picker.
Made the Windows credential-file ACL tighten observable — a failed or skipped icacls owner-only grant (e.g. USERNAME unset) is now logged instead of silently leaving the file on its inherited %APPDATA% ACL.
Fixed MCP servers failing to spawn on Windows — a bare npx / node / pnpm command (a .cmd shim) now resolves against PATH×PATHEXT, since CreateProcessW only appends .exe.
Fixed edit_file / multi_edit failing on CRLF (Windows) files — line endings are reconciled so an \n-joined old_string matches the \r\n on disk, and CRLF is preserved.
Fixed glob on machines without ripgrep — replaced the Unix find fallback (absent on Windows) with a native recursive walk that skips .git and never loops on symlinks.
Fixed grep path normalization — a hyphenated segment (tomte-website\…) no longer leaves backslashes in the Windows result.
Fixed a Windows session-persistence test that wrote to the real %APPDATA% because dirs honors XDG_CONFIG_HOME only on Unix (now uses TOMTE_CONFIG_DIR).
Fixed the drag text-selection drifting off its content on mouse-wheel scroll — the highlighted rows now shift with the scroll.
Broadened the run_shell destructive-command guard — block-device writes (dd / shred / wipefs / tee / truncate / cp), rm -rf root-globs and $VAR / ~user targets, find -delete / -exec rm, and git's wider destructive surface (force-push, branch -D, reflog expire, gc --prune, stash drop).
Hardened the destructive-command guard further — interpreter pipes behind wrappers (curl … | sudo sh, xargs / env / nohup), shell grouping or exec (| { sh; }, | exec sh), raw block-device redirects (>|, &>), and single-quoted paths/devices (dd … of='/dev/sda') are all flagged, without false-positiving a benign grep sh.
Closed more destructive-command classifier bypasses — output piped into a shell from any source (cat x | sh, base64 -d | bash, not just curl/wget), command substitution in a delete/chmod target (rm -rf `…`, rm -rf $(…)), runtime-assembled commands (eval / PowerShell iex), wider git surface (reset --merge/--keep, rm -r/-f, push --prune, worktree remove --force), recursive chmod/chown on any system/home/root path (not just /), and Windows verbs (del/rd /s, format X:, Remove-Item -Recurse -Force).
Fixed a conscience self-check bypass in headless runs — a model-raised edit conflict no longer hard-approves the write; the conflict path falls back to the baseline approval gate (which denies side effects unattended), so the self-check can only add friction, never turn a headless edit the gate would deny into an executed one.
Closed a dozen run_shell permission-rule bypasses — rules now match the whole command through quotes, wrappers, command substitution, subshells, brace groups, redirections, and env prefixes; path globs are normalized, symlink-resolved, and case-folded, and a malformed *** glob can't O(n^k) DoS.
Closed shell-expansion bypasses in the destructive/deny scanners — escaped names (r\m), empty expansions (r${EMPTY:-}m), and $IFS splitting are normalized before matching, while '$HOME' stays non-expanding.
Tightened permission-rule trust — a project .tomte/permissions.json is honored for deny only, a classifier-flagged command always prompts, and dangerous_override is ignored non-interactively.
Fixed a permission-deny bypass — a dir/** rule now also blocks dir itself (e.g. list_dir(.git)), not just its children.
Extended the secret-env scrub before shell / MCP / hooks — now also strips SSH_AUTH_SOCK, KUBECONFIG, DOCKER_AUTH_CONFIG, NETRC, DATABASE_URL, and *PASSPHRASE* / *_KEY / *_PWD / *WEBHOOK* vars — and extended it to read-only helper subprocesses (grep / glob, git-root discovery, the @ picker, /diff).
Widened the secret-env scrub to secrets that don't follow the *_KEY / *_TOKEN / *_SECRET convention — JWT / BEARER / OAUTH / *_SID / *SIGNING* / *ENCRYPTION* / MNEMONIC and common vendor prefixes (Stripe / Twilio / SendGrid / Doppler), chosen to avoid colliding with benign vars like PATH.
Hardened SSRF defenses in web_fetch / web_search — redirects follow only non-blocked http(s), internal/metadata IPs and file:// / javascript: / data: are dropped, IPv6-embedded internal IPv4 is blocked, the whole 0.0.0.0/8 range is blocked, and the vetted DNS address is pinned for the connection.
Fixed the web_fetch SSRF guard for IPv6-literal URLs — a bracketed host like [::1] is parsed straight to a socket address and vetted, instead of failing with a misleading DNS error.
Hardened the untrusted-input posture — tool output is treated as data not instructions, project-local subagents stay read-only even under Auto mode, and resume no longer restores the read_files set.
Hardened inherited instruction loading against symlink exfiltration — AGENTS.md / CLAUDE.md are read through a non-symlink, size-capped regular-file loader, and a planted block marker can't truncate the inherited-memory block.
Hardened filesystem safety against symlink/TOCTOU tricks — write_file re-resolves its target after creating parent dirs, allow-rule writes use O_NOFOLLOW owner-only, and session / skill / agent files go through a shared size-capped, regular-file-only helper.
Fixed the read-before-overwrite guard — a failed or partial read_file (limit: 0, over-cap, large-file-without-limit, offset/limit) no longer marks a file as read, so write_file / edit_file can't clobber content the model never saw.
Hardened auth-token redaction — bare-JWT and unprefixed custom keys are redacted by exact value, the sk- / Bearer heuristic matches at word boundaries (so disk-usage survives), and provider error/SSE bodies are bounded and redacted.
Fixed OAuth sign-in robustness — redirect_uri=localhost for ChatGPT/Codex, state (CSRF) verification for manual Anthropic login, late/ mismatched-callback safety, and Esc cancels a pending login.
Fixed login-screen input — bracketed paste now lands the code/key in the active login field, and the API-key prompt ignores key-release/repeat (no doubled keys on Windows/kitty).
Hardened credential-file and state-dir permissions — auth.json self-heals to 0o600, the config / session / logs dirs are created 0o700 / 0o600, config_dir never falls back to cwd, and a custom base_url must be https.
Hardened multi-provider auth refresh — concurrent OpenAI/Anthropic refreshes reload-and-merge before saving so neither clobbers the other's fresh single-use token, and a glued secret (token_sk-…) is still redacted.
Fixed a token-refresh lockout — when persisting refreshed OAuth tokens fails, the in-flight turn proceeds on the new access token (with a warning) instead of dying on the already-consumed refresh token.
Made headless chat strip terminal control sequences from untrusted model/tool text — including the 8-bit C1 (CSI/OSC/DCS) introducers — so a payload can't rewrite the terminal, set the title, or inject clipboard data.
Surfaced content_filter and refusal stops as errors across OpenAI Chat / Responses and Anthropic, instead of finalizing a silent empty turn.
Hardened context-overflow recovery — an HTTP 413 triggers the same shed-stale-output-and-retry, recovery is gated on no committed output, auto-compaction re-arms after a failed summary, and extreme token counts are handled safely.
Made an OAuth-rejected OpenAI model fail fast naming the supported models, instead of a raw 400 mid-turn.
Fixed Anthropic streaming accounting — the final usage folds every message_delta field over the start snapshot, and redacted_thinking blocks survive the tool loop so a redacted-thinking-then-tool turn isn't rejected.
Fixed resume dropping Anthropic reasoning — signed / redacted_thinking blocks are now persisted, so a resumed turn replaying a tool_use no longer 400s.
Fixed the Anthropic refusal error losing its explanation when a trailing usage-only message_delta reset stop_details.
Fixed OpenAI tool-call argument streaming — a bare null / {} / [] mid-stream is kept verbatim, an id-less continuation routes to the in-flight call, and an out-of-range index no longer collides into a wrong slot.
Fixed /cost over-reporting on auto-recovered context overflow — a failed-then-retried turn no longer bills the rejected request's input tokens.
Fixed model- or injection-supplied values panicking the turn — memory view rejects a reversed range, grep / glob saturate a near-usize::MAX head_limit, and compaction thresholds handle extreme counts.
Fixed notebook_edit — it now requires the notebook read this session and unchanged on disk, and delete no longer treats a numeric cell_id as a position.
Fixed read_file on a large binary — a non-UTF-8 file over the 5 MB cap is described as binary (with its true size) instead of a raw decode error.
Bounded attacker-influenced buffers — tool-argument streams (16 MiB), web-search bodies (8 MiB), and MCP messages (16 MiB per JSON-RPC line).
Confined @-mentions to the workspace and the prompt only, bounded the picker at 5000 entries, and hardened !/# behavior (backgrounded with a 120s timeout, #-notes on their own line, mid-stream queued).
Fixed shell-lifecycle leaks and hangs — a foreground run_shell is bounded even when a backgrounded child holds the stdout pipe, background shells are killed at session end, and teardown won't SIGKILL a recycled same-uid pgid.
Fixed UI deadlocks and freezes — resume/undo and the session picker wait for an in-flight turn, a hook that prints before reading stdin no longer deadlocks, and a large paste inserts in one operation.
Made the Linux sandbox fail closed when Landlock is inactive or its confinement helper can't be built — commands are refused instead of running unconfined, and tomte doctor probes /sys/kernel/security/lsm and warns.
Tightened the Linux read-only sandbox — /dev/shm is no longer always writable, closing a tmpfs persistence escape while keeping /dev/null available.
Closed a ReDoS in the path-permission glob matcher — adjacent ** groups (**a**a…) no longer backtrack O(text^k); the matcher is memoized to O(pattern·text).
Bounded skill discovery against a symlink fan-out — a visited-set caps a self-referential .tomte/skills/ walk while still following legitimate symlinked skill dirs.
Closed a sub-agent confinement bypass — a project-local agent matched only by its frontmatter name is confined to read-only tools, instead of running mutating tools under Auto / --dangerously-skip-permissions.
Fixed assorted TUI/tool issues — /clear resets the model's conversation history (not just the transcript), a --- after a line containing | isn't misrendered as a table, a shell result can't spoof its rendered exit code, a failed resume reopens the picker, /buddy guards an empty rarity tier, off-screen sub-agent rows ignore clicks, and /export uses safe Markdown fences.
Locked the markdown table renderer's bounds-safety with regression tests — a malformed table from model output (ragged columns, missing cells, a header-only table with no body) is now covered, so a later refactor can't reintroduce an out-of-bounds panic on the tbl[2..] / cells[c] accesses; an audit confirmed those paths are already safe, so there is no behavior change.
Added tomte hooks — one-line list / enable <id> / disable <id> for built-in hook presets (rustfmt → cargo fmt, gofmt, prettier) that auto-run after tomte edits a matching file, so the agent self-triggers a tidy-up without you hand-editing settings.json. Writes are merged into settings.json so mcp_servers and any hand-added hooks are preserved; enable is idempotent and disable removes an emptied block. The preset commands are plain program + args invocations chosen to behave identically under sh -c and cmd /C.
Made lifecycle hooks cross-platform — the hook runner now falls back to cmd /C on a stock Windows box with no sh on PATH (Git Bash is still preferred when present), instead of silently never running on a machine without a POSIX shell; Linux/macOS are unchanged. The shell choice is a pure function, unit-tested for every OS branch. tomte doctor now counts hooks across every event, not just PreToolUse.
Added a Hooks section to tomte doctor — it lists every configured hook (across all events), warns when a hook command's program isn't on PATH so a typo'd or missing tool surfaces before it silently fails the first time the hook fires (softened, since it may be a shell builtin or alias), and names the shell that runs hooks on this OS (sh -c or cmd /C).
Added tomte hooks run <id> — runs a preset's real command once through the same OS-appropriate shell and reports its exit code and output, so you can confirm a hook actually works on your machine before relying on it.
Fixed more run_shell destructive-command bypasses where the action is opaque to the token scan — a non-shell interpreter running an inline program (python -c 'shutil.rmtree("/")', node -e …, perl/ruby/php, awk 'BEGIN{system(…)}', PowerShell -EncodedCommand), output piped into any interpreter (… | node/ruby/php/pwsh/deno, not just sh/bash/python), find … | xargs rm -rf, and a command word built by substitution (`echo rm` -rf /, sh <(echo rm -rf /)) now all clear the override prompt; they auto-ran under a run_shell(…:*) grant or bypass mode, and on Windows (no OS sandbox) ran unconfined.
Fixed gaps in the classifier's Windows and disk coverage — PowerShell Format-Volume / Clear-Disk, a drive-root del c:\*, and Remove-Item -Recurse -Force via abbreviated flags (ri -r -fo), plus low-level disk destroyers blkdiscard / sgdisk / parted / fdisk / mke2fs / hdparm --security-erase / tar→device; a routine Unix rm -r -f node_modules and a read-only fdisk -l stay unflagged.
Fixed a brief window where the Windows credential file was group-/world-readable — the config dir is now tightened with an inheriting owner-only ACL before the temp auth.json is written, so the file is owner-only from birth even on a profile whose %APPDATA% (or a custom TOMTE_CONFIG_DIR) isn't already owner-restricted.
Fixed a latent secret-in-logs risk — Credential / StoredTokens / AuthRecord / TokenSet now redact their token fields from Debug, so a future tracing::debug!(?cred) or {record:?} context can't dump a live token into the owner-readable log; non-secret fields (provider, account id, expiry) stay visible.
Fixed gaps in the child-process env scrub — secret-store / provider names whose auth material doesn't follow the *_KEY / *_TOKEN convention (VAULT, e.g. VAULT_ROLE_ID, plus GitLab / Cloudflare / Heroku / DigitalOcean / Slack / Discord) are now stripped, each collision-checked against benign vars (DEFAULT/FAULT carry no VAULT).
Fixed an indirect prompt-injection foothold in MCP — a server's tools/call text is now wrapped in a labeled <untrusted-mcp-output> block (framework markers neutralized, a forged closing tag broken, label values stripped of structural characters) and the searchable-tools manifest descriptions are defanged, so a malicious or compromised server's text reaches the model as data, not instructions.
Fixed injected skill content not being defanged — a project SKILL.md name/description (manifest) and body (the skill tool) now run through the same block-marker neutralizer as inherited memory, so a planted  can't make a later prompt-block stripper truncate unrelated content.
Fixed read_file hanging on a non-regular file — a FIFO / socket / device (e.g. a planted named pipe inside the workspace) is now refused with a clear error instead of blocking the tool forever waiting for a writer.
Fixed the host-side /undo to be atomic and permission-preserving — it now restores through the same temp-then-rename helper as the undo_last_edit tool, so a crash mid-restore can't leave a half-written file and the restored file keeps its original permissions instead of the umask default.
Fixed project custom-command files being read unbounded — they now go through the capped, symlink-rejecting loader the rest of the codebase uses, so a planted .tomte/commands/*.md symlink to a huge or non-regular file can't be slurped when the slash menu enumerates commands.
Fixed the pasted Anthropic authorization code showing in cleartext on the login screen — it is now masked like the API-key field, so the secret isn't exposed to a shoulder-surfer or a screen-share / recording.
Fixed a latent SSRF / credential-leak surface — removed an unused OpenAI raw_post helper that joined a caller-supplied path onto the API base with the live bearer attached.
Added built-in provider presets for well-known OpenAI-compatible endpoints — <id>/<model> now works out of the box for Groq, OpenRouter, DeepSeek, xAI, Together, Fireworks, Cerebras, Mistral, Ollama, and LM Studio, reading the key from the conventional <ID>_API_KEY env var (local servers need none) without hand-writing a providers entry; a user's own config.providers[<id>] still overrides the preset, and the routing and context-limit lookups share one fallback.
Made tomte why <file:line> drift-resilient — the CLI lookup now heals a decision whose anchored line has moved (in memory, without rewriting the trail) before matching, so it finds the decision even after the code shifted, matching the reconcile the injected trail already runs; a drifted line no longer reports "no decision recorded".
Rewrote internal source comments and one effort-picker label to describe behavior directly in tomte's own voice — comment/documentation cleanup only, no functional change.
Stopped the inline markdown renderer swallowing a line on an unmatched marker — a glob like *.rs, a 2 * 3, or an unterminated `code now renders literally instead of italic/bold/code-styling the rest of the line; emphasis follows CommonMark's flanking rule (a marker only opens before non-whitespace and closes after it), so a stray asterisk in normal prose stays a stray asterisk.
Taught the assistant-markdown renderer to render headings, lists, and blockquotes instead of leaking raw markers — ## H shows without its #s, - x / 1. x get a •/number with a hanging indent so a wrapped item aligns under its text (not back at the marker), and > q gets a quiet │ rail.
Fixed diff rows truncating by character count rather than display width — a removed/added line containing CJK or emoji no longer overflows the gutter and the right border; the cut is now wide-char aware.
Unified the context-window readout — the turn spinner now leads with the same N% ctx the status line shows (the raw token count moved behind a used label), so the one "context fullness" fact reads identically in both places instead of as two differently-shaped "ctx" figures.
Softened the success-with-stderr hint from "N stderr lines suppressed" to "N lines on stderr" (it no longer reads like an error was hidden on a command that succeeded), and the still-running tool body now shows working… instead of a bare … that collided with the truncation marker in the same column.
Sanitized two raw model/tool strings that reached the terminal unfiltered — the headless chat error message and the run_shell one-line Bash header now strip control sequences (and the header flattens newlines), so neither a provider error nor a heredoc command can rewrite the screen or desync the row.
Fixed the conscience-conflict gate recording an override before the edit was allowed to run — a Supersede that a PreToolUse hook (or the headless baseline gate) then blocks no longer appends a supersede to the decision trail or shows a "decision overturned" card for a write that never happened; the trail append and the event are deferred until the edit clears the gate.
Stopped a response.failed event from emitting a live Usage / context-warning for tokens it deliberately does not bill — a failed-then-retried turn no longer double-counts its input in the live readout, since the occupancy refresh now goes through a pure parse that records and emits nothing.
Fixed background-shell output that overflows the 4 MiB cap trimming through the middle of a multi-byte UTF-8 character — the front trim now lands on a char boundary, so the next bash_output read no longer surfaces a spurious U+FFFD where a valid character was split.
Named the offline auth state in the status line — the "not authenticated" case now reads ● offline instead of a bare red dot, since that one state (the one you must not miss) was conveyed by hue alone, ambiguous against the signed-in dots and invisible to a colour-blind reader; the signed-in states stay dot-only to keep the line calm.
Fixed tomte doctor reporting a false routing error for a built-in provider preset — a <id>/<model> like groq/llama-3.3-70b (any of the ten presets, plus the keyless local ollama/lmstudio) is now checked against the preset's own credential like the real run path (LlmClient::for_config), instead of being misrouted to the OpenAI/Anthropic credential check and flagged as a hard error when no OpenAI/Anthropic credential is configured — exactly the setup doctor exists to confirm.
Brought the TUI's terminal-control sanitizer to parity with the headless one — sanitize_display now also drops the 8-bit C1 controls (U+0080..=U+009F) that some terminals treat as CSI/OSC/DCS introducers, so model/tool text rendered in the TUI (tool output, code blocks, diffs) can't smuggle a pure-C1 escape past the filter that the headless path already blocked.
Corrected the docs against the shipped code — the README tool belt now lists undo_last_edit (and notes tool_search is MCP-conditional), the sample config shows the real default model gpt-5.5, tomte login is described as the interactive picker it is, the reasoning-effort list is complete, and CONTRIBUTING.md no longer claims run_shell runs "with no sandbox yet" (it has shipped the OS sandbox since this release).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.0.2

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

0.0.2

Uh oh!