Skip to content

Releases: ryan-mt/tomte

v0.0.4

10 Jun 17:22

Choose a tag to compare

0.0.4

Added

  • Added a think tool — a no-op scratchpad the model calls to reason in the open mid-task (plan the next move after a batch of tool results, weigh a trade-off and state the rejected option, check a plan against a record_decision before editing). It writes nothing to disk, fetches nothing, and returns nothing new — the thought just lands in the conversation so the reasoning is in the record and survives into later turns. Read-only (so it never needs approval) and kept for sub-agents (it survives the wildcard filter and is whitelistable by name); the system prompt's tool-discipline list teaches the timing — use it to think, not to narrate, and skip it when the next step is already known. Aliased (scratchpad) and tolerant of the field spellings models reach for (thought/text/note/reasoning).

  • Added MCP resources — list_mcp_resources and read_mcp_resource: when a connected MCP server advertises the resources capability at handshake, tomte registers one shared pair of tools so the model can pull a server's files/docs/configs into context by URI (parity with Claude Code's ListMcpResources/ReadMcpResource). list_mcp_resources with no server aggregates every resource-capable server under a header; with a server it lists just that one. read_mcp_resource reads one URI, disambiguating with server only when several servers expose resources (a sole server is used automatically; an unknown name errors with the known servers listed). Both are read-only (auto-approvable), and the server output is wrapped in the same <untrusted-mcp-output> provenance fence as tool results — a compromised server can't forge the close tag or smuggle a framework marker. The tools are never deferred behind tool_search (they aren't mcp__-namespaced), and they only appear when a server can actually serve them, so they don't clutter the tool list otherwise.

  • Added TOMTE_MAX_CONTEXT_TOKENS — an env override for the resolved context-window size, mirroring Claude Code's CLAUDE_CODE_MAX_CONTEXT_TOKENS. Pin the window for a gateway/proxy whose real limit tomte can't infer from the model name, or shrink it to force earlier compaction. Accepts a bare integer or a k/m suffix (200000, 200k, 1m; case-insensitive, underscores tolerated); a valid value is clamped to [8k, 20m] so a typo can't make every turn read as ≥100% full and thrash the compaction path, and an unset or unparseable value leaves the catalog/provider value untouched. It flows through the single effective_context_limit choke point, so the status-bar gauge, /context, the warn/auto-compact thresholds, and microcompaction all honor it together.

  • Made the collapsed "Thought for Xs" line click-to-expand. Once the model's live reasoning collapses into the compact Thought for Xs line, you can now left-click it to re-show the full thought (and click again to hide it) — the same click-target affordance as the "Jump to bottom" bar and the fleet rows. The reasoning text is retained instead of discarded on collapse, the line carries a dim (click to show) / (click to hide) hint, and expanding works even with live thinking display turned off (it's an explicit request). Implemented without disturbing the render cache's fast path: each visible thought line's screen rect is mapped per frame (append-only marks tracked alongside the cached lines), and toggling re-wraps only the affected block. Alt-screen renderer; the opt-in inline viewport keeps the collapsed line without the click target.

  • Made the busy spinner narrate what tomte is actually doing. It already borrowed an in-progress todo's active_form as the live word (Claude-parity), but most short turns carry no todo, so the line fell straight to a whimsical pool word (Pottering…). Now, when there's no in-progress todo, the spinner shows the running tool's plain action verb — Reading…, Running…, Searching…, Editing…, Delegating… — derived from the most recent tool call whose result hasn't arrived yet. The precedence is todo active_form → running-tool verb → the drifting pool word (kept for the gaps between tool calls, so tomte's voice survives). Meta tools (todo_write/goal_update/wait), MCP tools, and unknown names fall through to the pool rather than reading something unhelpful.

  • Added a per-read output ceiling to read_file, matching how Claude Code keeps a single file read token-bounded. The 2000-line and 2000-char/line caps bounded line count and width but not total size, so a file of long-but-under-2000-char lines could dump hundreds of KB — hundreds of thousands of tokens — in one read, burning context and the user's limit. A read now stops once its rendered output passes ~25k tokens (Claude Code's CLAUDE_CODE_FILE_READ_MAX_OUTPUT_TOKENS default; tomte estimates ≈4 bytes/token) — even under 2000 lines — and emits the same continue with offset=N notice as the line cap, so nothing is lost, just paged. A TOMTE_READ_MAX_TOKENS env override mirrors Claude Code's knob (clamped to a sane band). Normal small/medium reads are unchanged; the description now nudges toward reading only the slice you need (offset/limit, or grep first), which is cheaper than a capped full read. Applies to both the in-memory and large-file (>5 MB streamed) read paths.

  • Added tomte receipt — the work receipt: one Markdown / HTML / JSON artifact that proves a stretch of work instead of transcribing it, ready to attach to a PR. It bundles a fresh Proof Capsule (the files git reports changed plus the real exit codes of the project's own test/typecheck/lint/build, run by the CLI right now), whether HEAD carries a verified Commit Seal (checked with the same binding rules as tomte seal verify), what the session actually did — the shell commands run and files edited, read from the persisted session log (the CLI's own record of executed tool calls, never a model's recollection) with the per-model token/cost receipt — and the newest recorded decisions with the drift-watch counts. Sections degrade gracefully (outside a repo, with no saved sessions, with an empty trail, the receipt says so), lists are capped with "and N more" pointers at the full stores, and the HTML page is standalone with all interpolated text escaped. --session <id> picks an older session (default: the project's newest), --json/--html pick the format, --out writes a file. It always renders, red or green — the gates remain tomte prove and tomte seal verify.

  • Added the official GitHub Action — "Done means verified" as a PR gate. uses: ryan-mt/tomte@v0.0.4 installs the released binary (checksum-verified against the published .sha256), runs tomte prove (the project's own test/typecheck/lint/build, real exit codes) and tomte rounds (drift watch, risk risers, hot-and-untested files; --no-proof automatically when prove already paid for the check suite), optionally requires a bound Commit Seal on HEAD (seal-verify: "true"), and fails the job when any selected gate is red. The full evidence lands in the job's step summary (the PR check page) with long outputs truncated to a stated tail — never silently — and comment: "true" posts it as one self-updating PR comment (needs pull-requests: write). Inputs select the gates, the release version, and the working-directory; the action exposes a verified output for downstream steps. It deliberately brings no toolchain — the project's own setup steps run first, and tomte measures that project's checks.

  • Added monorepo coverage to the Proof Capsule (tomte prove, and everything built on it — /prove, seal, rounds, receipt, the race judge, the GitHub Action). The capsule used to verify only the primary ecosystem at the root (a Cargo.toml beat everything), so a repo carrying a second stack — a Next.js site beside a Rust workspace, a src-tauri/ beside a Node app — could ship that stack completely unverified while the card read green. Now an immediate sub-directory holding a different ecosystem's manifest gets its own checks too, named <dir>:<check> (e.g. tomte-website:lint), run inside that directory, and folded into the same verdict — a red website lint fails tomte prove exactly like a red cargo test. Same-kind sub-projects are deliberately left to the root toolchain (a cargo workspace or npm workspaces already run their members), so nothing runs or bills twice; hidden directories and dependency/build dirs (node_modules, target, dist, …) are never treated as project roots, sub-project discovery is sorted so equal trees plan equal capsules, and the reproduce line wraps each sub-check as (cd <dir> && …) so the pasted command still runs everything exactly where the capsule did.

  • CI now covers the website: a website job runs npm ci, npm run lint, and npm run build for tomte-website/ on every push and PR, so a broken site can no longer ride in under green Rust checks.

  • Added tomte why diff [base] (and /why diff [base] in a session) — review the reasoning, not just the code. A PR review reads the diff; this reads the decision trail against the same range (the merge-base with base, defaulting to the first of origin/main/main/origin/master/master that resolves) and answers what the diff can't: which decisions are new in this range (each flagged when it points outside the changed files), which earlier decisions were superseded here — promises deliberately broken, shown as was → now — and which changed files carry no recorded why at all (the reviewer's gap list; committed, uncommitted, and untracked files all count). Everything is computed from real state — git for the range, the project's own trail for the decisions — the analysis core is pure and fully unit-tested, and --json emits the report for scripting. This is the view a clone can't fake quickly: it needs the store, the anchors, and the supersede links the trail has been accumulating all along.

  • Added tomte models (and ...

Read more

v0.0.3

10 Jun 01:29

Choose a tag to compare

0.0.3

Added

  • Added a first-run setup card — on startup tomte auto-detects your OS and package manager and, if an important external tool is missing (currently git), shows the exact install command for your platform (winget/scoop/choco, brew, apt/dnf/pacman/…, or a download link). It only shows the command — it never runs an installer — and no card appears when the environment is ready.
  • Added an OS terminal window/tab title — tomte names the window tomte on launch and tomte — <task> after the first prompt, resetting to tomte on /clear so the next prompt re-titles it, so several tomte sessions are easy to tell apart. Cross-platform via crossterm (SetConsoleTitle on Windows, the OSC title escape on macOS/Linux/Windows Terminal); the task text is the first prompt line with control characters stripped, so a crafted prompt can't inject its own terminal escape.
  • Added an optional focus to /compact/compact <what to keep> steers the summary toward the topic you name (e.g. /compact the auth refactor and the failing test) while still producing a self-contained summary, so a compaction at 85% doesn't drop the thread you care about. A bare /compact is unchanged, an auto-compaction never carries a steer, and a blank focus (/compact ) is treated as no focus; the steer is consumed once per run so it can't leak into the next summary. (The instruction is appended to the existing compaction prompt; history replacement and checkpoint handling are untouched.)
  • Added /rewind — restore the session to an earlier turn: it truncates the conversation back to a chosen turn AND reverts the file edits made since (the custodian you can follow and undo). A checkpoint is recorded at every turn; /rewind opens a picker of them (newest first), each row showing its blast radius before you commit — … · drops N later turns · reverts M files (Pillar 1) — and selecting one reverts each touched file to its pre-turn content, newest-edit-first so stacked edits to one file collapse to a single restore. A file you changed outside tomte is reported and left as-is, never clobbered; run_shell side effects can't be undone and are counted in the calm summary (↩ rewound to: … · reverted N files · M shell effects could not be undone). In-session only — checkpoints reset on /clear, /compact, and /resume, since they index the runtime undo stack. Reuses the same atomic-restore + staleness guard as /undo; edits since a checkpoint are tracked by a monotonic counter so the capped undo stack's eviction can't miscount.
  • Added a live thinking display — while the model reasons, tomte now shows the reasoning text in muted italic (like Claude Code) so you can follow its thought, then collapses it to a compact Thought for Xs line the moment the answer starts. On by default; /thoughts off (or show_thinking: false in config.json) hides the text and keeps only the spinner's thinking cue, /thoughts on brings it back. Provider-agnostic — it renders whatever reasoning the active model streams (Anthropic thinking, OpenAI reasoning), so it carries across a model switch. (/thinking is unchanged — it still picks reasoning effort; /thoughts is the new display toggle. The reasoning was already captured per turn; this surfaces it instead of suppressing it.)
  • Added tomte --continue (-c) — resume the most recent session in the current directory immediately, skipping the /resume picker (parity with claude --continue). It reuses the exact restore path the picker uses (history, reasoning, and the active goal), and a directory with no saved session starts fresh with a one-line note instead of erroring. tomte resume still opens the picker to choose among older sessions.
  • Added the Proof Capsule — "done means verified." /prove (in a session) and tomte prove (headless) collect an evidence bundle the CLI gathers itself: the files git reports changed, plus the real exit codes of the project's own verification scripts — test, typecheck, lint, build — which tomte runs and observes. The model never supplies these numbers and can't fabricate a green capsule; at most it explains one the CLI already collected. The card reads ✅ Verified / ❌ Not verified / ⚠️ Unverified, lists each check (✅ test passed cargo test), shows a failing check's output tail, and ends with a one-line reproduce command. A check the project could define but doesn't (a Node project with no typecheck script) surfaces as a deterministic "⚠️ not verified", never silently dropped — so an absent test suite can't masquerade as a passing one. The toolchain is auto-detected per ecosystem: Rust (cargo test/check/clippy/build), Node (package.json scripts via the detected npm/pnpm/yarn/bun, resolving typecheck/type-check/tsc aliases), Go (go test/vet/build ./...), and Python (pytest/mypy/ruff, each present only when its tool is on PATH). tomte prove exits non-zero when any check fails, so it can gate a commit hook or CI step; tomte prove --json emits the capsule for scripting. In the TUI the collection runs on a background task (it can shell out to a full build/test suite) so the UI keeps animating, and a second /prove while one is in flight is a no-op. Cross-platform (runs each script through cmd /C on Windows, sh -c elsewhere; secret-looking env vars are scrubbed from the child as everywhere else).
  • Added the Repo Twin / Context X-Ray — a verifiable map of the repository the agent (and you) can trust. tomte twin builds five indexes straight from the source — file/import graph, symbol/function graph, test→source map, git recent-change map, and project conventions (AGENTS.md/README/docs) — and caches them as JSON beside the memory/decision stores, rebuilding automatically when the working tree changes (--rebuild forces a fresh scan, --json emits the summary). tomte why-context <file|symbol|file:line> is the headline query: given a seed — a file, a stack-trace location, or a symbol name — it prints the files a maintainer would pull into context, each tagged with the index it came from (import / symbol / test / git / recorded decision), and the nearby files it deliberately leaves out, each with the reason it's unreachable. Every claim is grounded in a real import edge, definition, test, commit, or decision — never an invented "this project uses pattern X": the symbol graph only traces globally-distinctive names and skips method/field accesses (so a generic name like append can't manufacture a false reference), and recorded decisions on the seed are shown with a freshness flag (fresh / drifted / stale) so you can see which memory has gone out of date. Multi-language (Rust, JavaScript/TypeScript, Python, Go) and cross-platform, with no native database — pure Rust/JSON. The same map is reachable from inside a session: /twin [--rebuild] shows the index summary and /why-context <seed> (alias /xray) runs the X-ray query, each on a blocking task off the UI thread so the first full scan doesn't freeze the session.
  • Added the Handoff capsule — tomte handoff (and /handoff in a session): the shift report that lets the next session pick the house up where this one left it, whether that's a colleague, tomorrow's you, or a different model entirely — the decision trail is cross-model on purpose, and this is the door it walks through. One paste-ready markdown capsule collects, from real state and never from a model's summary: where the tree stands (branch, HEAD, dirty files capped with an "and N more", recent commits), why things are the way they are (the newest recorded decisions with who recorded them, a pointer to the full trail, and a drift-watch line — anchors holding / healed / needing eyes — from the same reconcile /why --reconcile runs), the twin's five-index map summary, and the top of the Repo Pulse. Sections degrade gracefully (outside a git repo it says so; an empty trail points at record_decision) and it ends with the house rule: tomte prove — done means verified. --json for scripts, --out HANDOFF.md to write a file; in the TUI the collection runs off the UI thread.
  • Added Repo Pulse — tomte pulse (and /pulse in a session): which files are most likely to break next, scored from the Repo Twin's own indexes with a formula printed right on the card — risk = commits in the recent git window × (import fan-in + 1) × 2 when no test covers the file. Change heat says the code is in play, blast radius says others lean on it, and a missing test means the regression lands silently — every factor is a real index entry (a GitStat, an ImportEdge, a TestEdge), so the verdict is reproducible: rerun it, get the same card, argue with the numbers instead of a model's vibe. The card lists the top 10 with the most recent commit subject for each, plus two vitals — how many hot source files have no covering test, and the file with the widest import fan-in (shown only at ≥2 importers, since a Rust mod tree gives every file exactly one declaring parent). Test files and non-source files are never scored; ties break by path so equal twins render byte-identical cards; --json emits the report for scripts and --rebuild re-scans first. Costs nothing beyond the cached twin load — pure index math, no shell-outs, no model.
  • Added Claude Fable 5 (claude-fable-5, GA June 9, 2026) — Anthropic's new top tier above Opus — to the model catalog: it appears first in the Anthropic picker/login list (tagged 1M ctx · most capable · top tier), carries its published facts (1M context window by default, 128K max output, adaptive-only thinking, xhigh/max effort honoured instead of clamped), prices at the published $10/$50 per MTok in /cost (cache read $1.00, 5m cache write $12.50 — the same 0.1×/1.25× rule as the other Claude tiers), and model: fable in a subagent file resolves to the concrete id like opus/sonnet/haiku do. The request shape was alre...
Read more

v0.0.2

08 Jun 17:11

Choose a tag to compare

0.0.2

  • Taught the agent to see a task through by default — a # Seeing a task through section in the system prompt every turn inherits: plan, write the failing test first (TDD), finish the job, then prove it with build/test/lint and loop until green; scaled so a one-line fix stays light.
  • Added a glass-box pre-flight — before a write or shell command runs, one calm line states what it changes and how far it reaches (plus a leash note for destructive ones); reads and searches stay cardless.
  • Surfaced a file's recorded decisions as house rules in the pre-flight — an edit to a file with recorded decisions lists them first, so the agent re-reads its own constraints before it could break one.
  • Added a cross-model decision trail — record_decision logs why a non-obvious change was made to an append-only decisions.jsonl, re-injected each session so a later session or a different model inherits the reasoning.
  • Added tomte why <loc> / tomte why --all / /why to read the decision trail back, and tomte blame <file> for the greppable, one-decision-per-line file view.
  • Added --json to tomte why and tomte blame (and tomte why --reconcile --json) — the decision trail and the drift report emit machine-readable JSON alongside the rendered text, so a script or CI drift-gate can read the trail (and assert no stale/ambiguous decisions) instead of parsing prose.
  • Added an end-of-turn receipt — a turn that changes something closes with one line: files touched, tests run (pass/fail), and the why it recorded.
  • Added an agent-writable memory tool — project-scoped notes that persist across sessions, with a MEMORY.md index re-injected each session.
  • Added an OS-level sandbox for run_shell — Landlock + seccomp (Linux) / sandbox-exec (macOS) confine writes to the workspace and block outbound network by default (read-only / workspace-write / danger-full-access).
  • Added per-run sandbox overrides — --sandbox <mode> / --sandbox-allow-net and TOMTE_SANDBOX_* env vars (never persisted); Linux adds conservative rlimits, Windows tears the tree down via a kill-on-close Job Object.
  • Added tomte doctor and /doctor — a read-only setup health check (auth, config, model routing, MCP, external tools) that runs headless and exits non-zero on failure.
  • Added tomte mcp — manage MCP servers from the CLI instead of hand-editing settings.json: tomte mcp add <name> -- <command> [args…] (with repeatable --env KEY=VALUE) writes the mcp_servers entry atomically while preserving every other key, and list / get / remove round it out (env values are never printed back).
  • Added a TOMTE_CONFIG_DIR override to relocate the whole config tree (config, auth, sessions, logs) on every platform — also the portable way to isolate tests.
  • Added composer prefixes — @<path> attaches a file via gitignore-aware typeahead, !<command> runs a shell command inline (!! past the guard), #<note> appends to CLAUDE.md.
  • Added automatic, provider-agnostic model failover — a rate-limited or overloaded model switches to the next in fallback_models; off by default, only before any answer has streamed.
  • Added project-scoped config — a .tomte/config.json overrides safe fields only (model, reasoning_effort, verbosity, auto_compact, fallback_models); security keys are ignored.
  • Added left-drag text selection in the TUI — drag to highlight and copy on release (no Shift), handling wide CJK/emoji; /help documents it plus history recall (↑/↓).
  • Added a live context gauge to the status line — N% ctx next to the model, colored calm → warning → danger toward the ~85% auto-compact threshold.
  • Added tomte-website/ — the static Next.js marketing & docs site, deployed at https://tomte-website.vercel.app.
  • Added runnable, std-only previews of the planned pillar concepts (hand-compiled, invisible to cargo/CI), including a cross-provider cost demo.
  • Rebuilt the welcome card into a full first-screen panel — pixel-pet, brand/version, live setup (model · effort · account), workspace, a /init-style house-rules check, and a shortcuts footer; spans the full terminal width.
  • Reworked the turn spinner — a flickering-hearthfire glyph (▁▂▄▆█▆▄▂) instead of braille and a ~245-word tomte-voiced pool that holds a word ~8s then drifts on a pure seed × elapsed schedule, so it never flickers; a running todo's active_form takes the line instead.
  • Made the spinner words configurable — spinner_verbs { verbs, exclude_default } in config.json appends to or replaces the built-in pool.
  • Gave a finished sub-agent in the fleet view a settled past-tense verb (e.g. Forged · 4 steps · 1m 12s) instead of a stale in-flight phrase.
  • Rewrote the edit_file / multi_edit diff into a real hunk — shared lines collapse into uncolored context, the -/+ counts reflect only real changes, and line numbers follow the unified-diff convention.
  • Unified todo glyphs across the inline todo_write checklist and the pinned panel ( / / , in-progress now a filled ), and added the (Ctrl+O for more) hint to truncated diff/error bodies.
  • Rebuilt /context as a visual context-window breakdown — a proportional colored grid and per-category legend with token estimates; /context all expands the lists.
  • Made /cost accurate and per-model — spend tallied per model and billing class (input / output / cache read / cache write), and it survives /resume.
  • Gave the composer a cozy face — a prompt gutter and a what shall we build today? placeholder.
  • Made grep work with no external tools — a native, dependency-free fallback covers content / files_with_matches / count with context and path scoping when neither ripgrep nor grep can be spawned; the recursive walk is shared with glob.
  • Made tomte doctor warn (not hard-error) when neither ripgrep nor grep is installed, now that grep / glob have a native fallback.
  • Made read_file render a Jupyter .ipynb as cells (ids + text outputs; image/rich outputs omitted) instead of dumping raw JSON, pairing with notebook_edit; a sliced read (offset/limit) still returns the raw JSON.
  • Gave read_file vision — a whole-file read of an image (PNG/JPEG/GIF/WebP) or PDF now attaches the bytes as media so a vision model can SEE it (the Anthropic translator emits image/document blocks in the tool_result), instead of the old text-only "binary file" note. Tool results carry optional media end-to-end via a new execute_rich (the 26 text tools are untouched; only read_file overrides it); the OpenAI wire degrades to the text note since its function_call_output doesn't accept media.
  • Surfaced project-local skills and custom commands in the / slash menu — skills under .tomte/skills (and .claude/.codex) plus commands/*.md now appear as /<name> entries (tagged by scope, e.g. skill (.tomte)) so you can trigger them manually, and typing /<skill-name> loads that skill's instructions into the composer. Global skills stay out of the quick menu (the model still loads any of them on demand via the skill tool).
  • Made read_file and list_dir give a clear, self-correcting error when handed the wrong kind of path — read_file on a directory points to list_dir/glob, and list_dir on a file points to read_file, instead of surfacing a raw OS error.
  • Settled on the full-screen alternate-buffer renderer as the default; the inline viewport is now opt-in via TOMTE_INLINE=1, with a slimmer height and a bottom-anchored live tail.
  • Unified the TUI's ~70 scattered color literals into one calm palette — an achromatic base, a single muted sage-teal accent, and muted semantic colors.
  • Gave the harness prompt a voice with a spine — it pushes back on weak plans, states confidence, anchors claims to receipts, and drops sycophancy and emoji.
  • Made tool calls self-correcting — a failed-to-parse argument gets an expected-args summary, and an unknown tool name suggests the closest real one (Did you mean: read_file?).
  • Made headless chat / run read-only by default — side-effecting tools are denied so a prompt-injected model can't act unattended; pass --dangerously-skip-permissions to allow them.
  • Grounded the OpenAI model catalog against the current API — removed ids auto-migrate, the -pro family clamps to high effort, and a future Opus/Sonnet inherits the 1M window.
  • Improved retry/timeout behavior — Retry-After is honored in HTTP-date form too, backoff is jittered so sub-agents don't retry in lockstep, and non-streaming calls get a 300s cap.
  • Reorganized the codebase so every Rust source file is ≤500 lines — large modules split into focused submodules, later renamed to semantic names (canonical_args, todo_reminder, slash_ops / slash_meta, content-named *_tests); pure refactor, no behavior change.
  • Removed the unused keyring dependency for a smaller build and attack surface — credentials stay in auth.json with 0o600 perms.
  • Renamed the project from opencli to tomte — binary, crates, config dir (~/.config/tomte), TOMTE_* env vars, logo, and user-agent — breaking: the old ~/.config/opencli is no longer read, so re-run tomte login.
  • Hardened the decision-trail reconcile write — a failed atomic rewrite of decisions.jsonl is now logged instead of silently swallowed, and the staging temp uses a unique per-process name so two concurrent reconciles can't clobber each other's temp before the rename.
  • Fixed sign-in on Windows — auth.json now persists under %APPDATA%\tomte (owner-only via icacls), so an OAuth login can complete instead of looping the sign-in picker.
  • Made the Windows credential-file ACL tighten observable — a failed or skipped icacls owner-only grant (e.g. USERNAME unset) is now logged instead of silently leaving the file on its inherited %APPDATA% ACL.
  • Fixed MCP servers failing to spawn on Windows — a bare npx / node / pnpm command (a .cmd shim)...
Read more

v0.0.1-beta.4

02 Jun 01:52

Choose a tag to compare

v0.0.1-beta.4 Pre-release
Pre-release

v0.0.1-beta.3

01 Jun 04:52
299cf69

Choose a tag to compare

v0.0.1-beta.3 Pre-release
Pre-release

What's Changed

  • Release 0.0.1-beta.3 — context, streamed tool calls, reasoning portability by @ryan-mt in #1

New Contributors

Full Changelog: v0.0.1-beta.2...v0.0.1-beta.3

v0.0.1-beta.2

01 Jun 01:45

Choose a tag to compare

v0.0.1-beta.2 Pre-release
Pre-release

v0.0.1-beta.1

01 Jun 00:19

Choose a tag to compare

v0.0.1-beta.1 Pre-release
Pre-release

OpenCLI 0.0.1 beta build.

Highlights:

  • Hardened OpenAI-compatible strict tool schemas, including dispatch_agent optional fields.
  • Improved cross-platform shell/tool output behavior for Linux, macOS, and Windows.
  • Published release artifacts for Linux x86_64, macOS x86_64, macOS arm64, and Windows x86_64.
  • Updated release CI to current GitHub Actions versions.

Full changelog: https://github.com/ryan-mt/opencli/commits/v0.0.1-beta.1