Skip to content

v0.2.0

Choose a tag to compare

@github-actions github-actions released this 31 May 14:12
· 226 commits to main since this release

First release since 0.1.0 — 122 commits. The theme: making the harness do more of the work so cheaper/open models perform like expensive ones, plus a major reliability and permission-system pass.

Install: cargo install dirge-agent (then run dirge).

Highlights

Permission system — "allow always" actually sticks now

The recurring "I allow npx, then keep getting re-prompted" bug had three compounding root causes, all fixed (#271):

  • Command patterns weren't DOTALL, so a grant never matched a multi-line command (npx tsx -e "…\n…").
  • cmd ** required trailing args, so bare cargo test / ls re-prompted.
  • Benign prefixes (export, set, …) were skipped in the suggestion but not auto-allowed.

Security tightening: npx / node / python / python3 run arbitrary (possibly remote) code and are no longer default-allowed — they prompt once, then the grant sticks. Project-scoped tools (cargo/go/make/pytest) stay trusted. Backed by a deterministic gating test corpus.

Optional LLM auto-approval (approval_provider) (#272)

Point approval_provider at a model and permission prompts are judged by it (with a safety rubric) instead of pausing for you — fail-safe (unclear → deny; LLM error → human prompt), and it can never override a hard deny.

Background MCP tool loading — instant UI (#265, #266, #267, #268)

The TUI now draws immediately; MCP servers connect concurrently in the background and their tools are injected into the live agent when ready (search-gated under dynamic_tool_search). No more multi-second startup stall.

Reasoning & tool-use guidance suite

A model-agnostic, research-backed steering layer baked into the loop: few-shot tool-use exemplars (F1), a finishing self-check + definition-of-done (F2), progress narration (F3), reflexion memory (F4), ask-vs-proceed calibration (F5), an outcome-aware verifier gate + a bounded in-loop LLM critic (F6), plus DeepSeek-aware model-family steering and reflect-then-pivot loop intervention.

Reliability

  • No more provider 400s from orphaned tool_call_ids (partial storm suppression / interrupted batches now backfill a synthetic error result) (#275).
  • Active steering isn't killed by the max_turns cap — a fresh user steer resets the turn budget (#277).
  • Memory compaction: when the budget is full, the oldest entries are evicted to make room instead of failing the write (#276).
  • Transient-failure retry budget raised 3 → 5; doom-loop guard no longer hard-denies calls you keep approving.

Tooling & feedback

  • Tree-sitter pre-write syntax validation with actionable, named missing-token + delimiter-balance feedback (so the model fixes broken code on the same turn).
  • Background/detached shells (Claude-Code model: unbounded, read/kill by id).
  • LSP extend_extensions config; LSP exposed to Janet plugins via harness/lsp.
  • New plugins: PlanSearch /plan, a working bundled nREPL plugin.

UI

Configurable visible panes (/display) and key bindings, left-panel session vitals (context gauge / activity / git), subagent rendering, slash-command ghost-text autocomplete, double-click word select, Ctrl+R reverse history search, background-agent count in the status bar, and a content-sized MODIFIED panel.

Housekeeping

Speed-optimized release profile (opt-level 3 + fat LTO), session-test isolation (no more recent-sessions noise), and a docs split (docs/).

Full diff: v0.1.0...v0.2.0