v0.2.0
First release since 0.1.0 — 122 commits. The theme: making the harness do more of the work so cheaper/open models perform like expensive ones, plus a major reliability and permission-system pass.
Install: cargo install dirge-agent (then run dirge).
Highlights
Permission system — "allow always" actually sticks now
The recurring "I allow npx, then keep getting re-prompted" bug had three compounding root causes, all fixed (#271):
- Command patterns weren't DOTALL, so a grant never matched a multi-line command (
npx tsx -e "…\n…"). cmd **required trailing args, so barecargo test/lsre-prompted.- Benign prefixes (
export,set, …) were skipped in the suggestion but not auto-allowed.
Security tightening: npx / node / python / python3 run arbitrary (possibly remote) code and are no longer default-allowed — they prompt once, then the grant sticks. Project-scoped tools (cargo/go/make/pytest) stay trusted. Backed by a deterministic gating test corpus.
Optional LLM auto-approval (approval_provider) (#272)
Point approval_provider at a model and permission prompts are judged by it (with a safety rubric) instead of pausing for you — fail-safe (unclear → deny; LLM error → human prompt), and it can never override a hard deny.
Background MCP tool loading — instant UI (#265, #266, #267, #268)
The TUI now draws immediately; MCP servers connect concurrently in the background and their tools are injected into the live agent when ready (search-gated under dynamic_tool_search). No more multi-second startup stall.
Reasoning & tool-use guidance suite
A model-agnostic, research-backed steering layer baked into the loop: few-shot tool-use exemplars (F1), a finishing self-check + definition-of-done (F2), progress narration (F3), reflexion memory (F4), ask-vs-proceed calibration (F5), an outcome-aware verifier gate + a bounded in-loop LLM critic (F6), plus DeepSeek-aware model-family steering and reflect-then-pivot loop intervention.
Reliability
- No more provider 400s from orphaned
tool_call_ids (partial storm suppression / interrupted batches now backfill a synthetic error result) (#275). - Active steering isn't killed by the
max_turnscap — a fresh user steer resets the turn budget (#277). - Memory compaction: when the budget is full, the oldest entries are evicted to make room instead of failing the write (#276).
- Transient-failure retry budget raised 3 → 5; doom-loop guard no longer hard-denies calls you keep approving.
Tooling & feedback
- Tree-sitter pre-write syntax validation with actionable, named missing-token + delimiter-balance feedback (so the model fixes broken code on the same turn).
- Background/detached shells (Claude-Code model: unbounded, read/kill by id).
- LSP
extend_extensionsconfig; LSP exposed to Janet plugins viaharness/lsp. - New plugins: PlanSearch
/plan, a working bundled nREPL plugin.
UI
Configurable visible panes (/display) and key bindings, left-panel session vitals (context gauge / activity / git), subagent rendering, slash-command ghost-text autocomplete, double-click word select, Ctrl+R reverse history search, background-agent count in the status bar, and a content-sized MODIFIED panel.
Housekeeping
Speed-optimized release profile (opt-level 3 + fat LTO), session-test isolation (no more recent-sessions noise), and a docs split (docs/).
Full diff: v0.1.0...v0.2.0