feat: sprout-agent + sprout-dev-mcp — minimal ACP coding agent#493
Merged
Conversation
Minimal ACP agent in Rust replacing goose. Speaks ACP over stdio, calls LLM (Anthropic or OpenAI-compatible) non-streaming, uses MCP tools via rmcp. Five files (types, mcp, llm, agent, main). Provider enum (no trait). Typed HistoryItem prevents tool_call_id pairing bugs. One session, one prompt in flight. Env-var config only. Tests not yet added.
Adds tests/fake_llm.rs with a minimal HTTP/1.1 server returning canned
LLM responses. Three tests:
- text_only_end_turn: verifies initialize, session/new, session/prompt,
agent_message_chunk, end_turn stopReason.
- tool_call_then_text: verifies tool_call(pending) →
request_permission → permission reply → tool_call_update(failed)
[unknown tool synthetic error] → next round → end_turn.
- rejects_concurrent_prompts: verifies -32602 on second in-flight
session/prompt.
Also fixes a writer race: notifications must flush before the response
they precede. Added biased select! + try_recv drain to guarantee
agent_message_chunk arrives before stopReason on the wire.
1. agent.rs - only record assistant history when tool_calls is non-empty 2. mcp.rs - collapse_content takes max_bytes and stops appending once full 3. types.rs - clamp() always returns <= max bytes, even with tiny max 4. llm.rs - bound LLM response body to 16 MiB via chunk loop 5. agent.rs/main.rs - cancellation cleans up pending permission entry and emits a terminal tool_call_update(failed,cancelled) 6. main.rs - explicit JSON-RPC classification (request/notification/ response/malformed); -32600 for requests with missing params
- mcp.rs: push_bounded helper bounds ALL collapse_content branches (text, image, audio, resource-link, resource elision) - agent.rs: always remove pending permission entry on any non-success exit (cancel, wire send failure, dropped oneshot) - main.rs: own history directly in the prompt task -- no Arc<Mutex>; Session.history is taken on prompt start and restored on completion - main.rs: don't hold app.state across MCP child spawn; quick-check, release, spawn, re-check before installing the session - main.rs: reject unterminated partial frames at EOF -- log and close - main.rs: session_token() reads 8 random bytes from /dev/urandom (falls back to nanos^pid<<32) instead of leaking a stack address
- New src/wire.rs: typed ACP framing, classify(), JSON-RPC helpers, bounded line reader, writer task, ACP request param types. - New src/config.rs: Config + env parsing + protocol/byte constants. - src/types.rs: pure domain types only. - src/agent.rs: RunCtx struct (was 9-arg run_prompt). Rejects unsupported ContentBlock with -32602; treats provider ToolUse + 0 calls as error. - src/main.rs: thin dispatch. Validates cwd is absolute (-32602), protocolVersion (-32602), strict JSON-RPC classify (no-method-no-id => -32600), never responds to notifications. session_token() uses getrandom. - Rename ACP_SEED_* env vars to SPROUT_AGENT_* in src/ and tests/. - Remove Session.history Option workaround (mem::take on Vec). - Drop #[allow(clippy::too_many_arguments)].
- tests/golden_transcripts.rs: 10 tests that read like ACP spec examples. Covers initialize handshake, version check, cwd absolute validation, text-only response, full tool-call transcript (pending -> failed), permission flow, unsupported content block, malformed JSON-RPC, oversized line, concurrent prompt rejection, cancel notification. - main.rs: introduce decode() and reject() helpers to collapse 4 stages of duplicated error-emit code; trim from 317 to 303 LOC. - README.md: add metadata + canonical SPROUT_AGENT_* env vars. - regressions.rs picked up SPROUT_AGENT_* rename. cargo clippy -p sprout-agent --tests -- -D warnings: clean. cargo test -p sprout-agent: 25/25 passing (15 prior + 10 golden).
3 MCP tools (shell, todo, str_replace) plus an rg PATH shim that prefers the system ripgrep and falls back to a built-in matcher. - shell: ephemeral bash -c, workdir param, timeout with process-group kill via setsid + killpg, tail-heavy truncation (last ~8KB) with full output saved to a per-session artifact ring buffer (last 8). - todo: persistent file under the session tempdir; replace-all when content is given, read otherwise. - str_replace: atomic find-and-replace via NamedTempFile + persist, unique-match enforcement, unified diff output, fuzzy line hint (similar crate) on misses. - rg shim: per-session mkdtemp (0700) hardlinked to the binary; prepended to PATH only inside the shell tool's env. argv0 dispatch re-execs the system rg with the shim removed from PATH; falls back to a small built-in supporting --files, -n, -i, -l, -g, -C. - Bootstrap: ServerInfo.instructions describes tools, working directory, detected stack (Cargo.toml, package.json, go.mod, ...). 938 LOC across 6 files. No clap/anyhow/thiserror/tracing.
…th boundary, no panics)
Must fix (safety/correctness): - str_replace: 10MB file size cap before read - str_replace: count_occurrences_capped stops at 2 matches (memory) - rg fallback: stream BufReader line-by-line; bounded sink caps total output at 50KB / 2000 lines (matches shell tool caps) - todo: 1MB content cap + atomic write (temp + rename) - shell: spawn failure now returns CallToolResult::error (is_error=true) instead of fake success with exit_code -1 - shell: artifact wording is honest about the 10MB capture cap - rg: streaming design eliminates the m + opts.context overflow path Should fix (quality): - shim: drop guard renamed _dir (private), removed allow(dead_code) - shell: kill / wait / try_wait errors surface in response notes field instead of being swallowed - str_replace: nearest_line_hint capped at first 200 lines Tests (16, all passing): - str_replace: count_occurrences_capped, resolve_within rejects escape, basic replace + diff, outside-workspace rejection, file-too-large - todo: read/write round-trip, oversize rejection - shell: basic echo, timeout fires, workdir honored - rg: parse basic / files-only / unknown flag, CappedSink byte limit, glob matching, scan_file finds match Verification: - cargo clippy --all-targets clean - cargo test pass (16/16) - zero expect/unwrap in production code - production LOC: 1193 (under 1200 budget)
Split kill_process_group into two strategies: - kill_process_group_graceful (async): SIGTERM → tokio::time::sleep(200ms) → SIGKILL - kill_process_group_immediate (sync): SIGKILL only, for Drop where async is unavailable Eliminates 200ms thread::sleep that blocked the current_thread runtime.
The LLM could previously pass any directory as workdir, escaping the workspace boundary. Now both shell and str_replace canonicalize the provided workdir and reject it if it doesn't start_with the server's initial cwd. Symlink escapes are caught by canonicalization.
…ols" This reverts commit bd9011c.
…not 0) A pathological server that starts fine but deadlocks on every tool call would previously get infinite restart attempts because kill_server reset attempts to 0. Now it starts at 1, so repeated kill→restart cycles eventually exhaust the budget.
MCP server initialization (spawn + handshake) can take up to 30s. Previously this blocked the single reader task, preventing session/cancel from being processed during that window. Now spawned like session/prompt.
- todo.rs: remove anti-removal enforcement, control-char rejection, deny_unknown_fields. Keep end-turn gate. 405 → 287 lines. - tree.rs: new PATH shim (like rg). Shows directory structure with line counts. 102 lines, zero comments, codex-approved. - Remove 8 unhelpful comments across agent.rs, mcp.rs, shell.rs. - Deduplicate MCP descriptions: server instructions shrink to 3 lines, tool descriptions are self-contained with no overlap. - shim.rs: symlink tree alongside rg. - main.rs: add mod tree + argv[0] dispatch.
86c7e97 to
0735836
Compare
tree / was hanging forever — it walked the entire filesystem before truncating output. Now collect() checks out.len() >= line_budget at every recursion entry and before each file/dir, stopping as soon as we have enough lines to fill the output cap.
…add, -- support Fixes from codex audit: - Skip files >10MB instead of reading into memory (OOM prevention) - Use writeln! to stdout lock instead of println! (no panic on broken pipe) - saturating_add on line count totals (overflow prevention) - Support -- terminator for paths starting with -
Replaced manual read_dir + SKIP_DIRS with ignore::WalkBuilder. Handles .gitignore, .ignore, global gitignore, nested ignores, and hidden files. Single-pass with stack-based directory totals. Still bounded (line budget, file size cap, broken pipe safe).
VISION_AGENT.md: 10 files (not 9), 39 tests (not 25), ~2,900 LOC (not ~2,500), ~4,400 total (not ~4,000), sprout-dev-mcp 7 files (not 6). README.md: ten files, seven deps, updated architecture diagram, ~2,900 LOC.
The while-loop guard ensures stack is non-empty, making the unwrap logically unreachable. Replace with let-else to eliminate the panic path entirely and satisfy #![forbid(unsafe_code)] + zero-panic claims.
aad4191 to
2be6a63
Compare
…pact) Remove the 295-LOC synthetic todo tool from sprout-agent and replace it with a generic MCP hook system. Any MCP server can now participate in agent lifecycle events by exposing tools prefixed with _. Hook system (~80 LOC in agent): - call_hooks(): single DRY dispatch function for all lifecycle points - _Stop: called before honoring end_turn; objections continue the loop - _PostCompact: called after context handoff; re-injects server state - Tools prefixed with _ are filtered from LLM and rejected if called directly - Fail-open: timeouts kill the server, errors are silent, hooks never block - Agent sovereignty: 500ms timeout, 3-rejection session budget, consecutive end_turn without tool calls = respect LLM's decision Todo reimplemented in sprout-dev-mcp: - Regular 'todo' MCP tool (CRUD, same schema/validation as before) - _Stop hook returns objection when open items exist - _PostCompact hook returns full list for context re-injection - schemars annotations expose constraints (max 50 items, id<=9999, etc) Configuration: - MCP_HOOK_SERVERS: operator allowlist (unset=no hooks, *=all) - SPROUT_AGENT_HOOK_TIMEOUT_MS: per-hook timeout (default 500) - SPROUT_AGENT_STOP_MAX_REJECTIONS: session budget (default 3, 0=disable) Compatible with Open Plugin Spec hook naming conventions. MCP_HOOK_SERVERS is a standard env var name for cross-agent adoption. Tests: 64 passing (15 regression, 12 config unit, 10 golden, 4 transcript, 23 dev-mcp). New coverage: stop-blocks, budget-exhausted, consecutive-end, timeout-failopen, hidden-tool-rejection, post-compact-injection.
2be6a63 to
8bd6a3f
Compare
Remove numeric IDs from todo items. Schema is now [{text, done}] —
position is display-only. The LLM no longer invents/tracks IDs.
Add silent-removal detection: if open items disappear from the list
without being marked done, the tool response includes a soft warning.
This closes the _Stop bypass where the LLM could just delete open items.
Hardening:
- Reject control characters and Unicode trickery (bidi, zero-width, etc)
- Reject duplicate text (after trim normalization)
- deny_unknown_fields on all param structs
- Trim text on storage for consistent identity
- Length validation on trimmed form
- Atomic write+render under single lock hold
39 tests in sprout-dev-mcp (was 23). Codex GPT 5.5 scored 9/10 —
remaining items are exotic Unicode normalization and the intentional
advisory (not authoritative) design of the removal warning.
e3286fc to
a875ad8
Compare
Documentation:
- README: fix 'no hooks'/'no compaction' claims to match implementation
- README: fix tool-name separator from <server>.<tool> to server__tool
- README: fix 'text never reaches client' — agent emits agent_message_chunk
- README: fix stale method count (was 'six methods', now accurate)
- README/VISION_AGENT.md/Cargo.toml: remove all LOC claims (they drift)
- Fix misleading handoff log message ('-> 1 item' -> actual count)
Logging:
- Migrate both crates from eprintln!/custom log_*! macros to tracing
(consistent with the rest of the sprout codebase)
- Delete both log.rs files
- Init tracing-subscriber writing to stderr in both main() functions
When a session is cancelled, the agent now sends notifications/cancelled to in-flight MCP servers via rmcp's cancellable request API. sprout-dev-mcp observes the cancellation token and kills the running shell process group. Agent side: - do_call uses send_cancellable_request + select! cancel vs response - fire_and_forget_cancel() helper sends notification without blocking - execute_parallel closes semaphore on cancel (cooperative drain, 5s bound) - Early borrow() check prevents missed cancellation on pre-cancelled receivers - Typed AgentError::Cancelled variant (no string matching) Server side: - RequestContext<RoleServer>.ct threaded into shell::run - Shell wait loop observes ct.cancelled() → SIGKILL process group + bounded reap - PgidGuard only disarmed after successful reap Tests: - End-to-end: sleep 60 killed on cancel, PID verified dead via kill -0 - Protocol-level: fake_mcp receives notifications/cancelled with correct requestId - fake_mcp refactored to channel-based reader thread for notification capture - All waits use bounded polling (no fixed sleeps)
This was referenced May 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Two new crates: a minimal, auditable coding agent and its MCP tool server.
sprout-agent speaks ACP over stdio, calls an LLM, executes MCP tools. Multiple concurrent sessions (configurable cap, default 8), each with its own MCP servers, history, and context. Internal context handoff when history fills. MCP-driven lifecycle hooks for task enforcement. Works with Zed, JetBrains, sprout-acp, or anything that speaks ACP.
sprout-dev-mcp is an MCP server providing
shell,str_replace, andtodotools plus_Stopand_PostCompactlifecycle hooks. Ephemeral processes with process-group kill on every exit path. Bounded output.rgandtreeon PATH (gitignore-aware, line counts). Works with any MCP client.Together: ~4,200 lines of Rust purpose-built for headless autonomous coding work.
Why
See VISION_AGENT.md for the full rationale. The short version:
#![forbid(unsafe_code)], zero panics, bounded everythingArchitecture
Key Features
Parallel Tool Calls
When the LLM returns multiple tool calls in one turn, they execute concurrently (default limit: 8). Semaphore-bounded JoinSet with cancel drain.
MCP Server Lifecycle
2-state machine (Healthy/Dead). Transport errors kill the process group and mark dead. Lazy restart with exponential backoff + jitter. Application-level errors returned to the LLM — server is healthy.
Context Handoff
When history exceeds 75% of budget, the agent summarizes and resets. Original task +
_PostCompacthook state preserved across handoff.Tree Shim (gitignore-aware, line counts)
treeis a PATH shim — shows directory structure with line counts, respects .gitignore. Bounded output (2000 lines / 50KB).Todo Tool (MCP-native)
The
todotool lives in sprout-dev-mcp as a regular MCP tool. Same CRUD interface as before (full-list replacement, max 50 items, validation). Additionally exposes_Stop(returns objection if open items exist) and_PostCompact(returns full list for re-injection after handoff).MCP Lifecycle Hooks (_Stop, _PostCompact)
The agent has a generic hook system compatible with the Open Plugin Spec. Any MCP server can participate in agent lifecycle events by exposing tools prefixed with
_:_Stop— called before honoringend_turn. If the hook returns non-empty text (an objection), the agent injects it as a tool result and continues. The todo server uses this to enforce task completion._PostCompact— called after context handoff/compaction. The hook response is injected into the fresh context so MCP servers can re-establish state visibility.Hooks are:
_are filtered from the tool list and rejected if the LLM tries to call them directlyMCP_HOOK_SERVERSenv var controls which servers get hook access (unset = no hooks,*= all,dev,policy= named list)Safety Properties
Every input is bounded. Every exit path kills the process group. Every truncation is marked.