feat: replace subagent and skills with unified summon extension#6964
feat: replace subagent and skills with unified summon extension#6964tlongwell-block merged 37 commits intomainfrom
Conversation
Implements unified 'summon' platform extension with two intuitive tools:
- load: Inject knowledge into current context ("teach me this")
- delegate: Run tasks in isolated subagents ("do this for me")
Key features:
- Source discovery from recipes, skills, agents with priority ordering
- Skill/agent frontmatter parsing with Claude model shorthand translation
- Sync and async delegation with background task tracking
- MOIM status reporting for background tasks with rounded durations
- Nested delegation prevention (subagents cannot spawn subagents)
- 60s TTL caching for source discovery
Removes:
- subagent_tool.rs (523 LOC)
- skills_extension.rs (865 LOC)
- builtin_skills/ directory
Adds:
- summon_extension.rs (1834 LOC) - consolidated implementation
The summon extension provides a cleaner mental model while maintaining
all existing functionality. Builtin skills are now inlined in the
extension.
Ref: GitHub Discussion #6202
- Skill-only delegation now sets prompt: 'Apply the skill knowledge to produce a useful result.' - Agent-only delegation now sets prompt: 'Proceed with your expertise to produce a useful result.' - This prevents subagents from receiving the meaningless 'Begin.' prompt Also includes: - Agent model override now correctly applied from recipe.settings - Model/provider/temperature precedence: params > recipe.settings > session - DEFAULT_SUBAGENT_MAX_TURNS updated to 50 for consistency Crossfire review scores: - Default subagent: 8.5/10 - goose-gpt-5-2: 8/10 No blocking issues found. Ready to merge.
Implement meta-based tool ownership to enable platform extensions to expose tools without the extension__ prefix. Changes: - Add unprefixed_tools flag to PlatformExtensionDef (enabled for summon) - Embed goose_extension in Tool.meta for all tools during fetch - Add resolve_tool() helper using meta-based ownership lookup - Simplify dispatch_tool_call() to use unified resolution path - Update filter_tools() and reply_parts.rs to use meta ownership - Add collision detection for duplicate tool names - Remove unused get_client_for_tool() method This allows the summon extension's load and delegate tools to appear without prefix while maintaining correct dispatch and filtering.
Clean up implementation by removing explanatory inline comments. The code is self-documenting through clear naming and structure. Removed ~55 comments including: - Implementation detail explanations - Precedence order comments - Test section labels - Inline explanatory notes All doc comments (///) preserved for public API documentation.
Replace inline string literals with compile-time bundled .md files using include_dir!, matching the pattern used for system prompts. - Add builtin_skills module with include_dir! macro - Restore goose_doc_guide.md skill file from main - Update summon_extension to use builtin_skills::get_all() Benefits: - Easier to maintain with full editor support - Syntax highlighting for markdown content - Simple to add new skills (just create .md files) - Consistent with prompts/ directory pattern
- Add prefix-based fallback in resolve_tool() to handle prefixed tool calls without triggering tools/list, fixing MCP replay test failures - Set unprefixed_tools: true for code_execution extension to maintain backward compatibility with LLMs calling execute_code without prefix - Refactor subagent_handler with OnMessageCallback type and run_subagent_task_with_callback() to eliminate code duplication - Update summon_extension to use shared subagent infrastructure - Fix CLI output.rs to recognize unprefixed tool names (execute_code, delegate) for proper rendering All 548 lib tests and 4 MCP integration tests pass.
- Replace subagent references with delegate tool tests - Add load tool testing (discovery, builtin skills, knowledge injection) - Add async delegate test with MOIM monitoring - Add nested delegation prevention test (critical security) - Add source-based delegate test - Rename test_phases parameter value from 'subagents' to 'delegation' - Clean up trailing whitespace throughout file
When async delegates complete, their results are now persisted in
completed_tasks instead of being discarded. The agent can retrieve
completed task outputs by calling load(task_id).
Changes:
- Add CompletedTask struct to store task results with metadata
- Add completed_tasks field to SummonClient
- Modify cleanup_completed_tasks() to move finished tasks to completed_tasks
- Add handle_load_task_result() to retrieve and format task output
- Update handle_load() to check for task_ prefix and call cleanup first
- Update get_moim() to show completed tasks with retrieval hints
- Update handle_load_discovery() to list completed tasks awaiting retrieval
- Add comprehensive test_async_task_result_lifecycle test
MOIM now shows:
- Running tasks with sleep hint
- Completed tasks with 'use load("task_id") to get result' hint
Implements TASK-09A from #6202
Review feedback: - Remove doc comments that restate what code does (extension.rs, extension_manager.rs, subagent_handler.rs, summon_extension.rs) - Consolidate add_local_*, add_global_*, add_recipe_path_* functions into a single discover_filesystem_sources() method that builds directory lists declaratively and iterates through them The source discovery consolidation reduces ~80 LOC by eliminating 8 separate add_* methods and replacing them with inline directory list construction using iterator chains.
…mon_extension Removed doc comments that merely restate what the code does: - extension_manager.rs: get_tool_owner, is_unprefixed_extension - summon_extension.rs: skill/agent directory comments, handle_load_source, get_task_description These comments added no value beyond what the function names and code already communicate clearly.
- Update delegate tool description: 'Parallel execution requires async: true' - Expand goose-self-test.yaml parallel delegate test to explicitly validate: - Sync delegates run sequentially even when called in same message - Async delegates run in parallel when called in same message - Test includes timestamp comparison to verify behavior This documents the expected behavior due to MCP protocol constraints where sync tool calls are serialized per extension connection.
Model names should be specified in full by the agent/recipe. The provider layer handles validation. Removes unnecessary maintenance burden of keeping shorthand mappings up to date.
Replace custom task_id generation with session ID (YYYYMMDD_N format). This simplifies the code and makes task IDs match session IDs for easier debugging and correlation.
Resolved conflicts: - Updated McpClientTrait::call_tool signature to include working_dir parameter - Re-exported SUBAGENT_TOOL_REQUEST_TYPE from agents module for goose-cli - Integrated origin/main's working_dir support with summon_extension's changes
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
Add ability to cancel running background tasks via load(source: "task_id", cancel: true). - Store CancellationToken in BackgroundTask instead of creating orphan tokens - Add `cancel` parameter to load tool schema - Cancel triggers token, waits up to 5s for graceful shutdown, then aborts - Cancelled tasks return partial output with "⊘ Cancelled" status - Update MOIM hint to show cancel option when tasks are running - Add cancellation test to goose-self-test.yaml
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
- Add Drop impl to cancel background tasks on shutdown (best-effort via try_lock) - Propagate ModelConfig::new error instead of unwrap - Document unprefixed_tools field in PlatformExtensionDef
|
I haven't looked at the details here, but this feels like the right way to do this. |
* origin/main: fix: detect context length errors in GCP Vertex AI provider (#6976) Added the ability to escape Jinja variables in recipes to include them in prompts (#6975) Bug Fix: bump pctx (#6967) fix(acp): fixtures now raise content mismatch errors (#6912) custom provider form minor improvements (#6966) Fix 'Edit In Place' feature appending instead of editing messages (#6955) docs: change RPI slash commands (#6963)
Parallel async delegates were causing 'database is locked' errors due to concurrent write operations (create_session, add_message) competing for SQLite's single writer lock. Solution: Add a tokio::sync::Mutex to SessionStorage that serializes all write operations. This gracefully handles intra-process contention while SQLite's busy_timeout (5s) handles any cross-process scenarios. Protected methods: - create_session - add_message - apply_update - replace_conversation - delete_session - truncate_conversation - update_message_metadata
The load_callback_configs function was failing when encountering unprefixed tools (like summon's load/delegate) because it expected all tool names to contain '__'. Now uses get_tool_owner() to get the namespace for unprefixed tools, making them available in code execution mode as Summon.load() and Summon.delegate().
…ptions Two fixes for subrecipes support: 1. CLI now stores recipe on session after creation, allowing the summon extension to discover sub_recipes defined in the parent recipe. This matches the behavior of goose-server and scheduler. 2. Subrecipe descriptions now show available parameters by loading the actual recipe file and extracting parameter keys. Example output: 'Gather file statistics (params: directory, fast_mode)'
|
@tlongwell-block so I think it is good to revisit skills, my main Q is - for the case of just skills, will it still have thge lightweight approach of name/description pairs in system prompt (always) an then a simple tool to load it progressively? as that is key for skills to be useful, dont' want to confuse it with other things it could do (even claude is not doing a great job at skills now as it gets distracted) - that is my main feedback/query but otherwise I think this makes a lot of sense and probably makes it more efficient I would hope. |
Right now, the agent has to call load() to see the list of skills and a brief description. I was considering putting the list in the system prompt or the moim, but internally folks are just putting a 70+ skill repo in path, which... that's not ideal Hence this current setup. I'm not sure exactly the best approach |
|
With my setup, I've found the only consistent way to get it to use skills is to reference them in AGENTS.md or my global .goosehints |
The summon extension adds goose_extension metadata to tool _meta fields, which changes the SHA256 hash used for scenario test replay matching. Since _meta is internal routing metadata not seen by the LLM, strip it before hashing so recordings remain stable across metadata changes. Signed-off-by: Travis Longwell <travis@block.xyz>
* origin/main: (55 commits) test(mcp): add image tool test and consolidate MCP test fixtures (#7019) fix: remove Option from model listing return types, propagate errors (#7074) fix: lazy provider creation for goose acp (#7026) (#7066) Smoke tests: split compaction test and use debug build (#6984) fix(deps): trim bat to resolve RUSTSEC-2024-0320 (#7061) feat: expose AGENT_SESSION_ID env var to extension child processes (#7072) fix: add XML tool call parsing fallback for Qwen3-coder via Ollama (#6882) Remove clippy too_many_lines lint and decompose long functions (#7064) refactor: move disable_session_naming into AgentConfig (#7062) Add global config switch to disable automatic session naming (#7052) docs: add blog post - 8 Things You Didn't Know About Code Mode (#7059) fix: ensure animated elements are visible when prefers-reduced-motion is enabled (#7047) Show recommended model on failture (#7040) feat(ui): add session content search via API (#7050) docs: fix img url (#7053) Desktop UI for deleting custom providers (#7042) Add blog post: How I Used RPI to Build an OpenClaw Alternative (#7051) Remove build-dependencies section from Cargo.toml (#6946) add /rp-why skill blog post (#6997) fix: fix snake_case function names in code_execution instructions (#7035) ... # Conflicts: # scripts/test_subrecipes.sh
After merging main, AgentConfig::new gained a 5th parameter (disable_session_naming: bool). Update both call sites in summon_extension.rs to pass true for subagents. Signed-off-by: Travis Longwell <travis@block.xyz>
gpt-3.5-turbo consistently fails tool-calling smoke tests. It's an older model with known flaky tool-calling behavior. Add it to ALLOWED_FAILURES so it still runs and reports but doesn't block PRs. Signed-off-by: Travis Longwell <travis@block.xyz>
jamadeo
left a comment
There was a problem hiding this comment.
I mostly didn't look at the moved code, assuming that's unchanged for the most part. Looks overall good to me, with a few suggestions and questions. I would prefer not to merge until learning more about the db lock though as the last time we thought we needed that, it turned out to be a different issue.
* origin/main: Docs: require auth optional for custom providers (#7098) fix: improve text-muted contrast for better readability (#7095) Always sync bundled extensions (#7057) feat: Add tom (Top Of Mind) platform extension (#7073) chore(docs): update GOOSE_SESSION_ID -> AGENT_SESSION_ID (#6669) fix(ci): switch from cargo-audit to cargo-deny for advisory scanning (#7032) chore(deps): bump @isaacs/brace-expansion from 5.0.0 to 5.0.1 in /evals/open-model-gym/suite (#7085) chore(deps): bump @modelcontextprotocol/sdk from 1.25.3 to 1.26.0 in /evals/open-model-gym/mcp-harness (#7086) fix: switch to windows msvc (#7080) fix: allow unlisted models for CLI providers (#7090) Use goose port (#7089) chore: strip posthog for sessions/models/daily only (#7079) tidy: clean up old benchmark and add gym (#7081) fix: use command.process_group(0) for CLI providers, not just MCP (#7083) added build notify (#6891)
…vel write_lock Remove the tokio::sync::Mutex write_lock from SessionStorage and increase SQLite's busy_timeout from 5s to 30s. Under heavy concurrent subagent load, multiple connections competing for SQLite's single-writer lock can exceed the 5s retry window, causing 'database is locked' errors. A 30s timeout absorbs this contention reliably (tested up to 1000 concurrent agents). SQLite's WAL mode already serializes writers at the database level, so the application-level Mutex was redundant — it only protected intra-process callers and added unnecessary serialization overhead. Letting SQLite handle contention via its native busy-retry mechanism is simpler and sufficient.
* origin/main: feat: add AGENT=goose environment variable for cross-tool compatibility (#7017) fix: strip empty extensions array when deeplink also (#7096) [docs] update authors.yaml file (#7114) Implement manpage generation for goose-cli (#6980) docs: tool output optimization (#7109) Fix duplicated output in Code Mode by filtering content by audience (#7117) Enable tom (Top Of Mind) platform extension by default (#7111) chore: added notification for canary build failure (#7106) fix: fix windows bundle random failure and optimise canary build (#7105) feat(acp): add model selection support for session/new and session/set_model (#7112) fix: isolate claude-code sessions via stream-json session_id (#7108) ci: enable agentic provider live tests (claude-code, codex, gemini-cli) (#7088) docs: codex subscription support (#7104) chore: add a new scenario (#7107) fix: Goose Desktop missing Calendar and Reminders entitlements (#7100) Fix 'Edit In Place' and 'Fork Session' features (#6970) Fix: Only send command content to command injection classifier (excluding part of tool call dict) (#7082) # Conflicts: # crates/goose/src/agents/extension.rs
Replace bare "goose_extension" string literals with a TOOL_EXTENSION_META_KEY constant so the key is defined once and referenced everywhere. Addresses review feedback on PR #6964.
* origin/main: (30 commits) docs: GCP Vertex AI org policy filtering & update OnboardingProviderSetup component (#7125) feat: replace subagent and skills with unified summon extension (#6964) feat: add AGENT=goose environment variable for cross-tool compatibility (#7017) fix: strip empty extensions array when deeplink also (#7096) [docs] update authors.yaml file (#7114) Implement manpage generation for goose-cli (#6980) docs: tool output optimization (#7109) Fix duplicated output in Code Mode by filtering content by audience (#7117) Enable tom (Top Of Mind) platform extension by default (#7111) chore: added notification for canary build failure (#7106) fix: fix windows bundle random failure and optimise canary build (#7105) feat(acp): add model selection support for session/new and session/set_model (#7112) fix: isolate claude-code sessions via stream-json session_id (#7108) ci: enable agentic provider live tests (claude-code, codex, gemini-cli) (#7088) docs: codex subscription support (#7104) chore: add a new scenario (#7107) fix: Goose Desktop missing Calendar and Reminders entitlements (#7100) Fix 'Edit In Place' and 'Fork Session' features (#6970) Fix: Only send command content to command injection classifier (excluding part of tool call dict) (#7082) Docs: require auth optional for custom providers (#7098) ...
* origin/main: (107 commits) feat: Allow overriding default bat themes using environment variables (#7140) Make the system prompt smaller (#6991) Pre release script (#7145) Spelling (#7137) feat(mcp): upgrade rmcp to 0.15.0 and advertise MCP Apps UI extension capability (#6927) fix: ensure assistant messages with tool_calls include content field (#7076) fix(canonical): handle gcp_vertex_ai model mapping correctly (#6836) Group dependencies in root Cargo.toml (#6948) refactor: updated elevenLabs API module and `remove button` UX (#6781) fix: we were missing content from langfuse traces (#7135) docs: update username in authors.yml (#7132) fix extension selector syncing issues (#7133) fix(acp): per-session Agent for model isolation and load_session restore (#7115) fix(claude-code): defensive coding improvements for model switching (#7131) feat(claude-code): dynamic model listing and mid-session model switching (#7120) Inline worklet source (#7128) [docs] One shot prompting is dead - Blog Post (#7113) fix: correct spelling of Debbie O'Brien's name in authors.yml (#7127) docs: GCP Vertex AI org policy filtering & update OnboardingProviderSetup component (#7125) feat: replace subagent and skills with unified summon extension (#6964) ... # Conflicts: # Cargo.lock # Cargo.toml
…k#6964) Signed-off-by: Travis Longwell <travis@block.xyz>
…k#6964) Signed-off-by: Travis Longwell <travis@block.xyz>
…k#6964) Signed-off-by: Travis Longwell <travis@block.xyz>
Introduces the
summonextension to unify goose's delegation and knowledge-loading capabilities.New Tools
loadload()lists all,load(source: "rust-patterns")loads contentdelegatedelegate(instructions: "...")ordelegate(source: "review", async: true)Features
extension__prefixRemoved
subagent_tool.rs,skills_extension.rs,builtin_skills/- replaced by summonCloses #6202