Skip to content

feat: replace subagent and skills with unified summon extension#6964

Merged
tlongwell-block merged 37 commits intomainfrom
summon_extension
Feb 10, 2026
Merged

feat: replace subagent and skills with unified summon extension#6964
tlongwell-block merged 37 commits intomainfrom
summon_extension

Conversation

@tlongwell-block
Copy link
Collaborator

@tlongwell-block tlongwell-block commented Feb 4, 2026

Introduces the summon extension to unify goose's delegation and knowledge-loading capabilities.

New Tools

Tool Purpose Example
load Discover sources or inject knowledge load() lists all, load(source: "rust-patterns") loads content
delegate Run task in isolated subagent delegate(instructions: "...") or delegate(source: "review", async: true)

Features

  • Source discovery: Recipes, skills, agents from local/global paths with priority resolution
  • Async execution: Background tasks with turn tracking and MOIM status reporting
  • Unprefixed tools: Cleaner tool names without extension__ prefix

Removed

  • subagent_tool.rs, skills_extension.rs, builtin_skills/ - replaced by summon

Closes #6202

Implements unified 'summon' platform extension with two intuitive tools:
- load: Inject knowledge into current context ("teach me this")
- delegate: Run tasks in isolated subagents ("do this for me")

Key features:
- Source discovery from recipes, skills, agents with priority ordering
- Skill/agent frontmatter parsing with Claude model shorthand translation
- Sync and async delegation with background task tracking
- MOIM status reporting for background tasks with rounded durations
- Nested delegation prevention (subagents cannot spawn subagents)
- 60s TTL caching for source discovery

Removes:
- subagent_tool.rs (523 LOC)
- skills_extension.rs (865 LOC)
- builtin_skills/ directory

Adds:
- summon_extension.rs (1834 LOC) - consolidated implementation

The summon extension provides a cleaner mental model while maintaining
all existing functionality. Builtin skills are now inlined in the
extension.

Ref: GitHub Discussion #6202
- Skill-only delegation now sets prompt: 'Apply the skill knowledge to produce a useful result.'
- Agent-only delegation now sets prompt: 'Proceed with your expertise to produce a useful result.'
- This prevents subagents from receiving the meaningless 'Begin.' prompt

Also includes:
- Agent model override now correctly applied from recipe.settings
- Model/provider/temperature precedence: params > recipe.settings > session
- DEFAULT_SUBAGENT_MAX_TURNS updated to 50 for consistency

Crossfire review scores:
- Default subagent: 8.5/10
- goose-gpt-5-2: 8/10

No blocking issues found. Ready to merge.
Implement meta-based tool ownership to enable platform extensions to
expose tools without the extension__ prefix.

Changes:
- Add unprefixed_tools flag to PlatformExtensionDef (enabled for summon)
- Embed goose_extension in Tool.meta for all tools during fetch
- Add resolve_tool() helper using meta-based ownership lookup
- Simplify dispatch_tool_call() to use unified resolution path
- Update filter_tools() and reply_parts.rs to use meta ownership
- Add collision detection for duplicate tool names
- Remove unused get_client_for_tool() method

This allows the summon extension's load and delegate tools to appear
without prefix while maintaining correct dispatch and filtering.
Clean up implementation by removing explanatory inline comments.
The code is self-documenting through clear naming and structure.

Removed ~55 comments including:
- Implementation detail explanations
- Precedence order comments
- Test section labels
- Inline explanatory notes

All doc comments (///) preserved for public API documentation.
Replace inline string literals with compile-time bundled .md files
using include_dir!, matching the pattern used for system prompts.

- Add builtin_skills module with include_dir! macro
- Restore goose_doc_guide.md skill file from main
- Update summon_extension to use builtin_skills::get_all()

Benefits:
- Easier to maintain with full editor support
- Syntax highlighting for markdown content
- Simple to add new skills (just create .md files)
- Consistent with prompts/ directory pattern
- Add prefix-based fallback in resolve_tool() to handle prefixed tool
  calls without triggering tools/list, fixing MCP replay test failures
- Set unprefixed_tools: true for code_execution extension to maintain
  backward compatibility with LLMs calling execute_code without prefix
- Refactor subagent_handler with OnMessageCallback type and
  run_subagent_task_with_callback() to eliminate code duplication
- Update summon_extension to use shared subagent infrastructure
- Fix CLI output.rs to recognize unprefixed tool names (execute_code,
  delegate) for proper rendering

All 548 lib tests and 4 MCP integration tests pass.
- Replace subagent references with delegate tool tests
- Add load tool testing (discovery, builtin skills, knowledge injection)
- Add async delegate test with MOIM monitoring
- Add nested delegation prevention test (critical security)
- Add source-based delegate test
- Rename test_phases parameter value from 'subagents' to 'delegation'
- Clean up trailing whitespace throughout file
When async delegates complete, their results are now persisted in
completed_tasks instead of being discarded. The agent can retrieve
completed task outputs by calling load(task_id).

Changes:
- Add CompletedTask struct to store task results with metadata
- Add completed_tasks field to SummonClient
- Modify cleanup_completed_tasks() to move finished tasks to completed_tasks
- Add handle_load_task_result() to retrieve and format task output
- Update handle_load() to check for task_ prefix and call cleanup first
- Update get_moim() to show completed tasks with retrieval hints
- Update handle_load_discovery() to list completed tasks awaiting retrieval
- Add comprehensive test_async_task_result_lifecycle test

MOIM now shows:
- Running tasks with sleep hint
- Completed tasks with 'use load("task_id") to get result' hint

Implements TASK-09A from #6202
Review feedback:
- Remove doc comments that restate what code does (extension.rs,
  extension_manager.rs, subagent_handler.rs, summon_extension.rs)
- Consolidate add_local_*, add_global_*, add_recipe_path_* functions
  into a single discover_filesystem_sources() method that builds
  directory lists declaratively and iterates through them

The source discovery consolidation reduces ~80 LOC by eliminating
8 separate add_* methods and replacing them with inline directory
list construction using iterator chains.
…mon_extension

Removed doc comments that merely restate what the code does:
- extension_manager.rs: get_tool_owner, is_unprefixed_extension
- summon_extension.rs: skill/agent directory comments, handle_load_source,
  get_task_description

These comments added no value beyond what the function names and code
already communicate clearly.
- Update delegate tool description: 'Parallel execution requires async: true'
- Expand goose-self-test.yaml parallel delegate test to explicitly validate:
  - Sync delegates run sequentially even when called in same message
  - Async delegates run in parallel when called in same message
  - Test includes timestamp comparison to verify behavior

This documents the expected behavior due to MCP protocol constraints where
sync tool calls are serialized per extension connection.
Model names should be specified in full by the agent/recipe.
The provider layer handles validation. Removes unnecessary
maintenance burden of keeping shorthand mappings up to date.
Replace custom task_id generation with session ID (YYYYMMDD_N format).
This simplifies the code and makes task IDs match session IDs for
easier debugging and correlation.
Resolved conflicts:
- Updated McpClientTrait::call_tool signature to include working_dir parameter
- Re-exported SUBAGENT_TOOL_REQUEST_TYPE from agents module for goose-cli
- Integrated origin/main's working_dir support with summon_extension's changes
@tlongwell-block

This comment was marked as resolved.

@github-actions

This comment was marked as resolved.

Add ability to cancel running background tasks via load(source: "task_id", cancel: true).

- Store CancellationToken in BackgroundTask instead of creating orphan tokens
- Add `cancel` parameter to load tool schema
- Cancel triggers token, waits up to 5s for graceful shutdown, then aborts
- Cancelled tasks return partial output with "⊘ Cancelled" status
- Update MOIM hint to show cancel option when tasks are running
- Add cancellation test to goose-self-test.yaml
@tlongwell-block

This comment was marked as resolved.

@github-actions

This comment was marked as resolved.

- Add Drop impl to cancel background tasks on shutdown (best-effort via try_lock)
- Propagate ModelConfig::new error instead of unwrap
- Document unprefixed_tools field in PlatformExtensionDef
@DOsinga
Copy link
Collaborator

DOsinga commented Feb 5, 2026

I haven't looked at the details here, but this feels like the right way to do this.

@block block deleted a comment from github-actions bot Feb 5, 2026
* origin/main:
  fix: detect context length errors in GCP Vertex AI provider (#6976)
  Added the ability to escape Jinja variables in recipes to include them in prompts (#6975)
  Bug Fix: bump pctx (#6967)
  fix(acp): fixtures now raise content mismatch errors (#6912)
  custom provider form minor improvements (#6966)
  Fix 'Edit In Place' feature appending instead of editing messages (#6955)
  docs: change RPI slash commands (#6963)
Parallel async delegates were causing 'database is locked' errors due to
concurrent write operations (create_session, add_message) competing for
SQLite's single writer lock.

Solution: Add a tokio::sync::Mutex to SessionStorage that serializes all
write operations. This gracefully handles intra-process contention while
SQLite's busy_timeout (5s) handles any cross-process scenarios.

Protected methods:
- create_session
- add_message
- apply_update
- replace_conversation
- delete_session
- truncate_conversation
- update_message_metadata
The load_callback_configs function was failing when encountering
unprefixed tools (like summon's load/delegate) because it expected
all tool names to contain '__'. Now uses get_tool_owner() to get
the namespace for unprefixed tools, making them available in code
execution mode as Summon.load() and Summon.delegate().
…ptions

Two fixes for subrecipes support:

1. CLI now stores recipe on session after creation, allowing the summon
   extension to discover sub_recipes defined in the parent recipe. This
   matches the behavior of goose-server and scheduler.

2. Subrecipe descriptions now show available parameters by loading the
   actual recipe file and extracting parameter keys. Example output:
   'Gather file statistics (params: directory, fast_mode)'
@michaelneale
Copy link
Collaborator

@tlongwell-block so I think it is good to revisit skills, my main Q is - for the case of just skills, will it still have thge lightweight approach of name/description pairs in system prompt (always) an then a simple tool to load it progressively? as that is key for skills to be useful, dont' want to confuse it with other things it could do (even claude is not doing a great job at skills now as it gets distracted) - that is my main feedback/query but otherwise I think this makes a lot of sense and probably makes it more efficient I would hope.

@tlongwell-block
Copy link
Collaborator Author

@tlongwell-block so I think it is good to revisit skills, my main Q is - for the case of just skills, will it still have thge lightweight approach of name/description pairs in system prompt (always) an then a simple tool to load it progressively? as that is key for skills to be useful, dont' want to confuse it with other things it could do (even claude is not doing a great job at skills now as it gets distracted) - that is my main feedback/query but otherwise I think this makes a lot of sense and probably makes it more efficient I would hope.

Right now, the agent has to call load() to see the list of skills and a brief description. I was considering putting the list in the system prompt or the moim, but internally folks are just putting a 70+ skill repo in path, which... that's not ideal

Hence this current setup. I'm not sure exactly the best approach

@tlongwell-block
Copy link
Collaborator Author

tlongwell-block commented Feb 9, 2026

With my setup, I've found the only consistent way to get it to use skills is to reference them in AGENTS.md or my global .goosehints

The summon extension adds goose_extension metadata to tool _meta fields,
which changes the SHA256 hash used for scenario test replay matching.
Since _meta is internal routing metadata not seen by the LLM, strip it
before hashing so recordings remain stable across metadata changes.

Signed-off-by: Travis Longwell <travis@block.xyz>
* origin/main: (55 commits)
  test(mcp): add image tool test and consolidate MCP test fixtures (#7019)
  fix: remove Option from model listing return types, propagate errors (#7074)
  fix: lazy provider creation for goose acp (#7026) (#7066)
  Smoke tests: split compaction test and use debug build (#6984)
  fix(deps): trim bat to resolve RUSTSEC-2024-0320 (#7061)
  feat: expose AGENT_SESSION_ID env var to extension child processes (#7072)
  fix: add XML tool call parsing fallback for Qwen3-coder via Ollama (#6882)
  Remove clippy too_many_lines lint and decompose long functions (#7064)
  refactor: move disable_session_naming into AgentConfig (#7062)
  Add global config switch to disable automatic session naming (#7052)
  docs: add blog post - 8 Things You Didn't Know About Code Mode (#7059)
  fix: ensure animated elements are visible when prefers-reduced-motion is enabled (#7047)
  Show recommended model on failture (#7040)
  feat(ui): add session content search via API (#7050)
  docs: fix img url (#7053)
  Desktop UI for deleting custom providers (#7042)
  Add blog post: How I Used RPI to Build an OpenClaw Alternative (#7051)
  Remove build-dependencies section from Cargo.toml (#6946)
  add /rp-why skill blog post (#6997)
  fix: fix snake_case function names in code_execution instructions (#7035)
  ...

# Conflicts:
#	scripts/test_subrecipes.sh
After merging main, AgentConfig::new gained a 5th parameter
(disable_session_naming: bool). Update both call sites in
summon_extension.rs to pass true for subagents.

Signed-off-by: Travis Longwell <travis@block.xyz>
gpt-3.5-turbo consistently fails tool-calling smoke tests. It's an
older model with known flaky tool-calling behavior. Add it to
ALLOWED_FAILURES so it still runs and reports but doesn't block PRs.

Signed-off-by: Travis Longwell <travis@block.xyz>
Copy link
Collaborator

@jamadeo jamadeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mostly didn't look at the moved code, assuming that's unchanged for the most part. Looks overall good to me, with a few suggestions and questions. I would prefer not to merge until learning more about the db lock though as the last time we thought we needed that, it turned out to be a different issue.

* origin/main:
  Docs: require auth optional for custom providers (#7098)
  fix: improve text-muted contrast for better readability (#7095)
  Always sync bundled extensions (#7057)
  feat: Add tom (Top Of Mind) platform extension (#7073)
  chore(docs): update GOOSE_SESSION_ID -> AGENT_SESSION_ID (#6669)
  fix(ci): switch from cargo-audit to cargo-deny for advisory scanning (#7032)
  chore(deps): bump @isaacs/brace-expansion from 5.0.0 to 5.0.1 in /evals/open-model-gym/suite (#7085)
  chore(deps): bump @modelcontextprotocol/sdk from 1.25.3 to 1.26.0 in /evals/open-model-gym/mcp-harness (#7086)
  fix: switch to windows msvc (#7080)
  fix: allow unlisted models for CLI providers (#7090)
  Use goose port (#7089)
  chore: strip posthog for sessions/models/daily only (#7079)
  tidy: clean up old benchmark and add gym (#7081)
  fix: use command.process_group(0) for CLI providers, not just MCP (#7083)
  added build notify (#6891)
…vel write_lock

Remove the tokio::sync::Mutex write_lock from SessionStorage and increase
SQLite's busy_timeout from 5s to 30s. Under heavy concurrent subagent load,
multiple connections competing for SQLite's single-writer lock can exceed the
5s retry window, causing 'database is locked' errors. A 30s timeout absorbs
this contention reliably (tested up to 1000 concurrent agents).

SQLite's WAL mode already serializes writers at the database level, so the
application-level Mutex was redundant — it only protected intra-process
callers and added unnecessary serialization overhead. Letting SQLite handle
contention via its native busy-retry mechanism is simpler and sufficient.
* origin/main:
  feat: add AGENT=goose environment variable for cross-tool compatibility (#7017)
  fix: strip empty extensions array when deeplink also (#7096)
  [docs] update authors.yaml file (#7114)
  Implement manpage generation for goose-cli (#6980)
  docs: tool output optimization (#7109)
  Fix duplicated output in Code Mode by filtering content by audience (#7117)
  Enable tom (Top Of Mind) platform extension by default (#7111)
  chore: added notification for canary build failure (#7106)
  fix: fix windows bundle random failure and optimise canary build (#7105)
  feat(acp): add model selection support for session/new and session/set_model (#7112)
  fix: isolate claude-code sessions via stream-json session_id (#7108)
  ci: enable agentic provider live tests (claude-code, codex, gemini-cli) (#7088)
  docs: codex subscription support (#7104)
  chore: add a new scenario (#7107)
  fix: Goose Desktop missing Calendar and Reminders entitlements (#7100)
  Fix 'Edit In Place' and 'Fork Session' features (#6970)
  Fix: Only send command content to command injection classifier (excluding part of tool call dict) (#7082)

# Conflicts:
#	crates/goose/src/agents/extension.rs
Replace bare "goose_extension" string literals with a
TOOL_EXTENSION_META_KEY constant so the key is defined once
and referenced everywhere.

Addresses review feedback on PR #6964.
@tlongwell-block tlongwell-block added this pull request to the merge queue Feb 10, 2026
Merged via the queue into main with commit 7ea19f5 Feb 10, 2026
19 checks passed
@tlongwell-block tlongwell-block deleted the summon_extension branch February 10, 2026 19:23
jh-block added a commit that referenced this pull request Feb 10, 2026
* origin/main: (30 commits)
  docs: GCP Vertex AI org policy filtering & update OnboardingProviderSetup component (#7125)
  feat: replace subagent and skills with unified summon extension (#6964)
  feat: add AGENT=goose environment variable for cross-tool compatibility (#7017)
  fix: strip empty extensions array when deeplink also (#7096)
  [docs] update authors.yaml file (#7114)
  Implement manpage generation for goose-cli (#6980)
  docs: tool output optimization (#7109)
  Fix duplicated output in Code Mode by filtering content by audience (#7117)
  Enable tom (Top Of Mind) platform extension by default (#7111)
  chore: added notification for canary build failure (#7106)
  fix: fix windows bundle random failure and optimise canary build (#7105)
  feat(acp): add model selection support for session/new and session/set_model (#7112)
  fix: isolate claude-code sessions via stream-json session_id (#7108)
  ci: enable agentic provider live tests (claude-code, codex, gemini-cli) (#7088)
  docs: codex subscription support (#7104)
  chore: add a new scenario (#7107)
  fix: Goose Desktop missing Calendar and Reminders entitlements (#7100)
  Fix 'Edit In Place' and 'Fork Session' features (#6970)
  Fix: Only send command content to command injection classifier (excluding part of tool call dict) (#7082)
  Docs: require auth optional for custom providers (#7098)
  ...
tlongwell-block added a commit that referenced this pull request Feb 11, 2026
* origin/main: (107 commits)
  feat: Allow overriding default bat themes using environment variables (#7140)
  Make the system prompt smaller (#6991)
  Pre release script (#7145)
  Spelling (#7137)
  feat(mcp): upgrade rmcp to 0.15.0 and advertise MCP Apps UI extension capability (#6927)
  fix: ensure assistant messages with tool_calls include content field (#7076)
  fix(canonical): handle gcp_vertex_ai model mapping correctly (#6836)
  Group dependencies in root Cargo.toml (#6948)
  refactor: updated elevenLabs API module and `remove button` UX (#6781)
  fix: we were missing content from langfuse traces (#7135)
  docs: update username in authors.yml (#7132)
  fix extension selector syncing issues (#7133)
  fix(acp): per-session Agent for model isolation and load_session restore (#7115)
  fix(claude-code): defensive coding improvements for model switching (#7131)
  feat(claude-code): dynamic model listing and mid-session model switching (#7120)
  Inline worklet source (#7128)
  [docs] One shot prompting is dead - Blog Post (#7113)
  fix: correct spelling of Debbie O'Brien's name in authors.yml (#7127)
  docs: GCP Vertex AI org policy filtering & update OnboardingProviderSetup component (#7125)
  feat: replace subagent and skills with unified summon extension (#6964)
  ...

# Conflicts:
#	Cargo.lock
#	Cargo.toml
Tyler-Hardin pushed a commit to Tyler-Hardin/goose that referenced this pull request Feb 11, 2026
Tyler-Hardin pushed a commit to Tyler-Hardin/goose that referenced this pull request Feb 11, 2026
Tyler-Hardin pushed a commit to Tyler-Hardin/goose that referenced this pull request Feb 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants