feat: add task tool for spawning subagents by evansenter · Pull Request #80 · evansenter/clemini

evansenter · 2026-01-19T10:23:53Z

Summary

Add task tool that spawns clemini subprocesses to handle delegated work
Supports foreground (wait for result) and background (fire-and-forget) modes
Reuses BACKGROUND_TASKS registry so kill_shell works for both bash and task processes

Changes

New src/tools/task.rs with TaskTool implementation
Make NEXT_TASK_ID pub(crate) in bash.rs for reuse
Register TaskTool in CleminiToolService

Design Decisions

Decision	Choice	Rationale
Registry	Reuse `BACKGROUND_TASKS`	KillShell works for both, unified management
Invocation	Configurable	`current_exe()` first, `cargo run` fallback
Output capture	Fire-and-forget (MVP)	Matches bash behavior, #79 tracks enhancement

Test plan

Unit tests for get_clemini_command() and declaration
make clippy passes
make fmt passes
make test passes
Manual testing with foreground/background modes

Closes #78

🤖 Generated with Claude Code

- Refactor clemini from pure binary to library+binary architecture - Add src/lib.rs exposing AgentEvent, run_interaction, CleminiToolService - Move logging infrastructure to src/logging.rs with OutputSink trait - Add tests/confirmation_tests.rs with 4 live API tests: - test_destructive_command_requests_confirmation - test_confirmation_approval_executes_command - test_safe_command_no_confirmation - test_confirmation_response_is_semantic - Add semantic validation using Gemini structured output - Add serial_test dependency for global state tests Run live tests with: cargo test --test confirmation_tests -- --include-ignored Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Drop MutexGuard before awaiting child.kill() in bash tests - Remove unnecessary let binding in init_test_logging - Collapse nested if statements using let chain Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Add comment explaining unsafe env var usage in Rust 2024 edition - Expand re-export pattern comment for clarity - Improve semantic validator comment and use eprintln for warnings - Add Makefile comment explaining separate test invocations Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Change tool executing prefix from 🔧 to ┌─ to create visual bracket with tool result line (└─), making tool call output more structured: ┌─ bash command="ls" └─ bash 0.01s ~20 tok Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Change tool executing format from: ┌─ (dimmed) name (cyan) args (dimmed) to: ┌─ name (cyan) args This makes the opening bracket match the closing bracket (└─) and improves readability of tool arguments. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Fix ordering issues where tools called log_event() directly, causing race conditions with async events. Tools now emit AgentEvent::ToolOutput events via channels, ensuring all output flows through the event system in correct order. Changes: - Add ToolOutput variant to AgentEvent - Add on_tool_output() to EventHandler trait - All 11 tools receive events_tx and emit via emit() helper - Add graceful RwLock poisoning recovery - Add 3 integration test files with semantic validation - Wire integration tests into CI with GEMINI_API_KEY - Add make test-all target for running full test suite Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Add .gradle/, bin/, node_modules/, build/, .class files, and package-lock.json patterns for benchmark exercises. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Allow dead_code on EXTENDED_TIMEOUT constant - Remove unnecessary let binding in create_test_tool_service Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Add --reset flag to restore exercises to clean state before running - Warn users when exercises have uncommitted changes from previous runs - Add -y/--yes flag to skip confirmation prompts - Show list of modified files when dirty state detected Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The test_file_not_found_helpful_response test can take longer than 60s because the model may try multiple approaches (glob, read) before giving up and suggesting alternatives. Use EXTENDED_TIMEOUT (120s) to avoid flaky CI failures. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Add a task tool that spawns clemini subprocesses to handle delegated work. Supports foreground (wait for result) and background (fire-and- forget) modes. Key changes: - New src/tools/task.rs with TaskTool implementation - Make NEXT_TASK_ID pub(crate) for reuse across tools - Register TaskTool in CleminiToolService Design decisions: - Reuses BACKGROUND_TASKS registry so kill_shell works for both - Configurable invocation: tries current_exe(), falls back to cargo run - Fire-and-forget MVP; output capture tracked in #79 Closes #78 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

claude · 2026-01-19T10:25:46Z

Code Review

Summary

This PR adds a task tool for spawning clemini subprocesses to handle delegated work, supporting both foreground (wait for result) and background (fire-and-forget) modes. It also refactors event handling to use AgentEvent::ToolOutput for tool visual output, adds integration tests with semantic validation, and updates CI to run integration tests.

Issues Found

Critical

None

Important

src/tools/task.rs:72-98 - Background tasks inherit parent's stdout/stderr set to null, but foreground mode captures output. The foreground subagent's streaming output to stderr will be captured and returned in the stderr field, but this may contain the assistant's response text mixed with tool output. Consider documenting this behavior or separating response text from debug/tool output in the returned JSON.
src/tools/task.rs:65 - The --cwd flag is passed to the subagent, but the CLAUDE.md or issue Implement Task tool for spawning subagents #78 doesn't specify if the subagent should have the same allowed_paths restrictions. Currently it would default to the cwd which may be more permissive than intended.

Suggestions

src/tools/task.rs:27-34 - The get_clemini_command() fallback to cargo run --quiet only makes sense during development. In production, if current_exe() fails (which is unusual), this fallback would attempt to compile the project. Consider logging a warning or returning an error instead of silently falling back to cargo.
src/tools/task.rs:89 - The stdin(Stdio::null()) for foreground mode means the subagent cannot use ask_user tool since it needs stdin. This limitation should be documented in the tool description.
tests/common/mod.rs:99-106 - The #[allow(dead_code)] on get_api_key and other functions suggests these may be unused in some test files. Consider moving only shared utilities here and keeping test-specific helpers in their respective test files.
src/tools/mod.rs:63-89 - The set_events_tx and events_tx() methods use interior mutability with RwLock. While this works, the lock is held very briefly. Consider documenting why RwLock was chosen over Mutex (reader performance isn't the reason since writes are also frequent).

Verdict

REQUEST_CHANGES - The task tool implementation is solid overall, but there are a couple of important issues around subagent behavior (output capture semantics, path restrictions) that should be clarified either in code or documentation before merging.

Automated review by Claude Code

- Add tracing::warn when falling back to cargo run (clarifies dev vs prod) - Document stdin limitation in tool description (ask_user not available) - Add code comment explaining allowed_paths inheritance is intentional Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

evansenter · 2026-01-19T10:35:12Z

Feedback Addressed

Implemented

[Important] Documented paths/sandbox behavior - Added code comment explaining allowed_paths inheritance is intentional (subagent operates independently)
[Suggestion] Added tracing::warn for cargo run fallback - Clarifies this is expected during development but indicates an issue in production
[Suggestion] Documented stdin limitation - Added to tool description that subagents cannot use interactive tools (ask_user)

Skipped

[Important] Output capture behavior - This is intentional MVP behavior. Fire-and-forget for background mode matches bash tool. Issue Add TaskOutput tool for retrieving background task results #79 tracks streaming output enhancement.
[Suggestion] cargo run fallback coverage - This fallback only triggers in edge cases (broken install, deleted binary). The warning log makes the behavior visible for debugging.

claude · 2026-01-19T10:37:00Z

Code Review

Summary

This PR adds a task tool for spawning clemini subprocesses (subagents) to handle delegated work, supporting both foreground and background modes. It also refactors event handling to use AgentEvent::ToolOutput for tool visual output, adds semantic integration tests with a common test framework, updates CI to run integration tests, and makes various visual output improvements.

Issues Found

Critical

None

Important

None

Suggestions

src/tools/mod.rs:63-89 - The set_events_tx and events_tx() methods use interior mutability with RwLock. The comments explain what but not why RwLock was chosen over Mutex. Since write operations are frequent (before/after every interaction) and reads only happen during tool creation, reader-writer lock performance benefits don't apply. Consider adding a brief comment explaining the choice, or using Mutex for simplicity since the lock is held very briefly in both cases.
tests/common/mod.rs:99-106 - Several functions have #[allow(dead_code)] annotations (get_api_key, create_tool_service, create_temp_dir, init_test_logging). While these are genuinely used across different test files, the pattern suggests these might benefit from being marked pub(crate) or having the test modules explicitly import them to suppress the warnings organically rather than with attributes.

Previously Addressed (Filtered)

src/tools/task.rs:65 - paths/sandbox behavior (Implemented: added code comment explaining intentional behavior)
src/tools/task.rs:27-34 - cargo run fallback (Implemented: added tracing::warn for visibility)
Tool description - stdin limitation (Implemented: documented that subagents cannot use ask_user)
Output capture behavior (Skipped: intentional MVP, tracked in Add TaskOutput tool for retrieving background task results #79)

Verdict

REQUEST_CHANGES - Two minor suggestions around code clarity. The implementation is solid and addresses the linked issue #78 well.

Automated review by Claude Code

- Document task tool parameters, returns, limitations - Update kill_shell to mention it works with task tool too - Add task tool use cases to "When to Use Which Tool" table Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

claude · 2026-01-19T10:47:50Z

Code Review

Summary

This PR adds a task tool for spawning clemini subprocesses (subagents) to handle delegated work, supporting both foreground and background modes. It also refactors event handling so tools emit AgentEvent::ToolOutput instead of calling log_event() directly, adds a library crate for integration tests with semantic validation, updates CI to run integration tests, and includes various visual output improvements.

Issues Found

Critical

None

Important

None

Suggestions

None

Previously Addressed (Filtered)

src/tools/task.rs:65 - Paths/sandbox behavior (Implemented: added code comment explaining intentional behavior)
src/tools/task.rs:27-34 - Cargo run fallback warning (Implemented: added tracing::warn for visibility)
Tool description - stdin limitation (Implemented: documented that subagents cannot use ask_user)
Output capture behavior (Skipped: intentional MVP, tracked in Add TaskOutput tool for retrieving background task results #79)
src/tools/mod.rs:63-89 - RwLock vs Mutex choice (Implemented: prior feedback adequately addressed; the choice is reasonable for the use case)
tests/common/mod.rs - #[allow(dead_code)] annotations (Prior feedback: these are genuinely used across different test files via mod common; the pattern is acceptable)

Verdict

APPROVE - The implementation is solid and addresses the linked issue #78 well. The task tool design is clean, the event-driven refactoring for tool output is well-structured, and the integration test framework provides good coverage. Previous feedback has been addressed appropriately.

Automated review by Claude Code

claude · 2026-01-19T10:49:05Z

Code Review

Summary

This PR adds a task tool for spawning clemini subprocesses (subagents) to handle delegated work, supporting both foreground and background modes. It also refactors tool output to use AgentEvent::ToolOutput for proper event ordering, adds a library crate for integration tests with semantic validation, updates CI to run integration tests, and includes visual output improvements (replacing emoji with box-drawing characters).

Issues Found

Critical

None

Important

None

Suggestions

None

Previously Addressed (Filtered)

src/tools/task.rs:65 - Paths/sandbox behavior (Implemented: added code comment explaining intentional behavior)
src/tools/task.rs:27-34 - Cargo run fallback warning (Implemented: added tracing::warn for visibility)
Tool description - stdin limitation (Implemented: documented that subagents cannot use ask_user)
Output capture behavior (Skipped: intentional MVP, tracked in Add TaskOutput tool for retrieving background task results #79)
src/tools/mod.rs:63-89 - RwLock vs Mutex choice (Prior feedback: reasonable for the use case)
tests/common/mod.rs - #[allow(dead_code)] annotations (Prior feedback: acceptable pattern for shared test utilities)

Verdict

APPROVE - The implementation is solid and fully addresses the linked issue #78. The task tool design is clean, reusing the existing BACKGROUND_TASKS registry for unified process management. The event-driven refactoring ensures correct output ordering across all UI modes. The integration test framework with semantic validation provides meaningful coverage. All prior feedback has been appropriately addressed.

Automated review by Claude Code

Makes MCP config portable across machines by using $HOME instead of hardcoded absolute path. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

claude · 2026-01-19T10:51:29Z

Code Review

Summary

This PR adds a task tool for spawning clemini subprocesses (subagents) to handle delegated work, supporting both foreground and background modes. It also refactors tool output to use AgentEvent::ToolOutput for proper event ordering, creates a library crate for integration tests with semantic validation, updates CI to run integration tests, and replaces the emoji tool indicator (🔧) with box-drawing characters (┌─).

Issues Found

Critical

None

Important

None

Suggestions

None

Previously Addressed (Filtered)

src/tools/task.rs:65 - Paths/sandbox behavior (Implemented: added code comment explaining intentional behavior - subagent operates independently)
src/tools/task.rs:27-34 - Cargo run fallback warning (Implemented: added tracing::warn for visibility)
Tool description - stdin limitation (Implemented: documented that subagents cannot use ask_user)
Output capture behavior (Skipped: intentional MVP, tracked in Add TaskOutput tool for retrieving background task results #79)
src/tools/mod.rs:63-89 - RwLock vs Mutex choice (Prior feedback: reasonable for the use case)
tests/common/mod.rs - #[allow(dead_code)] annotations (Prior feedback: acceptable pattern for shared test utilities used via mod common)

Verdict

APPROVE - The implementation is solid and fully addresses the linked issue #78. Key strengths:

Clean task tool design - Reuses the existing BACKGROUND_TASKS registry for unified process management, allowing kill_shell to work for both bash and task processes
Event-driven refactoring - Tools now emit AgentEvent::ToolOutput instead of calling log_event() directly, ensuring correct output ordering across all UI modes (Terminal, TUI, MCP)
Comprehensive test coverage - New integration tests with semantic validation (validate_response_semantically()) provide meaningful coverage without brittle string assertions
Well-documented limitations - Tool description clearly states that subagents cannot use interactive tools (ask_user) and have their own sandbox
All prior feedback appropriately addressed - Either implemented or consciously skipped with tracked follow-up (Add TaskOutput tool for retrieving background task results #79)

Automated review by Claude Code

Machine-specific paths don't work with $HOME expansion, so keep .mcp.json gitignored and provide an example template for users to copy. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

claude · 2026-01-19T10:53:03Z

Code Review

Summary

This PR adds a task tool for spawning clemini subprocesses (subagents) to handle delegated work, supporting both foreground and background modes. It also refactors tool output to use AgentEvent::ToolOutput for proper event ordering, creates a library crate for integration tests with semantic validation, updates CI to run integration tests, and replaces the emoji tool indicator (🔧) with box-drawing characters (┌─).

Issues Found

Critical

None

Important

None

Suggestions

None

Previously Addressed (Filtered)

src/tools/task.rs:65 - Paths/sandbox behavior (Implemented: added code comment explaining intentional behavior - subagent operates independently)
src/tools/task.rs:27-34 - Cargo run fallback warning (Implemented: added tracing::warn for visibility)
Tool description - stdin limitation (Implemented: documented that subagents cannot use ask_user)
Output capture behavior (Skipped: intentional MVP, tracked in Add TaskOutput tool for retrieving background task results #79)
src/tools/mod.rs:63-89 - RwLock vs Mutex choice (Prior feedback: reasonable for the use case)
tests/common/mod.rs - #[allow(dead_code)] annotations (Prior feedback: acceptable pattern for shared test utilities used via mod common)

Verdict

APPROVE - The implementation is solid and fully addresses the linked issue #78. Key strengths:

Clean task tool design - Reuses the existing BACKGROUND_TASKS registry for unified process management, allowing kill_shell to work for both bash and task processes
Event-driven refactoring - Tools now emit AgentEvent::ToolOutput instead of calling log_event() directly, ensuring correct output ordering across all UI modes (Terminal, TUI, MCP)
Comprehensive test coverage - New integration tests with semantic validation (validate_response_semantically()) provide meaningful coverage without brittle string assertions
Well-documented limitations - Tool description clearly states that subagents cannot use interactive tools (ask_user) and have their own sandbox
All prior feedback appropriately addressed - Either implemented or consciously skipped with tracked follow-up (Add TaskOutput tool for retrieving background task results #79)

Automated review by Claude Code

claude · 2026-01-19T10:54:29Z

Code Review

Summary

This PR adds a task tool for spawning clemini subprocesses (subagents) to handle delegated work, supporting both foreground and background modes. It also refactors tool output to use AgentEvent::ToolOutput for proper event ordering, creates a library crate for integration tests with semantic validation, updates CI to run integration tests, and replaces the emoji tool indicator (🔧) with box-drawing characters (┌─).

Issues Found

Critical

None

Important

None

Suggestions

None

Previously Addressed (Filtered)

src/tools/task.rs:65 - Paths/sandbox behavior (Implemented: added code comment explaining intentional behavior - subagent operates independently)
src/tools/task.rs:27-34 - Cargo run fallback warning (Implemented: added tracing::warn for visibility)
Tool description - stdin limitation (Implemented: documented that subagents cannot use ask_user)
Output capture behavior (Skipped: intentional MVP, tracked in Add TaskOutput tool for retrieving background task results #79)
src/tools/mod.rs:63-89 - RwLock vs Mutex choice (Prior feedback: reasonable for the use case)
tests/common/mod.rs - #[allow(dead_code)] annotations (Prior feedback: acceptable pattern for shared test utilities used via mod common)

Verdict

APPROVE - The implementation is solid and fully addresses the linked issue #78. Key strengths:

Clean task tool design - Reuses the existing BACKGROUND_TASKS registry for unified process management, allowing kill_shell to work for both bash and task processes
Event-driven refactoring - Tools now emit AgentEvent::ToolOutput instead of calling log_event() directly, ensuring correct output ordering across all UI modes (Terminal, TUI, MCP)
Comprehensive test coverage - New integration tests with semantic validation (validate_response_semantically()) provide meaningful coverage without brittle string assertions
Well-documented limitations - Tool description clearly states that subagents cannot use interactive tools (ask_user) and have their own sandbox
All prior feedback appropriately addressed - Either implemented or consciously skipped with tracked follow-up (Add TaskOutput tool for retrieving background task results #79)

Automated review by Claude Code

evansenter and others added 11 commits January 18, 2026 23:12

fix: resolve clippy warnings in tests

bfab2f7

- Drop MutexGuard before awaiting child.kill() in bash tests - Remove unnecessary let binding in init_test_logging - Collapse nested if statements using let chain Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

chore: ignore benchmark build artifacts

770e0d3

Add .gradle/, bin/, node_modules/, build/, .class files, and package-lock.json patterns for benchmark exercises. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

fix: resolve clippy warnings in tests

fac6e0b

- Allow dead_code on EXTENDED_TIMEOUT constant - Remove unnecessary let binding in create_test_tool_service Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

evansenter mentioned this pull request Jan 19, 2026

Add TaskOutput tool for retrieving background task results #79

Closed

evansenter and others added 3 commits January 19, 2026 10:45

docs: add task tool to TOOLS.md reference

4a80d1e

- Document task tool parameters, returns, limitations - Update kill_shell to mention it works with task tool too - Add task tool use cases to "When to Use Which Tool" table Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

docs: reference TOOLS.md from CLAUDE.md

56a1867

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

chore: remove stale issue-31 analysis artifacts

73cc162

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

chore: check in .mcp.json with $HOME path

9bb5fda

Makes MCP config portable across machines by using $HOME instead of hardcoded absolute path. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

chore: add .mcp.json.example template

98b6862

Machine-specific paths don't work with $HOME expansion, so keep .mcp.json gitignored and provide an example template for users to copy. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

evansenter force-pushed the feat/task-tool branch from c5c83b4 to 98b6862 Compare January 19, 2026 10:52

evansenter merged commit fd0fa4e into main Jan 19, 2026
8 checks passed

evansenter deleted the feat/task-tool branch January 19, 2026 10:53

evansenter mentioned this pull request Jan 19, 2026

Consider streaming-first architecture for subagents and background operations #59

Open

This was referenced Jan 20, 2026

RFC: MCP client integration (connect to external MCP servers) #3

Open

feat: expose agent as library with live integration tests #74

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add task tool for spawning subagents#80

feat: add task tool for spawning subagents#80
evansenter merged 17 commits intomainfrom
feat/task-tool

evansenter commented Jan 19, 2026

Uh oh!

claude bot commented Jan 19, 2026

Uh oh!

evansenter commented Jan 19, 2026

Uh oh!

claude bot commented Jan 19, 2026

Uh oh!

claude bot commented Jan 19, 2026

Uh oh!

claude bot commented Jan 19, 2026

Uh oh!

claude bot commented Jan 19, 2026

Uh oh!

claude bot commented Jan 19, 2026

Uh oh!

Uh oh!

claude bot commented Jan 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

evansenter commented Jan 19, 2026

Summary

Changes

Design Decisions

Test plan

Uh oh!

claude bot commented Jan 19, 2026

Code Review

Summary

Issues Found

Critical

Important

Suggestions

Verdict

Uh oh!

evansenter commented Jan 19, 2026

Feedback Addressed

Implemented

Skipped

Uh oh!

claude bot commented Jan 19, 2026

Code Review

Summary

Issues Found

Critical

Important

Suggestions

Previously Addressed (Filtered)

Verdict

Uh oh!

claude bot commented Jan 19, 2026

Code Review

Summary

Issues Found

Critical

Important

Suggestions

Previously Addressed (Filtered)

Verdict

Uh oh!

claude bot commented Jan 19, 2026

Code Review

Summary

Issues Found

Critical

Important

Suggestions

Previously Addressed (Filtered)

Verdict

Uh oh!

claude bot commented Jan 19, 2026

Code Review

Summary

Issues Found

Critical

Important

Suggestions

Previously Addressed (Filtered)

Verdict

Uh oh!

claude bot commented Jan 19, 2026

Code Review

Summary

Issues Found

Critical

Important

Suggestions

Previously Addressed (Filtered)

Verdict

Uh oh!

Uh oh!

claude bot commented Jan 19, 2026

Code Review

Summary

Issues Found

Critical

Important

Suggestions

Previously Addressed (Filtered)