Skip to content

[Feature]: Add mandatory plan step before task execution#37

Merged
louisdevzz merged 1 commit intomainfrom
feat/mandatory-plan-step
Mar 5, 2026
Merged

[Feature]: Add mandatory plan step before task execution#37
louisdevzz merged 1 commit intomainfrom
feat/mandatory-plan-step

Conversation

@louisdevzz
Copy link
Collaborator

@louisdevzz louisdevzz commented Mar 5, 2026

Summary

  • Base branch target: main
  • Problem: Agent executes tasks immediately without showing user what it plans to do
  • Why it matters: Users need transparency and control before agent modifies files/runs commands
  • What changed: Added mandatory plan step that shows LLM-generated plan before tool execution
  • What did not change: Read-only operations skip plan; factory_build workflow unchanged

Label Snapshot (required)

  • Risk label: risk: medium
  • Size label: size: M
  • Scope labels: agent,core
  • Module labels: agent: orchestration
  • Contributor tier label: trusted contributor

Change Metadata

  • Change type: feature
  • Primary scope: agent

Linked Issue

Validation Evidence

cargo fmt --all -- --check  #
cargo clippy --locked --all-targets -- -D clippy::correctness  # ✓ (warnings only)
cargo test --lib  # ✓ 3139 passed

Security Impact

  • New permissions/capabilities? No
  • New external network calls? No
  • Secrets/tokens handling changed? No
  • File system access scope changed? No

Privacy and Data Hygiene

  • Data-hygiene status: pass
  • Redaction/anonymization notes: N/A
  • Neutral wording confirmation: Yes

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? No
  • Migration needed? No

i18n Follow-Through

  • i18n follow-through triggered? No

Rollback Plan

  • Fast rollback: git revert or revert PR
  • Feature flags: None
  • Observable failure: User reports, test failures

Risks and Mitigations

  • Risk: UX friction from extra step
    • Mitigation: Read-only ops exempt; simple yes/no flow
  • Risk: Channel compatibility (Signal/WhatsApp)
    • Mitigation: Plan shown as normal message

Implement mandatory plan-before-execute requirement for all agent tasks:

Agent changes (src/agent/agent.rs):
- Add generate_and_confirm_plan() method that creates execution plan
- Plan is triggered on first tool iteration when write operations detected
- Plan lists: files to modify, commands to execute, expected outcomes
- User must explicitly approve (yes/no) before execution proceeds
- Read-only operations (file_read, etc.) are exempt from plan confirmation
- Plan is added to conversation history for audit purposes

Documentation (AGENTS.md):
- Update section 5.2 'Plan enforcement' with detailed behavior
- Document implementation location and exemption rules

Features:
- Automatic plan generation based on actual tool calls
- Clear bullet-point plan format
- User confirmation required before any state changes
- Graceful rejection handling
- Test-friendly implementation (all 3139 tests pass)

Fixes #21
@github-actions
Copy link

github-actions bot commented Mar 5, 2026

PR Intake Checks - Warnings (non-blocking)

The following are recommendations:

  • Missing sections: Problem statement, Proposed solution, Acceptance criteria

@coderabbitai
Copy link

coderabbitai bot commented Mar 5, 2026

Note

.coderabbit.yaml has unrecognized properties

CodeRabbit is using all valid settings from your configuration. Unrecognized properties (listed below) have been ignored and may indicate typos or deprecated fields that can be removed.

⚠️ Parsing warnings (1)
Validation error: Unrecognized key(s) in object: 'version', 'tests', 'ignore'
⚙️ Configuration instructions
  • Please see the configuration documentation for more information.
  • You can also validate your configuration using the online YAML validator.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Walkthrough

Updates documentation and implements mandatory plan confirmation before state-modifying tool execution. Adds a private generate_and_confirm_plan method that inspects upcoming tool calls, builds a textual plan for write operations, prompts user approval, and integrates planning into the turn flow at the first iteration.

Changes

Cohort / File(s) Summary
Documentation
AGENTS.md
Clarifies plan enforcement requirement: plan generation/confirmation must occur before executing state-modifying tools. Documents mandatory plan-step behavior for first tool iteration (listing actions, requiring "yes" approval, aborting on rejection, read-only ops exempt).
Agent Planning Logic
src/agent/agent.rs
Adds private generate_and_confirm_plan() async method that inspects tool calls for write operations, generates formatted plan, prompts user approval, and appends approved plan to history. Integrates mandatory planning into turn flow at iteration == 0 before tool execution.

Sequence Diagram

sequenceDiagram
    participant User
    participant Agent as Agent System
    participant Tools as Tool Execution
    
    User->>Agent: Submit task
    Agent->>Agent: Evaluate next tool calls
    alt Has Write Operations
        Agent->>Agent: Generate plan from tool metadata
        Agent->>User: Present plan for approval
        User->>User: Review planned actions
        User-->>Agent: Approve ("yes")
        Agent->>Agent: Append plan to history
        Agent->>Tools: Execute approved tools
        Tools-->>Agent: Tool results
        Agent->>User: Return execution results
    else Read-Only Operations
        Agent->>Tools: Execute tools directly
        Tools-->>Agent: Tool results
        Agent->>User: Return execution results
    else User Rejects Plan
        User-->>Agent: Reject plan
        Agent->>User: Abort with no changes notice
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly and concisely describes the main feature being implemented—adding a mandatory plan step before task execution.
Description check ✅ Passed The PR description comprehensively covers all required template sections including summary, metadata, validation evidence, security/privacy/compatibility assessments, verification details, rollback plan, and risks.
Linked Issues check ✅ Passed The implementation fully addresses issue #21 objectives: mandatory plan generation with user confirmation, read-only exemptions, plan logging in conversation history, and AGENTS.md documentation updates.
Out of Scope Changes check ✅ Passed All changes are scoped to the stated objectives. Only AGENTS.md and src/agent/agent.rs are modified to implement the mandatory plan step; no extraneous refactoring or unrelated changes are present.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/mandatory-plan-step

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@louisdevzz louisdevzz self-assigned this Mar 5, 2026
@louisdevzz louisdevzz added the enhancement small enhancement / improvement from existing feature label Mar 5, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
src/agent/agent.rs (3)

442-446: Unused model parameter.

The model parameter is accepted but never used within generate_and_confirm_plan. Either remove it or document the intended future use.

♻️ Proposed fix
     async fn generate_and_confirm_plan(
         &mut self,
-        model: &str,
         calls: &[ParsedToolCall],
     ) -> Result<bool> {

And update the call site at line 602:

                 let plan_confirmed = self
-                    .generate_and_confirm_plan(&effective_model, &calls)
+                    .generate_and_confirm_plan(&calls)
                     .await?;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/agent.rs` around lines 442 - 446, The generate_and_confirm_plan
function currently accepts an unused model parameter; remove the unused
parameter from the function signature (async fn generate_and_confirm_plan(&mut
self, calls: &[ParsedToolCall]) -> Result<bool>) and update every call site that
passed the model (e.g., the invocation near the previous call site) to stop
supplying that argument, or alternatively if the model is intended to be used
later, document the parameter and use it in the function body; prefer removing
the parameter now and update callers to match the new signature (search for
generate_and_confirm_plan and adjust callers accordingly).

787-871: Consider adding test coverage for the plan confirmation path.

The existing tests exercise non-write-op scenarios where plan confirmation is skipped. The new generate_and_confirm_plan logic isn't directly tested. Consider:

  1. A unit test for generate_and_confirm_plan with a mock stdin, or
  2. Refactoring confirmation input behind a trait for testability

This could be addressed in a follow-up PR given the complexity of mocking stdin.

Would you like me to open an issue to track adding test coverage for the plan confirmation workflow?

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/agent.rs` around lines 787 - 871, Tests currently don't cover the
plan confirmation path; add a unit test that exercises generate_and_confirm_plan
by either (A) refactoring the confirmation input into an injectable trait (e.g.,
a ConfirmProvider) and adding a mock implementation passed via Agent::builder,
or (B) writing a test that mocks stdin to simulate user confirmation, then
calling Agent::turn to trigger generate_and_confirm_plan and asserting the
plan-confirmation behavior (e.g., the agent proceeds and expected history
entries appear). Modify Agent::builder to accept the injectable confirmation
provider (or wiring for mocked stdin) and add a test that validates the
confirmed-plan flow so generate_and_confirm_plan is exercised.

506-510: Blocking stdin read in async context may limit non-CLI usage.

std::io::stdin().read_line() blocks the tokio runtime thread. While this works for CLI mode, it could cause issues when:

  • The agent runs via non-interactive channels (Telegram, Discord, Slack)
  • Running in automated tests or CI pipelines

Consider abstracting the confirmation mechanism behind a trait or using tokio::io::stdin() with AsyncBufReadExt::read_line() for async-compatible I/O. Alternatively, document that this method is CLI-only.

💡 Alternative: async stdin
use tokio::io::{AsyncBufReadExt, BufReader};

// In generate_and_confirm_plan:
let mut reader = BufReader::new(tokio::io::stdin());
let mut input = String::new();
reader.read_line(&mut input).await
    .map_err(|e| anyhow::anyhow!("Failed to read user input: {e}"))?;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/agent.rs` around lines 506 - 510, The blocking
std::io::stdin().read_line() call inside generate_and_confirm_plan blocks the
tokio runtime and prevents non-CLI usage; replace it with an async-compatible
confirmation abstraction: either (1) introduce a trait (e.g., ConfirmPrompt)
with an async method confirm(&mut self) -> Result<String> and inject an
implementation used in generate_and_confirm_plan so non-CLI callers can provide
alternate behavior, or (2) switch to tokio::io::stdin() + tokio::io::BufReader
and AsyncBufReadExt::read_line(). Update generate_and_confirm_plan to depend on
the async trait or await the async read_line and propagate errors as
anyhow::Error, and ensure tests/CLI wiring construct the CLI implementation
while non-interactive callers can stub/mock the trait.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/agent/agent.rs`:
- Around line 448-458: The current write-op detection in has_write_ops (the
closure on calls.iter().any that matches call.name.as_str()) is missing several
sandbox- and GitHub-related state-modifying tool names; update that matches list
to also include sandbox_create, sandbox_kill, sandbox_save_snapshot,
sandbox_restore_snapshot and the GitHub mutating tools (github_create_issue,
github_create_pr, github_create_issue_with_hashtags, github_edit_issue,
github_close_issue, github_comment_issue, github_comment_pr,
github_reply_comment, github_review_pr, github_review_pr_with_checklist) so
these calls trigger the plan-confirmation gate, or alternatively refactor the
logic in the has_write_ops computation to invert the check to an explicit
allowlist of read-only tool names (e.g., file_read, sandbox_read_file,
sandbox_list_files, etc.) to ensure a safer default.

---

Nitpick comments:
In `@src/agent/agent.rs`:
- Around line 442-446: The generate_and_confirm_plan function currently accepts
an unused model parameter; remove the unused parameter from the function
signature (async fn generate_and_confirm_plan(&mut self, calls:
&[ParsedToolCall]) -> Result<bool>) and update every call site that passed the
model (e.g., the invocation near the previous call site) to stop supplying that
argument, or alternatively if the model is intended to be used later, document
the parameter and use it in the function body; prefer removing the parameter now
and update callers to match the new signature (search for
generate_and_confirm_plan and adjust callers accordingly).
- Around line 787-871: Tests currently don't cover the plan confirmation path;
add a unit test that exercises generate_and_confirm_plan by either (A)
refactoring the confirmation input into an injectable trait (e.g., a
ConfirmProvider) and adding a mock implementation passed via Agent::builder, or
(B) writing a test that mocks stdin to simulate user confirmation, then calling
Agent::turn to trigger generate_and_confirm_plan and asserting the
plan-confirmation behavior (e.g., the agent proceeds and expected history
entries appear). Modify Agent::builder to accept the injectable confirmation
provider (or wiring for mocked stdin) and add a test that validates the
confirmed-plan flow so generate_and_confirm_plan is exercised.
- Around line 506-510: The blocking std::io::stdin().read_line() call inside
generate_and_confirm_plan blocks the tokio runtime and prevents non-CLI usage;
replace it with an async-compatible confirmation abstraction: either (1)
introduce a trait (e.g., ConfirmPrompt) with an async method confirm(&mut self)
-> Result<String> and inject an implementation used in generate_and_confirm_plan
so non-CLI callers can provide alternate behavior, or (2) switch to
tokio::io::stdin() + tokio::io::BufReader and AsyncBufReadExt::read_line().
Update generate_and_confirm_plan to depend on the async trait or await the async
read_line and propagate errors as anyhow::Error, and ensure tests/CLI wiring
construct the CLI implementation while non-interactive callers can stub/mock the
trait.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6a70bfe3-277c-4ec0-8047-8269981a9578

📥 Commits

Reviewing files that changed from the base of the PR and between ada445f and 27f0f12.

📒 Files selected for processing (2)
  • AGENTS.md
  • src/agent/agent.rs

@louisdevzz louisdevzz merged commit 7fea63e into main Mar 5, 2026
6 of 19 checks passed
louisdevzz added a commit that referenced this pull request Mar 5, 2026
Added detailed FIX_PLAN_BUGS.md documenting:
- Current implementation and flow
- 5 specific bugs with acceptance criteria
- Step-by-step implementation guide
- Code structure and key methods
- Testing strategy

This document will help any agent understand and fix the bugs
in the mandatory plan-before-execute feature.

Related: #38, #37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent docs enhancement small enhancement / improvement from existing feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Add mandatory plan-before-execute step for agent tasks

1 participant