Skip to content

Conversation

@jerop
Copy link
Collaborator

@jerop jerop commented Jan 16, 2026

Summary

This PR refines the experimental Plan Mode by implementing a strict "secure-by-default" policy that allows only read and search operations. It also enhances the tool scheduler to immediately halt agent execution if a prohibited tool is attempted, preventing retry loops and providing clear guidance to the user.

Closes #16625

Details

Policy Implementation

  • Introduced packages/core/src/policy/policies/plan.toml with a default-deny rule for all tools in Plan mode.
  • Explicitly allow-listed read and search tools (read_file, web_fetch, google_web_search, etc.) only.

Execution Control

  • Updated CoreToolScheduler to return a STOP_EXECUTION error type when a tool is denied in Plan mode.
  • This triggers the existing "Stop" logic in the agent loop, preventing the model from retrying the same blocked action.

How to Validate

  1. Build the project:

    npm run build
  2. Run in Plan Mode:

    # Ensure experimental.plan is enabled in your settings
    npm start -- --approval-mode=plan 
  3. Attempt a Write Operation:

    • Ask: "Create a new file called test.txt"
    • Expected: The agent stops immediately with the error: "Tool execution denied by policy. You are in Plan Mode - adjust your prompt to only use read and search tools."
  4. Attempt a Read Operation:

    • Ask: "Read package.json"
    • Expected: The tool executes successfully.

Pre-Merge Checklist

  • Updated relevant documentation and README (Self-documenting policy files)
  • Added/updated tests (Integration and Unit tests added)
  • Noted breaking changes (None, Plan mode is experimental)
  • Validated on required platforms/methods:
    • MacOS
      • npm run
      • npx
      • Docker
      • Podman
      • Seatbelt
    • Windows
    • Linux

@jerop jerop requested a review from a team as a code owner January 16, 2026 17:00
@jerop jerop force-pushed the feat/plan-mode-refinement branch 2 times, most recently from d4fb3bf to a75fd86 Compare January 16, 2026 17:05
jerop added 2 commits January 16, 2026 12:06
Introduces a default-deny policy for Plan mode that explicitly allows only safe read and search tools. Includes integration tests verifying tool enforcement and priority logic.
Updates the tool scheduler to return a STOP_EXECUTION error type when a tool is denied in Plan mode. This breaks the agent's retry loop and provides a clear instructional error message. Includes unit tests for the new denial behavior.
@jerop jerop force-pushed the feat/plan-mode-refinement branch from a75fd86 to 476394f Compare January 16, 2026 17:07
Updates the Plan mode unit test to use the correct MockTool constructor signature and proper type casting for the mocked PolicyEngine.
@github-actions
Copy link

Size Change: +375 B (0%)

Total Size: 23.1 MB

ℹ️ View Unchanged
Filename Size Change
./bundle/gemini.js 23.1 MB +375 B (0%)
./bundle/sandbox-macos-permissive-closed.sb 1.03 kB 0 B
./bundle/sandbox-macos-permissive-open.sb 890 B 0 B
./bundle/sandbox-macos-permissive-proxied.sb 1.31 kB 0 B
./bundle/sandbox-macos-restrictive-closed.sb 3.29 kB 0 B
./bundle/sandbox-macos-restrictive-open.sb 3.36 kB 0 B
./bundle/sandbox-macos-restrictive-proxied.sb 3.56 kB 0 B

compressed-size-action

@jerop jerop enabled auto-merge January 16, 2026 17:17
Copy link
Collaborator

@jacob314 jacob314 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@jerop jerop added this pull request to the merge queue Jan 16, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 16, 2026
@jerop jerop added this pull request to the merge queue Jan 16, 2026
Merged via the queue into main with commit 5241174 Jan 16, 2026
43 of 44 checks passed
@jerop jerop deleted the feat/plan-mode-refinement branch January 16, 2026 18:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Configure Plan Mode Policy Rules

2 participants