Skip to content

Update planner to verify screenshot first. #54

Merged
droid-ash merged 1 commit intomainfrom
update-planner-to-rely-on-vision-first
Apr 5, 2026
Merged

Update planner to verify screenshot first. #54
droid-ash merged 1 commit intomainfrom
update-planner-to-rely-on-vision-first

Conversation

@arnoldlaishram
Copy link
Copy Markdown
Contributor

@arnoldlaishram arnoldlaishram commented Apr 5, 2026

Also the thinking level is changed for planner to high and grounder to medium

Summary by CodeRabbit

  • Bug Fixes
    • Enhanced screenshot validation to prevent actions targeting invisible or mispositioned elements
    • Strengthened hierarchy validation to ensure only visually confirmed elements are targeted
    • Improved planning phase with more thorough analytical processing for better decision-making

… changed for planner to high and grounder to medium
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 5, 2026

📝 Walkthrough

Walkthrough

Two files updated in the goal executor's planner module: the AI agent now increases Google provider thinking level from 'medium' to 'high' for planner phases, while the planner prompt now enforces screenshot-first validation before issuing tap, long_press, or input_text actions.

Changes

Cohort / File(s) Summary
Planner Enhancements
packages/goal-executor/src/ai/AIAgent.ts, packages/goal-executor/src/prompts/planner.md
Increased Google provider thinking level from 'medium' to 'high' for planner phase; added screenshot-first validation rules to planner prompt requiring visual confirmation of targets before issuing tap, long_press, or input_text actions, treating invisible targets as "ghosts" and strengthening post_action_hierarchy validation guidance.

Estimated Code Review Effort

🎯 2 (Simple) | ⏱️ ~15 minutes

Poem

🐰 The planner thinks deeper than before,
With screenshots to light the way,
No ghost-taps shall knock upon the door—
We see, then act, hip-hip-hooray! 🐇✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main change in the planner prompt, which implements screenshot-first validation. However, it omits the other significant change: adjusting the thinking level in AIAgent.ts from medium to high for the planner phase.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch update-planner-to-rely-on-vision-first

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/goal-executor/src/ai/AIAgent.ts`:
- Line 420: The planner behavior was changed in AIAgent (thinkingLevel set to
'high' when phase === 'planner'), so update the test in AIAgent.test (the
assertion around the planner case that currently expects 'medium' between lines
~133-148) to expect 'high' instead; locate the test that inspects the agent's
thinkingLevel for the 'planner' phase (look for references to thinkingLevel or
phase === 'planner' in the test) and change the expected value to 'high' so
tests match the new behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7c5869b0-52ef-4abb-a1ec-471ee1061899

📥 Commits

Reviewing files that changed from the base of the PR and between 22b53a3 and 27058c0.

📒 Files selected for processing (2)
  • packages/goal-executor/src/ai/AIAgent.ts
  • packages/goal-executor/src/prompts/planner.md

google: {
thinkingConfig: {
thinkingLevel: phase === 'planner' ? 'medium' : 'minimal',
thinkingLevel: phase === 'planner' ? 'high' : 'medium',
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Update planner reasoning-level test expectation to match this behavior change.

Line 420 intentionally changes planner thinkingLevel to 'high', but packages/goal-executor/src/ai/AIAgent.test.ts (Line 133-148 in the provided snippet) still asserts 'medium'. This will cause test failure and leave behavior/docs/tests out of sync.

✅ Suggested test update
 test('AIAgent uses medium Gemini 3 reasoning defaults for planner calls', () => {
   const providerOptions = getProviderOptions({
     provider: 'google',
     modelName: 'gemini-3.1-pro-preview',
     phase: 'planner',
   });

   assert.deepEqual(providerOptions, {
     google: {
       thinkingConfig: {
-        thinkingLevel: 'medium',
+        thinkingLevel: 'high',
         includeThoughts: false,
       },
     },
   });
 });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/goal-executor/src/ai/AIAgent.ts` at line 420, The planner behavior
was changed in AIAgent (thinkingLevel set to 'high' when phase === 'planner'),
so update the test in AIAgent.test (the assertion around the planner case that
currently expects 'medium' between lines ~133-148) to expect 'high' instead;
locate the test that inspects the agent's thinkingLevel for the 'planner' phase
(look for references to thinkingLevel or phase === 'planner' in the test) and
change the expected value to 'high' so tests match the new behavior.

@droid-ash droid-ash merged commit a951d9f into main Apr 5, 2026
1 check passed
@droid-ash droid-ash mentioned this pull request Apr 6, 2026
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants