Update planner to verify screenshot first. by arnoldlaishram · Pull Request #54 · final-run/finalrun-agent

arnoldlaishram · 2026-04-05T05:52:51Z

Also the thinking level is changed for planner to high and grounder to medium

Summary by CodeRabbit

Bug Fixes
- Enhanced screenshot validation to prevent actions targeting invisible or mispositioned elements
- Strengthened hierarchy validation to ensure only visually confirmed elements are targeted
- Improved planning phase with more thorough analytical processing for better decision-making

… changed for planner to high and grounder to medium

coderabbitai · 2026-04-05T05:53:06Z

📝 Walkthrough

Walkthrough

Two files updated in the goal executor's planner module: the AI agent now increases Google provider thinking level from 'medium' to 'high' for planner phases, while the planner prompt now enforces screenshot-first validation before issuing tap, long_press, or input_text actions.

Changes

Cohort / File(s)	Summary
Planner Enhancements `packages/goal-executor/src/ai/AIAgent.ts`, `packages/goal-executor/src/prompts/planner.md`	Increased Google provider thinking level from 'medium' to 'high' for planner phase; added screenshot-first validation rules to planner prompt requiring visual confirmation of targets before issuing tap, long_press, or input_text actions, treating invisible targets as "ghosts" and strengthening post_action_hierarchy validation guidance.

Estimated Code Review Effort

🎯 2 (Simple) | ⏱️ ~15 minutes

Poem

🐰 The planner thinks deeper than before,
With screenshots to light the way,
No ghost-taps shall knock upon the door—
We see, then act, hip-hip-hooray! 🐇✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately reflects the main change in the planner prompt, which implements screenshot-first validation. However, it omits the other significant change: adjusting the thinking level in AIAgent.ts from medium to high for the planner phase.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch update-planner-to-rely-on-vision-first

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/goal-executor/src/ai/AIAgent.ts`:
- Line 420: The planner behavior was changed in AIAgent (thinkingLevel set to
'high' when phase === 'planner'), so update the test in AIAgent.test (the
assertion around the planner case that currently expects 'medium' between lines
~133-148) to expect 'high' instead; locate the test that inspects the agent's
thinkingLevel for the 'planner' phase (look for references to thinkingLevel or
phase === 'planner' in the test) and change the expected value to 'high' so
tests match the new behavior.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7c5869b0-52ef-4abb-a1ec-471ee1061899

📥 Commits

Reviewing files that changed from the base of the PR and between 22b53a3 and 27058c0.

📒 Files selected for processing (2)

packages/goal-executor/src/ai/AIAgent.ts
packages/goal-executor/src/prompts/planner.md

coderabbitai · 2026-04-05T05:55:51Z

packages/goal-executor/src/ai/AIAgent.ts

          google: {
            thinkingConfig: {
-              thinkingLevel: phase === 'planner' ? 'medium' : 'minimal',
+              thinkingLevel: phase === 'planner' ? 'high' : 'medium',


⚠️ Potential issue | 🟠 Major

Update planner reasoning-level test expectation to match this behavior change.

Line 420 intentionally changes planner thinkingLevel to 'high', but packages/goal-executor/src/ai/AIAgent.test.ts (Line 133-148 in the provided snippet) still asserts 'medium'. This will cause test failure and leave behavior/docs/tests out of sync.

✅ Suggested test update

test('AIAgent uses medium Gemini 3 reasoning defaults for planner calls', () => { const providerOptions = getProviderOptions({ provider: 'google', modelName: 'gemini-3.1-pro-preview', phase: 'planner', }); assert.deepEqual(providerOptions, { google: { thinkingConfig: { - thinkingLevel: 'medium', + thinkingLevel: 'high', includeThoughts: false, }, }, }); });

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/goal-executor/src/ai/AIAgent.ts` at line 420, The planner behavior was changed in AIAgent (thinkingLevel set to 'high' when phase === 'planner'), so update the test in AIAgent.test (the assertion around the planner case that currently expects 'medium' between lines ~133-148) to expect 'high' instead; locate the test that inspects the agent's thinkingLevel for the 'planner' phase (look for references to thinkingLevel or phase === 'planner' in the test) and change the expected value to 'high' so tests match the new behavior.

Update planner to verify screenshot first. also the thinking level is…

27058c0

… changed for planner to high and grounder to medium

coderabbitai bot reviewed Apr 5, 2026

View reviewed changes

droid-ash self-requested a review April 5, 2026 06:01

droid-ash assigned arnoldlaishram Apr 5, 2026

droid-ash approved these changes Apr 5, 2026

View reviewed changes

droid-ash merged commit a951d9f into main Apr 5, 2026
1 check passed

droid-ash mentioned this pull request Apr 6, 2026

chore: bump version to 0.1.4 #65

Merged

2 tasks

coderabbitai bot mentioned this pull request Apr 12, 2026

Restructure planner prompt for clarity and stricter retry rules #84

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update planner to verify screenshot first. #54

Update planner to verify screenshot first. #54
droid-ash merged 1 commit intomainfrom
update-planner-to-rely-on-vision-first

arnoldlaishram commented Apr 5, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Apr 5, 2026 •

edited

Loading

Walkthrough

Changes

Estimated Code Review Effort

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Apr 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

arnoldlaishram commented Apr 5, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated Code Review Effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

arnoldlaishram commented Apr 5, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 5, 2026 •

edited

Loading