-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Description
App Version
3.25.17
API Provider
AWS Bedrock
Model Used
Claude 3.7 Sonnet
Roo Code Task Links (Optional)
Browser Action Tool: Screenshots Display as Base64 Text After ~5 Actions
Bug Description
After approximately 5 browser actions, the browser_action
tool stops displaying screenshots as images and instead shows massive walls of base64-encoded text in the chat interface. This makes browser automation workflows extremely difficult to follow and debug.
Expected Behavior
Screenshots should consistently display as rendered images throughout the entire browser session, providing visual feedback for each action.
Actual Behavior
- First 5+ actions: Screenshots display correctly as images
- Subsequent actions: Screenshots appear as large blocks of base64 text like:
{"screenshot":"data:image/webp;base64,UklGRjAKAABXRUJQVlA4ICQKAADwJACdASqEA..."}
Impact
- Severe usability degradation: Walls of base64 text flood the interface, making it nearly impossible to read conversation history
- Loss of visual debugging: Cannot see what's happening on the page during automation
- Workflow disruption: Long browser sessions become unusable due to interface clutter
- Poor user experience: The core visual feedback feature becomes counterproductive
Steps to Reproduce
- Launch a browser session with
browser_action
- Perform 5+ sequential actions (click, type, scroll, etc.)
- Observe that screenshots transition from images to base64 text blocks
Environment
- Roo Code Version: 3.25.17
- OS: Windows 10
Suggested Solutions
- Maintain consistent image rendering: Continue displaying screenshots as images regardless of session length
- Add configuration option: Allow users to control screenshot display format
- Implement smart fallbacks: If memory is a concern, consider thumbnail previews instead of base64 dumps
- Session management: Provide automatic browser session reset options to maintain visual feedback
Workarounds Currently Used
- Manually restarting browser sessions every 4-5 actions
- Breaking complex workflows into smaller tasks
- Avoiding longer browser automation sequences
Priority
High - This significantly impacts the usability of the browser automation feature, which is a core functionality of Roo Code.
Additional Notes: The base64 format suggests the screenshots are still being captured correctly, but there's likely a UI rendering threshold or memory management issue causing the display to fall back to raw data instead of rendered images.
🔁 Steps to Reproduce
- Launch a browser session with browser_action
- Perform 5+ sequential actions (click, type, scroll, etc.)
- Observe that screenshots transition from images to base64 text blocks
💥 Outcome Summary
Screenshots should consistently display as rendered images throughout the entire browser session, providing visual feedback for each action.
First 5+ actions: Screenshots display correctly as images
Subsequent actions: Screenshots appear as large blocks of base64 text like:
{"screenshot":"data:image/webp;base64,UklGRjAKAABXRUJQVlA4ICQKAADwJACdASqEA..."}
📄 Relevant Logs or Errors (Optional)
Metadata
Metadata
Assignees
Labels
Type
Projects
Status