Skip to content

Conversation

@olesho
Copy link
Collaborator

@olesho olesho commented Nov 20, 2025

Summary by CodeRabbit

  • New Features

    • Added JavaScript execution via a new /page/execute API endpoint
    • Page content retrieval can now optionally include iframe content
    • Responses may include a frame count when iframe inclusion is used
  • Documentation

    • Added full API docs and examples for the new execution endpoint and iframe options
    • Updated server docs to reflect extended page content behavior

✏️ Tip: You can customize this high-level summary in your review settings.

@olesho olesho requested a review from tysonthomas9 November 20, 2025 00:31
@claude
Copy link

claude bot commented Nov 20, 2025

Claude finished @olesho's task —— View job


PR Review: Feat/capture iframes

I've completed a thorough review of this pull request that adds iframe capturing functionality to the browser-operator-core project.

Todo List

  • Read repository CLAUDE.md for guidelines
  • Review agent-server/README.md changes
  • Review agent-server/nodejs/CLAUDE.md changes
  • Review agent-server/nodejs/src/api-server.js changes
  • Review agent-server/nodejs/src/lib/BrowserAgentServer.js changes
  • Check for test coverage
  • Provide comprehensive feedback summary

Review Summary

This PR successfully implements iframe capture functionality across the agent-server API. The implementation is well-structured and follows the existing architectural patterns.

Code Quality and Best Practices ✅

Strengths:

  • Clean API design: The includeIframes parameter in /page/content endpoint is intuitive and backward-compatible
  • Proper error handling: Each iframe capture failure is logged but doesn't break the entire operation
  • Comprehensive logging: Excellent logging throughout the capture process for debugging
  • Documentation: Both README.md and CLAUDE.md are thoroughly updated with clear examples

Areas for improvement:

  • Method signature inconsistency: getPageText doesn't support iframe options while getPageHTML does (agent-server/nodejs/src/lib/BrowserAgentServer.js:1277)
  • Consider adding iframe support to text extraction for consistency

Potential Bugs and Issues 🟡

Minor Issues:

  1. Error handling: In captureAllFramesHTML method, failed frames are logged but silently skipped. Consider adding a summary of skipped frames in the response for transparency
  2. Resource cleanup: Multiple CDP connections are created for frame contexts but there's no explicit cleanup mentioned for the isolated worlds created via Page.createIsolatedWorld

Recommendations:

  • Add a skippedFrames counter to the response metadata
  • Consider implementing frame capture timeouts to prevent hanging on problematic frames

Performance Considerations 🟡

Concerns:

  1. Sequential frame processing: Frames are captured sequentially which could be slow for pages with many iframes (BrowserAgentServer.js:1191-1241)
  2. Memory usage: Concatenating all frame HTML in memory could be problematic for large pages with many frames

Suggestions:

  • Consider parallel frame capture with Promise.all() for better performance
  • Add configuration limits for maximum frames to capture
  • Consider streaming or chunked responses for very large combined HTML

Security Considerations ✅

Good practices observed:

  • Uses CDP's isolated world execution contexts for frame access
  • Proper iframe sandboxing through CDP rather than direct DOM manipulation
  • No direct script injection or eval usage

Test Coverage ❌

Critical Gap:

  • No test files found in the entire agent-server directory
  • No unit tests for the new iframe capture functionality
  • No integration tests for the enhanced /page/content endpoint

Recommendations:

  • Add unit tests for captureAllFramesHTML() method
  • Add integration tests for the /page/content endpoint with includeIframes=true
  • Consider adding test cases for error scenarios (unreachable frames, permission issues)

Implementation Details Review

API Server Changes (api-server.js:287-324):

  • Clean parameter extraction with sensible defaults
  • Proper error messaging for required fields
  • Good logging for debugging

Browser Agent Server Changes (BrowserAgentServer.js:1113-1255):

  • Well-structured recursive frame tree traversal
  • Appropriate use of CDP's Page.getFrameTree and isolated world execution
  • Good separation between main frame and child frame handling

Documentation Quality ✅

Excellent documentation updates:

  • README.md examples are clear and practical
  • CLAUDE.md includes comprehensive API documentation
  • Proper parameter documentation with types and defaults

Final Recommendation

APPROVE with minor suggestions - This is a solid implementation that adds valuable functionality while maintaining code quality and following existing patterns. The main concern is the lack of test coverage, which should be addressed in a follow-up PR.

Suggested Follow-up Tasks

  1. Add comprehensive test suite for iframe capture functionality
  2. Consider adding iframe support to getPageText() method for consistency
  3. Implement performance optimizations for parallel frame capture
  4. Add configuration options for frame capture limits

@coderabbitai
Copy link

coderabbitai bot commented Nov 20, 2025

Caution

Review failed

The pull request is closed.

Walkthrough

Adds a new POST /page/execute API documented for remote JavaScript execution and extends page content retrieval with an includeIframes option. BrowserAgentServer gains traversal helpers to capture and aggregate HTML from all frames and to report a frameCount.

Changes

Cohort / File(s) Summary
Documentation / API
agent-server/README.md, agent-server/nodejs/CLAUDE.md
Documented new POST /page/execute endpoint (request/response schema, examples) and expanded /page/content docs to include includeIframes, frameCount, and usage notes.
API Server
agent-server/nodejs/src/api-server.js
getPageContent now accepts includeIframes from payload, forwards it to page fetchers, logs the flag, and includes frameCount in responses when present.
Browser Agent Server
agent-server/nodejs/src/lib/BrowserAgentServer.js
getPageHTML(tabId, options = {}) accepts includeIframes. Added captureAllFramesHTML(tabId, frameTree) to recursively collect frame HTML (with markers) using isolated worlds, and countFrames(frameTree) to compute total frames. Responses may include aggregated content and frameCount.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant APIServer
    participant BrowserAgentServer
    participant CDP as Chrome DevTools Protocol

    Client->>APIServer: POST /page/content (includeIframes: true)
    APIServer->>BrowserAgentServer: getPageHTML(tabId, { includeIframes: true })

    alt includeIframes = true
        BrowserAgentServer->>CDP: Page.getFrameTree()
        CDP-->>BrowserAgentServer: frameTree
        loop each frame
            BrowserAgentServer->>CDP: createIsolatedWorld / Runtime.evaluate(outerHTML) in frame
            CDP-->>BrowserAgentServer: frame HTML
            BrowserAgentServer->>BrowserAgentServer: append HTML with frame markers
        end
        BrowserAgentServer->>BrowserAgentServer: countFrames(frameTree)
        BrowserAgentServer-->>APIServer: { content: aggregatedHTML, frameCount, timestamp }
    else includeIframes = false
        BrowserAgentServer->>CDP: Runtime.evaluate(outerHTML) (main frame)
        CDP-->>BrowserAgentServer: main frame HTML
        BrowserAgentServer-->>APIServer: { content, timestamp }
    end

    APIServer-->>Client: HTTP response (result, frameCount?, timestamp)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40 minutes

  • Review points:
    • agent-server/nodejs/src/lib/BrowserAgentServer.js: validate recursive frame traversal, isolated world creation, error handling per-frame, HTML concatenation and marker format.
    • agent-server/nodejs/src/api-server.js: ensure includeIframes is correctly threaded, logged, and frameCount surfaced only when appropriate.
    • Docs: confirm /page/execute and /page/content examples and schemas match implementation.

Possibly related PRs

  • Feat/capture iframes #72 — Adds iframe-capture support and frame counting; closely related to includeIframes, captureAllFramesHTML, and countFrames changes.

Suggested reviewers

  • tysonthomas9

Poem

🐰 Through nested frames I nimbly creep,
I gather HTML where rabbits leap,
Markers placed, each frame in line,
Execute the script — the results are mine,
Hopping pages, one snapshot at a time!

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Feat/capture iframes' directly describes the main feature addition across all modified files: capturing HTML content from iframes in addition to the main frame.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1d859b2 and ba87432.

📒 Files selected for processing (2)
  • agent-server/nodejs/CLAUDE.md (4 hunks)
  • agent-server/nodejs/src/lib/BrowserAgentServer.js (2 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
agent-server/README.md (1)

188-211: Document the includeIframes parameter (same issue as CLAUDE.md).

Similar to the CLAUDE.md documentation, the /page/content endpoint documentation should include the includeIframes parameter. See the review comment on CLAUDE.md lines 216-227 for the suggested documentation addition.

🧹 Nitpick comments (3)
agent-server/nodejs/src/lib/BrowserAgentServer.js (1)

1207-1211: Consider uniqueness for isolated world names.

The hardcoded world name 'iframe-capture' could cause issues if multiple concurrent calls to captureAllFramesHTML occur on the same tab, as each frame will try to create an isolated world with the same name.

Consider appending a unique identifier:

           const contextResult = await this.sendCDPCommandToTarget(tabId, 'Page.createIsolatedWorld', {
             frameId: frame.id,
             grantUniversalAccess: true,
-            worldName: 'iframe-capture'
+            worldName: `iframe-capture-${Date.now()}-${i}`
           });
agent-server/nodejs/src/api-server.js (2)

287-287: Add type validation for includeIframes parameter.

The includeIframes parameter is not validated. Non-boolean values could cause unexpected behavior due to JavaScript's truthy/falsy coercion (e.g., "false" would be truthy).

Add validation after line 299:

     if (!['html', 'text'].includes(format)) {
       throw new Error('Format must be either "html" or "text"');
     }
+
+    if (includeIframes !== undefined && typeof includeIframes !== 'boolean') {
+      throw new Error('includeIframes must be a boolean value');
+    }

306-308: Document or handle that includeIframes only works with HTML format.

Looking at BrowserAgentServer.js, only the getPageHTML method implements the includeIframes feature. The getPageText method doesn't have this parameter. Passing includeIframes: true with format: 'text' will silently ignore the option, which could confuse users.

Consider either:

  1. Adding a validation check
  2. Documenting this limitation in the endpoint docs

Option 1 (validation):

+    if (includeIframes && format === 'text') {
+      throw new Error('includeIframes is only supported with format="html"');
+    }
+
     // Call appropriate method based on format
     const result = format === 'html'
       ? await this.browserAgentServer.getPageHTML(tabId, { includeIframes })
       : await this.browserAgentServer.getPageText(tabId, { includeIframes });

Option 2 (only pass to HTML):

     // Call appropriate method based on format
     const result = format === 'html'
       ? await this.browserAgentServer.getPageHTML(tabId, { includeIframes })
-      : await this.browserAgentServer.getPageText(tabId, { includeIframes });
+      : await this.browserAgentServer.getPageText(tabId);
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4ab6446 and 1d859b2.

📒 Files selected for processing (4)
  • agent-server/README.md (1 hunks)
  • agent-server/nodejs/CLAUDE.md (4 hunks)
  • agent-server/nodejs/src/api-server.js (2 hunks)
  • agent-server/nodejs/src/lib/BrowserAgentServer.js (2 hunks)
🧰 Additional context used
🪛 ESLint
agent-server/nodejs/src/lib/BrowserAgentServer.js

[error] 1169-1169: There must be no hyphen before @param description.

(jsdoc/require-hyphen-before-param-description)


[error] 1170-1170: There must be no hyphen before @param description.

(jsdoc/require-hyphen-before-param-description)


[error] 1177-1177: Unexpected parentheses around single function argument.

(@stylistic/arrow-parens)


[error] 1259-1259: There must be no hyphen before @param description.

(jsdoc/require-hyphen-before-param-description)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: claude-review
🔇 Additional comments (7)
agent-server/nodejs/CLAUDE.md (1)

228-317: LGTM! Comprehensive endpoint documentation.

The /page/execute endpoint documentation is thorough and well-structured, including:

  • Clear parameter descriptions
  • Response field explanations
  • Multiple practical examples
  • Use cases that help users understand when to use this endpoint
agent-server/nodejs/src/lib/BrowserAgentServer.js (4)

1113-1165: LGTM! Clean implementation with backward compatibility.

The enhanced getPageHTML method properly:

  • Maintains backward compatibility with includeIframes defaulting to false
  • Branches cleanly between main-frame-only and iframe-inclusive paths
  • Returns appropriate metadata (frameCount) when iframes are included
  • Includes clear logging for both paths

1233-1240: LGTM! Appropriate error handling for frame capture failures.

Individual frame capture failures are logged as warnings but don't fail the entire operation. This is the right approach since:

  • Some iframes may be cross-origin and inaccessible
  • Partial content is better than no content
  • The main frame capture (which is most important) is handled separately

1262-1270: LGTM! Clean recursive frame counting.

The countFrames method correctly implements recursive counting with proper base case and accumulation.


1207-1211: grantUniversalAccess: true usage verified as safe for the current read-only application.

The isolated world is used exclusively to execute a single, hardcoded DOM query (document.documentElement.outerHTML) with no dynamic code execution or untrusted input. While grantUniversalAccess: true does break same-origin isolation and allow cross-origin DOM access/modification—creating a significant security capability, the risk here is minimal because:

  • Only a static, read-only expression evaluates in the context
  • Results are returned as HTML strings with no code execution
  • No untrusted data flows into the isolated world

The current implementation is safe. No changes required.

agent-server/nodejs/src/api-server.js (1)

310-324: LGTM! Clean response construction with optional frameCount.

The response construction properly handles the optional frameCount field, only including it when present in the result. This maintains backward compatibility while exposing new metadata.

agent-server/README.md (1)

213-297: LGTM! Complete and user-friendly endpoint documentation.

The /page/execute documentation is excellent:

  • Clear parameter descriptions with defaults
  • Detailed response field explanations
  • Multiple practical curl examples
  • Well-explained use cases
  • Proper context about how it complements /page/content

const frames = [];

// Helper to recursively collect frames
const collectFrames = (node) => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix ESLint arrow-parens violation.

The single-parameter arrow function has unnecessary parentheses.

Apply this diff:

-    const collectFrames = (node) => {
+    const collectFrames = node => {
       frames.push(node.frame);
       if (node.childFrames) {
         node.childFrames.forEach(collectFrames);
       }
     };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const collectFrames = (node) => {
const collectFrames = node => {
frames.push(node.frame);
if (node.childFrames) {
node.childFrames.forEach(collectFrames);
}
};
🧰 Tools
🪛 ESLint

[error] 1177-1177: Unexpected parentheses around single function argument.

(@stylistic/arrow-parens)

🤖 Prompt for AI Agents
In agent-server/nodejs/src/lib/BrowserAgentServer.js around line 1177, the arrow
function parameter is wrapped in unnecessary parentheses causing an ESLint
arrow-parens violation; change the declaration from using parentheses to the
single-parameter form (remove the surrounding parentheses so it reads node =>
...) and ensure the file still passes linting after the update.

@olesho olesho merged commit c2f55c3 into main Nov 22, 2025
1 of 2 checks passed
@claude
Copy link

claude bot commented Nov 22, 2025

Claude finished @olesho's task —— View job


PR Review: Feat/capture iframes

Todo List

  • Read repository CLAUDE.md for guidelines
  • Review agent-server/README.md changes
  • Review agent-server/nodejs/CLAUDE.md changes
  • Review agent-server/nodejs/src/api-server.js changes
  • Review agent-server/nodejs/src/lib/BrowserAgentServer.js changes
  • Check for test coverage
  • Provide comprehensive feedback summary

Review Summary

This PR successfully implements iframe capturing functionality across the agent-server API. The implementation is well-structured and follows the existing architectural patterns.

Code Quality and Best Practices ✅

Strengths:

  • Clean API design: The includeIframes parameter in /page/content endpoint is intuitive and backward-compatible
  • Proper error handling: Each iframe capture failure is logged but doesn't break the entire operation
  • Comprehensive logging: Excellent logging throughout the capture process for debugging
  • Documentation: Both README.md and CLAUDE.md are thoroughly updated with clear examples

Areas for improvement:

  • Method signature inconsistency: getPageText doesn't support iframe options while getPageHTML does (agent-server/nodejs/src/lib/BrowserAgentServer.js:1277)
  • Consider adding iframe support to text extraction for consistency

Potential Bugs and Issues 🟡

Minor Issues:

  1. Error handling: In captureAllFramesHTML method, failed frames are logged but silently skipped. Consider adding a summary of skipped frames in the response for transparency
  2. Resource cleanup: Multiple CDP connections are created for frame contexts but there's no explicit cleanup mentioned for the isolated worlds created via Page.createIsolatedWorld

Recommendations:

  • Add a skippedFrames counter to the response metadata
  • Consider implementing frame capture timeouts to prevent hanging on problematic frames

Performance Considerations 🟡

Concerns:

  1. Sequential frame processing: Frames are captured sequentially which could be slow for pages with many iframes (BrowserAgentServer.js:1191-1241)
  2. Memory usage: Concatenating all frame HTML in memory could be problematic for large pages with many frames

Suggestions:

  • Consider parallel frame capture with Promise.all() for better performance
  • Add configuration limits for maximum frames to capture
  • Consider streaming or chunked responses for very large combined HTML

Security Considerations ✅

Good practices observed:

  • Uses CDP's isolated world execution contexts for frame access
  • Proper iframe sandboxing through CDP rather than direct DOM manipulation
  • No direct script injection or eval usage

Test Coverage ❌

Critical Gap:

  • No test files found in the entire agent-server directory
  • No unit tests for the new iframe capture functionality
  • No integration tests for the enhanced /page/content endpoint

Recommendations:

  • Add unit tests for captureAllFramesHTML() method
  • Add integration tests for the /page/content endpoint with includeIframes=true
  • Consider adding test cases for error scenarios (unreachable frames, permission issues)

Implementation Details Review

API Server Changes (api-server.js:287-324):

  • Clean parameter extraction with sensible defaults
  • Proper error messaging for required fields
  • Good logging for debugging

Browser Agent Server Changes (BrowserAgentServer.js:1113-1255):

  • Well-structured recursive frame tree traversal
  • Appropriate use of CDP's Page.getFrameTree and isolated world execution
  • Good separation between main frame and child frame handling

Documentation Quality ✅

Excellent documentation updates:

  • README.md examples are clear and practical
  • CLAUDE.md includes comprehensive API documentation
  • Proper parameter documentation with types and defaults

Final Recommendation

APPROVE with minor suggestions - This is a solid implementation that adds valuable functionality while maintaining code quality and following existing patterns. The main concern is the lack of test coverage, which should be addressed in a follow-up PR.

Suggested Follow-up Tasks

  1. Add comprehensive test suite for iframe capture functionality
  2. Consider adding iframe support to getPageText() method for consistency
  3. Implement performance optimizations for parallel frame capture
  4. Add configuration options for frame capture limits

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants