Skip to content

feat: inject snapshot bundle in browser context #672

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: develop
Choose a base branch
from

Conversation

amhsirak
Copy link
Member

@amhsirak amhsirak commented Jun 30, 2025

closes: #671 #656

Summary by CodeRabbit

Summary by CodeRabbit

  • Refactor
    • Removed network resource caching and monitoring features from browser management.
    • Simplified DOM snapshot process to rely on rrweb’s built-in options for inlining images and collecting fonts.
    • Replaced network-based triggers with a debounced network quiet period for DOM snapshots.
    • Streamlined initialization and snapshot logic for improved performance and maintainability.
  • Bug Fixes
    • Improved navigation reliability by updating page load wait strategy and adding a post-navigation delay.
  • Style
    • Simplified CSS injection in snapshot rendering, removing embedded font and stylesheet styling for cleaner iframe display.

Copy link

coderabbitai bot commented Jun 30, 2025

"""

Walkthrough

The changes remove all network resource caching and monitoring logic from the RemoteBrowser class. This includes deleting related fields, methods, event listeners, and snapshot processing that handled resource proxying. The DOM snapshot now uses rrweb's built-in options for inlining images and collecting fonts, with simplified initialization and snapshot logic.

Changes

File(s) Change Summary
server/src/browser-management/classes/RemoteBrowser.ts Removed all network resource caching/monitoring fields and methods; simplified rrweb snapshot processing; adjusted initialization and event listeners.
server/src/browser-management/inputHandlers.ts Modified navigation in handleChangeUrl to use waitUntil: "domcontentloaded" with shorter timeout and added fixed 2-second wait after navigation.
src/components/recorder/DOMBrowserRenderer.tsx Removed additional CSS injection for fonts and styles derived from snapshot resources; simplified injected CSS for iframe rendering.

Assessment against linked issues

Objective Addressed Explanation
Fix broken DOM UI for certain websites (#671)

Assessment against linked issues: Out-of-scope changes

No out-of-scope changes detected.

Suggested reviewers

  • amhsirak

Poem

A bunny hopped through lines of code,
Sweeping caches off the road.
No more watching network streams,
Just snapshots now—so light, it seems!
With rrweb's magic, fonts and pics in tow,
The browser's lighter—onward we go!
🐇✨
"""


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c924996 and 98770e3.

📒 Files selected for processing (3)
  • server/src/browser-management/classes/RemoteBrowser.ts (10 hunks)
  • server/src/browser-management/inputHandlers.ts (1 hunks)
  • src/components/recorder/DOMBrowserRenderer.tsx (4 hunks)
🔇 Additional comments (10)
server/src/browser-management/inputHandlers.ts (1)

495-496: Ensure navigation timing strategy is intentional and consistent

I see that in server/src/browser-management/inputHandlers.ts you’ve switched from

await page.goto(url, { waitUntil: "networkidle", timeout: 100000 });

to

await page.goto(url, { waitUntil: "domcontentloaded", timeout: 30000 });
await page.waitForTimeout(2000);

Meanwhile:

  • In maxun-core/src/interpret.ts you’re still using waitForNavigation({ waitUntil: 'networkidle', timeout: 15000 }).
  • In server/src/routes/proxy.ts you call page.goto with no options (which defaults to the 'load' event).

These differences can lead to inconsistent page-load behaviors and potential flakiness. Please verify that:

  • A 2-second hard wait post-DOMContentLoaded always suffices for your downstream logic.
  • You’re OK reducing the timeout from 100 s to 30 s (and to 15 s elsewhere).
  • If you truly no longer need to wait for network quiescence, document the reason and consider aligning or standardizing the strategy across modules.
  • Instead of a fixed waitForTimeout, injecting targeted waitForSelector/waitForLoadState('networkidle') calls might be more robust.
src/components/recorder/DOMBrowserRenderer.tsx (3)

299-299: LGTM: Code formatting improvements.

The addition of trailing commas improves code consistency and reduces diff noise in future changes.

Also applies to: 306-306, 311-311, 315-315


673-675: LGTM: Consistent formatting applied.

The formatting aligns with the rest of the codebase improvements.


796-826: CSS simplification aligns with architectural changes.

The removal of font-family, background color, and image styling from the enhanced CSS aligns with the backend changes that eliminated custom resource caching and proxying. The rrweb snapshotting now uses built-in options (inlineImages: true, collectFonts: true) to handle these resources directly.

This simplification should maintain visual fidelity while reducing complexity in the resource handling pipeline.

server/src/browser-management/classes/RemoteBrowser.ts (6)

201-203: Network timeout fields are now properly utilized.

These fields are appropriately used in the network monitoring logic (lines 445-466) for implementing network quiet period detection. The implementation correctly manages pending network requests and timeouts.


246-246: Good simplification of snapshot processing.

The removal of complex resource processing logic aligns well with the PR objective to simplify snapshot bundle injection. The method now returns the snapshot directly without unnecessary resource transformation.

Also applies to: 253-253


540-544: Excellent simplification of rrweb snapshot capture.

The simplified approach using rrweb's built-in inlineImages and collectFonts options is much cleaner than the previous complex resource processing. This should improve both performance and maintainability while ensuring essential resources are still captured.


600-605: Proper cleanup of network monitoring state.

The cleanup logic correctly handles network request timeouts and pending requests when DOM streaming stops, ensuring no memory leaks or stale state.


619-621: Initialization order correctly fixed.

The rrweb initialization script is now properly added to the context before creating the page, which resolves the critical initialization order bug identified in previous reviews.


706-710: Good defensive programming practice.

The page closure check prevents errors when attempting to extract list data from a closed page. The early return with an empty array and warning log is the appropriate handling for this edge case.

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@amhsirak amhsirak added the Scope: Recorder All issues/PRs related to recorder label Jun 30, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (3)
server/src/browser-management/classes/RemoteBrowser.ts (3)

278-293: Remove references to deleted networkResourceCache

The processCSS method still references this.networkResourceCache which has been removed according to the changes. This will cause runtime errors.

Since network resource caching has been removed, this code block should be updated to either:

  1. Remove the caching logic entirely
  2. Use an alternative approach for handling font resources

Consider removing or refactoring this entire block that depends on the removed cache.


495-609: Remove dead code methods

The shouldCacheResource and cleanupResourceCache methods are no longer used after the removal of network resource caching functionality.

Remove these unused methods to keep the codebase clean:

-    /**
-     * Check if a resource should be cached based on its MIME type and URL
-     * @private
-     */
-    private shouldCacheResource(mimeType: string, url: string): boolean {
-      // ... entire method body
-    }
-
-    /**
-     * Clean up old cached resources to prevent memory leaks
-     * @private
-     */
-    private cleanupResourceCache(): void {
-      // ... entire method body  
-    }

958-971: Clean up network monitoring initialization logic

The conditional blocks still check for this.client and have comments about initializing network monitoring, but the initializeNetworkMonitoring method has been removed.

Remove the dead code paths:

                 if (this.client) {
-                  await this.initializeNetworkMonitoring();
-                  logger.info("Network monitoring initialized successfully");
+                  // CDP client initialized successfully
                 }
🧹 Nitpick comments (1)
server/src/browser-management/classes/RemoteBrowser.ts (1)

772-784: Improve documentation for commented code

The comment "DO NOT REMOVE THIS CODE - MIGHT BE NEEDED LATER" is too vague. Consider adding more context about when and why this code might be re-enabled.

Replace the comment with a more descriptive TODO:

-      // DO NOT REMOVE THIS CODE - MIGHT BE NEEDED LATER
+      // TODO: Re-enable response-based DOM snapshot triggers when implementing 
+      // dynamic content monitoring for SPAs. This was disabled to reduce overhead
+      // after removing network resource caching. 
+      // Related issue: [add issue number if applicable]
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 64d53fd and 830bf15.

📒 Files selected for processing (1)
  • server/src/browser-management/classes/RemoteBrowser.ts (5 hunks)
🔇 Additional comments (1)
server/src/browser-management/classes/RemoteBrowser.ts (1)

1862-1865: LGTM! Good use of rrweb's built-in options

The change to use rrweb's built-in inlineImages and collectFonts options is a good simplification that replaces the custom resource caching logic.

@amhsirak amhsirak added the Type: Enhancement Improvements to existing features label Jun 30, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
server/src/browser-management/classes/RemoteBrowser.ts (1)

230-275: Update processing stats to reflect simplified resource handling.

The method now returns empty resource arrays but still calculates processing stats as if resources were processed. This is misleading for debugging and monitoring.

      return {
        snapshot,
        resources,
        baseUrl,
        viewport,
        timestamp: Date.now(),
        processingStats: {
          discoveredResources: {
-            images: resources.images.length,
-            stylesheets: resources.stylesheets.length,
-            scripts: resources.scripts.length,
-            fonts: resources.fonts.length,
-            media: resources.media.length,
+            images: 0,
+            stylesheets: 0,
+            scripts: 0,
+            fonts: 0,
+            media: 0,
          },
          cachedResources: {
-            stylesheets: resources.stylesheets.length,
-            images: resources.images.length,
-            fonts: resources.fonts.length,
-            scripts: resources.scripts.length,
-            media: resources.media.length,
+            stylesheets: 0,
+            images: 0,
+            fonts: 0,
+            scripts: 0,
+            media: 0,
          },
        },
      };
♻️ Duplicate comments (1)
server/src/browser-management/classes/RemoteBrowser.ts (1)

602-604: Critical initialization order bug still exists.

This is the same bug identified in previous reviews. The code calls addInitScript on this.currentPage before it's assigned, which will cause a TypeError.

Apply the fix from the previous review:

-await this.currentPage.addInitScript({ path: './server/src/browser-management/classes/rrweb-bundle.js' });
-
 this.currentPage = await this.context.newPage();
+
+await this.currentPage.addInitScript({ path: './server/src/browser-management/classes/rrweb-bundle.js' });
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 830bf15 and f17c729.

📒 Files selected for processing (1)
  • server/src/browser-management/classes/RemoteBrowser.ts (6 hunks)
🔇 Additional comments (3)
server/src/browser-management/classes/RemoteBrowser.ts (3)

438-450: LGTM! Response event listener appropriately disabled.

The commenting out of the response event listener aligns with the removal of network monitoring functionality. The note to keep the code for potential future use is appropriate.


528-531: Excellent use of rrweb's built-in resource handling.

The addition of inlineImages: true and collectFonts: true options correctly replaces the previous network resource caching approach with rrweb's native capabilities. This simplifies the codebase while maintaining functionality.


625-637: Network monitoring removal is consistent with the refactoring.

The removal of initializeNetworkMonitoring() calls is appropriate and consistent with the overall goal of simplifying the codebase by removing network resource caching functionality.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
server/src/browser-management/classes/RemoteBrowser.ts (1)

230-275: Implement or Remove Resource Processing for processRRWebSnapshot

The PR has stripped out all resource extraction, but client code still consumes ProcessedSnapshot.resources. As a result, snapshots will never include stylesheets, fonts, images, scripts or media, breaking DOM rendering and styling. You need to either restore the resource‐processing logic in processRRWebSnapshot or update the UI to stop expecting these fields.

Files requiring your attention:

  • server/src/browser-management/classes/RemoteBrowser.ts
    – the processRRWebSnapshot method and ProcessedSnapshot interface
  • src/components/recorder/DOMBrowserRenderer.tsx
    – references to snapshotData.resources.stylesheets and resources.fonts
  • src/components/browser/BrowserWindow.tsx
    – references to currentSnapshot?.resources

Please choose one of the following:

  • Reintroduce resource collection (CSS, fonts, images, etc.) in RemoteBrowser.processRRWebSnapshot, so downstream components continue working unchanged.
  • Remove all resource‐dependent code paths in the renderer components and adjust ProcessedSnapshot accordingly to only carry the raw snapshot.
♻️ Duplicate comments (1)
server/src/browser-management/classes/RemoteBrowser.ts (1)

201-203: Remove unused network request timeout fields

These fields were previously identified as unused and marked as addressed in commit eecfa43, but they still appear in the current code. Since the network monitoring logic has been removed, these fields serve no purpose.

-    private networkRequestTimeout: NodeJS.Timeout | null = null;
-    private pendingNetworkRequests: string[] = [];
-    private readonly NETWORK_QUIET_PERIOD = 8000;
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eecfa43 and c924996.

📒 Files selected for processing (1)
  • server/src/browser-management/classes/RemoteBrowser.ts (6 hunks)
🔇 Additional comments (2)
server/src/browser-management/classes/RemoteBrowser.ts (2)

537-540: Approve the rrweb snapshot configuration

The addition of inlineImages: true and collectFonts: true options is appropriate for the new approach of relying on rrweb's built-in resource handling instead of custom network caching.


619-623: Confirm the initialization order fix

The order has been corrected to create the page first, then add the init script, which addresses the critical initialization bug identified in previous reviews.

Comment on lines +445 to 466
this.pendingNetworkRequests.push(url);

if (this.networkRequestTimeout) {
clearTimeout(this.networkRequestTimeout);
this.networkRequestTimeout = null;
}

logger.debug(
`Network request received: ${url}. Total pending: ${this.pendingNetworkRequests.length}`
);

this.networkRequestTimeout = setTimeout(async () => {
logger.info(
`Network quiet period reached. Processing ${this.pendingNetworkRequests.length} requests`
);

this.pendingNetworkRequests = [];
this.networkRequestTimeout = null;

await this.makeAndEmitDOMSnapshot();
}, 800);
}, this.NETWORK_QUIET_PERIOD);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Review the new network quiet period implementation

The replacement of complex network monitoring with a simpler timeout-based approach is reasonable, but there are potential issues with the current implementation:

  1. The pendingNetworkRequests array grows but is only cleared when the timeout fires
  2. No bounds checking on the array size could lead to memory leaks
  3. The timeout resets on every request, which could delay snapshots indefinitely on busy sites

Consider adding safeguards:

  this.pendingNetworkRequests.push(url);
+ 
+ // Prevent memory leaks by limiting queue size
+ if (this.pendingNetworkRequests.length > 100) {
+   this.pendingNetworkRequests = this.pendingNetworkRequests.slice(-50);
+ }

  if (this.networkRequestTimeout) {
    clearTimeout(this.networkRequestTimeout);
    this.networkRequestTimeout = null;
  }

  logger.debug(
    `Network request received: ${url}. Total pending: ${this.pendingNetworkRequests.length}`
  );

+ // Force snapshot after maximum wait time regardless of ongoing requests
+ const maxWaitTime = Math.min(this.NETWORK_QUIET_PERIOD * 2, 16000);
  this.networkRequestTimeout = setTimeout(async () => {
    logger.info(
      `Network quiet period reached. Processing ${this.pendingNetworkRequests.length} requests`
    );

    this.pendingNetworkRequests = [];
    this.networkRequestTimeout = null;

    await this.makeAndEmitDOMSnapshot();
-   }, this.NETWORK_QUIET_PERIOD);
+   }, this.NETWORK_QUIET_PERIOD);

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In server/src/browser-management/classes/RemoteBrowser.ts around lines 445 to
466, the current network quiet period implementation pushes URLs to
pendingNetworkRequests without bounds checking, which risks memory leaks, and
resets the timeout on every request, potentially delaying snapshots
indefinitely. To fix this, add a maximum size limit to pendingNetworkRequests
and discard or handle excess entries to prevent unbounded growth. Also, modify
the timeout logic so it does not reset indefinitely on continuous requests, for
example by allowing the timeout to run once or by implementing a maximum wait
time before forcing a snapshot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Scope: Recorder All issues/PRs related to recorder Type: Enhancement Improvements to existing features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: Broken DOM UI for certain websites
2 participants