Skip to content

[Bug]: connect_over_cdp: Page objects fail to sync after navigating from about:blank via Bookmark/Address bar (Empty page.url & Timeouts) #39483

@Hmily88

Description

@Hmily88

Version

1.57.0

Steps to reproduce

1.Launch Microsoft Edge (or Chrome) manually with the remote debugging port enabled. Use your default profile or a persistent session to ensure bookmarks are available:
msedge.exe --remote-debugging-port=9222

2.In the opened browser, click the "+" button to create a new blank tab (about:blank).

3.Immediately click any website from your Favorites/Bookmark bar (e.g., https://www.google.com) to navigate.

4.ONLY AFTER completing the manual navigation, run the Python script to connect via CDP.
(Note: This is NOT a runtime monitoring issue; it is a discovery issue at the moment of connection.)

Minimal reproduction code:

import json
from urllib.request import urlopen
from playwright.sync_api import sync_playwright

def test_cdp_issue():
    # 1. Inspect raw CDP targets via HTTP API
    try:
        raw_targets = json.loads(urlopen("http://127.0.0.1:9222/json/list").read().decode("utf-8"))
        print("--- Raw CDP Targets (from /json/list) ---")
        for t in raw_targets:
            if t.get("type") == "page":
                print(f"ID: {t.get('id')[:8]} | URL: {t.get('url')} | Title: {t.get('title')}")
    except Exception as e:
        print(f"Cannot fetch /json/list: {e}. Is browser running with --remote-debugging-port=9222?")

    with sync_playwright() as p:
        # Connect to existing browser
        # IMPORTANT: Manually navigate the browser FIRST, then run this script.
        print("\nConnecting to CDP...")
        browser = p.chromium.connect_over_cdp("http://127.0.0.1:9222")
        
        if not browser.contexts:
            print("Error: No contexts found.")
            return
            
        context = browser.contexts[0]
        print(f"Playwright Context Pages: {len(context.pages)}")

        for i, page in enumerate(context.pages):
            print(f"\n[Page {i}]")
            print(f"  Playwright URL: {repr(page.url)}")
            
            # Probe via CDP Session
            cdp = context.new_cdp_session(page)
            try:
                target_info = cdp.send("Target.getTargetInfo").get("targetInfo", {})
                print(f"  CDP Target URL: {repr(target_info.get('url'))}")
                
                # This usually SUCCEEDS via raw CDP
                result = cdp.send("Runtime.evaluate", {
                    "expression": "document.title", 
                    "returnByValue": True
                })
                js_title = result.get("result", {}).get("value")
                print(f"  CDP Runtime.evaluate (title): {js_title}")
                
                # This usually FAILS / TIMEOUTS via Playwright API
                # Playwright 高级 API 在此场景下通常会超时
                print("  Attempting Playwright wait_for_selector('body')...")
                page.wait_for_selector("body", timeout=3000)
                print("  Playwright Interaction: OK")
            except Exception as e:
                error_msg = str(e).splitlines()[0] if str(e) else "Timeout"
                print(f"  Playwright Interaction: FAILED -> {type(e).__name__}: {error_msg}")
            finally:
                try:
                    cdp.detach()
                except:
                    pass

if __name__ == "__main__":
    test_cdp_issue()

Expected behavior

1.All tab targets visible in http://127.0.0.1:9222/json (or /json/list) should be correctly mapped and returned as Page objects in Playwright's context.pages.

2.These Page objects should have synchronized internal states (URL, frames, etc.) and be fully interactable.

Actual behavior

1.Target Discovery Failure at Connection: When the script connects to the browser, context.pages does not include the tab that was navigated from about:blank via a bookmark, even though the tab is physically present and active in the browser.

2.Inconsistency with CDP HTTP API: If I check http://127.0.0.1:9222/json, the problematic tab IS VISIBLE and has the correct URL/Title. This proves the browser's CDP server recognizes the target, but Playwright's initialization logic fails to "adopt" it into the BrowserContext.

3.Missing Binding: Even if I try to wait for several seconds after connecting, the Page object never appears in context.pages, or if it does, it remains uninitialized (page.url == "").

Additional context

Important Note on Reproducibility:
I have tested this multiple times. While the bug occurs in the majority of attempts, there are rare occasions where the page syncs correctly. This suggests a race condition during the initial target discovery/attachment phase. I kindly ask the team to perform multiple test runs if it doesn't reproduce on the first try. Please do not close the issue immediately, as the "hollow page" state is a significant blocker for RPA scenarios involving manual browser intervention.

Cross-Machine Consistency: I have confirmed this issue on multiple different computers, indicating it is not a machine-specific configuration error but a systemic synchronization flaw.

Edge-Specific Observation: My testing primarily focused on Microsoft Edge, where the issue is highly reproducible.

Environment

- Playwright Python: 1.57.0
- Browser: Microsoft Edge 145.0.3800.82
- CDP Protocol-Version: 1.3
- OS: Windows (Windows NT 10.0 x64)
- Connection mode: `chromium.connect_over_cdp("http://127.0.0.1:9222")`

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions