Skip to content

[Bug]: IFrame Tab returned by get_frame() on remote Chrome connection uses WebSocket URL with port "None" and crashes when calling page_source #305

@avn-coder

Description

@avn-coder

Checklist before reporting

  • I have searched for similar issues and didn't find a duplicate.
  • I have updated to the latest version of pydoll to verify the issue still exists.

pydoll Version

2.12.0

Python Version

3.12.10

Operating System

Windows

Bug Description

When connecting to a remote Chrome instance using Chrome.connect(ws_url) (browser-level connection as described in the Remote Connections docs) and then working with an iframe via tab.get_frame(iframe_element), any subsequent call on the iframe Tab (for example await iframe_tab.page_source) fails.

Internally, the iframe tab's ConnectionHandler tries to establish a new WebSocket connection using a URL that contains :None as the port. This leads to a ValueError: Port could not be cast to integer value as 'None' coming from websockets.uri.parse_uri.

This only happens on the iframe Tab created by get_frame(). The top level Tab returned by Chrome.connect(ws_url) works correctly.

Steps to Reproduce

  1. Start a Chrome based browser with remote debugging enabled, for example:
    Windows example:

"C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=54321

  1. From Python, query http://127.0.0.1:54321/json/version and get the webSocketDebuggerUrl field.

Use Pydoll to connect to the remote browser with Chrome.connect(ws_url) as per the Remote Connections documentation.

  1. Navigate to any page that contains an iframe.

  2. Locate the iframe element with a CSS selector.

  3. Call iframe_tab = await tab.get_frame(element).

  4. Call await iframe_tab.page_source (or any other method that forces the iframe Tab to establish its own CDP connection).

  5. Observe that a ValueError is raised, coming from websockets.uri.parse_uri complaining about port 'None'.

Code Example

import asyncio
import aiohttp

from pydoll.browser.chromium import Chrome

SOLVECAPTCHA_RECAPTCHA2_DEMO = "https://www.google.com/recaptcha/api2/demo"

async def connect_remote_chromium_get_ws_url():
    port = 543212  # example port number
    async with aiohttp.ClientSession() as session:
        url = f"http://127.0.0.1:{port}/json/version"
        async with session.get(url) as response:
            data = await response.json()
            ws_url = data["webSocketDebuggerUrl"]

            print("Server info:")
            print(f"  Browser: {data.get('Browser')}")
            print(f"  Protocol: {data.get('Protocol-Version')}")
            print(f"  WebSocket: {ws_url}")

    return ws_url

async def solve_with_iframe(tab):
    recaptcha2_iframe_css = "iframe[title='reCAPTCHA']"
    recaptcha_iframe_element = await tab.query(recaptcha2_iframe_css, timeout=10)
    iframe_tab = await tab.get_frame(recaptcha_iframe_element)

    page_source = await iframe_tab.page_source # This line triggers the bug
    print(page_source)

async def main():
   ws_url = await connect_remote_chromium_get_ws_url()
   chrome = Chrome()
   tab = await chrome.connect(ws_url)

   print("\n[SUCCESS] Connected to remote Chrome server!")

   await tab.go_to(SOLVECAPTCHA_RECAPTCHA2_DEMO)
   await solve_with_iframe(tab)

   await chrome.close()


if name == "main":
asyncio.run(main())

Expected Behavior

According to the IFrames documentation, tab.get_frame(iframe_element) should return a Tab instance that can be used like any other tab, including calling find, query, execute_script, page_source, and so on, with its own CDP target and separate WebSocket connection.
Pydoll

So in this example, I would expect:

iframe_tab = await tab.get_frame(recaptcha_iframe_element) to succeed.

await iframe_tab.page_source to return the HTML source of the iframe document without errors.

Subsequent operations like await iframe_tab.find(id="recaptcha-anchor") to also work.

Actual Behavior

iframe_tab = await tab.get_frame(recaptcha_iframe_element) returns an object without raising any error.

On the next line, when I run page_source = await iframe_tab.page_source, Pydoll attempts to establish a WebSocket connection for the iframe tab and fails with a ValueError, because the URL it passes to websockets seems to contain :None as the port.

Relevant Log Output

Additional Context

Workaround:

Setting the connection_port manually, something like:

iframe_tab._connection_handler._connection_port = 54321

before calling await iframe_tab.page_source fixes the issue, thus confirming that the missing port is the direct cause.

Request:

It would be great if the iframe Tab created by get_frame() could reuse the necessary connection information from the parent Tab or from the original ws_url passed to Chrome.connect, so that iframe tabs work seamlessly in the remote connection scenario, similar to how they work when you start the browser locally via browser.start().

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions