Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Async Playwright with Jupyter throws NotImplementedError #723

Closed
kxdan opened this issue May 26, 2021 · 4 comments
Closed

[BUG] Async Playwright with Jupyter throws NotImplementedError #723

kxdan opened this issue May 26, 2021 · 4 comments
Labels

Comments

@kxdan
Copy link

kxdan commented May 26, 2021

Context:

  • Playwright Version: Version 1.11.0-1620331022000
  • Operating System: Windows
  • Python Version: 3.8.10, repro on 3.9.X as well, but can also get it working on a certain conda environment, but a cloned version of that environment doesn't exhibit similar behaviour
# Get WebPage contents using playwright
async def get_browser_content():
    async with async_playwright() as p:
        browser = await p.webkit.launch()
        context = await browser.new_context()
        page = await context.new_page()

        await page.goto('https://www.accuweather.com/en/us/new-york/10007/november-weather/349727?year=2020')
        await page.wait_for_load_state()
        time.sleep(int(4))

        page_content = await page.content()
        await browser.close()
        return page_content


result = await get_browser_content()

Interaction between Jupyter & Playwright causing NotImplementedError on some machines / versions of python.

Playwright detects that it’s being used in an async environment (the Jupyter cell) and insists that we also utilize it in an asynchronous way with proper async functions and await calls. This works on some virtual environments, not on others with varying behavior, different machines (even if on the same version of python) and python versions themselves (but this one is expected). From what we can tell this is a windows-only issue. Potentially related: jupyter/notebook#4613

Await at top level works for other usages in Jupyter (vscode notebooks) this appears to be some issue caused by the way playwright is leveraging subprocesses.

I can patch some virtual environments to work by actually modifying the asyncio package code to include “asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())” (which must be called / run before any event loop ever runs, so cannot be done in a cell) as outlined in this thread tornadoweb/tornado#2608

I am able to get this working on one virtual environment (but even a cloned version of that environment doesn’t exhibit the same behaviour) with playwright, but on fresh ones I create and on other team members machines, the following error appears:

Full stack trace:

`NotImplementedError

NotImplementedError Traceback (most recent call last)
in
27
28
---> 29 result = await get_browser_content()
30 soup = BeautifulSoup(result)
31
in get_browser_content()
12 # Get WebPage contents using playwright
13 async def get_browser_content():
---> 14 async with async_playwright() as p:
15 browser = await p.webkit.launch()
16 context = await browser.new_context()
c:\prose\MultiModality\Python.env\lib\site-packages\playwright\async_api_context_manager.py in aenter(self)
45 if not playwright_future.done():
46 playwright_future.cancel()
---> 47 playwright = AsyncPlaywright(next(iter(done)).result())
48 playwright.stop = self.aexit # type: ignore
49 return playwright
c:\prose\MultiModality\Python.env\lib\site-packages\playwright_impl_transport.py in run(self)
100
101 try:
--> 102 self._proc = proc = await asyncio.create_subprocess_exec(
103 str(self._driver_executable),
104 "run-driver",
C:\Python38\lib\asyncio\subprocess.py in create_subprocess_exec(program, stdin, stdout, stderr, loop, limit, *args, **kwds)
234 protocol_factory = lambda: SubprocessStreamProtocol(limit=limit,
235 loop=loop)
--> 236 transport, protocol = await loop.subprocess_exec(
237 protocol_factory,
238 program, *args,
C:\Python38\lib\asyncio\base_events.py in subprocess_exec(self, protocol_factory, program, stdin, stdout, stderr, universal_newlines, shell, bufsize, encoding, errors, text, *args, **kwargs)
1628 debug_log = f'execute program {program!r}'
1629 self._log_subprocess(debug_log, stdin, stdout, stderr)
-> 1630 transport = await self._make_subprocess_transport(
1631 protocol, popen_args, False, stdin, stdout, stderr,
1632 bufsize, **kwargs)
C:\Python38\lib\asyncio\base_events.py in _make_subprocess_transport(self, protocol, args, shell, stdin, stdout, stderr, bufsize, extra, **kwargs)
489 extra=None, **kwargs):
490 """Create subprocess transport."""
--> 491 raise NotImplementedError
492
493 def _write_to_self(self):
NotImplementedError: `
Add any other details about the problem here.

@mxschmitt
Copy link
Member

Seems related to #178 to me, but unsure since I've never used Jupyter before.

So asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy()) fixes it or not?

@kumaraditya303
Copy link
Contributor

@mxschmitt fyi:

So asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy()) fixes it or not?

It does not fix the issue because playwright uses async subprocess which requires ProactorEventLoop.

Also the issue is not fixable from playwright because tornado requires SelectorEventLoop, hence it is wont-fix

@r0h1tr
Copy link

r0h1tr commented May 21, 2023

So any solutions for it?

@Wector
Copy link

Wector commented Jun 28, 2023

So any solutions for it?

You can fix that issue when you define your event loop.
If you run your code in Windows you can use an alternative event loop available by default - ProactorEventLoop.

Replace

loop = asyncio.get_event_loop()

with:

loop = asyncio.ProactorEventLoop()
asyncio.set_event_loop(loop)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants