Skip to content
This repository has been archived by the owner on May 8, 2020. It is now read-only.

#159 [SEVERE] The communication with Chromium are disconnected after 20 seconds. #160

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

PCXDME
Copy link

@PCXDME PCXDME commented Nov 6, 2018

#159 [SEVERE] The communication with Chromium are disconnected after 20 seconds.

This is fixed in https://github.com/pyppeteer/pyppeteer2

See #160 (comment)

@aronsky
Copy link

aronsky commented Nov 6, 2018

Thank you for this!

@jrmlhermitte
Copy link

When will this be added in? The best solution to this right now is to keep opening and closing a browser for every page interaction and keeping it under 20 s. :-(

@nurettin
Copy link

perhaps we need to keep the connection alive with ping-pong instead of setting timeouts to none

@PCXDME
Copy link
Author

PCXDME commented Nov 27, 2018

@nurettin I think the problem here is that Chrome does not send pong back, so when our WebSocket client send ping out but not receiving pong back after 20 seconds timeout, it thinks that the connection is lost so it disconnects.

@nurettin
Copy link

@PCXDME Perhaps then we need a way to check if _ws really did disconnect from the page instance when we set the timeout to None?

@PCXDME
Copy link
Author

PCXDME commented Nov 27, 2018

@nurettin The way WebSocket client knows whether it really disconnects is through timeout if I remembered correctly. Otherwise you need to use other request/response message instead of ping/pong to be used for timeout mechanism. But anyways, having timeout set to none is better than having timeout without Chrome responding pong. Disconnections also won't happen regularly as Chrome and pyppeteer are usually on the same machine (local). It would happens only when Chrome crashes/exits. When we close browser with pyppeteer, there is nothing to worry about as we are closing it so we do also close WebSocket connectiom. I would suggest to fix one problem at a time. This seems to be more important as you can not use the library for more than 20 seconds.

@nurettin
Copy link

@PCXDME

But anyways, having timeout set to none is better than having timeout without Chrome responding pong. 

I agree

I would suggest to fix one problem at a time. This seems to be more important as you can not use the library for more than 20 seconds.

pretty sure it fixes the issue for now (thank you). I just have a long running process, so I was looking for ways to solidify that service and thought this would be a suitable place to talk about it.

@obsd
Copy link

obsd commented Nov 30, 2018

Another solution for this, is to set websockets==6.0, it is good until this pr will be merged and released

@alfred82santa
Copy link

I think this PR could not be aproved if setup.py requirements do not change (websockets>=7.0). Parameters ping_interval and ping_timeout do not exist in Websockets 6.0.

https://websockets.readthedocs.io/en/6.0/api.html#websockets.client.connect

@nurettin
Copy link

nurettin commented Dec 1, 2018

Another solution for this, is to set websockets==6.0, it is good until this pr will be merged and released

I did it like that in the beginning, but I get disconnects with websockets==6.0 more often than 7.0 for some reason.

@zxwild
Copy link

zxwild commented Dec 11, 2018

It's a chromium bug actually, other Puppeteer implementations suffer from this too, so it seems only workaround is just to disable pings.
I've tested 15 minutes delay interval, headless chrome responded after this period.

https://bugs.chromium.org/p/chromium/issues/detail?id=865002

@stolati
Copy link

stolati commented Dec 20, 2018

For those who want to hack before the patch arrives.

def patch_pyppeteer():
    import pyppeteer.connection
    original_method = pyppeteer.connection.websockets.client.connect

    def new_method(*args, **kwargs):
        kwargs['ping_interval'] = None
        kwargs['ping_timeout'] = None
        return original_method(*args, **kwargs)

    pyppeteer.connection.websockets.client.connect = new_method
patch_pyppeteer()

@kiwi0fruit
Copy link

kiwi0fruit commented Jan 15, 2019

By the way. Patching approach is a nice one! There's another patch that changes chromium download to validated https instead of unsecure one that is now.

Copy link

@lwabish lwabish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this works when I use requests_html to visit 91porn and render its js

@PCXDME
Copy link
Author

PCXDME commented Jun 10, 2019

@luabish Do you have write access? Could you also merge this? I could not merge because of the travis checks.

@lwabish
Copy link

lwabish commented Jun 12, 2019

@luabish Do you have write access? Could you also merge this? I could not merge because of the travis checks.

@PCXDME I just met this problem when I occasionally use the module requests_html.I guess I can't merge this.

@ingmferrer
Copy link

10 months now and this hasn't been fixed in master. At this point, this library is unmaintained.

@Alex-Bogdanov
Copy link

10 months now and this hasn't been fixed in master. At this point, this library is unmaintained.

anybody tried to request the lib author?)

Copy link

@Q0 Q0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It really works. Thanks

PMK9 added a commit to Penny-AI/pyppeteer that referenced this pull request Oct 18, 2019
It's to stop getting this error for recaptcha for isagenix login: miyakogi#252
PMK9 added a commit to Penny-AI/pyppeteer that referenced this pull request Oct 19, 2019
byt3bl33d3r added a commit to byt3bl33d3r/WitnessMe that referenced this pull request Nov 3, 2019
- Applied path from miyakogi/pyppeteer#160 in
order to fix miyakogi/pyppeteer#62 which made
the websocket connection to Chromium close after ~20s

- Reworked logic to use a producer/consumer pattern
@Yang-z
Copy link

Yang-z commented Dec 27, 2019

def patch_pyppeteer():
import pyppeteer.connection
original_method = pyppeteer.connection.websockets.client.connect

def new_method(*args, **kwargs):
    kwargs['ping_interval'] = None
    kwargs['ping_timeout'] = None
    return original_method(*args, **kwargs)

pyppeteer.connection.websockets.client.connect = new_method

patch_pyppeteer()

It works.
Thanks mate!!!

@mjpieters
Copy link

mjpieters commented Jan 7, 2020

def patch_pyppeteer():
    import pyppeteer.connection
    original_method = pyppeteer.connection.websockets.client.connect

    def new_method(*args, **kwargs):
        kwargs['ping_interval'] = None
        kwargs['ping_timeout'] = None
        return original_method(*args, **kwargs)

    pyppeteer.connection.websockets.client.connect = new_method
patch_pyppeteer()

Note that you are patching the websockets.client module itself here! pypuppeteer.websockets is just the module global reference to the websockets package. You may as well just use

def patch_websockets():
    import websockets.client
    original_method = websockets.client.connect

    def new_method(*args, **kwargs):
        kwargs['ping_interval'] = None
        kwargs['ping_timeout'] = None
        return original_method(*args, **kwargs)

    websockets.client.connect = new_method

patch_websockets()

Instead of patching an innocent 3rd-party library, I'm patching pyppeteer itself:

def _patch_pyppeteer():
    from typing import Any
    from pyppeteer import connection, launcher
    import websockets.client

    class PatchedConnection(connection.Connection):  # type: ignore
        def __init__(self, *args: Any, **kwargs: Any) -> None:
            super().__init__(*args, **kwargs)
            # the _ws argument is not yet connected, can simply be replaced with another
            # with better defaults.
            self._ws = websockets.client.connect(
                self._url,
                loop=self._loop,
                # the following parameters are all passed to WebSocketCommonProtocol
                # which markes all three as Optional, but connect() doesn't, hence the liberal
                # use of type: ignore on these lines.
                # fixed upstream but not yet released, see aaugustin/websockets#93ad88
                max_size=None,  # type: ignore
                ping_interval=None,  # type: ignore
                ping_timeout=None,  # type: ignore
            )

    connection.Connection = PatchedConnection
    # also imported as a  global in pyppeteer.launcher
    launcher.Connection = PatchedConnection

_patch_pyppeteer()

@danilofuchs
Copy link

Please @miyakogi, could you please take a look? This is breaking for many users and has a simple fix.

@BobCashStory
Copy link

BobCashStory commented Jan 31, 2020

def _patch_pyppeteer():
    from typing import Any
    from pyppeteer import connection, launcher
    import websockets.client

    class PatchedConnection(connection.Connection):  # type: ignore
        def __init__(self, *args: Any, **kwargs: Any) -> None:
            super().__init__(*args, **kwargs)
            # the _ws argument is not yet connected, can simply be replaced with another
            # with better defaults.
            self._ws = websockets.client.connect(
                self._url,
                loop=self._loop,
                # the following parameters are all passed to WebSocketCommonProtocol
                # which markes all three as Optional, but connect() doesn't, hence the liberal
                # use of type: ignore on these lines.
                # fixed upstream but not yet released, see aaugustin/websockets#93ad88
                max_size=None,  # type: ignore
                ping_interval=None,  # type: ignore
                ping_timeout=None,  # type: ignore
            )

    connection.Connection = PatchedConnection
    # also imported as a  global in pyppeteer.launcher
    launcher.Connection = PatchedConnection

_patch_pypuppeteer()

@mjpieters

you have a typo in your patch def _patch_pypuppeteer

@mjpieters
Copy link

@BobCashStory

you have a typo in your patch def _patch_pypuppeteer

Oopsie, fixed now. Thanks for pointing that out!

@Mattwmaster58
Copy link

Mattwmaster58 commented Apr 20, 2020

This library seems to have been abandoned, however I and others have been working on an updated fork — pyppeteer2. It's up on PyPi and the fix has already been applied.

@PCXDME / @aronsky / @jrmlhermitte Could you maybe include this information in your topmost post so that others don't have to scroll through other workarounds?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.