Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeout exception in test_important_plugin #5742

Open
cdecker opened this issue Nov 27, 2022 · 0 comments
Open

Timeout exception in test_important_plugin #5742

cdecker opened this issue Nov 27, 2022 · 0 comments

Comments

@cdecker
Copy link
Member

cdecker commented Nov 27, 2022

Issue and Steps to Reproduce

#5741 introduced a new flake, or failure, on the valgrind-4 confguration:

    def test_important_plugin(node_factory):
        # Cache it here.
        pluginsdir = os.path.join(os.path.dirname(__file__), "plugins")
    
        n = node_factory.get_node(options={"important-plugin": os.path.join(pluginsdir, "nonexistent")},
                                  may_fail=True, expect_fail=True,
                                  allow_broken_log=True, start=False)
    
        n.daemon.start(wait_for_initialized=False, stderr_redir=True)
        # Will exit with failure code.
        assert n.daemon.wait() == 1
        assert n.daemon.is_in_stderr(r"Failed to register .*nonexistent: No such file or directory")
    
        # Check we exit if the important plugin dies.
        n.daemon.opts['important-plugin'] = os.path.join(pluginsdir, "fail_by_itself.py")
    
        n.daemon.start(wait_for_initialized=False)
        # Will exit with failure code.
>       assert n.daemon.wait() == 1

tests/test_plugin.py:2128: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
contrib/pyln-testing/pyln/testing/utils.py:657: in wait
    self.proc.wait(timeout)
/opt/hostedtoolcache/Python/3.7.15/x64/lib/python3.7/subprocess.py:1019: in wait
    return self._wait(timeout=timeout)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <subprocess.Popen object at 0x7efd8b29f150>, timeout = 900

    def _wait(self, timeout):
        """Internal implementation of wait() on POSIX."""
        if self.returncode is not None:
            return self.returncode
    
        if timeout is not None:
            endtime = _time() + timeout
            # Enter a busy loop if we have a timeout.  This busy loop was
            # cribbed from Lib/threading.py in Thread.wait() at r71065.
            delay = 0.0005 # 500 us -> initial delay of 1 ms
            while True:
                if self._waitpid_lock.acquire(False):
                    try:
                        if self.returncode is not None:
                            break  # Another thread waited.
                        (pid, sts) = self._try_wait(os.WNOHANG)
                        assert pid == self.pid or pid == 0
                        if pid == self.pid:
                            self._handle_exitstatus(sts)
                            break
                    finally:
                        self._waitpid_lock.release()
                remaining = self._remaining_time(endtime)
                if remaining <= 0:
>                   raise TimeoutExpired(self.args, timeout)
E                   subprocess.TimeoutExpired: Command '['valgrind', '-q', '--trace-children=yes', '--trace-children-skip=*python*,*bitcoin-cli*,*elements-cli*,*cln-grpc', '--error-exitcode=7', '--log-file=/tmp/ltests-tj13wttf/test_important_plugin_1/lightning-1//valgrind-errors.%p', '--read-inline-info=no', '--suppressions=/home/runner/work/lightning/lightning/tests/valgrind-suppressions.txt', '--gen-suppressions=all', 'lightningd/lightningd', '--log-level=debug', '--cltv-delta=6', '--cltv-final=5', '--watchtime-blocks=5', '--rescan=1', '--disable-dns', '--lightning-dir=/tmp/ltests-tj13wttf/test_important_plugin_1/lightning-1/', '--addr=127.0.0.1:34521', '--allow-deprecated-apis=false', '--network=regtest', '--ignore-fee-limits=false', '--bitcoin-rpcuser=rpcuser', '--bitcoin-rpcpassword=rpcpass', '--bitcoin-datadir=/tmp/ltests-tj13wttf/test_important_plugin_1/lightning-1/', '--dev-fast-gossip', '--dev-bitcoind-poll=1', '--log-file=-', '--log-file=/tmp/ltests-tj13wttf/test_important_plugin_1/lightning-1/log', '--log-prefix=lightningd-1 ', '--dev-fail-on-subdaemon-fail', '--dev-no-version-checks', '--dev-no-plugin-checksum', '--dev-no-reconnect', '--important-plugin=/home/runner/work/lightning/lightning/tests/plugins/fail_by_itself.py', '--bitcoin-rpcport=60953']' timed out after 900 seconds

900 seconds seems too long for it to just be slow, so it seems we may indeed end up waiting forever for the exiting plugin to complete the handshake.

getinfo output

Commit 626998e

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant