Skip to content

Exception 'signal only works in main thread' when Connection.run command with pty=True #2204

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
francesco-giordano opened this issue Mar 28, 2022 · 10 comments

Comments

@francesco-giordano
Copy link

francesco-giordano commented Mar 28, 2022

Describe the bug
When running a command with Connection.run and argument pty=True in a thread we get that error. fabric version 2.7.0

To Reproduce
Steps to reproduce the behaviour (please attach a minimal example):
Running the code below on fabric 2.6.0 everything work fine, while if it is run with 2.7.0 it fails due to this line

signal.signal(signal.SIGWINCH, self.handle_window_change)

from fabric import Connection
import threading
def function(connection):
    command="uptime"
    connection.run(command, warn=True, pty=True, hide=False, timeout=None)

connection_kwargs = {
    "host":  <MY_IP>,
    "user": <MY_USER>,
    "forward_agent": False,
    "connect_kwargs": {"key_filename": "<PATH TO MY KEY>"},
}
connection = Connection(**connection_kwargs)
command="uptime"

connection.run(command, warn=True, pty=True, hide=False, timeout=None)
t1=threading.Thread(target=function, args=(connection,))
t1.start()
t1.join()

Expected behaviour
fabric works within a thread

Environment
Bug seen in python 3.7.2 and 3.10.1

  • What version of the Python interpreter are you using? Are you using an
    alternative interpreter such as PyPy?
  • What operating system are you using both client & server-side? Ubuntu, centos and amazon linux as server and mac as client
  • Are you using OpenSSH server or something else? OpenSSH
  • Which version or versions of the software are you using?
    • Have you already tried the latest release? yes
    • Have you, or can you, try some older releases to pin down where the bug
      appeared? 2.6.0 work fine
  • How can the developers recreate the bug on their end? If possible, include a
    copy of your code, the command you used to invoke it, and the full output of
    your run (if applicable.)
 13:28:46 up 3 days, 20:17,  1 user,  load average: 0.00, 0.00, 0.00
Exception in thread Thread-5 (function):
Traceback (most recent call last):
  File "/Users/giordafr/.pyenv/versions/3.10.1/lib/python3.10/threading.py", line 1009, in _bootstrap_inner
    self.run()
  File "/Users/giordafr/.pyenv/versions/3.10.1/lib/python3.10/threading.py", line 946, in run
    self._target(*self._args, **self._kwargs)
  File "/tmp/test.py", line 5, in function
    connection.run(command, warn=True, pty=True, hide=False, timeout=None)
  File "<decorator-gen-3>", line 2, in run
  File "/Users/giordafr/.pyenv/versions/parallel-cluster-tests/lib/python3.10/site-packages/fabric/connection.py", line 30, in opens
    return method(self, *args, **kwargs)
  File "/Users/giordafr/.pyenv/versions/parallel-cluster-tests/lib/python3.10/site-packages/fabric/connection.py", line 725, in run
    return self._run(self._remote_runner(), command, **kwargs)
  File "/Users/giordafr/.pyenv/versions/parallel-cluster-tests/lib/python3.10/site-packages/invoke/context.py", line 102, in _run
    return runner.run(command, **kwargs)
  File "/Users/giordafr/.pyenv/versions/parallel-cluster-tests/lib/python3.10/site-packages/fabric/runners.py", line 72, in run
    return super(Remote, self).run(command, **kwargs)
  File "/Users/giordafr/.pyenv/versions/parallel-cluster-tests/lib/python3.10/site-packages/invoke/runners.py", line 379, in run
    return self._run_body(command, **kwargs)
  File "/Users/giordafr/.pyenv/versions/parallel-cluster-tests/lib/python3.10/site-packages/invoke/runners.py", line 430, in _run_body
    self.start(command, self.opts["shell"], self.env)
  File "/Users/giordafr/.pyenv/versions/parallel-cluster-tests/lib/python3.10/site-packages/fabric/runners.py", line 46, in start
    signal.signal(signal.SIGWINCH, self.handle_window_change)
  File "/Users/giordafr/.pyenv/versions/3.10.1/lib/python3.10/signal.py", line 47, in signal
    handler = _signal.signal(_enum_to_int(signalnum), _enum_to_int(handler))
ValueError: signal only works in main thread of the main interpreter
francesco-giordano added a commit to francesco-giordano/aws-parallelcluster that referenced this issue Mar 28, 2022
francesco-giordano added a commit to francesco-giordano/aws-parallelcluster that referenced this issue Mar 28, 2022
francesco-giordano added a commit to francesco-giordano/aws-parallelcluster that referenced this issue Mar 28, 2022
francesco-giordano added a commit to francesco-giordano/aws-parallelcluster that referenced this issue Mar 28, 2022
francesco-giordano added a commit to francesco-giordano/aws-parallelcluster that referenced this issue Mar 28, 2022
francesco-giordano added a commit to aws/aws-parallelcluster that referenced this issue Mar 28, 2022
francesco-giordano added a commit to aws/aws-parallelcluster that referenced this issue Mar 28, 2022
francesco-giordano added a commit to aws/aws-parallelcluster that referenced this issue Mar 28, 2022
francesco-giordano added a commit to francesco-giordano/aws-parallelcluster that referenced this issue Mar 28, 2022
demartinofra pushed a commit to aws/aws-parallelcluster that referenced this issue Mar 28, 2022
@Shooshp
Copy link

Shooshp commented Apr 14, 2022

Quite annoying bug, have to rollback from 2.7.0 on 2.6.0.
Environment - python:3.9-slim docker container.

@ghost
Copy link

ghost commented Apr 17, 2022

Fabric 2.7 has bugs,
Fabric 2.6 works fine

@edsharp
Copy link

edsharp commented Apr 28, 2022

We are seeing a similar signal only works in main thread regression in behaviour in 2.7 when invoking .run with

                            "warn": True,
                            "echo": True,
                            "pty": True,

Fabric 2.6.0 worked without issue.

In both cases we were using python 3.8.

@demartinofra
Copy link

Hello, is there any update on a potential fix for this issue? Thank you!

@Crux-One
Copy link

v2.7 likely contains bugs.
Here is my Pipfile. It works.

[packages]
fabric = "==2.6"

[requires]
python_version = "3.9.9"

@juchen-dev
Copy link

Same issue on python 3.9

@ghost
Copy link

ghost commented Apr 18, 2023

Weird that this is happening on Pythons before 3.9, since I dug into this and found https://docs.python.org/3.9/whatsnew/3.9.html containing a note about seemingly exactly this error message (just search for 'signal').

This likely started happening in Fabric 2.7 since that's when the SIGWINCH change using the signal library landed.

Off the top of my head the right fix is probably to make that signal registration conditional on being the main thread; someone running Fabric inside worker threads is rather unlikely to care about highly interactive remote apps, which is what SIGWINCH is for.

Side note: I have notes from internal users at my job that pty=False skips this code path and should serve as a workaround until this gets patched.

@gmarciani
Copy link

Hello, is there any update on the patch?

@juchen-dev
Copy link

I use fabric as a library, and was able to get around this by monkey patching the problematic part:

from invoke.terminals import pty_size
from fabric.main import Argument, Config, Executor, Fab

def patched_start(self, command, shell, env, timeout=None):
    self.channel = self.context.create_session()
    if self.using_pty:
        cols, rows = pty_size()
        self.channel.get_pty(width=cols, height=rows)
        # NOTE: Everything else is the same except for these two commented lines
        # if hasattr(signal, "SIGWINCH"):
        #     signal.signal(signal.SIGWINCH, self.handle_window_change)
    if env:
        if self.inline_env:
            parameters = " ".join(["{}={}".format(k, v) for k, v in sorted(env.items())])
            command = "export {} && {}".format(parameters, command)
        else:
            self.channel.update_environment(env)
    self.send_start_message(command)

class MyProgram(Fab):
    def update_config(self):
        # This lets us patch Remote.start with the function above
        self.config.runners.remote.start = patched_start
        super().update_config()

program = MyProgram(
    name="Fabric",
    version="1.0.0",
    executor_class=Executor,
    config_class=Config,
)

sys.exit(program.run())

@bitprophet
Copy link
Member

Ran into this on CircleCI while fixing a related issue - it looks like the simple check of "are we the main thread?" is sufficient to avoid the problem.

That fix will be released imminently in Fabric 3.2.2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants