-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Client clash with Tornado IOLoop #33697
Comments
@asloboda-cisco I really don't consider this a bug in Salt. We use a Tornado ioloop on salt's main thread in the running daemons. If your application is running in the same process, you need to either use Salt's IOLoop or re-architect it such that your loop is running on its own thread. |
@cachedout I get that this way a separation is a requirement, but I got a bit lost in your answer. Is the Salt daemon a Salt master server? If so then how does it affect what's going on in client and why was the change that breaks things required? AFAIK it did use IOLoop before that too. I can't find anything about IOthread in Salt, can you point me to it? |
@asloboda-cisco This may be associated with ZMQ version, I by updating the python-zmq and zeromq to solve the problem. |
@ypoison I read that somewhere but apparently pyzmq==14.0.1 (& ZMQ: 4.0.4) should be fine. Which version do you suggest? |
@asloboda-cisco Sorry, I don't know why I wrote Could you clarify how you're using the salt client and give more details about how your application is interacting with Salt? |
@cachedout it is a WSGI app using Tornado web server to serve pages and when it starts HTTPServer it also starts IOLoop. I did not write the code but looking at the Tornado documentation it seems to be the thing to do. It is working just fine until some request will result in an action which tries to initialize Salt client. |
@cachedout I managed to get together an example which works with previous Salt version:
but does this now:
Here's the code:
|
I'm having the very same issue after upgrading from 2015.8.10 to 2016.3.0. Can you guys please take a deeper look on this issue? Thanks in advance |
@danlsgiga Which salt-api backend? CherryPy or Tornado? |
@cachedout rest_cherrypi |
@danlsgiga Oh, wow. OK. I had assumed Tornado. Interesting. We'll take a look. |
@cachedout It seems the problem isn't in salt-api... Here's a relatively easy way to reproduce: SetupThe error_runner.py looks like this: import salt.client
def test_func():
caller = salt.client.Caller()
return caller.cmd('test.ping') Steps to Reproduce Issue$ sudo salt-run error_runner.test_func
[ERROR ] Exception in callback <functools.partial object at 0x7fccda2dd8e8>
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/tornado/ioloop.py", line 592, in _run_callback
ret = callback()
File "/usr/lib/python2.7/dist-packages/tornado/stack_context.py", line 343, in wrapped
raise_exc_info(exc)
File "/usr/lib/python2.7/dist-packages/tornado/stack_context.py", line 314, in wrapped
ret = fn(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/tornado/ioloop.py", line 598, in <lambda>
self.add_future(ret, lambda f: f.result())
File "/usr/lib/python2.7/dist-packages/tornado/concurrent.py", line 215, in result
raise_exc_info(self._exc_info)
File "/usr/lib/python2.7/dist-packages/tornado/gen.py", line 230, in wrapper
yielded = next(result)
File "/usr/lib/python2.7/dist-packages/salt/crypt.py", line 462, in _authenticate
io_loop=self.io_loop)
File "/usr/lib/python2.7/dist-packages/salt/transport/client.py", line 105, in factory
return salt.transport.zeromq.AsyncZeroMQReqChannel(opts, **kwargs)
File "/usr/lib/python2.7/dist-packages/salt/transport/zeromq.py", line 84, in __new__
new_obj.__singleton_init__(opts, **kwargs)
File "/usr/lib/python2.7/dist-packages/salt/transport/zeromq.py", line 155, in __singleton_init__
io_loop=self._io_loop,
File "/usr/lib/python2.7/dist-packages/salt/transport/zeromq.py", line 820, in __init__
self._init_socket()
File "/usr/lib/python2.7/dist-packages/salt/transport/zeromq.py", line 869, in _init_socket
self.stream = zmq.eventloop.zmqstream.ZMQStream(self.socket, io_loop=self.io_loop)
File "/usr/lib/python2.7/dist-packages/zmq/eventloop/zmqstream.py", line 107, in __init__
self._init_io_state()
File "/usr/lib/python2.7/dist-packages/zmq/eventloop/zmqstream.py", line 528, in _init_io_state
self.io_loop.add_handler(self.socket, self._handle_events, self._state)
File "/usr/lib/python2.7/dist-packages/tornado/ioloop.py", line 704, in add_handler
self._impl.register(fd, events | self.ERROR)
TypeError: argument must be an int, or have a fileno() method. @Ch3LL has bisected it to a37a270 This has been blocking us from upgrading to 2016.3.x |
hey @fracklen, thanks for this. Good catch guys!! Thanks!! |
@cachedout @Ch3LL would it be possible to create an integration test for this, based on the relatively simple runner with the |
I'm looking at this today but also wanted to bring in @skizunov for additional insight. |
@cachedout : The code seems to be using a Tornado IO Loop
In theory, the code should correctly select which type of IO Loop it uses (based on whether or not ZMQ is installed). Perhaps there is a bug in that logic or there is a missed use case. |
@asloboda-cisco @danlsgiga @fracklen can you please test my pr: |
@thatch45 Your fix works fine!!! Just applied to my dev environment and my custom runners that uses salt client internally are running fine now. Thanks a lot!! Any idea of which release this fix will be applied to? |
good to hear! We are trying to finalize 2016.3.2 right now, if this fix is good we are down to just one blocker bug I think. |
@thatch45 works like a dream. Awesome! Sounds great if it could hit 2016.3.2. |
it is merged into the branch, so this fix will be in 2016.3.2, glad I could help :) |
with multiple validations on the fix and the pr being merged I am going to close this out as fixed |
@thatch45 wrt the tests, can this be an issue for other transports as well? Would be nice to have a test that verifies that a runner can connect to a running IOLoop using the client - using each transport. |
Yes, good call,I did validate this fix on the other transports and a test that checks for this will be run on our multiple transport test platforms. |
After upgrade to 2016.3 (from 2015.8) my application started crashing on Salt client. The application is using Tornado 4 as well and client crashes with
IOLoop is already running
I tracked down the relevant piece of code to SMinion constructor
io_loop.run_sync
and by making change to this io_loop initialization I made it go away:I am not familiar with nuances of io_loop constructors or why it has changed between Salt versions but it doesn't seem to break anything.
The text was updated successfully, but these errors were encountered: