-
Notifications
You must be signed in to change notification settings - Fork 5.5k
[BUG] Empty formula makes salt-call
to hang forever.
#57456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I ran into this as well. Indeed if you add any bit of random data to the sls outside of the if statement, so it will never be empty, will make it work. |
apologies! I can't seem to get the labels correct, today, my mistake in removing P3, merely wanted to add the severity with the definition that includes "hanging" which ATM is |
I can seeing the same thing on Ubuntu 20.04. This also smells like #56332 . It seems anything that touches salt:// hangs. Hangs, where testfile is an empty file:
Doesn't hang:
Doesn't hang:
Doesn't hang:
Hangs:
Hangs:
Doesn't hang:
Hangs and salt://testtest doesn't exist:
Hangs and salt://saltstack does exist:
|
I thought this might relate to gitfs, but a simple git clone with file_roots does the same thing. I also tried stripping it down. No pillars, no custom grains, a super basic top.sls, and an example state from above. Even the most basic of setups hangs with 3000.3, py3, and salt-call. |
Ubuntu 18.04, salt 3000.3 - no hang. It's only a |
Can you provide, at minimum, verbose versions output for both platforms? I feel that this is PY2 vs PY3 difference that makes this difference. |
I feel that it's somewhere on the low level, in the way how event loop is handled. It looks like that the final Indeed:
The Next question where is that |
Nope Works fine
Hangs
|
In case it's relevant, with debug logging the end is
|
Is this still a thing? I tried to reproduce with code from 2020-06-09 git and looks OK:
The versions being used are:
|
I can't reproduce it on current master, @b-a-t please confirm |
Testing 3001 release on Ubuntu 20.04. |
I've tried this a couple of times and haven't been able to repro. I didn't try on Ubuntu 20.04, but I'll give that a shot tomorrow and see if I can repro. |
IMO, the issue is pretty serious. Could it be considered for 3001.2? Waiting till 3002 (October?) is too much... |
Did a couple of tests with different python versions (using
When I place the if __name__ == '__main__':
sys.argv[0] = re.sub(r'(-script\.pyw?|\.exe)?$', '', sys.argv[0])
import ipdb; ipdb.set_trace()
sys.exit(
load_entry_point('salt==3001', 'console_scripts', 'salt-call')()
) and step over the
but it is a debugger-induced problem and it hangs somewhere else without it. The interesting thing is that The hanging issue could be related to https://bugs.python.org/issue6721 (a pure guess). |
Just found this one while digging with the similar issue (Ubuntu 20.04, salt 3001). Also checked #53319 and #57954, but this one seems to be the most relevant. My state is pretty short:
Minion output:
This is 100% reproducible in my current setup. |
Thank you @defanator, this is a critical issue and we attempted to fix it the 3001.1 point release, but other competing priorities and some core team members not being able to reproduce it, caused us to delay. The issue and the fix is moved to be committed in the Magnesium release. The engineer assigned has reproduced it and will provide a fix. We tried here and didn't succeed. |
@sagetherage thanks! @cmcmarrow please let me know if there's anything I could provide in order to help address this one as soon as possible. I'd also be happy to run a few tests once a fix is available. TIA! |
It hangs at the garbage collection stage:
Potential references (links are intentionally broken to avoid spamming upstream issues): https://github.com/zeromq/ pyzmq/issues/1167 |
Workaround: set |
@max-arnold yeah, same here:
zmq is waiting for something. forever. |
I'm not sure about this particular case, but I've been hunting down the one where an incorrect state is provided, e.g.
vs
The first one stalls out, at a I'm pretty sure that it is something within zeromq that's calling it, because if I simply manually do an I don't know if the loader is a red herring or not, though. I was also able to workaround it at the end of 🤷 There are a lot of things that I know at this point, but I think there are actually more things that I don't know than when I started 😝 |
@waynew @max-arnold Works for me after downgrading pyzmq to |
@DmitryKuzmenko @waynew @max-arnold @cmcmarrow unfortunately I can't invest much time into further debugging of this one, but a number of quick experiments shows that downgrading pyzmq to 18.0.1 also resolves an issue for us (Ubuntu 20.04 / Python 3.8.2 / libzmq 4.3.2). We've switched to TCP transport until there's no clean fix. |
same here
|
This hack will stop the hang but will skip cleanup and may leak resources.
|
The bug does not sit in |
Even though this is an upstream bug we are working on and patch and will open a pull request with libzmq |
i'm sorry but i downgraded from libzmq 4.3.2 to libzmq 4.3.1 and it still hangs. |
@disaster123 I think you need to rebuild |
may be #57856 is still a different bug. I did:
but salt still hangs at the end of execution. |
OK working fix is just to install pyzmq=18.0.1 no matter which version of libzmq is installed it's working with 4.3.2 and 4.3.1 for me. Important is that you use:
|
@disaster123 thanks for info. What state are you using to test this? I want to see if your state cause the hang on my box? |
@cmcmarrow i'm coming from #57856 where salt-call hangs on EVERY state. So for me it didn't matter what kind of state it is. |
@cmcmarrow where is the upstream issue being tracked? |
I solved this issue (fepitre@4c5e18b) by defining the attribute for compatibility with newer python3.9 python/cpython@21da76d. |
Description
Empty formula makes
salt-call
to hang forever.Setup
We have a simple formula, that allows to add some custom states during the
highstate
, based on the minion's FQDN:custom_minion_state/init.sls
:In 99% of the cases that renders to the empty SLS file, as there is no customization necessary.
Steps to Reproduce the behavior
Applying that formula to the host, which doesn't require any customization produces:
For Salt
2018.3.3
salt-call
immediately returns to the shell. But for Salt3000.3
it hangs after this output forever(?). Well, long enough to lose the temper and kill it withCtrl-C
.Expected behavior
Application of the empty state should exit immediately after the state has been evaluated and applied.
Versions Report
salt --versions-report
Salt Version: Salt: 3000.3Dependency Versions:
cffi: 1.11.5
cherrypy: Not Installed
dateutil: 2.6.1
docker-py: 4.2.0
gitdb: Not Installed
gitpython: Not Installed
Jinja2: 2.10.1
libgit2: Not Installed
M2Crypto: 0.35.2
Mako: Not Installed
msgpack-pure: Not Installed
msgpack-python: 0.6.2
mysql-python: Not Installed
pycparser: 2.14
pycrypto: Not Installed
pycryptodome: 3.9.7
pygit2: Not Installed
Python: 3.6.8 (default, Nov 21 2019, 19:31:34)
python-gnupg: Not Installed
PyYAML: 3.12
PyZMQ: 19.0.0
smmap: Not Installed
timelib: Not Installed
Tornado: 4.5.3
ZMQ: 4.3.2
System Versions:
dist: centos 8.1.1911 Core
locale: UTF-8
machine: x86_64
release: 4.18.0-147.8.1.el8_1.x86_64
system: Linux
version: CentOS Linux 8.1.1911 Core
Additional context
Running the same command with
-l debug
gives a bit more information:Running through
strace
shows:Applying the same state from the master completes immediately, but with the note that state completed with the ERROR.
The text was updated successfully, but these errors were encountered: