-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minion never returns from state and/or master never sees state completion #4975
Comments
Is this using an init script or upstart? I imagine that this has to do with the issue where scripts block in some cases. We are actively looking into this problem, I will cross reference the issue when I find it, @UtahDave has been at the forefront of this one |
It's using upstart. Installed via the Ubuntu salt-minion package, 0.15.1. |
Thanks for the feedback, we should be able to reproduce this |
Let me know if I can help in any way. Thanks. |
If you want to try open up the file: |
I've tried this to no avail. That was to be expected I guess because
Any other ideas I can try? |
No, this is only being applied if the platform is windows, add the line outside of the if block |
Hum, I do see my debug message there: Beside, I did add the line outside the if block, without success. |
Ok, that helps our data, which means that this is lower level.... |
(Not sure if this applies at all to Ubuntu and Upstart.) I saw something similar for an internal application package. Turned out it was because the init.d script started the app as a different user using /bin/sh, but only redirected stdout to a file. When a redirect was added that redirected stderr to /dev/null, the package installed properly. I'm not quite sure why, but I'm assuming the application/shell was holding on to the stderr fd, causing Python to wait. When running on the command line, your current shell is still active so that fd isn't closed. I'd try adding in a debug logging line before Popen, between Popen and communicate, and after communicate, in cmdmod.py, here - https://github.com/saltstack/salt/blob/develop/salt/modules/cmdmod.py#L298 If you find that it gets to communicate but stops there, have a look in your processes and see if there are any zombie processes (i.e. [defunct] processes). I had one when it happened and if I killed the app that went away and Salt continued on. I'd also try running the minion in non-daemonised mode under strace and see what the last call is - just prefix your minion command with 'strace' to do that. |
I tried what you suggested and looks like we never return from
Any fix for this? Aside from modifying the init.d script of couchdb. There is indeed a defunct process, but killing it does not help. |
Good point. Although looking at the |
Problem is that salt is a remote execution system, not just configuration management. Someone will eventually want to run I think non-blocking I/O in subprocess is the only way to put an end to all of these issues with misbehaving init scripts. E.g http://www.python.org/dev/peps/pep-3145/ Something like:
It would need a lot of testing though! |
Duplicate underlying problem as #4410 |
Trying to install CouchDB on a minion when salt-minion was started from its init script fails. When started from the command line it works.
Installing other packages work. Only CouchDB was found so far not to work.
The command:
The state file:
Note the following command does not work either (the logs below are from running this command):
Versions:
salt-minion --version: 0.15.1
salt-master --version: 0.15.1
Both running on Ubuntu 12.10 under AWS EC2.
The output from
pkg.install couchdb
:At this point, CouchDB is installed on the minion. I can query it: couchdb -V returns couchdb - Apache CouchDB 1.2.0. No matter how long I wait, it keeps going.
Running the minion in debug mode shows that the following command is executed repeatedly:
An excerpt from the minion debug log:
The corresponding master log:
This:
returns nothing. And this:
returns:
Running the minion from the command line works, no problem at all. Only when started from its init script does things break.
The text was updated successfully, but these errors were encountered: