Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exit code allway zero #24

Closed
ramary opened this issue Jun 26, 2011 · 34 comments
Closed

exit code allway zero #24

ramary opened this issue Jun 26, 2011 · 34 comments

Comments

@ramary
Copy link

ramary commented Jun 26, 2011

I want to use the exit code for nagios but it's allways 0.

ecample 1:
zope@plone:> ./bin/supervisorctl status
http://127.0.0.1:9001 refused connection
zope@plone:
> echo $?
0

Could you change exit LSB conform ?
http://refspecs.linuxfoundation.org/LSB_4.1.0/LSB-Core-generic/LSB-Core-generic/iniscrptact.html

example 2:
./bin/supervisorctl status
balancer RUNNING pid 23753, uptime 0:00:50
instance0 RUNNING pid 23751, uptime 0:00:50
instance1 EXITED Jun 26 05:40 PM
varnish RUNNING pid 23754, uptime 0:00:50
zeo RUNNING pid 23750, uptime 0:00:50
zope@plone:~> echo $?
0

@mnaberez
Copy link
Member

@mnaberez
Copy link
Member

mnaberez commented May 9, 2012

See also issue #116.

@mgedmin
Copy link

mgedmin commented Nov 21, 2012

FWIW supervisorctl pid used to exit with return code 2 when the daemon was not running in supervisor 3.0a8 (as packaged for Ubuntu 12.10), but this regressed back to return code 0 with supervisor 3.0b1 (latest on PyPI).

@mnaberez
Copy link
Member

I looked into that one specifically. The pid command in supervisorctl never intentionally set an exit code in any version. The other commands do not set exit codes either. The only condition that causes exit code 2 is when an unhandled exception occurs inside supervisorctl.

A bug in the pid command caused an unhandled exception, and that's why you saw pid return exit code 2 when the daemon was not running. The bug in pid has since been fixed, and that's why it no longer behaves this way (instead it now behaves like the other commands).

I agree we should set meaningful exit codes for the commands in a future version.

@rascalking
Copy link

Any word on accepting the pull request @GotenXiao submitted? Always exiting with a 0 means automating service bounces with tools like fabric will have no way to tell that the bounce attempt failed.

@dnephin
Copy link

dnephin commented Apr 9, 2014

+1

@dnephin
Copy link

dnephin commented May 8, 2014

I think you can work around this issue using a plugin. Something like this: https://gist.github.com/dnephin/f61646745bd004afba8b

@lukeweber
Copy link
Contributor

I took a stab at this starting where the previous patch left off. I turned every error message into an exit. It's a bit crude, but thought I might share. I didn't attempt to rewrite the unittests because I figure someone might take a different approach anyways.

lukeweber@04d47d0

@gdm85
Copy link

gdm85 commented Nov 4, 2014

Why is this marked as "enhancement"? it's a bug. Supervisor should return an exit code different than 0 when the operation requested by user failed.

For example when the daemon is not running (so when the socket is not even there, if you are using a socket), supervisorctl still returns 0.

@lukeweber I am going to try your patch, thanks

@wwalker
Copy link

wwalker commented Dec 8, 2014

+1 - this is a bug

@exi
Copy link

exi commented Mar 26, 2015

This ticket is years old and concerns a serious shortcoming. Could we bump it's priority or provide some kind of sensible workaround?

@gdm85
Copy link

gdm85 commented Mar 26, 2015

@exi I have been using @lukeweber's patch, but obviously already switching to runit for the long term/production.

@rtwair
Copy link

rtwair commented Apr 1, 2015

Labeling this as an enhancement makes absolutely no sense.

+1 - definitely a bug

@neutrinus
Copy link

+1

This issue makes us cry at least once a week. As supervisord is not returning anything but 0, ansible has to parse the "human" strings to check the state.

@mattwiller
Copy link

+1 — this is a serious issue that makes it very difficult to integrate supervisor with our deployment infrastructure

@castarco
Copy link

+1 - That's a serious bug. This shoudn't be seen as a low priority enhancement

It difficults the integration of supervisord with other automatization tools (rather than checking the return code, the only generic way to know if the command failed is to analyze the supervisorctl output to stdout....).

@vincent-io
Copy link

+1, this is a bug

@lukeweber
Copy link
Contributor

+1 for getting this fixed. It's pretty unusable for production for me without the patch. I've been using my patched version for a year now in production. @mnaberez would you merge this if there were tests or is there some other holdup? At 4 years pending I'm a bit curious why it's still not done. I recall it looking like a bit of a rewrite would be in order to test exit codes adequately.

@mnaberez
Copy link
Member

@mnaberez would you merge this if there were tests or is there some other holdup?

I'm not sure what you are referring to. This is an issue not a pull request. As for tests, this project doesn't typically merge patches without tests.

@msabramo
Copy link
Contributor

@lukeweber has a commit with a patch (lukeweber/supervisor@04d47d0), but he has not filed a pull request I think.

I know of at least two PRs - mine (#620) and one from @GotenXiao (#292).

Mine needs tests which I have not gotten to yet. I have not looked at the changes from @GotenXiao or @lukeweber. I guess I'm curious which of the 3 is closest to what is desired and then I'll focus on the best one if I get some time.

@msabramo
Copy link
Contributor

At first glance, lukeweber/supervisor@04d47d0 looks comprehensive and seems to have tests, so maybe it's a better starting point than my own PR.

@lukeweber: Do you want to submit a PR?

@lukeweber
Copy link
Contributor

@mnaberez -

  1. Is supervisorctl.py code used for anything besides single commands or interactive mode(two states), or is it reused in libraries (third state - not interactive, raise exceptions and don't exit on errors)?

Implementation could be the following that I think would work for both:

  1. implement handle_error(message=None, fatal=False)
  2. Wrap any output that contains "Error" or "ERROR" in supervisorctl.py with a handle_error(message=...) method.
  3. Wrap any blank "raise" with handle_error(fatal=True)
  4. Add an option exit_on_error in main() in supervisorctl.py defined as !interactive
  5. Create duplicate tests for all tests that check for different Exceptions in test_supervisorctl.py to check for SystemExit when options.exit_on_error = True

Handle error example code:

    def handle_error(self, message=None, fatal=False):
        if message:
            self.ctl.output(message)

        if self.ctl.options.exit_on_error:
            raise SystemExit(2)
        elif fatal:
            raise

@lukeweber
Copy link
Contributor

Lets get this merged: #641

@lukeweber
Copy link
Contributor

Would be cool if anyone wanted to test out the latest code. It's just such a chunk would feel better if a few people ran through some real world cases.

@gdm85
Copy link

gdm85 commented Aug 8, 2015

@lukeweber what about adding some script-based integration tests? I can see that a simple command like sh -c 'sleep 3 && true' and sh -c 'sleep 3 && false' could do the job

lukeweber added a commit to lukeweber/supervisor that referenced this issue Sep 30, 2015
* error_status being set in supervisorctl.py raises a SystemExit of that code on exit.
* handle_error wraps all error output and sets error_status if not in interactive mode.
* onecmd is the entry point for single command and where we enforce SystemExit.
lukeweber added a commit to lukeweber/supervisor that referenced this issue Sep 30, 2015
lukeweber added a commit to lukeweber/supervisor that referenced this issue Sep 30, 2015
lukeweber added a commit to lukeweber/supervisor that referenced this issue Oct 7, 2015
* error_status being set in supervisorctl.py raises a SystemExit of that code on exit.
* handle_error wraps all error output and sets error_status if not in interactive mode.
* onecmd is the entry point for single command and where we enforce SystemExit.
lukeweber added a commit to lukeweber/supervisor that referenced this issue Oct 7, 2015
@tbg
Copy link

tbg commented Dec 10, 2015

also very surprised that such a basic thing isn't working yet. Ping https://github.com/cockroachdb/cockroach-prod/issues/59

@c0b
Copy link

c0b commented Jan 27, 2016

+1 for a severe bug of supervisorctl

from 2011 till now 2016 why hasn't it be fixed yet?

@nocive
Copy link

nocive commented May 17, 2016

+1 in which version of supervisor will this be available?
pretty shocking this took 5 years to be fixed...

@AaronOpfer
Copy link

Also curious what version of supervisor this bug will be fixed in.

Thanks supervisor contributors for making such a useful tool!

@mnaberez
Copy link
Member

This has been merged into the master branch which will be Supervisor 4.0.

@pawelad
Copy link

pawelad commented Dec 13, 2017

Is there any chance of backporting this to 3.x version? I'd be happy to help.

@kujiy
Copy link

kujiy commented Apr 6, 2018

I use supervisorctl status cron | grep RUNNING instead for a while on 3.x. It returns 1 when got Fatal.

@talawahtech
Copy link

talawahtech commented Oct 29, 2018

Hi, the current docs say the following:

supervisorctl status all would return non-zero if any single process was not running

However this doesn't seem to be the case when I try it. Are the docs incorrect or is there something else going on. Is there any version of supervisor where supervisorctl returns non-zero exit codes?

alexsilva pushed a commit to alexsilva/supervisor that referenced this issue Jun 22, 2019
* error_status being set in supervisorctl.py raises a SystemExit of that code on exit.
* handle_error wraps all error output and sets error_status if not in interactive mode.
* onecmd is the entry point for single command and where we enforce SystemExit.

(cherry picked from commit a11acd6)
@StefCoders
Copy link

I still have this problem! I don't know how to fix it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests