Add ability to send arbitrary UNIX signals to processes: #477

Merged
merged 11 commits into from Aug 19, 2014

Projects

None yet

9 participants

@msabramo
Contributor
  • Extend the Process class to handle non-killing signals
  • Expose this via the XML-RPC API
  • Add supervisorctl signal command

This is an update of PR #228 and hopefully moves the ball forward on #179, #53, etc.

Cc: @squeed, @weebly, @mnaberez, @sontek

@msabramo Casey Callendrello Add ability to send arbitrary UNIX signals to processes:
* Extend the Process class to handle non-killing signals
* Expose this via the XML-RPC API
9a7b160
@msabramo
Contributor

I just fixed a few Python 3 issues.

@msabramo
Contributor

Here's an example of using the RPC interface from IPython:

❯ ipython
Python 2.7.6 (default, Nov 11 2013, 13:08:06)
Type "copyright", "credits" or "license" for more information.

IPython 2.0.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: from xmlrpclib import ServerProxy, Error

In [2]: server = ServerProxy('http://localhost:9002')

In [3]: [m for m in server.system.listMethods() if 'Signal' in m]
Out[3]: ['supervisor.sendGroupSignal', 'supervisor.sendProcessSignal']

In [4]: server.supervisor.sendProcessSignal('dog:4', 1)
Out[4]: True

In [5]: server.supervisor.sendGroupSignal('dog')
Out[5]:
[{'description': 'OK', 'group': 'dog', 'name': '1', 'status': 80},
 {'description': 'OK', 'group': 'dog', 'name': '0', 'status': 80},
 {'description': 'OK', 'group': 'dog', 'name': '3', 'status': 80},
 {'description': 'OK', 'group': 'dog', 'name': '2', 'status': 80},
 {'description': 'OK', 'group': 'dog', 'name': '4', 'status': 80}]
@msabramo
Contributor

I added frontend (supervisorctl) support in 21ddfff. Looks like this:

❯ supervisorctl
cat:0                            RUNNING   pid 57305, uptime 0:00:07
cat:1                            RUNNING   pid 57304, uptime 0:00:07
cat:2                            RUNNING   pid 57307, uptime 0:00:07
cat:3                            RUNNING   pid 57306, uptime 0:00:07
cat:4                            RUNNING   pid 57308, uptime 0:00:07
dog:0                            RUNNING   pid 57300, uptime 0:00:07
dog:1                            RUNNING   pid 57299, uptime 0:00:07
dog:2                            RUNNING   pid 57302, uptime 0:00:07
dog:3                            RUNNING   pid 57301, uptime 0:00:07
dog:4                            RUNNING   pid 57303, uptime 0:00:07
supervisor> help signal
signal <signal name> <name>           Signal a process
signal <signal name> <gname>:*        Signal all processes in a group
signal <signal name> <name> <name>    Signal multiple processes or groups
supervisor> signal 1 dog:3 dog:4
dog:3: signalled
dog:4: signalled
supervisor> signal HUP dog:3 dog:4
dog:3: signalled
dog:4: signalled
supervisor> signal HUP dog:*
dog:1: signalled
dog:0: signalled
dog:3: signalled
dog:2: signalled
dog:4: signalled
supervisor> signal USR1 dog:1 dog:2
dog:1: signalled
dog:2: signalled
@msabramo msabramo Remove extra space around =
for consistency and PEP8
068b946
@sontek
sontek commented Aug 14, 2014

I think a restartsignal configuration option should also be introduced. The reasoning for this would be if you run sudo supervisorctl restart all you might want to send a HUP signal where as when you run sudo supervisorctl stop all you probably want to send TERM

@msabramo
Contributor

from @mcdonc on IRC (#pyramid):

restartsignal i'm +0 on.. i'd never use it myself but it seems to fit peoples' brains
its a little weird because it changes the meaning of "restart", and we dont have prior art for anything
that changes the meaning of a command except for "stopsignal"
i just have a feeling that what people actually want to do is "signal USR2 someprogram"
they'd just rather sort it in their brains as "restart"
i mean i guess its ok
they dont have to use it
it's just for people who need to map things that way in their heads i suppose

@msabramo
Contributor

We have had folks who wanted quick restarts put stopsignal = HUP in their supervisor config, which is of course, a bad idea, because then supervisor doesn't actually stop things.

@mcdonc
Member
mcdonc commented Aug 14, 2014

If the signal people use as "restartsignal" doesnt actually exit the process, they'll be in for a similarly bad time.

@sontek
sontek commented Aug 14, 2014

Yeah, I'm totally convinced that restartsignal is not needed.

@mcdonc
Member
mcdonc commented Aug 14, 2014

Convincing the thousand folks who think they need it in #53 is probably going to be tricky.

@msabramo
Contributor

Well, give 'em the signal command and then see what happens.

@mnaberez
Member

I remember seeing a fork that had another implementation of this same feature. I couldn't find it with GitHub search, so I cloned all the forks and grepped for signal in their rpcinterface.py files. The only ones I came up with were fedosov (which is for a different issue, restartsignal) and this one. I may have missed it or it may have been deleted.

Edit: Found it, see below.

@mnaberez mnaberez and 1 other commented on an outdated diff Aug 14, 2014
supervisor/rpcinterface.py
@@ -464,6 +465,82 @@ def stopAllProcesses(self, wait=True):
killall.rpcinterface = self
return killall # deferred
+
+ def _getSignalFromString(self, name):
@msabramo
msabramo Aug 15, 2014 Contributor

Oh cool! Fixed in 315b30e.

@mnaberez mnaberez and 1 other commented on an outdated diff Aug 14, 2014
supervisor/rpcinterface.py
+
+ sig = self._getSignalFromString(signal)
+
+ if process.get_state() not in RUNNING_STATES:
+ raise RPCError(Faults.NOT_RUNNING)
+
+ msg = process.signal(sig)
+
+ if not msg is None:
+ raise RPCError(Faults.FAILED, msg)
+
+ cb = lambda: True
+ cb.delay = 0
+ cb.rpcinterface = self
+
+ return cb
@mnaberez
mnaberez Aug 14, 2014 Member

It has already called process.signal() above so the work is done. I don't think it needs to return a callback here.

@msabramo
msabramo Aug 15, 2014 Contributor

Fixed in 873c8d9.

@mnaberez mnaberez and 1 other commented on an outdated diff Aug 14, 2014
supervisor/rpcinterface.py
+ """ Send an arbitrary UNIX signal to the process named by name
+
+ @param string name The name of the process to signal (or 'group:name')
+ @param int signal the integer UNIX signal to send. SIGHUP by default.
+ @return boolean result
+ """
+
+ self._update('sendProcessSignal')
+
+ group, process = self._getGroupAndProcess(name)
+
+ if process is None:
+ group_name, process_name = split_namespec(name)
+ return self.sendGroupSignal(group_name, signal=signal)
+
+ sig = self._getSignalFromString(signal)
@mnaberez
mnaberez Aug 14, 2014 Member

I'm not sure what happens here if the signal name is bad but I think we should check it and bail out with a BAD_SIGNAL fault or similar if lookup fails.

@msabramo
msabramo Aug 15, 2014 Contributor
supervisor> signal FOO dog:1 dog:2
error: <type 'exceptions.ValueError'>, Unknown result code 2 for dog:1: file: /Users/marca/dev/git-repos/supervisor/.tox/py27/lib/python2.7/site-packages/supervisor/supervisorctl.py line: 755
@msabramo
msabramo Aug 15, 2014 Contributor

I added error checking in 9abc7ca -- now it looks like this:

supervisor> signal FOO dog:1 dog:2
dog:1: ERROR (BAD_SIGNAL: FOO)
dog:2: ERROR (BAD_SIGNAL: FOO)
@mnaberez mnaberez commented on the diff Aug 14, 2014
supervisor/rpcinterface.py
+ @param int signal The signal to be sent. SIGHUP by default
+ @return array result
+ """
+
+ self._update('sendGroupSignal')
+
+ group = self.supervisord.process_groups.get(name)
+
+ if group is None:
+ raise RPCError(Faults.BAD_NAME, name)
+
+ processes = list(group.processes.values())
+ processes.sort()
+ processes = [(group, process) for process in processes]
+
+ sendall = make_allfunc(processes, isRunning, self.sendProcessSignal,
@mnaberez
mnaberez Aug 14, 2014 Member

Does this need to be done async?

@mnaberez mnaberez commented on the diff Aug 14, 2014
supervisor/process.py
@@ -432,6 +432,45 @@ def kill(self, sig):
return None
+ def signal(self, sig):
@mnaberez
mnaberez Aug 14, 2014 Member

Much of this code is copied from kill above it. We should probably refactor to remove the duplication.

@mnaberez
Member

Found that other implementation:
https://github.com/moriyoshi/supervisor/tree/feature/moriyoshi/add-signal-command

This one should be reviewed as well. One thing I noticed is that it has tests for the new signal command in supervisorctl, which this patch does not. We should at least pull those tests in.

@mcdonc
Member
mcdonc commented Aug 18, 2014
  • sendXSignal XMLRPC API methods should probably be named signalX (e.g.
    sendProcessSignal -> signalProcess).
  • As Mike mentioned in his inline notes, I suspect sendGroupSignal doesn't need
    to do anything async. Unlike stopping processes, there's nothing to wait for,
    and sending even a thousand signals wouldn't block the mainloop for any
    considerable time.
  • We might consider adding a signalAllProcesses API, if only for foolish consistency with
    stopAllProcesses.
  • We should probably merge the _stopresult and _signalresult code in
    supervisorctl.
  • I wonder if we need a "signalasgroup" configuration option, like killasgroup. I don't think
    we do, but it does mean that signal HUP someprocess won't be (and can't be made)
    completely equivalent to stop someprocess, which is fine by me, but probably needs
    to be called out in the docs.
  • We should review the supervisorctl tests in
    https://github.com/moriyoshi/supervisor/tree/feature/moriyoshi/add-signal-command
    and adapt them to this implementation.
@mcdonc mcdonc merged commit e34d41c into Supervisor:master Aug 19, 2014

1 check passed

continuous-integration/travis-ci The Travis CI build passed
Details
@mcdonc mcdonc added a commit that referenced this pull request Aug 19, 2014
@mcdonc mcdonc as per comment attached to #477, rename sendXSignal to signalX, dont …
…do anything async in signalGroup, merge logic for _stopresult and _signalresult in supervisorctl
09fd3e4
@mcdonc
Member
mcdonc commented Aug 19, 2014

Merged to master. I took care of most of the stuff in my comment, delta mining the moriyoshi code for supervisorctl tests.

@adnam
adnam commented Nov 12, 2014

My feeling on this is that a "restart" signal should completely stop and then start, just as it does now, so I agree with @sontek that restartsignal is not needed.

I think the supporters of #53 would be happy to have a "graceful" command which can be provided by specifying a "graceful_signal". For gunicorn this would be "HUP" and Apache2 this would be "SIGUSR1". If the command "graceful-restart" is issued for a service for which "gracefulrestartsignal" is not defined, it simply does nothing and issues a warning. Similarly, "graceful-stop" can be provided the same way.

The only remaining problem would be restarting groups of services where the desired behaviour would be:

  • If the service has a graceful signal defined, use it.
  • Otherwise restart "normally" via stop/start

So I would add a "--force" option to the graceful command which falls back to stop/start if no graceful signal is defined.

@IvanAnishchuk

it simply does nothing and issues a warning
So I would add a "--force" option to the graceful command which falls back to stop/start if no graceful signal is defined.

I believe that should be the default behavior with option to do nothing if there's no graceful signal defined (--calm?). And warning could be issued while doing the job, I don't think anybody would go and edit the config when they need to restart something and --force option will only increase the command line required for the most needed function (and re-sending rate, sometimes causing unneeded consecutive restart).

@lonetwin

Just curious what version of supervisor are these changes set to be released in ? I am looking to just use the signal command and have no preference either ways about also having a 'graceful restart' config option.

@mnaberez
Member

The signal command from this pull request will be included in the next release, 3.2.

@jgadling

Any ETA on the 3.2 release?

@mcdonc mcdonc added a commit that referenced this pull request Aug 18, 2015
@mcdonc @mnaberez mcdonc + mnaberez as per comment attached to #477, rename sendXSignal to signalX, dont …
…do anything async in signalGroup, merge logic for _stopresult and _signalresult in supervisorctl
813787f
@astlock
astlock commented Jan 14, 2016

Thank you for the signal implementation! I found some wrong behaviour with signals and with normal restarts too:

[koshevoy@koshevoy-staging.dal05 ~]$ supervisorctl signal sigusr2 app-api
app-api: ERROR (no such process)
[koshevoy@koshevoy-staging.dal05 ~]$ echo $?
0
[koshevoy@koshevoy-staging.dal05 ~]$ supervisorctl restart elf
elf: ERROR (no such process)
elf: ERROR (no such process)
[koshevoy@koshevoy-staging.dal05 ~]$ echo $?
0

If the restart or signaling failed for any reason, there should be exit code 1. Am I right?

@mnaberez
Member

If the restart or signaling failed for any reason, there should be exit code 1. Am I right?

The exit status always being 0 is not limited to signals. It is tracked as #24 and there is now a pull request to change this in #688 that should be merged for a future release.

@astlock
astlock commented Jan 14, 2016

I missed this issue. Thank you for the explanation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment