New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Waiting for termination after ExecStop #13284
Comments
The documented assumption is that any command configured in ExecStop= is synchronous and complete, i.e. more than an asynchronous, friendly request to shutdown. i.e. so that any processes left after ExecStop= is left-over stuff that can be killed immediately. This assumption builds on the fact that sysvinit required such synchronous stop behaviour too to work correctly, and ExecStop= is mostly about mimicking compat with that. Modern services should not need ExecStop= anyway: if they react correctly to SIGTERM (and exit cleanly) they don't need to define ExecStop= at all, and can set KillMode=mixed and thus get an asynchronous SIGTERM to the main PID to which they then can react asynchronously to. So, summary: any service that implements an asynchronous ExecStop= is not in line with either systemd's nor sysvinit#s expected behaviour. I'd really prefer if the service would be fixed to follow this logic (and ideally not require any ExecStop= command at all, just by handling SIGTERM correctly, as mentioned). |
The service in question is written in Java, which doesn't really lend itself well to signal handling: You can register shutdown hooks, but they tend to be rather fragile and while they work reliably in single-threaded applications, they can cause problems in multithreaded applications like this one. If this is too much of a niche case to be handled in systemd I understand, and will simply stick to the 5 second sleep as a second ExecStop= command. This works reliably as the service really doesn't need long to exit, it's just not quite immediate after the command exits. In any case, while this is documented in the ExecStop= section, it might be a good idea to add a note to TimeoutStopSec= that this value explicitly does not wait between ExecStop= and KillSignal=. It surprised me and might surprise other readers of the documentation as well. |
This is unrealistic. There is absolutely no sane way Anything that |
Do IPC, talk to the service, and wait until it reports back that it is almost finish going down now. Then wait for the EOF on the IPC channel and assume the thing is done with that. But quite frankly: don't use ExecStop= at all, just handle SIGTERM properly. There must be a way to do this with Java.
Well, my assumption is that new-style stuff would just handle SIGTERM properly, and old style stuff already has some hacky solution in place. ExecStop= for service shutdown is a legacy concept if you so will, you don't need it for clean services that properly handle SIGTERM. |
Another use case for this is nginx. To gracefully stops nginx, you need to send SIGSTOP, wait a bit, send SIGTERM if still needed, wait a bit, send SIGKILL if really needed. In Debian, this is implemented using ExecStop=-/sbin/start-stop-daemon --quiet --stop --retry QUIT/5 --pidfile /run/nginx.pid
TimeoutStopSec=5
KillMode=mixed It would be nice to not have to rely on Debian-specific tools. With the proposed change, we could implement the QUIT/5/TERM/5/KILL with the following snippet: ExecStop=/usr/bin/kill -QUIT $MAINPID
TimeoutStopSec=5 nginx cannot conflates the two signals into one as they have different purposes:
An alternative configuration could be: KillSignal=QUIT
TimeoutSec=30s But if a connection is lingering for 30s, logs may not be flushed correctly due to nginx receiving SIGKILL and not SIGTERM. |
This is exactly what happened to me, I wrote a service that should cleanly shutdown a program within a screen session. For this it is necessary to send a specific text command to the program within the screen session. The ExecStop command was executed but immediately afterwards the SIGTERM ensured that the screen was closed before the program could be ended properly. At this point the documentation of TimeoutStopSecis is really a bit misleading and should be clarified. It took me some time to realize that there is no option for a timeout between ExecStop and the KillSignal. Therefore I had the choice to put a SIGTERM wrapper around the screen or just tinker with the behavior of systemd. Yes, in a perfect world you would simply adapt any software so that it reacts cleanly to a SIGTERM signal and would avoid yourself all the problems, but we don't live in this perfect world and have to get along with what we have, at least for the time being. |
While I agree that it would be ideal if everything handled
It's ugly and terrible, and ideally it wouldn't be necessary, but it gets the job done. |
That's definitely ugly, but a nice trick to know... I had never heard of kill -0. thx |
The following workaround works just fine for ugly Java app I have to work with:
I think this ticket can be safely closed - it's really not systemd responsibility to cater for broken Java apps. |
I would also like to see better tools here. While I agree that it would be nice if all services handled I think the default behaviour is quite reasonable, but it would nice to see an option to change it. For me the best solution would be the following:
Basically since Yes, this can be implemented by the user by adding a polling loop at the end of their script. But this is tedious, inefficient and probably breaks with PID reuse. For now I will be using zabbal's workaround which is pretty close to the desired behaviour. But it would be nice to be able to still maintain a |
Is your feature request related to a problem? Please describe.
I have a service which can only be terminated by sending a command to the service via an external tool. That external tool sends a "shutdown" command to the service. The service then reports back to the tool "I am now shutting down", which the tool interprets as success. At this point, the tool terminates, and the ExecStop= command terminates successfully.
At this point, the service is still in the process of shutting down, but systemd immediately sends SIGTERM (or whatever is specified in KillSignal=), and the process is killed abruptly.
Describe the solution you'd like
I would like systemd to honor TimeoutStopSec= in this case. TimeoutStopSec is honored before sending FinalKillSignal later, but it does not seem to be honored in this case.
Describe alternatives you've considered
Currently the unit file simply contains a second ExecStop=sleep 5 line, which is enough time for the service to actually terminate. This solution is obviously ugly.
Another solution would be the introduction of another Timeout-like variable that exists precisely for this use case, in order not to change behaviour in existing units.
I realize that documentation states that "The specified command should hence be a synchronous operation, not an asynchronous one.". However, at least to someone not deeply familiar with systemd internals; this behaviour was very surprising, I would completely expect TimeoutStopSec to also apply to the gap between ExecStop= and KillSignal=.
The text was updated successfully, but these errors were encountered: