Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ES_STOP_TIMEOUT configuration variable to packages #3973

Conversation

spinscale
Copy link
Contributor

This variable allows to configure the waiting time after a TERM signal has
been sent. After that waiting time a KILL signal is sent to ensure the
service is stopped.

In case of a bigger installation the default values might be to slow, so it
now is configurable.

Original work done by @tmclaugh at the PR #3719

Closes #3719
Closes #3972

@s1monw
Copy link
Contributor

s1monw commented Aug 21, 2014

any reason this has not been resolved yet? @spinscale ?

@spinscale
Copy link
Contributor Author

@s1monw I think we need to fix the timeouts, as they are too low (20 seconds for TERM, 40 for KILL or something) and I would like to have someone with packaging knowledge review this one, if it is useful at all.

@drewr @electrical can you have a look maybe, if this is good & useful?

Going to extend the timeouts asap

@spinscale spinscale force-pushed the packaging-configurable-shutdown-delay branch 2 times, most recently from de4f259 to 720ceae Compare August 22, 2014 07:57
@electrical
Copy link
Contributor

Looks good to me.
I'm a bit scared by the kill -9 part though.
I believe that still should be a manual action instead of automated.
If it fails to stop normally there would be something wrong with the process.

Having the stop time configured defo makes sense.

@spinscale
Copy link
Contributor Author

@electrical The more I think about it the more I agree that the only one to execute kill -9 should be the sysadmin and not an init script. If a process does not finish after TERM, manual intervention should be required.

any other/different opinions on that? @drew @s1monw ?

@rjernst
Copy link
Member

rjernst commented Aug 27, 2014

+1 to not do a kill -9 from a script...

@clintongormley
Copy link

+1, especially in light of #7563 which will perform a flush on shutdown

This variable allows to configure the waiting time after a TERM signal has
been sent. After that waiting time a KILL signal is sent to ensure the
service is stopped.

In case of a bigger installation the default values might be to slow, so it
now is configurable.

Also, we opted out from sending a KILL signal by default. This has to be
explicitely done by using the ES_KILL_TIMEOUT variable.

Original work done by @tmclaugh at the PR elastic#3719

Closes elastic#3719
Closes elastic#3972
@spinscale spinscale force-pushed the packaging-configurable-shutdown-delay branch from 720ceae to 9b736f2 Compare September 8, 2014 11:48
@spinscale
Copy link
Contributor Author

removed all the KILL relevant parts... note that by default, redhats killproc function still sends a KILL signal after the timeout passed and the process is still there...

should this explicitely be disabled (which is possible), to make sure an init mechanism never sends a KILL signal, even though it is part of the standard shutdown mechanism in init scripts?

@spinscale
Copy link
Contributor Author

@electrical can you have a look and comment on my last question, if it makes sense to change the killproc default behaviour in your opinion?

@electrical
Copy link
Contributor

On the one hand we should not deviate from standards handled by a service manager.
On the other hand with systems like ES I'm very afraid of SIGTERM's to it.

Imho when a stop fails there must be something wrong with it, for example shutdown takes longer or some other weird error.
It would be safer for ES to never execute a Kill / SIGTERM and let the sysadmin do that manually.

@clintongormley clintongormley added the :Delivery/Packaging RPM and deb packaging, tar and zip archives, shell and batch scripts label Nov 11, 2014
@t-lo
Copy link

t-lo commented Dec 3, 2014

It is worth noting that the systemd service configuration does use a kill mechanism (and a timeout of 20 seconds) already.

@s1monw
Copy link
Contributor

s1monw commented Dec 5, 2014

we decided to not allow for explicit configuration of this in our packages. If other 3rd party tool do that already that's a different story. If there is further evidence that we need to do that too we can still reopen the issue.

@s1monw s1monw closed this Dec 5, 2014
@t-lo
Copy link

t-lo commented Dec 5, 2014

@s1monw this is partially right. As I wrote above - the elasticsearch systemd configuration uses an explicit kill timeout of 20 seconds (see https://github.com/elasticsearch/elasticsearch/blob/master/src/rpm/systemd/elasticsearch.service#L17), I suppose this is a bug, then?

@s1monw
Copy link
Contributor

s1monw commented Dec 5, 2014

suppose this is a bug, then?

I don't think so - I'd rather say it's maybe an inconsistency but we can fix it though. I don't have strong feelings about it to be honest I'd vote fore removing the kill there too?

@t-lo
Copy link

t-lo commented Dec 5, 2014

+1 for being consistent - users would expect the same behaviour on different init systems.

@t-lo
Copy link

t-lo commented Dec 5, 2014

Please note that there's a pull request for adding a systemd service file to the .deb package (since Debian will switch to systemd soon) which still has the kill timeout: #8765
@s1monw ping me if you want the timeout removed and I'll update that pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Delivery/Packaging RPM and deb packaging, tar and zip archives, shell and batch scripts Team:Delivery Meta label for Delivery team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make shutdown wait time configurable for packages
7 participants