Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

service logstash restart does not take KILL_ON_STOP_TIMEOUT into account #4991

Closed
purbon opened this issue Apr 4, 2016 · 12 comments
Closed

Comments

@purbon
Copy link
Contributor

purbon commented Apr 4, 2016

Tested with: Ubuntu 14.04 package when calling service logstash restart the script does not take into account the var KILL_ON_STOP_TIMEOUT leaving former LS instance alive.

see:

root@vagrant-ubuntu-trusty-64:/etc/init.d# echo $KILL_ON_STOP_TIMEOUT
1
root@vagrant-ubuntu-trusty-64:/etc/init.d# /etc/init.d/logstash restart
Killing logstash (pid 13256) with SIGTERM
Waiting logstash (pid 13256) to die...
Waiting logstash (pid 13256) to die...
Waiting logstash (pid 13256) to die...
Waiting logstash (pid 13256) to die...
Waiting logstash (pid 13256) to die...
Waiting logstash (pid 13256) to die...
Waiting logstash (pid 13256) to die...
Waiting logstash (pid 13256) to die...
Waiting logstash (pid 13256) to die...
logstash stop failed; still running.
logstash started.

this might happen when LS resist to stop, for example when there is an stale output.

The expected workaround for this is to use start and stop when using the KILL_ON_STOP_TIMEOUT.

@cgreatorexatonshape
Copy link
Contributor

+1 with additional information, cross-post from the forums...

When logstash doesn't stop in a timely manner, the init script exits with a bug trying to handle the timeout, and when calling it with restart, it ends up starting multiple copies.

sudo service logstash restart
Killing logstash (pid 23698) with SIGTERM
Waiting logstash (pid 23698) to die...
Waiting logstash (pid 23698) to die...
Waiting logstash (pid 23698) to die...
Waiting logstash (pid 23698) to die...
Waiting logstash (pid 23698) to die...
Waiting logstash (pid 23698) to die...
Waiting logstash (pid 23698) to die...
Waiting logstash (pid 23698) to die...
Waiting logstash (pid 23698) to die...
/etc/init.d/logstash: 99: [: -eq: unexpected operator
logstash stop failed; still running.
logstash started.

Seen in version 2.3.1 on ubuntu 14 LTS

The relevant line in the code is...

if [ $KILL_ON_STOP_TIMEOUT -eq 1 ] ; then
...and that variable doesn't exist anywhere in a default configuration, hence the init script bugs out.

Addtionally, after reviewing the code further, when the else clause to the line with the bug fires, the script should not start another copy. The stop function should return an error so that "stop && start" does not do the start.

@KnightOfNight
Copy link
Contributor

I checked out the source code in anticipation of submitting a patch for this, and I found that 'KILL_ON_STOP_TIMEOUT' does exist in the stock defaults file that ships with logstash, so the problem is when puppet installs the defaults file it doesn't get included. The default should exist in the script so that if not present in the defaults file the script doesn't bug out trying to check the value. My soon to be released patch will reflect this. I also renamed the variable to start with LS to match the other variables.

@suyograo
Copy link
Contributor

@KnightOfNight thanks much! I would love to include your patch in 2.3.2 which will happen next week.

suyograo pushed a commit to suyograo/logstash that referenced this issue Apr 23, 2016
…dded code to the stop function to handle inability to stop when not forcing

Fixes elastic#4991
suyograo pushed a commit that referenced this issue Apr 23, 2016
…dded code to the stop function to handle inability to stop when not forcing

Fixes #4991, #5168
suyograo pushed a commit that referenced this issue Apr 23, 2016
…dded code to the stop function to handle inability to stop when not forcing

Fixes #4991, #5168
ph pushed a commit to ph/logstash that referenced this issue Apr 26, 2016
…dded code to the stop function to handle inability to stop when not forcing

Fixes elastic#4991, elastic#5168
ph pushed a commit to ph/logstash that referenced this issue Apr 26, 2016
…dded code to the stop function to handle inability to stop when not forcing

Fixes elastic#4991, elastic#5168
@Slach
Copy link

Slach commented Apr 26, 2016

currenly still not worked on ubuntu 14.04 because init.d/logstash have #!/bin/sh in first line

@jordansissel
Copy link
Contributor

@Slach I'm not sure I understand. Why is having #!/bin/sh as the first line a problem?

@Slach
Copy link

Slach commented Apr 26, 2016

http://stackoverflow.com/questions/3411048/unexpected-operator-in-shell-programming

when i run
service logstash restart
i have following errorr

Killing logstash (pid 20736) with SIGTERM
Waiting logstash (pid 20736) to die...
Waiting logstash (pid 20736) to die...
Waiting logstash (pid 20736) to die...
Waiting logstash (pid 20736) to die...
Waiting logstash (pid 20736) to die...
Waiting logstash (pid 20736) to die...
Waiting logstash (pid 20736) to die...
Waiting logstash (pid 20736) to die...
Waiting logstash (pid 20736) to die...
/etc/init.d/logstash: 100: [: 1: unexpected operator
logstash stop failed; still running.

@jordansissel
Copy link
Contributor

There's a bug in the patch, it should either use = or -eq, not == (double equals).

@jordansissel
Copy link
Contributor

@Slach I have stackoverflow blackholed, so I can't read that link. Reviewing the patch, I see a bug. Bourne shell equality operator is a single equal sign, not double.

@Slach
Copy link

Slach commented Apr 26, 2016

yep bug in line 100, need =

@KnightOfNight
Copy link
Contributor

BASH is either = or ==, but /bin/sh is just =, and I failed to notice the shell at the top of the script.

I'll fix and PR again.

@jordansissel
Copy link
Contributor

@KnightOfNight thank you much :)

@purbon
Copy link
Contributor Author

purbon commented Jun 10, 2016

this error has resurfaces somehow, working on proper issue for this.

@purbon purbon self-assigned this Jun 10, 2016
@purbon purbon closed this as completed Jun 10, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants