Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent init script from returning when the service isn't actually started #6909

Closed
wants to merge 1 commit into from

Conversation

Projects
None yet
5 participants
@sbraz
Copy link

commented Jul 17, 2014

Hello,
Since the init script uses start-stop-daemon's -b option, it might return from start before the service is actually started. If the init script's status action is called immediately after that, It will return elasticsearch is not running and exit with a non-zero code.
This causes systems like Heartbeat to incorrectly start multiple instances of Elasticsearch at once.
This pull request aims at fixing this issue by waiting until the Elasticsearch process is started.
There might be a nicer way to achieve this but this is what came to mind.

@clintongormley

This comment has been minimized.

Copy link
Member

commented Oct 20, 2014

@spinscale please could you take a look

@electrical

This comment has been minimized.

Copy link
Contributor

commented Oct 20, 2014

With all the tests i do with the packages and init scripts i have never seen this sort of race condition happen.

i=0
timeout=10
# Wait for the process to be properly started before exiting
until { cat "$PID_FILE" | xargs kill -0; } >/dev/null 2>&1

This comment has been minimized.

Copy link
@clintongormley

clintongormley Oct 21, 2014

Member

This line seems weird to me. You're sending a kill -0 signal to check if a process with the PID exists, but the PID_FILE will only exist if the process has started because it is that process which writes the PID_FILE... What are you trying to achieve here?

This comment has been minimized.

Copy link
@sbraz

sbraz Oct 21, 2014

Author

If the PID file doesn't exist, xargs kill returns 123 so the loop keeps going.

@clintongormley

This comment has been minimized.

Copy link
Member

commented Oct 21, 2014

@sbraz Have you seen this problem occur, or is this a theoretical concern?

@sbraz

This comment has been minimized.

Copy link
Author

commented Oct 21, 2014

I have seen it occur when I started ElasticSearch with Pacemaker, it would sometimes start ES, check its status and find it stopped.

jpountz added a commit that referenced this pull request Dec 5, 2014

@jpountz jpountz removed the feedback_needed label Dec 5, 2014

@jpountz jpountz closed this in 6c2abcc Dec 5, 2014

@jpountz

This comment has been minimized.

Copy link
Contributor

commented Dec 5, 2014

Sorry for the delay, but I just merged this PR. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.