-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rabbitmqctl: "wait" could wait forever if rabbitmq-server fails to create a pidfile #463
Comments
Is there an approximate timeline to fix this? |
Not yet. We have more critical fixes to do first and we are busy with the 3.6.0 release at the moment. |
Could this make it into the 3.6.0 release? I've hit this quite a bit and its rather annoying to have to ctrl-c and kill the process. |
Hi @jmoney8080! No, unfortunately, it won't make it to 3.6.0. It requires non-trivial changes and we are too far in the release cycle. |
It possibly can make it into |
Thanks! If you need an external source to test an RC with it I can help. This issue causes a lot of annoyances when our automation runs(not all the time but when it does happen to causes issues it's annoying to fix). |
@jmoney8080 we'll keep you posted ;) |
To distinguish a node which hasn't completed start-up yet, from a node which failed to create its PID file, we could consider the node to have failed after retrying or timing out. Specifying a timeout on the command-line would be a friendly way to give the user control. But the danger is that the broker may complete start-up after we consider it to have failed. |
@dumbbell & @michaelklishin: To clarify what I meant above: if the script doesn't get a response from the server (in a given time) saying that it has started successfully, then the script could exit with status/code 1, otherwise 0. |
Just a random thought while working on something unrelated: we could start an ephemere Erlang node before starting RabbitMQ itself and use |
We will investigate if there are reasonably safe ways to fix this in |
- Use socket-activated epmd - that way there won't be any trouble when more than one erlang system is used within a single host. - Use new automation-friendly configuration file format - Use systemd notifications instead of buggy 'rabbitmqctl wait' for confirming successful server startup. 'wait' bug: rabbitmq/rabbitmq-server#463 - Use 'rabbitmqctl shutdown' instead of 'stop', because it's not pid-file based - Use sane systemd unit defaults from RabbitMQ repo: https://github.com/rabbitmq/rabbitmq-server/blob/master/docs/rabbitmq-server.service.example - Support for external plugins
- Use socket-activated epmd - that way there won't be any trouble when more than one erlang system is used within a single host. - Use new automation-friendly configuration file format - Use systemd notifications instead of buggy 'rabbitmqctl wait' for confirming successful server startup. 'wait' bug: rabbitmq/rabbitmq-server#463 - Use 'rabbitmqctl shutdown' instead of 'stop', because it's not pid-file based - Use sane systemd unit defaults from RabbitMQ repo: https://github.com/rabbitmq/rabbitmq-server/blob/master/docs/rabbitmq-server.service.example - Support for external plugins
- Use socket-activated epmd - that way there won't be any trouble when more than one erlang system is used within a single host. - Use new automation-friendly configuration file format - Use systemd notifications instead of buggy 'rabbitmqctl wait' for confirming successful server startup. 'wait' bug: rabbitmq/rabbitmq-server#463 - Use 'rabbitmqctl shutdown' instead of 'stop', because it's not pid-file based - Use sane systemd unit defaults from RabbitMQ repo: https://github.com/rabbitmq/rabbitmq-server/blob/master/docs/rabbitmq-server.service.example - Support for external plugins
- Use socket-activated epmd - that way there won't be any trouble when more than one erlang system is used within a single host. - Use new automation-friendly configuration file format - Use systemd notifications instead of buggy 'rabbitmqctl wait' for confirming successful server startup. 'wait' bug: rabbitmq/rabbitmq-server#463 - Use 'rabbitmqctl shutdown' instead of 'stop', because it's not pid-file based - Use sane systemd unit defaults from RabbitMQ repo: https://github.com/rabbitmq/rabbitmq-server/blob/master/docs/rabbitmq-server.service.example - Support for external plugins
- Use socket-activated epmd - that way there won't be any trouble when more than one erlang system is used within a single host. - Use new automation-friendly configuration file format - Use systemd notifications instead of buggy 'rabbitmqctl wait' for confirming successful server startup. 'wait' bug: rabbitmq/rabbitmq-server#463 - Use 'rabbitmqctl shutdown' instead of 'stop', because it's not pid-file based - Use sane systemd unit defaults from RabbitMQ repo: https://github.com/rabbitmq/rabbitmq-server/blob/master/docs/rabbitmq-server.service.example - Support for external plugins
- Use socket-activated epmd - that way there won't be any trouble when more than one erlang system is used within a single host. - Use new automation-friendly configuration file format - Use systemd notifications instead of buggy 'rabbitmqctl wait' for confirming successful server startup. 'wait' bug: rabbitmq/rabbitmq-server#463 - Use 'rabbitmqctl shutdown' instead of 'stop', because it's not pid-file based - Use sane systemd unit defaults from RabbitMQ repo: https://github.com/rabbitmq/rabbitmq-server/blob/master/docs/rabbitmq-server.service.example - Support for external plugins
IMHO this is fixed by 69454027 and can be closed, thanks! EDIT: actually it seems like |
The current implementation of
rabbitmqctl wait
can't distinguish between a node that did not start and a node that failed to create its pidfile. In both cases,rabbitmqctl wait
will wait indefinitely. When the node fails to create a pidfile, one would expect the command to return with a non-zero exit status.The current implementation ofrabbitmqctl wait
can't distinguish a node who did not start yet from a node who failed to create its pidfile. In both cases,rabbitmqctl wait
will spin forever, but in the latter (eg. when used in an init script), one would expect it to return and the init script to notify the startup failure.The text was updated successfully, but these errors were encountered: