-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Supervisor doesn't restart server process if it couldn't listen on a port #1072
Comments
I think that repeating restarts doesn't help the situation. If other daemon processes listens the port specified in configuration file (like syslog daemon vs in_syslog), repeating restarts make supervisor running. Users might not find that Fluentd wasn't launched correctly. @frsyuki the problem you meant is about |
Time to implement this feature? > fluent/fluent-plugin-multiprocess#3 |
@tagomoris one process succeeds to start. |
@frsyuki Yes. But totally, that configuration is broken in that case. |
Fluentd core can't accept this proposal, I think. It breaks compatibility of behavior, against users' expectation. IMO, fluent-plugin-multiprocess should implement it. |
Overview
Supervisor doesn't restart server process if it couldn't start within 1 second with this message:
This happens also when the server failed to start due to "address already In use" error and the port is temporarily used by another process by accident (e.g. remaining fluentd process that is going to exit soon).
Especially in production environment that uses td-agent launched through /etc/init.d/td-agent script, or if process is started by in_multiprocess plugin, it's helpful if supervisor keeps restarting the server process (with some (exponential) sleep in between) until it succeeds. Repeating error logs is OK because it's actually a problematic situation that should be alerted.
Problem
Actual problem happened with in_multiprocess plugin.
A minimal config was as following:
child1.conf:
child2.conf:
Local port 24225 was used by another fluentd process by accident (it was another fluentd process running by accident) and child2 didn't start.
Environment
The text was updated successfully, but these errors were encountered: