Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supervisor doesn't restart server process if it couldn't listen on a port #1072

Closed
frsyuki opened this issue Jul 1, 2016 · 5 comments
Closed

Comments

@frsyuki
Copy link
Member

frsyuki commented Jul 1, 2016

Overview

Supervisor doesn't restart server process if it couldn't start within 1 second with this message:

2016-07-01 10:39:53 -0700 [warn]: process died within 1 second. exit.

This happens also when the server failed to start due to "address already In use" error and the port is temporarily used by another process by accident (e.g. remaining fluentd process that is going to exit soon).

Especially in production environment that uses td-agent launched through /etc/init.d/td-agent script, or if process is started by in_multiprocess plugin, it's helpful if supervisor keeps restarting the server process (with some (exponential) sleep in between) until it succeeds. Repeating error logs is OK because it's actually a problematic situation that should be alerted.

Problem

Actual problem happened with in_multiprocess plugin.
A minimal config was as following:

<source>
  @type multiprocess
  <process>
    cmdline -c child1.conf
  </process>
  <process>
    cmdline -c child2.conf
  </process>
</source>

child1.conf:

<source>
  @type forward
  port 24224
</source>

child2.conf:

<source>
  @type forward
  port 24225  # this local port was used by someone temporarily
</source>

Local port 24225 was used by another fluentd process by accident (it was another fluentd process running by accident) and child2 didn't start.

Environment

  • td-agent 2.3.1
  • ubuntu 14.04
@tagomoris
Copy link
Member

I think that repeating restarts doesn't help the situation. If other daemon processes listens the port specified in configuration file (like syslog daemon vs in_syslog), repeating restarts make supervisor running. Users might not find that Fluentd wasn't launched correctly.

@frsyuki the problem you meant is about in_multiprocess plugin. It should fail to launch if any child process fails to start.

@repeatedly
Copy link
Member

Time to implement this feature? > fluent/fluent-plugin-multiprocess#3

@frsyuki
Copy link
Member Author

frsyuki commented Jul 6, 2016

@tagomoris one process succeeds to start.

@tagomoris
Copy link
Member

@frsyuki Yes. But totally, that configuration is broken in that case.

@tagomoris
Copy link
Member

Fluentd core can't accept this proposal, I think. It breaks compatibility of behavior, against users' expectation. IMO, fluent-plugin-multiprocess should implement it.
Let me close this issue. Please reopen if you think this should be implemented in Fluentd core (probably after v1 release).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants