Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add timeout option to haproxy input #7506

Closed
wants to merge 1 commit into from
Closed

Conversation

danielnelson
Copy link
Contributor

closes #7458

Required for all PRs:

  • Signed CLA.
  • Associated README.md updated.
  • Has appropriate unit tests.

@danielnelson danielnelson added the feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin label May 14, 2020
@danielnelson danielnelson added this to the 1.15.0 milestone May 14, 2020
@danielnelson danielnelson requested a review from reimda May 14, 2020 08:05
Copy link
Contributor

@reimda reimda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this approach will fix the problem in the issue. The timeouts in the issue are for socket unix connections but these changes are for HTTP connections. Sockets go through gatherServerSocket so these changes won't have any effect.

Actually, I don't understand why the input doesn't already return partial responses. The Gather func loops over the endpoints (the sockets) and gathers each in its own goroutine. It seems like this should add metrics for the sockets that are responding even if some are not responding. There's something else going on here.

If the problem is around connecting to the socket or writing to it, the user should be getting errors from telegraf. It seems more likely that when the haproxy process isn't responding it doesn't accept connections which should log an error "Could not connect to socket". There's also a chance it's accepting connection but not accepting write so they'd see "Could not write to socket".

If it's a write problem, Maybe we need to add a deadline on the connection in gatherServerSocket. After dial returns

err := c.SetDeadline(time.Now().Add(4*time.Second))

but we should confirm that they see "Could not write to socket" in their logs before adding that fix.

Also, once it's more clear what the failure is, we ought to have a unit test for the workaround. haproxy_test.go already does the work to stub/mock haproxy expected response so a new unit test won't be too much trouble. We can copy most of TestHaproxyGeneratesMetricsUsingSocket but add a new handler mock that doesn't accept connections or writes or whatever, along with the existing mocks of expected response.

@danielnelson danielnelson modified the milestones: 1.15.0, Planned Jul 2, 2020
@reimda reimda closed this Dec 11, 2020
@sspaink sspaink deleted the haproxy-timeout branch October 19, 2021 20:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HAProxy input : being able to set a timeout
3 participants