Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Made the router skip health checks when there is one endpoint #16643

Conversation

knobunc
Copy link
Contributor

@knobunc knobunc commented Oct 2, 2017

If there is only one endpoint for a route, there is no point to doing
health checks. If the endpoint is down, haproxy will fail to connect.
Skipping the checks helps tremendously on servers with large numbers
of routes, because reducing any checking means the router doesn't
spend a lot of time doing health checks pointlessly.

Fixes bug 1492189 (https://bugzilla.redhat.com/show_bug.cgi?id=1492189)

@knobunc knobunc self-assigned this Oct 2, 2017
@openshift-ci-robot openshift-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Oct 2, 2017
@openshift-merge-robot openshift-merge-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 2, 2017
@knobunc knobunc added component/routing and removed approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Oct 2, 2017
@openshift-merge-robot openshift-merge-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 2, 2017
@knobunc
Copy link
Contributor Author

knobunc commented Oct 2, 2017

@openshift/networking @openshift/sig-networking PTAL

// Count the number of endpoint addresses (we have to do this first since we need the result when we loop below)
// But the two loop conditions must be the same
numEndpoints := 0
for _, s := range subsets {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Run this loop only if wasIdled=false. Just to be more efficient.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or rather run a loop in the end on the 'out' array only if wasIdled=false and len(out)==1 and fix all elements of the 'out' array to set NoHealthCheck=true. This way we would only loop if the length is 1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough... but I'm also considering adding metrics to the router for:

  • Num idled
  • Num endpoints
  • Num health checked endpoints

And so on... so may need to flip this back. However, until we do that, I'll go for the faster version.


// TODO: review me for sanity
// Now build the actual endpoints we pass to the template
out := make([]Endpoint, 0, numEndpoints)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just do this and if there is more than 1 end point (after line 325), loop over endpoints setting the flag. The argument to do this suggests that most of the time the count will be 1. Looping to find that out seems inefficient.

If there is only one endpoint for a route, there is no point to doing
health checks.  If the endpoint is down, haproxy will fail to connect.
Skipping the checks helps tremendously on servers with large numbers
of routes, because reducing any checking means the router doesn't
spend a lot of time doing health checks pointlessly.

Fixes bug 1492189 (https://bugzilla.redhat.com/show_bug.cgi?id=1492189)
@knobunc knobunc force-pushed the fix/router-skip-health-when-one-endpoint branch from e200b1b to 4833eb5 Compare October 4, 2017 14:43
@openshift-ci-robot openshift-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Oct 4, 2017
@knobunc
Copy link
Contributor Author

knobunc commented Oct 4, 2017

/test

Copy link
Contributor

@pecameron pecameron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-merge-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: knobunc, pecameron

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 4, 2017
@knobunc knobunc added the kind/bug Categorizes issue or PR as related to a bug. label Oct 5, 2017
@openshift-merge-robot
Copy link
Contributor

Automatic merge from submit-queue (batch tested with PRs 16545, 16684, 16643, 16459, 16682).

@openshift-merge-robot openshift-merge-robot merged commit f6a5067 into openshift:master Oct 5, 2017
@knobunc knobunc deleted the fix/router-skip-health-when-one-endpoint branch June 7, 2018 12:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. component/routing kind/bug Categorizes issue or PR as related to a bug. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants