New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1829824: Remove dead member from LB pool. #612
Conversation
@gryf: This pull request references Bugzilla bug 1829824, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just some minor issues, plus the jobs failures.
@@ -481,6 +483,11 @@ func BootstrapKuryr(conf *operv1.NetworkSpec, kubeClient client.Client) (*bootst | |||
} | |||
} | |||
|
|||
err := purgeOpenStackLbPoolMember(client, poolId, addresses) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the variable err already exists, no need to recreate it, just to update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack.
@@ -411,3 +411,34 @@ func listOpenStackOctaviaProviders(client *gophercloud.ServiceClient) ([]provide | |||
return providersList, nil | |||
} | |||
} | |||
|
|||
// Iterate on pool members and check their address against provided list. | |||
// Remove all surplus members, which address doesn't existst on that list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: exist. Nit: better to specify what is the provided list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack.
@@ -481,6 +483,11 @@ func BootstrapKuryr(conf *operv1.NetworkSpec, kubeClient client.Client) (*bootst | |||
} | |||
} | |||
|
|||
err := purgeOpenStackLbPoolMember(client, poolId, addresses) | |||
if err != nil { | |||
return nil, errors.Wrap(err, "Failed on purging LB pool.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: LB pool member.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it's an error during purging pool from invalid members ;)
return errors.Wrap(err, "failed to extract LB members list") | ||
} | ||
|
||
for _, member := range members { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as we have already added the "new members", does it make sense to compare here members and addresses length first, and if the same assume members are already up to date?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems tempting, although it won't prevent us from making less Octavia calls, right? And this will ensure, that members addresses are the same as for nodes addresses (as it should be) :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I support @gryf here, we could miss kind of a failover of master VM. Regardless of if it's possible or not, it's almost no cost to a do full comparison. It's just golang makes it look long and clumsy. ;)
/retest |
/test e2e-metal-ipi |
/lgtm |
During installation a bootstrap node is used as an API node at first, but then gets removed. In Kuryr's bootstrap we never remove it from API loadbalancer pool which leads to Octavia stating that LB is in degraded state. With those changes we prevent it by making sure on reconciliation we remove loadbalancer members that we don't need there anymore. This should help with scaledown of masters too.
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, minor feedback in the comments.
@@ -457,9 +457,11 @@ func BootstrapKuryr(conf *operv1.NetworkSpec, kubeClient client.Client) (*bootst | |||
log.Print("Creating OpenShift API loadbalancer pool members") | |||
r, _ := regexp.Compile(fmt.Sprintf("^%s-(master-port-[0-9]+|bootstrap-port)$", clusterID)) | |||
portList, err := listOpenStackPortsMatchingPattern(client, tag, r) | |||
addresses := make([]string, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for make()
in that case, just var addresses []string
to create a nil slice should be fine, append
will be fine with it.
return errors.Wrap(err, "failed to extract LB members list") | ||
} | ||
|
||
for _, member := range members { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I support @gryf here, we could miss kind of a failover of master VM. Regardless of if it's possible or not, it's almost no cost to a do full comparison. It's just golang makes it look long and clumsy. ;)
for _, address := range addresses { | ||
if address == member.Address { | ||
found = true | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using found
flag you could just use loops labels: mark loop in 429 as members_loop
and make this a continue members_loop
.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dulek, gryf, luis5tb, MaysaMacedo The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@gryf: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/retest Please review the full test history for this PR and help us cut down flakes. |
@gryf: All pull requests linked via external trackers have merged: openshift/cluster-network-operator#612. Bugzilla bug 1829824 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
During installation a bootstrap node is used as an API node at first,
but then gets removed. In Kuryr's bootstrap we never remove it from API
loadbalancer pool which leads to Octavia stating that LB is in degraded
state.
With those changes we prevent it by making sure on reconciliation we
remove loadbalancer members that we don't need there anymore. This
should help with scaledown of masters too.