New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes service controller update race condition #55336
Fixes service controller update race condition #55336
Conversation
@jhorwit2: GitHub didn't allow me to request PR reviews from the following users: prydie. Note that only kubernetes members can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This is WIP because I still need to run more tests in an actual cluster. Functionally this is ready for review though. |
@@ -435,6 +446,8 @@ func (s *serviceCache) delete(serviceName string) { | |||
delete(s.serviceMap, serviceName) | |||
} | |||
|
|||
// needsUpdate checks to see if there were any changes between the old and new service that would require a load balancer update. | |||
// This method does not and should not check if the hosts have changed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Thanks for this! This LGTM |
e18d8be
to
26f9dd7
Compare
Assign back to me once it is LGTM'ed and ready for approval. I give @wlan0 the review on this one. |
@wlan0 I ran some tests on an existing cluster and the results were good. I removed the WIP label. First test was cordoning a node. As the logs indicate UpdateLoadBalancer was called instead of EnsureLoadBalancer.
The next test I ran was I uncordoned the same node. Once I saw the update start happening I updated the service. As expected, they were processed in order based on the workqueue. EnsureLoadBalancer was called for the service update and UpdateLoadBalancer for updating the two load balancers for the node change.
|
if !s.needsUpdate(cachedService.state, service) { | ||
// The service does not require an update which means it was placed on the work queue | ||
// by the node sync loop and indicates that the hosts need to be updated. | ||
err := s.updateLoadBalancerHosts(service) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inferring source from state seems a little odd… does doing host updates when needsUpdate returns false mean this controller will do work on every service even when it should be in steady state?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Services are only added to the queue under three conditions:
- The service requires an update. That's determined by
needsUpdate
which is checked in the informer OnUpdate. - The node sync loop determined that hosts change from the last sync.
- An error occurred so retry.
The UpdateLoadBalancer
method is supposed to be cheap for cloud providers. It was added so that cloud providers could have a method that handled only updating load balancer hosts.
This approach should not (and doesn't from what I tell in my tests) add any extra calls from before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From #56443. I think the logic here is too hacky and may not cover all corner cases. What we really want here is to distinguish "service update" and "nodeSync update". The condition !s.needsUpdate(cachedService.state, service)
is too broad that it includes also the retry case of "service update" (given how we cache service).
Besides, putting "nodeSync update" into the same work queue as "service update" might introduce another problem that one update could override the other. Ref #52495 (comment), within the working queue we don't save duplicate key, if both "nodeSync update" and "service update" come in before anyone leaves the queue, it will end up with only one update (depends on how we decide what update it is). It seems to me that the working queue mechanism also needs to be adjusted before we can put in "nodeSync update".
cc @bowei
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides, putting "nodeSync update" into the same work queue as "service update" might introduce another problem that one update could override the other
One update won't override the other because the sync checks if the service needs an update. Both cloud provider calls (EnsureLoadBalancer
and UpdateLoadBalancer
) will update the hosts, so if both happen it will go with EnsureLoadBalancer
which is what we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I mentioned in the other PR #56448 (comment), the finalizer support will clean this all up and I think we should revert this until the finalizer PR cleans up all the cache/delete logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both cloud provider calls (EnsureLoadBalancer and UpdateLoadBalancer) will update the hosts, so if both happen it will go with EnsureLoadBalancer
I think this assumption is inaccurate, ref the LoadBalancer interface, it doesn't explicitly define EnsureLoadBalancer()
should update the hosts:
kubernetes/pkg/cloudprovider/cloud.go
Lines 100 to 109 in e5aec86
// EnsureLoadBalancer creates a new load balancer 'name', or updates the existing one. Returns the status of the balancer | |
// Implementations must treat the *v1.Service and *v1.Node | |
// parameters as read-only and not modify them. | |
// Parameter 'clusterName' is the name of the cluster as presented to kube-controller-manager | |
EnsureLoadBalancer(clusterName string, service *v1.Service, nodes []*v1.Node) (*v1.LoadBalancerStatus, error) | |
// UpdateLoadBalancer updates hosts under the specified load balancer. | |
// Implementations must treat the *v1.Service and *v1.Node | |
// parameters as read-only and not modify them. | |
// Parameter 'clusterName' is the name of the cluster as presented to kube-controller-manager | |
UpdateLoadBalancer(clusterName string, service *v1.Service, nodes []*v1.Node) error |
And in fact, EnsureLoadBalancer()
in GCE cloudprovider doesn't update hosts. Hence for GCE this is the case where "service update" overrides "nodeSync update".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
creates a new load balancer 'name', or updates the existing one
Should include the backends (nodes). Why would it partially update the load balancer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MrHohn from what I see GCE does make sure that backends are up-to-date on EnsureLoadBalancer calls via this method, which is called by ensureInternalLoadBalancer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for gathering the links, I didn't look into the internal one before, seems like it does check for hosts update. Though ATM the external one doesn't check for hosts update.
kubernetes/pkg/cloudprovider/providers/gce/gce_loadbalancer_external.go
Lines 675 to 676 in e57accb
// Doesn't check whether the hosts have changed, since host updating is handled | |
// separately. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, missed that comment. The external one is definitely more complex than the internal one 👼
It sounds like we should definitely revert this then and we need to come to an agreement on what EnsureLoadBalancer
and UpdateLoadBalancer
should do for each cloud provider. It was my understanding that EnsureLoadBalancer
should completely update the load balancer, which is how AWS, Azure, Oracle & DigitalOcean handle it (only ones i checked). cc @wlan0 @luxas
I'll open a PR to revert this for 1.9 @MrHohn
@@ -233,30 +258,44 @@ func TestUpdateNodesInExternalLoadBalancer(t *testing.T) { | |||
{Service: newService("s3", "999", v1.ServiceTypeLoadBalancer), Hosts: nodes}, | |||
}, | |||
}, | |||
{ | |||
// One service has an external load balancer and one is nil: one call. | |||
services: []*v1.Service{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this case removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not possible for a service to be nil when updateLoadBalancerHosts
is called. I didn't see it valuable to add a nil check here because then i should also add it to createLoadBalancerIfNeeded
and other methods.
Other than the concerns introduced by @liggitt, I'm also wondering if there will be a noticeable delay in updating hosts for LBs now, since we are adding it back to the queue, instead of processing it right away. Is that something to consider, or is it too insignificant? |
@wlan0 Prior to this PR if a call to |
I meant the normal case, not just the failure case. But your answer gave me the information.
If that delay was acceptable, this will be fine. |
Ah, yeah you could have a little longer wait but you'd have to be creating/updating/deleting a service at the time a node change occurs. In a subsequent PR I'd like to expose the number of workers as a flag, which will speed things up significantly (given you stay under your rate limits for a given cloud). |
@liggitt PTAL |
/status in-progress |
You must be a member of the kubernetes/kubernetes-milestone-maintainers github team to add status labels. |
@wlan0 PTAL |
This LGTM. I don't see any changes since my last review. |
Thanks! /assign @thockin |
@wlan0 you have to say /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jhorwit2, thockin Associated issue: 53462 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
/test pull-kubernetes-unit |
/priority critical-urgent /remove-priority important-longterm adjusting priorities for code freeze |
/test all [submit-queue is verifying that this PR is safe to merge] |
[MILESTONENOTIFIER] Milestone Pull Request Current @bowei @jhorwit2 @thockin @wlan0 Note: This pull request is marked as Example update:
Pull Request Labels
|
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here. |
Automatic merge from submit-queue (batch tested with PRs 56520, 53764). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Revert "Merge pull request #55336 from oracle/for/upstream/master/53462" This reverts commit ccb15fb, reversing changes made to 4904037. **What this PR does / why we need it**: Reverting this PR due to the discussion #56448 (comment) and #55336 (comment). **Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #56443 **Special notes for your reviewer**: **Release note**: ```release-note NONE ``` /cc @thockin @luxas @wlan0 @MrHohn /priority critical-urgent
What this PR does / why we need it:
Fixes service controller update race condition that can happen with the node sync loop and the worker(s). This PR allows the node sync loop to utilize the same work queue as service updates so that the queue can ensure the service is being acted upon by only one goroutine.
Which issue this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close that issue when PR gets merged): fixes #53462Special notes for your reviewer:
Release note:
/cc @wlan0 @luxas @prydie @andrewsykim
/sig cluster-lifecycle
/area cloudprovider