Fixes service controller update race condition #55336

jhorwit2 · 2017-11-08T20:14:03Z

What this PR does / why we need it:

Fixes service controller update race condition that can happen with the node sync loop and the worker(s). This PR allows the node sync loop to utilize the same work queue as service updates so that the queue can ensure the service is being acted upon by only one goroutine.

Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes #53462

Special notes for your reviewer:

Release note:

NONE

/cc @wlan0 @luxas @prydie @andrewsykim

/sig cluster-lifecycle
/area cloudprovider

k8s-ci-robot · 2017-11-08T20:14:06Z

@jhorwit2: GitHub didn't allow me to request PR reviews from the following users: prydie.

Note that only kubernetes members can review this PR, and authors cannot review their own PRs.

In response to this:

What this PR does / why we need it:

Fixes service controller update race condition that can happen with the node sync loop and the worker(s). This PR allows the node sync loop to utilize the same work queue as service updates so that the queue can ensure the service is being acted upon by only one goroutine.

Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes #53462

Special notes for your reviewer:

Release note:
NONE
/cc @wlan0 @luxas @prydie @andrewsykim

/sig cluster-lifecycle
/area cloudprovider

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

jhorwit2 · 2017-11-08T20:14:47Z

This is WIP because I still need to run more tests in an actual cluster. Functionally this is ready for review though.

wlan0 · 2017-11-09T02:53:44Z

pkg/controller/service/service_controller.go

@@ -435,6 +446,8 @@ func (s *serviceCache) delete(serviceName string) {
 	delete(s.serviceMap, serviceName)
 }

+// needsUpdate checks to see if there were any changes between the old and new service that would require a load balancer update.
+// This method does not and should not check if the hosts have changed.


wlan0 · 2017-11-09T02:55:56Z

Thanks for this! This LGTM

thockin · 2017-11-09T05:46:49Z

Assign back to me once it is LGTM'ed and ready for approval. I give @wlan0 the review on this one.

jhorwit2 · 2017-11-10T21:27:50Z

@wlan0 I ran some tests on an existing cluster and the results were good. I removed the WIP label.

First test was cordoning a node. As the logs indicate UpdateLoadBalancer was called instead of EnsureLoadBalancer.

I1110 21:16:05.957196       7 service_controller.go:666] Detected change in list of current cluster nodes. New node set: [host1 host2 host3 host4]
I1110 21:16:05.957297       7 load_balancer.go:406] Attempting to update load balancer 'loadbalancer-service-1'
I1110 21:16:07.067208       7 load_balancer.go:286] Applying "update" action on backend set `TCP-443` for lb `loadbalancer-1-ocid`
I1110 21:16:32.780236       7 load_balancer.go:286] Applying "update" action on backend set `TCP-80` for lb `loadbalancer-1-ocid`
I1110 21:16:47.265296       7 load_balancer.go:217] Successfully ensured load balancer "loadbalancer-service-1"
I1110 21:16:47.265401       7 load_balancer.go:406] Attempting to update load balancer 'loadbalancer-service-2
I1110 21:16:47.265471       7 event.go:218] Event(v1.ObjectReference{Kind:"Service", Namespace:"test", Name:"ingress-controller-internal", UID:"171b9fe9-bb30-11e7-bf4c-0000170092e3", APIVersion:"v1", ResourceVersion:"15751", FieldPath:""}): type: 'Normal' reason: 'UpdatedLoadBalancer' Updated load balancer with new hosts
I1110 21:16:48.458252       7 load_balancer.go:286] Applying "update" action on backend set `TCP-443` for lb `loadbalancer-2-ocid`
I1110 21:17:10.260149       7 load_balancer.go:286] Applying "update" action on backend set `TCP-80` for lb `loadbalancer-2-ocid`
I1110 21:17:21.425777       7 load_balancer.go:217] Successfully ensured load balancer "k8s-us-phx-a-1748042b-bb30-11e7-bf4c-0000170092e3"
I1110 21:17:21.425913       7 event.go:218] Event(v1.ObjectReference{Kind:"Service", Namespace:"test", Name:"ingress-controller-public", UID:"1748042b-bb30-11e7-bf4c-0000170092e3", APIVersion:"v1", ResourceVersion:"16082", FieldPath:""}): type: 'Normal' reason: 'UpdatedLoadBalancer' Updated load balancer with new hosts

The next test I ran was I uncordoned the same node. Once I saw the update start happening I updated the service. As expected, they were processed in order based on the workqueue. EnsureLoadBalancer was called for the service update and UpdateLoadBalancer for updating the two load balancers for the node change.

I1110 21:19:25.957522       7 service_controller.go:666] Detected change in list of current cluster nodes. New node set: [host1 host5 host2 host3 host4]
I1110 21:19:25.957635       7 load_balancer.go:406] Attempting to update load balancer 'loadbalancer-service-1'
I1110 21:19:27.038952       7 load_balancer.go:286] Applying "update" action on backend set `TCP-443` for lb `loadbalancer-1-ocid`
I1110 21:19:43.324369       7 load_balancer.go:286] Applying "update" action on backend set `TCP-80` for lb `loadbalancer-1-ocid`
I1110 21:19:59.128539       7 load_balancer.go:217] Successfully ensured load balancer "loadbalancer-service-1"
I1110 21:19:59.128699       7 load_balancer.go:406] Attempting to update load balancer 'loadbalancer-service-2
I1110 21:19:59.128740       7 event.go:218] Event(v1.ObjectReference{Kind:"Service", Namespace:"test", Name:"ingress-controller-internal", UID:"171b9fe9-bb30-11e7-bf4c-0000170092e3", APIVersion:"v1", ResourceVersion:"15751", FieldPath:""}): type: 'Normal' reason: 'UpdatedLoadBalancer' Updated load balancer with new hosts
I1110 21:20:00.264083       7 load_balancer.go:286] Applying "update" action on backend set `TCP-443` for lb `loadbalancer-2-ocid`
I1110 21:20:11.254631       7 load_balancer.go:286] Applying "update" action on backend set `TCP-80` for lb `loadbalancer-2-ocid`
I1110 21:20:20.577953       7 load_balancer.go:217] Successfully ensured load balancer "k8s-us-phx-a-1748042b-bb30-11e7-bf4c-0000170092e3"
I1110 21:20:20.578069       7 service_controller.go:309] Ensuring LB for service test/ingress-controller-internal
I1110 21:20:20.578137       7 event.go:218] Event(v1.ObjectReference{Kind:"Service", Namespace:"test", Name:"ingress-controller-public", UID:"1748042b-bb30-11e7-bf4c-0000170092e3", APIVersion:"v1", ResourceVersion:"16082", FieldPath:""}): type: 'Normal' reason: 'UpdatedLoadBalancer' Updated load balancer with new hosts
I1110 21:20:20.578211       7 event.go:218] Event(v1.ObjectReference{Kind:"Service", Namespace:"test", Name:"ingress-controller-internal", UID:"171b9fe9-bb30-11e7-bf4c-0000170092e3", APIVersion:"v1", ResourceVersion:"3874587", FieldPath:""}): type: 'Normal' reason: 'EnsuringLoadBalancer' Ensuring load balancer
I1110 21:20:21.700978       7 load_balancer.go:286] Applying "update" action on backend set `TCP-443` for lb `loadbalancer-1-ocid`
I1110 21:20:47.476569       7 load_balancer.go:352] Applying "delete" action on listener `TCP-80` for lb `loadbalancer-1-ocid`
I1110 21:21:06.944680       7 load_balancer.go:286] Applying "delete" action on backend set `TCP-80` for lb `loadbalancer-1-ocid`
I1110 21:21:22.450553       7 load_balancer.go:286] Applying "create" action on backend set `TCP-8080` for lb `loadbalancer-1-ocid`
I1110 21:21:47.989922       7 load_balancer.go:352] Applying "create" action on listener `TCP-8080` for lb `loadbalancer-1-ocid`
I1110 21:22:12.294323       7 load_balancer.go:217] Successfully ensured load balancer "loadbalancer-service-1"
I1110 21:22:12.294379       7 service_controller.go:334] Not persisting unchanged LoadBalancerStatus for service test/ingress-controller-internal to registry.
I1110 21:22:12.294470       7 event.go:218] Event(v1.ObjectReference{Kind:"Service", Namespace:"test", Name:"ingress-controller-internal", UID:"171b9fe9-bb30-11e7-bf4c-0000170092e3", APIVersion:"v1", ResourceVersion:"3874587", FieldPath:""}): type: 'Normal' reason: 'EnsuredLoadBalancer' Ensured load balancer

liggitt · 2017-11-11T16:01:34Z

pkg/controller/service/service_controller.go

+		if !s.needsUpdate(cachedService.state, service) {
+			// The service does not require an update which means it was placed on the work queue
+			// by the node sync loop and indicates that the hosts need to be updated.
+			err := s.updateLoadBalancerHosts(service)


Inferring source from state seems a little odd… does doing host updates when needsUpdate returns false mean this controller will do work on every service even when it should be in steady state?

Services are only added to the queue under three conditions:

The service requires an update. That's determined by needsUpdate which is checked in the informer OnUpdate.

The node sync loop determined that hosts change from the last sync.

An error occurred so retry.

The UpdateLoadBalancer method is supposed to be cheap for cloud providers. It was added so that cloud providers could have a method that handled only updating load balancer hosts.

This approach should not (and doesn't from what I tell in my tests) add any extra calls from before.

From #56443. I think the logic here is too hacky and may not cover all corner cases. What we really want here is to distinguish "service update" and "nodeSync update". The condition !s.needsUpdate(cachedService.state, service) is too broad that it includes also the retry case of "service update" (given how we cache service).

Besides, putting "nodeSync update" into the same work queue as "service update" might introduce another problem that one update could override the other. Ref #52495 (comment), within the working queue we don't save duplicate key, if both "nodeSync update" and "service update" come in before anyone leaves the queue, it will end up with only one update (depends on how we decide what update it is). It seems to me that the working queue mechanism also needs to be adjusted before we can put in "nodeSync update".

cc @bowei

Besides, putting "nodeSync update" into the same work queue as "service update" might introduce another problem that one update could override the other

One update won't override the other because the sync checks if the service needs an update. Both cloud provider calls (EnsureLoadBalancer and UpdateLoadBalancer) will update the hosts, so if both happen it will go with EnsureLoadBalancer which is what we want.

As I mentioned in the other PR #56448 (comment), the finalizer support will clean this all up and I think we should revert this until the finalizer PR cleans up all the cache/delete logic.

Both cloud provider calls (EnsureLoadBalancer and UpdateLoadBalancer) will update the hosts, so if both happen it will go with EnsureLoadBalancer

I think this assumption is inaccurate, ref the LoadBalancer interface, it doesn't explicitly define EnsureLoadBalancer() should update the hosts:

kubernetes/pkg/cloudprovider/cloud.go

Lines 100 to 109 in e5aec86

// EnsureLoadBalancer creates a new load balancer 'name', or updates the existing one. Returns the status of the balancer

// Implementations must treat the *v1.Service and *v1.Node

// parameters as read-only and not modify them.

// Parameter 'clusterName' is the name of the cluster as presented to kube-controller-manager

EnsureLoadBalancer(clusterName string, service *v1.Service, nodes []*v1.Node) (*v1.LoadBalancerStatus, error)

// UpdateLoadBalancer updates hosts under the specified load balancer.

// Implementations must treat the *v1.Service and *v1.Node

// parameters as read-only and not modify them.

// Parameter 'clusterName' is the name of the cluster as presented to kube-controller-manager

UpdateLoadBalancer(clusterName string, service *v1.Service, nodes []*v1.Node) error

And in fact, EnsureLoadBalancer() in GCE cloudprovider doesn't update hosts. Hence for GCE this is the case where "service update" overrides "nodeSync update".

creates a new load balancer 'name', or updates the existing one

Should include the backends (nodes). Why would it partially update the load balancer?

@MrHohn from what I see GCE does make sure that backends are up-to-date on EnsureLoadBalancer calls via this method, which is called by ensureInternalLoadBalancer

Thanks for gathering the links, I didn't look into the internal one before, seems like it does check for hosts update. Though ATM the external one doesn't check for hosts update.

kubernetes/pkg/cloudprovider/providers/gce/gce_loadbalancer_external.go

Lines 675 to 676 in e57accb

// Doesn't check whether the hosts have changed, since host updating is handled

// separately.

Ah, missed that comment. The external one is definitely more complex than the internal one 👼

It sounds like we should definitely revert this then and we need to come to an agreement on what EnsureLoadBalancer and UpdateLoadBalancer should do for each cloud provider. It was my understanding that EnsureLoadBalancer should completely update the load balancer, which is how AWS, Azure, Oracle & DigitalOcean handle it (only ones i checked). cc @wlan0 @luxas

I'll open a PR to revert this for 1.9 @MrHohn

liggitt · 2017-11-11T16:02:58Z

pkg/controller/service/service_controller_test.go

@@ -233,30 +258,44 @@ func TestUpdateNodesInExternalLoadBalancer(t *testing.T) {
 				{Service: newService("s3", "999", v1.ServiceTypeLoadBalancer), Hosts: nodes},
 			},
 		},
-		{
-			// One service has an external load balancer and one is nil: one call.
-			services: []*v1.Service{


Why was this case removed?

It's not possible for a service to be nil when updateLoadBalancerHosts is called. I didn't see it valuable to add a nil check here because then i should also add it to createLoadBalancerIfNeeded and other methods.

wlan0 · 2017-11-11T18:13:50Z

Other than the concerns introduced by @liggitt, I'm also wondering if there will be a noticeable delay in updating hosts for LBs now, since we are adding it back to the queue, instead of processing it right away.

Is that something to consider, or is it too insignificant?

jhorwit2 · 2017-11-11T18:28:26Z

@wlan0 Prior to this PR if a call to UpdateLoadBalancer failed then the service would be added to the servicesToUpdate list on the controller object. It wasn't retried immediately and only retried the next sync loop which is every 100s so retries occur faster this way.

wlan0 · 2017-11-11T18:53:18Z

I meant the normal case, not just the failure case. But your answer gave me the information.

It wasn't retried immediately and only retried the next sync loop which is every 100s so retries occur faster this way.

If that delay was acceptable, this will be fine.

jhorwit2 · 2017-11-11T22:29:45Z

Ah, yeah you could have a little longer wait but you'd have to be creating/updating/deleting a service at the time a node change occurs. In a subsequent PR I'd like to expose the number of workers as a flag, which will speed things up significantly (given you stay under your rate limits for a given cloud).

jhorwit2 · 2017-11-14T18:42:04Z

@liggitt PTAL

jhorwit2 · 2017-11-21T00:31:30Z

/status in-progress

k8s-ci-robot · 2017-11-21T00:31:31Z

You must be a member of the kubernetes/kubernetes-milestone-maintainers github team to add status labels.

jhorwit2 · 2017-11-21T12:12:41Z

@wlan0 PTAL

wlan0 · 2017-11-22T05:28:50Z

This LGTM. I don't see any changes since my last review.

jhorwit2 · 2017-11-22T14:34:30Z

Thanks!

/assign @thockin
For approval.

thockin · 2017-11-22T16:03:51Z

@wlan0 you have to say /lgtm :)

/lgtm
/approve

k8s-github-robot · 2017-11-22T16:03:56Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jhorwit2, thockin

Associated issue: 53462

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

~~pkg/controller/service/OWNERS~~ [thockin]

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

jhorwit2 · 2017-11-22T16:37:13Z

/test pull-kubernetes-unit

jberkus · 2017-11-22T23:48:52Z

/priority critical-urgent

/remove-priority important-longterm

adjusting priorities for code freeze

k8s-github-robot · 2017-11-23T01:21:20Z

/test all [submit-queue is verifying that this PR is safe to merge]

k8s-github-robot · 2017-11-23T01:21:58Z

[MILESTONENOTIFIER] Milestone Pull Request Current

@bowei @jhorwit2 @thockin @wlan0

Note: This pull request is marked as priority/critical-urgent, and must be updated every 1 day during code freeze.

Example update:

ACK.  In progress
ETA: DD/MM/YYYY
Risks: Complicated fix required

Pull Request Labels

sig/cluster-lifecycle sig/scheduling: Pull Request will be escalated to these SIGs if needed.
priority/critical-urgent: Never automatically move pull request out of a release milestone; continually escalate to contributor and SIG through all available channels.
kind/bug: Fixes a bug discovered during the current release.

Help

k8s-github-robot · 2017-11-23T02:05:50Z

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

…master/53462" This reverts commit ccb15fb, reversing changes made to 4904037.

@thockin

Automatic merge from submit-queue (batch tested with PRs 56520, 53764). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Revert "Merge pull request #55336 from oracle/for/upstream/master/53462" This reverts commit ccb15fb, reversing changes made to 4904037. **What this PR does / why we need it**: Reverting this PR due to the discussion #56448 (comment) and #55336 (comment). **Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #56443 **Special notes for your reviewer**: **Release note**: ```release-note NONE ``` /cc @thockin @luxas @wlan0 @MrHohn /priority critical-urgent

k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Nov 8, 2017

k8s-ci-robot requested a review from wlan0 November 8, 2017 20:14

k8s-ci-robot added the sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. label Nov 8, 2017

k8s-ci-robot requested a review from luxas November 8, 2017 20:14

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Nov 8, 2017

k8s-ci-robot added the area/cloudprovider label Nov 8, 2017

k8s-ci-robot requested a review from andrewsykim November 8, 2017 20:14

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 8, 2017

k8s-github-robot assigned thockin and bowei Nov 8, 2017

wlan0 reviewed Nov 9, 2017

View reviewed changes

jhorwit2 added 2 commits November 8, 2017 21:57

Fixes service controller update race condition

c8a12b8

generated files

26f9dd7

jhorwit2 force-pushed the for/upstream/master/53462 branch from e18d8be to 26f9dd7 Compare November 9, 2017 02:58

thockin assigned wlan0 Nov 9, 2017

thockin removed their assignment Nov 9, 2017

jhorwit2 changed the title ~~[WIP] Fixes service controller update race condition~~ Fixes service controller update race condition Nov 10, 2017

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 10, 2017

liggitt reviewed Nov 11, 2017

View reviewed changes

k8s-ci-robot assigned thockin Nov 22, 2017

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 22, 2017

k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 22, 2017

k8s-ci-robot added priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. and removed priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Nov 22, 2017

enisoc added the status/in-progress label Nov 23, 2017

k8s-github-robot removed the milestone/needs-attention label Nov 23, 2017

k8s-github-robot merged commit ccb15fb into kubernetes:master Nov 23, 2017

MrHohn mentioned this pull request Nov 27, 2017

ESIPP [Slow] e2e tests flakyness increased #56443

Closed

jhorwit2 mentioned this pull request Nov 28, 2017

Fixed service controller cache storing failed services #56448

Closed

jhorwit2 added a commit to jhorwit2/kubernetes that referenced this pull request Nov 29, 2017

Revert "Merge pull request kubernetes#55336 from oracle/for/upstream/…

5ee2368

…master/53462" This reverts commit ccb15fb, reversing changes made to 4904037.

jhorwit2 mentioned this pull request Nov 29, 2017

Revert "Merge pull request #55336 from oracle/for/upstream/master/53462" oracle/kubernetes#18

Merged

jhorwit2 added a commit to oracle/kubernetes that referenced this pull request Nov 29, 2017

Revert "Merge pull request kubernetes#55336 from oracle/for/upstream/…

04c45e1

…master/53462" This reverts commit ccb15fb, reversing changes made to 4904037.

This was referenced Nov 29, 2017

Revert "Merge pull request #55336 from oracle/for/upstream/master/53462" #56520

Merged

GCE cloud provider does not update backend nodes on EnsureLoadBalancer for external LBs #56527

Closed

MrHohn mentioned this pull request Nov 30, 2017

service controller race condition on updating the same service in multiple goroutines #53462

Closed

nicksardo mentioned this pull request Mar 5, 2018

GKE network load balancer target pools get updated slowly for changes in node pools #39423

Closed

	// EnsureLoadBalancer creates a new load balancer 'name', or updates the existing one. Returns the status of the balancer
	// Implementations must treat the v1.Service and v1.Node
	// parameters as read-only and not modify them.
	// Parameter 'clusterName' is the name of the cluster as presented to kube-controller-manager
	EnsureLoadBalancer(clusterName string, service v1.Service, nodes []v1.Node) (*v1.LoadBalancerStatus, error)
	// UpdateLoadBalancer updates hosts under the specified load balancer.
	// Implementations must treat the v1.Service and v1.Node
	// parameters as read-only and not modify them.
	// Parameter 'clusterName' is the name of the cluster as presented to kube-controller-manager
	UpdateLoadBalancer(clusterName string, service v1.Service, nodes []v1.Node) error

	// Doesn't check whether the hosts have changed, since host updating is handled
	// separately.

Fixes service controller update race condition #55336

Fixes service controller update race condition #55336

Conversation

jhorwit2 commented Nov 8, 2017

k8s-ci-robot commented Nov 8, 2017

jhorwit2 commented Nov 8, 2017

Choose a reason for hiding this comment

wlan0 commented Nov 9, 2017

thockin commented Nov 9, 2017

jhorwit2 commented Nov 10, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jhorwit2 Nov 28, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jhorwit2 Nov 28, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jhorwit2 Nov 11, 2017 • edited

Choose a reason for hiding this comment

wlan0 commented Nov 11, 2017

jhorwit2 commented Nov 11, 2017 • edited

wlan0 commented Nov 11, 2017

jhorwit2 commented Nov 11, 2017

jhorwit2 commented Nov 14, 2017

jhorwit2 commented Nov 21, 2017

k8s-ci-robot commented Nov 21, 2017

jhorwit2 commented Nov 21, 2017

wlan0 commented Nov 22, 2017

jhorwit2 commented Nov 22, 2017

thockin commented Nov 22, 2017

k8s-github-robot commented Nov 22, 2017

jhorwit2 commented Nov 22, 2017

jberkus commented Nov 22, 2017

k8s-github-robot commented Nov 23, 2017

k8s-github-robot commented Nov 23, 2017

k8s-github-robot commented Nov 23, 2017

jhorwit2 commented Nov 10, 2017 •

edited

jhorwit2 Nov 28, 2017 •

edited

jhorwit2 Nov 28, 2017 •

edited

jhorwit2 Nov 11, 2017 •

edited

jhorwit2 commented Nov 11, 2017 •

edited