Reduce latency to node ready after CIDR is assigned. #67031

krzysztof-jastrzebski · 2018-08-06T19:42:37Z

This adds code to execute an immediate runtime and node status update when the Kubelet sees that it has a CIDR, which significantly decreases the latency to node ready.

Speed up kubelet start time by executing an immediate runtime and node status update when the Kubelet sees that it has a CIDR.

krzysztof-jastrzebski · 2018-08-06T19:43:01Z

/assign mtaufen

krzysztof-jastrzebski · 2018-08-06T19:47:00Z

pkg/kubelet/kubelet_network.go

@@ -56,22 +56,23 @@ func (kl *Kubelet) providerRequiresNetworkingConfiguration() bool {

 // updatePodCIDR updates the pod CIDR in the runtime state if it is different
 // from the current CIDR.
-func (kl *Kubelet) updatePodCIDR(cidr string) {
+func (kl *Kubelet) updatePodCIDR(cidr string) error {
 	podCIDR := kl.runtimeState.podCIDR()


I'm not sure if this thread safe.

RuntimeState is already protected by an internal mutex.
Concurrent calls to kl.updatePodCIDR could race though (good catch), so I'd add a lock for this too.

"You get a lock, you get a lock, everybody gets a lock!" ;)
If there's a reasonably small refactor that makes all of this more composable and thread-safe without lock proliferation, I would welcome it in this PR :).
Not necessarily a blocker though.

krzysztof-jastrzebski · 2018-08-06T19:47:41Z

pkg/kubelet/kubelet.go

+							glog.Errorf(err.Error())
+							continue
+						}
+						kl.updateRuntimeUp()


I'm not sure if this thread safe.

hmm yes maybe we should grab the status update lock at the top of the for loop and release it just before calling syncNodeStatus?

that's kind of a jagged, fragile solution though, possibly we just use fine-grained locks for the functions we call independently

What exactly is not thread safe?
updateRuntimeUp and syncNodeStatus can be executed in parallel as they are executed in parallel even without this change.
I'm not sure if two updateRuntimeUp can be exectuted in parallel.
I'm also not sure if updatePodCIDR and syncNodeStatus can be executed in parallel.
Grabbing lock at top of the loop won't solve the problem. I could grab lock in updateRuntimeUp but if I use the same mutex as in syncNodeStatus then it might slow things down.
Should I use another lock in updatePodCIDR?

Friendly ping:)

updateRuntimeUp and syncNodeStatus can be executed in parallel as they are executed in parallel even without this change

Yes I think this should be fine.

I'm not sure if two updateRuntimeUp can be exectuted in parallel.

Me neither; to be safe let's give updateRuntimeUp its own lock, and have updateRuntimeUp grab/release it. I would use a new mutex for updateRuntimeUp, not the same as syncNodeStatus. They are separate paths and we should avoid sharing the lock.

yastij · 2018-08-06T20:30:57Z

pkg/kubelet/kubelet.go

+							glog.Errorf(err.Error())
+							continue
+						}
+						kl.updateRuntimeUp()


This isn’t thread-safe we might need to use the node update lock

neolit123 · 2018-08-06T22:03:03Z

/sig node

@krzysztof-jastrzebski
please add a release note, given this is user facing.

mtaufen · 2018-08-06T20:26:17Z

pkg/kubelet/kubelet.go

+							glog.Errorf(err.Error())
+							continue
+						}
+						kl.updateRuntimeUp()


that's kind of a jagged, fragile solution though, possibly we just use fine-grained locks for the functions we call independently

mtaufen · 2018-08-08T21:42:10Z

pkg/kubelet/kubelet.go

+							glog.Errorf(err.Error())
+							continue
+						}
+						kl.updateRuntimeUp()


updateRuntimeUp and syncNodeStatus can be executed in parallel as they are executed in parallel even without this change

Yes I think this should be fine.

I'm not sure if two updateRuntimeUp can be exectuted in parallel.

Me neither; to be safe let's give updateRuntimeUp its own lock, and have updateRuntimeUp grab/release it. I would use a new mutex for updateRuntimeUp, not the same as syncNodeStatus. They are separate paths and we should avoid sharing the lock.

mtaufen · 2018-08-08T21:45:39Z

pkg/kubelet/kubelet.go

+		// and node statuses ASAP.
+		// TODO(mtaufen): potentially generalize this to a fast "node ready" status update by
+		// factoring a readiness predicate out of setNodeReadyCondition in kubelet_node_status.go.
+		fastStatusUpdate := func() {


Can you convert this to a Kubelet method, rather than inline?
Inlining was just a shortcut I took while prototyping.

mtaufen · 2018-08-08T21:50:29Z

pkg/kubelet/kubelet_network.go

@@ -56,22 +56,23 @@ func (kl *Kubelet) providerRequiresNetworkingConfiguration() bool {

 // updatePodCIDR updates the pod CIDR in the runtime state if it is different
 // from the current CIDR.
-func (kl *Kubelet) updatePodCIDR(cidr string) {
+func (kl *Kubelet) updatePodCIDR(cidr string) error {
 	podCIDR := kl.runtimeState.podCIDR()


RuntimeState is already protected by an internal mutex.
Concurrent calls to kl.updatePodCIDR could race though (good catch), so I'd add a lock for this too.

mtaufen · 2018-08-08T21:52:55Z

pkg/kubelet/kubelet_network.go

@@ -56,22 +56,23 @@ func (kl *Kubelet) providerRequiresNetworkingConfiguration() bool {

 // updatePodCIDR updates the pod CIDR in the runtime state if it is different
 // from the current CIDR.
-func (kl *Kubelet) updatePodCIDR(cidr string) {
+func (kl *Kubelet) updatePodCIDR(cidr string) error {
 	podCIDR := kl.runtimeState.podCIDR()


"You get a lock, you get a lock, everybody gets a lock!" ;)
If there's a reasonably small refactor that makes all of this more composable and thread-safe without lock proliferation, I would welcome it in this PR :).
Not necessarily a blocker though.

mwielgus · 2018-08-15T13:32:49Z

/ok-to-tests

mtaufen · 2018-08-20T17:48:38Z

/ok-to-test

mtaufen · 2018-08-20T18:02:08Z

pkg/kubelet/kubelet.go

+	// updatePodCIDRMux is a lock on updating pod CIDR, because this path is not thread-safe.
+	updatePodCIDRMux sync.Mutex
+
+	// updateRuntimeUpMux is a lock on updating runtime, because this path is not thread-safe.


Either change the name of the variable to updateRuntimeUpMux or fix the godoc.

mtaufen · 2018-08-20T18:02:39Z

pkg/kubelet/kubelet.go

@@ -2079,6 +2089,9 @@ func (kl *Kubelet) LatestLoopEntryTime() time.Time {
 // and returns an error if the status check fails.  If the status check is OK,
 // update the container runtime uptime in the kubelet runtimeState.
 func (kl *Kubelet) updateRuntimeUp() {
+	kl.syncNodeStatusMux.Lock()
+	defer kl.syncNodeStatusMux.Unlock()


Should these be updateRuntimeMux?

mtaufen · 2018-08-20T18:11:32Z

pkg/kubelet/kubelet_network.go

 	podCIDR := kl.runtimeState.podCIDR()

 	if podCIDR == cidr {
-		return
+		return nil
 	}

 	// kubelet -> generic runtime -> runtime shim -> network plugin
 	// docker/non-cri implementations have a passthrough UpdatePodCIDR
 	if err := kl.getRuntime().UpdatePodCIDR(cidr); err != nil {
 		glog.Errorf("Failed to update pod CIDR: %v", err)


Let's remove this log line now that we are just returning an error. We can log the error at relevant return sites (pkg/kubelet/kubelet.go needs to update the klet.updatePodCIDR(kubeCfg.PodCIDR) call to log the error in this case).

mtaufen · 2018-08-20T19:59:20Z

/lgtm

krzysztof-jastrzebski · 2018-08-21T07:30:53Z

/test pull-kubernetes-e2e-gce
/test pull-kubernetes-e2e-gce-device-plugin-gpu

mtaufen · 2018-08-21T17:36:51Z

/assign @yujuhong
for approval

yujuhong

I am not too thrilled about adding three locks in the kubelet structure just for the one-off startup use case, but I don't have better solution except for more refactoring...

I think we can live with this until the fate of pod CIDR is finalized #62288

Change mostly looks good. I've left some minor comments. Will approve after they are addressed.

yujuhong · 2018-08-21T22:24:27Z

pkg/kubelet/kubelet.go

@@ -1011,6 +1013,15 @@ type Kubelet struct {
 	//    as it takes time to gather all necessary node information.
 	nodeStatusUpdateFrequency time.Duration

+	// syncNodeStatusMux is a lock on updating the node status, because this path is not thread-safe.


Please add the function that uses this lock in the comment, and state that the lock must not be used outside of the function.

yujuhong · 2018-08-21T22:24:31Z

pkg/kubelet/kubelet.go

+	// syncNodeStatusMux is a lock on updating the node status, because this path is not thread-safe.
+	syncNodeStatusMux sync.Mutex
+
+	// updatePodCIDRMux is a lock on updating pod CIDR, because this path is not thread-safe.


Please add the function that uses this lock in the comment, and state that the lock must not be used outside of the function.

yujuhong · 2018-08-21T22:24:35Z

pkg/kubelet/kubelet.go

+	// updatePodCIDRMux is a lock on updating pod CIDR, because this path is not thread-safe.
+	updatePodCIDRMux sync.Mutex
+
+	// updateRuntimeMux is a lock on updating runtime, because this path is not thread-safe.


Please add the function that uses this lock in the comment, and state that the lock must not be used outside of the function.

yujuhong · 2018-08-21T22:31:49Z

pkg/kubelet/kubelet.go

+// and tries to update pod CIDR immediately. After pod CIDR is updated it fires off a runtime
+// update and a node status update.
+// This should significantly improve latency to ready node by updating pod CIDR, runtime status
+// and node statuses ASAP.


Please expand the comment to make it clear that this only expedites the node status updates when kubelet first starts up. After one successful update, the function returns.

It'd only be good to reflect that in the function name. Maybe fastStatusUpdateOnce()?

krzysztof-jastrzebski · 2018-08-22T09:37:11Z

/test pull-kubernetes-e2e-kops-aws

yujuhong · 2018-08-22T19:37:20Z

/approve

mtaufen · 2018-08-23T16:18:38Z

/lgtm

k8s-ci-robot · 2018-08-23T16:19:28Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: krzysztof-jastrzebski, mtaufen, yujuhong

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~pkg/kubelet/OWNERS~~ [yujuhong]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-github-robot · 2018-08-23T17:37:30Z

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

k8s-ci-robot assigned mtaufen Aug 6, 2018

k8s-ci-robot requested review from mtaufen and tallclair August 6, 2018 19:43

krzysztof-jastrzebski commented Aug 6, 2018

View reviewed changes

yastij reviewed Aug 6, 2018

View reviewed changes

k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Aug 6, 2018

mtaufen reviewed Aug 10, 2018

View reviewed changes

tallclair removed their request for review August 14, 2018 01:44

krzysztof-jastrzebski force-pushed the node_startup branch from 25d1409 to a5a6c02 Compare August 17, 2018 09:32

k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Aug 17, 2018

mwielgus removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Aug 17, 2018

mtaufen reviewed Aug 20, 2018

View reviewed changes

krzysztof-jastrzebski force-pushed the node_startup branch 3 times, most recently from d7a6e52 to 476ecaf Compare August 20, 2018 19:12

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 20, 2018

k8s-ci-robot assigned yujuhong Aug 21, 2018

yujuhong reviewed Aug 21, 2018

View reviewed changes

Reduce latency to node ready after CIDR is assigned.

7ffa4e1

krzysztof-jastrzebski force-pushed the node_startup branch from 476ecaf to 7ffa4e1 Compare August 22, 2018 08:48

k8s-ci-robot added area/kubelet and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Aug 22, 2018

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 22, 2018

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 23, 2018

k8s-github-robot merged commit e46203c into kubernetes:master Aug 23, 2018

mtaufen mentioned this pull request Nov 9, 2018

WIP attempt to reduce latency to node ready after CIDR is assigned #66143

Closed

jingyuanliang mentioned this pull request Sep 23, 2022

kubelet: Keep trying fast status update at startup until node is ready #112618

Merged

gjkim42 mentioned this pull request Nov 13, 2022

daemonset.status.desiredNumberScheduled=0 for some time in v1.26.0-beta.0 #113844

Closed

Reduce latency to node ready after CIDR is assigned. #67031

Reduce latency to node ready after CIDR is assigned. #67031

Conversation

krzysztof-jastrzebski commented Aug 6, 2018 • edited Loading

krzysztof-jastrzebski commented Aug 6, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

neolit123 commented Aug 6, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mwielgus commented Aug 15, 2018

mtaufen commented Aug 20, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mtaufen commented Aug 20, 2018

krzysztof-jastrzebski commented Aug 21, 2018

mtaufen commented Aug 21, 2018

yujuhong left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

krzysztof-jastrzebski commented Aug 22, 2018

yujuhong commented Aug 22, 2018

mtaufen commented Aug 23, 2018

k8s-ci-robot commented Aug 23, 2018

k8s-github-robot commented Aug 23, 2018

krzysztof-jastrzebski commented Aug 6, 2018 •

edited

Loading