Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce latency to node ready after CIDR is assigned. #67031

Merged

Conversation

krzysztof-jastrzebski
Copy link
Contributor

@krzysztof-jastrzebski krzysztof-jastrzebski commented Aug 6, 2018

This adds code to execute an immediate runtime and node status update when the Kubelet sees that it has a CIDR, which significantly decreases the latency to node ready.

Speed up kubelet start time by executing an immediate runtime and node status update when the Kubelet sees that it has a CIDR.

@k8s-ci-robot k8s-ci-robot added do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Aug 6, 2018
@krzysztof-jastrzebski
Copy link
Contributor Author

/assign mtaufen

@@ -56,22 +56,23 @@ func (kl *Kubelet) providerRequiresNetworkingConfiguration() bool {

// updatePodCIDR updates the pod CIDR in the runtime state if it is different
// from the current CIDR.
func (kl *Kubelet) updatePodCIDR(cidr string) {
func (kl *Kubelet) updatePodCIDR(cidr string) error {
podCIDR := kl.runtimeState.podCIDR()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this thread safe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RuntimeState is already protected by an internal mutex.
Concurrent calls to kl.updatePodCIDR could race though (good catch), so I'd add a lock for this too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"You get a lock, you get a lock, everybody gets a lock!" ;)
If there's a reasonably small refactor that makes all of this more composable and thread-safe without lock proliferation, I would welcome it in this PR :).
Not necessarily a blocker though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

glog.Errorf(err.Error())
continue
}
kl.updateRuntimeUp()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this thread safe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm yes maybe we should grab the status update lock at the top of the for loop and release it just before calling syncNodeStatus?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's kind of a jagged, fragile solution though, possibly we just use fine-grained locks for the functions we call independently

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly is not thread safe?
updateRuntimeUp and syncNodeStatus can be executed in parallel as they are executed in parallel even without this change.
I'm not sure if two updateRuntimeUp can be exectuted in parallel.
I'm also not sure if updatePodCIDR and syncNodeStatus can be executed in parallel.
Grabbing lock at top of the loop won't solve the problem. I could grab lock in updateRuntimeUp but if I use the same mutex as in syncNodeStatus then it might slow things down.
Should I use another lock in updatePodCIDR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Friendly ping:)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updateRuntimeUp and syncNodeStatus can be executed in parallel as they are executed in parallel even without this change

Yes I think this should be fine.

I'm not sure if two updateRuntimeUp can be exectuted in parallel.

Me neither; to be safe let's give updateRuntimeUp its own lock, and have updateRuntimeUp grab/release it. I would use a new mutex for updateRuntimeUp, not the same as syncNodeStatus. They are separate paths and we should avoid sharing the lock.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

glog.Errorf(err.Error())
continue
}
kl.updateRuntimeUp()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn’t thread-safe we might need to use the node update lock

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@neolit123
Copy link
Member

/sig node

@krzysztof-jastrzebski
please add a release note, given this is user facing.

@k8s-ci-robot k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Aug 6, 2018
glog.Errorf(err.Error())
continue
}
kl.updateRuntimeUp()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's kind of a jagged, fragile solution though, possibly we just use fine-grained locks for the functions we call independently

glog.Errorf(err.Error())
continue
}
kl.updateRuntimeUp()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updateRuntimeUp and syncNodeStatus can be executed in parallel as they are executed in parallel even without this change

Yes I think this should be fine.

I'm not sure if two updateRuntimeUp can be exectuted in parallel.

Me neither; to be safe let's give updateRuntimeUp its own lock, and have updateRuntimeUp grab/release it. I would use a new mutex for updateRuntimeUp, not the same as syncNodeStatus. They are separate paths and we should avoid sharing the lock.

// and node statuses ASAP.
// TODO(mtaufen): potentially generalize this to a fast "node ready" status update by
// factoring a readiness predicate out of setNodeReadyCondition in kubelet_node_status.go.
fastStatusUpdate := func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you convert this to a Kubelet method, rather than inline?
Inlining was just a shortcut I took while prototyping.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -56,22 +56,23 @@ func (kl *Kubelet) providerRequiresNetworkingConfiguration() bool {

// updatePodCIDR updates the pod CIDR in the runtime state if it is different
// from the current CIDR.
func (kl *Kubelet) updatePodCIDR(cidr string) {
func (kl *Kubelet) updatePodCIDR(cidr string) error {
podCIDR := kl.runtimeState.podCIDR()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RuntimeState is already protected by an internal mutex.
Concurrent calls to kl.updatePodCIDR could race though (good catch), so I'd add a lock for this too.

@@ -56,22 +56,23 @@ func (kl *Kubelet) providerRequiresNetworkingConfiguration() bool {

// updatePodCIDR updates the pod CIDR in the runtime state if it is different
// from the current CIDR.
func (kl *Kubelet) updatePodCIDR(cidr string) {
func (kl *Kubelet) updatePodCIDR(cidr string) error {
podCIDR := kl.runtimeState.podCIDR()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"You get a lock, you get a lock, everybody gets a lock!" ;)
If there's a reasonably small refactor that makes all of this more composable and thread-safe without lock proliferation, I would welcome it in this PR :).
Not necessarily a blocker though.

@tallclair tallclair removed their request for review August 14, 2018 01:44
@mwielgus
Copy link
Contributor

/ok-to-tests

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Aug 17, 2018
@mwielgus mwielgus removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Aug 17, 2018
@mtaufen
Copy link
Contributor

mtaufen commented Aug 20, 2018

/ok-to-test

// updatePodCIDRMux is a lock on updating pod CIDR, because this path is not thread-safe.
updatePodCIDRMux sync.Mutex

// updateRuntimeUpMux is a lock on updating runtime, because this path is not thread-safe.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either change the name of the variable to updateRuntimeUpMux or fix the godoc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -2079,6 +2089,9 @@ func (kl *Kubelet) LatestLoopEntryTime() time.Time {
// and returns an error if the status check fails. If the status check is OK,
// update the container runtime uptime in the kubelet runtimeState.
func (kl *Kubelet) updateRuntimeUp() {
kl.syncNodeStatusMux.Lock()
defer kl.syncNodeStatusMux.Unlock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be updateRuntimeMux?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

podCIDR := kl.runtimeState.podCIDR()

if podCIDR == cidr {
return
return nil
}

// kubelet -> generic runtime -> runtime shim -> network plugin
// docker/non-cri implementations have a passthrough UpdatePodCIDR
if err := kl.getRuntime().UpdatePodCIDR(cidr); err != nil {
glog.Errorf("Failed to update pod CIDR: %v", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove this log line now that we are just returning an error. We can log the error at relevant return sites (pkg/kubelet/kubelet.go needs to update the klet.updatePodCIDR(kubeCfg.PodCIDR) call to log the error in this case).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@mtaufen
Copy link
Contributor

mtaufen commented Aug 20, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 20, 2018
@krzysztof-jastrzebski
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce
/test pull-kubernetes-e2e-gce-device-plugin-gpu

@mtaufen
Copy link
Contributor

mtaufen commented Aug 21, 2018

/assign @yujuhong
for approval

Copy link
Contributor

@yujuhong yujuhong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not too thrilled about adding three locks in the kubelet structure just for the one-off startup use case, but I don't have better solution except for more refactoring...

I think we can live with this until the fate of pod CIDR is finalized #62288

Change mostly looks good. I've left some minor comments. Will approve after they are addressed.

@@ -1011,6 +1013,15 @@ type Kubelet struct {
// as it takes time to gather all necessary node information.
nodeStatusUpdateFrequency time.Duration

// syncNodeStatusMux is a lock on updating the node status, because this path is not thread-safe.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the function that uses this lock in the comment, and state that the lock must not be used outside of the function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// syncNodeStatusMux is a lock on updating the node status, because this path is not thread-safe.
syncNodeStatusMux sync.Mutex

// updatePodCIDRMux is a lock on updating pod CIDR, because this path is not thread-safe.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the function that uses this lock in the comment, and state that the lock must not be used outside of the function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// updatePodCIDRMux is a lock on updating pod CIDR, because this path is not thread-safe.
updatePodCIDRMux sync.Mutex

// updateRuntimeMux is a lock on updating runtime, because this path is not thread-safe.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the function that uses this lock in the comment, and state that the lock must not be used outside of the function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// and tries to update pod CIDR immediately. After pod CIDR is updated it fires off a runtime
// update and a node status update.
// This should significantly improve latency to ready node by updating pod CIDR, runtime status
// and node statuses ASAP.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please expand the comment to make it clear that this only expedites the node status updates when kubelet first starts up. After one successful update, the function returns.

It'd only be good to reflect that in the function name. Maybe fastStatusUpdateOnce()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@k8s-ci-robot k8s-ci-robot added area/kubelet and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Aug 22, 2018
@krzysztof-jastrzebski
Copy link
Contributor Author

/test pull-kubernetes-e2e-kops-aws

@yujuhong
Copy link
Contributor

/approve

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 22, 2018
@mtaufen
Copy link
Contributor

mtaufen commented Aug 23, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 23, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: krzysztof-jastrzebski, mtaufen, yujuhong

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-github-robot
Copy link

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants