Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete node lease if the corresponding node is deleted #70034

Merged
merged 1 commit into from
Nov 10, 2018

Conversation

wangzhen127
Copy link
Member

What type of PR is this?
/kind feature

What this PR does / why we need it:
This PR ensures that node lease is deleted if the corresponding node is deleted. This is done in node lifecycle controller. OwnerReference approach is not feasible because an owning object must be in the same namespace as the owned object, and node does not have namespace.

This is part of KEP-0009, feature #589 and issue #14733.

Does this PR introduce a user-facing change?:
Yes. When node lease feature is enabled, with this PR, user will see node lease is deleted when the corresponding node is deleted.

Release note:

NodeLifecycleController: When node lease feature is enabled, node lease will be deleted when the corresponding node is deleted.

/sig node
/milestone v1.13

@k8s-ci-robot
Copy link
Contributor

@wangzhen127: You must be a member of the kubernetes/kubernetes-milestone-maintainers github team to set the milestone.

In response to this:

What type of PR is this?
/kind feature

What this PR does / why we need it:
This PR ensures that node lease is deleted if the corresponding node is deleted. This is done in node lifecycle controller. OwnerReference approach is not feasible because an owning object must be in the same namespace as the owned object, and node does not have namespace.

This is part of KEP-0009, feature #589 and issue #14733.

Does this PR introduce a user-facing change?:
Yes. When node lease feature is enabled, with this PR, user will see node lease is deleted when the corresponding node is deleted.

Release note:

NodeLifecycleController: When node lease feature is enabled, node lease will be deleted when the corresponding node is deleted.

/sig node
/milestone v1.13

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. sig/node Categorizes an issue or PR as relevant to SIG Node. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Oct 19, 2018
@wangzhen127
Copy link
Member Author

/assign @wojtek-t
/assign @yujuhong

@@ -669,6 +669,11 @@ func (nc *Controller) monitorNodeHealth() error {
glog.V(1).Infof("Controller observed a Node deletion: %v", deleted[i].Name)
nodeutil.RecordNodeEvent(nc.recorder, deleted[i].Name, string(deleted[i].UID), v1.EventTypeNormal, "RemovingNode", fmt.Sprintf("Removing Node %v from Controller", deleted[i].Name))
delete(nc.knownNodeSet, deleted[i].Name)
if utilfeature.DefaultFeatureGate.Enabled(features.NodeLease) {
if err := nc.kubeClient.CoordinationV1beta1().Leases(v1.NamespaceNodeLease).Delete(deleted[i].Name, metav1.NewDeleteOptions(0)); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has a problem that if this fail (or if controller-manager will be killed/stopped in the meantime or sth like that), the lease object will be left.

Wouldn't it be possible to set ownerReference of the Lease object to be the corresponding Node object and rely on GC? @janetkuo - thought?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has a problem that if this fail (or if controller-manager will be killed/stopped in the meantime or sth like that), the lease object will be left.

The alternative is to compare the existing nodes and leases and do the cleanup every time when monitorNodeHealth() runs.

Wouldn't it be possible to set ownerReference of the Lease object to be the corresponding Node object and rely on GC? Janet Kuo - thought?

IIUC, OwnerReference approach is not feasible because an owning object must be in the same namespace as the owned object, and node does not have namespace. See here. I also included this in the PR description.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alternative is to compare the existing nodes and leases and do the cleanup every time when monitorNodeHealth() runs.

@wangzhen127 do the leases have information about which node acquired it ?

IIUC, OwnerReference approach is not feasible because an owning object must be in the same namespace as the owned object, and node does not have namespace. See here. I also included this in the PR description.

@janetkuo @caesarxuchao Does the GC work on non namespaced objects ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really think that we need something that is more robust that this.
Listing all leases (from local cache should be good enough) and comparing them with existing nodes sounds reasonable.

[If we won't be able to use GC - I pinged Janet and Chao about that offline].

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GC support owner being non namespaced, while the dependents are namespaced. (see #65200).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wangzhen127 - if that mechanism works, then we should use it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG. Will try owner reference and update later.

if actions[0].GetVerb() != "delete" {
t.Fatalf("unexpected action verb: %s", actions[0].GetVerb())
}
if actions[0].GetResource().Resource != "leases" {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the place you verify the action of delete was called for resource leases ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also verify the name of the lease being deleted and if that was indeed held by the node that got deleted

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. area/kubelet sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 30, 2018
@wangzhen127
Copy link
Member Author

@yujuhong, @wojtek-t, I have updated this PR to use owner reference. PTAL.

/test pull-kubernetes-node-e2e-alpha

@wojtek-t
Copy link
Member

This LGTM.

@wangzhen127
Copy link
Member Author

/retest

@guineveresaenger
Copy link
Contributor

Hi @wangzhen127 - I'm Enhancements Shadow for 1.13. This looks like it is nearing completion! Just a friendly reminder that code slush begins on 11/9 and code freeze is 11/15, so if you could follow up with your reviewers and get this in, that would be awesome.

Thank you!

@wangzhen127
Copy link
Member Author

Ping @yujuhong :)

@yujuhong yujuhong added this to the v1.13 milestone Nov 7, 2018
Expect(err).To(BeNil())
systemPodsNo = int32(len(systemPods))
if strings.Index(framework.TestContext.CloudConfig.NodeInstanceGroup, ",") >= 0 {
framework.Failf("Test dose not support cluster setup with more than one MIG: %s", framework.TestContext.CloudConfig.NodeInstanceGroup)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the test be skipped if there are more than one MIG?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not entirely sure about the reasoning here. I copied from resize_nodes tests:
https://github.com/kubernetes/kubernetes/blob/master/test/e2e/lifecycle/resize_nodes.go#L57

}
if pass {
return nil
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: don't need the else

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Expect(deletedNodeName).NotTo(Equal(""))
Eventually(func() error {
if _, err := leaseClient.Get(deletedNodeName, metav1.GetOptions{}); err == nil {
return fmt.Errorf("node lease is not deleted yet")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Include the deleted node name in the error message for debugging.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}
}
return nil
}, 5*time.Minute, 5*time.Second).Should(BeNil())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to wait 5 minutes for this to be true? Seems excessive to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing to 1m now.

} else {
return fmt.Errorf("some node lease is not ready")
}
}, 5*time.Minute, 5*time.Second).Should(BeNil())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we shorten the timeout here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing to 1m now.

targetNodes := framework.GetReadySchedulableNodesOrDie(c)
Expect(len(targetNodes.Items)).To(Equal(int(targetNumNodes)))

By("verify node lease is deleted for the deleted node")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the expected latency for the lease to be deleted?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That depends on GC. Based on my observation previously, it is deleted very soon after deleting nodes. Changing the timeout to 1m now.

},
}
} else {
glog.Infof("failed to get node %q when trying to set owner ref to the node lease: %v", c.holderIdentity, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this Infof as opposed to Errorf?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member Author

@wangzhen127 wangzhen127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. PTAL

},
}
} else {
glog.Infof("failed to get node %q when trying to set owner ref to the node lease: %v", c.holderIdentity, err)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Expect(err).To(BeNil())
systemPodsNo = int32(len(systemPods))
if strings.Index(framework.TestContext.CloudConfig.NodeInstanceGroup, ",") >= 0 {
framework.Failf("Test dose not support cluster setup with more than one MIG: %s", framework.TestContext.CloudConfig.NodeInstanceGroup)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not entirely sure about the reasoning here. I copied from resize_nodes tests:
https://github.com/kubernetes/kubernetes/blob/master/test/e2e/lifecycle/resize_nodes.go#L57

}
if pass {
return nil
} else {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

} else {
return fmt.Errorf("some node lease is not ready")
}
}, 5*time.Minute, 5*time.Second).Should(BeNil())
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing to 1m now.

targetNodes := framework.GetReadySchedulableNodesOrDie(c)
Expect(len(targetNodes.Items)).To(Equal(int(targetNumNodes)))

By("verify node lease is deleted for the deleted node")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That depends on GC. Based on my observation previously, it is deleted very soon after deleting nodes. Changing the timeout to 1m now.

Expect(deletedNodeName).NotTo(Equal(""))
Eventually(func() error {
if _, err := leaseClient.Get(deletedNodeName, metav1.GetOptions{}); err == nil {
return fmt.Errorf("node lease is not deleted yet")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}
}
return nil
}, 5*time.Minute, 5*time.Second).Should(BeNil())
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing to 1m now.

@wangzhen127
Copy link
Member Author

wangzhen127 commented Nov 8, 2018

The failures in pull-kubernetes-e2e-gce-alpha-features seem not related to this PR, because the gci-gce-alpha-features CI jobs are also failing. Node lease tests are passing.

@yujuhong
Copy link
Contributor

yujuhong commented Nov 8, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 8, 2018
@yujuhong
Copy link
Contributor

yujuhong commented Nov 8, 2018

/approve

1 similar comment
@wojtek-t
Copy link
Member

wojtek-t commented Nov 9, 2018

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: wangzhen127, wojtek-t, yujuhong

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 9, 2018
@wangzhen127
Copy link
Member Author

/retest

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

@wangzhen127
Copy link
Member Author

Do we have to wait until pull-kubernetes-e2e-gce-alpha-features fully pass before merging, which is not a required job? The failure is on CSI tests, which are unrelated to this PR. The CSI failures are tracked in #70760

Since the node lease tests are passing, can we merge ignoring pull-kubernetes-e2e-gce-alpha-features?
/cc @krzyzacy
/cc @AishSundar

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

@AishSundar
Copy link
Contributor

/priority important-soon

@k8s-ci-robot k8s-ci-robot added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Nov 9, 2018
@saad-ali
Copy link
Member

saad-ali commented Nov 9, 2018

Do we have to wait until pull-kubernetes-e2e-gce-alpha-features fully pass before merging, which is not a required job? The failure is on CSI tests, which are unrelated to this PR. The CSI failures are tracked in #70760

Yes looks like these test are bad. They should not block 1.13. @mkimuram please open a PR to disable the tests and then fix them before reenabling.

@krzyzacy
Copy link
Member

krzyzacy commented Nov 9, 2018

/override pull-kubernetes-e2e-gce-alpha-features

@krzyzacy
Copy link
Member

krzyzacy commented Nov 9, 2018

gah... I think we need to wait for the test to finish before we can override the context

@wangzhen127
Copy link
Member Author

/override pull-kubernetes-e2e-gce-alpha-features

1 similar comment
@krzyzacy
Copy link
Member

krzyzacy commented Nov 9, 2018

/override pull-kubernetes-e2e-gce-alpha-features

@krzyzacy
Copy link
Member

krzyzacy commented Nov 9, 2018

/skip

@wangzhen127
Copy link
Member Author

/test pull-kubernetes-e2e-gce-100-performance

1 similar comment
@wangzhen127
Copy link
Member Author

/test pull-kubernetes-e2e-gce-100-performance

@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Nov 10, 2018

@wangzhen127: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-e2e-gce-alpha-features b8b1279 link /test pull-kubernetes-e2e-gce-alpha-features

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

@k8s-ci-robot k8s-ci-robot merged commit be71fbe into kubernetes:master Nov 10, 2018
@wangzhen127 wangzhen127 deleted the cleanup-node-lease branch November 10, 2018 07:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet