Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubelet recreating node lease after node object has been deleted #13259

Closed
olemarkus opened this issue Feb 15, 2022 · 0 comments · Fixed by #13289
Closed

Kubelet recreating node lease after node object has been deleted #13259

olemarkus opened this issue Feb 15, 2022 · 0 comments · Fixed by #13289
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/office-hours

Comments

@olemarkus
Copy link
Member

During rolling upgrades the following happens:

  • Drain node
  • Terminate instance
  • Delete node object
  • Node lease object deleted as part of the cascade since the lease has the node object as owner

After deleting the node object, and the lease with it, there is a high chance that kubelet will recreate the lease object. Especially after enabling graceful shutdowns. Since the node object is gone, it will not set the ownerRef on the lease, which is causing our E2E to fail. This behavior is somewhat questionable as kubelet is aware it is being shut down, and the lease doesn't make that much sense without the node object in the first place.

Consequence is anyways that lease objects are being created and never cleaned up.

There are a couple of ways this could be solved:

  • Change behavior in kubelet
  • cloudup or kops-controller cleans up lease objects of terminated instances
  • We stop deleting node objects, and just let CCM clean them up. In general this is not a problem as CCM will clean up nodes faster than new nodes are being provisioned. There is a comment on GCP will recycle node names, but I am not sure if we then either make an exception for GCP or if we just leave it be.

I want to do the latter. Let CCM discover instances are terminated and delete the nodes.

/kind bug
/kind office-hours

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. kind/office-hours labels Feb 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/office-hours
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants