You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Node lease object deleted as part of the cascade since the lease has the node object as owner
After deleting the node object, and the lease with it, there is a high chance that kubelet will recreate the lease object. Especially after enabling graceful shutdowns. Since the node object is gone, it will not set the ownerRef on the lease, which is causing our E2E to fail. This behavior is somewhat questionable as kubelet is aware it is being shut down, and the lease doesn't make that much sense without the node object in the first place.
Consequence is anyways that lease objects are being created and never cleaned up.
There are a couple of ways this could be solved:
Change behavior in kubelet
cloudup or kops-controller cleans up lease objects of terminated instances
We stop deleting node objects, and just let CCM clean them up. In general this is not a problem as CCM will clean up nodes faster than new nodes are being provisioned. There is a comment on GCP will recycle node names, but I am not sure if we then either make an exception for GCP or if we just leave it be.
I want to do the latter. Let CCM discover instances are terminated and delete the nodes.
/kind bug
/kind office-hours
The text was updated successfully, but these errors were encountered:
During rolling upgrades the following happens:
After deleting the node object, and the lease with it, there is a high chance that kubelet will recreate the lease object. Especially after enabling graceful shutdowns. Since the node object is gone, it will not set the ownerRef on the lease, which is causing our E2E to fail. This behavior is somewhat questionable as kubelet is aware it is being shut down, and the lease doesn't make that much sense without the node object in the first place.
Consequence is anyways that lease objects are being created and never cleaned up.
There are a couple of ways this could be solved:
I want to do the latter. Let CCM discover instances are terminated and delete the nodes.
/kind bug
/kind office-hours
The text was updated successfully, but these errors were encountered: