Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
I recently observed that node lease objects are staying around even after nodes got deleted. It seems like there is no reconcile logic for deleting orphaned node lease objects anywhere in the k8s control-plane. This is leading to stale objects piling up in etcd over time. Even worse if nodes are being created/deleted at a high rate.
Wojtek pointed me to kubelet code here where we come close to identifying this situation, but there also no lease cleanup is happening. Nevertheless, we need some controller to be doing this cleanup outside of kubelet (as kubelet can go poof anytime and node be deleted). Maybe node-lifecycle-controller?
@wojtek-t - can you share thoughts you were suggesting in our offline discussion?
I think that at the kubelet level, we should introduce an invariant that Lease object is created after the Node object is created - i.e. by changing
That said, given that we may have leaked some Leases already, we need to extend nodelifecyclecontroller to:
Actually, I looked deeper into the code, and generally, even if we create the Lease without OwnerReferences, in the next update we will try to update it:
So the only situation when we will not try to update it is when going via the fallback path:
I'm not sure we really need to add the GC component, so the probability of it is really low
I'm leaning towards adding a test that ensure that OwnerReferences are set and adding a condition that Lease will always be created with OwnerReferences.