-
Notifications
You must be signed in to change notification settings - Fork 39k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Node controller to not force delete pods #35235
Node controller to not force delete pods #35235
Conversation
6b7c2b6
to
2c48694
Compare
@foxish why we abandon this feature? i think force delete pod may be useful in some situation |
@AlexMioMio, please see the discussion here #35145 and #34160, and feel free to comment on it. This is still a WIP. |
We are not abandoning this feature, but we are making it the administrator's responsibility to decide to force delete. Node controller does not know whether the machine is partitioned and coming back or not, so if the node controller deletes those pods it can cause split brain in the cluster for pet sets. |
With this change, if you want to delete a node (because it's no longer running), delete the node and the pods will be cleaned up. If you want to delete a pod, run |
2c48694
to
84e1676
Compare
02f5c1c
to
fa74c29
Compare
Waiting for #35581 to merge before rebase. The other parts are now ready for review. |
fa74c29
to
92f4f21
Compare
e1a15b3
to
cf9c787
Compare
cf9c787
to
7194101
Compare
Addressed comments by @erictune, will reapply LGTM after tests pass. |
Automatic merge from submit-queue |
I understand the request came from #35145 and #34160, but we get side effects because of this changes. We implement kubernetes in aws autoscaling group. kubernetes nodes are keeping scaled up and down. If the terminated nodes keep NotReady status, a lot of daemonset pods are in pending status. If we start to disable this feature |
That violates the safety guarantees of the cluster - if those nodes aren't coming back, you should delete the node. Can you describe exactly why the nodes are in not ready state? That seems like a flaw in the autoscaler or the cloud provider - if the instance is deleted in AWS, the cloud provider should be removing the instance from the API (the node should be deleted) which should clear the pods off the node, regardless of readiness. |
Thanks, @smarterclayton We set up Kuberentes masters and nodes both with aws autoscaling group (ASG). Version is 1.4.6. When scale down a node by ASG in aws, the node has been terminated, but the node stay in node list My colleague runs another kubernetes stack which is only v1.2, after the nodes are terminated by ASG as well in that environment, nodes disappear from node list in very short time. I try to fix this issue, but failed. I think maybe the problem is related to this pull request. |
I think that's the issue here. The nodes should be kept in sync with the underlying infrastructure always. The node object should be removed by either the cloud-provider specific code, or be deleted by an external loop. It shouldn't stick around indefinitely as "NotReady". In your case, I'd expect the cloud provider specific code to figure out that the instance is gone and remove the node object. |
https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/node/nodecontroller.go#L563
is what is supposed to be firing when we detect that the node no longer
exists in AWS
…On Tue, Feb 14, 2017 at 9:55 PM, Anirudh Ramanathan < ***@***.***> wrote:
When scale down a node by ASG in aws, the node has been terminated, but
the node stay in node list kubectl get nodes as NotReady status forever.
I think that's the issue here. The nodes should be kept in sync with the
underlying infrastructure always. The node object should be removed by
either the cloud-provider specific code, or be deleted by an external loop.
It shouldn't stick around indefinitely as "NotReady".
The rationale for this change is explained also in
https://kubernetes.io/docs/admin/node/#condition
In this case, I'd expect the cloud provider specific code to figure out
that the instance is gone and remove the node object.
/cc @justinsb <https://github.com/justinsb> for AWS
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#35235 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABG_p6HAnR09zK4Av3iWqaWd1-mCqctQks5rcmk7gaJpZM4Kcocv>
.
|
I confirmed that the node object does get deleted when the instance is deleted in case of GCE, GKE. So, the fault here is likely in the AWS cloud provider specific code. |
I guess it is also possible that the instance is being stopped, but not deleted in this case. If that is the case, the cloud provider code may still find the instance, and the NC wouldn't delete it (not sure about this). Then the responsibility lies with the cluster admin to clean up such nodes. |
Maybe we should delete stopped Nodes as well? |
Thanks for the comments above. I am still not sure if this is related to my environment only. Because the PR is merged to version 1.5, my kubernetes version is 1.4.6 only and should have |
The only thing that was change here was handling Pods on not-ready Nodes. The code that is supposed to remove Nodes when they're not present in the cloud provider was untouched (i.e. Node should disappear when corresponding VM was deleted and cloud provider starts answering with "NotFound" when asked about it). We also moved the code that was responsible for deleting Pods from not existing Nodes, but that should have been no-op. |
Fixes #35145
Release note:
This change is![Reviewable](https://camo.githubusercontent.com/2d899f4291d07d3cd2fa4aaae1e3b243f164c23fce87d30a589ace0d496a444c/68747470733a2f2f72657669657761626c652e6b756265726e657465732e696f2f7265766965775f627574746f6e2e737667)