-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support drain timeout #743
Comments
What should happen after the drainTimeout has elapsed? A pod not terminating may be by design, e.g. it's waiting for an IO operation with a long timeout. |
it should force-delete the pod |
Related to aws/karpenter-provider-aws#2391 |
@grosser, can you elaborate a bit on what can cause the node to get stuck? Do you have |
broken pdbs |
I wonder if we should just hardcode something like 10 minutes if eviction is returning a 500. Do you need to configure this value? |
I don't think it's good to hardcode any timeout since by default the system should be safe for the workloads and fail the cluster rollout instead. |
What would you set it to? |
nothing in production and 10-15m in staging |
The other wrinkle here is that we won't attempt (iirc, or perhaps it's just deprioritized) to deprovision a node that has a blocked PDB, so we'll need handling for both disruption and termination flows. |
Would you want to treat a 500 (misconfigured) differently than a 429 (pdb unfulfilled)? |
I'd treat them the same, either way: |
Great feedback, thank you! |
Related / extra credit: manage machine drains through a custom resource, similar to how the node maintenance operator does it. This could include reporting progress via |
How does https://kubernetes.io/blog/2023/01/06/unhealthy-pod-eviction-policy-for-pdbs/ fit into your use-case? Does it solve it? It's behind a Beta feature gate in 1.27. |
it solves it for completely unhealthy applications and should not have a big impact for regular operations |
@jonathan-innis does anyone start to work on this issue yet? |
Is it possible to expose and implement a timeout in the finalizer? If nothing is set (default) then it never times out but if it exist it should be easy enough for the finalizer to simply annotation the machine/ node with drain started with and if that now - annotation > timeout -> force kill the node/ machine? |
I found out recently that graceful deletion for custom resources (as in CRDs) doesn't really seem to be a thing. |
Can you provide some details? I think you are correct however shouldn't finalizer on CRD (machine) work equally as well? Trying to understand what specific API you need from K8s |
Graceful deletion has an associated time; finalizers don't have any time stored in the API. |
Ah ya that's unfortunate |
I think to upstream that change would take a while. I wonder if there is a mid to short term solution? I believe we might just fork Karpenter core and add the ability into the finalizer that picks up the timeout from the config map and some kind of annotation on the machine to mark it with time deletion attempt |
We had talked about including a timeout here that would infer from the NodePool or node itself to control the drain/eviction logic and forcefully evict once a drain/termination has reached the timeout. We'd essentially need to mock the graceful deletion ourselves, as I don't think there's any graceful deletion for nodes. We had a PR #466 that explored this, but haven't had the time to get to it. |
Just for reference, this is the field that supports this behavior in ClusterAPI.
In our case, we set this value to We've discussed lowering the value from 24h to something much shorter once we have a better story for automated traffic failover between our clusters. |
Should support for volume detach timeouts be opened as another issue? We use EBS CSI and find that some volumes refuse to detach due to timeouts that aren't handled well by the EBS CSI controller. We've opened several issues there, but these problems still occur after years of using EBS CSI. This blocks removal of nodes and can greatly delay the process of updating nodes in a cluster if left to manual intervention.
|
I think in general there is a need/ desire for more control over how nodes shut down happens from cluster operators from timeouts and how drain happens. Wondering if we should prioritize #740 to expand how node termination happens and exposing more control for users? |
Supportive of a node drain timeout proposal. Anyone want to take this on? |
I'm interested, opened a PR for a design on this above. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
cc: @akestner |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
Tell us about your request
Sometimes nodes get stuck when a pod could not be drained, so we have to alert ourselves and then manually kill it, which is not ideal.
It would be nice to set an optional
drainTimeout
so we can ensure nodes always die after x hours/days.Are you currently working around this issue?
Helper pod that kills pods on stuck nodes so they finish draining.
Community Note
The text was updated successfully, but these errors were encountered: