Description
Which component are you using?:
Cluster Autoscaler
area/cluster-autoscaler
Is your feature request designed to solve a problem? If so describe the problem this feature should solve.:
Describe the solution you'd like.:
I'd like to be able to manually mark a node ready to be terminated so that we can short circuit logic necessary to determine whether a node should be removed. When operating with thousands of nodes and pods, node deletion seems to be very slow while the cluster autoscaler tries to determine removable nodes.
Ideally this can happen by putting a given label on the node, but open to other alternatives that would make sense for the cluster autoscaler. The high level goal here is to speed up the time to remove nodes from 2k to 0.
Describe any alternative solutions you've considered.:
Additional context.:
I am operating a cluster where I have a service manually creating/deleting singleton pods with no other k8s controller. Each of these pods are expected to run on their own node - I control this using pod anti-affinity (which is not ideal) but open to alternatives here too which may improve performance.