Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster AutoScaler: scale up timeout nodes should not be removed by fixIncorrectNodeGroupSizes #6746

Open
xrmzju opened this issue Apr 23, 2024 · 0 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@xrmzju
Copy link

xrmzju commented Apr 23, 2024

Which component are you using?:

cluster-autoscaler

What version of the component are you using?:

Component version: cluster-autoscaler-release-1.30

What k8s version are you using (kubectl version)?:

v1.26

What behaviour did you expect to see?

The nodes that have timed out during the scale-up process should be specifically targeted for removal, rather than being removed randomly.

What happened instead?:

The nodes that have timed out during the scale-up process are removed randomly.

How to reproduce it (as minimally and precisely as possible):

  1. Initiate a scale-up request.
  2. Cloud instances have been successfully created.
  3. The node was successfully registered, but for some reason, it remained in the 'notReady' state (due to issues such as CNI/container runtime failure, etc).
  4. The scale-up requests were removed due to a timeout (refer to: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/clusterstate/clusterstate.go#L272), and an 'incorrectSize' status would be observed.
  5. After the MaxNodeProvisionTime, the cloud instances would be removed during the fixNodeGroupSize reconciliation. However, the DecreaseTargetSize function will randomly delete cloud instances, instead of removing the newly created nodes.

Anything else we need to know?:

@xrmzju xrmzju added the kind/bug Categorizes issue or PR as related to a bug. label Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

1 participant