Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: cluster-autoscaler does not wait long enough for new server to become available #1278

Closed
UBaggeler opened this issue Mar 15, 2024 · 2 comments · Fixed by #1279
Closed
Labels
bug Something isn't working

Comments

@UBaggeler
Copy link
Contributor

Description

Since midnight our autoscaler is having issues launching new nodes (from 1 --> 3). The creation of the new servers seems to take longer than the expected (default) timeout of 5min:

failed to create error: failed to start server hcloud-autoscaled-xyz error: timeout waiting for server hcloud-autoscaled-xyz

In this case cluster-autoscaler removes/deletes the nodes and repeatedly tries to spin up new servers, without success.

Manually increasing the timeout (for example to 15min) by setting the env variable HCLOUD_SERVER_CREATION_TIMEOUT on the autoscaler deployment resolves the issue.

Unfortunately the autoscaler.yaml.tpl file does not allow to set this environment variable.

Kube.tf file

n/a

Screenshots

No response

Platform

Linux

@zarevavasyl
Copy link

Tnx!)

@mysticaltech
Copy link
Collaborator

@UBaggeler Your fix was just released as part of v2.13.4 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
3 participants