Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup pending pods on scale down #817

Open
BitTheByte opened this issue Sep 16, 2023 · 2 comments
Open

Cleanup pending pods on scale down #817

BitTheByte opened this issue Sep 16, 2023 · 2 comments

Comments

@BitTheByte
Copy link

BitTheByte commented Sep 16, 2023

Currently, the operator retires workers using the HTTP or RPC APIs however those only control the connected dask workers, the operator should take into count dask's Kubernetes worker pods that are in a pending state as those will cause a useless Kubernetes cluster scale-up and then connect to dask and get retired thus a scale down should retire active workers and prevent pending pods from entering running state

@jacobtomlinson
Copy link
Member

Agreed.

We could add a check here for any Pods that aren't in a Running phase and delete those before calling retire_workers (if that's even necessary any more).

if workers_needed < 0:
worker_ids = await retire_workers(

@BitTheByte
Copy link
Author

BitTheByte commented Sep 18, 2023

Looks good to me, we should also subtract pending workers from the number of workers passed to retire_workers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants