-
Notifications
You must be signed in to change notification settings - Fork 325
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Finalizer to block deletion of RayCluster with running jobs #1740
Comments
Thoughts @kevin85421 ? |
The |
Btw, this is pertaining to deletion, not suspension. |
@kevin85421 here's the use-case I am thinking about it:
Note that the finalizer would be optional and blocking deletion on job completion is not default behavior. I agree with your previous comment that we don't need to cover this for suspension |
We has the similar use-case:
|
Search before asking
Description
I would like to introduce a finalizer that can be used with RayCluster to block deletion until all jobs in the Ray cluster are completed.
Use case
This feature would allow you to delete a Ray cluster while jobs are still running. The finalizer will ensure that all jobs are completed before cleaning up resources by querying the Ray head service. This is handy for when you want to automatically clean up resource immediately after a long-running training job. Even more important for larger jobs where resources need to be cleaned up as soon as possible to save costs.
This can also be used as a safety measure to ensure RayClusters with running jobs can't be accidentally deleted.
While RayJob can be used for similar use-cases, it is not a viable option for longer-lived RayClusters that can accept multiple jobs before being deleted.
Related issues
No response
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: