[Core] Ray workers are not killed by SIGTERM #40182
Labels
bug
Something that is supposed to be working; but isn't
core
Issues that should be addressed in Ray Core
P0
Issues that should be fixed in short order
ray 2.8
release-blocker
P0 Issue that blocks the release
What happened + What you expected to happen
It looks like Ray workers are not killed by SIGTERM. We have several code that uses SIGTERM to terminate worker processes, and this means that those workers are always ungracefully terminated, which means it will not run critical destructor (it is important for ML workloads). I.e., cleaning up child processes.
Versions / Dependencies
master
Reproduction script
send sigterm to a ray worker and see what happens.
Issue Severity
None
The text was updated successfully, but these errors were encountered: