-
Notifications
You must be signed in to change notification settings - Fork 6.9k
Closed
Labels
P1Issue that should be fixed within a few weeksIssue that should be fixed within a few weeksbugSomething that is supposed to be working; but isn'tSomething that is supposed to be working; but isn'tusability
Milestone
Description
start an AWS cluster with 0 workers (m5.large, default yaml, set idle timeout to 1).
run this code:
import ray
ray.init(address="auto")
@ray.remote(num_cpus=2)
... def f():
... time.sleep(60)
ray.get([f.remote for _ in range(2)])
This would result spinning up 1 worker. When this worker becomes idle, autoscaler terminates it, but generates the following warning:
>>> 2021-02-16 04:52:40,168 WARNING worker.py:1034 -- The node with node id 610641aa77c977618fefc691d433d4606876e904 has been marked dead because the detector has missed too many heartbeats from it. This can happen when a raylet crashes unexpectedly or has lagging heartbeats.
- [ ] I have verified my script runs in a clean environment and reproduces the issue.
- [ ] I have verified the issue also occurs with the [latest wheels](https://docs.ray.io/en/master/installation.html).
Metadata
Metadata
Assignees
Labels
P1Issue that should be fixed within a few weeksIssue that should be fixed within a few weeksbugSomething that is supposed to be working; but isn'tSomething that is supposed to be working; but isn'tusability