Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Mars worker not recovered on ray master #3079

Closed
chaokunyang opened this issue May 24, 2022 · 0 comments · Fixed by #3080
Closed

[BUG] Mars worker not recovered on ray master #3079

chaokunyang opened this issue May 24, 2022 · 0 comments · Fixed by #3080

Comments

@chaokunyang
Copy link
Contributor

Describe the bug
In ray master, If a actor created with max_restarts=-1 is restarting, call actor method will raise exception instead of pending in caller, which make ray worker failover not work.

To Reproduce
To help us reproducing this bug, please provide information below:

  1. Your Python version: 3.7.9
  2. The version of Mars you use: master
  3. Versions of crucial packages, such as ray, numpy, scipy and pandas: ray master
  4. Full stack of the error.
    image
  5. Minimized code to reproduce the error.
import ray
import time

@ray.remote(max_restarts=-1)
class A:
  def __init__(self):
    if ray.get_runtime_context().was_current_actor_reconstructed:
      import os
      os._exit(-1)
    print(ray.get_runtime_context().was_current_actor_reconstructed)
    time.sleep(3)

  def f(self):
    return 1

  def f1():
    time.sleep(30)
a = A.remote()
print(ray.get(a.f.remote()))
r = a.f1.remote()
ray.kill(a, no_restart=False)
try:
  ray.get(r)
except Exception:
  pass
print(ray.get(a.f.remote()))

Expected behavior
This can be fixed by specifying max_retries=-1 when call actro methods

Additional context
Add any other context about the problem here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant