-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core] Better error message for task/actors when unschedulable (integrate with autoscaler) #15933
Comments
I am commenting here as the related issues seem to have been closed. If should should be on another issue, please let me know. I have received the same blocking message as reported in 13905: The actor or task cannot be scheduled right now. My use case is associated with Poputation Based Training replays: [https://docs.ray.io/en/master/tune/tutorials/tune-advanced-tutorial.html#replaying-a-pbt-run]. In my case, the pbt policy text file specifies 6 workers. As reported by another user, if I manually change that value to 1, the replay runs as expected, albeit slowly since I am using only one worker. Further, the replay will run if local mode is activated. FYI, I have experienced the same issue with loading and running checkpoints not associated with PBT. A better error message would be useful, particularly if it described how to resolve the problem without giving up the benefits of multiple workers. I have code and sample PBT policy files that I would be happy to share if you feel they would be useful. Thanks. David Wilt |
@DLWCMD thanks a lot! After handling the first half of TODO, I will ping you if the error message will be useful for your case. |
TY. Let me know I can assist.
David L. Wilt
3272 Bayou Road
Longboat Key FL 34228
***@***.***
540-420-0844
From: SangBin Cho ***@***.***>
Reply-To: ray-project/ray ***@***.***>
Date: Monday, August 9, 2021 at 1:55 PM
To: ray-project/ray ***@***.***>
Cc: DLWCMD ***@***.***>, Mention ***@***.***>
Subject: Re: [ray-project/ray] [core] Better error message for task/actors when unschedulable (integrate with autoscaler) (#15933)
@DLWCMD thanks a lot! After handling the first half of TODO, I will ping you if the error message will be useful for your case.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
Triage notifications on the go with GitHub Mobile for iOS or Android.
|
#15962 => This will be also done as a part of this work. |
The problem
This error message is not very actionable nor usable:
A few improvements are needed:
The text was updated successfully, but these errors were encountered: