-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include FAILED and UPSTREAM_FAILED tasks in count_pending #1926
Conversation
@daveFNbuck, thanks for your PR! By analyzing the history of the files in this pull request, we identified @javrasya, @Tarrasch and @erikbern to be potential reviewers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The approach in that PR didn't really work well, as it missed cases where the root task is failed or where all leaf tasks are failed
Oh really? It makes sense wen you say it. I suppose it was quite a blunder from our side.
@@ -363,17 +363,19 @@ def prune(self, config): | |||
if self.last_active + config.worker_disconnect_delay < time.time(): | |||
return True | |||
|
|||
def get_pending_tasks(self, state): | |||
def get_pending_tasks(self, state, include_failed=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to instead implement a get_statuses([list of statuses])
? I've always found it a bit weird that get_pending_tasks
also returns running tasks. This could be an opportunity to fix that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'll try to do that tomorrow.
@@ -1774,7 +1793,7 @@ def requires(self): | |||
|
|||
self.assertTrue(w3.run()) | |||
self.assertFalse(w2.run()) | |||
self.assertFalse(w1.run()) | |||
self.assertTrue(w1.run()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just briefly, what changed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually not sure what happened here. I was planning to look into it but forgot.
Nice refactoring. I think the code is more readable now! :) |
These functions are just special cases of more general functions to fetch tasks by status, so we just implement those instead.
59a709f
to
14a0d67
Compare
ping |
Thanks! |
No problem, just trying to clean up :) |
Description
Reverts #1789 and replaces it by counting FAILED tasks in count_pending
Motivation and Context
#1789 was meant to allow workers to stay alive in order to keep working on FAILED tasks after they become PENDING again. The approach in that PR didn't really work well, as it missed cases where the root task is failed or where all leaf tasks are failed. Instead, we just directly count FAILED tasks as part of the pending count, as well as counting UPSTREAM_FAILED tasks as PENDING. UPSTREAM_DISABLED tasks are still not counted, even if only one dependency is DISABLED.
Have you tested this? If so, how?
I have included unit tests.