Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The number of Running and Done tasks can get more than the total number of tasks #569

Open
pooya opened this issue Aug 5, 2014 · 3 comments
Assignees
Labels

Comments

@pooya
Copy link
Member

pooya commented Aug 5, 2014

disco_ui_issue
This happens when we have a large number of workers (100 in this case) for each slave node.

@pooya pooya added the ux label Aug 5, 2014
@pooya pooya self-assigned this Aug 5, 2014
@pooya
Copy link
Member Author

pooya commented Aug 5, 2014

The reason this happens is that we are getting the information about the "running" and "done" tasks from two different sources. We first consult the disco_server and get the information about the "running" tasks and then we consult the job event handlers and get the information about the "done" tasks. If any of the tasks finishes in this small window of time, it will be counted both as a running and as a done task which results in the inconsistency.

One way to avoid this problem is to first get the "done" tasks and then the running tasks. In that case, the inconsistencies will be counted as "waiting" tasks and is more acceptable.

@aegray
Copy link

aegray commented Oct 9, 2014

Your explanation for the cause makes sense but seems to suggest its a UI issue. Do you have any idea why, when we see this, the job always seems to hang indefinitely with negative waiting count? No further progress is made and nothing is actually run on the job once we see the count go negative on the ui.

@pooya
Copy link
Member Author

pooya commented Oct 9, 2014

There was a bug in 0.5.2 with the same symptoms that caused the job to hang. Please upgrade to 0.5.3. If you still have this issue in 0.5.3, then it is a different issue and should be tracked and fixed separately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants