The number of Running and Done tasks can get more than the total number of tasks #569

pooya · 2014-08-05T20:26:49Z

This happens when we have a large number of workers (100 in this case) for each slave node.

pooya · 2014-08-05T20:44:28Z

The reason this happens is that we are getting the information about the "running" and "done" tasks from two different sources. We first consult the disco_server and get the information about the "running" tasks and then we consult the job event handlers and get the information about the "done" tasks. If any of the tasks finishes in this small window of time, it will be counted both as a running and as a done task which results in the inconsistency.

One way to avoid this problem is to first get the "done" tasks and then the running tasks. In that case, the inconsistencies will be counted as "waiting" tasks and is more acceptable.

aegray · 2014-10-09T13:28:15Z

Your explanation for the cause makes sense but seems to suggest its a UI issue. Do you have any idea why, when we see this, the job always seems to hang indefinitely with negative waiting count? No further progress is made and nothing is actually run on the job once we see the count go negative on the ui.

pooya · 2014-10-09T15:02:47Z

There was a bug in 0.5.2 with the same symptoms that caused the job to hang. Please upgrade to 0.5.3. If you still have this issue in 0.5.3, then it is a different issue and should be tracked and fixed separately.

pooya added the ux label Aug 5, 2014

pooya self-assigned this Aug 5, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The number of Running and Done tasks can get more than the total number of tasks #569

The number of Running and Done tasks can get more than the total number of tasks #569

pooya commented Aug 5, 2014

pooya commented Aug 5, 2014

aegray commented Oct 9, 2014

pooya commented Oct 9, 2014

The number of Running and Done tasks can get more than the total number of tasks #569

The number of Running and Done tasks can get more than the total number of tasks #569

Comments

pooya commented Aug 5, 2014

pooya commented Aug 5, 2014

aegray commented Oct 9, 2014

pooya commented Oct 9, 2014