Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested mapTasks invocations can cause deadlock #4

Closed
jwiegley opened this issue Jan 19, 2015 · 5 comments · Fixed by #26
Closed

Nested mapTasks invocations can cause deadlock #4

jwiegley opened this issue Jan 19, 2015 · 5 comments · Fixed by #26

Comments

@jwiegley
Copy link
Owner

Pinging @spl

If mapTasks invokes mapTasks, deadlock can occur. Scenario:

  • mapTasks, which enqueues all of its jobs immediately in the execution graph, starts executing the first N tasks. It also waits until all tasks are finished before returning to the caller.
  • If one of those tasks calls mapTasks, it will enqueue M more tasks, and wait for them to complete as well. However, the inner mapTasks can starve because no execution slots may ever free up: all jobs are now blocked, waiting on future jobs, which are waiting on the blocked jobs to finish.

Solution: If we notice a call to wait in a thread whose threadId matches that of a running job, raise an exception that there may be a deadlock scenario.

We could either raise this exception always (which makes it easier for the developer to avoid this situation), or we could only raise it if, when wait is called, there are no available slots.

@jwiegley
Copy link
Owner Author

This issue is split off from #2.

@depp
Copy link

depp commented Oct 19, 2016

Raising an exception when calling wait from a running job seems like a pessimal solution, since it makes it difficult to run a set of jobs when the dependency graph is not known ahead of time. Only raising when all slots are taken is even worse, since it means the error could happen nondeterministically.

My expectation is that any finite poset of jobs should eventually complete. This implies that a if a worker runs a job that calls wait on another job in the same group, the worker should execute that job.

@expipiplus1
Copy link
Contributor

expipiplus1 commented May 14, 2020

@depp, unfortunately that's not always correct, as the waited on Async may be bound to run on a particular capability, one which is different from the waiters capability.

The A correct solution would probably be more along the lines of temporarily increasing the number of concurrent tasks (or equivalently decreasing the number of running tasks) while the wait is in progress. This highlights a distinction not currently made in the library between

  • The number of tasks currently executing
  • The number of tasks which have started and not yet finished.

Because of this new distinction any solution would require some consideration.

@l29ah
Copy link
Contributor

l29ah commented Oct 3, 2022

This would require us to somehow figure out whether we're run from one of the tasks, otherwise that would spawn one more thread than requested by caller.

l29ah added a commit to l29ah/async-pool that referenced this issue Oct 3, 2022
l29ah added a commit to l29ah/async-pool that referenced this issue Oct 8, 2022
a dumb fix not accounting for whether we're run from one of the tasks
that would spawn one more thread than requested by caller

fixes jwiegley#4
l29ah added a commit to l29ah/async-pool that referenced this issue Oct 8, 2022
a dumb fix not accounting for whether we're run from one of the tasks
that would spawn one more thread than requested by caller

fixes jwiegley#4
@expipiplus1
Copy link
Contributor

thanks, @l29ah

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants