mapTasks_ does not work #2

spl · 2015-01-13T17:14:29Z

Sorry about the vague title. I haven't looked into the cause, but I thought it was significant enough to report.

I have the following code:

liftIO $ withTaskGroup 50 $ \tg -> mapTasks_ tg tasks

This was the first time I tried async-pool, so when I ran it, I was quite surprised that the tasks did not run. I swapped in mapTasks for mapTasks_:

liftIO $ withTaskGroup 50 $ \tg -> void $ mapTasks tg tasks

And the tasks ran just fine.

Since the problem seems to be pretty clearly due to mapTasks_, I hope it's an easy fix for you. Otherwise, if you need a test case or something, let me know.

The text was updated successfully, but these errors were encountered:

jwiegley · 2015-01-17T14:57:44Z

@spl Thanks for the report! Also, I see your point about nested mapTasks, although I'm not sure what to do there. async-pool is creating all the tasks immediately, but only spawning threads for a few jobs. But as you say, within those jobs is another mapTasks, which will indeed deadlock waiting for a free slot.

How is this for a solution: Each time we spawn a thread within a task group, we remember the threadId used for that job. Then, anytime a new job is enqueued for the same task group, and we see that it's being enqueued from within that threadId, and that we will be waiting on the result, we allow those jobs to additionally execute within that thread (at most serially, but if slots become free they can run concurrently again).

I'm not sure exactly how this will work in the code, but we do need some way for the blocked worker thread to make progress in these situations. I'll create a test to reproduce the deadlock and then see what I can do.

Thanks for trying async-pool! If you think it's ready for prime-time, I was thinking of announcing it to Haskell-Cafe this weekend.

jwiegley · 2015-01-17T15:04:31Z

As for mapTasks_: The main difference here is that it does not wait before returning. Perhaps this is making it appear as though the tasks aren't running, because they haven't had a chance to yet? mapTasks does wait, which is perhaps too different of a semantic to justify simply dropping the _.

jwiegley · 2015-01-17T15:09:48Z

To further clarify the deadlock-resolving solution:

When calling wait on a job, and that job was enqueued within the current threadId, replace the job in the task group with a thunk waiting on a TMVar and immediately execute the corresponding IO action in that thread, putting to the TMVar on completion.

This way we can commandeer execution to avoid the delay, while not upsetting anyone depending on the result via the task group.

spl · 2015-01-17T20:09:22Z

@jwiegley Thanks for the response. I don't have time to answer fully right now, but there are a couple of things I wanted to mention.

I did see the code containing mapTasks_ complete, so it makes sense that that is because it doesn't wait. Each of the tasks forked by mapTasks_ had database updates, and they were not waiting on anything other than free slots. I never saw those database updates. I'm not sure if those threads were somehow killed or what.

I ran the new mapTasks-based code on a production migration last night. Unfortunately, something made it get stuck in the middle. No idea what it was.

jwiegley · 2015-01-17T21:46:02Z

@spl Ok, I really want to get to the bottom of this, so anything I can do to help, let me know.

I tried implementing the deadlock avoidance that I described above, but it is not possible: At the point where the deadlock happens, I don't have access to IO, only STM.

So what about this for an alternate solution: Waiting, within an active task, on a task spawned from that thread throws an exception, with a suggestion to use a secondary task group?

spl · 2015-01-19T06:57:17Z

I feel I'm getting a bit lost. There are two independent issues, right?

mapTasks_ semantics and implementation
Nested use of a TaskGroup

To keep the discussion focused, I suggest creating a new GitHub issue for the second issue. I think it's much more difficult to resolve. At this point, I would add documentation that functions such as mapTasks create all tasks immediately, so you do not want to have one of those tasks using the same TaskGroup.

As for the first issue, were you able to reproduce my problem? If not, I'll try to make a small test case. It should be easy to do using my app+database setup, but I can try to do it without the database, perhaps with a threadDelay or something.

That said, time to respond to some of your comments:

As for mapTasks_: The main difference here is that it does not wait before returning. [...] mapTasks does wait, which is perhaps too different of a semantic to justify simply dropping the _.

I do think it is unexpected to have the _ version of a function be so radically different from the non-_ version. I glanced at the implementation of mapTasks_ when I ran into the problem. I was, admittedly, surprised to see that it wasn't mapTasks with void, but I didn't dig into it further. If it is the case that mapTasks_ will be so different, it should be documented in bold.

So what about this for an alternate solution: Waiting, within an active task, on a task spawned from that thread throws an exception, with a suggestion to use a secondary task group?

This sounds like some form of deadlock detection. How certain can you be that the exception doesn't get thrown for a “legitimate” wait?

jwiegley · 2015-01-19T15:31:28Z

My apologies for confusing this ticket. I'll continue discussing mapTasks_, but will open a new issue for the deadlock problem.

jwiegley · 2015-01-19T16:00:46Z

As for mapTasks_, a reproducible test case would be a great help. As for the semantic difference, I think we should just underscore it in bold in the documentation for now.

jwiegley · 2015-01-23T20:35:35Z

@spl After thinking about it a lot more, I think that mapTasks_ f xs should be equivalent to void $ mapTasks f xs. If one wishes to start a group of tasks without waiting for any result, they can use mapM_ async xs.

spl · 2015-01-24T06:32:16Z

That makes sense, and it would be good to document the async alternative.

Sorry I haven't gotten back to you with the test case. I've been rather busy lately. I've got a deadline next weekend, so I'm busy this weekend as well.

Do you still want the test case for mapTasks_?

jwiegley · 2015-01-24T08:18:33Z

@spl Whenever you have the time, I'd really like to know what's going on. Thanks!

clrnd · 2015-10-03T20:24:40Z

This is a great library, I'd love if this one got fixed. Maybe I can help somehow?

jwiegley · 2015-10-03T20:48:35Z

@Alvare That would be great! If you could write some tests to demonstrate the exact problem, it shouldn't be too hard to find the issue and fix it.

Unrelated, by Simon Marlow has pointed out a potential race condition that simply can't be fixed -- but we should have tests to demonstrate it, at the very last as documentation. For example, if you have a task group with 1 slot, and a task A that manually depends on B (using an MVar, instead of the dependency logic in async-pool), and A gets started first by the scheduler, we'll deadlock without detection.

The answer in this case is to make use of the dependency graph in async-pool itself, but it does raise the point that manually using blocking constructions can lead to undetected deadlocks. This doesn't happen with regular async, because every thread is allowed to start right away.

clrnd · 2015-10-04T10:33:42Z

@jwiegley well that's easy:

https://gist.github.com/alvare/67c5717affe6bba891f3

Replace mapTasks_ with mapTasks and it prints the numbers. mapTasks_ is just not waiting for it's tasks to finish, a simple threadDelay and you see the numbers get printed.

spl · 2015-10-04T13:25:11Z

@Alvare Thanks for doing that.

@jwiegley I'm sorry I didn't get around to writing something up. I had moved onto other problems and never made time for this one.

dzhus · 2016-05-09T14:50:36Z

Is this problem with mapConcurrently the same as the one being discussed?

λ> withTaskGroup 2 $ \g -> mapConcurrently g (const $ getCurrentTime) [1, 2, 3]
*** Exception: thread blocked indefinitely in an STM transaction

neongreen · 2018-05-25T13:27:08Z

It looks like the latest version on Github doesn't have this bug anymore (at least the example above definitely works) but I'm not sure that other functions are not buggy.

0xd34df00d · 2019-11-15T16:35:16Z

mapConcurrently works for me too for the current HEAD but not 0.9.0.2 on hackage, with a more involved usage pattern. Perhaps worth a bump?

jchia · 2024-07-03T11:16:47Z

What's the status of the mapTasks_ issue? I have a test program (but not a 'test') https://github.com/jchia/async-pool-test
The expected output is 12, but the actual output is 2. It repros every time for me.

spl mentioned this issue Jan 14, 2015

mapTasks threw fgl exception on completion #3

Closed

jwiegley mentioned this issue Jan 19, 2015

Nested mapTasks invocations can cause deadlock #4

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mapTasks_ does not work #2

mapTasks_ does not work #2

spl commented Jan 13, 2015

jwiegley commented Jan 17, 2015

jwiegley commented Jan 17, 2015

jwiegley commented Jan 17, 2015

spl commented Jan 17, 2015

jwiegley commented Jan 17, 2015

spl commented Jan 19, 2015 •

edited

Loading

jwiegley commented Jan 19, 2015

jwiegley commented Jan 19, 2015

jwiegley commented Jan 23, 2015

spl commented Jan 24, 2015 •

edited

Loading

jwiegley commented Jan 24, 2015

clrnd commented Oct 3, 2015

jwiegley commented Oct 3, 2015

clrnd commented Oct 4, 2015

spl commented Oct 4, 2015

dzhus commented May 9, 2016 •

edited

Loading

neongreen commented May 25, 2018

0xd34df00d commented Nov 15, 2019

jchia commented Jul 3, 2024

mapTasks_ does not work #2

mapTasks_ does not work #2

Comments

spl commented Jan 13, 2015

jwiegley commented Jan 17, 2015

jwiegley commented Jan 17, 2015

jwiegley commented Jan 17, 2015

spl commented Jan 17, 2015

jwiegley commented Jan 17, 2015

spl commented Jan 19, 2015 • edited Loading

jwiegley commented Jan 19, 2015

jwiegley commented Jan 19, 2015

jwiegley commented Jan 23, 2015

spl commented Jan 24, 2015 • edited Loading

jwiegley commented Jan 24, 2015

clrnd commented Oct 3, 2015

jwiegley commented Oct 3, 2015

clrnd commented Oct 4, 2015

spl commented Oct 4, 2015

dzhus commented May 9, 2016 • edited Loading

neongreen commented May 25, 2018

0xd34df00d commented Nov 15, 2019

jchia commented Jul 3, 2024

spl commented Jan 19, 2015 •

edited

Loading

spl commented Jan 24, 2015 •

edited

Loading

dzhus commented May 9, 2016 •

edited

Loading