Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MVar deadlock exceptions cause waitCatch to throw an exception #14

Closed
snoyberg opened this issue Jul 31, 2014 · 8 comments
Closed

MVar deadlock exceptions cause waitCatch to throw an exception #14

snoyberg opened this issue Jul 31, 2014 · 8 comments

Comments

@snoyberg
Copy link
Contributor

I originally discovered this problem when working around an issue with hlint (so pinging @ndmitchell, who may be interested). Consider the following program:

import Control.Concurrent
import Control.Concurrent.Async
import Control.Concurrent.MVar
import Control.Exception

main :: IO ()
main = do
    e <- try $ newEmptyMVar >>= takeMVar
    print (e :: Either SomeException Int)

    _ <- forkIO $ do
        e <- try $ newEmptyMVar >>= takeMVar
        print ("thread", e :: Either SomeException Int)
    threadDelay 1000000

    putStrLn "calling async"
    x <- async $ newEmptyMVar >>= takeMVar
    putStrLn "calling waitCatch"
    y <- waitCatch x
    putStrLn "completed waitCatch"
    print (y :: Either SomeException Int)

    putStrLn "Exiting..."

There are three logical blocks in this program, each doing essentially the same thing: creating an MVar deadlock exception and trying to catch it. The first approach does it in the main program thread, and the second in a child thread. Both of them work as expected: the exception is caught and printed, and the program continues executing.

However, the third block behaves differently. The MVar exception seems to not be got by the call to try in asyncUsing, and therefore the TMVar is never filled. This means that a later call to waitCatch throws an exception, and therefore the program exits prematurely. I've seen the same behavior with asyncBound as well, so I do not believe this is related to rawForkIO.

Here's an example output from my system:

$ ghc --make -O2 -threaded deadlock.hs && ./deadlock 
[1 of 1] Compiling Main             ( deadlock.hs, deadlock.o )
Linking deadlock ...
Left thread blocked indefinitely in an MVar operation
("thread",Left thread blocked indefinitely in an MVar operation)
calling async
calling waitCatch
deadlock: thread blocked indefinitely in an STM transaction

I'm on Ubuntu 12.04 64-bit, using GHC 7.8.3, stm-2.4.3 and async-2.0.1.5.

@snoyberg
Copy link
Contributor Author

snoyberg commented Aug 3, 2014

I reduced this to a bug reproducible with just base, so this does not appear to be a bug in async. I've opened a GHC trac item for this:

https://ghc.haskell.org/trac/ghc/ticket/9401

Feel free to close this issue if you feel it makes more sense to just follow this in GHC trac.

@simonmar
Copy link
Owner

simonmar commented Aug 4, 2014

See the comment on https://ghc.haskell.org/trac/ghc/ticket/9401, this isn't really a bug, just a consequence of the way that deadlock detection works.

If you want a particular thread to be immune to deadlock detection, you can use this trick:

  t <- myThreadId
  s <- newStablePtr t
  -- now I'm immune!
  freeStablePtr s
  -- now I'm vulnerable again!

@simonmar simonmar closed this as completed Aug 4, 2014
snoyberg added a commit to snoyberg/enclosed-exceptions that referenced this issue Aug 4, 2014
Pinging @simonmar. Thank you for your response in the tickets. However,
the StablePtr technique simply caused this test suite to hang
indefinitely. Instead, I'm trying to catch the BlockedIndefinitelyOnSTM
exception sent by the RTS, which seems to work reliably.

Pinging @sol. This change needs to be included in hspec as well. Without
a patch, for example, the enclosed-exceptions test suite still fails,
since hspec ends up letting the test die due to the
BlockedIndefinitelyOnSTM exception being sent to it as well. I can send
a pull request after this is reviewed/accepted into enclosed-exceptions.
@snoyberg
Copy link
Contributor Author

snoyberg commented Aug 4, 2014

Thanks for the clarification. While the StablePtr trick didn't work for me, catching the BlockedIndefinitelyOnSTM exception does seem to have solved the problem. I've created pull request jcristovao/enclosed-exceptions#5.

@simonmar
Copy link
Owner

simonmar commented Aug 4, 2014

Making a StablePtr will prevent the main thread from receiving deadlock exceptions, so if you need that exception to recover from a main thread deadlock then it will just hang instead. The StablePtr trick works for me on the example in https://ghc.haskell.org/trac/ghc/ticket/9401.

snoyberg added a commit to fpco/async that referenced this issue Aug 4, 2014
This uses the same technique I described in my enclosed-exceptions pull
request. I think it would be better to fix this in async itself, so that
other packages trying to use async for reliable exception handling get
the semantics they're looking for.
simonmar added a commit that referenced this issue Aug 4, 2014
Add workaround to waitCatch for #14
bitonic added a commit to fpco/async that referenced this issue Jul 12, 2016
`waitCatch` is used to have the fix for
<simonmar#14> also in `cancel`
@mitchellwrosen
Copy link

Here is the fix implemented in async:

wait :: Async a -> IO a
wait = tryAgain . atomically . waitSTM
  where
    -- See: https://github.com/simonmar/async/issues/14
    tryAgain f = f `catch` \BlockedIndefinitelyOnSTM -> f

But my colleague @tstat pointed out that catch executes the RHS with an "implicit mask", as documented in Catching Exceptions (but curiously missing from the haddocks of catch itself):

The difference between using try and catch for recovery is that in catch the handler is inside an implicit mask (see "Asynchronous Exceptions") which is important when catching asynchronous exceptions, but when catching other kinds of exception it is unnecessary. Furthermore it is possible to accidentally stay inside the implicit mask by tail-calling rather than returning from the handler, which is why we recommend using try rather than catch for ordinary exception recovery.

For that reason, do we want to use try instead, so the RHS (the second call to f) isn't run in a MaskedInterruptible state?

wait :: Async a -> IO a
wait = tryAgain . atomically . waitSTM
  where
    -- See: https://github.com/simonmar/async/issues/14
    tryAgain f = try f >>= \case 
      Left BlockedIndefinitelyOnSTM -> f
      Right x -> pure x

@simonmar
Copy link
Owner

simonmar commented Sep 7, 2021

I don't think it actually matters, because the only thing that f does is block in STM, and mask will be a no-op since blocking is interruptible. try would be slightly less efficient.

@mitchellwrosen
Copy link

@simonmar Thanks for the clarification. I'm still a little bit puzzled - could it not be the case that on the second invocation of

atomically (waitSTM x)

(with async exceptions masked due to catch), the STM action might not retry, and if not, any pending asynchronous exception would not be delivered until after the transaction commits?

A related question - if the transaction gets rolled back and retried due to an inconsistent view of memory, does that also count as an "interruptible" moment for the runtime to poll for an async exception?

Overall I agree there is not a very significant distinction here, but I am curious about the nitty-gritty timings :)

Thanks!

@simonmar
Copy link
Owner

simonmar commented Nov 8, 2021

(with async exceptions masked due to catch), the STM action might not retry, and if not, any pending asynchronous exception would not be delivered until after the transaction commits?

That's true, but you never have any guarantees about when an asynchronous exception will be delivered anyway.

I mean, strictly speaking you're probably right. I'll accept a pull request to fix it.

A related question - if the transaction gets rolled back and retried due to an inconsistent view of memory, does that also count as an "interruptible" moment for the runtime to poll for an async exception?

I don't believe so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants