Skip to content
Permalink
Browse files

Merge #1453

1453: Add uninterruptibleMask wrapper to a takeMVar in ImmDB and VolDB  r=nfrisby a=nfrisby

Fixes #1452.

This PR prevents wraps the `takeMVar` calls mentioned in the Issue in the sledgehammer `uninterruptibleMask_`. I've scanned for other uses of the `STM` variables to ensure that they're never empty for very long (`uninterruptibleMask_` should not be used on calls that may block for "long" durations) and also added another couple `mask` calls to ensure that every `take` is always paired with a subsequent `put`.

I'm opening this as a Draft PR (edit: we're proceeding, see Issue 1464) because:

  * I want another developer to confirm that `uninterruptibleMask_` is the desired solution here. Maybe we can re-architect instead? A note: the immediately surrounding `bracket` starts with `takeMVar`, so the `uninterruptibleMask_` isn't really necessary for that one. But in the repros (currently only on my local PR 1419, sadly) there are more outer layers of `mask` (actually `bracket`, I think) that do other things before reaching this `takeMVar`.
  * The timing involved on this problem seems delicate enough that it's not obvious to me how to add a repro to the test suite. (Beyond being delicate, I'm unsure I can disentangle my current repros from PR 1419.) Maybe there's a promising way to add this to `test-storage` (with which I am not yet familiar).

@mrBliss @edsko, can you advise?

Co-authored-by: Nicolas Frisby <nick.frisby@iohk.io>
  • Loading branch information
iohk-bors and nfrisby committed Jan 14, 2020
2 parents 8154a9c + 245219a commit 56bc665a604e11aa504cac833ae7f8a6dedb9170
@@ -107,7 +107,7 @@ import Data.Functor (($>), (<&>))

import GHC.Stack (HasCallStack)

import Control.Monad.Class.MonadThrow (bracket, finally)
import Control.Monad.Class.MonadThrow (bracket, bracketOnError, finally)

import Ouroboros.Consensus.Block (IsEBB (..))
import Ouroboros.Consensus.Util (SomePair (..))
@@ -289,8 +289,8 @@ closeDBImpl ImmutableDBEnv {..} = releaseAll _dbRegistry `finally` do
case internalState of
-- Already closed
DbClosed _ -> do
traceWith _dbTracer $ DBAlreadyClosed
putMVar _dbInternalState internalState
traceWith _dbTracer $ DBAlreadyClosed
DbOpen OpenState {..} -> do
let !closedState = closedStateFromInternalState internalState
-- Close the database before doing the file-system operations so that
@@ -309,20 +309,16 @@ reopenImpl
=> ImmutableDBEnv m hash
-> ValidationPolicy
-> m ()
reopenImpl ImmutableDBEnv {..} valPol = do
internalState <- takeMVar _dbInternalState
case internalState of
reopenImpl ImmutableDBEnv {..} valPol = bracketOnError
(takeMVar _dbInternalState)
-- Important: put back the state when an error is thrown, otherwise we have
-- an empty TMVar.
(putMVar _dbInternalState) $ \internalState -> case internalState of
-- When still open,
DbOpen _ -> do
putMVar _dbInternalState internalState
throwUserError _dbErr OpenDBError
DbOpen _ -> throwUserError _dbErr OpenDBError

-- Closed, so we can try to reopen
DbClosed ClosedState {..} ->
-- Important: put back the state when an error is thrown, otherwise we
-- have an empty TMVar.
onException hasFsErr _dbErr
(putMVar _dbInternalState internalState) $ do
DbClosed ClosedState {..} -> do
let validateEnv = ValidateEnv
{ hasFS = _dbHasFS
, err = _dbErr
@@ -200,7 +200,8 @@ modifyOpenState ImmutableDBEnv { _dbHasFS = hasFS :: HasFS m h, .. } action = do
-- r)@).

open :: m (InternalState m hash h)
open = takeMVar _dbInternalState
-- TODO Is uninterruptibleMask_ absolutely necessary here?
open = uninterruptibleMask_ $ takeMVar _dbInternalState

close :: InternalState m hash h
-> ExitCase (Either ImmutableDBError (r, OpenState m hash h))
@@ -671,7 +671,8 @@ modifyState VolatileDBEnv{_dbHasFS = hasFS :: HasFS m h, ..} action = do
HasFS{..} = hasFS

open :: m (OpenOrClosed blockId h)
open = takeMVar _dbInternalState
-- TODO Is uninterruptibleMask_ absolutely necessary here?
open = uninterruptibleMask_ $ takeMVar _dbInternalState

close
:: OpenOrClosed blockId h

0 comments on commit 56bc665

Please sign in to comment.
You can’t perform that action at this time.