Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add atomicModifyIORef_ and atomicModifyIORef'_ to 'Data.IORef' #101

Closed
chshersh opened this issue Oct 20, 2022 · 26 comments
Closed

Add atomicModifyIORef_ and atomicModifyIORef'_ to 'Data.IORef' #101

chshersh opened this issue Oct 20, 2022 · 26 comments
Labels
withdrawn Withdrawn by proposer

Comments

@chshersh
Copy link
Member

I propose to add the following functions to the Data.IORef module in base:

-- | Variant of 'atomicModifyIORef' which ignores the return value
atomicModifyIORef_ :: IO a -> (a -> a) -> IO ()
atomicModifyIORef_ ref f = atomicModifyIORef ref $ \a -> (f a, ())

-- | Variant of 'atomicModifyIORef'' which ignores the return value
atomicModifyIORef'_ :: IO a -> (a -> a) -> IO ()
atomicModifyIORef_ ref f = atomicModifyIORef' ref $ \a -> (f a, ())

The exact documentation with examples and more explanation will be added as a part of changes.

This functions are rather common and I personally use them much more often than the original atomicModifyIORef versions. Currenty, I need to write atomicModifyIORef' ref $ \a -> (myFunction a, ()) which is:

  1. Contains more syntactic noise.
  2. Requires me to check the type of atomicModifyIORef every single time because I always forget where is the return value: is it the first element of the tuple or the second one?

This was proposed before as a GHC proposal but was rejected because it should be a CLC proposal:

A few common libraries out there already implement this function:

Unfortunately, base itself already has functions with these names but different type in the GHC.IORef module:

atomicModifyIORef'_ :: IORef a -> (a -> a) -> IO (a, a) 
@nomeata
Copy link

nomeata commented Oct 20, 2022

I’m in support, I ran into this odd omission in the API just last month.

@Bodigrim
Copy link
Collaborator

I must admit that I don't really understand the idea behind existing API.

atomicModifyIORef :: IORef a -> (a -> (a, b)) -> IO b 

What's the expected use case when b is not ()?

@gbaz
Copy link

gbaz commented Oct 26, 2022

That's one of the most useful ones in my opinion! Imagine one has a map behind an IO ref. As one example, that function can be used to both insert into the map, and also return a bool indicating if there was a prior value in the map.

That said, I think the versions already in base are slightly more useful and general, although they might sometimes require discarding the result to avoid warnings.

@Bodigrim
Copy link
Collaborator

Thanks @gbaz, it's a pity that haddocks do not provide such example.

Would it be too bad to define atomicModifyIORef_' :: IO a -> (a -> a) -> IO () to avoid name clash with GHC.IORef.atomicModifyIORef'_?

@treeowl
Copy link

treeowl commented Oct 26, 2022

It's been a while since I wrote it, but the code used to implement atomicModifyIORef' has some extra power that I believe is still not exposed. I'd like to see that happen.

@Jashweii
Copy link

#101 (comment)

What's the expected use case when b is not ()?

If you did a prior read instead, the combined block is not atomic (there could have been a write in between the two).
I assume this avoids that.

@chshersh
Copy link
Member Author

I must admit that I don't really understand the idea behind existing API.

atomicModifyIORef :: IORef a -> (a -> (a, b)) -> IO b 

What's the expected use case when b is not ()?

I'm using the existing interface behind atomicModifyIORef in dr-cabal like this here:

command <- atomicModifyIORef' watchRef popAction

popAction :: [WatchAction] -> ([WatchAction], WorkerCommand)
popAction [] = ([], Wait)
popAction (x : xs) = case x of
    Start        -> (xs, Greeting)
    Consume l    -> (xs, WriteLine l)
    End path lns -> ([], Finish path lns)

But that's about the only time I used this default interface. More often I need the one that's being proposed here.

@tomjaguarpaw
Copy link
Member

I'm confused then why one would use atomicModifyIORef_/' rather than modifyIORef/'. If you're not both writing and reading then what about these operations is "atomic"?

@tomjaguarpaw
Copy link
Member

Oh, it's the modification itself that's not atomic. modifyIORef is

modifyIORef ref f = readIORef ref >>= writeIORef ref . f

🤯

@nomeata
Copy link

nomeata commented Oct 27, 2022

I wonder: could we simply make modifyIORef atomic? It seems a reasonable default (and I certainly expected it to be atomic when I first looked at that). What would we lose?

@treeowl
Copy link

treeowl commented Oct 27, 2022

I wonder: could we simply make modifyIORef atomic? It seems a reasonable default (and I certainly expected it to be atomic when I first looked at that). What would we lose?

That would have rather bad performance effects for code using an IORef for things other than inter-thread communication. Extra allocation for sure; maybe also more synchronization cost.

@Bodigrim
Copy link
Collaborator

I made a documentation MR for the current state of Data.IORef as it's full of pitfalls, especially on ARM: https://gitlab.haskell.org/ghc/ghc/-/merge_requests/9349/diffs Anyone up to review?

@tomjaguarpaw
Copy link
Member

Reviewed with suggestions. Thanks for doing that.

@treeowl
Copy link

treeowl commented Mar 15, 2023

Here's how you'd want to actually implement these, I believe:

atomicModifyIORef_ :: IO a -> (a -> a) -> IO ()
atomicModifyIORef_ (IORef (STRef ref)) f = IO $ \s ->
  case atomicModifyMutVar2# ref (MkSolo . f) s of
    (# s', _old, MkSolo _new #) -> (# s', () #)

atomicModifyIORef'_ :: IO a -> (a -> a) -> IO ()
atomicModifyIORef'_ (IORef (STRef ref)) f = IO $ \s ->
  case atomicModifyMutVar_# ref f s of
    (# s', _old, !_new #) -> (# s', () #)

But ... why throw away all the information? The following two functions should be (at least nearly) as convenient, just as efficient (assuming decent optimization), and can be used for extra things.

-- | Returns the old value from the IORef and the new one installed in it.
foobar :: IO a -> (a -> a) -> IO (a, a)
foobar (IORef (STRef ref)) f = IO $ \s ->
  case atomicModifyMutVar2# ref (MkSolo . f) s of
    (# s', old, MkSolo new #) -> (# s', (old, new) #)
{-# INLINE foobar  #-}

-- | Returns the old value from the IORef and the new one installed in it.
baz :: IO a -> (a -> a) -> IO (a, a)
baz (IORef (STRef ref)) f = IO $ \s ->
  case atomicModifyMutVar_# ref f s of
    (# s', old, !new #) -> (# s', (old, new) #)
{-# INLINE baz #-}

Note (2023-03-25): I updated the implementations of atomicModifyIORef'_ and baznow that I remembered that theatomicModifyMutVar_#` operation exists.

@mixphix
Copy link
Collaborator

mixphix commented Mar 15, 2023

So convenient I'd almost rather name the other ones the same with a _ on the end to indicate that they throw the values away. But that's a breaking change. And clashing naming idioms.

@treeowl
Copy link

treeowl commented Mar 15, 2023

@mixphix , I don't understand your comment. Could you be more specific about what you're comparing to what?

@mixphix
Copy link
Collaborator

mixphix commented Mar 15, 2023

It seems that the functions you've named foobar and baz are the more general functions and should bear the unadulterated "name" for these functions, whereas the ones that exist currently ignore information and would in other circumstances be indicated with a postfix _.

@mixphix
Copy link
Collaborator

mixphix commented Mar 15, 2023

Frankly I'm not a fan of any of the names in that module. I can't make heads or tails of how each one is supposed to relate to the details of what they do differently from each other. Perhaps such precise functions deserve precise (long) names?

@treeowl
Copy link

treeowl commented Mar 15, 2023

@mixphix The current ones indeed throw away information they should keep, but they take functions of a different type. The function currently named GHC.IORef.atomicModifyIORef2 is the improvement of atomicModifyIORef. I've proposed exposing that through Data.IORef in #138.

@treeowl
Copy link

treeowl commented Mar 15, 2023

Frankly I'm not a fan of any of the names in that module. I can't make heads or tails of how each one is supposed to relate to the details of what they do differently from each other. Perhaps such precise functions deserve precise (long) names?

There's a reason I use names that are utter nonsense; coming up with good ones is hard. I believe that the fact atomicModifyIORef' forces its "result" (the second component of the pair) after forcing the new IORef contents is almost certainly just an artifact of the way it used to be implemented. The strict pairish functions I want are

data LSP a b = LSP !a b
-- Just force the new IORef contents, not the extracted info
bloop :: IORef a -> (a -> LSP a b) -> IO (a, LSP a b)

data SP a b = SP !a !b
-- Force everything in whichever thread gets there first
bleep :: IORef a -> (a -> SP a b) -> IO (a, SP a b)

@Bodigrim
Copy link
Collaborator

Let's not diverge from the original proposal.

Unfortunately, base itself already has functions with these names but different type in the GHC.IORef module:

This is the main point of contention in my opinion. The names of atomicModifyIORef* operations are already terrible: there are atomicModifyIORef, atomicModifyIORef', atomicModifyIORef'_, atomicModifyIORef2, atomicModifyIORef2Lazy, atomicModifyIORefLazy_, atomicModifyIORefP, atomicModifyMutVar#, atomicModifyMutVar2#, atomicModifyMutVar_# - and adding two more, one of which clashes with an existent one, will not make it better.

I'm very sympathetic to @mixphix's idea (#138 (comment)) to introduce Data.IORef.Atomic with

write :: a -> IORef a -> IO ()
modify :: (a -> a) -> IORef a -> IO ()
modify' :: (a -> a) -> IORef a -> IO ()
update :: (a -> (a, b)) -> IORef a -> IO (a, (a, b))
update' :: (a -> (a, b)) -> IORef a -> IO (a, (a, b))

matching existent API of non-atomic operations in Data.IORef. Seems to be a clear improvement in terms of explorability, clarity and safety (if I imported Data.IORef.Atomic only, I cannot acidentally use a non-atomic call). Any opinions?

@chshersh
Copy link
Member Author

@Bodigrim I actually do like the idea of having a separate module Data.IORef.Atomic. This looks like a much better approach, and I'm happy to sunset my proposal in favour of this new one.

I don't have the capacity to steer this new proposal, but I'm more than happy to encourage and support anyone else who'll take it 🤗

@treeowl
Copy link

treeowl commented Mar 26, 2023

@Bodigrim There are reasons I'm "diverging from the proposal". I think we want a reasonably full set of the basic functionality offered by the primops. So adding less capable operations first and using up good names without discussing what to name the rest in the set seems like a bad idea. I have commented on several problems with @mixphix's suggestions in the other thread.

@Bodigrim
Copy link
Collaborator

@Bodigrim There are reasons I'm "diverging from the proposal". I think we want a reasonably full set of the basic functionality offered by the primops.

AFAIU the original proposal here, it had nothing to do with a full set of primops wrappers or intricate laziness/strictness properties. @chshersh just wanted a set of atomic helpers matching existent non-atomic helpers.

I mean, providing a full set of the basic primops functionality is a reasonable goal, just a different one from this proposal.

@tomjaguarpaw
Copy link
Member

I too like the idea of exposing the functions from a new module, making it harder to use the non-atomic ones by accident.

@chshersh
Copy link
Member Author

I'm closing this issue as I don't want to pursue the current design anyway.

@chshersh chshersh added the withdrawn Withdrawn by proposer label Mar 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
withdrawn Withdrawn by proposer
Projects
None yet
Development

No branches or pull requests

8 participants