Add monadic traversal with accumulator to Data.Traversable #65

lykahb · 2022-05-17T17:18:24Z

I suggest adding new utility functions to Data.Traversable that allow monadic effects when traversing the data with accumulator:

mapAccumM :: (Monad m, Traversable t) => (a -> b -> m (a, c)) -> a -> t b -> m (a, t c)
forAccumM :: (Monad m, Traversable t) => a -> t b -> (a -> b -> m (a, c)) -> m (a, t c)

Cabal has an implementation of it at Distribution.Utils.MapAccum:

mapAccumM f s t = runStateM (traverse (\x -> StateM (\s' -> f s' x)) t) s

We can reuse this implementation.

Motivation

My use case is processing a list of entries with the cache of auxiliary data that needs expensive IO operations to compute. Cabal also uses mapAccumM for caching and discards the accumulator.

Alternative

It gets easier to use larger functions when a function is the last argument. A few times Cabal flips the arguments of mapAccumM or defines the processing function outside of the mapAccumM call. So, this may be more ergonomic:

forAccumM :: (Monad m, Traversable t) => a -> t b -> (a -> b -> m (a, c)) -> m (a, t c)

Impact

A search of "mapAccumM" on github yields nearly 200 results. After this change, a name clash would break many of those modules. However, this also shows that there is a need for the function.

The text was updated successfully, but these errors were encountered:

treeowl · 2022-05-17T17:57:44Z

If you change the argument order a bit, you'll see that mapAccumM is really just mapM in the StateM monad. I think it should use mapM rather than traverse because there is an (admittedly not huge) chance that the Traversable in question has a specially optimized mapM for "strict" monads. I also think the Functor and Applicative instances for StateM should use monadic implementations to avoid accumulating thunks to inspect pairs. All together:

import Control.Applicative
import Data.Traversable
import Control.Monad
import Data.Coerce

myMapAccumM
  :: (Monad m, Traversable t)
  => (a -> s -> m (s, b))
  -> t a -> s -> m (s, t b)
myMapAccumM f = runStateM #. mapM (StateM #. f)

newtype StateM s m a = StateM
  { runStateM :: s -> m (s, a) }

instance Monad m => Functor (StateM s m) where
  fmap = liftM

instance Monad m => Applicative (StateM s m) where
  pure a = StateM $ \s -> pure (s, a)
  liftA2 = liftM2
  (<*>) = ap
  m *> n = StateM $ \s -> runStateM m s >>= \(s', _) -> runStateM n s'

instance Monad m => Monad (StateM s m) where
  m >>= f = StateM $ \s -> runStateM m s >>= \(s', a) -> runStateM (f a) s'
  (>>) = (*>)

-- Stolen from profunctors
(#.) :: Coercible b c => p b c -> (a -> b) -> a -> c
(#.) _ = coerce
{-# INLINE (#.) #-}

Of course, you can define your mapAccumM in terms of myMapAccumM by just flipping arguments around. The argument order is probably mostly a matter of taste, but there might be some minor performance implications in some cases.

andreasabel · 2022-05-19T15:15:10Z

I think mapAccumL is already a sort of monster that can be avoided by using mapM with State.
So, why not just using StateT on your monad to get the accumulator?

Of course, your monad might already implementing MonadState. The lesson we learned in Agda development is to not make our main monad TCM instance of MonadState, MonadExcept, MonadReader etc. This way, we can stack additional effects like StateT on top of TCM whenever we need them temporarily.
https://github.com/agda/agda/blob/78986123103190e62ba175f85957c7432615285d/src/full/Agda/TypeChecking/Monad/Base.hs#L4207-L4212

tomjaguarpaw · 2022-05-19T15:20:51Z

Of course, your monad might already implementing MonadState

I don't think that matters, does it? Unless I'm missing something, naked MonadState operations will be handled by the StateT that's providing the "accumulating" functionality, and lift of MonadState operations will be delegated to the wrapped monad.

andreasabel · 2022-05-20T11:27:03Z

Indeed, MonadState does not lift through StateT, but always refers to the outermost StateT (https://hackage.haskell.org/package/mtl-2.3/docs/Control-Monad-State-Class.html#t:MonadState):

type My = State Integer

f :: Char -> StateT String My Int
f c = do
  modify (c:)
  lift $ modify succ
  return $ ord c

test :: (([Int], String), Integer)
test = mapM f "Hello, world!" `runStateT` "" `runState` 0

Bodigrim · 2022-05-30T21:05:16Z

@lykahb could you please prepare a GHC merge request, so that we have a specific design to vote on? Does it fit into Data.Traversable better than into Control.Monad?

lykahb · 2022-05-30T22:40:49Z

@Bodigrim Do you think it is better to split the voting into two parts? At first, vote whether we need mapAccumM/forAccumM, and then vote for implementation or discuss it at the merge request?

In my opinion, Data.Traversable is the most fitting module. Putting the functions there would follow the pattern for the functions mapM and forM, which also have both the Traversable and Monad constraints. They are defined in Data.Traversable and re-exported by the module Control.Monad. I do not have a strong opinion if they should be re-exported.

Bodigrim · 2022-05-30T22:43:24Z

Yes, we can have separate votes, but please put up a draft MR first, so that we know that there exists at least one working implementation of the proposal.

lykahb · 2022-05-31T20:53:14Z

Submitted a merge request https://gitlab.haskell.org/ghc/ghc/-/merge_requests/8384.

treeowl · 2022-05-31T20:55:42Z

Your decision to use a lazy state transformer needs explanation/justification.

treeowl · 2022-05-31T21:14:46Z

To be more explicit, I think this should probably use a strict state transformer, like Control.Monad.Trans.State.Strict. The lazy one is only good for rather special purposes, and I think is quite likely to lead to memory leaks in this context.

lykahb · 2022-05-31T21:15:02Z

I believe that the lazy versions of the functions are the default. And there is a convention that the lazy functions have a plain name, and the strict ones have a ' suffix. The related module Data.Foldable has foldMap/foldMap', foldr/foldr'.

Do you suggest adding the strict versions or making the transformer strict by default?

As for the details of the implementation I think that it is better to re-use an existing implementation of the state monad, rather than create a new one. So, the definitions are copied from transformers library.

treeowl · 2022-05-31T21:28:15Z

The strict state monad transformer doesn't force the state; it only forces the pairs. It should be thought of as the "default" state monad transformer. The lazy version is a weird beast. It doesn't strictly obey the monad laws, and it tends to break intuition about what gets calculated when. As a general rule, it should only be used when its special behavior is specifically desired. It would be valuable to consider actual applications for mapAccumM. Which ones would benefit from a lazy state transformer? How counterintuitive will that laziness be in contexts where it's not needed?

lykahb · 2022-05-31T22:13:17Z

This is a convincing argument. The lazy variant seems to be more general, and it is the default in the transformers library. But for the use cases I've seen the laziness wouldn't be useful. Updated the MR to use the strict state transformer.

tomjaguarpaw · 2022-06-01T07:19:58Z

The lazy variant seems to be more general

I believe that it's the other way round: the strict variant is more general, in the sense that you can use the strict variant to implement a variant with the behaviour and space asymptotics of the lazy version, but not vice versa.

EDIT: Having thought about it for a bit I suspect that neither can be used to implement the other.

konsumlamm · 2022-06-01T15:39:14Z

Since Data.Traversable has mapAccumL/mapAccumR, would it make sense to also have left/right variants here?

lykahb · 2022-06-03T17:11:59Z

It seems to me that the functions mapAccumM/forAccumM are similar to mapM/forM/sequence/sequenceA and other traversals that have an Applicative or Monad constraint. They go from left to right. The right-to-left functions like foldr can be powerful tools when there is laziness. For the monadic traversals where the evaluation is done for every element, the right variant wouldn't have more expressive power. Edit: evaluation is not necessarily done for every element if the effects short-circuit.

If someone needs to traverse a structure right-to-left, they reverse the structure. I'd be curious to see the use cases for right-to-left monadic traversal in the wild. I couldn't find examples of it with search on hoogle and github. If it is done manually with recursion, it would be hard to find, though.

lykahb · 2022-06-22T16:35:16Z

I updated the implementation to address the feedback three weeks ago. The Merge Request introduces these two functions:

mapAccumM
  :: (Monad m, Traversable t)
  => (s -> a -> m (s, b)) -> s -> t a -> m (s, t b)

forAccumM
  :: (Monad m, Traversable t)
  => s -> t a -> (s -> a -> m (s, b)) -> m (s, t b)

Without feedback I do not have work to do for this proposal. Should this be put to vote?

Bodigrim · 2022-06-22T19:16:38Z

With regards to prior art and impact, Hackage Search reveals that mapAccumM was implemented (with small variations) in 19 packages. There is however no prior art at all for forAccumM.

@lykahb sorry for delay. I posted this proposal to Reddit, hopefully we can source a bit more feedback before putting this to vote.

why-not-try-calmer · 2022-06-23T08:53:09Z

With regards to prior art and impact, Hackage Search reveals that mapAccumM was implemented (with small variations) in 19 packages. There is however no prior art at all for forAccumM.

@lykahb sorry for delay. I posted this proposal to Reddit, hopefully we can source a bit more feedback before putting this to vote.

Please don't forget about https://discourse.haskell.org/. It's a great fit for such decisions.

benjamin-hodgson · 2022-06-24T00:15:21Z

A minor suggestion: It took me a moment to think about why the constraint needed to be Monad and not Applicative. Just wanted to suggest adding a note to the documentation explaining why that is. (Although it may be the sort of thing where the skills required to understand the docs are the same as the skills required to figure it out for yourself, idk.)

Bodigrim · 2022-06-25T13:20:43Z

Please don't forget about https://discourse.haskell.org/. It's a great fit for such decisions.

Feel free to post a link there, but please encourage to discuss the proposal here, not on Discourse.

Bodigrim · 2022-07-06T22:21:14Z

@lykahb happy to trigger a vote?

lykahb · 2022-07-06T22:54:17Z

@Bodigrim I updated the proposal to include both mapAccumM and forAccumM. This matches the implementation in MR. Happy to trigger a vote.

treeowl · 2022-07-07T07:25:41Z

As I commented on the MR, the Functor instance for StateT should have a Monad constraint. You can then write fmap = liftM (or inline that; whatever). Currently it introduces weird laziness.

Bodigrim · 2022-07-12T21:24:31Z

I agree that instance Functor StateT should be expressed via Monad, but it's unused anyway, so this is a tangential question. Let's get to the crux first.

Dear CLC members, the proposal is to expand Data.Traversable with

mapAccumM
  :: forall m t s a b. (Monad m, Traversable t)
  => (s -> a -> m (s, b))
  -> s -> t a -> m (s, t b)
mapAccumM f s t = coerce (mapM @t @(StateT s m) @a @b) (StateT #. flip f) t s

and

forAccumM
  :: (Monad m, Traversable t)
  => s -> t a -> (s -> a -> m (s, b)) -> m (s, t b)
forAccumM s t f = mapAccumM f s t

The full implementation is available at https://gitlab.haskell.org/ghc/ghc/-/merge_requests/8384/diffs

Please vote on these two additions separately. If you disagree with the proposal, please indicate whether some changes may sway your opinion (e. g., move to Control.Monad or somewhere else, rename, change constraints).

@tomjaguarpaw @chessai @mixphix @emilypi @cgibbard

+1 for mapAccumM, -1 for forAccumM for the reasons explained in #65 (comment): there are plenty or prior art for mapAccumM and none for forAccumM.

cgibbard · 2022-07-12T21:30:50Z

+1 for both. I think given the rest of the contents of the module, lacking a for-variant would seem out of place.

treeowl · 2022-07-12T21:35:22Z

I agree that instance Functor StateT should be expressed via Monad, but it's unused anyway, so this is a tangential question. Let's get to the crux first.

I agree that it's tangential, but your claim that it's not used is unjustified. mapM for the relevant container may use it.

mixphix · 2022-07-12T22:34:54Z

+1 for both!

lykahb · 2022-07-13T18:41:11Z

Updated the MR with fmap = liftM.

tomjaguarpaw · 2022-07-14T11:23:17Z

+1 for both

The prior art for mapAccumM is convincing (and there is additional prior art for the same function under the name mapAccumLM) and I don't think we should add the map version without the for version.

tomjaguarpaw · 2022-07-14T11:25:04Z

Even more tangential ...

but your claim that it's not used is unjustified. mapM for the relevant container may use it.

How? StateT and its functions are not exposed in the public API.

treeowl · 2022-07-14T12:57:36Z

@tomjaguarpaw traverse may (lawfully) use fmap, pure, <*>, and/or liftA2. mapM may additionally (lawfully) use >>=.

tomjaguarpaw · 2022-07-15T08:53:31Z

Ah, interesting. Of course, the point of the Monad interface is so that it can be used without direct access to the concrete operations and the underlying type. I think it's the fact that there are two interacting typeclasses (Monad and Traversable) that makes this unintuitive for me.

Bodigrim · 2022-07-15T19:55:35Z

@chessai @emilypi just a gentle reminder to vote.

chessai · 2022-07-15T20:34:53Z

+1 for both

Bodigrim · 2022-07-16T19:40:57Z

...which gives as 5 votes in favor of mapAccumM and 4 votes in favor of forAccumM, so both are approved. Thanks all.

Approved by Core Libraries Committee in haskell/core-libraries-committee#65 (comment)

chshersh · 2023-03-22T16:16:03Z

I'm trying to summarise the state of this proposal as part of my volunteering effort to track the progress of all approved CLC proposals.

Field	Value
Authors	@lykahb
Status	merged
`base` version	`4.18.0.0`
Merge Request (MR)	https://gitlab.haskell.org/ghc/ghc/-/merge_requests/8384
Blocked by	nothing
CHANGELOG entry	present
Migration guide	not needed

Please, let me know if you find any mistakes 🙂

Bodigrim closed this as completed Jul 16, 2022

Bodigrim added the approved Approved by CLC vote label Jul 16, 2022

ghc-mirror-bot pushed a commit to ghc/ghc that referenced this issue Jul 19, 2022

Add mapAccumM, forAccumM to Data.Traversable

226aba8

Approved by Core Libraries Committee in haskell/core-libraries-committee#65 (comment)

ghc-mirror-bot pushed a commit to ghc/ghc that referenced this issue Jul 19, 2022

Add mapAccumM, forAccumM to Data.Traversable

4b98c5c

Approved by Core Libraries Committee in haskell/core-libraries-committee#65 (comment)

chshersh added the base-4.18 Implemented in base-4.18 (GHC 9.6) label Mar 22, 2023

Add monadic traversal with accumulator to Data.Traversable #65

Add monadic traversal with accumulator to Data.Traversable #65

Comments

lykahb commented May 17, 2022 • edited

Motivation

Alternative

Impact

treeowl commented May 17, 2022

andreasabel commented May 19, 2022

tomjaguarpaw commented May 19, 2022

andreasabel commented May 20, 2022

Bodigrim commented May 30, 2022

lykahb commented May 30, 2022

Bodigrim commented May 30, 2022

lykahb commented May 31, 2022

treeowl commented May 31, 2022

treeowl commented May 31, 2022

lykahb commented May 31, 2022

treeowl commented May 31, 2022

lykahb commented May 31, 2022

tomjaguarpaw commented Jun 1, 2022 • edited

konsumlamm commented Jun 1, 2022

lykahb commented Jun 3, 2022 • edited

lykahb commented Jun 22, 2022

Bodigrim commented Jun 22, 2022

why-not-try-calmer commented Jun 23, 2022

benjamin-hodgson commented Jun 24, 2022

Bodigrim commented Jun 25, 2022 • edited

Bodigrim commented Jul 6, 2022

lykahb commented Jul 6, 2022

treeowl commented Jul 7, 2022

Bodigrim commented Jul 12, 2022

cgibbard commented Jul 12, 2022

treeowl commented Jul 12, 2022

mixphix commented Jul 12, 2022

lykahb commented Jul 13, 2022

tomjaguarpaw commented Jul 14, 2022

tomjaguarpaw commented Jul 14, 2022

treeowl commented Jul 14, 2022

tomjaguarpaw commented Jul 15, 2022

Bodigrim commented Jul 15, 2022

chessai commented Jul 15, 2022

Bodigrim commented Jul 16, 2022

chshersh commented Mar 22, 2023

lykahb commented May 17, 2022 •

edited

tomjaguarpaw commented Jun 1, 2022 •

edited

lykahb commented Jun 3, 2022 •

edited

Bodigrim commented Jun 25, 2022 •

edited