Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weaken constraint on mapM #132

Open
andrewthad opened this issue Jul 28, 2016 · 13 comments
Open

Weaken constraint on mapM #132

andrewthad opened this issue Jul 28, 2016 · 13 comments

Comments

@andrewthad
Copy link
Contributor

The type signatures of all of the monadic mapping functions (mapM, imapM, etc) have a Monad constraint on m. However, this can be weakened to Applicative. I do not know if this causes degradation of performance for certain Monads though.

@RyanGlScott
Copy link
Member

I imagine using Monad here is intentional, since the mapM function from Traversable deliberately uses a Monad constraint. You're right that this could be weakened on base-4.8 or later, but for backwards compatibility reasons it'd be prudent to keep around variants that use Monad rather than Applicative, since there is still some old code floating around that declares Monad instances but not Applicative ones. Similarly for imapM, forM, and sequence.

That being said, Data.Traversable does contain Applicative counterparts to all of these functions (sequenceA vs. sequence, traverse vs. mapM, for vs. forM). It doesn't look like the vector library implements them directly (aside from this inefficient implementation), so it might be prudent to add them.

@Shimuuar
Copy link
Contributor

It would require changes in stream fusion framework. Currently streams use Monad for sequencing of effects

@andrewthad
Copy link
Contributor Author

Thanks for the feedback from both of you. It seems like weakening the constraints on the existing functions would be bad for multiple reasons. Would introducing something like itraverse, which I am under the impression would be incompatible with the stream fusion framework, be alright? That is, is there a rule in place that everything that operates on immutable vectors must have a streaming rewrite, or do some function lack this?

@cartazio
Copy link
Contributor

It might not be possible because of the current fusion rep. If you can
weaken the internals to get by with applicative it may be possible

On Thursday, July 28, 2016, Andrew Martin notifications@github.com wrote:

Thanks for the feedback from both of you. It seems like weakening the
constraints on the existing functions would be bad for multiple reasons.
Would introducing something like itraverse, which I am under the
impression would be incompatible with the stream fusion framework, be
alright? That is, is there a rule in place that everything that operates on
immutable vectors must have a streaming rewrite, or do some function lack
this?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#132 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAAQwp35xul8A7tVfmazMUY6meEOHwiKks5qaLEwgaJpZM4JXL0F
.

@andrewthad
Copy link
Contributor Author

@cartazio Sorry I wasn't clear. What I was asking is if it's alright to have a function that operates on immutable vectors that does not fuse.

@cartazio
Copy link
Contributor

Boxed vectors have traverse already right?
What would this applicative allow for the other formats? I'm a tad slow
this morning so please expound.

On Thursday, July 28, 2016, Andrew Martin notifications@github.com wrote:

@cartazio https://github.com/cartazio Sorry I wasn't clear. What I was
asking is if it's alright to have a function that operates on immutable
vectors that does not fuse.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#132 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAAQwroly50Zz64ZrGgS0ygzkISfkws0ks5qaLmvgaJpZM4JXL0F
.

@andrewthad
Copy link
Contributor Author

The original reason I brought this up was because the indexed traversal imapM has a Monad constraint. It would be nice to have itraverse, which would weaken that to Applicative.

@RyanGlScott
Copy link
Member

To be clear, I think there's two separate issues here:

  1. The current fusion representation is not amenable to implementing traverse :: Applicative m => (a -> m b) -> Vector a -> m (Vector b) (a.k.a. mapM with its Monad constraint relaxed to Applicative) natively, and as a consequence we can't implement itraverse natively either. To see why, consider the current definition of mapM in Data.Vector.Fusion.Streaming.Monadic:

    mapM :: Monad m => (a -> m b) -> Stream m a -> Stream m b
    mapM f (Stream step t) = Stream step' t
     where
       step' s = do
                   r <- step s
                   case r of
                     Yield x s' -> liftM  (`Yield` s') (f x)
                     Skip    s' -> return (Skip    s')
                     Done       -> return Done

    You'd need to somehow figure out how to make this Applicative. I can't see a way to do this (that doesn't involve changing the internals of Step/Stream somehow, which I don't feel qualified to do).

  2. vector doesn't export an standalone traverse/itraverse functions. To be honest, I don't see this as a problem, since the Traversable instance for Vector gives you an (inefficient) traverse, and it's straightforward to adopt that implementation for other Vectors. Similarly, you can get an (inefficient) itraverse via the TraversableWithKey Vector instance from vector-instances (TraversableWithKey comes from keys) or the TraversableWithIndex instance for Vector from lens.

@sjakobi
Copy link
Member

sjakobi commented Nov 29, 2019

Thanks for explaining the situation @RyanGlScott! :)

Are there any other array-like data structures that would offer an efficient itraverse? The TraverseWithKey instance for Array doesn't look so great either…

@lehins
Copy link
Contributor

lehins commented Nov 30, 2019

@sjakobi I am not sure why traverse doesn't get added to vector, I personally don't see anything preventing it from happening. Current implementation of mapM goes through a list, so the same can be done for traverse. Here is how mapM is implemented

mapM f = unstreamM . Bundle.mapM f . stream

unstreamM s = do
                xs <- MBundle.toList s
                return $ unstream $ Bundle.unsafeFromList (MBundle.size s) xs

So here is a traverse that does the same thing and has the same performance as mapM. Sure, it probably breaks fusion, but so does indexing into a vector.

traverseVector :: (Applicative f, Vector v1 a, Vector v2 b) => (a -> f b) -> v1 a -> f (v2 b)
traverseVector f v =
  unstream . Bundle.unsafeFromList (MBundle.size s) <$>
  Traversable.traverse f (unId (MBundle.toList s))
  where
    s = stream v

Side note - I'd submit a PR with this, but development process is pretty weird in vector, don't even know which branch to submit a PR to.

To answer your question

Are there any other array-like data structures that would offer an efficient itraverse?

That all being said:

  • you can't have a Traversable (or FoldableWithKey for that matter) instance, because it is defined forall elements, but elements in vector have restrictions on them (i.e. Unbox, Prim and Storable)
  • traversing is significantly (x50) slower with Applicative than in a PrimMonad which can mutate an array while doing traversing

@cartazio
Copy link
Contributor

cartazio commented Nov 30, 2019 via email

@lehins
Copy link
Contributor

lehins commented Nov 30, 2019

@cartazio If you wanna write it in a way such that it fuses similarly to how mapM does, then here is something that should work.

traverseVector f =
  (\s -> unstream . Bundle.unsafeFromList (MBundle.size s) <$>
  Traversable.traverse f (unId (MBundle.toList s))) . stream

As far as the input goes it should fuse (i.e stream . unstream rule), but for the output, I don't think any fusion is possible, that is why mapM output isn't fused either, I believe. Consider traversing with the function (a -> Maybe b), then you can't know if a new stream will be produced or will it be a Nothing until you applied that function to the last element of the input vector.

As far as primitive and massiv, the approach for traversal avoids any lists, but that doesn't make it faster or anything, it's just a different way of implementing it. I noted them just to answer the @sjakobi's question about other array libraries.

@cartazio
Copy link
Contributor

cartazio commented Nov 30, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants