Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generateM is broken for monads encoding nondeterminism #24

Closed
Shimuuar opened this issue Jun 8, 2018 · 7 comments
Closed

generateM is broken for monads encoding nondeterminism #24

Shimuuar opened this issue Jun 8, 2018 · 7 comments

Comments

@Shimuuar
Copy link

Shimuuar commented Jun 8, 2018

Illustration is quite simple:

λ> import qualified Data.Massiv.Array as A
λ> xs = (A.generateM A.Seq 3 (\i -> [0..i]) :: [A.Array A.U Int Int])
λ> xs
[(Array U Seq (3)
  [ 0,0,0 ]),(Array U Seq (3)
  [ 0,0,1 ]),(Array U Seq (3)
  [ 0,0,2 ]),(Array U Seq (3)
  [ 0,1,0 ]),(Array U Seq (3)
  [ 0,1,1 ]),(Array U Seq (3)
  [ 0,1,2 ])]
λ> xs
[(Array U Seq (3)
  [ 0,1,2 ]),(Array U Seq (3)
  [ 0,1,2 ]),(Array U Seq (3)
  [ 0,1,2 ]),(Array U Seq (3)
  [ 0,1,2 ]),(Array U Seq (3)
  [ 0,1,2 ]),(Array U Seq (3)
  [ 0,1,2 ])]

As it could be seen all vectors share same buffer so it becomes overwritten. I thing only possible fix is to strengthen Monad constraint to PrimMonad and use it to thread state token.

@Shimuuar
Copy link
Author

Shimuuar commented Jun 9, 2018

Yes proposed fix seems to fix problem for list monad but it relies on particular order of list traversal and thus very fragile. So array updates and copying of buffer are interleaved just in correct way so for example with A.generateM A.Seq 3 (\i -> [0..1]) we'll have following sequence of evaluation:

buf[0] = 0
buf[1] = 0
buf[2] = 0
freeze!
buf[2] = 1
freeze!
buf[1] = 1
buf[2] = 0
freeze!
buf[2] = 1
freeze!
...

Thus buffer updates do not interfere with each other since they're done in stack-like manner. But should we change order of traversal of possible lists it's no longer the case! Should we take for example Omega monad fix will break. It's not proper monad since associativity only holds up to permutations of resulting list. Still I think massiv should break on such monads

λ> A.generateM A.Seq 3 (\i -> Omega [0..1]) :: Omega (A.Array A.P Int Int)
Omega {runOmega = [(Array P Seq (3)
  [ 0,0,0 ]),(Array P Seq (3)
  [ 1,0,0 ]),(Array P Seq (3)
  [ 1,0,0 ]),(Array P Seq (3)
  [ 1,1,0 ]),(Array P Seq (3)
  [ 1,1,0 ]),(Array P Seq (3)
  [ 1,1,1 ]),(Array P Seq (3)
  [ 1,1,1 ]),(Array P Seq (3)
  [ 1,1,1 ])]}

All in all it seems that only solution that works for any monad will require accumulation of results in some persistent data structure as opposed to ephemeral (aka mutable) buffer and corresponding performance hit. Another possibility is to constrain monad enough so we'll have only single sequence of events (IO, ST, StateT, Reader, Identity, what else?) However it's not clear to me how to attain that

@Shimuuar
Copy link
Author

I thought about this problem a bit more and this problem requires systematic treatment. Basically what we're doing is working in STT monad transformer:

newtype STT m a = STT (State# s  m (State# s, a))

In this formulation problem becomes much more obvious: in monads supporting nondeterminism state token could be reused which will cause problems with mutable state. So it's only safe to use this trick with monads where token will be used at most once: state, writer, reader, error. Question is how to express it

@lehins
Copy link
Owner

lehins commented Nov 12, 2018

I'll close this issue in favor of #52, since previous buggy implementation has been disabled and needs a complete rewrite.

@lehins lehins closed this as completed Nov 12, 2018
@lehins
Copy link
Owner

lehins commented Dec 13, 2018

I think I found a solution to this problem, but I am still not 100% confident. I posted a minimal example on stackoverflow: Is it safe to interleave manual realWorld# state passing with an arbitrary Monad. @Shimuuar, I'd really appreciate your input if you get a chance to take a look at it.

Regardless of the answer, doing mapM through a list (converting an array to a list, mapM over the list and convert it back to array) seems to result in decent performance thanks to fusion, so that we'll be a fallback implementation

@Shimuuar
Copy link
Author

I think it should work. Essentially you build a closure which will write into supplied buffer in the end. Very similar to expressing foldr via foldl. One question is performance since it's not too different from using lists as intermediate. You still build persistent data structure only in form of closures instead of lists

I said it should work but I'm not 100% sure. Maybe there's some really perverted monad where this approach will break

@lehins
Copy link
Owner

lehins commented Dec 15, 2018

That was precisely my thinking. I guess, we'll not know for 100% until we put it out into the wild and let the world find out, if there is such perverted monad (applicative) out there :)
As far as performance goes, I did benchmark it against using lists and it is about 2 times faster, but only for ghc-8.0 and 8.2, not the latest ones. Important part is that it is never slower.

@lehins
Copy link
Owner

lehins commented Dec 15, 2018

@Shimuuar Thanks for checking up on it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants