sampling from distributions over lazy lists? #32

currymj · 2017-03-17T04:23:28Z

For educational purposes, I modified your HMM example to work as an even simpler Markov chain. I decided to really live the Haskell lifestyle and try to define a distribution over lazy infinite lists.

states :: [Int]
states = [-1,0,1]

trans :: MonadDist m => Int -> m Int
trans (-1) = categorical $ zip states [0.1, 0.4, 0.5]
trans 0 = categorical $ zip states [0.2, 0.6, 0.2]
trans 1 = categorical $ zip states [0.15, 0.7, 0.15]

step startDist = startDist >>= trans

start :: MonadDist m => m Int
start = uniformD states

mmlazy :: MonadDist m => m Int -> m [Int]
mmlazy startDist = liftM2 (:) startDist (mmlazy (step startDist) )

Ideally, I'd be able to, say sampleIO $ fmap head (mmlazy start) and get a sample of just the first state; likewise use take, etc.. In practice, although this typechecks, the computation just hangs.

Perhaps there are good reasons why I shouldn't even be trying to do this -- in which case, is there any piece of the code, paper, etc. that I might want to be looking at to gain a better understanding of why it won't work?

The text was updated successfully, but these errors were encountered:

adscib · 2017-03-17T12:15:30Z

Indeed, it is very tempting to sample infinite data structures. There are no fundamental reasons why you shouldn't do it, but there are certain technical challenges involved that made me decide not to support it in monad-bayes at this point.

The reason sampleIO isn't lazy is that it modifies a random number generator using the IO monad. Sampling a random variable changes the state of the RNG and so it's final state can't be computed until the whole list is sampled, but IO forces the computation of that final state.

If we used the State monad instead of IO and didn't care about the final state of RNG we would be able to sample an infinite list lazily. Unfortunately this approach doesn't compose since we can't sample anything else after the infinite list, for it would require knowing the final state of RNG after sampling the whole list. For a truly lazy monadic random sampler we need to split the RNG in half at every bind, but this may not be very efficient both computationally and statistically. We used to have an implementation for this at some point and you can still find it in src/Control/Monad/Bayes/Sampler/RandomFu.hs but it's out of date so it doesn't match the current typeclass setup in monad-bayes. I could update it though.

The main problem with laziness, however, is in inference. Since we allow conditioning in arbitrary places in the program, and those conditions can have a global effect on the distribution defined by the program, we can't really do inference lazily since we need to make sure we find all the conditions. Since the main focus of monad-bayes is inference, I decided not to bother with lazy sampling at all.

Having said that, if you put restrictions on where conditions are allowed, you may be able to do inference on lazy data structures. This is what the original code did, you can still find it on haskell2015 branch.

currymj · 2017-03-17T12:20:51Z

Thanks! This is a very helpful explanation.

bredelings · 2019-07-07T20:49:48Z

Hi @adscib. I've been working on something that seems to intersect with your work here, and I'm curious if you have any thoughts on this.

I've been working on a similar paradigm to monad-bayes, and I ran into some of the problems that you mentioned above. For example, if you make the monad interpreter lazy, then it may not ever evaluate the observation statements, since their result is never used, among other problems. etc.

In order to allow the interpreter to be lazy, I split the random monad into two parts:

Part 1 allows random sampling, but no observations. Part 1 is interpreted lazily.
Part 2 allows observations, and is allowed to call into Part 1 to for random samples. Part 2 is interpreted strictly.
Perhaps this is what you mean by "putting restrictions on where conditions are allowed". The main annoyance here seems to be the extra verbosity needed to call into Part1.

I investigated using observations inside the random sampling part to generate conditioned samples, but I could not see how to make this work on contingent variables for MCMC. Perhaps I'm missing something...?

import Probability

model = do
  x <- sample $ normal 0.0 1.0
  ys <- sample $ list (repeat $ normal 0.0 1.0) -- this is a distribution of an infinite number of normals
  return $ (x*x):(take 10 ys)

main = do
  zs <- random $ model
  observe (normal (zs!!2) 1.0) 10.0
  return $ log_all [ zs %% "zs"]

reubenharry · 2022-08-09T17:55:27Z

LazyPPL has a lazy sampling monad. As discussed above, this interacts with the monad-bayes transformer stack only to a limited degree, but will nevertheless be made available with examples and documentation in an upcoming release of monad-bayes.

reubenharry closed this as completed Aug 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sampling from distributions over lazy lists? #32

sampling from distributions over lazy lists? #32

currymj commented Mar 17, 2017

adscib commented Mar 17, 2017

currymj commented Mar 17, 2017

bredelings commented Jul 7, 2019

reubenharry commented Aug 9, 2022

sampling from distributions over lazy lists? #32

sampling from distributions over lazy lists? #32

Comments

currymj commented Mar 17, 2017

adscib commented Mar 17, 2017

currymj commented Mar 17, 2017

bredelings commented Jul 7, 2019

reubenharry commented Aug 9, 2022