In [17]:
:l Plotting.hs
:l ../src/Control/Monad/Bayes/Class.hs
:l ../src/Control/Monad/Bayes/Enumerator.hs
:l ../src/Control/Monad/Bayes/Sampler.hs

import Control.Monad
import Data.List
import Data.Ord
import Control.Arrow (first)
import Data.Text (pack, Text)


# Sampling

Before discussing inference, we should understand how to sample from models. This notebook explains how to do that.

We'll start with a very simple model, namely:

In [18]:
model :: MonadSample m => m Bool
model = bernoulli 0.7

To take a sample, do:

In [19]:
sampleIO model

True

Or with a fixed seed:

In [20]:
sampleIOfixed model

True

To take multiple samples, you could rerun `sampleIO` many times, but it's somewhat more in the probabilistic programming spirit to instead define a distribution over multiple draws from `model` and then just sample once, as follows:

In [21]:
multipleDraws :: MonadSample m => m [Bool]
multipleDraws = replicateM 10 model

draws <- sampleIO multipleDraws

draws

[False,True,False,True,True,True,True,False,False,True]

We can write a function to convert a list of samples to an empirical distribution, like so:

In [22]:

toEmpirical :: (Show a, Fractional b, Ord a) => [a] -> [(Text, b)]
toEmpirical ls = fmap (first (pack . show)) $ normalizeWeights $ compact (zip ls (repeat 1)) 

emp = toEmpirical draws

emp



[("False",0.4),("True",0.6)]

In [23]:
plotVega emp

In fact, we could lean even further into the spirit of probabilistic programming, and transform `model` into a distribution over plots, and sample from that:

In [24]:
distributionOverPlots :: MonadSample m => m VegaLiteLab -- the type of plots
distributionOverPlots = plotVega . toEmpirical <$> replicateM 10 model

sampleIO distributionOverPlots


Now for a continuous distribution, consider

In [25]:
model2 :: MonadSample m => m Double
model2 = normal 0 1

Sampling is no different to before:

In [26]:
sampleIO model2

-1.5870710686421936

And as before, to obtain multiple draws:

In [27]:
multipleDraws2 :: MonadSample m => m [Double]
multipleDraws2 = replicateM 10 model2

draws2 <- sampleIO multipleDraws2

draws2

[-1.185852793082832,0.316858019718148,-1.7798815770412686,1.3536879738771843,-0.9497122508672085,0.29585199966066783,-1.2748917173305718,1.7266318539752605,0.5264427621701333,1.7041225136233003]

We'd like to view a histogram of samples, which in the limit of many samples should tend to the PDF of a normal distribution. Again, we could apply a histogram to the list of samples, but it's nicer to apply a `histogram` function to `multipleDraws`, to define a distribution over histograms from which we'll sample.

In [51]:
toBins binWidth = fmap (fst . toBin binWidth)

In [52]:
sampleIO $ plotVega . toEmpirical . toBins 0.05 <$> replicateM 100000 model2



In [53]:
model3 = do
    p <- bernoulli 0.7
    if p then normal 0 1 else normal 3 1

In [54]:
sampleIO $ plotVega . toEmpirical . toBins 0.05 <$> replicateM 100000 model3
