In [30]:
:load Plotting.hs

import Control.Monad.Bayes.Class
import Control.Monad.Bayes.Weighted
import Control.Monad.Bayes.Population
import Control.Monad.Bayes.Sampler
import Control.Monad.Bayes.Inference.PMMH
import Control.Monad.Bayes.Traced
import Control.Monad.Bayes.Inference.RMSMC
import Control.Monad.Bayes.Inference.SMC2

import qualified Data.Text as T
import Numeric.Log
import Control.Arrow (first,second)

import Control.Monad
import Control.Monad.Bayes.Class 
import Control.Monad.Bayes.Sampler
import Control.Monad.Bayes.Traced
import Control.Monad.Bayes.Weighted

import Graphics.Vega.VegaLite hiding (density)
import qualified Graphics.Vega.VegaLite as VL
import IHaskell.Display.Hvega (vlShow)

:e OverloadedStrings
:e BlockArguments

:e TupleSections


monad-bayes offers several more advanced inference methods, which are modular combinations of `SMC` and `MCMC`. 

# Resample Move Sequential Monte Carlo (RMSMC)

RMSMC is fundamentally an SMC technique. It creates and updates a population of weighted samples. The clever part is that after resampling, the update step uses MCMC to perform a walk which updates the population efficiently.

To motivate this more sophisticated inference method, let's pick a relatively hard inference problem: inferring the position of a moving point mass from measurements of its bearings.

Here it is in practice:

todo: 
visualize the mcmc steps to show that they're efficient

In [45]:
:l ../src/Lib.hs
import qualified Pipes.Prelude as P
import Pipes (Producer, (>->), MonadTrans (lift))


In [46]:


samples <- sampleIOfixed $ P.toListM (latent >-> P.map fst >-> P.take 100)



In [47]:
plotVega (zip samples $ Prelude.repeat 1.0)

In [39]:
observations = fmap (\((x,y)) -> atan2 y x) samples
s <- sampleIO $ prior $ mh 1000 $ P.toListM $ model observations




In [40]:

plotVega $ (zip (head s) (Prelude.repeat 1.0))

In [42]:
import Control.Monad.Bayes.Inference.SMC 
import Control.Monad.Bayes.Population

observations = fmap (\((x,y)) -> atan2 y x) samples
s <- sampleIO $ runPopulation $ smcSystematic 100 1000 $ P.toListM $ model observations




In [43]:
a = join (fst <$> s)
-- b = join (replicate 1000 ((ln . exp . snd) <$> s))

s' = zip a (Prelude.repeat 1.0)

plotVega $ take 10000 s'


-- plotVega $ zip (fst $ head s) (Prelude.repeat 1.0)


In [48]:
samplesSingle <- sampleIO $ P.toListM (latent >-> P.map fst >-> P.take 100)
observationsSingle = fmap (\((x,y)) -> atan2 y x) samplesSingle



In [49]:
s <- sampleIO $ runPopulation $ rmsmc 100 100 100 $ P.toListM $ model observationsSingle

plotVega $ zip (samplesSingle) (Prelude.repeat 1.0)
plotVega $ (zip (fst $ head s) (Prelude.repeat 1.0))

# Particle Marginal Metropolis Hastings (PMMH)

PMMH is fundamentally an MCMC technique. It performs a random walk through parameter space, but to estimate the likelihood at each step, it uses an unbiased estimate from a population of samples.

In [61]:
import Control.Monad

latent = normal 0 2.0
generative = (bernoulli

-- samples <- sampleIO $ prior $ pmmh 20000 1 10 latent generative
-- samples <- sampleIO $ prior $ mh 10000 (latent >>= generative)
samples <- replicateM 10000 $ sampleIO $ prior (latent >>= generative)

toSample pop = fst $ head pop

a = take 5000 $ samples
-- a = runWeighted $ prior $ do
--     let x = pmmh 10 1 10 latent generative
--     -- return x
--     b <- (proper . fromWeightedList . fmap head) x
--     return b

-- :t head samples

:t toSample
:t samples
toDouble :: Int -> Double
toDouble = fromIntegral
plotVega [("true" :: T.Text,toDouble $ length $ filter (==True) a), ("false",toDouble $ length $ filter (==False) a)]

In [64]:
paramPrior :: MonadInfer m => m (Double, Double, Double, Bool)
paramPrior = do
    slope <- normal 0 2
    intercept <- normal 0 2
    noise <- gamma 1 1
    prob_outlier <- uniform 0 0.5 
    return (slope, intercept, noise, prob_outlier)

forward (slope, intercept, noise, probOutlier) x = do
    isOutlier <- bernoulli probOutlier
    let meanParams = if isOutlier
                    then (0, 20)
                    else (x*slope + intercept, noise)
    return (meanParams, isOutlier)

regressionWithOutliersData :: (MonadSample m, Traversable t) => t Double -> m (t ((Double, Double), Bool))
regressionWithOutliersData xs = do
    params <- paramPrior

    forM xs \x -> do
        ((mu, std), isOutlier) <- forward params x
        y <- normal mu std
        return ((x, y), isOutlier)

In [129]:

-- plotVega $ fmap (second (ln . exp) . first (T.pack . show)) samps

[(False,1.0),(False,1.0),(False,1.0),(False,1.0),(False,1.0),(False,1.0),(False,1.0),(False,1.0),(False,1.0),(False,1.0)]

In [66]:
range = [-10,-9.9..10] :: [Double]
samples <- sampleIOfixed $ regressionWithOutliersData range
plotVega (fmap (second (T.pack . show)) samples)


In [87]:
-- regressionWithOutliers :: (MonadSample m, MonadCond m) =>
--     [Double] -> [Double] -> m ((Double, Double, Double, Double), [Bool])
regressionWithOutliers xs ys params = do
    -- params <- paramPrior
    
    outliers <- forM (zip xs ys) \(x, y) -> do
        ((mu, std), isOutlier) <- forward params x
        factor $ normalPdf mu std y
        return isOutlier
    return (params, outliers)

In [120]:
mhRuns <- sampleIOfixed $ prior $ pmmh 1000 200 10
    paramPrior
    (regressionWithOutliers range (snd . fst <$> samples))





1001

In [123]:
m <- mapM (sampleIO . prior . proper . fromWeightedList . return) $ mhRuns



In [69]:


outlierProb s = (\(x, y) -> log (fromIntegral y / (fromIntegral x))) 
        <$> (foldr 
    (\(_,lb) li -> 
        [ if b then (num1+1, num2) else (num1,num2+1) | (b,(num1, num2)) <- zip lb li]) 
    (Prelude.repeat (0,0))) s


In [130]:
plotVega $ take 1000 (zip (fst <$> samples) (outlierProb m))

As the above plot shows, this works nicely: the `slope`, `intercept`, `noise` and `prob_outlier` variables are inferred by a random walk through the space, while the score to determine whether to accept a new proposed step in this walk is determined by a particle filter which guesses which points are outliers after each observation.

TODO: make color gradient not fade to white

# $SMC^2$

The culmination of monad-bayes is $SMC^2$, which combines all the previous piece together into one inference algorithm.

It is $RMSMC$, but with the $MCMC$ walk performed using $PMMH$. 