# Haskell koan: probabilstic programming in < 100 LoC

Let us begin with our incantations to GHC:

In [1]:
{-# LANGUAGE GADTs #-}
import Control.Monad (ap, replicateM)
import System.Random (getStdGen, getStdRandom, randomR)
import qualified Data.Map as M

First, we create a new monad called `P`, for *probability*.
It supports only two operations:
- `Ret` to convert a pure value into a `P` value,
- `Sample01` to sample from a distribution.

We encode the monad in CPS style to get a monad instance for-free. For more
modularity, use the free monad technique.

In [2]:
data P a where
  Ret :: a -> P a -- ^ life a pure value
  Sample01 :: (Float -> P a) -> P a -- ^ use a float uniformly sampled from [0, 1]
  
instance Functor P where
  fmap f (Ret x) = Ret (f x)
  fmap f (Sample01 rand2pa) = Sample01 $ \r -> fmap f (rand2pa r)
  
instance Monad P where
  return = Ret
  Ret a >>= a2pb = a2pb a
  Sample01 r2pa >>= a2pb = Sample01 $ \r -> r2pa r >>= a2pb

instance Applicative P where
  pure = return
  (<*>) = ap
  
sample01 :: P Float
sample01 = Sample01 Ret

Now that we have a way to build expressions in this language, let's build interesting objects from this.
We start with a coin, and then we use the coin to approximate a normal distribution by adding many coins.

In [3]:
-- | 'coin b' returns 1 with probability p, 0 with probability (1 - p) 
coin :: Num a => Float -> P a
coin p = do
  r <- sample01
  return $ if r < p then 1 else 0
  
-- | approximate a normal distribution using the central limit theorem by adding many
-- uniformly distributed {0, 1} variables. Here, mean = 1
normal :: (Fractional a, Num a) => P a 
normal = do
  as <- replicateM 10 (coin 0.5)
  return $ sum as / 5

We need a way to run our computation and plot its values, so let's build that capability

In [4]:
-- | Sample a random value from a `P a`
runP :: P a -> IO a
runP (Ret a) = return a
runP (Sample01 rand2pa) = do
  r <- getStdRandom $ randomR (0, 1)
  runP (rand2pa r)

-- | Sample `n` random values from a `P a`
runsP :: Int -> P a -> IO [a]
runsP n p = replicateM n (runP p)

Let's also write a tiny utility to plot these values onto the command line with fancy ASCII art.
Notice that this will take us more code than everything we have done so far

In [5]:
-- | List of characters that represent sparklines
sparkchars :: String
sparkchars = "_▁▂▃▄▅▆▇█"

-- Convert an int to a sparkline character
num2spark :: RealFrac a => a -- ^ Max value
  -> a -- ^ Current value
  -> Char
num2spark maxv curv =
   sparkchars !!
     (floor $ (curv / maxv) * (fromIntegral (length sparkchars - 1)))

-- | Print sparklines with title
printvals :: RealFrac a => String -> [a] -> IO ()
printvals title vs = do 
  let maxv = if null vs then 0 else maximum vs
  putStrLn $ title ++ " " ++ map (num2spark maxv) vs
  
  -- | Create a histogram from values.
histogram :: RealFrac a
          => String -- ^ title
          -> Int -- ^ number of buckets
          -> [a] -- values
          -> IO ()
histogram title nbuckets as = do
        let minv = minimum as
        let maxv = maximum as
        let perbucket = (maxv - minv) / (fromIntegral nbuckets)
        let bucket v = floor (v / perbucket)
        let bucketed = foldl (\m v -> M.insertWith (+) (bucket v) 1 m) mempty as
        printvals title $ map snd . M.toList $ bucketed

We now have all the pieces we need to experiment!

In [6]:
runsP 20 (coin 0) >>=  printvals "bias 0"
runsP 20 (coin 0.2) >>=  printvals "bias 0.2"
runsP 20 (coin 0.5) >>=  printvals "bias 0.5"
runsP 20 (coin 0.8) >>=  printvals "bias 0.8"
runsP 20 (coin 1) >>=  printvals "bias 1"
runsP 1000 normal >>= histogram "normal" 10 

bias 0 ____________________

bias 0.2 _____█_____________█

bias 0.5 __████_██___█████_█_

bias 0.8 ██_████_███__███__██

bias 1 ████████████████████

normal __▂▃█▁___

So, we can create coins, gaussians, and whatever we want _from the building block of a uniform distribution_.
However, we still don't have a way to take samples from _arbitrary distributions_. What if we want
to sample from a distribution of the form `p(x) = x^{-2}`? Enter MCMC.

In [7]:
-- | take a sampler and create a new probabilstic value, which samples from the original
-- based on the score values (Metropolis-Hastings)
mhStep :: (a -> Float) -- ^ score function
   -> (a -> P a) -- ^ proposal: given a value, provide values around that value
   -> a -- ^ current value
   -> P a -- ^ next value
mhStep scoring proposer a = do
 a' <- proposer a
 let accept = scoring a' / scoring a
 u <- sample01
 return $ if u < accept then a' else a

-- | Take N steps of metropolois-hastings
mhSteps :: Int -> (a -> Float) -> (a -> P a) -> a -> P a
mhSteps 0 _ _ a = return a
mhSteps n scoring proposer a = mhStep scoring proposer a >>= mhSteps (n - 1) scoring proposer


sampleFloatUniform :: Float -> Float ->  P Float
sampleFloatUniform l h = do
  u <- sample01
  return $ l + u * (h - l)
  
-- | Generate samples for a numeric type printvals given a scoring function
-- using metropolis hastings
mhFloat :: (Float, Float) -> (Float -> Float) -> P Float
mhFloat (lo, hi) scorer = mhSteps 10 scorer (const $ sampleFloatUniform lo hi) 0.5

Let's now check if this works! If it does work, then our histograms should look like the functions that we are plotting, since the function we pass to `mhFloat` is the probability density that we wish to sample from.

In [8]:
runsP 1000 (mhFloat (0, 6) (const 1.0)) >>= histogram "uniform" 20
runsP 1000 (mhFloat (0, 6) (^2)) >>= histogram "x^2" 20
runsP 1000 (mhFloat (0, 6) (abs . sin)) >>= histogram "|sin x|" 20
runsP 1000 (mhFloat (0, 6) (abs . cos)) >>= histogram "|cos x|" 20

uniform ▇▇▆▆▇▆▆▆▇█▅▇▆▆▆▆▆▆▅▆_

x^2 ______▁▁▁▂▂▂▃▄▄▅▅▆█_

|sin x| ▁▂▄▅▆▆▇▅▃▂_▂▃▅▅█▅▇▅▄▁

|cos x| ▇▅▄▃▁▁▂▄▄▇▆█▅▄▂_▁▂▅▆_

Great, all of them seem to work. Let's now use similar ideas to estimate the bias of a coin.

In [9]:
-- | Given a list of observations from a coin and the bias, return a value proportional
-- to the coin having that bias
estimateBias :: [Int] -> Float -> Float
estimateBias obs bias = 
  product $ (map (\o -> if o == 1 then bias else (1 - bias)) obs)

replicateList :: Int -> [a] -> [a]
replicateList n as = mconcat $ replicate n as 

runsP 1000 (mhFloat (0, 1) $ estimateBias []) >>= histogram "estimate with no data" 20
runsP 1000 (mhFloat (0, 1) $ estimateBias [1]) >>= histogram "estimate with [1]" 20
runsP 1000 (mhFloat (0, 1) $ estimateBias [0]) >>= histogram "estimate with [0]" 20
runsP 1000 (mhFloat (0, 1) $ estimateBias [0, 1]) >>= histogram "estimate with [0, 1]" 20
runsP 1000 (mhFloat (0, 1) $ estimateBias [1, 0]) >>= histogram "estimate with [1, 0]" 20
runsP 1000 (mhFloat (0, 1) $ estimateBias [1, 0, 1, 0]) >>= histogram "estimate with [1, 0]x2" 20
runsP 1000 (mhFloat (0, 1) $ estimateBias (replicateList 8 [1, 0])) >>= histogram "estimate with [1, 0]x8" 20
runsP 1000 (mhFloat (0, 1) $ estimateBias (replicateList 20 [1, 0])) >>= histogram "estimate with [1, 0]x20" 20

estimate with no data ▅▆▇▄▅▇▆▆▅▆▅▅▇▅▅▆▅▅█▆_

estimate with [1] __▁▁▂▁▂▃▃▃▄▄▅▆▅▇▆▇▅█▁

estimate with [0] ▇▆▇█▇▅▅▅▅▅▄▃▃▂▂▁▂▁___

estimate with [0, 1] ▁▂▃▃▃▅▄▆▅▄█▇▆▆▆▄▄▃▂▁_

estimate with [1, 0] _▁▄▃▄▅▆▅▇█▇▆▅▆▅▅▄▃▂▁_

estimate with [1, 0]x2 ___▁▃▃▅▆▅█▇▇▅▆▆▆▃▂▁▁_

estimate with [1, 0]x8 __▁▂▂▄▅▆▆█▇▄▄▃▂▂▁___

estimate with [1, 0]x20 ___▁▂▂▂▃█▄▃▂▂▁_____