Switch branches/tags
Nothing to show

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

## Files

Failed to load latest commit information.
Type
Name
Commit time

Makes it easy to create, manipulate and sample probability distributions.

## Installation

Include this in your sbt config:

``````"org.jliszka" %% "probability-monad" % "1.0.3"
``````

## Examples

Here's how you would code up the following problem: You are given either a fair coin or a biased coin with equal probability. If you flip it 5 times and it comes up heads each time, what is the probability you have the fair coin?

```case class Trial(haveFairCoin: Boolean, flips: List[Coin])

def bayesianCoin(nflips: Int): Distribution[Trial] = {
for {
haveFairCoin <- tf()
c = if (haveFairCoin) coin else biasedCoin(0.9)
flips <- c.repeat(nflips)
} yield Trial(haveFairCoin, flips)
}

bayesianCoin(5).given(_.flips.forall(_ == H)).pr(_.haveFairCoin)```

Or: You repeatedly roll a 6-sided die and keep a running sum. What is the probability the sum reaches exactly 30?

```def dieSum(rolls: Int): Distribution[List[Int]] = {
markov(rolls, List(0))(runningSum => for {
d <- die
} yield (d + runningSum.head) :: runningSum)
}

dieSum(30).pr(_ contains 30)```

Or: Each family has children until it has a boy, and then stops. What is the expected fraction of girls in the population?

```sealed abstract class Child
case object Boy extends Child
case object Girl extends Child

def family = {
discreteUniform(List(Boy, Girl)).until(_ contains Boy)
}

def population(families: Int) = {
for {
children <- family.repeat(families).map(_.flatten)
val girls = children.count(_ == Girl)
} yield 1.0 * girls / children.length
}

population(4).ev```

## How it works

A `Distribution[T]` represents a random variable that, when sampled, produces values of type `T` according to a particular probability distribution. For example, `Distribution.uniform` is a `Distribution[Double]` that produces `Double` values between 0.0 and 1.0, uniformly distributed. `Distribution.coin` is a `Distribution[Coin]` that produces the values `H` and `T` with equal probability, and `Distribution.biasedCoin(0.3)` is a `Distribution[Coin]` that produces the value `H` 30% of the time and the value `T` 70% of the time.

You can think of a `Distribution[T]` as a collection like any other scala collection that you can `map`, `flatMap` and `filter` over. The presence of these methods allow you to use scala's for-comprehensions to manipulate distributions. For example, here's how you would create a distribution that represents the sum of 2 die rolls:

```val dice = for {
d1 <- die
d2 <- die
} yield d1 + d2```

Here, `die` is a `Distribution[Int]`, and `d1` and `d2` are both `Int`s. The type of `dice` is `Distribution[Int]`. You can see that for-comprehensions are an easy way to define new a distribution in terms of individual samples from other distributions.

You can visualize a distribution with `hist`:

``````scala> dice.hist
2  2.61% ##
3  5.48% #####
4  8.70% ########
5 10.53% ##########
6 14.21% ##############
7 16.90% ################
8 13.90% #############
9 11.43% ###########
10  8.35% ########
11  5.17% #####
12  2.72% ##
``````

If you want more control over the display of continuous distributions, use `bucketedHist`:

``````scala> normal.map(_ * 2 + 1).bucketedHist(20)   // 20 buckets, min & max determined automatically
-7.0  0.02%
-6.0  0.03%
-5.0  0.22%
-4.0  1.01% #
-3.0  2.69% ##
-2.0  6.43% ######
-1.0 12.19% ############
0.0 17.07% #################
1.0 19.74% ###################
2.0 17.55% #################
3.0 12.17% ############
4.0  6.55% ######
5.0  2.92% ##
6.0  1.10% #
7.0  0.23%
8.0  0.06%
9.0  0.01%
10.0  0.01%

scala> cauchy.bucketedHist(-10, 10, 20)   // min=-10, max=10, #buckets=20
-10.0  0.20%
-9.0  0.38%
-8.0  0.44%
-7.0  0.55%
-6.0  0.82%
-5.0  1.23% #
-4.0  1.85% #
-3.0  2.92% ##
-2.0  6.78% ######
-1.0 16.78% ################
0.0 30.04% ##############################
1.0 16.64% ################
2.0  6.22% ######
3.0  3.06% ###
4.0  1.76% #
5.0  1.26% #
6.0  0.84%
7.0  0.67%
8.0  0.48%
9.0  0.42%
10.0  0.14%
``````

This probability monad is based on sampling, so the values and plots produced will be inexact and will vary between runs.

``````scala> normal.stdev
res9: Double = 1.0044818262040809

scala> normal.stdev
res10: Double = 1.0071194147525722
``````

# Code and examples

Distribution.scala contains code for creating and manipulating probability distributions. Built-in distributions include:

• uniform discrete (including die and fair coin)
• weighted discrete (biased coin, uses the alias method)
• bernoulli
• geometric
• binomial
• negative binomial
• poisson
• zipf
• uniform continuous
• normal
• cauchy
• chi2
• pareto
• exponential
• lognormal
• student's t-distribution
• gamma
• beta

Methods for manipulating distributions include:

• adding (convolution), subracting (cross-correlation), multiplying and dividing distributions
• joint distributions (a flatMap)
• marginal distributions (a filter)
• creating Markov chains (an iterated flatMap)
• finding the probability of arbitrary predicates, conditional probabililty
• finding expected values (mean), standard deviation, variance, skewness and kurtosis
• sampling, histogram

Examples.scala contains some example uses, and possibly a RISK simulator.

To try out some examples, do

``````\$ ./sbt console

scala> runBayesianCoin(5)
``````

Contributions welcome!

No description, website, or topics provided.

## Releases

No releases published

## Packages 0

No packages published