In [1]:
import sympy as sp
import numpy as np

# Continuous Examples of Bayesian Methods


## Example - Unfair Coins

Suppose we have an unfair coin that returns heads with some probability p. (before we get into it, it is worth noting that you can buy such coins from a magic or toy shop). Before we gather any data, *Bayes Thinking* says that we should think of $p$ as being a random variable that can take values between 0 and 1. We could take a prior such as $f(p) = 2p$.  Note we might choose this prior if we suspect the coin is favored towards heads before we flip it.

In this case we are going to flip the coin once and call that our observation. So this is a case where we have a continuous prior and a discrete likliehood. We compute the likliehoods of observing a head or a tails:

$$ L( x=\mbox{heads} | p) = p \qquad L(x = \mbox{tails} | p) = 1-p $$

Then our total probability of observing x is given by:

$$ P(x=\mbox{heads}) = \int_0^1 L( x=\mbox{heads} | p) f(p) dp = \int_0^1 2 p^2 dp = \frac{2}{3} $$

$$ P(x=\mbox{tails}) = \int_0^1 L( x=\mbox{tails} | p) f(p) dp = \int_0^1 2 (1-p) p dp = $$

In [50]:
p = sp.Symbol('p')

sp.integrate( 2*(1-p)*p, (p, 0, 1) )

1/3

The *Posterior Estimate* in both cases then becomes:

$$ f(p | x=\mbox{heads}) = \frac{L( x=\mbox{heads}|p) f(p) }{ P(x=\mbox{heads}) } = 3 p^2 $$

and in the other case

$$ f(p | x=\mbox{tails}) = 6 (1-p)p $$

Depending on what we flip this then updates our result. You could imagine that what we then do is repeat this as we continue to flip the coin and record the results, and it continues to give variations on the beta distributions.

### Discussion

Note that our prior estimate was weighted towards heads, and the total probability we compute reflected that with 2/3 to 1/3. 

Note that the two results we conclude with are divergent. One is the result of the flip being heads and other the result of the flip being tails.


## Flat Prior 

As we've learned an important step in Bayesian Analysis is understanding the role of the prior for our particular problem. In the ideal case we have enough data that our prior is submerged by the likliehoods of the data. However if the sample size is fixed, we may not be able to avoid effects of the prior. In the case of our coin for example, our initial prior was weighted towards heads.

We could try a *flat prior* which is just a unifromly distributed prior where every possibility is equally likely - again in the prior.

$$ f(p) = 1$$

Note that our likliehoods do not changed, however the total probabilities will:

$$ P( x = \mbox{heads}) = \int_0^1 p dp = \frac{1}{2} $$

and 

$$ P( x = \mbox{tails}) = \int_0^1 (1-p) dp = \frac{1}{2} $$

Then we compute the posterior estimates:

$$ f(p | x= \mbox{heads}) = 2 p $$

and 

$$ f(p | x=\mbox{tails}) = 2 (1- p) $$

### Discussion

Note the impressive symmetry. 

Note that our posterior in the event that the first flip is a head is precisely the prior we started with. 





## Using Our Posterior to Make a Prediction

So let's start with our flat prior, we flip the coin once and obtain a heads and our posterior estimate for the likliehood of p is now:  $$ f(p| x_1 = \mbox{heads} ) = 2p $$. 

We flip the coin again and get another heads so that our updated posterior is now:

$$ f(p | x_1 = x_2 = \mbox{heads}) = 3 p^2 $$

What do we think is going to happen next?

The likliehood that a third flip is a heads given the value of p then looks like:

$$ p(x_3 = \mbox{heads} | x_1=x_2 =\mbox{heads}) = \int L(x=\mbox{heads} | p) f(p | x_1 = x_2 = \mbox{heads}) = \int_0^1 3 p^3 dp = \frac{3}{4} $$

and therefore


$$ p(x_3 = \mbox{tails} | x_1=x_2 =\mbox{heads}) = \frac{1}{4} $$

## Example - Radioactive Decay

An unknown radioactive isotope has a lifetime modeled by an exponential distribution:  $$f(x) = \lambda \exp(-\lambda x) $$ 

With mean lifetime $1/\lambda$.  Suppose we detect a decay after x seconds. What is our estimate of the mean liftime for this isotope.

### Discussion

That's the game. The only barrier is whether the integrals being computed can be found exactly or not. 

Some general parting thoughts about Bayesian Methods:

- There are two points of view. One is that Bayesian Methods are an alternative way of thinking about statistics and experiments that is more intuitive and closer to what the meaning that a non-expert thinks we are using in describing conclusions.
- Another is that Bayesian Methods are one more tool. They are a tool that is useful if you are going to try and make a prediction following an experiment (what happens next). 

