## Part 0: Introduction to Bayesian Analysis

This following formula is the underpinning of all Bayesian Analysis.
It is **very important** to understand what each of the terms are. Do not
move on until you have read and understood this section.

**Prior Probability**:
- A PMF / PDF representing your initial beliefs about the parameter(s).  
- The initial belief is less represented in the posterior as more data is incorporated

**Likelihood**:
- The probability of observing the data given the parameter(s)
- i.e. What is the likelihood of 3 Heads in a row given the probability of heads is 0.7?

**Posterior Probability**:
- The product of prior and likelihood (Bayesian-update)
- The posterior probability becomes the prior of the next Bayesian-update

**Normalizing Constant**:
- The probability of observing the data. 
- In Bayesian analysis, this term ensures the sum of all probabilities is 1

https://github.com/mrdtirado/power-bayesian/blob/master/pair.md

## Part 1: Bayesian Analysis (Discrete example)

We're going to start with a discrete example.

A box contains a 4-sided die, a 6-sided die, an 8-sided die,
a 12-sided die, and a 20-sided die. A die is selected at random, and the
rest are destroyed.

We would like to determine which die I have selected, given only information of what I roll.

You should write the solutions to these in a text or markdown file.

1. What is the prior associated with choosing any one die?


In [1]:
# The prior to choosing any one die is 1/5 as it is randomly distributed

2. What is the likelihood function? You should assume that the die are all fair.

In [2]:
# The likelilhood function would be our data of throwing the dice
# given that we had a 1/5 prior probablity to pick any given die

3. Say I roll an 8. After one bayesian update, what is the probability that I chose each of the dice?


In [3]:
# 0: The probablity we chose a 4 sided die given we rolled an 8
# 0: The probablity we chose a 6 sided die given we rolled an 8
# 1/3: The probablity we chose a 8 sided die given we rolled an 8
# 1/3: The probablity we chose a 12 sided die given we rolled an 8
# 1/3: The probablity we chose a 20 sided die given we rolled an 8

4. Comment on the difference in the posteriors if I had rolled the die 50 times instead of 1.


In [5]:
# We would be more confident in our prior given more trials, 
# as our trials gets larger, even if our prior was wrong our likelilhood
# would drown out an incorrect prior

5. Which one of these two sets of data gives you a more certain posterior and why?
`[1, 1, 1, 3, 1, 2]` or `[10, 10, 10, 10, 8, 8]`


In [6]:
# the second distribtuion would give us a more certain poserior because
# having a roll of value 10 would mean that it wouldnt be a die of 4,6,8
# meaning our prior would need to be a 12 sided die or a 20 sided die

6. Say that I modify my prior by my belief that bigger dice are more likely to be drawn from the box. This is my prior distribution:

    ```
    4-sided die: 8%
    6-sided die: 12%
    8-sided die: 16%
    12-sided die: 24%
    20-sided die: 40%
    ```

    What are my posteriors for each die after rolling the 8?

    Which die do we think is most likely? Is this different than what you got with the previous prior?

In [None]:
# P(Die=4 | Roll8) = 0
# P(Die=6 | Roll8) = 0 
# P(Die=8 | Roll8) = (P(Die8 | Roll8)P(Die8) / P(Die8))
# P(Die=12| Roll8) = (P(Die12 | Roll8)P(Die12) / P(Die8))
# P(Die=20| Roll8) = (P(Die20 | Roll8)P(Die20) / P(Die8))

In [17]:
die8 = .16 
die12 = .24
die20 = .40
i8roll8 = 1/8
i12roll12 = 1/12
i20roll20 = 1/20
marginal = .16*1/8 + .24*1/12 + .40*1/20

In [18]:
print('Likelilhood of die8 rolled 8 is {}'.format(((i8roll8)*die8)/marginal))
print('Likelilhood of die12 rolled 8 is {}'.format(((i12roll12)*die12)/marginal))
print('Likelilhood of die20 rolled 8 is {}'.format(((i20roll20)*die20)/marginal))
# All die are equally likely to be chosen 

Likelilhood of die8 rolled 8 is 0.33333333333333337
Likelilhood of die12 rolled 8 is 0.3333333333333333
Likelilhood of die20 rolled 8 is 0.3333333333333334


7. Say you keep the same prior and you roll the die 50 times and get values 1-8 every time. What would you expect of the posterior? How different do you think it would be if you'd used the uniform prior?

In [19]:
# i would have a 1/3 chance of it being either a 4, 6, or 8 sided die