# Chapter 2. Bayes's Theorem
[Link to chapter online](https://allendowney.github.io/ThinkBayes2/chap02.html)

$P(A|B) = \frac{P(A)P(B|A)}{P(B)}$

## Warning

The content of this file may be incorrect, erroneous and/or harmful. Use it at Your own risk.

## Imports

In [None]:
import DataFrames as Dfs

## Functionality developed in this chapter

In [None]:
"""
    update!(df)
    
Compute the posterior probabilities

# Arguments
- df: Dfs.DataFrame, must contain columns named: prior, likelihood
"""
function update!(df::Dfs.DataFrame)
    df.unnorm = df.prior .* df.likelihood
    df.posterior = df.unnorm ./ sum(df.unnorm)
end

## The Cookie Problem

Suppose there are two bowls of cookies.
- Bowl 1 contains 30 vanilla cookies and 10 chocolate cookies.
- Bowl 2 contains 20 vanilla cookies and 20 chocolate cookies.

Now suppose you choose one of the bowls at random and, without looking, choose a cookie at random. If the cookie is vanilla, what is the probability that it came from Bowl 1?

My (BL) logical reasoning:
- fact: I got vanilla cookie
- vanilla cookies in total (B1 + B2) = (30+20) = 50
- prob of getting white cookie from bowl1 = $\frac{30}{50}$ = 0.6

The calculations above work because both bowls got equal number of cookies (40 in each).

Using Bayese's Theorem:
$P(A|B) = \frac{P(A)P(B|A)}{P(B)}$

Where:
- $P(A|B)$ is $P(B1|Vanilla)$
- $P(A)$ is $P(B1)$ = $\frac{1}{2}$
- $P(B|A)$ is $P(Vanilla|B1)$ = $\frac{30}{40}$ = $\frac{3}{4}$
- $P(B)$ is $P(Vanilla)$ = $\frac{30+20}{40+40}$ = $\frac{50}{80}$ = $\frac{5}{8}$

$P(B1|Vanilla) = \frac{\frac{1}{2} * \frac{3}{4}}{\frac{5}{8}}$

$P(B1|Vanilla) = \frac{\frac{3}{8}}{\frac{5}{8}}$

$P(B1|Vanilla) = \frac{3}{8} * \frac{8}{5}$ (to divide is to multiply by inverse)

$P(B1|Vanilla) = \frac{24}{40} = \frac{24/4}{40/4}= \frac{6}{10} = 0.6$

## Diachronic Bayes

"diachronic" means "related to change over time"; in this case the probability of the hypotheses changes as we see new data.

Rewriting Bayese's Theorem, from:

$P(A|B) = \frac{P(A)P(B|A)}{P(B)}$

replacements:
A = H (hypothesis), B = D (new data)

new form:

$P(H|D) = \frac{P(H)P(D|H)}{P(D)}$, where

- P(H) - probability of the hypothesis before we see data, **prior**
- P(H|D) - probability of the hypothesis after we see data, **posterior**
- P(D|H) - probability of the data under the hypothesis, **likelihood**
- P(D) - the total probability of the data under the hypothesis

We can compute $P(D)$ using the law of total probability (from ch01):

$P(A) = P(B_1\ and\ A) + P(B_2\ and\ A) + ...$

And Theorem 2 [remember: $P(A\ and\ B) = P(B\ and\ A)$ (multiplication is cumutative)]:

$P(A\ and\ B) = P(B)P(A|B)$

After applying Theorem 2 to the law of total probability we get:

$P(A) = P(B_1)P(A|B_1) + P(B_2)P(A|B_2) + ...$

If we replace:
- A with D,
- B with H

We get:

Here rewritten as:

$P(D) = \sum_iP(H_i)~P(D|H_i)$

The process in this section, using data and a prior probability to compute a posterior probability, is called a **Bayesian update**.

## Bayes Tables

A convenient tool for doing a bayesian update is a Bayes table.

In [None]:
table = Dfs.DataFrame(
    (;
    names=["Bowl1", "Bowl2"],
    prior=[0.5, 0.5], # prob get a bowl (1/2 and 1/2)
    # prob get vanilla cookie for a bowl (30/40 and 20/40)
    likelihood=[0.75, 0.5]) 
    )

You might notice that the likelihoods don’t add up to 1. That’s OK; each of them is a probability conditioned on a different hypothesis. There’s no reason they should add up to 1 and no problem if they don’t.

In [None]:
# unnorm = P(D) = P(B_i) * P(D|B_i) = P(H_i) * P(D|H_i)
# see: numerator in The Cookie Problem
table.unnorm = table.prior .* table.likelihood
table

In [None]:
# P(D) = \sum_iP(H_i) * P(D|H_i)
# denominator in The Cookie Problem ((30+20)/(40+40) = 50/80 = 5/8)
probData = sum(table.unnorm)
probData

In [None]:
# normalization
# ((3/8)/(5/8)) = 24/40 = 6/10 = 0.6 (division in The Cookie Problem)
# ((1/4)/(5/8)) = 1/4 * 8/5 = 2/5 = 0.4 (not performed in The Cookie Problem)
table.posterior = table.unnorm ./ probData
table

## The Dice Problem

A Bayes table can also solve problems with more than two hypotheses. For example:

> Suppose I have a box with a 6-sided die, an 8-sided die, and a 12-sided die. I choose one of the dice at random, roll it, and report that the outcome is a 1. What is the probability that I chose the 6-sided die?

In [None]:
table2 = Dfs.DataFrame(
    (;
    qs=[6, 8, 12],
    prior=repeat([1//3], 3),
    likelihood=[1//6, 1//8, 1//12]
    )
)

In [None]:
update!(table2)
table2

## The Monty Hall Problem 

The Monty Hall problem is based on a game show called 'Let’s Make a Deal'. If you are a contestant on the show, here’s how the game works:
- The host, Monty Hall, shows you three closed doors – numbered 1, 2, and 3 – and tells you that there is a prize behind each door.
- One prize is valuable (traditionally a car), the other two are less valuable (traditionally goats).
- The object of the game is to guess which door has the car. If you guess right, you get to keep the car.

Suppose you pick Door 1. Before opening the door you chose, Monty opens Door 3 and reveals a goat. Then Monty offers you the option to stick with your original choice or switch to the remaining unopened door.

To maximize your chance of winning the car, should you stick with Door 1 or switch to Door 2?

In [None]:
table3 = Dfs.DataFrame(
    (;
    qs=[1, 2, 3],
    prior=[1//3, 1//3, 1//3],
    )
)

The data is that Monty opened Door 3 and revealed a goat. So let’s consider the probability of the data under each hypothesis:
- If the car is behind Door 1, Monty chooses Door 2 or 3 at random, so the probability he opens Door 3 is $1/2$.
- If the car is behind Door 2, Monty has to open Door 3, so the probability of the data under this hypothesis is 1.
- If the car is behind Door 3, Monty does not open it, so the probability of the data under this hypothesis is 0.

In [None]:
table3.likelihood = [1//2, 1, 0]

In [None]:
update!(table3)
table3

As this example shows, our intuition for probability is not always reliable. Bayes’s Theorem can help by providing a divide-and-conquer strategy:
- First, write down the hypotheses and the data.
- Next, figure out the prior probabilities.
- Finally, compute the likelihood of the data under each hypothesis.

The Bayes table does the rest.

## Summary

The Bayes table can make it easier to compute the total probability of the data, especially for problems with more than two hypotheses.

Now, go to some exercises.

## Exercises

### Exercise 1

Suppose you have two coins in a box. One is a normal coin with heads on one side and tails on the other, and one is a trick coin with heads on both sides. You choose a coin at random and see that one of the sides is heads. What is the probability that you chose the trick coin?

In [None]:
ex1 = Dfs.DataFrame(
    (;
    coins=["HT", "HH"],
    prior=[1//2, 1//2],
    likelihood=[1//2, 1]
    )
)

In [None]:
update!(ex1)
ex1