# Definitions

**Probability**: Number in the range [0,1] that represents a degree of belief in a fact or prediction. A 50% probability represents a predicted outcome that is as likely to occur as to not occur.

**Conditional probability**: Denoted $p(A|B)$, represents the probability of $A$ given that $B$ is true.

**Conjoint probability**: Denoted $p(A,B)$ represents the probability that $A$ and $B$ are both true. $p(A, B) = p(A)p(B)$ iff $A$ and $B$ are independent i.e. $p(B|A) = p(B)$. In general, $p(A, B) = p(B|A)p(A)$.

# Derivation of Bayes's Theorem

\begin{align*}
    p(A, B) &= p(B, A) \quad &\text{ by commutative property} \\
    p(B|A)p(A) &= p(A|B)p(B) \quad &\text{ conjoint probability definition} \\
    p(A|B) &= \frac{p(B|A)p(A)}{p(B)} &
\end{align*}

Bayes's Theorem can be used when trying to find the conditional probability $p(A|B)$ and it is easier to compute $p(B|A)$.

The other interpretation is the *diachronic interpretation*. Diachronic referring to something that happens over time - in the case of Bayes's Theorem that the probability of the hypotheses $H$ changes over time as we see new data $D$.

$$p(H|D) = \frac{p(H)p(D|H)}{p(D)}$$ where,

- $p(H|D)$ is the *posterior* and what we want to compute
- $p(H)$ is the *prior* - the probabilty of hypothesis before we see data
- $p(D|H)$ is the *likelihood* - the probability of data under the hypothesis
- $p(D)$ is the *normalizing constant* - the probability of data under any hypothesis

We usually look at problems with hypotheses that are *mutually exclusive* (at most one hypothesis is true) and *collectively exhaustive* (at least one hypothesis is true).

# Examples

## Cookie Problem
**Suppose there are two bowls of cookies. Bowl 1 contains 30 vanilla cookies and 10 chocolate cookies. Bowl 2 contains 20 of each. Suppose you choose a bowl at random and select a cookie without looking. You draw a vanilla cookie. What is the probability that it came from Bowl 1?**

We want to compute $p(Bowl 1|vanilla)$ but it is much easier to compute $p(vanilla|Bowl 1) = 30/40 = 3/4$.
Plugging into Bayes's Theorem,

\begin{align*}
    p(Bowl 1|vanilla) &= \frac{p(vanilla|Bowl 1)p(Bowl 1)}{p(vanilla)} \\
    &= \frac{p(vanilla|Bowl 1)p(Bowl 1)}{p(vanilla|Bowl 1)p(Bowl 1) + p(vanilla|Bowl 2)p(Bowl 2)}
    &= \frac{(3/4)(1/2)}{(3/4)(1/2) + (1/2)(1/2)} \\
    &= 0.6
\end{align*}

There is a 60% probability that the cookie came from Bowl 1 which makes sense since there are more vanilla cookies in Bowl 1 than Bowl 2.

## M&M Problem
**In 1994, the color mix in a bag of M&M's was 30% brown, 20% yellow, 20% red, 10% green, 10% orange, and 10% tan. In 1996, the color mix was 24% blue, 20% green, 16% orange, 14% yellow, 13% red, and 13% brown. Suppose we have two bags of M&M's, one from 1994 and the other from 1996. Suppose we draw a candy from each bag. One is yellow and the other is green. We don't know which comes from which bag. What is the probability that the yellow one is from the 1994 bag (and therefore the green one is from the 1996 bag)?**

We will use the diachronic interpretation of Bayes's Theorem. The different hypothesis are as follows:
- $A$: Yellow M&M comes from 1994 bag (and therefore the green one comes from the 1996 bag)
- $B$: Yellow M&M comes from 1996 bag (and therefore the green one comes from the 1994 bag)

We calculate each term:
\begin{tabular}{|c|c|c|c|c|}
    \hline
    & prior & likelihood & & posterior \\ 
    & $p(H)$ & $p(D|H)$ & $p(H)p(D|H)$ & $p(H|D)$ \\
    \hline
    $A$ & 1/2 & (0.2)(0.2) & 0.02 & 0.02/(0.02 + 0.007) = 20/27 \\
    \hline
    $B$ & 1/2 & (0.1)(0.14) & 0.007 & 0.007/(0.02 + 0.007) = 7/27 \\
    \hline
\end{tabular}

So the probability is 20/27.

## Monty Hall Problem

**There are three doors (A, B, C), behind one is a car and behind the other two are goats. Suppose you choose Door A. Before Monty reveals what's behind Door A, he opens B or C (whichever does not have a car; if both do not then at random). Should you stick with your original choice or switch?**

We define the three hypotheses $A$, $B$, and $C$ corresponding to which door the car is behind. Suppose he opened door B to reveal a goat.

We calculate each term:
\begin{tabular}{|c|c|c|c|c|}
    \hline
    & prior & likelihood & & posterior \\ 
    & $p(H)$ & $p(D|H)$ & $p(H)p(D|H)$ & $p(H|D)$ \\
    \hline
    $A$ & 1/3 & 1/2 & 1/6 & (1/6)/(1/6 + 0 + 1/3) = 1/3 \\
    \hline
    $B$ & 1/3 & 0 & 0 & 0/(1/6 + 0 + 1/3) = 0 \\
    \hline
    $C$ & 1/3 & 1 & 1/3 & (1/3)/(1/6 + 0 + 1/3) = 2/3 \\
    \hline
\end{tabular}

Since the probability of the car being behind door C is higher, we should switch.

Notice that if Monty's strategy was to always choose door B over door C, then there would be no benefit to switching.

\begin{tabular}{|c|c|c|c|c|}
    \hline
    & prior & likelihood & & posterior \\ 
    & $p(H)$ & $p(D|H)$ & $p(H)p(D|H)$ & $p(H|D)$ \\
    \hline
    $A$ & 1/3 & 1 & 1/3 & 1/2 \\
    \hline
    $B$ & 1/3 & 0 & 0 & 0 \\
    \hline
    $C$ & 1/3 & 1 & 1/3 & 1/2 \\
    \hline
\end{tabular}