## Probability Rules

- $\left | A \cap B \right | = \left | A \right | + \left | B \right | - \left | A \cup B \right |$
- Independence: $Pr(A|B) = Pr(A)$
- $P(A \cap B) = P(A) \cdot P(B)$, assuming independence
- $P(A \cap B) = P(A) \cdot P(B | A)$, no need for independence assumption

- Birthday problem
    - 30 ppl at a party. Is it more probable that 2 people have the same birthday, or that no two people have the same birthday?
    - Assume 365 days in a year
    - Probability no two people have same birthday
    $$
        \begin{align}
            \frac{(365 * 364 * 363 * ... * 336)}{(365^{30})} &= \frac{\frac{365!}{335!}} {365^{30}}
            &= 0.294
        \end{align} 
    $$

    - Probability at least two people have same birthday
        - 1 - 0.294 = 0.706

- Bayes
$$
\begin{align}
Pr(A|B) &= \frac{Pr(A) * Pr(B|A)}{(Pr(A) * Pr(B|A)) + (Pr(A^{'}) * Pr(B | A^{'}))} \\
&= \frac{Pr(A) * Pr(B|A)}{Pr(B)} \\
&= \frac{Pr(B|A) \cdot Pr(A)} {Pr(B)}
\end{align}
$$

- Some examples:
    - $$\begin{align}
        Pr(Heads | Heads Prev) &= \frac{Pr(Heads Prev | Heads) * Pr(Heads)} {Pr(Heads Prev | Heads) * Pr(Heads) + Pr(Heads Prev | Not Heads ) * Pr(Not Heads)} \\
        &= (0.5 * 0.5) / (0.5*0.5 + 0.5*0.5) \\
        &= 0.5
        \end{align}$$
    - $$\begin{align}
        Pr(Spam | "lottery") &= \frac{Pr("lottery" | spam) * Pr("lottery")} {Pr("lottery" | spam) * Pr("lottery") + Pr("lottery" | Not spam ) * Pr(Not spam)} 
        \end{align}$$

- Naive Bayes model
    - Let's extend the "spam" model we have above. Suppose we want to know $Pr(Spam | "lottery" \& "prize")$
        - From the Bayes formula, we need to compute $Pr("lottery" \& "prize" | spam)$ and $Pr("lottery" \& "prize" | not spam)$
        - Since there is 1 more condition here, our population size is reduced
    - Imagine if we want to do this for 100s of words. This will be almost impossible to estimate, because it is very unlikely that all 100 words will exist in the same email, even if individually the words carry some information.
    
    - Naive Bayes model simplifies this by assuming that the existance of these words are independent of each other. That is; $Pr("lottery" \& "prize") = Pr("lottery") * Pr("prize")$. We substitute this expression wherever we find it

    - This reduces the formula to
    $$\begin{align}
        Pr(Spam | "lottery" \& "prize") &= \frac{Pr("lottery" \& "prize" | spam) * Pr(spam)}{Pr("lottery" \& "prize" | spam) * Pr(spam) + Pr("lottery" \& "prize" | not spam) * Pr(not spam)} \\
        &= \frac{Pr("lottery" | spam) * Pr("prize" | spam) * Pr(spam)}{Pr("lottery" | spam) * Pr("prize" | spam) * Pr(spam) + Pr("lottery" | not spam) * Pr("prize" | not spam)  * Pr(not spam)}
    \end{align}$$
