# 1. Conditional Probability
- Conditional probability
    + Measure the probability of an event given that another event has occurred
    + The probability of one event occurring with some relationship to one or more other events.

- Example
    + Given
        + X: Chance of rainning today $P(X) = 30\%$
        + Y: You go outside $P(Y) = 50\%$
    + Conditional probability
        + $P(Y|X)$: Chance you go outside given it rains today
        + $P(X|Y)$: Chance it rains today given you go outside

\begin{align*}
    P(Y|X) &= \frac{P(\text{X and Y})}{P(X)} \\
    P(\text{X and Y}) &= P(X)*P(Y|X) = P(Y)*P(X|Y)
\end{align*}

- **Problem**: Probability of drawing 2 Kings from a deck of cards?
- **Solution**
    + X: Draw the 1st card = King => $P(X) = \frac{4}{52}$
    + Y: Draw the 2nd card = King
    + Y|X: Draw the 2nd card = King given the 1st card = King => $P(Y|X) = \frac{3}{51}$
    + Probability of getting 2 Kings:  
        $P(\text{X and Y}) = P(X)*P(Y|X) = \frac{4}{52} * \frac{3}{51} = \frac{1}{221}$

# 2. Bayes Probability

## 2.1 Discrete Bayes Distribution
- Marginal Distribution: $P(A)$, $P(B)$
- Joint Distribution: $P(A \cap B)$
- Conditional distribution: $P(A|B)$, $P(B|A)$
- Let define
    + $\sum\limits_i A_i = \Omega_A$ or $\sum\limits_iP(A_i) = 1$
    + $\sum\limits_j B_j = \Omega_B$ or $\sum\limits_jP(B_j) = 1$

#### Joint distribution
$$P(A_i \cap B_j) = P(A_i)P(B_j|A_i) = P(B_j)P(A_i|B_j)$$ 

#### Marginal distribution
$$P(A_i) = \sum\limits_jP(A_i\cap B_j) = \sum\limits_j \left[P(B_j)P(A_i|B_j) \right],\ \text{with} \sum\limits_jP(B_j) = 1$$
$$P(B_j) = \sum\limits_iP(A_i\cap B_j) = \sum\limits_i \left[P(A_i)P(B_j|A_i) \right],\ \text{with} \sum\limits_iP(A_i) = 1$$

#### Conditional distribution

$$P(A_i|B_j) = \frac{P(A_i \cap B_j)}{P(B_j)} = \frac{P(A_i)P(B_j|A_i)}{ \sum\limits_i \left[P(A_i)P(B_j|A_i) \right]},\ \text{with} \sum\limits_iP(A_i) = 1$$

$$P(B_j|A_i) = \frac{P(A_i \cap B_j)}{P(A_i)} = \frac{P(B_j)P(A_i|B_j)}{\sum\limits_j \left[P(B_j)P(A_i|B_j) \right]},\ \text{with} \sum\limits_jP(B_j) = 1$$

## 1.2 Continous Bayes Distribution

#### Joint Distribution
- Joint pdf of X and Y are defined as
    + p = pdf

$$p(x_0, y_0) = \lim\limits_{\Delta x \to 0 \\ \Delta y \to 0} \frac{P(X \in [x_0, x_0 + \Delta x] \cap Y \in [y_0, y_0 + \Delta y] )}{\Delta x \Delta y} $$

- Joint pdf

$$p(x,y)$$

#### Marginal distribution

$$p(x) = \int\limits_{-\infty}^{\infty} p(x,y)dy = \int\limits_{-\infty}^{\infty} p(y)p(x|y)dy$$
$$p(y) = \int\limits_{-\infty}^{\infty} p(x,y)dx = \int\limits_{-\infty}^{\infty} p(x)p(y|x)dx$$

#### Conditional distribution

$$p(x|y) = \frac{p(x,y)}{p(y)} = \frac{p(x)p(y|x)}{\int\limits_{-\infty}^{\infty} p(x,y)dx} = \frac{p(x)p(y|x)}{\int\limits_{-\infty}^{\infty} p(x)p(y|x)dx}$$
$$p(y|x) = \frac{p(x,y)}{p(x)} = \frac{p(y)p(x|y)}{\int\limits_{-\infty}^{\infty} p(x,y)dy} = \frac{p(y)p(x|y)}{\int\limits_{-\infty}^{\infty} p(y)p(x|y)dy}$$


# 3. Bayes Theorem
- A way of finding a probability when we know certain other probabilities

$$P(A|B) = \frac{P(B|A)*P(A)}{P(B)} = \frac{P(B|A)*P(A)}{P(B|A)*P(A)\ +\ P(B|\bar{A})*P(\bar{A})}$$
- Denotes
    + $P(A|B)$: how often A happens given that B happens
    + $P(B|A)$: How often B happens given that A happens
    + $P(A)$: How likely A is, on its own
    + $P(B)$: How likely B is, on its own

- Example: Spam filter
    + Analyze the words in a message, we can compute its probability of being spam using Bayes’ Theorem
    + $P(spam|words) = \frac{P(spam)*P(words|spam)}{P(words)}$

#### Problem 1
- **Problem**: A test for having an allergy
    + The test accuracy is 80%
    + The frequency of the test saying “Positive” to a patient is 10%
    + 1% of population actually has the allergy
- **Solution**
    + $P(Allergy) = 0.01$: probability a person got allergy
    + $P(Positive|Allergy) = 0.8$: probability the test correct given the person got allergy
    + $P(Positive) = 0.1$: probability of the test saying “Positive” to a random person
    + Bayes Theorem
        + $P(Allergy|Positive)$: The chance that a person actually has the allergy given test positive
        + $P(Allergy|Positive) = \frac{P(Allergy)*P(Positive|Allergy)}{P(Positive)} = \frac{0.01*0.8}{0.1} = 8\%$

#### Problem 2
- **Problem**: A test to detect a disease
    + 0.1% of the population have this disease
    + The test is 99% effective in detecting an infected person
    + Test false positive rate = 0.5%
    + If a person tests positive for the disease what is the probability that they actually have it?
- **Solution**
    + $P(D)\ =\ 0.001$: Probability a person got disease
    + $P(T|D)\ =\ 0.99$: Test positive giving disease = True
    + $P(T|\bar{D})\ =\ 0.005$: Test positive giving disease = False
    + Bayes theorem
        + $P(D|T)$: Probability a person got disease giving test = True
        + $P(D|T)\ =\ \frac{P(T|D)*P(D)}{P(T|D)*P(D)\ +\ P(T|\bar{D})*P(\bar{D})} = \frac{0.99*0.001}{0.99*0.001\ +\ 0.005*0.999}\ =\ 16.5\%$

[More Problem](http://gtribello.github.io/mathNET/bayes-theorem-problems.html)

# 4. [Bayes Theorem in terms of Hypothesis and Evidence](https://www.youtube.com/watch?v=HZGCoVF3YvM)
- $P(H|E)\ =\ \frac{P(H)\ \cdot\ P(E|H)}{P(E)}$
    + $P(H)$: Probabilities a **Hypothesis** is true (before any evidence)
    + $P(E|H)$: Probabilities of seeing the **Evidence** if the **Hypothesis** is true
    + $P(E)$: Probabilities of seeing the **Evidence**
    + $P(H|E)$: Probabilities of a seeing **Hypothesis** is true given the **Evidence**

## Example Problem
- Steve is shy and withdrawn. He is an introvert with a need for order and structure, and a passion for detail
    + Guess that Steve is likely a librabrian or a farmer?
    + Knowing that Steve is only a librabrian or a farmer, no other option

#### Normal people
- Steve is a librarian given his characteristics

#### Frequent stats
- The ratio of farmers/ librarians in the US is 20/1
- Steve is likely a librarian

#### Bayesian stats
- Assume
- From the fact: The ratio of farmers/ librarians in the US is 20/1
    + P(farmer) = 20/21 = 95.24 %
    + P(librarian) = 1/21 = 4.76 %

- Data: We need the data based on the Steve characteristic. Suppose we have
    + P(char | farmer) = 10 %
        + The number of farmers who have Steve characteristics
    + P(char | librarian) = 40 %
        + The number of librarians who have Steve characteristics

- Apply Bayes theorem
    + P(farmer | char): Probabilities Steve is a farmer given having that characteristics
        + $\begin{split}
            P({farmer}\ | {char}) &= \frac{P({farmer})\ \cdot\ P({char}\ |\ {farmer})}{P({char})} \\
            &= \frac{P({farmer})\ \cdot\ P({char}\ |\ {farmer})}{P({farmer})\ \cdot\ P({char}\ |\ {farmer})\ +\ P({\overline{farmer}})\ \cdot\ P(char\ |\ \overline{farmer})} \\
            &= \frac{P({farmer})\ \cdot\ P({char}\ |\ {farmer})}{P({farmer})\ \cdot\ P({char}\ |\ {farmer})\ +\ P({librarian})\ \cdot\ P({char}\ |\ librarian)} \\
            &= \frac{95.24 * 10}{95.24 * 10\ +\ 4.76 * 40} = 83.37 \%
        \end{split}$

    + P(librarian | char): Probabilities Steve is a librarian given having that characteristics
        + $\begin{split}
            P({librarian}\ | {char}) &= \frac{P({librarian})\ \cdot\ P({char}\ |\ {librarian})}{P({char})} \\
            &= \frac{P({librarian})\ \cdot\ P({char}\ |\ {librarian})}{P({librarian})\ \cdot\ P({char}\ |\ {librarian})\ +\ P({\overline{librarian}})\ \cdot\ P(char\ |\ \overline{librarian})} \\
            &= \frac{P({librarian})\ \cdot\ P({char}\ |\ {librarian})}{P({librarian})\ \cdot\ P({char}\ |\ {librarian})\ +\ P({farmer})\ \cdot\ P({char}\ |\ farmer)} \\
            &= \frac{4.76 * 40}{4.76 * 40\ +\ 95.24 * 10} = 16.63 \%
        \end{split}$

#### Conclusion
- In this case
    + Hypothesis = Steve is a librarian (or Steve is a farmer) 
    + Evidence (data) = Steve characteristic
- Bayes theorem is a tool to update our belief (Hypothesis) based on new data (Evidence)