## Motivating Example

The ELISA test is used to screen blood for HIV.

- When the blood contains HIV, it gives a positive result 98% of the time.
- When the blood does not contain HIV, it gives a negative result 94% of the time.

The prevalence of HIV is about 1% in the adult male population. A patient has just tested positive and wants to know the probability that he has HIV. What would you tell him?

## Theory

- Theorem 9.1: Bayes theorem is actually super intuitive. I'll put it in its raw form first, we'll break down the details later.

$$
    Pr(A|B) = \frac{Pr(B|A) \cdot P(A)}{P(B)}
$$

- The derivation of Bayes theorem is actually very simple. 
    - We want to compute conditional probability $Pr(A|B)$
    - Let's break down the conditional probability as a venn diagram, with A and B as intersecting circles
        - In the Venn diagram, $Pr(A|B)$ is simply the intersection of A and B, divided by the area of B
        - Intuitively, since you are already "given" that B has occurred, your probability space should only include the area with B. That is, $P(B)$
        - Hence, $Pr(A|B)$ is simply the area where both A and B occurs (i.e. $Pr(A,B)$) as a proportion where area B occurs $Pr(B)$
        - $Pr(A|B) = \frac{Pr(A,B)}{Pr(B)}$
    
    - We now have an intermediate representation that is somewhat closer to Bayes theorem formula above. Let's think a little more closely about $Pr(A,B)$
        - We know that $Pr(A,B)$ is the intersection of circles A and B in the Venn diagram
        - We also know from the argument above that $Pr(A|B) = \frac{Pr(A,B)}{Pr(B)}$
        - By symmetry, this must also mean that $Pr(B|A) = \frac{Pr(A,B)}{Pr(A)}$
        - Rearranging, it must follow that $Pr(A,B) = Pr(A|B)\cdot Pr(B) = Pr(B|A)\cdot Pr(A)$
        - Hence, $Pr(A|B) = \frac{Pr(A,B)}{Pr(B)} = \frac{Pr(B|A)\cdot Pr(A)}{Pr(B)}$

- Let's extend the logic above to a more realistic scenario
    - Assume $A$ can take on values $A_1$, $A_2$, and $A_3$. That is, $A$ is a partition of $B$ (see section 8 notes)
    - We want to know, what is $Pr(A_1 | B)$
    - Let's further suppose that $Pr(B | A_1)$, $Pr(B | A_2)$, and $Pr(B | A_3)$ are known
    - In such a case, given that A **must** be either $A_1$, $A_2$, or $A_3$
        - $Pr(A_1) + Pr(A_2) + Pr(A_3) = 1$
        - $Pr(B | A_1) * Pr(A_1) + Pr(B | A_2) * Pr(A_2) + Pr(B | A_3) * Pr(A_3) = Pr(B)$
        - $B$ must occur with $A_1$, $A_2$, or $A_3$, so summing the weighted probabilities that $B$ occurs conditioning on each value of $A$ must give you the probability of $B$. See section 8 notes, law of total probability
    - $Pr(A_1 | B)$ must occur when both $A_1$ and $B$ occur as a percentage of the probability of $B$ occurring
    - So this is simply $\frac{Pr(B | A_1) * Pr(A_1)}{Pr(B | A_1) * Pr(A_1) + Pr(B | A_2) * Pr(A_2) + Pr(B | A_3) * Pr(A_3)} = \frac{Pr(B | A_1) * Pr(A_1)}{P(B)}$
    - Bayes theorem pops up naturally!
    

### Solving the motivating question

- Let $P$ represent whether the test returns positive
- Let $H$ represent whether the user has HIV

- Given
    - $Pr(P | H) = 0.98$
    - $Pr(!P | !H) = 0.94$
    - $Pr(H) = 0.01$

- Deduced
    - $Pr(!P | H) = 0.02$
    - $Pr(P | !H) = 0.06$
    - $Pr(!H) = 0.99$

- Question: What is $Pr(H | P)$?

- For simplicity, let's assume 10000 people 
    - 0.01 * 10000=100 people have HIV, and 0.99*10000=9900 do not
    - When applying the test to the 100 HIV+ people, 0.98 * 100 = 98 will be positive, and 0.02 * 100 = 2 will not be positive
    - When applying the test to the 9900 HIV- people, 0.06 * 9900 = 594 will be positive, and 0.94 * 9900 = 9306 will not be positive
    - In total, 594 + 98 = 692 people test positive
    - 98/692 ~ 14.2% of people who test positive actually have HIV
    - Generally, we are trying to find $\frac{Pr(H, P)}{Pr(H, P) + Pr(H, !P)}$


- Let's generalise this procedure

$$\begin{align}
    Pr(H|P) &= \frac{0.98 * 0.01}{(0.98 * 0.01) + (0.06 * 0.99)} \\
    &= \frac{Pr(H, P)}{Pr(H, P) + Pr(H, !P)} \\
    &= \frac{Pr(P|H) * Pr(H)}{Pr(P|H) * Pr(H) + Pr(P|!H) * Pr(!H)} \\
    &= \frac{Pr(P|H) * Pr(H)}{Pr(P)} \\
    &= \frac{Pr(P|H) * Pr(H)}{Pr(P)} \\
    &= 0.142
\end{align}$$
