# Bayes Theorem
__Math 3080: Fundamentals of Data Science__

Reading:
* [Matt Parker and Hannah Fry on Bayes' Theorem](https://www.youtube.com/watch?v=7GgLSnQ48os)

-----
## The Theorem
Thomas Bayes came up with a thought experiment and wrote it ddown. After he died, this thought experiment was found by a family friend (Richard Price). Mathematicians eventually came up with a formula to explain this thought experiment, which came to be known as __Bayes' Theorem__.

Here's the theorem:
$$P(A) = \frac{\text{\# of outcomes in A}}{\text{Total \# of outcomes}}$$
$$P(B|A) = \frac{\text{\# of outcomes in both A and B}}{\text{\# of outcomes in A}}$$
$$P(A~and~B) = P(A)P(B|A) \qquad P(B~and~A) = P(B)P(A|B)$$

Discuss example with cards.
$$P(Q) = 4/52 = 1/13 \qquad P(H) = 13/52 = 1/4$$
$$P(H|Q) = 1/4$$
$$P(Q~and~H) = P(Q)P(H|Q) = \frac{4}{52}\frac{1}{4} = \frac{1}{52}$$

$$P(A)P(B|A) = P(B)P(A|B)$$
$$P(B|A) = \frac{P(B)P(A|B)}{P(A)}$$

This is Bayes' Theorem. But what does it mean?

* Do the thought experiment
* Notice that each piece of information did not determine what you believed was true. Rather, each piece of information *updated* your belief of what is true.
    * The accumulation of information led to a belief

So, what does this have to do with Bayes' experiment? Let's give names to A and B:
$$P(Belief|Information) = \frac{P(Belief)P(Information|Belief)}{P(Information)}$$

or perhaps another set of names:
$$P(Hypothesis|Evidence) = \frac{P(Hypothesis)P(Evidence|Hypothesis)}{P(Evidence)}$$

* $P(Hypothesis)$ is known as the __Prior__
* $P(Evidence|Hypothesis)$ is known as the __Likelihood__

Take an example of a doctor, trying to diagnose a patient. Here is the previous data:
* Condition = 0 says condition is not present
* Condition = 1 says conddition may or may not be present
* Condition = 2 says condition is present

In [10]:
import numpy as np
import pandas as pd
import seaborn as sns

numbers = {
    #'Condition':[0,0,1,0,2,1,0,2,2,1,0,0,2,2,1,2,0,2,1,0,1,1,2,2,0,0,1,0,2],
    #'Diagnosis':[0,0,1,0,1,1,0,1,1,0,0,1,0,1,1,0,0,1,1,0,0,1,1,1,0,1,0,0,0]
    'Condition':[0,0,1,0,2,1,0,2,2,1,0,0,2,2],
    'Diagnosis':[0,0,1,0,1,1,0,1,1,0,0,1,0,1]
}

numbers = pd.DataFrame(numbers)
display(numbers)

Unnamed: 0,Condition,Diagnosis
0,0,0
1,0,0
2,1,1
3,0,0
4,2,1
5,1,1
6,0,0
7,2,1
8,2,1
9,1,0


In [14]:
#numbers.pivot_table(index='Condition', columns='Diagnosis', values=1, aggfunc='sum')
numbers_long = numbers.groupby(numbers.columns.tolist(), as_index=False).size()
nums = numbers_long.pivot_table(index='Diagnosis', columns='Condition', values='size')
nums

Condition,0,1,2
Diagnosis,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
0,5.0,1.0,1.0
1,1.0,2.0,4.0


In [31]:
# What is the probability of being diagnosed if your test comes out as a 2?
P_D = numbers[numbers['Diagnosis'] == 1]['Diagnosis'].count() / len(numbers)
P_CD = numbers[(numbers['Diagnosis'] == 1) & (numbers['Condition'] == 2)]['Diagnosis'].count() / len(numbers)
P_C = numbers[numbers['Condition'] == 2]['Diagnosis'].count() / len(numbers)

print(P_D, P_C, P_CD)

P_D1C2 = P_D*P_CD/P_C
print(f"{P_D1C2:0.2f}")

0.5 0.35714285714285715 0.2857142857142857
0.40
