## Understanding Bayes Theorem

### Bayes Theorem:       P(A|B) =  P(B|A)*P(A) / P(B)

Prior: P(A)

Likelihood: P(B|A)

Posterior:  P(A|B)

Evidence (data): P(B)

The focus on Bayes Theorem is to calculate a probability. Bayes Theorem requires a prior assumption (information or knowledge), evidence reflected in the world.  This is before seeing any evidence.  The likelihood is the plausibility that our data is observed, given our prior. The posterior is the result of the Bayesian analysis, given the data and model. This can cycle through again as you update your beliefs again. 

 ### A Classic Example

Here is a classic example:

1% of women at age 40 who participate in a routine screening have breast cancer.
80% of women with breast cancer get positive mammographies.
9.6% of women without breast cancer get positive mammographies.

A 40-year old woman participates in a routine screening and has a positive mammography.  What's the probability that she has breast cancer?

Let's use a population of 10,000 women.

In [1]:
# Out of 10,000 women who go to a routine screening, 1% will have breast cancer.
with_cancer = 0.01*10000
with_cancer

100.0

Here, it is important to note that the prior probability that a woman has breast cancer is 1%. This is even before getting a positive result from a mammography test. P(A) = 0.01

What is the probability of having breast cancer, given a positive test?

Bayes Theorem:       P(A|B) =  P(B|A)*P(A) / P(B)

In [2]:
# 80% of women who have breast cancer have positive mammography test.
# 80% of women who have breat cancer are correctly diagnosed by the test.
# This is the likelihood, P(B|A). 
true_positive = 0.80 *100
true_positive
# 80 women have a true positive result

80.0

In [3]:
# We can deduce that 20% of women who have breast cancer have false mammgraphy test.
# The test said that they have breast cancer but, in fact, they did not.
false_negative = 0.20*100
false_negative
# 20 women have a false negative.

20.0

In [4]:
# We can deduce that at age 40, 99% of the 10,000 women who go to a routine 
# screening do not have breast cancer.

without_cancer = 0.99*10000
without_cancer
#9900 women do not have breast cancer

9900.0

In [5]:
# We know that 9.6% of women without breast cancer get a positive mammography.
false_positive = 0.096*9900
false_positive
#950 have a false_positive result

950.4

In [6]:
# We can deduce that 90.4% of women without breast cancer get a negative mammmography.
true_negative = 0.904 *9900
true_negative
# 8949.6 have a true_negative result

8949.6

What is the probability of having breast cancer, given a positive test?

Bayes Theorem:       P(A|B) =  P(B|A)*P(A) / P(B)

In [7]:
# P(positive) = P(positive, with cancer) + P (positive and no cancer)
# This is P(B), the evidence or data that we use to update the prior
probability_positive = (true_positive + false_positive)
probability_positive

1030.4

In [8]:
# Calculate proportion of positives 
1030.4/10000

0.10304

In [9]:
# Recall how many women have true positive result
true_positive

80.0

In [10]:
# Recall true positive proportion is 0.80

In [11]:
# Posterior Probability, P(A|B)
# posterior_probability = true_positive/(true_positive + false_positive)

posterior_probability = true_positive/probability_positive
posterior_probability

0.07763975155279502

### Summary of Calculations

What's the probability that a woman has breast cancer, given that she tested positive for a mammogram test?

Bayes Theorem  P(A|B) = (P(B|A)* P(A)) / P(B) = (0.80*0.01)/0.10304
 

P(A) is called the prior is 0.01


P(B|A) is the  likelihood is 0.80. 

P(B) is the evidence (data) that we use to update the prior is 0.10304.

P(A|B) is called the posterior is 0.077639.  

The probability that a woman has breast cancer, given a positive mammography test is 7.8%.

In [None]:
# end