# Introduction to Bayes Theorem
Bayes Theorem provides a principled way for calculating a conditional probability. This tutorial provides the intuition for estimating the probabilties using baayes theorem. 

**Conditional probability** is the probability of one event given the occurrence of another event, often described in terms of events A and B from two dependent random variables e.g. `X` and `Y`.

The conditional probability can be calculated using the joint probability
P(A | B)=  (P (A∩B) ) / (P(B))

The conditional probability is not symmetrical. ie., P(A|B) ≠ P(B|A).

One conditional probability can be calculated using the other conditional
probability. This is called Bayes'Theorem and can be stated as

                              `P(A|B) = (P(B|A) * P(A)) / P(B)`

P(A|B) = posterior probability

P(A) = Prior Probability

P(B|A) = Likelihood

P(B) = Evidence.

Bayes Theorem provides a principled way for calculating a conditional probability and an alternative to using the joint probability.

                  `P(A∕B)=  (P(B│A) * P(A)) / (P(B|A) * P(A) + P(B| NOT A) * P(NOT A))`

Thus the bayes theorm can be restated as

                    Posterior = ( Likelihood X Prior ) / Evidence

In terms of Binary classifier, the probabilities can be defined as :-

1. P(not B | not A): True Negative Rate or TNR (**specificity**).

2. P(B | not A): False Positive Rate or FPR.

3. P(not B | A): False Negative Rate or FNR.

4. P(B | A): True Positive Rate or TPR (**sensitivity or recall**).

5. P(A | B): Positive Predictive Value or PPV (**Precision**).

We can re-state the Bayes theorem as

                            PPV = ( (TPR * P(A)) / ( (TPR * P(A)) + (FPR * P(not A) ) ) 

# Lets solve some questions and find answers via Phython Script.

1. Consider the case where an elderly person (over 80 years of age) falls, what is the probability that they will die from the fall? Let's assume that the base rate of someone elderly dying P(A) is 10%, and the base rate for elderly people falling P(B) is 5%, and from all elderly people, 7% of those that die had a fall P(B | A).

In [None]:
# calculate the probability of an elderly person dying from a fall

# calculate P(A|B) given P(B|A), P(A) and P(B)
def bayes_theorem(p_a, p_b, p_b_given_a):
  # calculate P(A|B) = P(B|A) * P(A) / P(B)
  p_a_given_b = (p_b_given_a * p_a) / p_b
  return p_a_given_b

# P(A)
p_a = 0.10
# P(B)
p_b = 0.05
# P(B|A)
p_b_given_a = 0.07

# calculate P(A|B)
result = bayes_theorem(p_a, p_b, p_b_given_a)

# summarize
print('P(A|B) = %.3f%%' % (result * 100))

P(A|B) = 14.000%


That is if an elderly person falls then there is a 14% probability that they will die from the fall.

# Email spam detection.

Consider, 2% of the email we receive is spam P(A). Let's assume that the spam detector is really good and when an email is spam that it detects it P(B | A) with an accuracy of 99%, and when an email is not spam, it will mark it as spam with a very low rate of 0.1% P(B | not A). We need to find P(Spam | Detected).

## Solution
After plugging the values in to the formula:

P(A|B) = ( P (B|A) * P(A) ) / P(B).

P(Spam|Detected) = (P(Detected|Spam) * P(Spam)) / P(Detected)

Few values missing - P(B) and P(Detected)..... How to estimate them?????

                  

```
            P(B) = P(B|A) X P(A) + P(B|not A) X P(not A).
      P(Detected) = P(Detected|Spam) X P(Spam) + P(Detected|not Spam) X P(not Spam)
```
We know P(Detected|not Spam) which is 0.1% and we can calculate P(not Spam) as 1 - P(Spam)
P(not Spam) = 1 - 0.02 = 0.98.

Calculate P(Detected) = 0.02078

Since, now we have all the values, let's estimate the P(Spam | Detected)

In [None]:
# calculate the probability of an email in the spam folder being spam
# calculate P(A|B) given P(A), P(B|A), P(B|not A)
def bayes_theorem(p_a, p_b_given_a, p_b_given_not_a):
  # calculate P(not A)
  not_a = 1 - p_a
  # calculate P(B)
  p_b = p_b_given_a * p_a + p_b_given_not_a * not_a
  # calculate P(A|B)
  p_a_given_b = (p_b_given_a * p_a) / p_b
  return p_a_given_b

#P(A)
p_a = 0.02
# P(B|A)
p_b_given_a = 0.99
# P(B|not A)
p_b_given_not_a = 0.001
# calculate P(A|B)
result = bayes_theorem(p_a, p_b_given_a, p_b_given_not_a)
# summarize
print('P(A|B) = %.3f%%' % (result * 100))

P(A|B) = 95.284%


Hnece, if an email is in the spam folder, there is a 95.2% probability that it is in fact spam.

# Liars and Lie Detectors

Consider the case where a person is tested with a lie detector and gets a positive result suggesting that they are lying, what is the probability that the person is indeed lying?
Let's assume some details, such as most people that are tested are telling the truth, such as 98%, meaning (1 - 0.98) or 2% are liars P(A). Let's also assume that when someone is lying that the test can detect them well, but not great, such as 72% of the time P(B|A). Let's also assume that when the machine says they are not lying, this is true 97% of the time P(not B | not A).

We need to find - 

P(Lying | Positive) = ( P(Positive | Lying) * P(Lying) ) / P(Positive)

Any Missing Values???? P(Positive) = ?



In [None]:
# calculate the probability of a person lying given a positive lie detector result
# calculate P(A|B) given P(A), P(B|A), P(not B|not A)
def bayes_theorem(p_a, p_b_given_a, p_not_b_given_not_a):
  # calculate P(not A)
  not_a = 1 - p_a
  # calculate P(B|not A)
  p_b_given_not_a = 1 - p_not_b_given_not_a
  # calculate P(B)
  p_b = p_b_given_a * p_a + p_b_given_not_a * not_a
  # calculate P(A|B)
  p_a_given_b = (p_b_given_a * p_a) / p_b
  return p_a_given_b

# P(A), base rate
p_a = 0.02
# P(B|A)
p_b_given_a = 0.72
# P(not B| not A)
p_not_b_given_not_a = 0.97
# calculate P(A|B)
result = bayes_theorem(p_a, p_b_given_a, p_not_b_given_not_a)
# summarize
print('P(A|B) = %.3f%%' % (result * 100))

P(A|B) = 32.877%


That is, if the lie detector test comes back with a positive result, then there is a 32.8%
probability that they are in fact lying. It's a poor test!