# Bayes' Theorem

<img src="./img/bayes.jpg" width = '500' alt="Dice" style="float: center; margin-left: 10px;" />
<caption><left> Img Resourse:[Mattbuck](https://commons.wikimedia.org/wiki/User:Mattbuck) </left></caption>

## Lesson Objectives

By the end of class students will be able to:

- Apply Bayes Theorem to a word problem.
- Define and identify the likelihood, prior, and posterior
- Explain the difference between Bayesian theory and Frequentist theory

## Review of Conditional Probability

- In probability theory, Bayes’ theorem (alternatively Bayes’ law or Bayes’ rule) describes the probability of an event, **based on prior knowledge of conditions** that might be related to the event.

- In mathematical notation it is expressed as:

$$ P(A|B) = \frac{P(A)P(B|A)}{P(B)} $$

__Notation__

$P(A|B)$ -- probability of A given B

$P(B|A)$ -- probability of B given A

$P(A)$ -- probability of A

$P(B)$ -- probability of B


__Note:__ In fact, the reason why such relation holds is easy to show: The main ingredient is the definition of conditional probability:

$$ P(A|B) = \frac{P(A\cap B)}{P(B)}$$

$$
 \left.
    \begin{array} \\
        P(A|B) = \frac{P(A \cap B)}{P(B)} \Rightarrow  P(A|B) P(B) =P(A\cap B) \\
        P(B|A) = \frac{P(B\cap A)}{P(A)} \Rightarrow    P(B|A)P(A) =P(B \cap A) 
    \end{array}
 \right \}= P(A|B)  =\frac{P(A)P(B|A)}{P(B)}\\
$$

Where:

$P(A\cap B)$ -- the probability of the intersection of A and B

$P(B\cap A)$ -- the probability of the intersection of B and A




[3 Blue 1 Brown Video](https://www.youtube.com/watch?v=HZGCoVF3YvM)


## Family with two children

[Example is from: Data Science From Scratch](https://www.amazon.com/Data-Science-Scratch-Principles-Python/dp/149190142X)

- Assume that each child is equally likely to be a girl or boy

- The gender of the second child is independent of the gender of the first child


__Q:__ Now we can ask what is the probability of the event “both children are girls” (B) conditional on the event “the older child is a girl” (G)?


$P(two girls | older is girl) = \frac{P(two girls\cap older is girl}{P(older is girl)}$


We have 4 possible ways in which we could have 2 children:
- Boy, Girl
- Boy, Boy
- Girl, Boy
- Girl, Girl

In [3]:
## your answer here

## Here naming is just for the instructional purposes
## Don't name your variables in this way in your own code!

probability_B = 1/4

probability_G = 1/2

probability_B_and_G = 1/4

probability_B_given_G = probability_B_and_G/probability_G

probability_B_given_G


0.5

__Q:__ We could also ask about the probability of the event “both children are girls” conditional on the event “at least one of the children is a girl” (L).

In [2]:
probability_B = 1/4 #both girls

probability_L = 3/4 #at least 1 is a girl

probability_B_and_L = 1/4

probability_B_given_L = probability_B_and_L/probability_L

probability_B_given_L

0.3333333333333333

Let's simulate this situation

In [4]:
import random
def random_kid():
    return random.choice(["boy", "girl"])

both_girls = 0
older_girl = 0
either_girl = 0

random.seed(0)
for _ in range(10000):
    younger = random_kid()
    older = random_kid()
    if older == "girl":
        older_girl += 1
    if older == "girl" and younger == "girl":
        both_girls += 1
    if older == "girl" or younger == "girl":
        either_girl += 1

print("P(both | older):", both_girls / older_girl )     # 0.514 ~ 1/2
print("P(both | either): ", both_girls / either_girl)   # 0.342 ~ 1/3

P(both | older): 0.5007089325501317
P(both | either):  0.3311897106109325


## Discussion of the terms in the theorem

$P(A|B)$ -- Posterior:  the probability that A is true after the data is considered

$P(B|A)$ -- Likelihood: the evidence about A provided by the data

$P(A)$ -- Prior:  the probability that A is true before the data is considered




# Bayesian vs Frequentist

- Most people want statistics to tell them the probability of their hypothesis given their data 
- Bayes will tell you the relative probability of your hypothesis given your data
- Frequentism will tell you the “objective” probability of your data give your (null) hypothesis


Additionally,
- Bayesian statistics assume data is fixed and the parameters will vary
- Frequestist statistics assume data vary and parameters are fixed


1. A **Frequentist** Approach to the Coin Toss Experiment

 - We set a parameter for having a Head for one coin toss (say p)
 
 - Collect a data (say toss the coin many times)
 
 - Then calculate the probability of getting such data with this parameters.
 
2. A **Bayesian** Approach to the Coin Toss Experiment:

 - You start with an prior about the parameter (p: probability of getting a H in this case and let's say p can be any number between 0 and 1 and any number is equally likely.)
    
 - Collect data
    
 - Then adjust your prior belief about the distribution of the parameter p.


[Intuitive Example of Difference between Bayesian and Frequentist Statistics](https://youtu.be/r76oDIvwETI)

[Coin Flip- Bayesian vs Frequentist](https://youtu.be/YsJ4W1k0hUg)

[World Cup Win Bayesian vs Frequentist](https://youtu.be/XJqOEvzUG38)



## Practice!
A company has found that 80% of its new management hires are meeting expectations,
while 20% are not. Of the satisfactory hires, 75% had sales experience, while of the
unsatisfactory hires, 55% had sales experience. What is the probability that a new hire with
sales experience will meet expectations? 


#### In your group do the following:

- Write out the definition of each term in the context of this scenario:
    - posterior
    - prior
    - likelihood
    
- Using Bayes Theorem solve for the probability that a new hire with sales experience will meet expectations?

Posterior: What is the probabiliy of new hire will meet expectation GIVEN that they have sales experience? P(X|E)

Prior: Probability of meeting expectation P(X)

Likelihood: Probability of having experience GIVEN expectation has been met P(E|X)

In [6]:
# your code here
prior = 0.8 #satisfactory level
likelihood1 = 0.75 #having experience GIVEN satisfactory
likelihood2 = 0.55 #having experience GIVEN UNsatifactory

posterior = (prior * likelihood1)/((prior * likelihood1)+((1-prior)*likelihood2))
posterior

0.8450704225352113

## More Practice!

A certain disease has an incidence rate of 2%. If the false negative rate is 10% and the
false positive rate is 1%, compute the probability that a person who tests positive actually
has the disease.

Imagine 10,000 people who are tested. Of these 10,000, 200 will have the disease; 10% of
them, or 20, will test negative and the remaining 180 will test positive. Of the 9800 who do
not have the disease, 1% of them, or 98, will test positive. 

#### In your group do the following:

- Write out the definition of each term in the context of this scenario:
    - posterior
    - prior
    - likelihood
    
- Using Bayes Theorem solve for the probability that a person who tests positively actually has the disease


Posterior: Having the disease GIVEN positive test #P(A|B) what we want to find

Prior: Having the disease #P(A) = 0.02

Likelihood: Positive test GIVEN having the disease #P(B|A) = 0.99 NOT false positive

In [9]:
# your code here
prior = 0.02
lh = 0.9
lh2 = 0.01

ans = (prior * lh)/(((prior * lh))+((1-prior)*lh2))
ans 

0.6474820143884893