<a href="https://colab.research.google.com/github/iPrinka/MITx-Micromasters-Statistics-Data-Science/blob/main/bayes_I.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Bayes Rule: Introduction

**OBJECTIVES**

- Understand the notation for conditional probability
- Use Bayes Theorem to understand inverse probability
- Use Bayes Rule to update probabilities upon observation
- Solve basic probability problems using Bayes Rule

### Motivating Example

![](https://i.pinimg.com/originals/e0/57/bc/e057bc49e590d0b6a6c6450fbb6c1e5c.jpg)

```
Steve is very shy and withdrawn, invariably helpful, but with little interest in people, or in the world of reality.
A meek and tidy soul, he has a need for order and structure, and a passion for detail.
```

Is Steve a librarian or a farmer? -- [Source](https://www2.psych.ubc.ca/~schaller/Psyc590Readings/TverskyKahneman1974.pdf)

```
Dick is a 30 year old man. He is married with no children. A man of high
ability and high motivation, he promises
to be quite successful in his field. He is
well liked by his colleagues
```

### Social Media Applications

Social Media applications tend to have a younger audience.  From adult cell phone users (18 years and over), 47% of 18 to 29 use social media applications, 21% of the 30 to 49, and 7% of those 50 or over.  

We also know that 29% of cell phone users are 18 to 29, 47% are 30 to 49, and 24% are 50 and over.



In [None]:
#what is the probability that a randomly chosen adult cell phone user uses a chat app?


In [1]:
from IPython.display import IFrame
import matplotlib.pyplot as plt
import numpy as np

In [2]:
#draw a tree diagram
IFrame(src = '', width = 400, height = 300)

**Another Question**: What percent of social media users are aged 18 to 29?

$$P(A_1 | C) = \frac{P(A_1 ~ \text{and} ~ C)}{P(C)}$$

### Example: Bolts

Two boxes contain long bolts and short bolts.  Suppose that one box contains 60 long bolts and 40 short bolts, and the other box contains 10 long bolts and 20 short bolts.  Suppose that one box is selected at random and a bolt is then selected at random from that box.  What is the probability that the bolt is long?

In [None]:
IFrame(src = '', width = 500, height = 200)

In [None]:
7/15

0.4666666666666667

### Bayes Rule

$$ P(A\mid B)={\frac {P(B\mid A)P(A)}{P(B)}}$$

- $P(A\mid B)$ is a conditional probability: the likelihood of event $ A$ occurring given that $B$ is true.
- $P(B\mid A)$ is also a conditional probability: the likelihood of event $B$ occurring given that $A$ is true.
- $P(A)$ and $P(B)$ are the probabilities of observing $A$ and $B$ respectively; they are known as the marginal probability.
- A and B must be different events.

**EXAMPLE**: Bolts Again

Suppose now we have a selected a bolt from one of the two boxes, but we cannot tell which of the boxes it came from.  Suppose a long bolt was selected (call this event $A$), compute the two probabilities:

- $P(B_1 | A)$
- $P(B_2 | A)$

**EXAMPLE**:

CVS is giving free COVID tests.  Their test is 90 percent reliable in that if a person has COVID, the probability is 0.9 that they will test positive; if person does not have the disease there is a probabiity of only 0.1 that the test gives a positive response.

Currently, data in your county indicate that you have a 1 in 10,000 chance of having the disease.  Because the test is free and you are there, you decide to take it.  You test positive, what is the chance that you have the disease?

- Let $B_1$ represent having the disease and $B_2$ that you don't
- Let $A$ represent the positive test

Here, we want to use Bayes Theorem to determine $P(B_1 | A)$.

In [None]:
IFrame(src = '', width = 300, height = 400)

In [None]:
##ANSWER:


**EXAMPLE**

Three machines $M_1, M_2, M_3$ produce similar items.  Suppose that 20, 30, and 50 percent of the parts are produced by $M_1, M_2, M_3$ respectively.  Further, suppose that 1, 2, and 3 percent from $M_1, M_2, M_3$ respectively are defective.

Suppose we select a single item and find it defective.  What is the probability this was produced by machine $M_2$?

$$P(B_2 | A) = \frac{P(B_2)P(A | B_2)}{\sum_{j = 1}^3 P(B_j) P(A | B_j)}$$

In [None]:
##ANSWER


**EXAMPLE**

Suppose that a box contains one fair coin and one coin with a head on each side.  Suppose also that one coin is selected at random and when tossed, a head is obtained.  Determine the probability that the coin is fair.

Let $B_1$ be the event that the coin is fair, $B_2$ that it has two heads, and $H_1$ the even that a head is obtained.

$$P(B_1 | H_1 ) = \frac{P(B_1)P(H_1 | B_1)}{P(B_1)P(H_1 | B_1) + P(B_2)P(H_1 | B_2 )}$$

In [None]:
###ANSWER


Suppose the coin is tossed again and another head is received.  

$$P(B_1 | H_1 \cap H_2) = \frac{P(B_1 | H_1) P(H_2 | B_1 \cap H_1 )}{P(B_1 | H_1)P(H_2 | B_1 \cap H_1) + P(B_2 | H_1)P(H_2 | B_2 \cap H_1)}$$

In [None]:
##ANSWER


In [None]:
IFrame(src = '', height = 300, width = 300)

In [None]:
!pip install pymc3 -U

Collecting pymc3
  Downloading pymc3-3.11.5-py3-none-any.whl (872 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m872.2/872.2 kB[0m [31m7.4 MB/s[0m eta [36m0:00:00[0m
Collecting deprecat (from pymc3)
  Downloading deprecat-2.1.1-py2.py3-none-any.whl (9.8 kB)
Collecting dill (from pymc3)
  Downloading dill-0.3.7-py3-none-any.whl (115 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m115.3/115.3 kB[0m [31m10.9 MB/s[0m eta [36m0:00:00[0m
Collecting numpy<1.22.2,>=1.15.0 (from pymc3)
  Downloading numpy-1.22.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (16.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m16.8/16.8 MB[0m [31m47.6 MB/s[0m eta [36m0:00:00[0m
Collecting scipy<1.8.0,>=1.7.3 (from pymc3)
  Downloading scipy-1.7.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (39.9 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m39.9/39.9 MB[0m [31m14.3 MB/s[0m eta [36m0:00: