# Bayes's Theorum

$$P(A|B) = \frac{P(A)P(B|A)}{P(B)}$$

As an example, we used data from the General Social Survey and Bayes’s Theorem to compute conditional probabilities. But since we had the complete dataset, we didn’t really need Bayes’s Theorem. It was easy enough to compute the left side of the equation directly, and no easier to compute the right side.

But often we don’t have a complete dataset, and in that case Bayes’s Theorem is more useful. In this chapter, we’ll use it to solve several more challenging problems related to conditional probability.

In [1]:
import pandas as pd

In [2]:
df = pd.read_csv("gss_bayes.csv")
df.head()

Unnamed: 0,caseid,year,age,sex,polviews,partyid,indus10
0,1,1974,21.0,1,4.0,2.0,4970.0
1,2,1974,41.0,1,5.0,0.0,9160.0
2,5,1974,58.0,2,6.0,1.0,2670.0
3,6,1974,30.0,1,5.0,4.0,6870.0
4,7,1974,48.0,1,5.0,4.0,7860.0


## The Cookie Problem
Suppose there are two bowls of cookies.

- Bowl 1 contains 30 vanilla cookies and 10 chocolate cookies.

- Bowl 2 contains 20 vanilla cookies and 20 chocolate cookies.

Now suppose you choose one of the bowls at random and, without looking, choose a cookie at random. If the cookie is vanilla, what is the probability that it came from Bowl 1?

What we're trying to find is $P(B_1 | V)$ which is the probabililty of the cookie being from bowl 1 given it's vanilla.

$$P(B_1 | V) = \frac{P(B_1)P(V|B_1)}{P(V)}$$

- $P(B_1)$ the prob(bowl_1) is .5 since bowl 1 has 40 and bowl 2 has 40
- $P(V|B_1)$ is cond(vanilla, given=bowl_1) is 3/4 since of bowl1 vanilla is 3/4
- $P(V)$ is 5/8 since there are 50 vanilla cookies total out of 80 total cookies

In [3]:
p_b1 = .5
p_v_given_b1 = .75
p_v = 5/8

In [5]:
p_b1_given_vanilla = p_b1 * p_v_given_b1 / p_v
p_b1_given_vanilla

0.6

In [6]:
import pandas as pd

In [7]:
def prob(a):
    return a.mean()

In [9]:
def conditional(proposition, given):
    """Probability of A conditioned on given."""
    prob(proposition[given])

## Diachronic Bayes
- "Dia" "chronos" = through time
- Another way of thinking of Bayes's theorum: gives us a way to update the probability of a hypothesis, H, given some body of data, D


$$P(H|D) = \frac{P(H)P(D|H)}{P(D)}$$

- $P(H)$ is the *prior*, the probability of the hypothesis before we see the data
- $P(H|D)$ is the *posterior*, the probability of the hypothesis after we see the data
- $P(D|H)$ is the *likelihood*, the probability of the data under the hypothesis. "The probability of the data given the hypothesis"
- $P(D)$ is the *total probability of the data* under any hypothesis
- Remember that $P(D)$ is $\sum_i$ P(H_i)(P|H_i)$ 
- This means the set of hypotheses are mutually exclusive and collectively exhaustive

In some cases the prior is subjective; that is, reasonable people might disagree, either because they use different background information or because they interpret the same information differently.

The likelihood is usually the easiest part to compute.

Computing the total probability of the data can be tricky. It is supposed to be the probability of seeing the data under any hypothesis at all, but it can be hard to nail down what that means.

Most often we simplify things by specifying a set of hypotheses that are:

Mutually exclusive, which means that only one of them can be true, and

Collectively exhaustive, which means one of them must be true.