# Bayes Theorem

In [9]:
# format the book
import sys
sys.path.insert(0,'../LIBRARY')

In [None]:
import numpy as np
import pandas as pd
import LIBRARY.thinkbayes2.thinkbayes2

In [None]:
# Configure Jupyter to display the assigned value after an assignment
%config InteractiveShell.ast_node_interactivity='last_expr_or_assign'

# Conditional probability

The fundamental idea behind all Bayesian statistics is Bayes’s theorem, which is surprisingly easy to derive, provided that you understand conditional probability. So we’ll start with probability, then conditional probability, then Bayes’s theorem, and on to Bayesian statistics.

A probability is a number between 0 and 1 (including both) that represents a degree of belief in a fact or prediction. The value 1 represents certainty that a fact is true, or that a prediction will come true. The value 0 represents certainty that the fact is false.

# How to use bayes theoem
At this point we have everything we need to derive Bayes’s theorem. We’ll start with the observation that conjunction is commutative; that is

p(𝐴 and 𝐵)=p(𝐵 and 𝐴)
for any events 𝐴 and 𝐵

.

Next, we write the probability of a conjunction:

p(𝐴 and 𝐵)=p(𝐴) p(𝐵|𝐴)

Since we have not said anything about what 𝐴
and 𝐵

mean, they are interchangeable. Interchanging them yields

p(𝐵 and 𝐴)=p(𝐵) p(𝐴|𝐵)

That’s all we need. Pulling those pieces together, we get

p(𝐵) p(𝐴|𝐵)=p(𝐴) p(𝐵|𝐴)

Which means there are two ways to compute the conjunction. If you have p(𝐴)
, you multiply by the conditional probability p(𝐵|𝐴). Or you can do it the other way around; if you know p(𝐵), you multiply by p(𝐴|𝐵)

. Either way you should get the same thing.

Finally we can divide through by p(𝐵)
:
p(𝐴|𝐵)=p(𝐴) p(𝐵|𝐴)p(𝐵)

# Solving a question:
I’ll write 𝐵1 for the hypothesis that the literacy rate of all india  and 𝑉

for average literacy rate level. Plugging in Bayes’s theorem we get

p(𝐵1|𝑉)=p(𝐵1) p(𝑉|𝐵1)p(𝑉)

The term on the left is what we want: the probability of all india, given that we choose the average literacy rate level. The terms on the right are:

    p(𝐵1)

: This is the probability that we chose all india, unconditioned by what kind of average literacy rate we got. Since the problem says we chose a state at random, we can assume p(𝐵1)=1/2

.

p(𝑉|𝐵1)

: This is the probability of getting a vanilla cookie from Bowl 1, which is 3/4.

p(𝑉)
: This is the probability of drawing a vanilla cookie from either bowl. Since we had an equal chance of choosing either bowl and the bowls contain the same number of cookies, we had the same chance of choosing any cookie. Between the two bowls there are 50 vanilla and 30 chocolate cookies, so p(𝑉)

    = 5/8.

Putting it together, we have

p(𝐵1|𝑉)=(1/2) (3/4)5/8

which reduces to 3/5. So the vanilla cookie is evidence in favor of the hypothesis that we chose Bowl 1, because vanilla cookies are more likely to come from Bowl 1.

This example demonstrates one use of Bayes’s theorem: it provides a strategy to get from p(𝐵|𝐴)
to p(𝐴|𝐵). This strategy is useful in cases, like the cookie problem, where it is easier to compute the terms on the right side of Bayes’s theorem than the term on the left.

# Bayesian updates using the table method

This notebook demonstrates a way of doing simple Bayesian updates using the table method, with a Pandas DataFrame as the table.

# The BayesTable class

Here's the class that represents a Bayesian table.

In [3]:
class BayesTable(pd.DataFrame):
    def __init__(self, hypothesis, prior=1):
        columns = ['hypothesis', 'prior', 'likelihood', 'unnormalized', 'posterior']
        super().__init__(columns=columns)
        self.hypothesis = hypothesis
        self.prior = prior
    
    def multiply(self):
        self.unnormalized = self.prior * self.likelihood
        
    def normalize(self):
        nc = np.sum(self.unnormalized)
        self.posterior = self.unnormalized / nc
        return nc
    
    def update(self):
        self.multiply()
        return self.normalize()
    
    def reset(self):
        return BayesTable(self.hypothesis, self.posterior)

In [4]:
table = BayesTable(['All India', 'Andhra Pradesh'])
table

Unnamed: 0,hypothesis,prior,likelihood,unnormalized,posterior
0,All India,1,,,
1,Andhra Pradesh,1,,,


Since we didn't specify prior probabilities, the default value is equal priors for all hypotheses.

Now we can specify the likelihoods:

    The likelihood of getting a av rate level from ALL india is 1/2.

    The likelihood of getting a av rate level  from Andhra Pradesh is 1/2.

Here's how we plug the likelihoods in:


In [5]:
table.likelihood = [1/2, 1/2]
table

Unnamed: 0,hypothesis,prior,likelihood,unnormalized,posterior
0,All India,1,0.5,,
1,Andhra Pradesh,1,0.5,,


In [6]:
table.multiply()
table

Unnamed: 0,hypothesis,prior,likelihood,unnormalized,posterior
0,All India,1,0.5,0.5,
1,Andhra Pradesh,1,0.5,0.5,


In [7]:
table.normalize()

1.0

In [8]:
table

Unnamed: 0,hypothesis,prior,likelihood,unnormalized,posterior
0,All India,1,0.5,0.5,0.5
1,Andhra Pradesh,1,0.5,0.5,0.5


We can read the posterior probabilities from the last column: the probability that you chose from all states given their rate level is  is 50%.
