# Bayes Theorem

## Table of Contents

1. [Introduction](#Introduction)
2. [The Cookie Problem](#The-Cookie-Problem)
3. [Diachronic Bayes](#Diachronic-Bayes)
4. [The Dice Problem](#The-Dice-Problem)
5. [The Monty Hall Problem](#The-Monty-Hall-Problem)


## Introduction

- In the previous chapter we derived Bayes Theorem, Seeing as we had a dataset with the population data there was no purpose in us using the theorem as we already had all the information we needed.
- in the previous chapter it was easy enough calculate the left side:
$$P(A|B)$$
than the right:
$$\frac{P(A)P(B|A)}{P(B)}$$
As we already had data to calculate the values.
- But often we dont have the complete data set so this is when Bayes Theorem is more useful.

## The Cookie Problem

- One of the first examples we will look at is a thin version of the `Urn Problem`:
    - Suppose there are two bowls of Cookies:
        - Bowl 1 contains 30 Vanilla Cookies and 10 Chocolate Cookies
        - Bowl 2 contains 20 Vanilla Cookes and 20 Chocolate Cookies
    - Now suppose that you pick a Bowl at random and without looking take a cookie from the Bowl.
    - If the cookie is vanilla what is the probability that it came from Bowl 1?
    - In Mathamatical terms it is `P(B1|V)` or what is the conditional probability of picking Bowl 1 given that we have a Vanilla Cookie.
- From the original description of the problem we have the following informaiton:
    - The conditional probability of getting a Vanilla cookie given Bowl 1 `P(V|B1)`
    - The conditional probability of getting a Vanilla Cookie given Bowl 2 `P(v|B2)`.
- These are both pieces of information that is not required in answering the question but using Bayes Theorem we can calculate the Conditional Probability using the inverse Conditional Probability.
- Looking back at the problem we have the following:
    - The Probability of picking Bowl 1 `P(B1)` which is 0.5 as there are two and we pick one at random(This is an assumption on our side).
    - The Probability of getting a Vanilla Cookie Given Bowl 1 `P(V|B1)` which is 0.75 (Based on what we have been given).
    - The Probability of getting a Vanilla Cookie Given Bowl 2 `P(V|B2)` which is 0.5 (Based on what we have been given).
    - The Probability of getting a Vanilla Cookie from either Bowl `P(V)`.
- To calculate `P(V)` we can use the law of total probability:
$$P(V) = P(B_1)P(V|B_1) + P(B_2)P(V|B_2)$$
$$= (0.5)(0.75) + (0.5)(0.5) = \frac{5}{8}$$
- We could have also calculated the result directly because:
    - Since the bowls have an equal chance of being picked and there is equal cookies in each jar means there is an equal chance of selecting a cookie.
    - Between the two Bowls we have 50 Vanilla Cookies and 30 Chocolate ones which is the 5/8.
- Finally we will apply Bayes Theorem and solve the problem:
$$P(B_1|V) = \frac{(\frac{1}{2})(\frac{3}{4})}{\frac{5}{8}} = \frac{3}{5}$$

## Diachronic Bayes

- There is another way of thinking about Bayes Theorem, it gives us a way to update the probability of a hypothesis , `H`, given some body of data `D`.
- This interpretation is Diachronic, which means changing over time.
- We can now Supstitute the New variables into Bayes Theorem to look like the following:
$$P(H|D) = \frac{P(H)P(D|H)}{P(D)} $$
- In this interpretation we name the following:
    - `P(H)` is the probability of the hypothesis before we see the body of data, Also called the Prior Probability or just Prior.
    - `P(H|D)` is the probability of the hypothesis after we have seen the data. Called the Posterior.
    - `P(D|H)` is the probability of the data under the hypothesis called the likelihood.
    - `P(D)` is the total probability of the data, under any hypothesis.
- Sometimes the Prior can be calculated with the information givern for instance the Cookie problem gives us the Bowls with the cookie probabilites.
- In other cases the Prior can be subjective and reasonable people may disagree on the value we should set for this value.
- The Likelihood is usually the easiest part to calculate, In the cookie Bowl problem we were given the number of cookies in the Bowl and we could calculate the probability of each cookie in each Bowl.
- Computing the total probability of the data can be tricky, it is supposed to be the probability of the data under any hypothesis at all.
- Most often we simplify things by specifying a set of hypothesis that are:
    - Mutually exclusive.
    - Collectively Exhaustive, The correct solution must be in the set.
- When these conditions apply we can calculate the Total Probability with the following:
$$P(D) = P(H_1)(P(D|H_1) + P(H_2)P(D|H_2)$$

## Bayes Tables

- A convienent tool to doing a Bayesian update is Bayes Tables. You can write Bayes Tables on Paper or do it in Python.
- We will first make an empty DataFrame


In [19]:
import pandas as pd
table = pd.DataFrame(index=["Bowl 1", "Bowl 2"])

 - Now we can add a column to represent the Priors:

In [20]:
table["priors"] = [0.5,0.5]
table

Unnamed: 0,priors
Bowl 1,0.5
Bowl 2,0.5


 - And a Column for the Likelihoods

In [21]:
table["likelihood"] = 3/4, 1/2
table

Unnamed: 0,priors,likelihood
Bowl 1,0.5,0.75
Bowl 2,0.5,0.5


 - With this methods we see the difference from the previous method as we are calculating all of the hypothesis and not just for Bowl 1:
    - The hypothesis that the Vanilla Cookie came from Bowl 1.
    - The hypothesis that the Vanilla Cookie came from Bowl 2.
 - If you are concerned that the likelihoods dont equal 1 when summated it is the probability of the Vanila cookie in each bowl so not going to equal 1.
 - The next step is similar to the previous method - We multiple the priors and Likelihood.

In [22]:
table["unnorm"] = table["priors"] * table["likelihood"]
table

Unnamed: 0,priors,likelihood,unnorm
Bowl 1,0.5,0.75,0.375
Bowl 2,0.5,0.5,0.25


 - This multiplication is called the unnorm as it is the unmormalized Priors with each of them being a product of the prior and lieklihood:
$$P(B_i)P(D|B_i)$$
 - This is the numerator of Bayes Theorem and if we add all of them together we get the Law of total Probability which is the denominator, which is:

In [23]:
table["unnorm"].sum()

0.625

 - This is the same answer as we got in the previous method.
 - We can now calculate the posterior Probabilites

In [24]:
table["posterior"] = table["unnorm"] / table["unnorm"].sum()
table

Unnamed: 0,priors,likelihood,unnorm,posterior
Bowl 1,0.5,0.75,0.375,0.6
Bowl 2,0.5,0.5,0.25,0.4


 - The Posterior probability for Bowl 1 is 0.6 which is what we got when we used bayes Theorem and as a bonus we also get ther Posterior for Bowl 2.
 - By Diving by the unnorm we normalize the value so it will not add up to 1.

## The Dice Problem

- we can also use the Bayes Table to solve problems with two hypotheses. We can look at an example to show how:
    - Suppose there is a box with a 6 sided dice, a 8 sided dice and a 12 sided dice. I choose one dice at random and roll it and report the outcome is 1. What is the probability that the dice is the 6 sided dice?
- In this example there are three possible outcome as we have three dice that could be rolled and produce a value of 1.
- Let's create a Bayes table to solve it:

In [25]:
table_2 = pd.DataFrame(index = ["6 Sided", "8 Sided", "12 sided"])
table_2["priors"] = 1/3 , 1/3 , 1/3
table_2["likelihood"] = 1/6, 1/8, 1/12
table_2["unnorm"] = table_2.priors * table_2.likelihood
table_2["posterior"] = table_2.unnorm / table_2.unnorm.sum()
table_2

Unnamed: 0,priors,likelihood,unnorm,posterior
6 Sided,0.333333,0.166667,0.055556,0.444444
8 Sided,0.333333,0.125,0.041667,0.333333
12 sided,0.333333,0.083333,0.027778,0.222222


- From this we will say there is a probability of 0.44 for the 6 sided dice based on rolling a 1.

## The Monty Hall Problem

- This is one of the more famous probability puzzles that exists.
- This is based on a game show and has the following conditions:
    - The contestant needs to pick between 3 doors for a prize.
    - Once they have picked a door the host opens one of the unpicked doors to show that the prize is not behind this door and offers the contestant to switch doors or stick with the originally picked one.
- We are going to make a few assumptions to determine if we should stick or switch for the behaviour of the host:
    - The Host always opens a door and offers to switch
    - The Host will never open the picked door or the door with the car
    - If the prize is behind the picked door they will pick the other doors at random
    - The contestant has picked door 1 and Monty opens door 3.
- Lets use a Bayes table to analyse this problem:

In [26]:
table_3 = pd.DataFrame(index=["Prize Behind Door 1","Prize Behind Door 2","Prize Behind Door 3"])
table_3["prior"] = 1/3, 1/3, 1/3
table_3

Unnamed: 0,prior
Prize Behind Door 1,0.333333
Prize Behind Door 2,0.333333
Prize Behind Door 3,0.333333


- we set the Priors to be equal as the location of the prize is picked at random. so all doors have the same chance of having the prize.
- The likelihood is something we need to consider with the original assumptions(Remember this is the likelihood the host will open door 3):
    - if the car is behind door 1 then there is equal chance of the host picking the other doors.
    - If the car is behind door 2 there there is a 100% chance that the host will open the 3rd door
    - if the car is behind door 3 then the host will not be opening the door.

In [27]:
table_3["likelihood"] = 0.5,1,0
table_3["unnorm"] = table_3.prior * table_3.likelihood
table_3["posterior"] = table_3.unnorm / table_3.unnorm.sum()
table_3

Unnamed: 0,prior,likelihood,unnorm,posterior
Prize Behind Door 1,0.333333,0.5,0.166667,0.333333
Prize Behind Door 2,0.333333,1.0,0.333333,0.666667
Prize Behind Door 3,0.333333,0.0,0.0,0.0


- Based on the data in the table and the assumption that the contestant picked door 1 originally and the host opens door three:
    - there is a higher chance of getting the prize from the Posterior probablities for Door 2
    - So the contestant is better off in switching.

## Source

[Chapter 2](http://allendowney.github.io/ThinkBayes2/chap02.html)

## Exercises from the chapter:

There is two coins in a box with one being a trick coin, what is the probability of picking the trick coin given the flip ends on a heads?

In [28]:
table_4 = pd.DataFrame(index = ["Normal Coin", "Trick Coin"])

table_4["prior"] = 1/2, 1/2
table_4["likelihood"] = 1/2, 1
table_4["unnorm"] = table_4["prior"] * table_4["likelihood"]
table_4["posterior"] = table_4["unnorm"] / sum(table_4["unnorm"])

table_4

Unnamed: 0,prior,likelihood,unnorm,posterior
Normal Coin,0.5,0.5,0.25,0.333333
Trick Coin,0.5,1.0,0.5,0.666667


You talk to someone that has two children, they mention that one child is a girl what is the probability that both are girls (Hint is to create 4 options)

In [29]:
table_5 = pd.DataFrame(index = ["Girl, Girl", "Girl, Boy", "Boy, Girl", "Boy, Boy"])

table_5["prior"] = 1/4, 1/4, 1/4, 1/4
table_5["likelihood"]= 1, 1, 1, 0
table_5["unnorm"] = table_5["prior"] * table_5["likelihood"]
table_5["posterior"] = table_5["unnorm"] / sum(table_5["unnorm"])
table_5

Unnamed: 0,prior,likelihood,unnorm,posterior
"Girl, Girl",0.25,1,0.25,0.333333
"Girl, Boy",0.25,1,0.25,0.333333
"Boy, Girl",0.25,1,0.25,0.333333
"Boy, Boy",0.25,0,0.0,0.0


M & M Bag with two distibutions of candies in 94 and 96 bags. what is the probability of a 94 bag given I get a yellow one

In [30]:
table_6 = pd.DataFrame(index = ["94", "96"])
table_6["prior"] = 1/2, 1/2
table_6["likelihood"] = 0.2*0.2, 0.14*0.1
table_6["unnorm"] = table_6["prior"] * table_6["likelihood"]
table_6["posterior"] = table_6["unnorm"] / sum(table_6["unnorm"])
table_6

Unnamed: 0,prior,likelihood,unnorm,posterior
94,0.5,0.04,0.02,0.740741
96,0.5,0.014,0.007,0.259259
