Bayes Theorem serves to compute a type of conditional probability called a posterior probability. 

$$P(A|B) = \frac{P(A)P(B|A)}{P(B)}$$

## Problem 1: The Cookie Problem

There are two bowls containing vainilla and chocolate cookies.  Bowl 1 has 30 vainilla cookies and has 10 chocolate cookies.  Bowl 2 has 20 vainilla cookies and 20 chocolate cookies.   

Lets say that you chose a bowl at random.  Given that its a vainilla cookie, what is the probability that it came from the bowl #1?

This is not the same as to say calculate the probability of getting a vainilla cookie from bowl 1, and rather given that that there are 2 bowls that you can chose from at random and unknowingly (which is your sample set) calculate the probability that if you picked a vainilla cookie it came from the bowl #1. It is more in a larger perspective than the outcome within a specific set. 

The probability of choosing a vainilla cookie from bowl 1 is:
$$P(V|B_1)=\frac{nvainilla_{b1}}{n_{b1}}$$
$$P(V|B_1)=\frac{30}{40} = 3/4$$

The probability of randomly choosing a bowl is:
$$P(B_1) = P(B_2) = 0.5$$

The probability of a vainilla cookie in the entire problem set (data) is:
$$P(V) = 50/80 = 5/8$$ 

We can use the Bayes Theorem to calculate the posterior:

$$P(B_1|V) = \frac{P(V|B_1)*P(B_1)}{P(V)}$$

$$P(B_1|V) = \frac{\frac{3}{4}*\frac{1}{2}}{\frac{5}{8}} = 3/5 $$

# Bayesian Update

probability of an event given the data, with the knowledge that the data will keep updating information about the probability of the event.

__prior__ : probability of the event or hypothesis before we see the data.

__likelihood__ : probability of the data given the hypothesis (or probability of the specific event within a sample set of [specific group] data)

__totalprobabilityofevent__ : total probability of event within entire data.

__posterior__: probability of the event or hypothesis after we see the data. (i.e actualizing our prior belief of the occurrance given the new data)

 ## Problem 2: 
 
 Suppose you have two coins in a box.  One is a normal coin with heads on one side and tails on the other, and one is a trick coin with heads on both sides.  You can choose a coin at random and see that one of the sides is heads.  What is the probability that you chose the trick coin?

n = normal and t = trick

$$(P(C_n)=P(C_t)=0.5)$$

$$P(H|C_n)=\frac{1}{2}$$

$$P(H|C_t)=1$$

$$P(H) = 3/4$$

$$P(C_t|H) = \frac{P(C_t)*P(H|C_t)}{P(H)} =\frac{\frac{1}{2}*1}{3/4} = 1/3$$

  

## Problem 3: 

Suppose you meet someone and learn that they have two children.  You ask if either child is a girl and they say yes.  What is the probability that both children are girls?

- H1: first kid is a girl and the other one is a boy.
- H2: first kid is a boy and second is a girl.
- H3: both are girls.

$$P(GG|at least 1 girl) = \frac{P(GG)*P(1 girl|GG)}{P(1 girl)}$$

In [1]:
import pandas as pd
df = pd.DataFrame(index = ['GB','BG','GG'])
df['prior'] = 1/3 #any of the three choices are equally possible
df['likelihood'] = [1,1,1] # probability of a girl in each event

In [2]:
df['posterior'] = df['prior']*df['likelihood'] / sum(df['prior']*df['likelihood'])
df

Unnamed: 0,prior,likelihood,posterior
GB,0.333333,1,0.333333
BG,0.333333,1,0.333333
GG,0.333333,1,0.333333


In [None]:
## The probability that both children are girls is 0.5 

## Problem 3: 

M&M's are small candy coated chocolates that come ina variety of colors.  Mars, inc. which makes M&M's changes the miture of colors from time to time.  In 1995, they introduces blue m&ms.  In 1994 the color miz in a nag of plain M&M's was 30% Brown, 20% Yellow, 20% Red, 10% Green, 10% Orange, 10% Tan.

In 1996 it was 24% blue, 20% Green, 16% Orange 14% Yellow, 13% Red, 13% Brown.

Supposed a friend of mine has two bags and he tells me that one is from 1994 and another from 1996.  He wont tell me which is which, but he gives me one M&M from each bag.  One is yellow and one is green.  What is the probability that the yellow one came from the 1994 bag?


In [12]:
import pandas as pd

mnm_df = pd.DataFrame(index = ['1994','1996'])
n = 100

P94 = P96 = 1/2 #equal probability of either bag

# he gives 1 m&m from each bag one is yellow and one is green.  What is the probability that the yellow /
# is from the 1994 bag. Can we dismiss the fact he gave two? Does it really matter to report that? Unless order really mattered.

Py94 = 0.2
pg96 = 0.16

Pyellow = (14+20)/200

pb94y = (P94*Py94)/Pyellow
pb94y

0.5882352941176471

In [9]:
Pyg = 0.2*0.16
Pyg

0.032

In [10]:
Pyellow 

0.17

In [11]:
Pyellow = 0.2*0.5 + 0.5*0.14
Pyellow

0.17

In [17]:
#another hypothesis is that the yellow couldve came from the opposite bag so lets see how that looks like
#probability that the yellow came from 1996 and the green from 1994; probability that the yellow came from 1994 and green from 1996
#independent events make the likelihood be multiplied.
mnm_df['prior'] = 1/2
mnm_df['likelihood'] = [0.2*0.2, 0.14*0.1]

mnm_df['unnorm'] = mnm_df['prior'] *mnm_df['likelihood']
mnm_df['posterior'] = mnm_df['unnorm'] / mnm_df['unnorm'].sum()
mnm_df

Unnamed: 0,prior,likelihood,unnorm,posterior
1994,0.5,0.04,0.02,0.740741
1996,0.5,0.014,0.007,0.259259


In [None]:
## There is a higher likelihood that the bag came from the 1994 bag than the 1996.