# Intro to Bayesian Statistics Lab

Complete the following set of exercises to solidify your knowledge of Bayesian statistics and Bayesian data analysis.

In [6]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

## 1. Cookie Problem

Suppose we have two bowls of cookies. Bowl 1 contains 30 vanilla cookies and 10 chocolate cookies. Bowl 2 contains 20 of each. You randomly pick one cookie out of one of the bowls, and it is vanilla. Use Bayes Theorem to calculate the probability that the vanilla cookie you picked came from Bowl 1?

P(A|B) = P(B|A)*P(A) / P(B)

In [7]:
p_vanilla = (.75 * .5) + (.5 * .5) #P(A)
p_1 = .5 # P(B)
p_vanilla_1 = .75

p_1_vanilla = p_vanilla_1 * p_1 / p_vanilla
p_1_vanilla

0.6

What is the probability that it came from Bowl 2?

In [8]:
p_vanilla_2 = 0.5
p_2 = 0.5

p_2_vanilla = p_vanilla_2 * p_2 / p_vanilla
p_2_vanilla

0.4

What if the cookie you had picked was chocolate? What are the probabilities that the chocolate cookie came from Bowl 1 and Bowl 2 respectively?

In [9]:
p_choco_1 = 1 - p_vanilla_1
p_choco_2 = 1 - p_vanilla_2
p_choco = 1 - p_vanilla

p_1_choco = p_choco_1 * p_1 / p_choco
p_2_choco = p_choco_2 * p_2 / p_choco

print(f'Bowl 1: {p_1_choco}')
print(f'Bowl 2: {p_2_choco}')

Bowl 1: 0.3333333333333333
Bowl 2: 0.6666666666666666


## 2. Candy Problem

Suppose you have two bags of candies:

- In Bag 1, the mix of colors is:
    - Brown - 30%
    - Yellow - 20%
    - Red - 20%
    - Green - 10%
    - Orange - 10%
    - Tan - 10%
    
- In Bag 2, the mix of colors is:
    - Blue - 24%
    - Green - 20%
    - Orange - 16%
    - Yellow - 14%
    - Red - 13%
    - Brown - 13%
    
Not knowing which bag is which, you randomly draw one candy from each bag. One is yellow and one is green. What is the probability that the yellow one came from the Bag 1?

*Hint: For the likelihoods, you will need to multiply the probabilities of drawing yellow from one bag and green from the other bag and vice versa.*

In [10]:
p_yellow_1 = 0.2
p_yellow_2 = 0.14
p_1 = 0.5
p_2 = 0.5
p_yellow = (p_1 * p_yellow_1) + (p_2 * p_yellow_2)

p_1_yellow = p_yellow_1 * p_1 / p_yellow
p_1_yellow

0.5882352941176471

What is the probability that the yellow candy came from Bag 2?

In [11]:
p_2_yellow = p_yellow_2 * p_2 / p_yellow
p_2_yellow

0.411764705882353

What are the probabilities that the green one came from Bag 1 and Bag 2 respectively?

In [16]:
p_green_1 = 0.1
p_green_2 = 0.2
p_green = (p_1 * p_green_1) + (p_2 * p_green_2)

# Resultado sin considerar el primer draw de yellow
p_1_green = p_green_1 * p_1 / p_green

# Resultado considerando yellow en el primer draw
p_1_green_draw2 = (p_1_yellow + p_2_yellow) * p_1_green
p_1_green_draw2

0.3333333333333333

In [17]:
p_2_green = p_green_2 * p_2 / p_green

# Ajustando con el draw de amarillo
p_2_green_draw2 = (p_1_yellow + p_2_yellow) * p_2_green
p_2_green_draw2

0.6666666666666666

## 3. Monty Hall Problem

Suppose you are a contestant on the popular game show *Let's Make a Deal*. The host of the show (Monty Hall) presents you with three doors - Door A, Door B, and Door C. He tells you that there is a sports car behind one of them and if you choose the correct one, you win the car!

You select Door A, but then Monty makes things a little more interesting. He opens Door B to reveal that there is no sports car behind it and asks you if you would like to stick with your choice of Door A or switch your choice to Door C. Given this new information, what are the probabilities of you winning the car if you stick with Door A versus if you switch to Door C?

- P(A|B) = P(B|A)*P(A) / P(B)
- P(A and B) = P(A) ⋅ P(B|A)

Antes de que se nos revele la puerta B, tenemos estas probabilidades:

In [44]:
# Probabilidad de que el coche esté en una de estas puertas:
p_a = 1/3
p_b = 1/3
p_c = 1/3

# Probabilidad de que el coche no esté en una de estas puertas:
p_not_a = p_b + p_c
p_not_b = p_a + p_c
p_not_c = p_a + p_b

# Probabilidad de que el coché no esté en la puerta B si el coche está en la A o C:
p_not_b_a = 1
p_not_b_c = 1

# Probabilidad de que el coche está en la A si es que el coche no está en la B:
p_a_not_b = p_not_b_a * p_a / p_not_b
p_a_not_b

0.5

Después de que se revela la puerta B, se alteran las probabilidades:

In [45]:
# Probabilidad de que se revelara la puerta B, si el carro está en A, B o C y fue seleccionada
# la puerta A: 
p_revelo_b_a_seleccionada_a = 1/2
p_revelo_b_b_seleccionada_a = 0
p_revelo_b_c_seleccionada_a = 1 #Esta es la probabilidad crucial. B es la única puerta que se puede
# abrir, puesto que el host no puede revelar el carro.

#Probabilidad de que se revele B si seleccionamos A:
p_revelo_b = p_revelo_b_a_seleccionada_a * p_a + \
p_revelo_b_b_seleccionada_a * p_b + \
p_revelo_b_c_seleccionada_a * p_c

# Probabilidad de que el caro esté en A, B o C, si fue seleccionada A y se reveló B.
p_a_seleccionada_a_revelo_b = p_revelo_b_a_seleccionada_a * p_a / p_revelo_b
p_b_seleccionada_a_revelo_b = 0
p_c_seleccionada_a_revelo_b = p_revelo_b_c_seleccionada_a * p_c / p_revelo_b

print(f'No switch: {p_a_seleccionada_a_revelo_b}')
print(f'Switch: {p_c_seleccionada_a_revelo_b}')

No switch: 0.3333333333333333
Switch: 0.6666666666666666


## 4. Bayesian Analysis 

Suppose you work for a landscaping company, and they want to advertise their service online. They create an ad and sit back waiting for the money to roll in. On the first day, the ad sends 100 visitors to the site and 14 of them sign up for landscaping services. Create a generative model to come up with the posterior distribution and produce a visualization of what the posterior distribution would look like given the observed data.

NOTA: P(A|B) = P(A) P(B|A) / P(B)

- P(A): prior, probability of the hypothesis before we see the data.
- P(B|A): likelihood, probability of the data under the hypothesis.
- P(B): marginal probability, probability of the data under any hypothesis. Can be computed as P(A) P(B|A) + P(-A) P(B|-A).
- P(A|B): posterior, probability after having seen the data.


- Hipothesis: ¿cuántas personas van a contratar los servicios por cada ad?
- Prior: hay un 50/50 de probabilidad de que alguien vea el ad y contrate el servicio.

In [47]:
def bayes_rule(priors, likelihoods):
    marg = sum(np.multiply(priors, likelihoods))
    post = np.divide(np.multiply(priors, likelihoods), marg)
    return post

Produce a set of descriptive statistics for the posterior distribution.

What is the 90% credible interval range?

What is the Maximum Likelihood Estimate?