# Intro to Bayesian Statistics Lab

Complete the following set of exercises to solidify your knowledge of Bayesian statistics and Bayesian data analysis.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

## 1. Cookie Problem

Suppose we have two bowls of cookies. Bowl 1 contains 30 vanilla cookies and 10 chocolate cookies. Bowl 2 contains 20 of each. You randomly pick one cookie out of one of the bowls, and it is vanilla. Use Bayes Theorem to calculate the probability that the vanilla cookie you picked came from Bowl 1?

In [None]:
# Given values
P_B_given_A1 = 30 / 40  # Probability of picking a vanilla cookie from Bowl 1
P_B_given_A2 = 20 / 40  # Probability of picking a vanilla cookie from Bowl 2
P_A1 = 1 / 2  # Probability of picking from Bowl 1
P_A2 = 1 / 2  # Probability of picking from Bowl 2

# Calculating P(B)
P_B = (P_B_given_A1 * P_A1) + (P_B_given_A2 * P_A2)

# Using Bayes' Theorem to calculate P(A1|B)
P_A1_given_B = (P_B_given_A1 * P_A1) / P_B

P_A1_given_B

What is the probability that it came from Bowl 2?

In [None]:
# Calculate P(A2|B) using the complement of P(A1|B)
P_A2_given_B = 1 - P_A1_given_B

P_A2_given_B

What if the cookie you had picked was chocolate? What are the probabilities that the chocolate cookie came from Bowl 1 and Bowl 2 respectively?

In [None]:
# Adjusted probabilities for picking a chocolate cookie
P_B_prime_given_A1 = 10 / 40  # Probability of picking a chocolate cookie from Bowl 1
P_B_prime_given_A2 = 20 / 40  # Probability of picking a chocolate cookie from Bowl 2

# Calculating P(B'), the total probability of picking a chocolate cookie
P_B_prime = (P_B_prime_given_A1 * P_A1) + (P_B_prime_given_A2 * P_A2)

# Using Bayes' Theorem to calculate P(A1|B') and P(A2|B')
P_A1_given_B_prime = (P_B_prime_given_A1 * P_A1) / P_B_prime
P_A2_given_B_prime = (P_B_prime_given_A2 * P_A2) / P_B_prime

P_A1_given_B_prime, P_A2_given_B_prime


## 2. Candy Problem

Suppose you have two bags of candies:

- In Bag 1, the mix of colors is:
    - Brown - 30%
    - Yellow - 20%
    - Red - 20%
    - Green - 10%
    - Orange - 10%
    - Tan - 10%
    
- In Bag 2, the mix of colors is:
    - Blue - 24%
    - Green - 20%
    - Orange - 16%
    - Yellow - 14%
    - Red - 13%
    - Brown - 13%
    
Not knowing which bag is which, you randomly draw one candy from each bag. One is yellow and one is green. What is the probability that the yellow one came from the Bag 1?

*Hint: For the likelihoods, you will need to multiply the probabilities of drawing yellow from one bag and green from the other bag and vice versa.*

In [None]:
# Probabilities of drawing specific colors from each bag
prob_yellow_bag1 = 0.20
prob_green_bag2 = 0.20
prob_green_bag1 = 0.10
prob_yellow_bag2 = 0.14

# Calculating the probabilities for the events
P_Yellow_Bag1_And_Green_Bag2 = prob_yellow_bag1 * prob_green_bag2
P_Green_Bag1_And_Yellow_Bag2 = prob_green_bag1 * prob_yellow_bag2

# Total probability of drawing one yellow and one green candy
P_B = P_Yellow_Bag1_And_Green_Bag2 + P_Green_Bag1_And_Yellow_Bag2

# Since P(A) = 0.5 (prior probability that the yellow candy came from Bag 1), we can simplify P(A|B) calculation
P_A_given_B = P_Yellow_Bag1_And_Green_Bag2 / P_B

P_A_given_B

What is the probability that the yellow candy came from Bag 2?

In [None]:
# Calculating the probability that the yellow candy came from Bag 2
P_yellow_from_Bag2 = 1 - P_A_given_B

P_yellow_from_Bag2


What are the probabilities that the green one came from Bag 1 and Bag 2 respectively?

In [None]:
# Assuming P_yellow_from_Bag2 and P_A_given_B are the probabilities calculated earlier

# Probability that the green candy came from Bag 1 is equivalent to the probability
# that the yellow candy came from Bag 2
P_green_from_Bag1 = P_yellow_from_Bag2

# Probability that the green candy came from Bag 2 is equivalent to the probability
# that the yellow candy came from Bag 1
P_green_from_Bag2 = P_A_given_B

print(f"Probability that the green candy came from Bag 1: {P_green_from_Bag1}")
print(f"Probability that the green candy came from Bag 2: {P_green_from_Bag2}")


## 3. Monty Hall Problem

Suppose you are a contestant on the popular game show *Let's Make a Deal*. The host of the show (Monty Hall) presents you with three doors - Door A, Door B, and Door C. He tells you that there is a sports car behind one of them and if you choose the correct one, you win the car!

You select Door A, but then Monty makes things a little more interesting. He opens Door B to reveal that there is no sports car behind it and asks you if you would like to stick with your choice of Door A or switch your choice to Door C. Given this new information, what are the probabilities of you winning the car if you stick with Door A versus if you switch to Door C?

In [None]:
def monty_hall_simulation(num_trials=10000):
    import random

    win_by_staying = 0
    win_by_switching = 0

    for _ in range(num_trials):
        # There are three doors (0, 1, 2), one of them hides a car
        car_behind = random.randint(0, 2)
        contestant_choice = random.randint(0, 2)

        # Monty opens a door that was not picked by the contestant and does not hide the car
        possible_doors_to_open = [door for door in range(3) if door != contestant_choice and door != car_behind]
        door_opened_by_monty = random.choice(possible_doors_to_open)

        # If the contestant stays with the original choice
        if contestant_choice == car_behind:
            win_by_staying += 1

        # If the contestant switches (the only other door not opened and not initially chosen)
        else:
            win_by_switching += 1

    probability_of_winning_by_staying = win_by_staying / num_trials
    probability_of_winning_by_switching = win_by_switching / num_trials

    return probability_of_winning_by_staying, probability_of_winning_by_switching

# Simulate the Monty Hall problem
monty_hall_simulation()


## 4. Bayesian Analysis 

Suppose you work for a landscaping company, and they want to advertise their service online. They create an ad and sit back waiting for the money to roll in. On the first day, the ad sends 100 visitors to the site and 14 of them sign up for landscaping services. Create a generative model to come up with the posterior distribution and produce a visualization of what the posterior distribution would look like given the observed data.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import beta

# Prior distribution parameters (assuming a uniform prior for simplicity)
alpha_prior = 1
beta_prior = 1

# Observed data
visitors = 100
signups = 14

# Posterior distribution parameters
alpha_post = alpha_prior + signups
beta_post = beta_prior + visitors - signups

# Generate the posterior distribution
x = np.linspace(0, 1, 1000)
posterior = beta.pdf(x, alpha_post, beta_post)

# Plotting
plt.figure(figsize=(10, 6))
plt.plot(x, posterior, label='Posterior distribution')
plt.title('Posterior Distribution of Conversion Rate')
plt.xlabel('Conversion rate')
plt.ylabel('Density')
plt.legend()
plt.grid(True)
plt.show()


Produce a set of descriptive statistics for the posterior distribution.

In [None]:
# Descriptive statistics for the posterior distribution
mean = beta.mean(alpha_post, beta_post)
variance = beta.var(alpha_post, beta_post)
std_dev = beta.std(alpha_post, beta_post)
median = beta.median(alpha_post, beta_post)
mode = (alpha_post - 1) / (alpha_post + beta_post - 2)  # Mode formula for beta distribution
confidence_interval = beta.interval(0.95, alpha_post, beta_post)

descriptive_stats = {
    "Mean": mean,
    "Variance": variance,
    "Standard Deviation": std_dev,
    "Median": median,
    "Mode": mode,
    "95% Confidence Interval": confidence_interval
}

descriptive_stats


What is the 90% credible interval range?

In [None]:
# 90% Credible interval for the posterior distribution
credible_interval_90 = beta.interval(0.90, alpha_post, beta_post)

credible_interval_90


What is the Maximum Likelihood Estimate?

In [None]:
# Calculate the Maximum Likelihood Estimate (MLE) for the conversion rate
mle = signups / visitors

mle
